Delete checkpoint-200

Browse files

Files changed (15) hide show

checkpoint-200/README.md +0 -202
checkpoint-200/adapter_config.json +0 -34
checkpoint-200/adapter_model.safetensors +0 -3
checkpoint-200/added_tokens.json +0 -24
checkpoint-200/merges.txt +0 -0
checkpoint-200/optimizer.pt +0 -3
checkpoint-200/rng_state_0.pth +0 -3
checkpoint-200/rng_state_1.pth +0 -3
checkpoint-200/scheduler.pt +0 -3
checkpoint-200/special_tokens_map.json +0 -31
checkpoint-200/tokenizer.json +0 -3
checkpoint-200/tokenizer_config.json +0 -208
checkpoint-200/trainer_state.json +0 -313
checkpoint-200/training_args.bin +0 -3
checkpoint-200/vocab.json +0 -0

checkpoint-200/README.md DELETED Viewed

@@ -1,202 +0,0 @@
----
-base_model: Qwen/Qwen2.5-Coder-7B
-library_name: peft
----
-# Model Card for Model ID
-<!-- Provide a quick summary of what the model is/does. -->
-## Model Details
-### Model Description
-<!-- Provide a longer summary of what this model is. -->
-- **Developed by:** [More Information Needed]
-- **Funded by [optional]:** [More Information Needed]
-- **Shared by [optional]:** [More Information Needed]
-- **Model type:** [More Information Needed]
-- **Language(s) (NLP):** [More Information Needed]
-- **License:** [More Information Needed]
-- **Finetuned from model [optional]:** [More Information Needed]
-### Model Sources [optional]
-<!-- Provide the basic links for the model. -->
-- **Repository:** [More Information Needed]
-- **Paper [optional]:** [More Information Needed]
-- **Demo [optional]:** [More Information Needed]
-## Uses
-<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
-### Direct Use
-<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
-[More Information Needed]
-### Downstream Use [optional]
-<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
-[More Information Needed]
-### Out-of-Scope Use
-<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
-[More Information Needed]
-## Bias, Risks, and Limitations
-<!-- This section is meant to convey both technical and sociotechnical limitations. -->
-[More Information Needed]
-### Recommendations
-<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
-Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
-## How to Get Started with the Model
-Use the code below to get started with the model.
-[More Information Needed]
-## Training Details
-### Training Data
-<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-[More Information Needed]
-### Training Procedure
-<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
-#### Preprocessing [optional]
-[More Information Needed]
-#### Training Hyperparameters
-- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
-#### Speeds, Sizes, Times [optional]
-<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
-[More Information Needed]
-## Evaluation
-<!-- This section describes the evaluation protocols and provides the results. -->
-### Testing Data, Factors & Metrics
-#### Testing Data
-<!-- This should link to a Dataset Card if possible. -->
-[More Information Needed]
-#### Factors
-<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
-[More Information Needed]
-#### Metrics
-<!-- These are the evaluation metrics being used, ideally with a description of why. -->
-[More Information Needed]
-### Results
-[More Information Needed]
-#### Summary
-## Model Examination [optional]
-<!-- Relevant interpretability work for the model goes here -->
-[More Information Needed]
-## Environmental Impact
-<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
-Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
-- **Hardware Type:** [More Information Needed]
-- **Hours used:** [More Information Needed]
-- **Cloud Provider:** [More Information Needed]
-- **Compute Region:** [More Information Needed]
-- **Carbon Emitted:** [More Information Needed]
-## Technical Specifications [optional]
-### Model Architecture and Objective
-[More Information Needed]
-### Compute Infrastructure
-[More Information Needed]
-#### Hardware
-[More Information Needed]
-#### Software
-[More Information Needed]
-## Citation [optional]
-<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
-**BibTeX:**
-[More Information Needed]
-**APA:**
-[More Information Needed]
-## Glossary [optional]
-<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
-[More Information Needed]
-## More Information [optional]
-[More Information Needed]
-## Model Card Authors [optional]
-[More Information Needed]
-## Model Card Contact
-[More Information Needed]
-### Framework versions
-- PEFT 0.12.0

checkpoint-200/adapter_config.json DELETED Viewed

@@ -1,34 +0,0 @@
-{
-  "alpha_pattern": {},
-  "auto_mapping": null,
-  "base_model_name_or_path": "Qwen/Qwen2.5-Coder-7B",
-  "bias": "none",
-  "fan_in_fan_out": false,
-  "inference_mode": true,
-  "init_lora_weights": true,
-  "layer_replication": null,
-  "layers_pattern": null,
-  "layers_to_transform": null,
-  "loftq_config": {},
-  "lora_alpha": 32,
-  "lora_dropout": 0.1,
-  "megatron_config": null,
-  "megatron_core": "megatron.core",
-  "modules_to_save": null,
-  "peft_type": "LORA",
-  "r": 16,
-  "rank_pattern": {},
-  "revision": null,
-  "target_modules": [
-    "q_proj",
-    "k_proj",
-    "up_proj",
-    "down_proj",
-    "gate_proj",
-    "o_proj",
-    "v_proj"
-  ],
-  "task_type": "CAUSAL_LM",
-  "use_dora": false,
-  "use_rslora": false
-}

checkpoint-200/adapter_model.safetensors DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:49840fc86515e540dc4067a705e4e947842a7182a45cd391701fbde2cea8c64f
-size 161533192

checkpoint-200/added_tokens.json DELETED Viewed

@@ -1,24 +0,0 @@
-{
-  "</tool_call>": 151658,
-  "<tool_call>": 151657,
-  "<|box_end|>": 151649,
-  "<|box_start|>": 151648,
-  "<|endoftext|>": 151643,
-  "<|file_sep|>": 151664,
-  "<|fim_middle|>": 151660,
-  "<|fim_pad|>": 151662,
-  "<|fim_prefix|>": 151659,
-  "<|fim_suffix|>": 151661,
-  "<|im_end|>": 151645,
-  "<|im_start|>": 151644,
-  "<|image_pad|>": 151655,
-  "<|object_ref_end|>": 151647,
-  "<|object_ref_start|>": 151646,
-  "<|quad_end|>": 151651,
-  "<|quad_start|>": 151650,
-  "<|repo_name|>": 151663,
-  "<|video_pad|>": 151656,
-  "<|vision_end|>": 151653,
-  "<|vision_pad|>": 151654,
-  "<|vision_start|>": 151652
-}

checkpoint-200/merges.txt DELETED Viewed

The diff for this file is too large to render. See raw diff

checkpoint-200/optimizer.pt DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:a265c9dd28ae5933c7e78b76975a0e2d10aa210e2cb3259fa8a0708b52b4324a
-size 323290986

checkpoint-200/rng_state_0.pth DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:b44a482d877d1e427529771f4afc5b8148e12ae31b70daa89b564cb90f2e0365
-size 14512

checkpoint-200/rng_state_1.pth DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:55d2765fb13b4f4f1821493596447940bdb6e6eba3d84dba8e4cb81cd6c91592
-size 14512

checkpoint-200/scheduler.pt DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:a27b052646dcb561cbd68156c30bd466ce59bda64cf3c8eba9c3c1113af9827c
-size 1064

checkpoint-200/special_tokens_map.json DELETED Viewed

@@ -1,31 +0,0 @@
-{
-  "additional_special_tokens": [
-    "<|im_start|>",
-    "<|im_end|>",
-    "<|object_ref_start|>",
-    "<|object_ref_end|>",
-    "<|box_start|>",
-    "<|box_end|>",
-    "<|quad_start|>",
-    "<|quad_end|>",
-    "<|vision_start|>",
-    "<|vision_end|>",
-    "<|vision_pad|>",
-    "<|image_pad|>",
-    "<|video_pad|>"
-  ],
-  "eos_token": {
-    "content": "<|endoftext|>",
-    "lstrip": false,
-    "normalized": false,
-    "rstrip": false,
-    "single_word": false
-  },
-  "pad_token": {
-    "content": "<|endoftext|>",
-    "lstrip": false,
-    "normalized": false,
-    "rstrip": false,
-    "single_word": false
-  }
-}

checkpoint-200/tokenizer.json DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:9c5ae00e602b8860cbd784ba82a8aa14e8feecec692e7076590d014d7b7fdafa
-size 11421896

checkpoint-200/tokenizer_config.json DELETED Viewed

@@ -1,208 +0,0 @@
-{
-  "add_bos_token": false,
-  "add_prefix_space": false,
-  "added_tokens_decoder": {
-    "151643": {
-      "content": "<|endoftext|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": true
-    },
-    "151644": {
-      "content": "<|im_start|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": true
-    },
-    "151645": {
-      "content": "<|im_end|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": true
-    },
-    "151646": {
-      "content": "<|object_ref_start|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": true
-    },
-    "151647": {
-      "content": "<|object_ref_end|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": true
-    },
-    "151648": {
-      "content": "<|box_start|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": true
-    },
-    "151649": {
-      "content": "<|box_end|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": true
-    },
-    "151650": {
-      "content": "<|quad_start|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": true
-    },
-    "151651": {
-      "content": "<|quad_end|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": true
-    },
-    "151652": {
-      "content": "<|vision_start|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": true
-    },
-    "151653": {
-      "content": "<|vision_end|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": true
-    },
-    "151654": {
-      "content": "<|vision_pad|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": true
-    },
-    "151655": {
-      "content": "<|image_pad|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": true
-    },
-    "151656": {
-      "content": "<|video_pad|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": true
-    },
-    "151657": {
-      "content": "<tool_call>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "151658": {
-      "content": "</tool_call>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "151659": {
-      "content": "<|fim_prefix|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "151660": {
-      "content": "<|fim_middle|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "151661": {
-      "content": "<|fim_suffix|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "151662": {
-      "content": "<|fim_pad|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "151663": {
-      "content": "<|repo_name|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "151664": {
-      "content": "<|file_sep|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    }
-  },
-  "additional_special_tokens": [
-    "<|im_start|>",
-    "<|im_end|>",
-    "<|object_ref_start|>",
-    "<|object_ref_end|>",
-    "<|box_start|>",
-    "<|box_end|>",
-    "<|quad_start|>",
-    "<|quad_end|>",
-    "<|vision_start|>",
-    "<|vision_end|>",
-    "<|vision_pad|>",
-    "<|image_pad|>",
-    "<|video_pad|>"
-  ],
-  "bos_token": null,
-  "chat_template": "{% if messages[0]['role'] == 'system' %}{% set loop_messages = messages[1:] %}{% set system_message = messages[0]['content'] %}{% else %}{% set loop_messages = messages %}{% endif %}{% if system_message is defined %}{{ system_message + '\n' }}{% endif %}{% for message in loop_messages %}{% set content = message['content'] %}{% if message['role'] == 'user' %}{{ 'Human: ' + content + '\nAssistant:' }}{% elif message['role'] == 'assistant' %}{{ content + '<|endoftext|>' + '\n' }}{% endif %}{% endfor %}",
-  "clean_up_tokenization_spaces": false,
-  "eos_token": "<|endoftext|>",
-  "errors": "replace",
-  "model_max_length": 32768,
-  "pad_token": "<|endoftext|>",
-  "padding_side": "right",
-  "split_special_tokens": false,
-  "tokenizer_class": "Qwen2Tokenizer",
-  "unk_token": null
-}

checkpoint-200/trainer_state.json DELETED Viewed

@@ -1,313 +0,0 @@
-{
-  "best_metric": null,
-  "best_model_checkpoint": null,
-  "epoch": 0.9975062344139651,
-  "eval_steps": 500,
-  "global_step": 200,
-  "is_hyper_param_search": false,
-  "is_local_process_zero": true,
-  "is_world_process_zero": true,
-  "log_history": [
-    {
-      "epoch": 0.02493765586034913,
-      "grad_norm": 0.04108215495944023,
-      "learning_rate": 4.99229333433282e-05,
-      "loss": 0.4044,
-      "step": 5
-    },
-    {
-      "epoch": 0.04987531172069826,
-      "grad_norm": 0.025228893384337425,
-      "learning_rate": 4.9692208514878444e-05,
-      "loss": 0.391,
-      "step": 10
-    },
-    {
-      "epoch": 0.07481296758104738,
-      "grad_norm": 0.025317281484603882,
-      "learning_rate": 4.9309248009941914e-05,
-      "loss": 0.3583,
-      "step": 15
-    },
-    {
-      "epoch": 0.09975062344139651,
-      "grad_norm": 0.023856163024902344,
-      "learning_rate": 4.877641290737884e-05,
-      "loss": 0.3771,
-      "step": 20
-    },
-    {
-      "epoch": 0.12468827930174564,
-      "grad_norm": 0.02556118369102478,
-      "learning_rate": 4.8096988312782174e-05,
-      "loss": 0.3408,
-      "step": 25
-    },
-    {
-      "epoch": 0.14962593516209477,
-      "grad_norm": 0.026844358071684837,
-      "learning_rate": 4.72751631047092e-05,
-      "loss": 0.3665,
-      "step": 30
-    },
-    {
-      "epoch": 0.1745635910224439,
-      "grad_norm": 0.02702774479985237,
-      "learning_rate": 4.6316004108852305e-05,
-      "loss": 0.3508,
-      "step": 35
-    },
-    {
-      "epoch": 0.19950124688279303,
-      "grad_norm": 0.02752436138689518,
-      "learning_rate": 4.522542485937369e-05,
-      "loss": 0.3558,
-      "step": 40
-    },
-    {
-      "epoch": 0.22443890274314215,
-      "grad_norm": 0.029124287888407707,
-      "learning_rate": 4.401014914000078e-05,
-      "loss": 0.3579,
-      "step": 45
-    },
-    {
-      "epoch": 0.24937655860349128,
-      "grad_norm": 0.03660265728831291,
-      "learning_rate": 4.267766952966369e-05,
-      "loss": 0.3333,
-      "step": 50
-    },
-    {
-      "epoch": 0.2743142144638404,
-      "grad_norm": 0.027973175048828125,
-      "learning_rate": 4.123620120825459e-05,
-      "loss": 0.3263,
-      "step": 55
-    },
-    {
-      "epoch": 0.29925187032418954,
-      "grad_norm": 0.03005453571677208,
-      "learning_rate": 3.969463130731183e-05,
-      "loss": 0.3476,
-      "step": 60
-    },
-    {
-      "epoch": 0.32418952618453867,
-      "grad_norm": 0.03571762889623642,
-      "learning_rate": 3.8062464117898724e-05,
-      "loss": 0.3294,
-      "step": 65
-    },
-    {
-      "epoch": 0.3491271820448878,
-      "grad_norm": 0.03382609412074089,
-      "learning_rate": 3.634976249348867e-05,
-      "loss": 0.356,
-      "step": 70
-    },
-    {
-      "epoch": 0.3740648379052369,
-      "grad_norm": 0.03247227147221565,
-      "learning_rate": 3.456708580912725e-05,
-      "loss": 0.3304,
-      "step": 75
-    },
-    {
-      "epoch": 0.39900249376558605,
-      "grad_norm": 0.03641534969210625,
-      "learning_rate": 3.272542485937369e-05,
-      "loss": 0.3436,
-      "step": 80
-    },
-    {
-      "epoch": 0.4239401496259352,
-      "grad_norm": 0.03509656339883804,
-      "learning_rate": 3.083613409639764e-05,
-      "loss": 0.3388,
-      "step": 85
-    },
-    {
-      "epoch": 0.4488778054862843,
-      "grad_norm": 0.03508240357041359,
-      "learning_rate": 2.8910861626005776e-05,
-      "loss": 0.3367,
-      "step": 90
-    },
-    {
-      "epoch": 0.47381546134663344,
-      "grad_norm": 0.04164772108197212,
-      "learning_rate": 2.6961477393196126e-05,
-      "loss": 0.3561,
-      "step": 95
-    },
-    {
-      "epoch": 0.49875311720698257,
-      "grad_norm": 0.03994197025895119,
-      "learning_rate": 2.5e-05,
-      "loss": 0.3182,
-      "step": 100
-    },
-    {
-      "epoch": 0.5236907730673317,
-      "grad_norm": 0.038153909146785736,
-      "learning_rate": 2.303852260680388e-05,
-      "loss": 0.3331,
-      "step": 105
-    },
-    {
-      "epoch": 0.5486284289276808,
-      "grad_norm": 0.03781621903181076,
-      "learning_rate": 2.1089138373994223e-05,
-      "loss": 0.3383,
-      "step": 110
-    },
-    {
-      "epoch": 0.57356608478803,
-      "grad_norm": 0.04167185723781586,
-      "learning_rate": 1.9163865903602374e-05,
-      "loss": 0.3366,
-      "step": 115
-    },
-    {
-      "epoch": 0.5985037406483791,
-      "grad_norm": 0.042504750192165375,
-      "learning_rate": 1.7274575140626318e-05,
-      "loss": 0.3324,
-      "step": 120
-    },
-    {
-      "epoch": 0.6234413965087282,
-      "grad_norm": 0.043039821088314056,
-      "learning_rate": 1.5432914190872757e-05,
-      "loss": 0.303,
-      "step": 125
-    },
-    {
-      "epoch": 0.6483790523690773,
-      "grad_norm": 0.04201623052358627,
-      "learning_rate": 1.3650237506511331e-05,
-      "loss": 0.35,
-      "step": 130
-    },
-    {
-      "epoch": 0.6733167082294265,
-      "grad_norm": 0.04364593327045441,
-      "learning_rate": 1.1937535882101281e-05,
-      "loss": 0.331,
-      "step": 135
-    },
-    {
-      "epoch": 0.6982543640897756,
-      "grad_norm": 0.04168755188584328,
-      "learning_rate": 1.0305368692688174e-05,
-      "loss": 0.344,
-      "step": 140
-    },
-    {
-      "epoch": 0.7231920199501247,
-      "grad_norm": 0.04454099014401436,
-      "learning_rate": 8.763798791745411e-06,
-      "loss": 0.3567,
-      "step": 145
-    },
-    {
-      "epoch": 0.7481296758104738,
-      "grad_norm": 0.07072897255420685,
-      "learning_rate": 7.3223304703363135e-06,
-      "loss": 0.3184,
-      "step": 150
-    },
-    {
-      "epoch": 0.773067331670823,
-      "grad_norm": 0.04532039538025856,
-      "learning_rate": 5.989850859999227e-06,
-      "loss": 0.3279,
-      "step": 155
-    },
-    {
-      "epoch": 0.7980049875311721,
-      "grad_norm": 0.04330899938941002,
-      "learning_rate": 4.7745751406263165e-06,
-      "loss": 0.3372,
-      "step": 160
-    },
-    {
-      "epoch": 0.8229426433915212,
-      "grad_norm": 0.04454395920038223,
-      "learning_rate": 3.6839958911476957e-06,
-      "loss": 0.3553,
-      "step": 165
-    },
-    {
-      "epoch": 0.8478802992518704,
-      "grad_norm": 0.05742543563246727,
-      "learning_rate": 2.7248368952908053e-06,
-      "loss": 0.3361,
-      "step": 170
-    },
-    {
-      "epoch": 0.8728179551122195,
-      "grad_norm": 0.04155350476503372,
-      "learning_rate": 1.9030116872178316e-06,
-      "loss": 0.3335,
-      "step": 175
-    },
-    {
-      "epoch": 0.8977556109725686,
-      "grad_norm": 0.04169079288840294,
-      "learning_rate": 1.2235870926211619e-06,
-      "loss": 0.3425,
-      "step": 180
-    },
-    {
-      "epoch": 0.9226932668329177,
-      "grad_norm": 0.04245175048708916,
-      "learning_rate": 6.907519900580861e-07,
-      "loss": 0.3247,
-      "step": 185
-    },
-    {
-      "epoch": 0.9476309226932669,
-      "grad_norm": 0.03987717628479004,
-      "learning_rate": 3.077914851215585e-07,
-      "loss": 0.3314,
-      "step": 190
-    },
-    {
-      "epoch": 0.972568578553616,
-      "grad_norm": 0.11724965274333954,
-      "learning_rate": 7.706665667180091e-08,
-      "loss": 0.3354,
-      "step": 195
-    },
-    {
-      "epoch": 0.9975062344139651,
-      "grad_norm": 0.04020114243030548,
-      "learning_rate": 0.0,
-      "loss": 0.3413,
-      "step": 200
-    }
-  ],
-  "logging_steps": 5,
-  "max_steps": 200,
-  "num_input_tokens_seen": 0,
-  "num_train_epochs": 1,
-  "save_steps": 100,
-  "stateful_callbacks": {
-    "TrainerControl": {
-      "args": {
-        "should_epoch_stop": false,
-        "should_evaluate": false,
-        "should_log": false,
-        "should_save": true,
-        "should_training_stop": true
-      },
-      "attributes": {}
-    }
-  },
-  "total_flos": 3.3406622246696387e+18,
-  "train_batch_size": 8,
-  "trial_name": null,
-  "trial_params": null
-}

checkpoint-200/training_args.bin DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:f85ee3253ce2430836b5bfdd00d31d93672e95e995579ef99ce619afab5ba0a5
-size 5432

checkpoint-200/vocab.json DELETED Viewed

The diff for this file is too large to render. See raw diff