REILX
/

Llama-3-8B-Instruct-Chinese-Lora

Text Generation

text-generation-inference

Model card Files Files and versions Community

REILX commited on Jun 7, 2024

Commit

fec95ed

·

verified ·

1 Parent(s): 9ed7e6f

Update README.md

Files changed (1) hide show

README.md +42 -3

README.md CHANGED Viewed

@@ -1,3 +1,42 @@
----
-license: llama3
----

+---
+license: llama3
+datasets:
+- silk-road/alpaca-data-gpt4-chinese
+- TigerResearch/sft_zh
+- LooksJuicy/ruozhiba
+- leo009/alpaca-cleaned-zh-cn
+- REILX/extracted_tagengo_gpt4
+language:
+- en
+- zh
+---
+### 模型：
+- https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct
+### 数据集：
+- https://huggingface.co/datasets/TigerResearch/sft_zh
+- https://huggingface.co/datasets/silk-road/alpaca-data-gpt4-chinese
+- https://huggingface.co/datasets/REILX/extracted_tagengo_gpt4
+- https://huggingface.co/datasets/LooksJuicy/ruozhiba
+- https://huggingface.co/datasets/leo009/alpaca-cleaned-zh-cn
+（使用langid清理以上数据集，删除其中非中文资料）
+### 训练工具
+https://github.com/hiyouga/LLaMA-Factory
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 5e-05
+- train_batch_size: 4
+- eval_batch_size: 8
+- seed: 42
+- distributed_type: multi-GPU
+- num_devices: 8
+- gradient_accumulation_steps: 4
+- total_train_batch_size: 128
+- total_eval_batch_size: 64
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: cosine
+- lr_scheduler_warmup_ratio: 0.1
+- num_epochs: 3.0