yjf9966 commited on
Commit
d13b1f7
·
1 Parent(s): f6b445c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +128 -0
README.md CHANGED
@@ -1,3 +1,131 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ datasets:
4
+ - BAAI/COIG-PC
5
+ language:
6
+ - zh
7
+ library_name: transformers
8
+ pipeline_tag: text-generation
9
  ---
10
+
11
+ # Model Card for Model ID
12
+
13
+ <!-- Provide a quick summary of what the model is/does. -->
14
+
15
+ This is an experimental product that can be used to create new LLM bassed on Chinese language. It has been generated using [Chinese-LLaMA-Alpaca](https://github.com/ymcui/Chinese-LLaMA-Alpaca)
16
+ ## Model Details
17
+
18
+ ### Model Description
19
+
20
+ <!-- Provide a longer summary of what this model is. -->
21
+
22
+
23
+ - **Developed by:** yjf9966
24
+ - **Shared by [optional]:** yjf9966
25
+ - **Model type:** LLaMA with enhanced tokenizer-size-49964
26
+ - **Language(s) (NLP):** Chinese
27
+ - **License:** Apache 2.0
28
+ - **Finetuned from model:** [Chinese-LLaMA-Alpaca](https://github.com/ymcui/Chinese-LLaMA-Alpaca)
29
+
30
+ ### Model Sources [optional]
31
+
32
+ <!-- Provide the basic links for the model. -->
33
+
34
+ - **Repository:** [https://huggingface.co/BlueWhaleX/Chinese-Alpaca-COIG-49954-7B-HF]
35
+
36
+ ## Uses
37
+
38
+ <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
39
+
40
+ You can use the raw model for next sentence prediction, but it's mostly intended to be fine-tuned on a downstream task.
41
+ Note that this model is primarily aimed at being fine-tuned on tasks that use the whole sentence (potentially masked) to make decisions, such as sequence classification, token classification or question answering.
42
+
43
+
44
+ ## Bias, Risks, and Limitations
45
+
46
+ <!-- This section is meant to convey both technical and sociotechnical limitations. -->
47
+
48
+ Even if the training data used for this model could be characterized as fairly neutral, this model can have biased predictions.
49
+ It also inherits some of the bias of its dataset model.
50
+
51
+ ### Recommendations
52
+
53
+ <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
54
+
55
+ Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
56
+
57
+ ## How to Get Started with the Model
58
+
59
+ Use the code below to get started with the model.
60
+
61
+ ```
62
+ import torch
63
+ import transformers
64
+ from transformers import LlamaTokenizer, LlamaForCausalLM
65
+
66
+ def generate_prompt(text):
67
+ return f"""Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n" +
68
+ ### Instruction:\n\n{text}\n\n### Response:\n\n"""
69
+
70
+ tokenizer = LlamaTokenizer.from_pretrained('BlueWhaleX/Chinese-Alpaca-COIG-49954-7B-HF')
71
+ model = LlamaForCausalLM.from_pretrained('BlueWhaleX/Chinese-Alpaca-COIG-49954-7B-HF').half().cuda()
72
+ model.eval()
73
+
74
+ text = '王国维说:“自周之衰,文王、周公势力之瓦解也,国民之智力成熟于内,政治之纷乱乘之于外,上无统一之制度,下迫于社会之要求,于是诸于九流各创其学说。” 他意在说明 A. 分封制的崩溃 B. 商鞅变法的作用 C. 兼并战争的后果 D. 百家争鸣的原因'
75
+ prompt = generate_prompt(text)
76
+ input_ids = tokenizer.encode(prompt, return_tensors='pt').to('cuda')
77
+
78
+ with torch.no_grad():
79
+ output_ids = model.generate(
80
+ input_ids=input_ids,
81
+ max_new_tokens=400,
82
+ temperature=0.2,
83
+ top_k=40,
84
+ top_p=0.9,
85
+ repetition_penalty=1.3
86
+ ).cuda()
87
+ output = tokenizer.decode(output_ids[0], skip_special_tokens=True)
88
+ response = output.split("### Response:")[1].strip()
89
+ print("Response: ", response, '\n')
90
+ ```
91
+
92
+
93
+ ## Training Details
94
+
95
+ ### Training Data
96
+
97
+ <!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
98
+
99
+ BAAI/COIG-PC
100
+
101
+ ### Training Procedure
102
+
103
+ <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
104
+
105
+ #### Preprocessing [optional]
106
+
107
+ [80% for train dataset and 20% for test dataset]
108
+
109
+
110
+ #### Training Hyperparameters
111
+
112
+ - **Training regime:** [fp16 mixed precision, lr=1e-4, lora_rank=8, lora_alpha=32] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
113
+
114
+
115
+ ## Evaluation
116
+
117
+ #### Testing Data
118
+
119
+ <!-- This should link to a Data Card if possible. -->
120
+ 20% of the BAAI/COIG-PC dataset.
121
+
122
+ ## Citation
123
+
124
+ <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
125
+ ```
126
+ @software{Chinese-Alpaca-COIG-49954-7B-HF,
127
+ title={An Enchanced Chinese Language Model based on the Chinese-Alpaca},
128
+ url={https://huggingface.co/BlueWhaleX/Chinese-Alpaca-COIG-49954-7B-HF},
129
+ year={2023}
130
+ }
131
+ ```