MonteXiaofeng commited on
Commit
2e0a6e4
·
verified ·
1 Parent(s): 0c13a1f

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +118 -0
README.md ADDED
@@ -0,0 +1,118 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - BAAI/IndustryInstruction
5
+ - BAAI/IndustryInstruction_Aerospace
6
+ base_model:
7
+ - meta-llama/Meta-Llama-3.1-8B-Instruct
8
+ tags:
9
+ - 航空航天
10
+ - 中英文语言模型
11
+ ---
12
+
13
+ This model is finetuned on the model llama3.1-8b-instruct using the dataset [BAAI/IndustryInstruction_Aerospace](https://huggingface.co/datasets/BAAI/IndustryInstruction_Aerospace) dataset, the dataset details can jump to the repo: [BAAI/IndustryInstruction](https://huggingface.co/datasets/BAAI/IndustryInstruction)
14
+
15
+ ## training params
16
+
17
+ The training framework is llama-factory, template=llama3
18
+
19
+ ```
20
+ learning_rate=1e-5
21
+ lr_scheduler_type=cosine
22
+ max_length=2048
23
+ warmup_ratio=0.05
24
+ batch_size=64
25
+ epoch=10
26
+ ```
27
+
28
+ select best ckpt by the evaluation loss
29
+ ## evaluation
30
+
31
+ Duto to there is no evaluation benchmark, we can not eval the model
32
+
33
+ ## How to use
34
+
35
+ ```python
36
+ # !/usr/bin/env python
37
+ # -*- coding:utf-8 -*-
38
+ # ==================================================================
39
+ # [Author] : xiaofeng
40
+ # [Descriptions] :
41
+ # ==================================================================
42
+
43
+ from transformers import AutoTokenizer, AutoModelForCausalLM
44
+ import transformers
45
+ import torch
46
+
47
+
48
+ llama3_jinja = """{% if messages[0]['role'] == 'system' %}
49
+ {% set offset = 1 %}
50
+ {% else %}
51
+ {% set offset = 0 %}
52
+ {% endif %}
53
+
54
+ {{ bos_token }}
55
+ {% for message in messages %}
56
+ {% if (message['role'] == 'user') != (loop.index0 % 2 == offset) %}
57
+ {{ raise_exception('Conversation roles must alternate user/assistant/user/assistant/...') }}
58
+ {% endif %}
59
+
60
+ {{ '<|start_header_id|>' + message['role'] + '<|end_header_id|>\n\n' + message['content'] | trim + '<|eot_id|>' }}
61
+ {% endfor %}
62
+
63
+ {% if add_generation_prompt %}
64
+ {{ '<|start_header_id|>' + 'assistant' + '<|end_header_id|>\n\n' }}
65
+ {% endif %}"""
66
+
67
+
68
+ dtype = torch.bfloat16
69
+
70
+ model_dir = "MonteXiaofeng/Aerospace-llama3_1_8B_instruct"
71
+ model = AutoModelForCausalLM.from_pretrained(
72
+ model_dir,
73
+ device_map="cuda",
74
+ torch_dtype=dtype,
75
+ )
76
+
77
+ tokenizer = AutoTokenizer.from_pretrained(model_dir)
78
+ tokenizer.chat_template = llama3_jinja # update template
79
+
80
+ message = [
81
+ {"role": "system", "content": "You are a helpful assistant"},
82
+ {
83
+ "role": "user",
84
+ "content": "神舟十六号在中国空间站中的角色是什么,它如何实现对太空的全面覆盖?",
85
+ },
86
+ ]
87
+ prompt = tokenizer.apply_chat_template(
88
+ message, tokenize=False, add_generation_prompt=True
89
+ )
90
+ print(prompt)
91
+ inputs = tokenizer.encode(prompt, add_special_tokens=False, return_tensors="pt")
92
+ prompt_length = len(inputs[0])
93
+ print(f"prompt_length:{prompt_length}")
94
+
95
+ generating_args = {
96
+ "do_sample": True,
97
+ "temperature": 1.0,
98
+ "top_p": 0.5,
99
+ "top_k": 15,
100
+ "max_new_tokens": 512,
101
+ }
102
+
103
+
104
+ generate_output = model.generate(input_ids=inputs.to(model.device), **generating_args)
105
+
106
+ response_ids = generate_output[:, prompt_length:]
107
+ response = tokenizer.batch_decode(
108
+ response_ids, skip_special_tokens=True, clean_up_tokenization_spaces=True
109
+ )[0]
110
+
111
+
112
+ """
113
+ 神舟十六号作为中国空间站的最后一批次载人飞船,它将在空间站的建造和运营中发挥关键作用。神舟十六号将与天舟四号货运飞船和天舟五号货运飞船对接,共同完成空间站的组装,并进行空间站的全面覆盖。神舟十六号的发射将标志着中国空间站的全面建成,并且将为中国航天员在太空中进行长期驻留和科学实验提供支持。
114
+ """
115
+ print(f"response:{response}")
116
+
117
+
118
+ ```