motexture commited on
Commit
ba6cdc6
1 Parent(s): f0994ef

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +65 -3
README.md CHANGED
@@ -1,3 +1,65 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ pipeline_tag: text-generation
6
+ tags:
7
+ - chat
8
+ - coding
9
+ base_model: Qwen/Qwen2-7B
10
+ datasets:
11
+ - motexture/cData
12
+ ---
13
+
14
+ # Cwen-7B-Instruct
15
+
16
+ ## Introduction
17
+
18
+ Cwen-7B-Instruct is a fine-tuned version of Qwen2-7B-Instruct, optimized using the cData coding dataset to enhance its coding capabilities across various languages, with a primary focus on low-level ones.<br>
19
+
20
+ ## Quickstart
21
+
22
+ Here provides a code snippet with `apply_chat_template` to show you how to load the tokenizer and model and how to generate contents.
23
+
24
+ ```python
25
+ from transformers import AutoModelForCausalLM, AutoTokenizer
26
+ device = "cuda" # the device to load the model onto
27
+
28
+ model = AutoModelForCausalLM.from_pretrained(
29
+ "motexture/Cwen-7B-Instruct",
30
+ torch_dtype="auto",
31
+ device_map="auto"
32
+ )
33
+ tokenizer = AutoTokenizer.from_pretrained("motexture/Cwen-7B-Instruct")
34
+
35
+ prompt = "Write a C++ program that demonstrates the concept of separate compilation and linkage using namespaces and header files. The program should consist of multiple source files, each containing a portion of the program's code, and a header file that contains the interface information for the program.\n\nThe program should define a namespace my_namespace that contains a class MyClass with a member function print() that takes an integer as an argument. The program should also define a function main() that uses an object of the MyClass class to print a message.\n\nThe program should be compiled and linked separately, with each source file being compiled individually and then linked together to form the final executable."
36
+ messages = [
37
+ {"role": "system", "content": "You are a helpful assistant."},
38
+ {"role": "user", "content": prompt}
39
+ ]
40
+ text = tokenizer.apply_chat_template(
41
+ messages,
42
+ tokenize=False,
43
+ add_generation_prompt=True
44
+ )
45
+ model_inputs = tokenizer([text], return_tensors="pt").to(device)
46
+
47
+ generated_ids = model.generate(
48
+ model_inputs.input_ids,
49
+ max_new_tokens=512
50
+ )
51
+ generated_ids = [
52
+ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
53
+ ]
54
+
55
+ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
56
+ ```
57
+
58
+ ## Citation
59
+
60
+ ```
61
+ @article{qwen2,
62
+ title={Qwen2 Technical Report},
63
+ year={2024}
64
+ }
65
+ ```