qnguyen3 commited on
Commit
cc79676
·
verified ·
1 Parent(s): 7d66155

Upload README (2).md

Browse files
Files changed (1) hide show
  1. README (2).md +65 -0
README (2).md ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - vi
5
+ ---
6
+ # VinaLlama2-14B Beta
7
+
8
+ GGUF Here: [VinaLlama2-14B-GGUF](https://huggingface.co/qnguyen3/14b-gguf)
9
+
10
+ **Top Features**:
11
+
12
+ - **Context Length**: 32,768 tokens.
13
+ - **VERY GOOD** at reasoning, mathematics and creative writing.
14
+ - Works with **Langchain Agent** out-of-the-box.
15
+
16
+ **Known Issues**
17
+ - Still a bit struggling with Vietnamese fact (Hoang Sa & Truong Sa, Historical questions).
18
+ - Hallucination when reasoning.
19
+ - Can't do Vi-En/En-Vi translation (yet)!
20
+
21
+ Quick use:
22
+
23
+ VRAM Requirement: ~20GB
24
+
25
+ ```bash
26
+ pip install transformers accelerate
27
+ ```
28
+
29
+ ```python
30
+ from transformers import AutoModelForCausalLM, AutoTokenizer
31
+ import torch
32
+ device = "cuda" # the device to load the model onto
33
+
34
+ model = AutoModelForCausalLM.from_pretrained(
35
+ "vilm/VinaLlama2-14B",
36
+ torch_dtype='auto',
37
+ device_map="auto"
38
+ )
39
+ tokenizer = AutoTokenizer.from_pretrained("vilm/VinaLlama2-14B")
40
+
41
+ prompt = "Một cộng một bằng mấy?"
42
+ messages = [
43
+ {"role": "system", "content": "Bạn là trợ lí AI hữu ích."},
44
+ {"role": "user", "content": prompt}
45
+ ]
46
+ text = tokenizer.apply_chat_template(
47
+ messages,
48
+ tokenize=False,
49
+ add_generation_prompt=True
50
+ )
51
+ model_inputs = tokenizer([text], return_tensors="pt").to(device)
52
+
53
+ generated_ids = model.generate(
54
+ model_inputs.input_ids,
55
+ max_new_tokens=1024,
56
+ eos_token_id=tokenizer.eos_token_id,
57
+ temperature=0.25,
58
+ )
59
+ generated_ids = [
60
+ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
61
+ ]
62
+
63
+ response = tokenizer.batch_decode(generated_ids)[0]
64
+ print(response)
65
+ ```