|
--- |
|
license: llama2 |
|
base_model: ngoan/Llama-2-7b-vietnamese-20k |
|
datasets: |
|
- nthngdy/oscar-mini |
|
- Tamnemtf/VietNamese_lang |
|
language: |
|
- vi |
|
pipeline_tag: text-generation |
|
tags: |
|
- text-generation |
|
- llama-2 |
|
- llama-2-7B |
|
- llama2-vietnamese |
|
- vietnamese |
|
--- |
|
|
|
|
|
## Model Details |
|
- Model Name: llama-2-7b-vi-oscar_mini |
|
- Purpose: Mục đích để train con model này để phục vụ việc học và đề tài nckh. |
|
- Availability: The model checkpoint can be accessed on Hugging Face: Tamnemtf/llama-2-7b-vi-oscar_mini |
|
- Model trên được train dựa trên model gốc là ngoan/Llama-2-7b-vietnamese-20k |
|
## How to Use |
|
```python |
|
# Activate 4-bit precision base model loading |
|
use_4bit = True |
|
|
|
# Compute dtype for 4-bit base models |
|
bnb_4bit_compute_dtype = "float16" |
|
|
|
# Quantization type (fp4 or nf4) |
|
bnb_4bit_quant_type = "nf4" |
|
|
|
# Activate nested quantization for 4-bit base models (double quantization) |
|
use_nested_quant = False |
|
|
|
# Load the entire model on the GPU 0 |
|
device_map = {"": 0} |
|
``` |
|
|
|
```python |
|
compute_dtype = getattr(torch, bnb_4bit_compute_dtype) |
|
bnb_config = BitsAndBytesConfig( |
|
load_in_4bit=use_4bit, |
|
bnb_4bit_quant_type=bnb_4bit_quant_type, |
|
bnb_4bit_compute_dtype=compute_dtype, |
|
bnb_4bit_use_double_quant=use_nested_quant, |
|
) |
|
``` |
|
```python |
|
model = AutoModelForCausalLM.from_pretrained( |
|
'Tamnemtf/llama-2-7b-vi-oscar_mini', |
|
quantization_config=bnb_config, |
|
device_map=device_map |
|
) |
|
model.config.use_cache = False |
|
model.config.pretraining_tp = 1 |
|
|
|
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True) |
|
tokenizer.pad_token = tokenizer.eos_token |
|
tokenizer.padding_side = "right" # Fix weird overflow issue with fp16 training |
|
``` |
|
```python |
|
# Run text generation pipeline with our next model |
|
prompt = "Canh chua cá lau là món gì ?" |
|
pipe = pipeline(task="text-generation", model=model, tokenizer=tokenizer, max_length=200) |
|
result = pipe(f"<s>[INST] {prompt} [/INST]") |
|
print(result[0]['generated_text']) |
|
``` |
|
|
|
Để ưu tiên cho việc dễ dàng tiếp cận với các sinh viên dưới đây là mẫu ví dụ chạy thử model trên colab bằng T4 |
|
https://colab.research.google.com/drive/1ME_k-gUKSY2NbB7GQRk3sqz56CKsSV5C?usp=sharing |
|
|
|
## Conntact |
|
[email protected] |
|
|