Update README.md
Browse files
README.md
CHANGED
@@ -16,38 +16,39 @@ tags:
|
|
16 |
|
17 |
## Model Card: LawVinaLlama
|
18 |
|
19 |
-
**Mô tả mô hình:**
|
20 |
|
21 |
-
|
22 |
|
|
|
23 |
|
24 |
-
**Các nguồn dữ liệu chính:**
|
25 |
|
26 |
-
|
27 |
-
* 40.000 QA dịch và tóm tắt từ luật quốc tế
|
28 |
-
* 10.000 QA dịch và tóm tắt từ luật quốc tế
|
29 |
-
* 50.000 Reasoning QA được generated từ GPT-4.0/ Gemini
|
30 |
|
|
|
|
|
|
|
|
|
31 |
|
32 |
-
**Mục đích sử dụng:**
|
33 |
|
34 |
-
|
35 |
|
36 |
-
|
37 |
-
|
|
|
|
|
38 |
|
39 |
|
40 |
-
**
|
41 |
|
42 |
-
LawVinaLlama
|
43 |
|
44 |
-
|
45 |
-
|
46 |
|
47 |
|
48 |
-
**
|
49 |
|
50 |
-
Load
|
51 |
|
52 |
```python
|
53 |
from unsloth import FastLanguageModel
|
@@ -100,4 +101,4 @@ generated_ids = model.generate(
|
|
100 |
a = tokenizer.batch_decode(generated_ids)[0]
|
101 |
# print(a.split('### Trả lời:')[1])
|
102 |
print(a)
|
103 |
-
```
|
|
|
16 |
|
17 |
## Model Card: LawVinaLlama
|
18 |
|
|
|
19 |
|
20 |
+
**Model Description:**
|
21 |
|
22 |
+
LawVinaLlama is a large language model (LLM) specialized in **Vietnamese law**, fine-tuned from the Llama architecture. The model has been trained on real legal documents to improve its ability to **reason, retrieve legal information, and summarize legal content**.
|
23 |
|
|
|
24 |
|
25 |
+
**Main Data Sources:**
|
|
|
|
|
|
|
26 |
|
27 |
+
- **150,000 Q&A** crawled and processed from *Thư Viện Pháp Luật* (Vietnamese Legal Library)
|
28 |
+
- **40,000 Q&A** translated and summarized from international law
|
29 |
+
- **10,000 Q&A** translated and summarized from international law (duplicate, possibly an error)
|
30 |
+
- **50,000 Reasoning Q&A** generated by GPT-4.0/Gemini
|
31 |
|
|
|
32 |
|
33 |
+
**Intended Use Cases:**
|
34 |
|
35 |
+
LawVinaLlama is suitable for the following tasks:
|
36 |
+
|
37 |
+
- **Answering legal questions** / **Providing legal answers based on a given context**
|
38 |
+
- **Summarizing legal content**
|
39 |
|
40 |
|
41 |
+
**Limitations:**
|
42 |
|
43 |
+
LawVinaLlama may still encounter some limitations:
|
44 |
|
45 |
+
- It may generate **misleading or inaccurate** information.
|
46 |
+
- Its **performance depends on the quality of the input data**.
|
47 |
|
48 |
|
49 |
+
**How to Use:**
|
50 |
|
51 |
+
Load model
|
52 |
|
53 |
```python
|
54 |
from unsloth import FastLanguageModel
|
|
|
101 |
a = tokenizer.batch_decode(generated_ids)[0]
|
102 |
# print(a.split('### Trả lời:')[1])
|
103 |
print(a)
|
104 |
+
```
|