Update README.md
Browse files
README.md
CHANGED
@@ -10,4 +10,88 @@ tags:
|
|
10 |
- legal
|
11 |
---
|
12 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
13 |
|
|
|
10 |
- legal
|
11 |
---
|
12 |
|
13 |
+
# Llama 3.2 UK Legislation 3B
|
14 |
+
|
15 |
+
This model is the base version of Meta's Llama 3.2 3B architecture. It has not yet been fine-tuned and is provided as a foundational model for further development, such as domain-specific applications involving UK legislative texts.
|
16 |
+
It was trained as part of a blog series, see the article [here] https://www.gpt-labs.ai/post/making-a-domain-specific-uk-legislation-llm-part-1-pretraining.
|
17 |
+
## Model Details
|
18 |
+
|
19 |
+
### Model Description
|
20 |
+
- **Developed by:** GPT-LABS.AI
|
21 |
+
- **Model type:** Transformer-based language model
|
22 |
+
- **Language:** English
|
23 |
+
- **License:** [CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)
|
24 |
+
- **Base model:** [unsloth/Llama-3.2-3B](https://huggingface.co/unsloth/Llama-3.2-3B)
|
25 |
+
|
26 |
+
### Model Sources
|
27 |
+
- **Repository:** [EryriLabs/llama-3.2-uk-legislation-3b](https://huggingface.co/EryriLabs/llama-3.2-uk-legislation-3b)
|
28 |
+
- **Blog Post:** [Making a Domain-Specific UK Legislation LLM: Part 1 - Pretraining](https://www.gpt-labs.ai/post/making-a-domain-specific-uk-legislation-llm-part-1-pretraining)
|
29 |
+
|
30 |
+
## Uses
|
31 |
+
|
32 |
+
### Intended Use
|
33 |
+
This base model is designed to serve as a starting point for further fine-tuning and development for tasks such as:
|
34 |
+
- Domain-specific applications in law or other fields
|
35 |
+
- Research and experimentation in natural language processing
|
36 |
+
- General-purpose natural language understanding and generation
|
37 |
+
|
38 |
+
### Out-of-Scope Use
|
39 |
+
This model is **not suitable** for:
|
40 |
+
- Providing domain-specific expertise or insights without fine-tuning
|
41 |
+
- Applications requiring high accuracy or nuanced understanding of UK legislation
|
42 |
+
- Tasks involving sensitive or critical real-world applications without rigorous evaluation
|
43 |
+
|
44 |
+
## Bias, Risks, and Limitations
|
45 |
+
|
46 |
+
- **Bias:** The model may reflect biases inherent in the pretraining data. Outputs should be critically evaluated for accuracy and fairness.
|
47 |
+
- **Risks:** As a base model, it may generate responses that are overly general or contextually inappropriate for specific tasks.
|
48 |
+
- **Limitations:** The model is not fine-tuned for specific domains, including legal or legislative text, and does not include the most recent developments in any field.
|
49 |
+
|
50 |
+
## How to Get Started with the Model
|
51 |
+
|
52 |
+
```python
|
53 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
54 |
+
|
55 |
+
# Load model and tokenizer
|
56 |
+
model = AutoModelForCausalLM.from_pretrained("EryriLabs/llama-3.2-uk-legislation-3b", device_map="auto")
|
57 |
+
tokenizer = AutoTokenizer.from_pretrained("EryriLabs/llama-3.2-uk-legislation-3b")
|
58 |
+
|
59 |
+
# Sample question
|
60 |
+
input_text = "What are the main principles of UK legislation?"
|
61 |
+
|
62 |
+
# Tokenize and generate response
|
63 |
+
inputs = tokenizer(input_text, return_tensors="pt").to("cuda")
|
64 |
+
outputs = model.generate(inputs["input_ids"], max_length=50)
|
65 |
+
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
|
66 |
+
|
67 |
+
print(response)
|
68 |
+
```
|
69 |
+
|
70 |
+
## Technical Specifications
|
71 |
+
|
72 |
+
- **Model Architecture:** Llama 3.2 3B, a transformer-based model designed for natural language processing tasks.
|
73 |
+
- **Training Data:** Pretrained on a diverse dataset of general text.
|
74 |
+
- **Compute Infrastructure:** Training conducted on high-performance GPUs (e.g., NVIDIA A100).
|
75 |
+
|
76 |
+
## Citation
|
77 |
+
|
78 |
+
If you use this model, please cite:
|
79 |
+
|
80 |
+
```
|
81 |
+
@misc{llama3.2-uk-legislation-3b,
|
82 |
+
author = {GPT-LABS.AI},
|
83 |
+
title = {Llama 3.2 UK Legislation 3B},
|
84 |
+
year = {2024},
|
85 |
+
publisher = {Hugging Face},
|
86 |
+
url = {https://huggingface.co/EryriLabs/llama-3.2-uk-legislation-3b}
|
87 |
+
}
|
88 |
+
```
|
89 |
+
|
90 |
+
## Model Card Authors
|
91 |
+
|
92 |
+
- GPT-LABS.AI
|
93 |
+
|
94 |
+
## Contact
|
95 |
+
|
96 |
+
For questions or feedback, please visit gpt-labs.ai
|
97 |
|