EryriLabs's picture
Update README.md
65b277d verified
---
license: cc-by-4.0
datasets:
- santoshtyss/uk_legislation
language:
- en
base_model:
- unsloth/Llama-3.2-3B
tags:
- legal
---
# Llama 3.2 UK Legislation 3B
<figure>
<img src="UKlegislation.png" alt="Llama 3.2 UK Legislation 3B" width="300">
</figure>
This model is the base version of Meta's Llama 3.2 3B architecture. It has been pretrained on UK legislative texts but has not yet been fine-tuned. It is provided as a foundational model for further development, such as domain-specific applications or fine-tuning for specialised tasks involving UK legislative documents.
It was trained as part of a blog series, see the article [here](https://www.gpt-labs.ai/post/making-a-domain-specific-uk-legislation-llm-part-1-pretraining)
## Model Details
### Model Description
- **Developed by:** GPT-LABS.AI
- **Model type:** Transformer-based language model
- **Language:** English
- **License:** [CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)
- **Base model:** [unsloth/Llama-3.2-3B](https://huggingface.co/unsloth/Llama-3.2-3B)
### Model Sources
- **Repository:** [EryriLabs/llama-3.2-uk-legislation-3b](https://huggingface.co/EryriLabs/llama-3.2-uk-legislation-3b)
- **Blog Post:** [Making a Domain-Specific UK Legislation LLM: Part 1 - Pretraining](https://www.gpt-labs.ai/post/making-a-domain-specific-uk-legislation-llm-part-1-pretraining)
## Uses
### Intended Use
This base model is designed to serve as a starting point for further fine-tuning and development for tasks such as:
- Domain-specific applications in law or other fields
- Research and experimentation in natural language processing
- General-purpose natural language understanding and generation
### Out-of-Scope Use
This model is **not suitable** for:
- Providing domain-specific expertise or insights without fine-tuning
- Applications requiring high accuracy or nuanced understanding of UK legislation
- Tasks involving sensitive or critical real-world applications without rigorous evaluation
## Bias, Risks, and Limitations
- **Bias:** The model may reflect biases inherent in the pretraining data. Outputs should be critically evaluated for accuracy and fairness.
- **Risks:** As a base model, it may generate responses that are overly general or contextually inappropriate for specific tasks.
- **Limitations:** The model is not fine-tuned for specific domains, including legal or legislative text, and does not include the most recent developments in any field.
## How to Get Started with the Model
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained("EryriLabs/llama-3.2-uk-legislation-3b", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("EryriLabs/llama-3.2-uk-legislation-3b")
# Sample question
input_text = "What are the main principles of UK legislation?"
# Tokenize and generate response
inputs = tokenizer(input_text, return_tensors="pt").to("cuda")
outputs = model.generate(inputs["input_ids"], max_length=50)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
```
## Technical Specifications
- **Model Architecture:** Llama 3.2 3B, a transformer-based model designed for natural language processing tasks.
- **Training Data:** Pretrained on a diverse dataset of general text.
- **Compute Infrastructure:** Training conducted on high-performance GPUs (e.g., NVIDIA A100).
## Citation
If you use this model, please cite:
```
@misc{llama3.2-uk-legislation-3b,
author = {GPT-LABS.AI},
title = {Llama 3.2 UK Legislation 3B},
year = {2024},
publisher = {Hugging Face},
url = {https://huggingface.co/EryriLabs/llama-3.2-uk-legislation-3b}
}
```
## Model Card Authors
- GPT-LABS.AI
## Contact
For questions or feedback, please visit gpt-labs.ai