Update README.md

65b277d verified about 1 month ago

3.81 kB

	---
	license: cc-by-4.0
	datasets:
	- santoshtyss/uk_legislation
	language:
	- en
	base_model:
	- unsloth/Llama-3.2-3B
	tags:
	- legal
	---

	# Llama 3.2 UK Legislation 3B


	<figure>
	<img src="UKlegislation.png" alt="Llama 3.2 UK Legislation 3B" width="300">
	</figure>


	This model is the base version of Meta's Llama 3.2 3B architecture. It has been pretrained on UK legislative texts but has not yet been fine-tuned. It is provided as a foundational model for further development, such as domain-specific applications or fine-tuning for specialised tasks involving UK legislative documents.
	It was trained as part of a blog series, see the article [here](https://www.gpt-labs.ai/post/making-a-domain-specific-uk-legislation-llm-part-1-pretraining)
	## Model Details

	### Model Description
	- Developed by: GPT-LABS.AI
	- Model type: Transformer-based language model
	- Language: English
	- License: [CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)
	- Base model: [unsloth/Llama-3.2-3B](https://huggingface.co/unsloth/Llama-3.2-3B)

	### Model Sources
	- Repository: [EryriLabs/llama-3.2-uk-legislation-3b](https://huggingface.co/EryriLabs/llama-3.2-uk-legislation-3b)
	- Blog Post: [Making a Domain-Specific UK Legislation LLM: Part 1 - Pretraining](https://www.gpt-labs.ai/post/making-a-domain-specific-uk-legislation-llm-part-1-pretraining)

	## Uses

	### Intended Use
	This base model is designed to serve as a starting point for further fine-tuning and development for tasks such as:
	- Domain-specific applications in law or other fields
	- Research and experimentation in natural language processing
	- General-purpose natural language understanding and generation

	### Out-of-Scope Use
	This model is not suitable for:
	- Providing domain-specific expertise or insights without fine-tuning
	- Applications requiring high accuracy or nuanced understanding of UK legislation
	- Tasks involving sensitive or critical real-world applications without rigorous evaluation

	## Bias, Risks, and Limitations

	- Bias: The model may reflect biases inherent in the pretraining data. Outputs should be critically evaluated for accuracy and fairness.
	- Risks: As a base model, it may generate responses that are overly general or contextually inappropriate for specific tasks.
	- Limitations: The model is not fine-tuned for specific domains, including legal or legislative text, and does not include the most recent developments in any field.

	## How to Get Started with the Model

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	# Load model and tokenizer
	model = AutoModelForCausalLM.from_pretrained("EryriLabs/llama-3.2-uk-legislation-3b", device_map="auto")
	tokenizer = AutoTokenizer.from_pretrained("EryriLabs/llama-3.2-uk-legislation-3b")

	# Sample question
	input_text = "What are the main principles of UK legislation?"

	# Tokenize and generate response
	inputs = tokenizer(input_text, return_tensors="pt").to("cuda")
	outputs = model.generate(inputs["input_ids"], max_length=50)
	response = tokenizer.decode(outputs[0], skip_special_tokens=True)

	print(response)
	```

	## Technical Specifications

	- Model Architecture: Llama 3.2 3B, a transformer-based model designed for natural language processing tasks.
	- Training Data: Pretrained on a diverse dataset of general text.
	- Compute Infrastructure: Training conducted on high-performance GPUs (e.g., NVIDIA A100).

	## Citation

	If you use this model, please cite:

	```
	@misc{llama3.2-uk-legislation-3b,
	author = {GPT-LABS.AI},
	title = {Llama 3.2 UK Legislation 3B},
	year = {2024},
	publisher = {Hugging Face},
	url = {https://huggingface.co/EryriLabs/llama-3.2-uk-legislation-3b}
	}
	```

	## Model Card Authors

	- GPT-LABS.AI

	## Contact

	For questions or feedback, please visit gpt-labs.ai