We have successfully developed and trained a custom Large Language Model (LLM) tailored specifically for SpeedLegal's needs. This model, based on the Llama 3.2 1B architecture, has been fine-tuned on CUAD (Contract Understanding Atticus Dataset) dataset to enhance our ability to analyse and highlight critical sections of legal documents that require lawyer review.
Key Features
1.Specialised Legal Focus: Our model is trained to understand and process complex legal language and contexts.
2.Efficient and Lightweight: Built on a 1 billion parameter base model, it balances performance with computational efficiency.
3.Custom Training: Fine-tuned on our dataset of legal documents, questions, and expert-annotated answers.
4.Scalable Solution: Designed to handle a wide range of legal document types and queries.
Technical Highlights
Base Model: Llama 3.2 1B, a state-of-the-art language model known for its efficiency and performance.
Training Data: Utilised our self curated CUAD dataset, enhancing the model's relevance to our specific use cases.
Fine-tuning Technique: Employed Parameter-Efficient Fine-Tuning (PEFT) with LoRA (Low-Rank Adaptation) for optimal performance and resource utilisation.
Performance Monitoring: Integrated with Weights & Biases (wandb) for comprehensive tracking of training metrics and model performance.
Business Impact
Increased Efficiency: Automates the initial review process, allowing lawyers to focus on critical sections identified by the model.
Improved Accuracy: Trained on expert-annotated data, the model can identify subtle legal nuances that general-purpose models might miss.
Scalability: Can process large volumes of legal documents quickly, supporting our growth and handling increased workloads.
Competitive Advantage: Positions SpeedLegal at the forefront of AI-driven legal tech, offering unique value to our clients.
Model tree for speedlegal/SL-Llama-3.2-1b
Base model
meta-llama/Llama-3.2-1B-Instruct