File size: 3,312 Bytes
fa77ae2 2a1b25d dc96e03 2a1b25d 6aadec6 2a1b25d fa77ae2 2a1b25d fa77ae2 5c59bc3 fa77ae2 2a1b25d fa77ae2 2a1b25d fa77ae2 2a1b25d fa77ae2 2a1b25d fa77ae2 2a1b25d fa77ae2 2a1b25d fa77ae2 2a1b25d fa77ae2 2a1b25d fa77ae2 2a1b25d fa77ae2 2a1b25d fa77ae2 2a1b25d fa77ae2 2a1b25d fa77ae2 2a1b25d fa77ae2 2a1b25d fa77ae2 2a1b25d fa77ae2 4b452bc fa77ae2 4b452bc fa77ae2 4b452bc fa77ae2 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 |
---
license: mit
datasets:
- Equall/legalbench_instruct
language:
- en
---
# Model Card for SaulLM-141B-Instruct
<p align="center">
<img src="https://cdn-uploads.huggingface.co/production/uploads/644a900e3a619fe72b14af0f/tD6ZJIlh5pxR1iqcxtOSs.jpeg" alt="image/jpeg">
</p>
**Note:** This model is a research artifact and should be considered as such.
## Model Details
### Model Description
**SaulLM-141B-Instruct** is a state-of-the-art language model specifically designed for the legal domain. It was developed through a collaboration between Equall and MICS at CentraleSupélec (Université Paris-Saclay) and aims to contribute to the advancement of LLMs specialized for legal work.
- **Developed by:** Equall and MICS of CentraleSupélec (Université Paris-Saclay)
- **Model type:** A 141 billion parameter model pretrained and finetuned for legal tasks, leveraging data from US and European legal databases.
- **Language(s) (NLP):** English
- **License:** MIT-License
- **Finetuned from model:** Base model developed by Equall relying on continuous pretraining of Mixtral’s models.
## Intended Uses & Limitations
### Intended Uses
SaulLM-141B-Instruct is intended to support further research and be adapted for various legal use cases.
### Limitations
The information provided by the model is for informational purposes only and should not be interpreted as legal advice. Also, because SaulLM-141B-Instruct was trained with a focus on US and European legal systems, it may not perform as well on legal systems outside of those jurisdictions.
## Bias, Risks, and Ethical Considerations
### Bias and Risks
Despite efforts to mitigate bias, SaulLM-141B may still exhibit biases inherent in its training data or otherwise provide inaccurate responses. The model is trained on information up to a certain point in time, and the model cannot account for all recent legal developments. Users should be cautious and critically evaluate the model's outputs, especially in sensitive legal cases. The responsibility for making decisions based on the information rests with the user, not the model or its developers. Users are encouraged to seek the assistance of qualified legal professionals where legal advice is needed.
### Ethical Considerations
Users must use SaulLM-141B responsibly, ensuring that the model is not misused in a way that violates the law or infringes on the rights of others. Among other things, the model may not be used to generate harmful content, spread misinformation, or violate privacy or intellectual property rights.
## Technical Details
### Training Data
SaulLM-141B was trained on a rich dataset comprising European and US legal texts, court rulings, and legislative documents.
## Citation
To reference SaulLM-141B in your work, please cite this model card.
```
@misc{colombo2024saullm54bsaullm141bscaling,
title={SaulLM-54B & SaulLM-141B: Scaling Up Domain Adaptation for the Legal Domain},
author={Pierre Colombo and Telmo Pires and Malik Boudiaf and Rui Melo and Dominic Culver and Sofia Morgado and Etienne Malaboeuf and Gabriel Hautreux and Johanne Charpentier and Michael Desa},
year={2024},
eprint={2407.19584},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2407.19584},
}
```
---
|