|
--- |
|
language: |
|
- fr |
|
library_name: transformers |
|
tags: |
|
- music |
|
- rap |
|
- lyrics |
|
--- |
|
# Kichtral-7B-v0.3: a Mistral-7B Casual LM for French Rap Lyrics |
|
|
|
## Overview |
|
|
|
__Kichtral-7B-v0.3__ is a Casual Language Model fine-tuned from the __Mistral 7B__ model on __french rap lyrics__. The training dataset consists of cleaned French verses, with no repetitions, from songs that have at least 10k streams on Spotify. This dataset contains a total of __36M tokens__. |
|
|
|
This model aims to __understand and generate__ french rap lyrics, making it a valuable tool for __research__ in __french slang__ and __music lyrics generation__. |
|
|
|
## Model Details |
|
|
|
Kichtral-7B-v0.3 is based on the Mistral 7B v0.3 architecture and has been fine-tuned with the following hyperparameters: |
|
|
|
| Parameter | Value | |
|
|---------------------|----------| |
|
| Epochs | 1 | |
|
| LoRA Rank | 64 | |
|
| LoRA Alpha | 128 | |
|
| LoRA Dropout | 0.1 | |
|
| Learning Rate | 1e-4 | |
|
| Learning Scheduler | Cosine | |
|
|
|
### Versions |
|
|
|
The model was trained using AWS SageMaker on a single ml.g5.2xlarge instance during 15 hours with the following software versions: |
|
|
|
| Requirement | Version | |
|
|------------------------|-----------| |
|
| Transformers | 4.28 | |
|
| PyTorch | 2.0 | |
|
| Python | 3.10 | |
|
|
|
## Installation |
|
|
|
Install the required Python libraries: |
|
|
|
```bash |
|
pip install transformers |
|
``` |
|
|
|
## Loading the Model |
|
|
|
```python |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
|
# Load the tokenizer and model |
|
tokenizer = AutoTokenizer.from_pretrained("rapminerz/Kichtral-7B-v0.3") |
|
model = AutoModelForCausalLM.from_pretrained("rapminerz/Kichtral-7B-v0.3") |
|
``` |
|
|
|
## Using the Model |
|
|
|
```python |
|
def generate_lyrics(prompt): |
|
inputs = tokenizer(prompt, return_tensors="pt") |
|
outputs = model.generate( |
|
inputs["input_ids"], |
|
max_length=300, |
|
num_return_sequences=1, |
|
top_k=10, |
|
top_p=0.95, |
|
temperature=1.0, |
|
repetition_penalty=1.2 |
|
) |
|
return tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
|
|
generate_lyrics("Okay ça fait") |
|
""" |
|
Okay ça fait un moment que tu m'appelles |
|
Sans t'écouter, j'ai dû me tailler |
|
Jusqu'à présent, je sais pas qui t'es mais je peux pas t'oublier |
|
Tu m'as laissé des images dans l'crâne |
|
Quand je repense à ce soir-là |
|
""" |
|
|
|
generate_lyrics("Je viens de là où") |
|
""" |
|
Je viens de là où ça tire |
|
Je fais la loi je suis pas le roi |
|
Et je sais que tu penses à moi quand t'as besoin d'aide |
|
Quand y a trop d'ennemis autour de toi qui se mêlent |
|
""" |
|
``` |
|
|
|
## Purpose and Disclaimer |
|
|
|
This model is designed for academic and research purposes only. It is not intended for commercial use. The creators of this model do not endorse or promote any specific views or opinions that may be represented in the dataset. |
|
|
|
__Please mention @RapMinerz if you use our models__ |
|
|
|
|
|
## Improvements |
|
|
|
This model doesn't totally capture rhymes, another method should be needed to prompt for example rhymes and topics |
|
|
|
|
|
## Contact |
|
|
|
For any questions or issues, please contact the repository owner, __RapMinerz__, at [email protected]. |