File size: 3,202 Bytes
2612b48 7af3242 2612b48 e5aa181 2612b48 8aa9b51 2612b48 1b6a587 2612b48 1b6a587 2612b48 4e99fd3 2612b48 4e99fd3 7af3242 4e99fd3 2612b48 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 |
---
language:
- fr
library_name: transformers
tags:
- music
- rap
- lyrics
---
# Kichtral-7B-v0.3: a Mistral-7B Casual LM for French Rap Lyrics
## Overview
__Kichtral-7B-v0.3__ is a Casual Language Model fine-tuned from the __Mistral 7B__ model on __french rap lyrics__. The training dataset consists of cleaned French verses, with no repetitions, from songs that have at least 10k streams on Spotify. This dataset contains a total of __36M tokens__.
This model aims to __understand and generate__ french rap lyrics, making it a valuable tool for __research__ in __french slang__ and __music lyrics generation__.
## Model Details
Kichtral-7B-v0.3 is based on the Mistral 7B v0.3 architecture and has been fine-tuned with the following hyperparameters:
| Parameter | Value |
|---------------------|----------|
| Epochs | 1 |
| LoRA Rank | 64 |
| LoRA Alpha | 128 |
| LoRA Dropout | 0.1 |
| Learning Rate | 1e-4 |
| Learning Scheduler | Cosine |
### Versions
The model was trained using AWS SageMaker on a single ml.g5.2xlarge instance during 15 hours with the following software versions:
| Requirement | Version |
|------------------------|-----------|
| Transformers | 4.28 |
| PyTorch | 2.0 |
| Python | 3.10 |
## Installation
Install the required Python libraries:
```bash
pip install transformers
```
## Loading the Model
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("rapminerz/Kichtral-7B-v0.3")
model = AutoModelForCausalLM.from_pretrained("rapminerz/Kichtral-7B-v0.3")
```
## Using the Model
```python
def generate_lyrics(prompt):
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(
inputs["input_ids"],
max_length=300,
num_return_sequences=1,
top_k=10,
top_p=0.95,
temperature=1.0,
repetition_penalty=1.2
)
return tokenizer.decode(outputs[0], skip_special_tokens=True)
generate_lyrics("Okay ça fait")
"""
Okay ça fait un moment que tu m'appelles
Sans t'écouter, j'ai dû me tailler
Jusqu'à présent, je sais pas qui t'es mais je peux pas t'oublier
Tu m'as laissé des images dans l'crâne
Quand je repense à ce soir-là
"""
generate_lyrics("Je viens de là où")
"""
Je viens de là où ça tire
Je fais la loi je suis pas le roi
Et je sais que tu penses à moi quand t'as besoin d'aide
Quand y a trop d'ennemis autour de toi qui se mêlent
"""
```
## Purpose and Disclaimer
This model is designed for academic and research purposes only. It is not intended for commercial use. The creators of this model do not endorse or promote any specific views or opinions that may be represented in the dataset.
__Please mention @RapMinerz if you use our models__
## Improvements
This model doesn't totally capture rhymes, another method should be needed to prompt for example rhymes and topics
## Contact
For any questions or issues, please contact the repository owner, __RapMinerz__, at [email protected]. |