library_name: peft
base_model: mistralai/Mistral-7B-v0.1
license: mit
language:
- en
metrics:
- perplexity
- bertscore
Model Card for Model ID
Fine-tuned using QLoRA for story generation task.
Model Description
We utilize "Hierarchical Neural Story Generation" dataset and fine-tune the model to generate stories.
The input to the model is structred as follows:
'''
### Instruction: Below is a story idea. Write a short story based on this context.
### Input: [story idea here]
### Response:
'''
- Developed by: Abdelrahman ’Boda’ Sadallah, Anastasiia Demidova, Daria Kotova
- Model type: Causal LM
- Language(s) (NLP): English
- Finetuned from model [optional]: mistralai/Mistral-7B-v0.1
Model Sources
Uses
The model is the result of our AI project. If you intend to use it, please, refer to the repo.
Recommendations
For improving stories generation, you can play parameters: temeperature, top_p/top_k, repetition_penalty, etc.
Training Details
Training Data
Github for the dataset: https://github.com/kevalnagda/StoryGeneration
Evaluation
Testing Data, Factors & Metrics
Test split of the same dataset.
Metrics
We are using perplexity and BERTScore.
Results
Perplexity: 8.8647
BERTScore: 80.76
Training procedure
The following bitsandbytes
quantization config was used during training:
- quant_method: bitsandbytes
- load_in_8bit: False
- load_in_4bit: True
- llm_int8_threshold: 6.0
- llm_int8_skip_modules: None
- llm_int8_enable_fp32_cpu_offload: False
- llm_int8_has_fp16_weight: False
- bnb_4bit_quant_type: fp4
- bnb_4bit_use_double_quant: False
- bnb_4bit_compute_dtype: float32
Framework versions
- PEFT 0.6.0.dev0