boda's picture
Update README.md
3eb8b73
metadata
library_name: peft
base_model: meta-llama/Llama-2-7b-hf
license: mit
language:
  - en
metrics:
  - bertscore
  - perplexity

Model Card for Model ID

Fine-tuned using QLoRA for story generation task.

Model Description

We utilize "Hierarchical Neural Story Generation" dataset and fine-tune the model to generate stories.

The input to the model is structred as follows:

'''

### Instruction: Below is a story idea. Write a short story based on this context.

### Input: [story idea here]

### Response:

'''

  • Developed by: Abdelrahman ’Boda’ Sadallah, Anastasiia Demidova, Daria Kotova
  • Model type: Causal LM
  • Language(s) (NLP): English
  • Finetuned from model [optional]: meta-llama/Llama-2-7b-hf

Model Sources

Uses

The model is the result of our AI project. If you intend to use it, please, refer to the repo.

Recommendations

For improving stories generation, you can play parameters: temeperature, top_p/top_k, repetition_penalty, etc.

Training Details

Training Data

Github for the dataset: https://github.com/kevalnagda/StoryGeneration

Evaluation

Testing Data, Factors & Metrics

Test split of the same dataset.

Metrics

We are using perplexity and BERTScore.

Results

Perplexity: 8.0546

BERTScore: 80.11

Training procedure

The following bitsandbytes quantization config was used during training:

  • quant_method: bitsandbytes
  • load_in_8bit: False
  • load_in_4bit: True
  • llm_int8_threshold: 6.0
  • llm_int8_skip_modules: None
  • llm_int8_enable_fp32_cpu_offload: False
  • llm_int8_has_fp16_weight: False
  • bnb_4bit_quant_type: fp4
  • bnb_4bit_use_double_quant: False
  • bnb_4bit_compute_dtype: float32

Framework versions

  • PEFT 0.6.0.dev0