ayushsinha commited on
Commit
bb93a87
Β·
verified Β·
1 Parent(s): d71ad9f

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +80 -0
README.md ADDED
@@ -0,0 +1,80 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Next Word Prediction With GPT2
2
+
3
+ ## πŸ“Œ Overview
4
+
5
+ This repository hosts the quantized version of the GPT2 model fine-tuned for next word prediction tasks. The model has been trained on the bookcorpus dataset from Hugging Face. The model is quantized to Float16 (FP16) to optimize inference speed and efficiency while maintaining high performance.
6
+
7
+ ## πŸ— Model Details
8
+
9
+ - **Model Architecture:** GPT2
10
+ - **Task:** Next Word Prediction
11
+ - **Dataset:** Hugging Face's `bookcorpus`
12
+ - **Quantization:** Float16 (FP16) for optimized inference
13
+ - **Fine-tuning Framework:** Hugging Face Transformers
14
+
15
+ ## πŸš€ Usage
16
+
17
+ ### Installation
18
+
19
+ ```bash
20
+ pip install transformers torch
21
+ ```
22
+
23
+ ### Loading the Model
24
+
25
+ ```python
26
+ from sentence_transformers import GPT2LMHeadModel, GPT2Tokenizer
27
+ import torch
28
+
29
+ device = "cuda" if torch.cuda.is_available() else "cpu"
30
+
31
+ model_name = "AventIQ-AI/gpt2-next-word-prediction"
32
+ model = GPT2LMHeadModel.from_pretrained(model_name).to(device)
33
+ tokenizer = GPT2Tokenizer.from_pretrained(model_name)
34
+ ```
35
+
36
+ ### Question Answer Example
37
+
38
+ ```python
39
+ # Input text
40
+ text = "Hi! How are"
41
+
42
+ # Tokenize input text
43
+ input_ids = tokenizer.encode(text, return_tensors="pt").to(device)
44
+
45
+ # Generate next word (max_length ensures we get only the next token)
46
+ output = model.generate(input_ids, max_length=input_ids.shape[1] + 1, do_sample=False)
47
+
48
+ # Decode output
49
+ generated_text = tokenizer.decode(output[0])
50
+
51
+ print("Generated Sentence:", generated_text)
52
+ ```
53
+
54
+ ## ⚑ Quantization Details
55
+
56
+ Post-training quantization was applied using PyTorch's built-in quantization framework. The model was quantized to Float16 (FP16) to reduce model size and improve inference efficiency while balancing accuracy.
57
+
58
+ ## Evaluation Metrics
59
+
60
+ A well-trained language model should have a perplexity closer to 10–50, depending on the dataset and domain and our model's perplexity score is 32.4.
61
+
62
+ ## πŸ“‚ Repository Structure
63
+
64
+ ```
65
+ .
66
+ β”œβ”€β”€ model/ # Contains the quantized model files
67
+ β”œβ”€β”€ tokenizer_config/ # Tokenizer configuration and vocabulary files
68
+ β”œβ”€β”€ model.safetensors/ # Quantized Model
69
+ β”œβ”€β”€ README.md # Model documentation
70
+ ```
71
+
72
+ ## ⚠️ Limitations
73
+
74
+ - The model may struggle for out of scope tasks.
75
+ - Quantization may lead to slight degradation in accuracy compared to full-precision models.
76
+ - Performance may vary across different writing styles and sentence structures.
77
+
78
+ ## 🀝 Contributing
79
+
80
+ Contributions are welcome! Feel free to open an issue or submit a pull request if you have suggestions or improvements.