|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
import torch |
|
|
|
# Set device |
|
device = "cuda" if torch.cuda.is_available() else "cpu" |
|
|
|
# Model and tokenizer |
|
model_name = "AventIQ-AI/gpt2-lmheadmodel-next-line-prediction-model" |
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
model = AutoModelForCausalLM.from_pretrained(model_name).to(device) |
|
|
|
import html |
|
|
|
# Define test text |
|
sample_text = "Artificial intelligence is transforming" |
|
|
|
# Tokenize input |
|
inputs = tokenizer(sample_text, return_tensors="pt").to(device) |
|
|
|
# Generate prediction |
|
with torch.no_grad(): |
|
output_tokens = model.generate( |
|
**inputs, |
|
max_length=50, |
|
num_beams=5, |
|
repetition_penalty=1.5, |
|
temperature=0.7, |
|
top_k=50, |
|
top_p=0.9, |
|
do_sample=True, |
|
no_repeat_ngram_size=2, |
|
num_return_sequences=1, |
|
early_stopping=True, |
|
length_penalty=1.0, |
|
pad_token_id=tokenizer.eos_token_id, |
|
eos_token_id=tokenizer.eos_token_id, |
|
return_dict_in_generate=True, |
|
output_scores=True |
|
) |
|
|
|
# Decode and clean response |
|
generated_response = tokenizer.decode(output_tokens.sequences[0], skip_special_tokens=True) |
|
cleaned_response = html.unescape(generated_response).replace("#39;", "'").replace("quot;", '"') |
|
|
|
print("\nGenerated Response:\n", cleaned_response) |
|
|
|
## GPT-2 for Next-line Prediction |
|
|
|
This repository hosts a fine-tuned GPT-2 model optimized for **next-line prediction** tasks. The model has been fine-tuned on the **OpenWebText** dataset and quantized in **FP16** format to enhance efficiency without compromising performance. |
|
|
|
## Model Details |
|
|
|
- **Model Architecture:** GPT-2 (Causal Language Model) |
|
- **Task:** Next-line Prediction |
|
- **Dataset:** OpenWebText (subset: `stas/openwebtext-10k`) |
|
- **Quantization:** FP16 for reduced model size and faster inference |
|
- **Fine-tuning Framework:** Hugging Face Transformers |
|
|
|
## Training Details |
|
|
|
- **Number of Epochs:** 3 |
|
- **Batch Size:** 4 |
|
- **Evaluation Strategy:** Epoch |
|
- **Learning Rate:** 5e-5 |
|
|
|
## Evaluation Metrics (Perplexity Score) |
|
Perplexity Score: 14.355693817138672 |
|
|
|
## Limitations |
|
|
|
- The model is optimized for English-language next-word prediction tasks. |
|
- While quantization improves speed, minor accuracy degradation may occur. |
|
- Performance on out-of-distribution text (e.g., highly technical or domain-specific data) may be limited. |
|
|
|
## Usage Instructions |
|
|
|
### Installation |
|
|
|
```sh |
|
pip install transformers torch |
|
``` |
|
|
|
### Loading the Model in Python |
|
|
|
```python |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
import torch |
|
|
|
device = "cuda" if torch.cuda.is_available() else "cpu" |
|
|
|
model_name = "AventIQ-AI/gpt2-lmheadmodel-next-line-prediction-model" |
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
model = AutoModelForCausalLM.from_pretrained(model_name).to(device) |
|
``` |
|
|
|
## Repository Structure |
|
|
|
``` |
|
. |
|
βββ model/ # Contains the quantized model files |
|
βββ tokenizer_config/ # Tokenizer configuration and vocabulary files |
|
βββ model.safetensors/ # Quantized Model |
|
βββ README.md # Model documentation |
|
``` |
|
|
|
## Contributing |
|
|
|
Contributions are welcome! Feel free to open an issue or submit a pull request if you have suggestions or improvements. |
|
|