AventIQ-AI
/

gpt2-lmheadmodel-next-line-prediction-model

Model card Files Files and versions Community

gpt2-lmheadmodel-next-line-prediction-model / README.md

developerPushkal's picture

developerPushkal

Create README.md

a675abb verified 5 months ago

|

history blame contribute delete

3.28 kB

	from transformers import AutoTokenizer, AutoModelForCausalLM
	import torch

	# Set device
	device = "cuda" if torch.cuda.is_available() else "cpu"

	# Model and tokenizer
	model_name = "AventIQ-AI/gpt2-lmheadmodel-next-line-prediction-model"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForCausalLM.from_pretrained(model_name).to(device)

	import html

	# Define test text
	sample_text = "Artificial intelligence is transforming"

	# Tokenize input
	inputs = tokenizer(sample_text, return_tensors="pt").to(device)

	# Generate prediction
	with torch.no_grad():
	output_tokens = model.generate(
	**inputs,
	max_length=50,
	num_beams=5,
	repetition_penalty=1.5,
	temperature=0.7,
	top_k=50,
	top_p=0.9,
	do_sample=True,
	no_repeat_ngram_size=2,
	num_return_sequences=1,
	early_stopping=True,
	length_penalty=1.0,
	pad_token_id=tokenizer.eos_token_id,
	eos_token_id=tokenizer.eos_token_id,
	return_dict_in_generate=True,
	output_scores=True
	)

	# Decode and clean response
	generated_response = tokenizer.decode(output_tokens.sequences[0], skip_special_tokens=True)
	cleaned_response = html.unescape(generated_response).replace("#39;", "'").replace("quot;", '"')

	print("\nGenerated Response:\n", cleaned_response)

	## GPT-2 for Next-line Prediction

	This repository hosts a fine-tuned GPT-2 model optimized for next-line prediction tasks. The model has been fine-tuned on the OpenWebText dataset and quantized in FP16 format to enhance efficiency without compromising performance.

	## Model Details

	- Model Architecture: GPT-2 (Causal Language Model)
	- Task: Next-line Prediction
	- Dataset: OpenWebText (subset: `stas/openwebtext-10k`)
	- Quantization: FP16 for reduced model size and faster inference
	- Fine-tuning Framework: Hugging Face Transformers

	## Training Details

	- Number of Epochs: 3
	- Batch Size: 4
	- Evaluation Strategy: Epoch
	- Learning Rate: 5e-5

	## Evaluation Metrics (Perplexity Score)
	Perplexity Score: 14.355693817138672

	## Limitations

	- The model is optimized for English-language next-word prediction tasks.
	- While quantization improves speed, minor accuracy degradation may occur.
	- Performance on out-of-distribution text (e.g., highly technical or domain-specific data) may be limited.

	## Usage Instructions

	### Installation

	```sh
	pip install transformers torch
	```

	### Loading the Model in Python

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM
	import torch

	device = "cuda" if torch.cuda.is_available() else "cpu"

	model_name = "AventIQ-AI/gpt2-lmheadmodel-next-line-prediction-model"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForCausalLM.from_pretrained(model_name).to(device)
	```

	## Repository Structure

	```
	.
	├── model/ # Contains the quantized model files
	├── tokenizer_config/ # Tokenizer configuration and vocabulary files
	├── model.safetensors/ # Quantized Model
	├── README.md # Model documentation
	```

	## Contributing

	Contributions are welcome! Feel free to open an issue or submit a pull request if you have suggestions or improvements.