Model Card for Model ID
Model Details
Model Description
Model Description: This model card presents details for the gpt2-xl model, a large autoregressive language model optimized for text generation tasks. The model uses the GPT-2 architecture developed by OpenAI.
- Model type: Autoregressive Language Model
- Language(s) (NLP): English]
Uses
Direct Use
The model can be used for text generation tasks, such as completing sentences or generating coherent paragraphs.
Bias, Risks, and Limitations
The model may exhibit biases present in the training data and could generate inappropriate or sensitive content. Users should exercise caution when deploying the model in production.
Recommendations
Users should be aware of potential biases and limitations of the model, particularly when used in applications that involve sensitive or high-stakes content.
How to Get Started with the Model
Use the code below to get started with the model.
import torch from transformers import AutoTokenizer, AutoModelForCausalLM
model_name = "gpt2-xl" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained(model_name)
input_txt = "Bananas are a great" input_ids = tokenizer(input_txt, return_tensors="pt")["input_ids"]
output = model.generate(input_ids, max_length=200, do_sample=False) print(tokenizer.decode(output[0]))
Training Details
Training Data
The model was trained on a diverse range of internet text, including news articles, books, and websites.
Training Hyperparameters
Training regime: Autoregressive training with large-scale language modeling objectives Compute infrastructure: GPUs (specific details not disclosed)
Evaluation
Testing Data, Factors & Metrics
The model was evaluated on standard language modeling benchmarks, including perplexity scores on held-out data.
- Downloads last month
- 19