Adriana213's picture
Update README.md
559b5ee verified
---
library_name: transformers
tags:
- text-generation-inference
license: mit
language:
- en
---
# Model Card for Model ID
<!-- Provide a quick summary of what the model is/does. -->
## Model Details
### Model Description
Model Description:
This model card presents details for the gpt2-xl model, a large autoregressive language model optimized for text generation tasks. The model uses the GPT-2 architecture developed by OpenAI.
- **Model type:** Autoregressive Language Model
- **Language(s) (NLP):** English]
## Uses
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
### Direct Use
The model can be used for text generation tasks, such as completing sentences or generating coherent paragraphs.
## Bias, Risks, and Limitations
The model may exhibit biases present in the training data and could generate inappropriate or sensitive content. Users should exercise caution when deploying the model in production.
### Recommendations
Users should be aware of potential biases and limitations of the model, particularly when used in applications that involve sensitive or high-stakes content.
## How to Get Started with the Model
Use the code below to get started with the model.
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
model_name = "gpt2-xl"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
input_txt = "Bananas are a great"
input_ids = tokenizer(input_txt, return_tensors="pt")["input_ids"]
output = model.generate(input_ids, max_length=200, do_sample=False)
print(tokenizer.decode(output[0]))
## Training Details
### Training Data
The model was trained on a diverse range of internet text, including news articles, books, and websites.
#### Training Hyperparameters
Training regime: Autoregressive training with large-scale language modeling objectives
Compute infrastructure: GPUs (specific details not disclosed)
## Evaluation
### Testing Data, Factors & Metrics
The model was evaluated on standard language modeling benchmarks, including perplexity scores on held-out data.