Spaces:
Runtime error
Runtime error
<!--Copyright 2020 The HuggingFace Team. All rights reserved. | |
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with | |
the License. You may obtain a copy of the License at | |
http://www.apache.org/licenses/LICENSE-2.0 | |
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on | |
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the | |
specific language governing permissions and limitations under the License. | |
--> | |
# Utilities for Generation | |
This page lists all the utility functions used by [`~generation.GenerationMixin.generate`], | |
[`~generation.GenerationMixin.greedy_search`], | |
[`~generation.GenerationMixin.contrastive_search`], | |
[`~generation.GenerationMixin.sample`], | |
[`~generation.GenerationMixin.beam_search`], | |
[`~generation.GenerationMixin.beam_sample`], | |
[`~generation.GenerationMixin.group_beam_search`], and | |
[`~generation.GenerationMixin.constrained_beam_search`]. | |
Most of those are only useful if you are studying the code of the generate methods in the library. | |
## Generate Outputs | |
The output of [`~generation.GenerationMixin.generate`] is an instance of a subclass of | |
[`~utils.ModelOutput`]. This output is a data structure containing all the information returned | |
by [`~generation.GenerationMixin.generate`], but that can also be used as tuple or dictionary. | |
Here's an example: | |
```python | |
from transformers import GPT2Tokenizer, GPT2LMHeadModel | |
tokenizer = GPT2Tokenizer.from_pretrained("gpt2") | |
model = GPT2LMHeadModel.from_pretrained("gpt2") | |
inputs = tokenizer("Hello, my dog is cute and ", return_tensors="pt") | |
generation_output = model.generate(**inputs, return_dict_in_generate=True, output_scores=True) | |
``` | |
The `generation_output` object is a [`~generation.GreedySearchDecoderOnlyOutput`], as we can | |
see in the documentation of that class below, it means it has the following attributes: | |
- `sequences`: the generated sequences of tokens | |
- `scores` (optional): the prediction scores of the language modelling head, for each generation step | |
- `hidden_states` (optional): the hidden states of the model, for each generation step | |
- `attentions` (optional): the attention weights of the model, for each generation step | |
Here we have the `scores` since we passed along `output_scores=True`, but we don't have `hidden_states` and | |
`attentions` because we didn't pass `output_hidden_states=True` or `output_attentions=True`. | |
You can access each attribute as you would usually do, and if that attribute has not been returned by the model, you | |
will get `None`. Here for instance `generation_output.scores` are all the generated prediction scores of the | |
language modeling head, and `generation_output.attentions` is `None`. | |
When using our `generation_output` object as a tuple, it only keeps the attributes that don't have `None` values. | |
Here, for instance, it has two elements, `loss` then `logits`, so | |
```python | |
generation_output[:2] | |
``` | |
will return the tuple `(generation_output.sequences, generation_output.scores)` for instance. | |
When using our `generation_output` object as a dictionary, it only keeps the attributes that don't have `None` | |
values. Here, for instance, it has two keys that are `sequences` and `scores`. | |
We document here all output types. | |
### GreedySearchOutput | |
[[autodoc]] generation.GreedySearchDecoderOnlyOutput | |
[[autodoc]] generation.GreedySearchEncoderDecoderOutput | |
[[autodoc]] generation.FlaxGreedySearchOutput | |
### SampleOutput | |
[[autodoc]] generation.SampleDecoderOnlyOutput | |
[[autodoc]] generation.SampleEncoderDecoderOutput | |
[[autodoc]] generation.FlaxSampleOutput | |
### BeamSearchOutput | |
[[autodoc]] generation.BeamSearchDecoderOnlyOutput | |
[[autodoc]] generation.BeamSearchEncoderDecoderOutput | |
### BeamSampleOutput | |
[[autodoc]] generation.BeamSampleDecoderOnlyOutput | |
[[autodoc]] generation.BeamSampleEncoderDecoderOutput | |
## LogitsProcessor | |
A [`LogitsProcessor`] can be used to modify the prediction scores of a language model head for | |
generation. | |
[[autodoc]] LogitsProcessor | |
- __call__ | |
[[autodoc]] LogitsProcessorList | |
- __call__ | |
[[autodoc]] LogitsWarper | |
- __call__ | |
[[autodoc]] MinLengthLogitsProcessor | |
- __call__ | |
[[autodoc]] MinNewTokensLengthLogitsProcessor | |
- __call__ | |
[[autodoc]] TemperatureLogitsWarper | |
- __call__ | |
[[autodoc]] RepetitionPenaltyLogitsProcessor | |
- __call__ | |
[[autodoc]] TopPLogitsWarper | |
- __call__ | |
[[autodoc]] TopKLogitsWarper | |
- __call__ | |
[[autodoc]] TypicalLogitsWarper | |
- __call__ | |
[[autodoc]] NoRepeatNGramLogitsProcessor | |
- __call__ | |
[[autodoc]] NoBadWordsLogitsProcessor | |
- __call__ | |
[[autodoc]] PrefixConstrainedLogitsProcessor | |
- __call__ | |
[[autodoc]] HammingDiversityLogitsProcessor | |
- __call__ | |
[[autodoc]] ForcedBOSTokenLogitsProcessor | |
- __call__ | |
[[autodoc]] ForcedEOSTokenLogitsProcessor | |
- __call__ | |
[[autodoc]] InfNanRemoveLogitsProcessor | |
- __call__ | |
[[autodoc]] TFLogitsProcessor | |
- __call__ | |
[[autodoc]] TFLogitsProcessorList | |
- __call__ | |
[[autodoc]] TFLogitsWarper | |
- __call__ | |
[[autodoc]] TFTemperatureLogitsWarper | |
- __call__ | |
[[autodoc]] TFTopPLogitsWarper | |
- __call__ | |
[[autodoc]] TFTopKLogitsWarper | |
- __call__ | |
[[autodoc]] TFMinLengthLogitsProcessor | |
- __call__ | |
[[autodoc]] TFNoBadWordsLogitsProcessor | |
- __call__ | |
[[autodoc]] TFNoRepeatNGramLogitsProcessor | |
- __call__ | |
[[autodoc]] TFRepetitionPenaltyLogitsProcessor | |
- __call__ | |
[[autodoc]] TFForcedBOSTokenLogitsProcessor | |
- __call__ | |
[[autodoc]] TFForcedEOSTokenLogitsProcessor | |
- __call__ | |
[[autodoc]] FlaxLogitsProcessor | |
- __call__ | |
[[autodoc]] FlaxLogitsProcessorList | |
- __call__ | |
[[autodoc]] FlaxLogitsWarper | |
- __call__ | |
[[autodoc]] FlaxTemperatureLogitsWarper | |
- __call__ | |
[[autodoc]] FlaxTopPLogitsWarper | |
- __call__ | |
[[autodoc]] FlaxTopKLogitsWarper | |
- __call__ | |
[[autodoc]] FlaxForcedBOSTokenLogitsProcessor | |
- __call__ | |
[[autodoc]] FlaxForcedEOSTokenLogitsProcessor | |
- __call__ | |
[[autodoc]] FlaxMinLengthLogitsProcessor | |
- __call__ | |
## StoppingCriteria | |
A [`StoppingCriteria`] can be used to change when to stop generation (other than EOS token). | |
[[autodoc]] StoppingCriteria | |
- __call__ | |
[[autodoc]] StoppingCriteriaList | |
- __call__ | |
[[autodoc]] MaxLengthCriteria | |
- __call__ | |
[[autodoc]] MaxTimeCriteria | |
- __call__ | |
## Constraints | |
A [`Constraint`] can be used to force the generation to include specific tokens or sequences in the output. | |
[[autodoc]] Constraint | |
[[autodoc]] PhrasalConstraint | |
[[autodoc]] DisjunctiveConstraint | |
[[autodoc]] ConstraintListState | |
## BeamSearch | |
[[autodoc]] BeamScorer | |
- process | |
- finalize | |
[[autodoc]] BeamSearchScorer | |
- process | |
- finalize | |
[[autodoc]] ConstrainedBeamSearchScorer | |
- process | |
- finalize | |
## Utilities | |
[[autodoc]] top_k_top_p_filtering | |
[[autodoc]] tf_top_k_top_p_filtering | |
## Streamers | |
[[autodoc]] TextStreamer | |
[[autodoc]] TextIteratorStreamer | |