Spaces:
Runtime error
Runtime error
<!--Copyright 2020 The HuggingFace Team. All rights reserved. | |
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with | |
the License. You may obtain a copy of the License at | |
http://www.apache.org/licenses/LICENSE-2.0 | |
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on | |
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the | |
specific language governing permissions and limitations under the License. | |
--> | |
# Blenderbot | |
**DISCLAIMER:** If you see something strange, file a [Github Issue](https://github.com/huggingface/transformers/issues/new?assignees=&labels=&template=bug-report.md&title) . | |
## Overview | |
The Blender chatbot model was proposed in [Recipes for building an open-domain chatbot](https://arxiv.org/pdf/2004.13637.pdf) Stephen Roller, Emily Dinan, Naman Goyal, Da Ju, Mary Williamson, Yinhan Liu, | |
Jing Xu, Myle Ott, Kurt Shuster, Eric M. Smith, Y-Lan Boureau, Jason Weston on 30 Apr 2020. | |
The abstract of the paper is the following: | |
*Building open-domain chatbots is a challenging area for machine learning research. While prior work has shown that | |
scaling neural models in the number of parameters and the size of the data they are trained on gives improved results, | |
we show that other ingredients are important for a high-performing chatbot. Good conversation requires a number of | |
skills that an expert conversationalist blends in a seamless way: providing engaging talking points and listening to | |
their partners, and displaying knowledge, empathy and personality appropriately, while maintaining a consistent | |
persona. We show that large scale models can learn these skills when given appropriate training data and choice of | |
generation strategy. We build variants of these recipes with 90M, 2.7B and 9.4B parameter models, and make our models | |
and code publicly available. Human evaluations show our best models are superior to existing approaches in multi-turn | |
dialogue in terms of engagingness and humanness measurements. We then discuss the limitations of this work by analyzing | |
failure cases of our models.* | |
Tips: | |
- Blenderbot is a model with absolute position embeddings so it's usually advised to pad the inputs on the right rather than | |
the left. | |
This model was contributed by [sshleifer](https://huggingface.co/sshleifer). The authors' code can be found [here](https://github.com/facebookresearch/ParlAI) . | |
## Implementation Notes | |
- Blenderbot uses a standard [seq2seq model transformer](https://arxiv.org/pdf/1706.03762.pdf) based architecture. | |
- Available checkpoints can be found in the [model hub](https://huggingface.co/models?search=blenderbot). | |
- This is the *default* Blenderbot model class. However, some smaller checkpoints, such as | |
`facebook/blenderbot_small_90M`, have a different architecture and consequently should be used with | |
[BlenderbotSmall](blenderbot-small). | |
## Usage | |
Here is an example of model usage: | |
```python | |
>>> from transformers import BlenderbotTokenizer, BlenderbotForConditionalGeneration | |
>>> mname = "facebook/blenderbot-400M-distill" | |
>>> model = BlenderbotForConditionalGeneration.from_pretrained(mname) | |
>>> tokenizer = BlenderbotTokenizer.from_pretrained(mname) | |
>>> UTTERANCE = "My friends are cool but they eat too many carbs." | |
>>> inputs = tokenizer([UTTERANCE], return_tensors="pt") | |
>>> reply_ids = model.generate(**inputs) | |
>>> print(tokenizer.batch_decode(reply_ids)) | |
["<s> That's unfortunate. Are they trying to lose weight or are they just trying to be healthier?</s>"] | |
``` | |
## Documentation resources | |
- [Causal language modeling task guide](../tasks/language_modeling) | |
- [Translation task guide](../tasks/translation) | |
- [Summarization task guide](../tasks/summarization) | |
## BlenderbotConfig | |
[[autodoc]] BlenderbotConfig | |
## BlenderbotTokenizer | |
[[autodoc]] BlenderbotTokenizer | |
- build_inputs_with_special_tokens | |
## BlenderbotTokenizerFast | |
[[autodoc]] BlenderbotTokenizerFast | |
- build_inputs_with_special_tokens | |
## BlenderbotModel | |
See `transformers.BartModel` for arguments to *forward* and *generate* | |
[[autodoc]] BlenderbotModel | |
- forward | |
## BlenderbotForConditionalGeneration | |
See [`~transformers.BartForConditionalGeneration`] for arguments to *forward* and *generate* | |
[[autodoc]] BlenderbotForConditionalGeneration | |
- forward | |
## BlenderbotForCausalLM | |
[[autodoc]] BlenderbotForCausalLM | |
- forward | |
## TFBlenderbotModel | |
[[autodoc]] TFBlenderbotModel | |
- call | |
## TFBlenderbotForConditionalGeneration | |
[[autodoc]] TFBlenderbotForConditionalGeneration | |
- call | |
## FlaxBlenderbotModel | |
[[autodoc]] FlaxBlenderbotModel | |
- __call__ | |
- encode | |
- decode | |
## FlaxBlenderbotForConditionalGeneration | |
[[autodoc]] FlaxBlenderbotForConditionalGeneration | |
- __call__ | |
- encode | |
- decode | |