Spaces:
Runtime error
Runtime error
<!--Copyright 2020 The HuggingFace Team. All rights reserved. | |
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with | |
the License. You may obtain a copy of the License at | |
http://www.apache.org/licenses/LICENSE-2.0 | |
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on | |
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the | |
specific language governing permissions and limitations under the License. | |
--> | |
# CamemBERT | |
## Overview | |
The CamemBERT model was proposed in [CamemBERT: a Tasty French Language Model](https://arxiv.org/abs/1911.03894) by | |
Louis Martin, Benjamin Muller, Pedro Javier Ortiz Suárez, Yoann Dupont, Laurent Romary, Éric Villemonte de la | |
Clergerie, Djamé Seddah, and Benoît Sagot. It is based on Facebook's RoBERTa model released in 2019. It is a model | |
trained on 138GB of French text. | |
The abstract from the paper is the following: | |
*Pretrained language models are now ubiquitous in Natural Language Processing. Despite their success, most available | |
models have either been trained on English data or on the concatenation of data in multiple languages. This makes | |
practical use of such models --in all languages except English-- very limited. Aiming to address this issue for French, | |
we release CamemBERT, a French version of the Bi-directional Encoders for Transformers (BERT). We measure the | |
performance of CamemBERT compared to multilingual models in multiple downstream tasks, namely part-of-speech tagging, | |
dependency parsing, named-entity recognition, and natural language inference. CamemBERT improves the state of the art | |
for most of the tasks considered. We release the pretrained model for CamemBERT hoping to foster research and | |
downstream applications for French NLP.* | |
Tips: | |
- This implementation is the same as RoBERTa. Refer to the [documentation of RoBERTa](roberta) for usage examples | |
as well as the information relative to the inputs and outputs. | |
This model was contributed by [camembert](https://huggingface.co/camembert). The original code can be found [here](https://camembert-model.fr/). | |
## Documentation resources | |
- [Text classification task guide](../tasks/sequence_classification) | |
- [Token classification task guide](../tasks/token_classification) | |
- [Question answering task guide](../tasks/question_answering) | |
- [Causal language modeling task guide](../tasks/language_modeling) | |
- [Masked language modeling task guide](../tasks/masked_language_modeling) | |
- [Multiple choice task guide](../tasks/multiple_choice) | |
## CamembertConfig | |
[[autodoc]] CamembertConfig | |
## CamembertTokenizer | |
[[autodoc]] CamembertTokenizer | |
- build_inputs_with_special_tokens | |
- get_special_tokens_mask | |
- create_token_type_ids_from_sequences | |
- save_vocabulary | |
## CamembertTokenizerFast | |
[[autodoc]] CamembertTokenizerFast | |
## CamembertModel | |
[[autodoc]] CamembertModel | |
## CamembertForCausalLM | |
[[autodoc]] CamembertForCausalLM | |
## CamembertForMaskedLM | |
[[autodoc]] CamembertForMaskedLM | |
## CamembertForSequenceClassification | |
[[autodoc]] CamembertForSequenceClassification | |
## CamembertForMultipleChoice | |
[[autodoc]] CamembertForMultipleChoice | |
## CamembertForTokenClassification | |
[[autodoc]] CamembertForTokenClassification | |
## CamembertForQuestionAnswering | |
[[autodoc]] CamembertForQuestionAnswering | |
## TFCamembertModel | |
[[autodoc]] TFCamembertModel | |
## TFCamembertForCasualLM | |
[[autodoc]] TFCamembertForCausalLM | |
## TFCamembertForMaskedLM | |
[[autodoc]] TFCamembertForMaskedLM | |
## TFCamembertForSequenceClassification | |
[[autodoc]] TFCamembertForSequenceClassification | |
## TFCamembertForMultipleChoice | |
[[autodoc]] TFCamembertForMultipleChoice | |
## TFCamembertForTokenClassification | |
[[autodoc]] TFCamembertForTokenClassification | |
## TFCamembertForQuestionAnswering | |
[[autodoc]] TFCamembertForQuestionAnswering | |