julianrisch commited on
Commit
eae7258
1 Parent(s): e30c9cf

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -48
README.md CHANGED
@@ -5,13 +5,12 @@ datasets:
5
  license: cc-by-4.0
6
  ---
7
 
8
- # tinyroberta for Extractive QA
9
 
10
  ## Overview
11
- **Language model:** tinyroberta-squad2
12
  **Language:** English
13
  **Training data:** The PILE
14
- **Code:** See [an example extractive QA pipeline built with Haystack](https://haystack.deepset.ai/tutorials/34_extractive_qa_pipeline)
15
  **Infrastructure**: 4x Tesla v100
16
 
17
  ## Hyperparameters
@@ -19,7 +18,6 @@ license: cc-by-4.0
19
  ```
20
  batch_size = 96
21
  n_epochs = 4
22
- base_LM_model = "deepset/tinyroberta-squad2-step1"
23
  max_seq_len = 384
24
  learning_rate = 1e-4
25
  lr_schedule = LinearWarmup
@@ -32,50 +30,6 @@ This model was distilled using the TinyBERT approach described in [this paper](h
32
  We have performed intermediate layer distillation with roberta-base as the teacher which resulted in [deepset/tinyroberta-6l-768d](https://huggingface.co/deepset/tinyroberta-6l-768d).
33
  This model has not been distilled for any specific task. If you are interested in using distillation to improve its performance on a downstream task, you can take advantage of haystack's new [distillation functionality](https://haystack.deepset.ai/guides/model-distillation). You can also check out [deepset/tinyroberta-squad2](https://huggingface.co/deepset/tinyroberta-squad2) for a model that is already distilled on an extractive QA downstream task.
34
 
35
- ## Usage
36
-
37
- ### In Haystack
38
- Haystack is an AI orchestration framework to build customizable, production-ready LLM applications. You can use this model in Haystack to do extractive question answering on documents.
39
- To load and run the model with [Haystack](https://github.com/deepset-ai/haystack/):
40
- ```python
41
- # After running pip install haystack-ai "transformers[torch,sentencepiece]"
42
-
43
- from haystack import Document
44
- from haystack.components.readers import ExtractiveReader
45
-
46
- docs = [
47
- Document(content="Python is a popular programming language"),
48
- Document(content="python ist eine beliebte Programmiersprache"),
49
- ]
50
-
51
- reader = ExtractiveReader(model="deepset/tinyroberta-6l-768d")
52
- reader.warm_up()
53
-
54
- question = "What is a popular programming language?"
55
- result = reader.run(query=question, documents=docs)
56
- # {'answers': [ExtractedAnswer(query='What is a popular programming language?', score=0.5740374326705933, data='python', document=Document(id=..., content: '...'), context=None, document_offset=ExtractedAnswer.Span(start=0, end=6),...)]}
57
- ```
58
- For a complete example with an extractive question answering pipeline that scales over many documents, check out the [corresponding Haystack tutorial](https://haystack.deepset.ai/tutorials/34_extractive_qa_pipeline).
59
-
60
- ### In Transformers
61
- ```python
62
- from transformers import AutoModelForQuestionAnswering, AutoTokenizer, pipeline
63
-
64
- model_name = "deepset/tinyroberta-6l-768d"
65
-
66
- # a) Get predictions
67
- nlp = pipeline('question-answering', model=model_name, tokenizer=model_name)
68
- QA_input = {
69
- 'question': 'Why is model conversion important?',
70
- 'context': 'The option to convert models between FARM and transformers gives freedom to the user and let people easily switch between frameworks.'
71
- }
72
- res = nlp(QA_input)
73
-
74
- # b) Load model & tokenizer
75
- model = AutoModelForQuestionAnswering.from_pretrained(model_name)
76
- tokenizer = AutoTokenizer.from_pretrained(model_name)
77
- ```
78
-
79
  ## About us
80
  <div class="grid lg:grid-cols-2 gap-x-4 gap-y-3">
81
  <div class="w-full h-40 object-cover mb-2 rounded-lg flex items-center justify-center">
 
5
  license: cc-by-4.0
6
  ---
7
 
8
+ # roberta-base distilled into tinyroberta
9
 
10
  ## Overview
11
+ **Language model:** roberta-base
12
  **Language:** English
13
  **Training data:** The PILE
 
14
  **Infrastructure**: 4x Tesla v100
15
 
16
  ## Hyperparameters
 
18
  ```
19
  batch_size = 96
20
  n_epochs = 4
 
21
  max_seq_len = 384
22
  learning_rate = 1e-4
23
  lr_schedule = LinearWarmup
 
30
  We have performed intermediate layer distillation with roberta-base as the teacher which resulted in [deepset/tinyroberta-6l-768d](https://huggingface.co/deepset/tinyroberta-6l-768d).
31
  This model has not been distilled for any specific task. If you are interested in using distillation to improve its performance on a downstream task, you can take advantage of haystack's new [distillation functionality](https://haystack.deepset.ai/guides/model-distillation). You can also check out [deepset/tinyroberta-squad2](https://huggingface.co/deepset/tinyroberta-squad2) for a model that is already distilled on an extractive QA downstream task.
32
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
33
  ## About us
34
  <div class="grid lg:grid-cols-2 gap-x-4 gap-y-3">
35
  <div class="w-full h-40 object-cover mb-2 rounded-lg flex items-center justify-center">