tau
/

Transformers
English
tau/sled
maorivgi commited on
Commit
04580dd
·
1 Parent(s): 1753672

updated readme

Browse files
Files changed (1) hide show
  1. README.md +30 -5
README.md CHANGED
@@ -6,7 +6,7 @@ language: en
6
  # BART-SLED (SLiding-Encoder and Decoder, base-sized model)
7
 
8
  SLED models use pretrained, short-range encoder-decoder models, and apply them over
9
- long-text inputs by splitting the input into multiple overlapping chunks, encoding each independtly and perform fusion-in-decoder
10
 
11
  ## Model description
12
 
@@ -19,7 +19,19 @@ well for comprehension tasks (e.g. text classification, question answering). Whe
19
  You can use the raw model for text infilling. However, the model is mostly meant to be fine-tuned on a supervised dataset.
20
 
21
  ### How to use
22
- To use the model, you first have to get a local copy of the SLED model from the [official repository](https://github.com/Mivg/SLED/blob/main/README.md).
 
 
 
 
 
 
 
 
 
 
 
 
23
 
24
  Here is how to use this model in PyTorch:
25
 
@@ -32,12 +44,25 @@ outputs = model(**inputs)
32
  last_hidden_states = outputs.last_hidden_state
33
  ```
34
  You can also replace SledModel by SledModelForConditionalGeneration for Seq2Seq generation
35
-
 
 
36
  In case you wish to apply SLED on a task containing a prefix (e.g. question) which should be given as a context to
37
  every chunk, you can pass the `prefix_length` tensor input as well (A LongTensor in the length of the batch size).
 
 
 
 
 
 
 
 
 
 
38
 
39
- Sled is fully compatible with the AutoClasses (AutoTokenizer, AutoConfig, AutoModel
40
- and AutoModelForCausalLM) and can be loaded using the from_pretrained methods
 
41
 
42
  ### BibTeX entry and citation info
43
 
 
6
  # BART-SLED (SLiding-Encoder and Decoder, base-sized model)
7
 
8
  SLED models use pretrained, short-range encoder-decoder models, and apply them over
9
+ long-text inputs by splitting the input into multiple overlapping chunks, encoding each independently and perform fusion-in-decoder
10
 
11
  ## Model description
12
 
 
19
  You can use the raw model for text infilling. However, the model is mostly meant to be fine-tuned on a supervised dataset.
20
 
21
  ### How to use
22
+ To use the model, you first need to install `sled-py` in your environment (or clone the code from the [official repository](https://github.com/Mivg/SLED/blob/main/README.md))
23
+ ```
24
+ pip install sled-py
25
+ ```
26
+ For more installation instructions, see [here](https://github.com/Mivg/SLED#Installation).
27
+
28
+
29
+ Once installed, SLED is fully compatible with HuggingFace's AutoClasses (AutoTokenizer, AutoConfig, AutoModel
30
+ and AutoModelForCausalLM) and can be loaded using the from_pretrained methods
31
+ ```python
32
+ import sled # *** required so that SledModels will be registered for the AutoClasses ***
33
+ model = AutoModel.from_pretrained('tau/bart-base-sled')
34
+ ```
35
 
36
  Here is how to use this model in PyTorch:
37
 
 
44
  last_hidden_states = outputs.last_hidden_state
45
  ```
46
  You can also replace SledModel by SledModelForConditionalGeneration for Seq2Seq generation
47
+ ```python
48
+ model = SledModelForConditionalGeneration.from_pretrained('tau/bart-base-sled')
49
+ ```
50
  In case you wish to apply SLED on a task containing a prefix (e.g. question) which should be given as a context to
51
  every chunk, you can pass the `prefix_length` tensor input as well (A LongTensor in the length of the batch size).
52
+ ```python
53
+ import torch
54
+ import sled # *** required so that SledModels will be registered for the AutoClasses ***
55
+ tokenizer = AutoTokenizer.from_pretrained('tau/bart-base-sled')
56
+ model = AutoModel.from_pretrained('tau/bart-base-sled')
57
+ document_input_ids = tokenizer("Dogs are great for you.", return_tensors="pt").input_ids
58
+ prefix_input_ids = tokenizer("Are dogs good for you?", return_tensors="pt").input_ids
59
+ input_ids = torch.cat((prefix_input_ids, document_input_ids), dim=-1)
60
+ attention_mask = torch.ones_like(input_ids)
61
+ prefix_length = torch.LongTensor([[prefix_input_ids.size(1)]])
62
 
63
+ outputs = model(input_ids=input_ids, attention_mask=attention_mask, prefix_length=prefix_length)
64
+ last_hidden_states = outputs.last_hidden_state
65
+ ```
66
 
67
  ### BibTeX entry and citation info
68