Update README.md
Browse files
README.md
CHANGED
@@ -124,7 +124,8 @@ model = AutoModelForSeq2SeqLM.from_pretrained('stanfordnlp/mrt5-small', trust_re
|
|
124 |
input_ids = torch.tensor([list("Life is like a box of chocolates.".encode("utf-8"))]) + 3 # add 3 for special tokens
|
125 |
labels = torch.tensor([list("La vie est comme une boîte de chocolat.".encode("utf-8"))]) + 3 # add 3 for special tokens
|
126 |
|
127 |
-
|
|
|
128 |
```
|
129 |
|
130 |
For batched inference and training, you can use ByT5's tokenizer class:
|
@@ -138,7 +139,8 @@ tokenizer = AutoTokenizer.from_pretrained('google/byt5-small')
|
|
138 |
model_inputs = tokenizer(["Life is like a box of chocolates.", "Today is Monday."], padding="longest", return_tensors="pt")
|
139 |
labels = tokenizer(["La vie est comme une boîte de chocolat.", "Aujourd'hui c'est lundi."], padding="longest", return_tensors="pt").input_ids
|
140 |
|
141 |
-
|
|
|
142 |
```
|
143 |
|
144 |
## Training Details
|
|
|
124 |
input_ids = torch.tensor([list("Life is like a box of chocolates.".encode("utf-8"))]) + 3 # add 3 for special tokens
|
125 |
labels = torch.tensor([list("La vie est comme une boîte de chocolat.".encode("utf-8"))]) + 3 # add 3 for special tokens
|
126 |
|
127 |
+
# Forward pass with hard deletion
|
128 |
+
loss = model(input_ids, labels=labels, hard_delete=True).loss
|
129 |
```
|
130 |
|
131 |
For batched inference and training, you can use ByT5's tokenizer class:
|
|
|
139 |
model_inputs = tokenizer(["Life is like a box of chocolates.", "Today is Monday."], padding="longest", return_tensors="pt")
|
140 |
labels = tokenizer(["La vie est comme une boîte de chocolat.", "Aujourd'hui c'est lundi."], padding="longest", return_tensors="pt").input_ids
|
141 |
|
142 |
+
# Forward pass with hard deletion
|
143 |
+
loss = model(**model_inputs, labels=labels, hard_delete=True).loss
|
144 |
```
|
145 |
|
146 |
## Training Details
|