Update README.md
Browse files
README.md
CHANGED
@@ -19,9 +19,10 @@ In total, our dataset contains around 5.4 million Indian legal documents (all in
|
|
19 |
The raw text corpus size is around 27 GB.
|
20 |
|
21 |
### Training Objective
|
22 |
-
This model is initialized with the [LEGAL-BERT-SC model](https://huggingface.co/nlpaueb/legal-bert-base-uncased) from the paper [LEGAL-BERT: The Muppets straight out of Law School](https://aclanthology.org/2020.findings-emnlp.261/),
|
23 |
|
24 |
### Usage
|
|
|
25 |
```python
|
26 |
from transformers import AutoTokenizer, AutoModel, BertForPreTraining
|
27 |
tokenizer = AutoTokenizer.from_pretrained("nlpaueb/legal-bert-base-uncased")
|
|
|
19 |
The raw text corpus size is around 27 GB.
|
20 |
|
21 |
### Training Objective
|
22 |
+
This model is initialized with the [LEGAL-BERT-SC model](https://huggingface.co/nlpaueb/legal-bert-base-uncased) from the paper [LEGAL-BERT: The Muppets straight out of Law School](https://aclanthology.org/2020.findings-emnlp.261/), anri
|
23 |
|
24 |
### Usage
|
25 |
+
Using the tokenizer (same as LegalBERT
|
26 |
```python
|
27 |
from transformers import AutoTokenizer, AutoModel, BertForPreTraining
|
28 |
tokenizer = AutoTokenizer.from_pretrained("nlpaueb/legal-bert-base-uncased")
|