law-ai commited on
Commit
f0171f5
1 Parent(s): c8d6ef6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -1
README.md CHANGED
@@ -19,9 +19,10 @@ In total, our dataset contains around 5.4 million Indian legal documents (all in
19
  The raw text corpus size is around 27 GB.
20
 
21
  ### Training Objective
22
- This model is initialized with the [LEGAL-BERT-SC model](https://huggingface.co/nlpaueb/legal-bert-base-uncased) from the paper [LEGAL-BERT: The Muppets straight out of Law School](https://aclanthology.org/2020.findings-emnlp.261/), and trained for an additional 300K steps on our data on the MLM and NSP objective.
23
 
24
  ### Usage
 
25
  ```python
26
  from transformers import AutoTokenizer, AutoModel, BertForPreTraining
27
  tokenizer = AutoTokenizer.from_pretrained("nlpaueb/legal-bert-base-uncased")
 
19
  The raw text corpus size is around 27 GB.
20
 
21
  ### Training Objective
22
+ This model is initialized with the [LEGAL-BERT-SC model](https://huggingface.co/nlpaueb/legal-bert-base-uncased) from the paper [LEGAL-BERT: The Muppets straight out of Law School](https://aclanthology.org/2020.findings-emnlp.261/), anri
23
 
24
  ### Usage
25
+ Using the tokenizer (same as LegalBERT
26
  ```python
27
  from transformers import AutoTokenizer, AutoModel, BertForPreTraining
28
  tokenizer = AutoTokenizer.from_pretrained("nlpaueb/legal-bert-base-uncased")