metadata

license: mit

Vietnamese Legal Text BERT

Introduction
Using Vietnamese Legal Text BERT

Using Vietnamese Legal Text BERT `hmthanh/VietnamLegalText-SBERT`

Pre-trained PhoBERT models are the state-of-the-art language models for Vietnamese (Pho, i.e. "Phở", is a popular food in Vietnam):

Using Vietnamese Legal Text BERT `transformers`

Installation

Install transformers with pip: pip install transformers
Install tokenizers with pip: pip install tokenizers

Pre-trained models

Model	#params	Arch.	Max length	Pre-training data
`hmthanh/VietnamLegalText-SBERT`	135M	base	256	20GB of texts

Example usage

import torch
from transformers import AutoModel, AutoTokenizer

phobert = AutoModel.from_pretrained("hmthanh/VietnamLegalText-SBERT")
tokenizer = AutoTokenizer.from_pretrained("hmthanh/VietnamLegalText-SBERT")

sentence = 'Chúng_tôi là những nghiên_cứu_viên .'  

input_ids = torch.tensor([tokenizer.encode(sentence)])

with torch.no_grad():
    features = phobert(input_ids)  # Models outputs are now tuples

hmthanh
/

VietnamLegalText-SBERT

Vietnamese Legal Text BERT

Table of contents

Using Vietnamese Legal Text BERT `hmthanh/VietnamLegalText-SBERT`

Using Vietnamese Legal Text BERT `transformers`

Installation

Pre-trained models

Example usage

Vietnamese Legal Text BERT

Table of contents

Using Vietnamese Legal Text BERT hmthanh/VietnamLegalText-SBERT

Using Vietnamese Legal Text BERT transformers

Installation

Pre-trained models

Example usage

Using Vietnamese Legal Text BERT `hmthanh/VietnamLegalText-SBERT`

Using Vietnamese Legal Text BERT `transformers`