|
--- |
|
license: afl-3.0 |
|
datasets: |
|
- WillHeld/hinglish_top |
|
language: |
|
- en |
|
metrics: |
|
- accuracy |
|
library_name: transformers |
|
pipeline_tag: fill-mask |
|
--- |
|
|
|
### SRDberta |
|
|
|
This is a BERT model trained for Masked Language Modeling for English Data. |
|
|
|
### Dataset |
|
Hinglish-Top [Dataset](https://huggingface.co/datasets/WillHeld/hinglish_top) columns |
|
- en_query |
|
- cs_query |
|
- en_parse |
|
- cs_parse |
|
- domain |
|
|
|
### Training |
|
|Epoch|Loss| |
|
|:--:|:--:| |
|
|1 |0.0485| |
|
|2 |0.00837| |
|
|3 |0.00812| |
|
|4 |0.0029| |
|
|5 |0.014| |
|
|6 |0.00748| |
|
|7 |0.0041| |
|
|8 |0.00543| |
|
|9 |0.00304| |
|
|10 |0.000574| |
|
|
|
### Inference |
|
```python |
|
from transformers import AutoTokenizer, AutoModelForMaskedLM, pipeline |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("SRDdev/SRDBerta") |
|
|
|
model = AutoModelForMaskedLM.from_pretrained("SRDdev/SRDBerta") |
|
|
|
fill = pipeline('fill-mask', model='SRDberta', tokenizer='SRDberta') |
|
``` |
|
```python |
|
fill_mask = fill.tokenizer.mask_token |
|
fill(f'Aap {fill_mask} ho?') |
|
``` |
|
|
|
### Citation |
|
Author: @[SRDdev](https://huggingface.co/SRDdev) |
|
``` |
|
Name : Shreyas Dixit |
|
framework : Pytorch |
|
Year: Jan 2023 |
|
Pipeline : fill-mask |
|
Github : https://github.com/SRDdev |
|
LinkedIn : https://www.linkedin.com/in/srddev/ |
|
``` |