YAML Metadata Error: "datasets[0]" with value "MIT Movie" is not valid. If possible, use a dataset id from https://hf.co/datasets.
YAML Metadata Error: "language[0]" must only contain lowercase characters
YAML Metadata Error: "language[0]" with value "English" is not valid. It must be an ISO 639-1, 639-2 or 639-3 code (two/three letters), or a special value like "code", "multilingual". If you want to use BCP-47 identifiers, you can specify them in language_bcp47.

roberta-base + Movies NER Task

Objective: This is Roberta Base trained for the NER task using MIT Movie Dataset

model_name = "thatdramebaazguy/roberta-base-MITmovie"
pipeline(model=model_name, tokenizer=model_name, revision="v1.0", task="ner")

Overview

Language model: roberta-base
Language: English
Downstream-task: NER
Training data: MIT Movie
Eval data: MIT Movie
Infrastructure: 2x Tesla v100
Code: See example

Hyperparameters

Num examples = 6253  
Num Epochs = 5 
Instantaneous batch size per device = 64
Total train batch size (w. parallel, distributed & accumulation) = 128  

Performance

Eval on MIT Movie

  • epoch = 5.0
  • eval_accuracy = 0.9476
  • eval_f1 = 0.8853
  • eval_loss = 0.2208
  • eval_mem_cpu_alloc_delta = 17MB
  • eval_mem_cpu_peaked_delta = 2MB
  • eval_mem_gpu_alloc_delta = 0MB
  • eval_mem_gpu_peaked_delta = 38MB
  • eval_precision = 0.8833
  • eval_recall = 0.8874
  • eval_runtime = 0:00:03.62
  • eval_samples = 1955

Github Repo:


Downloads last month
14
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.