Model Details

Domain Specific BERT model for Text Mining in Energy & Material Field

Model Description

  • Developed by: Tong Xie, Yuwei Wan, Juntao Fang, Prof. Bram Hoex
  • Supported by: University of New South Wales, National Computational Infrastructure Australia
  • Model type: Transformer
  • Language(s) (NLP): EN
  • License: MIT

Model Sources

  • Repository: Github
  • Paper: [Under Prepreation]

Uses

Direct Use

Text Mining in Energy & Material Fields

Downstream Use

The EnergyBERT model can be expanded way beyond just text classification. It can be fine-tuned to perform various other downstream NLP tasks in the domain of Energy & Material

How to Get Started with the Model

Use the code below to get started with the model.

from transformers import pipeline
unmasker = pipeline('fill-mask', model='EnergyBERT')
unmasker("Hello I'm a <mask> model.")

Training Details

Training Data

1.2M Published full-text literature corpus from 2000 to 2021.

Training Procedure

BERT is trained on two unsupervised tasks during its pre-training period: masked language modeling and next sentence prediction. A masked language model involves masking some of the input tokens at random and training the model to predict the masked tokens based on the context surrounding the input tokens. Next sentence prediction involves training the model to predict whether two sentences follow each other logically.

Training Hyperparameters

  • Training regime:
Downloads last month
22
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.