lxyuan's picture
Update README.md
ba1a56a
|
raw
history blame
2.85 kB
metadata
license: apache-2.0
tags:
  - generated_from_trainer
  - finance
  - intent-classification
datasets:
  - banking77
model-index:
  - name: banking-intent-distilbert-classifier
    results: []
language:
  - en
metrics:
  - accuracy
pipeline_tag: text-classification

banking-intent-distilbert-classifier

This model is a fine-tuned version of distilbert-base-uncased on the banking77 dataset. It achieves the following results on the evaluation set:

  • eval_loss: 0.2885
  • eval_accuracy: 0.9244
  • eval_runtime: 1.9357
  • eval_samples_per_second: 1591.148
  • eval_steps_per_second: 99.705
  • epoch: 10.0
  • step: 3130

Note: This is just a simple example of fine-tuning a DistilBERT model for multi-class classification task to see how much it costs to train this model on Google Cloud (using a T4 GPU). It costs me about 1.07 SGD and takes less than 20 mins to complete the training. Although my intention was just to test it out on Google Cloud, the model has been appropriately trained and is now ready to be used. Hopefully, it is what you're looking for.

Inference example

from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("lxyuan/banking-intent-distilbert-classifier")
model = AutoModelForSequenceClassification.from_pretrained("lxyuan/banking-intent-distilbert-classifier")

banking_intend_classifier = TextClassificationPipeline(
  model=model,
  tokenizer=tokenizer,
  device=0
)

banking_intend_classifier("How to report lost card?")
>>> [{'label': 'lost_or_stolen_card', 'score': 0.9518502950668335}]

Training and evaluation data

The BANKING77 dataset consists of online banking queries labeled with their corresponding intents, offering a comprehensive collection of 77 finely categorized intents within the banking domain. With a total of 13,083 customer service queries, it specifically emphasizes precise intent detection within a single domain.

Training procedure

To reproduce the result, please refer to this notebook

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10
  • mixed_precision_training: Native AMP

Framework versions

  • Transformers 4.29.2
  • Pytorch 1.9.0+cu111
  • Datasets 2.12.0
  • Tokenizers 0.13.3