pooya-mohammadi
/

bert-fa-sentiment-digikala-snappfood

pooya-mohammadi commited on Jul 8, 2023

Commit

42b22ab

•

1 Parent(s): 9489708

Hezar: Upload training files

Files changed (6) hide show

model.pt ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:5ace0f4102cd2c0b04cd376b0f4506d83d9822ce08669e074917854bbb580a46
+size 473270933

model_config.yaml ADDED Viewed

+name: bert_text_classification
+config_type: model
+task: TEXT_CLASSIFICATION
+num_labels: 3
+id2label:
+  0: negative
+  1: positive
+  2: neutral
+vocab_size: 42000
+hidden_size: 768
+num_hidden_layers: 12
+num_attention_heads: 12
+intermediate_size: 3072
+hidden_act: gelu
+hidden_dropout_prob: 0.1
+attention_probs_dropout_prob: 0.1
+max_position_embeddings: 512
+type_vocab_size: 2
+initializer_range: 0.02
+layer_norm_eps: 1.0e-12
+pad_token_id: 0
+position_embedding_type: absolute
+use_cache: true

preprocessor/tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

preprocessor/tokenizer_config.yaml ADDED Viewed

+name: wordpiece_tokenizer
+config_type: preprocessor
+pretrained_path: hezarai/bert-base-fa
+max_length: 512
+truncation_strategy: longest_first
+truncation_direction: right
+stride: 0
+padding_strategy: longest
+padding_direction: right
+pad_to_multiple_of: 0
+pad_token_id: 0
+pad_token: '[PAD]'
+pad_token_type_id: 0
+unk_token: '[UNK]'
+special_tokens:
+- '[UNK]'
+- '[SEP]'
+- '[CLS]'
+- '[PAD]'
+- '[MASK]'
+wordpieces_prefix: '##'
+train_config:
+  name: wordpiece_tokenizer
+  config_type: preprocessor
+  vocab_size: 30000
+  min_frequency: 2
+  limit_alphabet: 1000
+  initial_alphabet: []
+  show_progress: true

train/dataset_config.yaml ADDED Viewed

+name: text_classification
+config_type: dataset
+task: text_classification
+path: hezarai/sentiment_digikala_snappfood
+tokenizer_path: hezarai/bert-base-fa
+label_field: label
+text_field: text

train/train_config.yaml ADDED Viewed

+name: text_classification
+config_type: train
+device: cuda
+init_weights_from: hezarai/bert-base-fa
+num_dataloader_workers: 0
+seed: 42
+optimizer:
+  lr: 2.0e-05
+batch_size: 8
+use_amp: false
+metrics:
+  f1:
+    task: multiclass
+num_epochs: 5
+save_freq: 1
+checkpoints_dir: checkpoints/