Svetlana0303
/

Regression_xlnet_NOaug_CustomLoss

+---
+license: mit
+tags:
+- generated_from_keras_callback
+model-index:
+- name: Regression_xlnet_NOaug_CustomLoss
+  results: []
+---
+<!-- This model card has been generated automatically according to the information Keras had access to. You should
+probably proofread and complete it, then remove this comment. -->
+# Regression_xlnet_NOaug_CustomLoss
+This model is a fine-tuned version of [xlnet-base-cased](https://huggingface.co/xlnet-base-cased) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Train Loss: 0.1862
+- Train Mae: 0.5631
+- Train Mse: 0.4095
+- Train R2-score: 0.8268
+- Validation Loss: 0.1355
+- Validation Mae: 0.5683
+- Validation Mse: 0.3643
+- Validation R2-score: 0.8811
+- Epoch: 14
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- optimizer: {'name': 'Adam', 'weight_decay': None, 'clipnorm': None, 'global_clipnorm': None, 'clipvalue': None, 'use_ema': False, 'ema_momentum': 0.99, 'ema_overwrite_frequency': None, 'jit_compile': True, 'is_legacy_optimizer': False, 'learning_rate': 1e-04, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False}
+- training_precision: float32
+### Training results
+| Train Loss | Train Mae | Train Mse | Train R2-score | Validation Loss | Validation Mae | Validation Mse | Validation R2-score | Epoch |
+|:----------:|:---------:|:---------:|:--------------:|:---------------:|:--------------:|:--------------:|:-------------------:|:-----:|
+| 0.1966     | 0.5177    | 0.3647    | 0.3590         | 0.1412          | 0.6460         | 0.4895         | 0.8850              | 0     |
+| 0.1804     | 0.5606    | 0.4181    | 0.8105         | 0.1540          | 0.6614         | 0.5259         | 0.8820              | 1     |
+| 0.2037     | 0.5676    | 0.4319    | 0.6885         | 0.1399          | 0.6439         | 0.4849         | 0.8849              | 2     |
+| 0.1833     | 0.5499    | 0.3954    | 0.8256         | 0.1804          | 0.6845         | 0.5879         | 0.8760              | 3     |
+| 0.1627     | 0.5412    | 0.3866    | 0.8022         | 0.1661          | 0.6729         | 0.5558         | 0.8793              | 4     |
+| 0.1822     | 0.5677    | 0.4178    | 0.7449         | 0.1327          | 0.6311         | 0.4580         | 0.8861              | 5     |
+| 0.2117     | 0.5798    | 0.4520    | 0.5186         | 0.1282          | 0.6187         | 0.4345         | 0.8866              | 6     |
+| 0.1843     | 0.5544    | 0.3998    | 0.5283         | 0.1272          | 0.6142         | 0.4265         | 0.8866              | 7     |
+| 0.2074     | 0.5906    | 0.4639    | 0.6729         | 0.1269          | 0.6127         | 0.4239         | 0.8865              | 8     |
+| 0.1756     | 0.5666    | 0.4032    | 0.8054         | 0.1272          | 0.5909         | 0.3908         | 0.8850              | 9     |
+| 0.1706     | 0.5452    | 0.3948    | 0.7999         | 0.1282          | 0.5862         | 0.3845         | 0.8844              | 10    |
+| 0.1727     | 0.5499    | 0.3928    | 0.8471         | 0.1453          | 0.6513         | 0.5021         | 0.8840              | 11    |
+| 0.1688     | 0.5467    | 0.3884    | 0.3339         | 0.1777          | 0.6823         | 0.5817         | 0.8766              | 12    |
+| 0.1625     | 0.5476    | 0.3918    | 0.5804         | 0.1483          | 0.6541         | 0.5098         | 0.8833              | 13    |
+| 0.1862     | 0.5631    | 0.4095    | 0.8268         | 0.1355          | 0.5683         | 0.3643         | 0.8811              | 14    |
+### Framework versions
+- Transformers 4.28.1
+- TensorFlow 2.12.0
+- Datasets 2.12.0
+- Tokenizers 0.13.3

config.json ADDED Viewed

	@@ -0,0 +1,49 @@

+{
+  "_name_or_path": "xlnet-base-cased",
+  "architectures": [
+    "XLNetForSequenceClassification"
+  ],
+  "attn_type": "bi",
+  "bi_data": false,
+  "bos_token_id": 1,
+  "clamp_len": -1,
+  "d_head": 64,
+  "d_inner": 3072,
+  "d_model": 768,
+  "dropout": 0.1,
+  "end_n_top": 5,
+  "eos_token_id": 2,
+  "ff_activation": "gelu",
+  "id2label": {
+    "0": "LABEL_0"
+  },
+  "initializer_range": 0.02,
+  "label2id": {
+    "LABEL_0": 0
+  },
+  "layer_norm_eps": 1e-12,
+  "mem_len": null,
+  "model_type": "xlnet",
+  "n_head": 12,
+  "n_layer": 12,
+  "pad_token_id": 5,
+  "problem_type": "regression",
+  "reuse_len": null,
+  "same_length": false,
+  "start_n_top": 5,
+  "summary_activation": "tanh",
+  "summary_last_dropout": 0.1,
+  "summary_type": "last",
+  "summary_use_proj": true,
+  "task_specific_params": {
+    "text-generation": {
+      "do_sample": true,
+      "max_length": 250
+    }
+  },
+  "transformers_version": "4.28.1",
+  "untie_r": true,
+  "use_mems_eval": true,
+  "use_mems_train": false,
+  "vocab_size": 32000
+}

tf_model.h5 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f09449d8fb3325723c5c63f44d5e5a38d6c689738062ffa1d8d3b3cb5b5f3a1b
+size 469429256