prakharz commited on
Commit
afac8eb
·
1 Parent(s): 49525af

Updated Readme

Browse files
Files changed (1) hide show
  1. README.md +9 -37
README.md CHANGED
@@ -7,20 +7,16 @@ model-index:
7
  results: []
8
  ---
9
 
10
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
11
- should probably proofread and complete it, then remove this comment. -->
12
 
13
- # alltasks_m1-t1
 
 
 
 
 
 
14
 
15
- This model is a fine-tuned version of [yuchenlin/BART0pp](https://huggingface.co/yuchenlin/BART0pp) on an unknown dataset.
16
- It achieves the following results on the evaluation set:
17
- - Loss: 1.8914
18
- - Train Runtime: 12625.9615
19
- - Train Samples Per Second: 57.001
20
- - Train Steps Per Second: 0.792
21
- - Train Loss: 1.6667
22
- - Train Samples: 239899
23
- - Gen Len: 9.9497
24
 
25
  ## Model description
26
 
@@ -32,7 +28,7 @@ More information needed
32
 
33
  ## Training and evaluation data
34
 
35
- More information needed
36
 
37
  ## Training procedure
38
 
@@ -51,30 +47,6 @@ The following hyperparameters were used during training:
51
  - lr_scheduler_type: linear
52
  - num_epochs: 3.0
53
 
54
- ### Training results
55
-
56
- | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Accuracy | F1 | Recall | Precision | Gen Len |
57
- |:-------------:|:-----:|:----:|:---------------:|:-------:|:------:|:-------:|:---------:|:--------:|:-------:|:-------:|:---------:|:-------:|
58
- | 1.9907 | 0.15 | 500 | 2.3435 | 50.3191 | 6.4838 | 49.7719 | 50.0456 | 55.5972 | 55.5972 | 55.5972 | 55.5972 | 8.8197 |
59
- | 1.9578 | 0.3 | 1000 | 2.0301 | 54.8237 | 7.033 | 54.3422 | 54.4676 | 61.3115 | 61.3115 | 61.3115 | 61.3115 | 8.0583 |
60
- | 1.8599 | 0.45 | 1500 | 1.9683 | 58.0535 | 6.4621 | 57.5215 | 57.7813 | 66.2295 | 66.2295 | 66.2295 | 66.2295 | 8.1403 |
61
- | 1.861 | 0.6 | 2000 | 1.9899 | 60.2053 | 6.6431 | 59.6317 | 59.8907 | 69.0867 | 69.0867 | 69.0867 | 69.0867 | 8.4773 |
62
- | 1.7464 | 0.75 | 2500 | 1.9600 | 61.3403 | 6.6424 | 60.8196 | 61.0684 | 70.726 | 70.726 | 70.726 | 70.726 | 8.4747 |
63
- | 1.8516 | 0.9 | 3000 | 1.9506 | 59.7834 | 6.4538 | 59.2387 | 59.5396 | 68.8993 | 68.8993 | 68.8993 | 68.8993 | 8.5043 |
64
- | 1.6371 | 1.05 | 3500 | 1.9415 | 60.9397 | 6.6405 | 60.3836 | 60.6176 | 70.1639 | 70.1639 | 70.1639 | 70.1639 | 8.1427 |
65
- | 1.643 | 1.2 | 4000 | 1.9433 | 62.7362 | 6.8939 | 62.1572 | 62.4167 | 72.4122 | 72.4122 | 72.4122 | 72.4122 | 7.9857 |
66
- | 1.6193 | 1.35 | 4500 | 1.9296 | 61.3662 | 6.7287 | 60.8375 | 61.1083 | 70.8197 | 70.8197 | 70.8197 | 70.8197 | 8.4563 |
67
- | 1.6593 | 1.5 | 5000 | 1.9060 | 63.089 | 6.7619 | 62.5142 | 62.8447 | 73.1616 | 73.1616 | 73.1616 | 73.1616 | 8.42 |
68
- | 1.6716 | 1.65 | 5500 | 1.9133 | 63.2106 | 6.7486 | 62.5549 | 62.9047 | 73.2553 | 73.2553 | 73.2553 | 73.2553 | 8.362 |
69
- | 1.5638 | 1.8 | 6000 | 1.8967 | 63.5146 | 6.9202 | 62.9517 | 63.1969 | 73.4895 | 73.4895 | 73.4895 | 73.4895 | 8.28 |
70
- | 1.5614 | 1.95 | 6500 | 1.8835 | 63.3545 | 6.9092 | 62.7955 | 63.0354 | 73.2084 | 73.2084 | 73.2084 | 73.2084 | 8.2333 |
71
- | 1.4675 | 2.1 | 7000 | 1.9220 | 63.465 | 6.7168 | 62.9135 | 63.2247 | 73.63 | 73.63 | 73.63 | 73.63 | 8.1323 |
72
- | 1.4402 | 2.25 | 7500 | 1.9425 | 64.0073 | 7.0859 | 63.4022 | 63.7246 | 73.8642 | 73.8642 | 73.8642 | 73.8642 | 8.1393 |
73
- | 1.4655 | 2.4 | 8000 | 1.9142 | 64.366 | 6.8629 | 63.7608 | 64.0938 | 74.5667 | 74.5667 | 74.5667 | 74.5667 | 8.1717 |
74
- | 1.4741 | 2.55 | 8500 | 1.9238 | 64.022 | 6.8364 | 63.4035 | 63.7259 | 74.192 | 74.192 | 74.192 | 74.192 | 8.1777 |
75
- | 1.4335 | 2.7 | 9000 | 1.9001 | 64.8286 | 6.9507 | 64.159 | 64.5065 | 75.0351 | 75.0351 | 75.0351 | 75.0351 | 8.1387 |
76
- | 1.5305 | 2.85 | 9500 | 1.8914 | 64.895 | 6.9613 | 64.2636 | 64.5959 | 75.1288 | 75.1288 | 75.1288 | 75.1288 | 8.2063 |
77
-
78
 
79
  ### Framework versions
80
 
 
7
  results: []
8
  ---
9
 
10
+ # InstructDial
 
11
 
12
+ Instruction tuning is an emergent paradigm in NLP wherein natural language instructions are leveraged with language models to induce zero-shot performance on unseen tasks. Instructions have been shown to enable good performance on unseen tasks and datasets in both large and small language models. Dialogue is an especially interesting area to explore instruction tuning because dialogue systems perform multiple kinds of tasks related to language (e.g., natural language understanding and generation, domain-specific interaction), yet instruction tuning has not been systematically explored for dialogue-related tasks. We introduce InstructDial, an instruction tuning framework for dialogue, which consists of a repository of 48 diverse dialogue tasks in a unified text-to-text format created from 59 openly available dialogue datasets. Next, we explore cross-task generalization ability on models tuned on InstructDial across diverse dialogue tasks. Our analysis reveals that InstructDial enables good zero-shot performance on unseen datasets and tasks such as dialogue evaluation and intent detection, and even better performance in a few-shot setting. To ensure that models adhere to instructions, we introduce novel meta-tasks. We establish benchmark zero-shot and few-shot performance of models trained using the proposed framework on multiple dialogue tasks.
13
+
14
+ [Paper](https://arxiv.org/abs/2205.12673)
15
+
16
+
17
+ # Dial_BART0
18
+ BART-large type model trained on InstructDial tasks. This model is a fine-tuned version of [yuchenlin/BART0pp](https://huggingface.co/yuchenlin/BART0pp) on the InstructDial datasets.
19
 
 
 
 
 
 
 
 
 
 
20
 
21
  ## Model description
22
 
 
28
 
29
  ## Training and evaluation data
30
 
31
+ All tasks in InstructDial framework (including all dialogue eval tasks)
32
 
33
  ## Training procedure
34
 
 
47
  - lr_scheduler_type: linear
48
  - num_epochs: 3.0
49
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
50
 
51
  ### Framework versions
52