File size: 4,681 Bytes
f41f5d6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
---
license: apache-2.0
library_name: peft
tags:
- trl
- sft
- generated_from_trainer
base_model: petals-team/falcon-rw-1b
model-index:
- name: GenAI-task-2-ModelD-DS
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# GenAI-task-2-ModelD-DS

This model is a fine-tuned version of [petals-team/falcon-rw-1b](https://huggingface.co/petals-team/falcon-rw-1b) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 0.8551

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 2
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 4
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.01
- num_epochs: 2

### Training results

| Training Loss | Epoch  | Step | Validation Loss |
|:-------------:|:------:|:----:|:---------------:|
| 1.538         | 0.0316 | 20   | 1.4898          |
| 2.1944        | 0.0631 | 40   | 1.4676          |
| 2.2479        | 0.0947 | 60   | 1.4396          |
| 1.7654        | 0.1263 | 80   | 1.3892          |
| 2.1763        | 0.1579 | 100  | 1.3834          |
| 1.3644        | 0.1894 | 120  | 1.3282          |
| 1.6781        | 0.2210 | 140  | 1.3058          |
| 1.7429        | 0.2526 | 160  | 1.2880          |
| 1.37          | 0.2841 | 180  | 1.2483          |
| 1.8196        | 0.3157 | 200  | 1.2511          |
| 1.223         | 0.3473 | 220  | 1.2120          |
| 1.5357        | 0.3788 | 240  | 1.2171          |
| 1.6471        | 0.4104 | 260  | 1.1906          |
| 1.271         | 0.4420 | 280  | 1.1818          |
| 1.7222        | 0.4736 | 300  | 1.1788          |
| 1.2022        | 0.5051 | 320  | 1.1170          |
| 1.4455        | 0.5367 | 340  | 1.1633          |
| 1.7014        | 0.5683 | 360  | 1.1011          |
| 1.1309        | 0.5998 | 380  | 1.0815          |
| 1.6978        | 0.6314 | 400  | 1.0966          |
| 1.0796        | 0.6630 | 420  | 1.0325          |
| 1.4504        | 0.6946 | 440  | 1.0429          |
| 1.4698        | 0.7261 | 460  | 1.0216          |
| 1.0858        | 0.7577 | 480  | 1.0031          |
| 1.4275        | 0.7893 | 500  | 1.0115          |
| 0.9607        | 0.8208 | 520  | 0.9771          |
| 1.2579        | 0.8524 | 540  | 0.9792          |
| 1.3363        | 0.8840 | 560  | 0.9608          |
| 1.0551        | 0.9155 | 580  | 0.9471          |
| 1.531         | 0.9471 | 600  | 0.9530          |
| 0.9776        | 0.9787 | 620  | 0.9321          |
| 1.374         | 1.0103 | 640  | 0.9257          |
| 0.9688        | 1.0418 | 660  | 0.9217          |
| 1.464         | 1.0734 | 680  | 0.9278          |
| 1.0608        | 1.1050 | 700  | 0.9040          |
| 1.0711        | 1.1365 | 720  | 0.9017          |
| 1.2806        | 1.1681 | 740  | 0.8954          |
| 0.9129        | 1.1997 | 760  | 0.8877          |
| 1.2161        | 1.2313 | 780  | 0.8907          |
| 1.0221        | 1.2628 | 800  | 0.8794          |
| 1.1306        | 1.2944 | 820  | 0.8782          |
| 1.3235        | 1.3260 | 840  | 0.8768          |
| 0.9663        | 1.3575 | 860  | 0.8711          |
| 1.3124        | 1.3891 | 880  | 0.8716          |
| 1.0169        | 1.4207 | 900  | 0.8663          |
| 1.1686        | 1.4522 | 920  | 0.8658          |
| 1.2976        | 1.4838 | 940  | 0.8656          |
| 0.8896        | 1.5154 | 960  | 0.8620          |
| 1.3252        | 1.5470 | 980  | 0.8623          |
| 1.0821        | 1.5785 | 1000 | 0.8601          |
| 1.1595        | 1.6101 | 1020 | 0.8594          |
| 1.4023        | 1.6417 | 1040 | 0.8591          |
| 0.8901        | 1.6732 | 1060 | 0.8574          |
| 1.2387        | 1.7048 | 1080 | 0.8575          |
| 0.9921        | 1.7364 | 1100 | 0.8564          |
| 1.0593        | 1.7680 | 1120 | 0.8558          |
| 1.3434        | 1.7995 | 1140 | 0.8558          |
| 0.8345        | 1.8311 | 1160 | 0.8554          |
| 1.3537        | 1.8627 | 1180 | 0.8554          |
| 1.0417        | 1.8942 | 1200 | 0.8552          |
| 1.0643        | 1.9258 | 1220 | 0.8551          |
| 1.2218        | 1.9574 | 1240 | 0.8551          |
| 1.1633        | 1.9890 | 1260 | 0.8551          |


### Framework versions

- PEFT 0.10.0
- Transformers 4.40.0
- Pytorch 2.2.1+cu121
- Datasets 2.19.0
- Tokenizers 0.19.1