File size: 6,421 Bytes
4b966a1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
12/17/2022 22:46:51 - WARNING - __main__ - Process rank: -1, device: cuda:0, n_gpu: 1distributed training: False, 16-bits training: True
12/17/2022 22:46:51 - INFO - __main__ - Training/evaluation parameters Seq2SeqTrainingArguments(
_n_gpu=1,
adafactor=False,
adam_beta1=0.9,
adam_beta2=0.999,
adam_epsilon=1e-08,
auto_find_batch_size=False,
bf16=False,
bf16_full_eval=False,
data_seed=None,
dataloader_drop_last=False,
dataloader_num_workers=0,
dataloader_pin_memory=True,
ddp_bucket_cap_mb=None,
ddp_find_unused_parameters=None,
ddp_timeout=1800,
debug=[],
deepspeed=None,
disable_tqdm=False,
do_eval=True,
do_predict=False,
do_train=True,
eval_accumulation_steps=None,
eval_delay=0,
eval_steps=1000,
evaluation_strategy=steps,
fp16=True,
fp16_backend=auto,
fp16_full_eval=False,
fp16_opt_level=O1,
fsdp=[],
fsdp_min_num_params=0,
fsdp_transformer_layer_cls_to_wrap=None,
full_determinism=False,
generation_max_length=225,
generation_num_beams=None,
gradient_accumulation_steps=1,
gradient_checkpointing=True,
greater_is_better=False,
group_by_length=False,
half_precision_backend=auto,
hub_model_id=None,
hub_private_repo=False,
hub_strategy=every_save,
hub_token=<HUB_TOKEN>,
ignore_data_skip=True,
include_inputs_for_metrics=False,
jit_mode_eval=False,
label_names=None,
label_smoothing_factor=0.0,
learning_rate=3.5e-05,
length_column_name=length,
load_best_model_at_end=True,
local_rank=-1,
log_level=passive,
log_level_replica=passive,
log_on_each_node=True,
logging_dir=./runs/Dec17_22-46-51_129-213-88-66,
logging_first_step=True,
logging_nan_inf_filter=True,
logging_steps=50,
logging_strategy=steps,
lr_scheduler_type=linear,
max_grad_norm=1.0,
max_steps=6000,
metric_for_best_model=wer,
mp_parameters=,
no_cuda=False,
num_train_epochs=3.0,
optim=adamw_hf,
optim_args=None,
output_dir=./,
overwrite_output_dir=False,
past_index=-1,
per_device_eval_batch_size=32,
per_device_train_batch_size=64,
predict_with_generate=True,
prediction_loss_only=False,
push_to_hub=False,
push_to_hub_model_id=None,
push_to_hub_organization=None,
push_to_hub_token=<PUSH_TO_HUB_TOKEN>,
ray_scope=last,
remove_unused_columns=True,
report_to=['tensorboard'],
resume_from_checkpoint=None,
run_name=./,
save_on_each_node=False,
save_steps=1000,
save_strategy=steps,
save_total_limit=None,
seed=43,
sharded_ddp=[],
skip_memory_metrics=True,
sortish_sampler=False,
tf32=None,
torch_compile=False,
torch_compile_backend=None,
torch_compile_mode=None,
torchdynamo=None,
tpu_metrics_debug=False,
tpu_num_cores=None,
use_ipex=False,
use_legacy_prediction_loop=False,
use_mps_device=False,
warmup_ratio=0.0,
warmup_steps=0,
weight_decay=0.0,
xpu_backend=None,
)
12/17/2022 22:46:51 - INFO - __main__ - Data parameters: DataTrainingArguments(dataset_name='mozilla-foundation/common_voice_11_0', dataset_config_name='be', max_train_samples=None, max_eval_samples=None, audio_column_name='audio', text_column_name='sentence', max_duration_in_seconds=30.0, min_duration_in_seconds=0.0, train_split_name='train', eval_split_name='validation', do_lower_case=False, do_remove_punctuation=False, do_normalize_eval=True, language='be', task='transcribe', shuffle_buffer_size=500, streaming_train=True, streaming_eval=False)
12/17/2022 22:46:51 - INFO - __main__ - Model parameters: ModelArguments(model_name_or_path='ales/whisper-small-belarusian', config_name=None, tokenizer_name=None, feature_extractor_name=None, cache_dir=None, use_fast_tokenizer=True, model_revision='main', use_auth_token=True, freeze_feature_encoder=False, freeze_encoder=False, forced_decoder_ids=None, suppress_tokens=None, model_index_name='Whisper Small Belarusian')
12/17/2022 22:46:51 - INFO - __main__ - output_dir already exists. will try to load last checkpoint.
12/17/2022 22:46:51 - INFO - __main__ - last_checkpoint is None. will try to read from training_args.resume_from_checkpoint
12/17/2022 22:46:51 - INFO - __main__ - last_checkpoint is None. resume_from_checkpoint is either None or not existing dir. will try to read from the model saved in the root of output_dir.
12/17/2022 22:46:51 - INFO - __main__ - dir is not empty, but contains only: ['src', 'train_20221217-224651.log', 'train_run_2.log']. it is OK - will start training
12/17/2022 22:46:51 - INFO - datasets.info - Loading Dataset Infos from /home/ubuntu/.cache/huggingface/modules/datasets_modules/datasets/mozilla-foundation--common_voice_11_0/f8e47235d9b4e68fa24ed71d63266a02018ccf7194b2a8c9c598a5f3ab304d9f
12/17/2022 22:46:51 - INFO - datasets.builder - Overwrite dataset info from restored data version.
12/17/2022 22:46:51 - INFO - datasets.info - Loading Dataset info from /home/ubuntu/.cache/huggingface/datasets/mozilla-foundation___common_voice_11_0/be/11.0.0/f8e47235d9b4e68fa24ed71d63266a02018ccf7194b2a8c9c598a5f3ab304d9f
12/17/2022 22:46:51 - INFO - datasets.info - Loading Dataset Infos from /home/ubuntu/.cache/huggingface/modules/datasets_modules/datasets/mozilla-foundation--common_voice_11_0/f8e47235d9b4e68fa24ed71d63266a02018ccf7194b2a8c9c598a5f3ab304d9f
12/17/2022 22:46:51 - INFO - datasets.builder - Overwrite dataset info from restored data version.
12/17/2022 22:46:51 - INFO - datasets.info - Loading Dataset info from /home/ubuntu/.cache/huggingface/datasets/mozilla-foundation___common_voice_11_0/be/11.0.0/f8e47235d9b4e68fa24ed71d63266a02018ccf7194b2a8c9c598a5f3ab304d9f
12/17/2022 22:46:51 - WARNING - datasets.builder - Found cached dataset common_voice_11_0 (/home/ubuntu/.cache/huggingface/datasets/mozilla-foundation___common_voice_11_0/be/11.0.0/f8e47235d9b4e68fa24ed71d63266a02018ccf7194b2a8c9c598a5f3ab304d9f)
12/17/2022 22:46:51 - INFO - datasets.info - Loading Dataset info from /home/ubuntu/.cache/huggingface/datasets/mozilla-foundation___common_voice_11_0/be/11.0.0/f8e47235d9b4e68fa24ed71d63266a02018ccf7194b2a8c9c598a5f3ab304d9f
12/17/2022 22:47:06 - INFO - __main__ - vectorizing dataset
12/17/2022 22:47:06 - INFO - __main__ - will preprocess data using None processes.
12/17/2022 22:47:08 - INFO - datasets.arrow_dataset - Caching processed dataset at /home/ubuntu/.cache/huggingface/datasets/mozilla-foundation___common_voice_11_0/be/11.0.0/f8e47235d9b4e68fa24ed71d63266a02018ccf7194b2a8c9c598a5f3ab304d9f/cache-796bd4b577aed289.arrow
12/17/2022 23:35:22 - INFO - __main__ - will launch training and pass resume_from_checkpoint=None
12/17/2022 23:35:22 - INFO - __main__ - ShuffleCallback. shuffling train dataset. seed: 43. dataset epoch: 0