File size: 6,421 Bytes
4b966a1 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 |
12/17/2022 22:46:51 - WARNING - __main__ - Process rank: -1, device: cuda:0, n_gpu: 1distributed training: False, 16-bits training: True
12/17/2022 22:46:51 - INFO - __main__ - Training/evaluation parameters Seq2SeqTrainingArguments(
_n_gpu=1,
adafactor=False,
adam_beta1=0.9,
adam_beta2=0.999,
adam_epsilon=1e-08,
auto_find_batch_size=False,
bf16=False,
bf16_full_eval=False,
data_seed=None,
dataloader_drop_last=False,
dataloader_num_workers=0,
dataloader_pin_memory=True,
ddp_bucket_cap_mb=None,
ddp_find_unused_parameters=None,
ddp_timeout=1800,
debug=[],
deepspeed=None,
disable_tqdm=False,
do_eval=True,
do_predict=False,
do_train=True,
eval_accumulation_steps=None,
eval_delay=0,
eval_steps=1000,
evaluation_strategy=steps,
fp16=True,
fp16_backend=auto,
fp16_full_eval=False,
fp16_opt_level=O1,
fsdp=[],
fsdp_min_num_params=0,
fsdp_transformer_layer_cls_to_wrap=None,
full_determinism=False,
generation_max_length=225,
generation_num_beams=None,
gradient_accumulation_steps=1,
gradient_checkpointing=True,
greater_is_better=False,
group_by_length=False,
half_precision_backend=auto,
hub_model_id=None,
hub_private_repo=False,
hub_strategy=every_save,
hub_token=<HUB_TOKEN>,
ignore_data_skip=True,
include_inputs_for_metrics=False,
jit_mode_eval=False,
label_names=None,
label_smoothing_factor=0.0,
learning_rate=3.5e-05,
length_column_name=length,
load_best_model_at_end=True,
local_rank=-1,
log_level=passive,
log_level_replica=passive,
log_on_each_node=True,
logging_dir=./runs/Dec17_22-46-51_129-213-88-66,
logging_first_step=True,
logging_nan_inf_filter=True,
logging_steps=50,
logging_strategy=steps,
lr_scheduler_type=linear,
max_grad_norm=1.0,
max_steps=6000,
metric_for_best_model=wer,
mp_parameters=,
no_cuda=False,
num_train_epochs=3.0,
optim=adamw_hf,
optim_args=None,
output_dir=./,
overwrite_output_dir=False,
past_index=-1,
per_device_eval_batch_size=32,
per_device_train_batch_size=64,
predict_with_generate=True,
prediction_loss_only=False,
push_to_hub=False,
push_to_hub_model_id=None,
push_to_hub_organization=None,
push_to_hub_token=<PUSH_TO_HUB_TOKEN>,
ray_scope=last,
remove_unused_columns=True,
report_to=['tensorboard'],
resume_from_checkpoint=None,
run_name=./,
save_on_each_node=False,
save_steps=1000,
save_strategy=steps,
save_total_limit=None,
seed=43,
sharded_ddp=[],
skip_memory_metrics=True,
sortish_sampler=False,
tf32=None,
torch_compile=False,
torch_compile_backend=None,
torch_compile_mode=None,
torchdynamo=None,
tpu_metrics_debug=False,
tpu_num_cores=None,
use_ipex=False,
use_legacy_prediction_loop=False,
use_mps_device=False,
warmup_ratio=0.0,
warmup_steps=0,
weight_decay=0.0,
xpu_backend=None,
)
12/17/2022 22:46:51 - INFO - __main__ - Data parameters: DataTrainingArguments(dataset_name='mozilla-foundation/common_voice_11_0', dataset_config_name='be', max_train_samples=None, max_eval_samples=None, audio_column_name='audio', text_column_name='sentence', max_duration_in_seconds=30.0, min_duration_in_seconds=0.0, train_split_name='train', eval_split_name='validation', do_lower_case=False, do_remove_punctuation=False, do_normalize_eval=True, language='be', task='transcribe', shuffle_buffer_size=500, streaming_train=True, streaming_eval=False)
12/17/2022 22:46:51 - INFO - __main__ - Model parameters: ModelArguments(model_name_or_path='ales/whisper-small-belarusian', config_name=None, tokenizer_name=None, feature_extractor_name=None, cache_dir=None, use_fast_tokenizer=True, model_revision='main', use_auth_token=True, freeze_feature_encoder=False, freeze_encoder=False, forced_decoder_ids=None, suppress_tokens=None, model_index_name='Whisper Small Belarusian')
12/17/2022 22:46:51 - INFO - __main__ - output_dir already exists. will try to load last checkpoint.
12/17/2022 22:46:51 - INFO - __main__ - last_checkpoint is None. will try to read from training_args.resume_from_checkpoint
12/17/2022 22:46:51 - INFO - __main__ - last_checkpoint is None. resume_from_checkpoint is either None or not existing dir. will try to read from the model saved in the root of output_dir.
12/17/2022 22:46:51 - INFO - __main__ - dir is not empty, but contains only: ['src', 'train_20221217-224651.log', 'train_run_2.log']. it is OK - will start training
12/17/2022 22:46:51 - INFO - datasets.info - Loading Dataset Infos from /home/ubuntu/.cache/huggingface/modules/datasets_modules/datasets/mozilla-foundation--common_voice_11_0/f8e47235d9b4e68fa24ed71d63266a02018ccf7194b2a8c9c598a5f3ab304d9f
12/17/2022 22:46:51 - INFO - datasets.builder - Overwrite dataset info from restored data version.
12/17/2022 22:46:51 - INFO - datasets.info - Loading Dataset info from /home/ubuntu/.cache/huggingface/datasets/mozilla-foundation___common_voice_11_0/be/11.0.0/f8e47235d9b4e68fa24ed71d63266a02018ccf7194b2a8c9c598a5f3ab304d9f
12/17/2022 22:46:51 - INFO - datasets.info - Loading Dataset Infos from /home/ubuntu/.cache/huggingface/modules/datasets_modules/datasets/mozilla-foundation--common_voice_11_0/f8e47235d9b4e68fa24ed71d63266a02018ccf7194b2a8c9c598a5f3ab304d9f
12/17/2022 22:46:51 - INFO - datasets.builder - Overwrite dataset info from restored data version.
12/17/2022 22:46:51 - INFO - datasets.info - Loading Dataset info from /home/ubuntu/.cache/huggingface/datasets/mozilla-foundation___common_voice_11_0/be/11.0.0/f8e47235d9b4e68fa24ed71d63266a02018ccf7194b2a8c9c598a5f3ab304d9f
12/17/2022 22:46:51 - WARNING - datasets.builder - Found cached dataset common_voice_11_0 (/home/ubuntu/.cache/huggingface/datasets/mozilla-foundation___common_voice_11_0/be/11.0.0/f8e47235d9b4e68fa24ed71d63266a02018ccf7194b2a8c9c598a5f3ab304d9f)
12/17/2022 22:46:51 - INFO - datasets.info - Loading Dataset info from /home/ubuntu/.cache/huggingface/datasets/mozilla-foundation___common_voice_11_0/be/11.0.0/f8e47235d9b4e68fa24ed71d63266a02018ccf7194b2a8c9c598a5f3ab304d9f
12/17/2022 22:47:06 - INFO - __main__ - vectorizing dataset
12/17/2022 22:47:06 - INFO - __main__ - will preprocess data using None processes.
12/17/2022 22:47:08 - INFO - datasets.arrow_dataset - Caching processed dataset at /home/ubuntu/.cache/huggingface/datasets/mozilla-foundation___common_voice_11_0/be/11.0.0/f8e47235d9b4e68fa24ed71d63266a02018ccf7194b2a8c9c598a5f3ab304d9f/cache-796bd4b577aed289.arrow
12/17/2022 23:35:22 - INFO - __main__ - will launch training and pass resume_from_checkpoint=None
12/17/2022 23:35:22 - INFO - __main__ - ShuffleCallback. shuffling train dataset. seed: 43. dataset epoch: 0
|