Padding of labels bug?

#44
by haukurpj - opened

I'm currently reimplementing the audio training sample code using Pytorch Lightning and while debugging an issue I noticed in the collator:

        labels = pad_sequence(labels_list, padding_side='left', padding_value=0)

When batching, should the labels not be padded with _IGNORE_INDEX?

I think the attention mask will handle it.

I think it does matter when calculating the loss but the HF trainer is probably handling this case, i.e. converting pad to -100 before the loss calculation.

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment