Finetuning
can you please explain more how to finetune it on new classification task and the code to use?
It is likely that the model already works with your task without additional fine-tuning, as it supports zero-shot classification. You can try the sample code provided on the model card.
Hi @HassanStar , you have two options for fine-tuning: (1) you do standard fine-tuning as with any BERT model. This would delete the universal NLI task head and create a new classification head for your task. (2) you can continuously fine-tune the model including the universal NLI head (this is recommended if you have roughly <= 1000 datapoints. If you have more data than 2000, normal fine-tuning is probably better). You can find example code for this in notebook nr. 4 here: https://github.com/MoritzLaurer/summer-school-transformers-2023
and as @akhtet said, you can also use it without fine-tuning with the example code from the model card.
"you can continuously fine-tune the model including the universal NLI head (this is recommended if you have roughly <= 1000 datapoints. If you have more data than 2000, normal fine-tuning is probably better). You can find example code for this in notebook nr. 4 here: https://github.com/MoritzLaurer/summer-school-transformers-2023".
I was wondering how multiple classes would change the data preparation.
in the example notebook, we have an "Positive military hypothesis", "Negative military hypotheses" and "not about military hypotheses" true and not true about these hypotheses. when we have more than theses 3 classes for example:
classifying "food related prompt" into 5 classes: "relating to fruit", "relating to vegetables", "relating to chocolate", "relating to meat".
Would it look something like:
text :"I really like strawberries", Hypothesis: "this text relates to fruit", label_nli_explicit: True
text :"I really like strawberries", Hypothesis: "this text relates to vegetables", label_nli_explicit: false
text :"I really like strawberries", Hypothesis: "this text relates to Chocolate", label_nli_explicit: false
text :"I really like strawberries", Hypothesis: "this text relates to meat", label_nli_explicit: false
@Gurdikyan1 I think your example is mostly correct, but for NLI heads there should be 3 label classes (Entailment, Contradiction and Neutral) instead of True/False.
When you prepare the fine-tuning dataset, your examples should include all 3 different types.
@Gurdikyan1 yes, format would work. In practice, with multiple classes, I've always only provided the True class and then randomly chosen ONE False class. Otherwise the dataset becomes huge with many classes, training becomes much longer and the model learns to overpredict "False".
@akhtet , classical NLI has 3 labels, but for 0-shot or few-shot classification you actually only need to (True/False). You can merge the contradiction+neutral class into the False class.