An NLI model to detect stance (Positive, Neutral, Negative) towards a given group found in a stance. Model was trained with a context window of a given sentence, and the sentence before it to provide context. Model is tuned on a corpus of political manifestos from the UK, Ireland, Germany and Austria. The specific hypotheses tested by the model is if the mention of a given group is positive, negative or neutral, where the model is flexible and can take any named group.
Finetuned by Will Horne, Alona Dolinsky and Lena Huber.
Accuracy: 86% F1 Macro: 0.78 Balanced Accuracy: 76%
Hyperparameters selected after 15 runs of optuna, each for 7 epochs, varying the following parameters:
trial.suggest_float("learning_rate", 1e-5, 4e-5, log=True) weight_decay = trial.suggest_float("weight_decay", 0.01, 0.3) warmup_ratio = trial.suggest_float("warmup_ratio", 0.0, 0.1)
The best performing trial had the following hyperparamters: {'learning_rate': 1.8023467140185343e-05, 'weight_decay': 0.08186573469975565, 'warmup_ratio': 0.04885059164630323}.
The training arguments were as follows:
training_args = TrainingArguments( num_train_epochs=7, learning_rate=learning_rate, warmup_ratio=warmup_ratio, weight_decay=weight_decay, evaluation_strategy="epoch", save_strategy="epoch", load_best_model_at_end=True, metric_for_best_model="f1_macro", output_dir='./results/temp', # Temporary directory for each trial logging_dir='./logs/temp', seed=42 )
- Downloads last month
- 4