This is a finetuned version of RuRoBERTa-large for the task of linguistic acceptability classification on the RuCoLA benchmark.

The hyperparameters used for finetuning are as follows:

  • 5 training epochs (with early stopping based on validation MCC)
  • Peak learning rate: 1e-5, linear warmup for 10% of total training time
  • Weight decay: 1e-4
  • Batch size: 32
  • Random seed: 5
  • Optimizer: torch.optim.AdamW
Downloads last month
640
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.