wanyu's picture
Update README.md
1b25dd2
|
raw
history blame
2.67 kB
metadata
datasets:
  - IteraTeR_full_sent

IteraTeR RoBERTa model

This model was obtained by fine-tuning roberta-large on IteraTeR-human-sent dataset.

Paper: Understanding Iterative Revision from Human-Written Text
Authors: Wanyu Du, Vipul Raheja, Dhruv Kumar, Zae Myung Kim, Melissa Lopez, Dongyeop Kang

Edit Intention Prediction Task

Given a pair of original sentence and revised sentence, our model can predict the edit intention for this revision pair.
More specifically, the model will predict the probability of the following edit intentions:

Edit Intention Definition Example
clarity Make the text more formal, concise, readable and understandable. Original: The changes made the paper better than before.
Revised: The changes improved the paper.
fluency Fix grammatical errors in the text. Original: She went to the markt.
Revised: She went to the market.
coherence Make the text more cohesive, logically linked and consistent as a whole. Original: She works hard. She is successful.
Revised: She works hard; therefore, she is successful.
style Convey the writer’s writing preferences, including emotions, tone, voice, etc.. Original: Everything was rotten.
Revised: Everything was awfully rotten.
meaning-changed Update or add new information to the text. Original: This method improves the model accuracy from 64% to 78%.
Revised: This method improves the model accuracy from 64% to 83%.

Usage

import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("wanyu/IteraTeR-ROBERTA")
model = AutoModelForSequenceClassification.from_pretrained("wanyu/IteraTeR-ROBERTA")

id2label = {0: "clarity", 1: "fluency", 2: "coherence", 3: "style", 4: "meaning-changed"}

before_text = 'I likes coffee.'
after_text = 'I like coffee.'
model_input = tokenizer(before_text, after_text, return_tensors='pt')
model_output = model(**model_input)
softmax_scores = torch.softmax(model_output.logits, dim=-1)
pred_id = torch.argmax(softmax_scores)
pred_label = id2label[pred_id.int()]