wanyu
/

IteraTeR-ROBERTA-Intention-Classifier

Text Classification

Inference Endpoints

Model card Files Files and versions Community

IteraTeR-ROBERTA-Intention-Classifier / README.md

wanyu's picture

Update README.md

1b25dd2 almost 3 years ago

|

2.67 kB

	---
	datasets:
	- IteraTeR_full_sent
	---

	# IteraTeR RoBERTa model
	This model was obtained by fine-tuning [roberta-large](https://huggingface.co/roberta-large) on [IteraTeR-human-sent](https://huggingface.co/datasets/wanyu/IteraTeR_human_sent) dataset.

	Paper: [Understanding Iterative Revision from Human-Written Text](https://arxiv.org/abs/2203.03802) <br>
	Authors: Wanyu Du, Vipul Raheja, Dhruv Kumar, Zae Myung Kim, Melissa Lopez, Dongyeop Kang

	## Edit Intention Prediction Task
	Given a pair of original sentence and revised sentence, our model can predict the edit intention for this revision pair.<br>
	More specifically, the model will predict the probability of the following edit intentions:
	<table style="width:90%">
	<tr>
	<th>Edit Intention</th>
	<th>Definition</th>
	<th>Example</th>
	</tr>
	<tr>
	<td>clarity</td>
	<td>Make the text more formal, concise, readable and understandable.</td>
	<td>
	Original: The changes made the paper better than before. <br>
	Revised: The changes improved the paper.
	</td>
	</tr>
	<tr>
	<td>fluency</td>
	<td>Fix grammatical errors in the text.</td>
	<td>
	Original: She went to the markt. <br>
	Revised: She went to the market.
	</td>
	</tr>
	<tr>
	<td>coherence</td>
	<td>Make the text more cohesive, logically linked and consistent as a whole.</td>
	<td>
	Original: She works hard. She is successful. <br>
	Revised: She works hard; therefore, she is successful.
	</td>
	</tr>
	<tr>
	<td>style</td>
	<td>Convey the writer’s writing preferences, including emotions, tone, voice, etc..</td>
	<td>
	Original: Everything was rotten. <br>
	Revised: Everything was awfully rotten.
	</td>
	</tr>
	<tr>
	<td>meaning-changed</td>
	<td>Update or add new information to the text.</td>
	<td>
	Original: This method improves the model accuracy from 64% to 78%. <br>
	Revised: This method improves the model accuracy from 64% to 83%.
	</td>
	</tr>
	</table>



	## Usage
	```python
	import torch
	from transformers import AutoTokenizer, AutoModelForSequenceClassification

	tokenizer = AutoTokenizer.from_pretrained("wanyu/IteraTeR-ROBERTA")
	model = AutoModelForSequenceClassification.from_pretrained("wanyu/IteraTeR-ROBERTA")

	id2label = {0: "clarity", 1: "fluency", 2: "coherence", 3: "style", 4: "meaning-changed"}

	before_text = 'I likes coffee.'
	after_text = 'I like coffee.'
	model_input = tokenizer(before_text, after_text, return_tensors='pt')
	model_output = model(**model_input)
	softmax_scores = torch.softmax(model_output.logits, dim=-1)
	pred_id = torch.argmax(softmax_scores)
	pred_label = id2label[pred_id.int()]
	```