KevSun
/

Engessay_grading_ML

Text Classification

Transformers

PyTorch

roberta

Model card Files Files and versions Community

kevintu commited on May 8, 2024

Commit

5604aae

verified ·

1 Parent(s): 0630a2c

Update README.md

Browse files

Files changed (1) hide show

README.md +49 -2

README.md CHANGED Viewed

@@ -13,7 +13,7 @@ The model's performance on the test dataset, which includes around 980 English e
 Upon inputting an essay, the model outputs six scores corresponding to cohesion, syntax, vocabulary, phraseology, grammar, and conventions. Each score ranges from 1 to 5, with higher scores indicating greater proficiency within the essay. These dimensions collectively assess the quality of the input essay from multiple perspectives. The model serves as a valuable tool for EFL teachers and researchers, and it is also beneficial for English L2 learners and parents for self-evaluating their composition skills.
 To test the model, run the following code or paste your essay into the API interface:
 ```
 #import packages
@@ -59,7 +59,7 @@ trait_names = ["cohesion", "syntax", "vocabulary", "phraseology", "grammar",  "c
 for trait, score in zip(trait_names, predicted_scores):
     print(f"{trait}: {score:.4f}")
-##"output":
 #cohesion: 3.5399
 #syntax: 3.6380
 #vocabulary: 3.9250
@@ -67,4 +67,51 @@ for trait, score in zip(trait_names, predicted_scores):
 #grammar: 3.9194
 #conventions: 3.6819
 ```

 Upon inputting an essay, the model outputs six scores corresponding to cohesion, syntax, vocabulary, phraseology, grammar, and conventions. Each score ranges from 1 to 5, with higher scores indicating greater proficiency within the essay. These dimensions collectively assess the quality of the input essay from multiple perspectives. The model serves as a valuable tool for EFL teachers and researchers, and it is also beneficial for English L2 learners and parents for self-evaluating their composition skills.
 To test the model, run the following code or paste your essay into the API interface:
+Please use the following if you want to get the ouput values ranging from 1 to 5.
 ```
 #import packages
 for trait, score in zip(trait_names, predicted_scores):
     print(f"{trait}: {score:.4f}")
+##"output" (values raning from 1 to 5):
 #cohesion: 3.5399
 #syntax: 3.6380
 #vocabulary: 3.9250
 #grammar: 3.9194
 #conventions: 3.6819
+```
+However, implment the following code if you have the output values between 1 to 10.
+```
+# Import packages
+from transformers import AutoModelForSequenceClassification, AutoTokenizer
+import torch
+# Load model and tokenizer
+model = AutoModelForSequenceClassification.from_pretrained("Kevintu/Engessay_grading_ML")
+tokenizer = AutoTokenizer.from_pretrained("Kevintu/Engessay_grading_ML")
+# Example new text input
+new_text = "The English Language Learner Insight, Proficiency and Skills Evaluation (ELLIPSE) Corpus is a freely available corpus of ~6,500 ELL writing samples that have been scored for overall holistic language proficiency as well as analytic proficiency scores related to cohesion, syntax, vocabulary, phraseology, grammar, and conventions. In addition, the ELLIPSE corpus provides individual and demographic information for the ELL writers in the corpus including economic status, gender, grade level (8-12), and race/ethnicity. The corpus provides language proficiency scores for individual writers and was developed to advance research in corpus and NLP approaches to assess overall and more fine-grained features of proficiency."
+# Encode the text
+encoded_input = tokenizer(new_text, return_tensors='pt', padding=True, truncation=True, max_length=64)
+# Evaluate model
+model.eval()
+with torch.no_grad():
+    outputs = model(**encoded_input)
+# Get predictions
+predictions = outputs.logits.squeeze()
+# Convert predictions if necessary
+predicted_scores = predictions.numpy()  # Convert to numpy array
+trait_names = ["cohesion", "syntax", "vocabulary", "phraseology", "grammar", "conventions"]
+# Scale predictions from 1 to 10
+scaled_scores = 2.25 * predicted_scores - 1.25
+# Print the scaled personality traits scores
+for trait, score in zip(trait_names, scaled_scores):
+    print(f"{trait}: {score:.4f}")
+##"ouput" (values between 1-10)
+#cohesion: 6.7147
+#syntax: 6.9354
+#vocabulary: 7.5814
+#phraseology: 7.3856
+#grammar: 7.5687
+#conventions: 7.0344
 ```