PeppoCola
/

IssueReportClassifier-NLBSE22

Text Classification

Model card Files Files and versions Community

PeppoCola commited on Mar 21, 2023

Commit

b12a845

·

1 Parent(s): 1a17051

Update README.md

Files changed (1) hide show

README.md +6 -0

README.md CHANGED Viewed

@@ -32,6 +32,12 @@ The model is trained on a dataset of labeled issue reports and is designed to pr
 | enhancement | 299,287 (41.4%) | 33,290 (41.3%) |
 | question   | 62,373 (8.6%) | 7,076 (8.8%) |
 ## Metrics
 The model is evaluated using the following metrics:

 | enhancement | 299,287 (41.4%) | 33,290 (41.3%) |
 | question   | 62,373 (8.6%) | 7,076 (8.8%) |
+## Data preprocessing
+The data used for training was preprocessed with [ekphrasis](https://github.com/cbaziotis/ekphrasis), adding some regular expressions to remove code, images and URLs.
+Check out our [GitHub](https://github.com/collab-uniba/Issue-Report-Classification-Using-RoBERTa) code for more information about this.
 ## Metrics
 The model is evaluated using the following metrics: