agentlans commited on
Commit
26423be
·
verified ·
1 Parent(s): 6a56b21

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -12
README.md CHANGED
@@ -13,7 +13,7 @@ model-index:
13
 
14
  ## Model Overview
15
 
16
- This model is a fine-tuned version of [agentlans/deberta-v3-xsmall-zyda-2](https://huggingface.co/agentlans/deberta-v3-xsmall-zyda-2) designed for text quality assessment. It has been trained on the Text Quality Meta-Analysis Dataset and achieves the following results on the evaluation set:
17
 
18
  - Loss: 0.3165
19
  - MSE: 0.3165
@@ -24,16 +24,6 @@ The model was trained on the [Text Quality Meta-Analysis Dataset](https://huggin
24
 
25
  In this context, "quality" refers to legible English sentences that are not spam and contain useful information. It does not necessarily indicate grammatical or factual correctness.
26
 
27
- ### Quality Score Derivation
28
-
29
- The composite quality score was derived through the following steps:
30
-
31
- 1. Principal Component Analysis (PCA) was performed on the normalized "fineweb" and "nvidia" scores.
32
- 2. The first principal component was extracted as an initial measure of quality.
33
- 3. This quality measure was then adjusted for sentence length using robust linear regression (rlm function from the MASS package).
34
- 4. The adjusted quality scores were scaled to z-scores to produce the final quality metric.
35
- 5. The scores were then quantile normalized to a normal distribution.
36
-
37
  ## Model Description
38
 
39
  The model is based on the DeBERTa-v3-xsmall architecture and has been fine-tuned for sequence classification tasks, specifically for assessing the quality of text inputs.
@@ -44,7 +34,7 @@ This model is intended for evaluating the quality of text inputs. It can be used
44
 
45
  ### Usage Example
46
 
47
- ```
48
  import torch
49
  from transformers import AutoModelForSequenceClassification, AutoTokenizer
50
 
 
13
 
14
  ## Model Overview
15
 
16
+ This model is a fine-tuned version of [agentlans/deberta-v3-xsmall-zyda-2](https://huggingface.co/agentlans/deberta-v3-xsmall-zyda-2) designed for text quality assessment. It achieves the following results on the evaluation set:
17
 
18
  - Loss: 0.3165
19
  - MSE: 0.3165
 
24
 
25
  In this context, "quality" refers to legible English sentences that are not spam and contain useful information. It does not necessarily indicate grammatical or factual correctness.
26
 
 
 
 
 
 
 
 
 
 
 
27
  ## Model Description
28
 
29
  The model is based on the DeBERTa-v3-xsmall architecture and has been fine-tuned for sequence classification tasks, specifically for assessing the quality of text inputs.
 
34
 
35
  ### Usage Example
36
 
37
+ ```python
38
  import torch
39
  from transformers import AutoModelForSequenceClassification, AutoTokenizer
40