Update README.md
Browse files
README.md
CHANGED
@@ -4,57 +4,60 @@ language:
|
|
4 |
- en
|
5 |
metrics:
|
6 |
- accuracy
|
7 |
-
base_model:
|
8 |
-
- FacebookAI/roberta-large
|
9 |
pipeline_tag: text-classification
|
10 |
tags:
|
11 |
- framing
|
12 |
---
|
13 |
|
14 |
-
Sentence Frame Classifier
|
|
|
15 |
A RoBERTa-based model for detecting media frames at the sentence level. This model can classify sentences into 9 different frame categories and works across both news articles and reader comments.
|
16 |
-
Model Description
|
17 |
-
This model was trained to identify media frames in text at the sentence level. It's based on the Media Frame Corpus (Card et al., 2015) and extends to online discussion contexts, making it suitable for analyzing both professional journalism and user-generated content.
|
18 |
-
Key Features:
|
19 |
|
20 |
-
|
21 |
-
|
22 |
-
|
23 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
24 |
|
25 |
-
Frame Categories
|
26 |
The model classifies sentences into these 9 frame categories:
|
27 |
|
28 |
-
Economic
|
29 |
-
Morality
|
30 |
-
Fairness and Equality
|
31 |
-
Legality and Crime
|
32 |
-
Political and Policies
|
33 |
-
Security and Defense
|
34 |
-
Health and Safety
|
35 |
-
Cultural Identity
|
36 |
-
Public Opinion
|
|
|
|
|
37 |
|
38 |
-
|
|
|
|
|
|
|
39 |
|
40 |
-
|
41 |
-
Accuracy: 0.77
|
42 |
-
Cross-topic generalization: Robust performance across different topics
|
43 |
-
Validation: Human-validated on 600 sentences
|
44 |
|
45 |
-
|
46 |
-
|
47 |
|
48 |
-
# Load the classifier
|
49 |
classifier = pipeline("text-classification", model="your-username/sentence-frame-classifier")
|
50 |
|
51 |
-
# Classify a sentence
|
52 |
text = "The new policy will cost taxpayers millions of dollars while providing few benefits."
|
53 |
result = classifier(text)
|
54 |
print(result)
|
55 |
-
#
|
56 |
|
57 |
-
# Multiple examples
|
58 |
examples = [
|
59 |
"This violates our constitutional rights and freedoms.",
|
60 |
"The public strongly supports this initiative according to recent polls.",
|
@@ -66,23 +69,3 @@ for text in examples:
|
|
66 |
print(f"Text: {text}")
|
67 |
print(f"Frame: {result[0]['label']} (confidence: {result[0]['score']:.2f})")
|
68 |
print()
|
69 |
-
Training Data
|
70 |
-
The model was trained on:
|
71 |
-
|
72 |
-
Media Frame Corpus (MFC): Professionally annotated news articles
|
73 |
-
Online Forum Data: Sentence-level annotations from online discussions
|
74 |
-
Total: 63,626 sentences across multiple topics
|
75 |
-
|
76 |
-
Citation
|
77 |
-
If you use this model in your research, please cite:
|
78 |
-
|
79 |
-
|
80 |
-
License
|
81 |
-
This model is released under the MIT License. You are free to use, modify, and distribute this model for any purpose, provided you include appropriate attribution.
|
82 |
-
Model Details
|
83 |
-
|
84 |
-
Model Type: Text Classification
|
85 |
-
Base Model: RoBERTa-large
|
86 |
-
Parameters: ~355M
|
87 |
-
Training Framework: Transformers
|
88 |
-
Inference Framework: Transformers Pipeline
|
|
|
4 |
- en
|
5 |
metrics:
|
6 |
- accuracy
|
7 |
+
base_model: facebook/roberta-large
|
|
|
8 |
pipeline_tag: text-classification
|
9 |
tags:
|
10 |
- framing
|
11 |
---
|
12 |
|
13 |
+
# Sentence Frame Classifier
|
14 |
+
|
15 |
A RoBERTa-based model for detecting media frames at the sentence level. This model can classify sentences into 9 different frame categories and works across both news articles and reader comments.
|
|
|
|
|
|
|
16 |
|
17 |
+
## Model Description
|
18 |
+
|
19 |
+
This model was trained to identify media frames in text at the sentence level. It's based on the Media Frame Corpus (Card et al., 2015) and extends to online discussion contexts (Hartmann et al., 2019), making it suitable for analyzing both professional journalism and user-generated content.
|
20 |
+
|
21 |
+
**Key Features:**
|
22 |
+
|
23 |
+
- Sentence-level frame classification
|
24 |
+
- Cross-domain capability (news articles + comments)
|
25 |
+
- 9 frame categories based on established political communication theory
|
26 |
+
- Robust performance across different topics
|
27 |
+
|
28 |
+
## Frame Categories
|
29 |
|
|
|
30 |
The model classifies sentences into these 9 frame categories:
|
31 |
|
32 |
+
- Economic β Economic costs, benefits, or implications
|
33 |
+
- Morality β Moral or ethical considerations
|
34 |
+
- Fairness and Equality β Issues of fairness, equality, or discrimination
|
35 |
+
- Legality and Crime β Legal aspects, constitutionality, crime, and punishment
|
36 |
+
- Political and Policies β Political processes, policy prescriptions, and evaluations
|
37 |
+
- Security and Defense β Security threats, defense, or public safety
|
38 |
+
- Health and Safety β Health risks, safety concerns, or medical implications
|
39 |
+
- Cultural Identity β Cultural values, traditions, or identity issues
|
40 |
+
- Public Opinion β Public sentiment, polls, or popular support
|
41 |
+
|
42 |
+
## Performance
|
43 |
|
44 |
+
- Macro F1: 0.66
|
45 |
+
- Accuracy: 0.77
|
46 |
+
- Cross-topic generalization: Robust performance across different topics
|
47 |
+
- Validation: Human-validated on 600 sentences
|
48 |
|
49 |
+
## Usage
|
|
|
|
|
|
|
50 |
|
51 |
+
```python
|
52 |
+
from transformers import pipeline
|
53 |
|
|
|
54 |
classifier = pipeline("text-classification", model="your-username/sentence-frame-classifier")
|
55 |
|
|
|
56 |
text = "The new policy will cost taxpayers millions of dollars while providing few benefits."
|
57 |
result = classifier(text)
|
58 |
print(result)
|
59 |
+
# [{'label': 'Economic', 'score': 0.89}]
|
60 |
|
|
|
61 |
examples = [
|
62 |
"This violates our constitutional rights and freedoms.",
|
63 |
"The public strongly supports this initiative according to recent polls.",
|
|
|
69 |
print(f"Text: {text}")
|
70 |
print(f"Frame: {result[0]['label']} (confidence: {result[0]['score']:.2f})")
|
71 |
print()
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|