quim-motger
commited on
Commit
•
ea0973f
1
Parent(s):
0919cd6
Update README
Browse files
README.md
CHANGED
@@ -1,3 +1,71 @@
|
|
1 |
-
---
|
2 |
-
license: gpl-3.0
|
3 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: gpl-3.0
|
3 |
+
---
|
4 |
+
|
5 |
+
# reviewBERT-large
|
6 |
+
|
7 |
+
This model is a fine-tuned version of [`bert-large-uncased`](https://huggingface.co/google-bert/bert-large-uncased) on a large dataset
|
8 |
+
of mobile app reviews. The model is designed to understand and process text from mobile app reviews, providing enhanced performance
|
9 |
+
for tasks such as feature extraction, sentiment analysis and review summarization from app reviews.
|
10 |
+
|
11 |
+
## Model Details
|
12 |
+
|
13 |
+
- **Model Architecture**: BERT (Bidirectional Encoder Representations from Transformers)
|
14 |
+
- **Base Model**: `bert-large-uncased`
|
15 |
+
- **Pre-training Extension**: Mobile app reviews dataset
|
16 |
+
- **Language**: English
|
17 |
+
|
18 |
+
## Dataset
|
19 |
+
|
20 |
+
The extended pre-training was performed using a diverse dataset of mobile app reviews collected from various app stores.
|
21 |
+
The dataset includes reviews of different lengths, sentiments, and topics, providing a robust foundation for understanding
|
22 |
+
the nuances of mobile app user feedback.
|
23 |
+
|
24 |
+
## Training Procedure
|
25 |
+
|
26 |
+
The model was fine-tuned using the following parameters:
|
27 |
+
|
28 |
+
- **Batch Size**: 16
|
29 |
+
- **Learning Rate**: 2e-5
|
30 |
+
- **Epochs**: 2
|
31 |
+
|
32 |
+
## Usage
|
33 |
+
|
34 |
+
### Load the model
|
35 |
+
|
36 |
+
```python
|
37 |
+
from transformers import BertTokenizer, BertForSequenceClassification
|
38 |
+
|
39 |
+
tokenizer = BertTokenizer.from_pretrained('quim-motger/reviewBERT-large')
|
40 |
+
model = BertForSequenceClassification.from_pretrained('quim-motger/reviewBERT-large')
|
41 |
+
```
|
42 |
+
|
43 |
+
### Example: Sentiment Analysis
|
44 |
+
|
45 |
+
```python
|
46 |
+
from transformers import pipeline
|
47 |
+
|
48 |
+
nlp = pipeline('sentiment-analysis', model=model, tokenizer=tokenizer)
|
49 |
+
|
50 |
+
review = "This app is fantastic! I love the user-friendly interface and features."
|
51 |
+
result = nlp(review)
|
52 |
+
|
53 |
+
print(result)
|
54 |
+
# Output: [{'label': 'POSITIVE', 'score': 0.98}]
|
55 |
+
```
|
56 |
+
|
57 |
+
### Example: Review Summarization
|
58 |
+
|
59 |
+
```python
|
60 |
+
from transformers import pipeline
|
61 |
+
|
62 |
+
summarizer = pipeline('summarization', model=model, tokenizer=tokenizer)
|
63 |
+
|
64 |
+
long_review = "I have been using this app for a while and it has significantly improved my productivity.
|
65 |
+
The range of features is excellent, and the user interface is intuitive. However, there are occasional
|
66 |
+
bugs that need fixing."
|
67 |
+
summary = summarizer(long_review, max_length=50, min_length=25, do_sample=False)
|
68 |
+
|
69 |
+
print(summary)
|
70 |
+
# Output: [{'summary_text': 'The app has significantly improved my productivity with its excellent features and intuitive user interface. However, occasional bugs need fixing.'}]
|
71 |
+
```
|