File size: 3,448 Bytes
dd2ae0a 4f33dc0 b58e3e7 16d550c b58e3e7 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 |
---
license: gpl-3.0
language:
- en
tags:
- feature extraction
- mobile apps
- reviews
- token classification
- named entity recognition
pipeline_tag: token-classification
widget:
- text: "The share note file feature is completely useless."
example_title: "Example 1"
- text: "Great app I've tested a lot of free habit tracking apps and this is by far my favorite."
example_title: "Example 2"
- text: "The only negative feedback I can give about this app is the difficulty level to set a sleep timer on it."
example_title: "Example 3"
- text: "Does what you want with a small pocket size checklist reminder app"
example_title: "Example 4"
- text: "Very bad because call recording notification send other person"
example_title: "Example 5"
- text: "I originally downloaded the app for pomodoro timing, but I stayed for the project management features, with syncing."
example_title: "Example 6"
- text: "It works accurate and I bought a portable one lap gps tracker it have a great battery Life"
example_title: "Example 7"
- text: "I'm my phone the notifications of group message are not at a time please check what was the reason behind it because due to this default I loose some opportunity"
example_title: "Example 8"
- text: "There is no setting for recurring alarms"
example_title: "Example 9"
---
# T-FREX RoBERTa base model
T-FREX is a transformer-based feature extraction method for mobile app reviews based on fine-tuning Large Language Models (LLMs) for a named entity recognition task. We collect a dataset of ground truth features from users in a real crowdsourced software recommendation platform, and we use this dataset to fine-tune multiple LLMs under different data configurations. We assess the performance of T-FREX with respect to this ground truth, and we complement our analysis by comparing T-FREX with a baseline method from the field. Finally, we assess the quality of new features predicted by T-FREX through an external human evaluation. Results show that T-FREX outperforms on average the traditional syntactic-based method, especially when discovering new features from a domain for which the model has been fine-tuned.
Source code for data generation, fine-tuning and model inference are available in the original [GitHub repository](https://github.com/gessi-chatbots/t-frex/).
## Model description
This version of T-FREX has been fine-tuned for [token classification](https://huggingface.co/docs/transformers/tasks/token_classification#train) from [BERT large model (uncased)](https://huggingface.co/bert-large-uncased).
## Model variations
T-FREX includes a set of released, fine-tuned models which are compared in the original study (to be published).
- [**t-frex-bert-base-uncased**](https://huggingface.co/quim-motger/t-frex-bert-base-uncased)
- [**t-frex-bert-large-uncased**](https://huggingface.co/quim-motger/t-frex-bert-large-uncased)
- [**t-frex-roberta-base**](https://huggingface.co/quim-motger/t-frex-roberta-base)
- [**t-frex-roberta-large**](https://huggingface.co/quim-motger/t-frex-roberta-large)
- [**t-frex-xlnet-base-cased**](https://huggingface.co/quim-motger/t-frex-xlnet-base-cased)
- [**t-frex-xlnet-large-cased**](https://huggingface.co/quim-motger/t-frex-xlnet-large-cased)
## How to use
You can use this model following the instructions for [model inference for token classification](https://huggingface.co/docs/transformers/tasks/token_classification#inference). |