license: gpl-3.0
language:
- en
tags:
- feature extraction
- mobile apps
- reviews
- token classification
- named entity recognition
pipeline_tag: token-classification
widget:
- text: The share note file feature is completely useless.
example_title: Example 1
- text: >-
Great app I've tested a lot of free habit tracking apps and this is by far
my favorite.
example_title: Example 2
- text: >-
The only negative feedback I can give about this app is the difficulty
level to set a sleep timer on it.
example_title: Example 3
- text: Does what you want with a small pocket size checklist reminder app
example_title: Example 4
- text: Very bad because call recording notification send other person
example_title: Example 5
- text: >-
I originally downloaded the app for pomodoro timing, but I stayed for the
project management features, with syncing.
example_title: Example 6
- text: >-
It works accurate and I bought a portable one lap gps tracker it have a
great battery Life
example_title: Example 7
- text: >-
I'm my phone the notifications of group message are not at a time please
check what was the reason behind it because due to this default I loose
some opportunity
example_title: Example 8
- text: There is no setting for recurring alarms
example_title: Example 9
T-FREX RoBERTa base model
T-FREX is a transformer-based feature extraction method for mobile app reviews based on fine-tuning Large Language Models (LLMs) for a named entity recognition task. We collect a dataset of ground truth features from users in a real crowdsourced software recommendation platform, and we use this dataset to fine-tune multiple LLMs under different data configurations. We assess the performance of T-FREX with respect to this ground truth, and we complement our analysis by comparing T-FREX with a baseline method from the field. Finally, we assess the quality of new features predicted by T-FREX through an external human evaluation. Results show that T-FREX outperforms on average the traditional syntactic-based method, especially when discovering new features from a domain for which the model has been fine-tuned.
Source code for data generation, fine-tuning and model inference are available in the original GitHub repository.
Model description
This version of T-FREX has been fine-tuned for token classification from BERT large model (uncased).
Model variations
T-FREX includes a set of released, fine-tuned models which are compared in the original study (to be published).
- t-frex-bert-base-uncased
- t-frex-bert-large-uncased
- t-frex-roberta-base
- t-frex-roberta-large
- t-frex-xlnet-base-cased
- t-frex-xlnet-large-cased
How to use
You can use this model following the instructions for model inference for token classification.