Goal

This model can be used to add emoji to an input text.

To accomplish this, we framed the problem as a token-classification problem, predicting the emoji that should follow a certain word/token as an entity.

The accompanying demo, which includes all the pre- and postprocessing needed can be found here.

For the moment, this only works for Dutch texts.

Dataset

For this model, we scraped about 1000 unique tweets per emoji we support: ['😨', 'πŸ˜₯', '😍', '😠', '🀯', 'πŸ˜„', '🍾', 'πŸš—', 'β˜•', 'πŸ’°']

Which could look like this:

Wow 😍😍, what a cool car πŸš—πŸš—!
Omg, I hate mondays 😠... I need a drink 🍾

After some processing, we can reposition this in a more known NER format:

Word Label
Wow B-😍
, O
what O
a O
cool O
car O
! B-πŸš—

Which can then be leveraged for training a token classification model.

Unfortunately, Terms of Service prohibit us from sharing the original dataset.

Training

The model was trained for 4 epochs.

Downloads last month
10
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.