Spaces:
Runtime error
Runtime error
import gradio as gr | |
import model_wrapper | |
model = model_wrapper.PredictionModel() | |
def pretty_print_opinion(opinion_dict): | |
res = [] | |
maxlen = max([len(key) for key in opinion_dict.keys()]) + 2 | |
maxlen = 0 | |
for key, value in opinion_dict.items(): | |
if key == 'Polarity': | |
res.append(f'{(key + ":").ljust(maxlen)} {value}') | |
else: | |
res.append(f'{(key + ":").ljust(maxlen)} \'{" ".join(value[0])}\'') | |
return '\n'.join(res) + '\n' | |
def predict(text): | |
print(f'Input message "{text}"') | |
try: | |
predictions = model.predict([text]) | |
prediction = predictions[0] | |
results = [] | |
if not prediction['opinions']: | |
return 'No opinions detected' | |
for opinion in prediction['opinions']: | |
results.append(pretty_print_opinion(opinion)) | |
print(f'Successfully predicted SA for input message "{text}": {results}') | |
return '\n'.join(results) | |
except Exception as e: | |
print(f'Error for input message "{text}": {e}') | |
raise e | |
markdown_text = ''' | |
<br> | |
<br> | |
This space provides a gradio demo and an easy-to-run wrapper of the pre-trained model for structured sentiment analysis in Norwegian language, pre-trained on the [NoReC dataset](https://huggingface.co/datasets/norec). | |
This space containt an implementation of method described in "Direct parsing to sentiment graphs" (Samuel _et al._, ACL 2022). The main repository that also contains the scripts for training the model, can be found on the project [github](https://github.com/jerbarnes/direct_parsing_to_sent_graph). | |
The sentiment graph model is based on an underlying masked language model – [NorBERT 2](https://huggingface.co/ltg/norbert2). | |
The proposed method suggests three different ways to encode the sentiment graph: "node-centric", "labeled-edge", and "opinion-tuple". | |
The current model | |
- uses "labeled-edge" graph encoding | |
- does not use character-level embedding | |
- all other hyperparameters are set to [default values](https://github.com/jerbarnes/direct_parsing_to_sent_graph/blob/main/perin/config/edge_norec.yaml) | |
, and it achieves the following results on the held-out set of the NoReC dataset: | |
| Unlabeled sentiment tuple F1 | Target F1 | Relative polarity precision | | |
|:----------------------------:|:----------:|:---------------------------:| | |
| 0.434 | 0.541 | 0.926 | | |
In "Word Substitution with Masked Language Models as Data Augmentation for Sentiment Analysis", we analyzed data augmentation strategies for improving performance of the model. Using masked-language modeling (MLM), we augmented the sentences with MLM-substituted words inside, outside, or inside+outside the actual sentiment tuples. The results below show that augmentation may be improve the model performance. This space, however, runs the original model trained without augmentation. | |
| | Augmentation rate | Unlabeled sentiment tuple F1 | Target F1 | Relative polarity precision | | |
|----------------|-------------------|------------------------------|-----------|-----------------------------| | |
| Baseline | 0% | 43.39 | 54.13 | 92.59 | | |
| Outside | 59% | **45.08** | 56.18 | 92.95 | | |
| Inside | 9% | 43.38 | 55.62 | 92.49 | | |
| Inside+Outside | 27% | 44.12 | **56.44** | **93.19** | | |
The model can be easily used for predicting sentiment tuples as follows: | |
```python | |
import model_wrapper | |
model = model_wrapper.PredictionModel() | |
model.predict(['vi liker svart kaffe']) | |
[{'sent_id': '0', | |
'text': 'vi liker svart kaffe', | |
'opinions': [{'Source': [['vi'], ['0:2']], | |
'Target': [['svart', 'kaffe'], ['9:14', '15:20']], | |
'Polar_expression': [['liker'], ['3:8']], | |
'Polarity': 'Positive'}]}] | |
``` | |
''' | |
with gr.Blocks() as demo: | |
with gr.Row() as row: | |
text_input = gr.Textbox(label="input") | |
text_output = gr.Textbox(label="output") | |
with gr.Row() as row: | |
text_button = gr.Button("submit") | |
text_button.click(fn=predict, inputs=text_input, outputs=text_output) | |
gr.Markdown(markdown_text) | |
demo.launch() | |