Report for soleimanian/financial-roberta-large-sentiment
Hey Team!🤗✨
We’re thrilled to share some amazing evaluation results that’ll make your day!🎉📊
We have identified 3 potential vulnerabilities in your model based on an automated scan.
This automated analysis evaluated the model on the dataset financial_phrasebank (subset sentences_allagree
, split train
).
👉Robustness issues (2)
Vulnerability | Level | Data slice | Metric | Transformation | Deviation |
---|---|---|---|---|---|
Robustness | medium 🟡 | — | Fail rate = 0.075 | Transform to uppercase | 75/1000 tested samples (7.5%) changed prediction after perturbation |
🔍✨Examples
When feature “text” is perturbed with the transformation “Transform to uppercase”, the model changes its prediction in 7.5% of the cases. We expected the predictions not to be affected by this transformation.text | Transform to uppercase(text) | Original prediction | Prediction after perturbation | |
---|---|---|---|---|
580 | Okmetic Board of Directors has also decided on a new share ownership program directed to the company 's top management . | OKMETIC BOARD OF DIRECTORS HAS ALSO DECIDED ON A NEW SHARE OWNERSHIP PROGRAM DIRECTED TO THE COMPANY 'S TOP MANAGEMENT . | neutral (p = 0.70) | positive (p = 0.79) |
823 | In the end of 2006 , the number of outlets will rise to 60-70 . | IN THE END OF 2006 , THE NUMBER OF OUTLETS WILL RISE TO 60-70 . | positive (p = 1.00) | negative (p = 1.00) |
1444 | The group reiterated its forecast that handset manufacturers will sell around 915 mln units this year globally . | THE GROUP REITERATED ITS FORECAST THAT HANDSET MANUFACTURERS WILL SELL AROUND 915 MLN UNITS THIS YEAR GLOBALLY . | neutral (p = 1.00) | positive (p = 1.00) |
Vulnerability | Level | Data slice | Metric | Transformation | Deviation |
---|---|---|---|---|---|
Robustness | medium 🟡 | — | Fail rate = 0.068 | Add typos | 68/1000 tested samples (6.8%) changed prediction after perturbation |
🔍✨Examples
When feature “text” is perturbed with the transformation “Add typos”, the model changes its prediction in 6.8% of the cases. We expected the predictions not to be affected by this transformation.text | Add typos(text) | Original prediction | Prediction after perturbation | |
---|---|---|---|---|
355 | In Q2 of 2009 , profit before taxes amounted to EUR 13.6 mn , down from EUR 26.8 mn in Q2 of 2008 . | In Q2 of 2009 , profit before taxes amounted t EUR 13.6 mn , doeb from EUR 2Y6.8 mn in Q2 of 2008 . | negative (p = 1.00) | positive (p = 1.00) |
173 | Operating profit was EUR 11.4 mn , up from EUR 7.5 mn . | Operafing profit wa sEUR 11.4 mn ,u p trom EUR 7.%5 mn . | positive (p = 1.00) | neutral (p = 0.98) |
2126 | Coca-Cola was the market leader of manufacturers with a market share of 36.9 % , down 2.2 % from the corresponding period in 2004-2005 . | Coca-Cola was ghe market leader of nanufacturers with a market share of 36.9 % , don 2.2 % from the corresponding period in 2004-2005 . | negative (p = 1.00) | positive (p = 1.00) |
👉Performance issues (1)
Vulnerability | Level | Data slice | Metric | Transformation | Deviation |
---|---|---|---|---|---|
Performance | medium 🟡 | avg_word_length(text) < 3.860 AND avg_word_length(text) >= 3.699 |
Balanced Accuracy = 0.892 | — | -5.29% than global |
🔍✨Examples
For records in the dataset where `avg_word_length(text)` < 3.860 AND `avg_word_length(text)` >= 3.699, the Balanced Accuracy is 5.29% lower than the global Balanced Accuracy.text | avg_word_length(text) | label | Predicted label |
|
---|---|---|---|---|
567 | It will provide heating in the form of hot water for the sawmill 's needs . | 3.75 | neutral | positive (p = 0.64) |
1121 | Upon completion of the sale Proha would get some USD12 .7 m for its stake in Artemis . | 3.83333 | neutral | positive (p = 0.99) |
1140 | 3 January 2011 - Scandinavian lenders Sampo Bank ( HEL : SAMAS ) , Pohjola Bank ( HEL : POH1S ) and Svenska Handelsbanken ( STO : SHB A ) have provided a EUR160m ( USD213m ) line of credit to Lemminkainen Oyj ( HEL : LEM1S ) , the Finnish construction firm said on Friday . | 3.80702 | neutral | positive (p = 0.99) |
Disclaimer: it's important to note that automated scans may produce false positives or miss certain vulnerabilities. We encourage you to review the findings and assess the impact accordingly.
💡 What's Next?
- Checkout the Giskard Space and improve your model.
- The Giskard community is always buzzing with ideas. 🐢🤔 What do you want to see next? Your feedback is our favorite fuel, so drop your thoughts in the community forum! 🗣️💬 Together, we're building something extraordinary.
🙌 Big Thanks!
We're grateful to have you on this adventure with us. 🚀🌟 Here's to more breakthroughs, laughter, and code magic! 🥂✨ Keep hugging that code and spreading the love! 💻 #Giskard #Huggingface #AISafety 🌈👏 Your enthusiasm, feedback, and contributions are what seek. 🌟 Keep being awesome!