Report for mrm8488/distilroberta-finetuned-financial-news-sentiment-analysis
Hey Team!🤗✨
We’re thrilled to share some amazing evaluation results that’ll make your day!🎉📊
We have identified 7 potential vulnerabilities in your model based on an automated scan.
This automated analysis evaluated the model on the dataset financial_phrasebank (subset sentences_50agree
, split train
).
👉Robustness issues (3)
Vulnerability | Level | Data slice | Metric | Transformation | Deviation |
---|---|---|---|---|---|
Robustness | major 🔴 | — | Fail rate = 0.322 | Transform to uppercase | 322/1000 tested samples (32.2%) changed prediction after perturbation |
🔍✨Examples
When feature “text” is perturbed with the transformation “Transform to uppercase”, the model changes its prediction in 32.2% of the cases. We expected the predictions not to be affected by this transformation.text | Transform to uppercase(text) | Original prediction | Prediction after perturbation | |
---|---|---|---|---|
996 | These moderate but significant changes resulted in a significant 24-32 % reduction in the estimated CVD risk . | THESE MODERATE BUT SIGNIFICANT CHANGES RESULTED IN A SIGNIFICANT 24-32 % REDUCTION IN THE ESTIMATED CVD RISK . | positive (p = 1.00) | neutral (p = 1.00) |
300 | The stock rose for a second day on Wednesday bringing its two-day rise to GBX12 .0 or 2.0 % . | THE STOCK ROSE FOR A SECOND DAY ON WEDNESDAY BRINGING ITS TWO-DAY RISE TO GBX12 .0 OR 2.0 % . | positive (p = 1.00) | neutral (p = 1.00) |
4737 | In food trade , sales amounted to EUR320 .1 m , a decline of 1.1 % . | IN FOOD TRADE , SALES AMOUNTED TO EUR320 .1 M , A DECLINE OF 1.1 % . | negative (p = 1.00) | neutral (p = 1.00) |
Vulnerability | Level | Data slice | Metric | Transformation | Deviation |
---|---|---|---|---|---|
Robustness | medium 🟡 | — | Fail rate = 0.095 | Transform to title case | 95/1000 tested samples (9.5%) changed prediction after perturbation |
🔍✨Examples
When feature “text” is perturbed with the transformation “Transform to title case”, the model changes its prediction in 9.5% of the cases. We expected the predictions not to be affected by this transformation.text | Transform to title case(text) | Original prediction | Prediction after perturbation | |
---|---|---|---|---|
4737 | In food trade , sales amounted to EUR320 .1 m , a decline of 1.1 % . | In Food Trade , Sales Amounted To Eur320 .1 M , A Decline Of 1.1 % . | negative (p = 1.00) | neutral (p = 0.99) |
4512 | Bioheapleaching makes extraction of metals from low grade ore economically viable . | Bioheapleaching Makes Extraction Of Metals From Low Grade Ore Economically Viable . | positive (p = 1.00) | neutral (p = 0.92) |
2222 | The earnings in the comparative period included a capital gain of EUR 8mn from the sale of OMX shares . | The Earnings In The Comparative Period Included A Capital Gain Of Eur 8Mn From The Sale Of Omx Shares . | positive (p = 0.98) | neutral (p = 0.97) |
Vulnerability | Level | Data slice | Metric | Transformation | Deviation |
---|---|---|---|---|---|
Robustness | medium 🟡 | — | Fail rate = 0.095 | Add typos | 95/1000 tested samples (9.5%) changed prediction after perturbation |
🔍✨Examples
When feature “text” is perturbed with the transformation “Add typos”, the model changes its prediction in 9.5% of the cases. We expected the predictions not to be affected by this transformation.text | Add typos(text) | Original prediction | Prediction after perturbation | |
---|---|---|---|---|
4060 | Finnish textiles and clothing group Marimekko Oyj posted a net profit of 7.99 mln euro $ 10.4 mln for 2006 , compared to 8.4 mln euro $ 10.9 mln for 2005 . | Finnish textiles and clothing roup Jarkmekko yj posted a net profiyt pof 7.99 mln euro $ 10.4 mln for 2006 , compared to 8.4 mn eurp $ 10.9 mln for 2005 . | negative (p = 0.99) | positive (p = 0.99) |
634 | The Vaisala Group is a successful international technology company that develops , manufactures and markets electronic measurement systems and products . | The Vaisala Group is a successful international tchnology company that develops , manufacfures and markets electronic measurement systems nad prducts . | neutral (p = 0.71) | positive (p = 0.98) |
2295 | Profit after taxes was EUR 0.1 mn , compared to EUR -0.4 mn the previous year . | Profit after taxes was EUR 0.1 mn , conpared to EUR -0.4 mn yhe peevious yewd . | positive (p = 1.00) | neutral (p = 1.00) |
👉Performance issues (4)
Vulnerability | Level | Data slice | Metric | Transformation | Deviation |
---|---|---|---|---|---|
Performance | major 🔴 | avg_digits(text) < 0.006 |
Balanced Accuracy = 0.741 | — | -10.63% than global |
🔍✨Examples
For records in the dataset where `avg_digits(text)` < 0.006, the Balanced Accuracy is 10.63% lower than the global Balanced Accuracy.text | avg_digits(text) | label | Predicted label |
|
---|---|---|---|---|
21 | ( Filippova ) A trilateral agreement on investment in the construction of a technology park in St Petersburg was to have been signed in the course of the forum , Days of the Russian Economy , that opened in Helsinki today . | 0 | positive | neutral (p = 0.80) |
47 | The agreement was signed with Biohit Healthcare Ltd , the UK-based subsidiary of Biohit Oyj , a Finnish public company which develops , manufactures and markets liquid handling products and diagnostic test systems . | 0 | positive | neutral (p = 1.00) |
60 | The company supports its global customers in developing new technologies and offers a fast route from product development to applications and volume production . | 0 | neutral | positive (p = 1.00) |
Vulnerability | Level | Data slice | Metric | Transformation | Deviation |
---|---|---|---|---|---|
Performance | medium 🟡 | avg_whitespace(text) < 0.161 AND avg_whitespace(text) >= 0.132 |
Balanced Accuracy = 0.758 | — | -8.64% than global |
🔍✨Examples
For records in the dataset where `avg_whitespace(text)` < 0.161 AND `avg_whitespace(text)` >= 0.132, the Balanced Accuracy is 8.64% lower than the global Balanced Accuracy.text | avg_whitespace(text) | label | Predicted label |
|
---|---|---|---|---|
47 | The agreement was signed with Biohit Healthcare Ltd , the UK-based subsidiary of Biohit Oyj , a Finnish public company which develops , manufactures and markets liquid handling products and diagnostic test systems . | 0.153488 | positive | neutral (p = 1.00) |
60 | The company supports its global customers in developing new technologies and offers a fast route from product development to applications and volume production . | 0.142857 | neutral | positive (p = 1.00) |
74 | Finnish real estate investor Sponda Plc said on Wednesday 12 March that it has signed agreements with Danske Bank A-S , Helsinki Branch for a 7-year EUR150m credit facility and with Ilmarinen Mutual Pension Insurance Company for a 7-year EUR50m credit facility . | 0.160305 | neutral | positive (p = 1.00) |
Vulnerability | Level | Data slice | Metric | Transformation | Deviation |
---|---|---|---|---|---|
Performance | medium 🟡 | avg_word_length(text) >= 4.770 AND avg_word_length(text) < 5.586 |
Balanced Accuracy = 0.768 | — | -7.43% than global |
🔍✨Examples
For records in the dataset where `avg_word_length(text)` >= 4.770 AND `avg_word_length(text)` < 5.586, the Balanced Accuracy is 7.43% lower than the global Balanced Accuracy.text | avg_word_length(text) | label | Predicted label |
|
---|---|---|---|---|
47 | The agreement was signed with Biohit Healthcare Ltd , the UK-based subsidiary of Biohit Oyj , a Finnish public company which develops , manufactures and markets liquid handling products and diagnostic test systems . | 5.35294 | positive | neutral (p = 1.00) |
74 | Finnish real estate investor Sponda Plc said on Wednesday 12 March that it has signed agreements with Danske Bank A-S , Helsinki Branch for a 7-year EUR150m credit facility and with Ilmarinen Mutual Pension Insurance Company for a 7-year EUR50m credit facility . | 5.11628 | neutral | positive (p = 1.00) |
91 | `` They would invest not only in the physical infrastructure , but would also provide know-how for managing and developing science and technology parks , '' said Sunrise Valley director Andrius Bagdonas . | 5.21212 | positive | neutral (p = 1.00) |
Vulnerability | Level | Data slice | Metric | Transformation | Deviation |
---|---|---|---|---|---|
Performance | medium 🟡 | avg_whitespace(text) < 0.172 AND avg_whitespace(text) >= 0.162 |
Balanced Accuracy = 0.788 | — | -5.06% than global |
🔍✨Examples
For records in the dataset where `avg_whitespace(text)` < 0.172 AND `avg_whitespace(text)` >= 0.162, the Balanced Accuracy is 5.06% lower than the global Balanced Accuracy.text | avg_whitespace(text) | label | Predicted label |
|
---|---|---|---|---|
135 | `` After this purchase , Cramo will become the second largest rental services provider in the Latvian market . | 0.163636 | positive | neutral (p = 1.00) |
270 | Previously the company has estimated its operating profit to reach the level of 2005 only . | 0.164835 | positive | neutral (p = 1.00) |
298 | The increase in capital stock has been registered in the Finnish Trade Register on 20 November 2006 . | 0.168317 | positive | neutral (p = 1.00) |
Disclaimer: it's important to note that automated scans may produce false positives or miss certain vulnerabilities. We encourage you to review the findings and assess the impact accordingly.
💡 What's Next?
- Checkout the Giskard Space and improve your model.
- The Giskard community is always buzzing with ideas. 🐢🤔 What do you want to see next? Your feedback is our favorite fuel, so drop your thoughts in the community forum! 🗣️💬 Together, we're building something extraordinary.
🙌 Big Thanks!
We're grateful to have you on this adventure with us. 🚀🌟 Here's to more breakthroughs, laughter, and code magic! 🥂✨ Keep hugging that code and spreading the love! 💻 #Giskard #Huggingface #AISafety 🌈👏 Your enthusiasm, feedback, and contributions are what seek. 🌟 Keep being awesome!