Report for ProsusAI/finbert
Hey Team!🤗✨
We’re thrilled to share some amazing evaluation results that’ll make your day!🎉📊
We have identified 3 potential vulnerabilities in your model based on an automated scan.
This automated analysis evaluated the model on the dataset financial_phrasebank (subset sentences_allagree
, split train
).
👉Robustness issues (1)
Vulnerability | Level | Data slice | Metric | Transformation | Deviation |
---|---|---|---|---|---|
Robustness | medium 🟡 | — | Fail rate = 0.053 | Add typos | 53/1000 tested samples (5.3%) changed prediction after perturbation |
🔍✨Examples
When feature “text” is perturbed with the transformation “Add typos”, the model changes its prediction in 5.3% of the cases. We expected the predictions not to be affected by this transformation.text | Add typos(text) | Original prediction | Prediction after perturbation | |
---|---|---|---|---|
263 | Finnish Talvivaara Mining Co HEL : TLV1V said Thursday it had picked BofA Merrill Lynch and JPMorgan NYSE : JPM as joint bookrunners of its planned issue of convertible notes worth up to EUR250m USD332m . | Finnish RTakvivaara Muning Co HEL : TLV1V said Thursday it hadp iced NofA Merrill Lynch and JPMoran NYSE : JPM as joint oboirunners of its planned isdsue of convertible nloes worth up to EUR2500n USD332m . | positive (p = 0.72) | neutral (p = 0.72) |
214 | Earnings per share for January-June 2010 were EUR0 .30 , an increase of 20 % year-on-year EUR0 .25 . | aErnings per share for January-Jyune 2010 were EUR0 .30 , an increaswe of 20 % ear-on-year EUR0 .25 .. | positive (p = 0.96) | negative (p = 0.69) |
1907 | Scanfil issued a profit warning on 10 April 2006 . | Scqnfil issued a profjit sarning on 10 April 2006 . | negative (p = 0.95) | neutral (p = 0.94) |
👉Performance issues (2)
Vulnerability | Level | Data slice | Metric | Transformation | Deviation |
---|---|---|---|---|---|
Performance | medium 🟡 | avg_whitespace(text) < 0.160 AND avg_whitespace(text) >= 0.156 |
Precision = 0.917 | — | -5.67% than global |
🔍✨Examples
For records in the dataset where `avg_whitespace(text)` < 0.160 AND `avg_whitespace(text)` >= 0.156, the Precision is 5.67% lower than the global Precision.text | avg_whitespace(text) | label | Predicted label |
|
---|---|---|---|---|
533 | According to Finnair Technical Services , the measure is above all due to the employment situation . | 0.16 | neutral | positive (p = 0.50) |
841 | Previously , EB delivered a custom solution for LG Electronics and now is making it commercially available for other mobile terminal vendors as well as to wireless operators . | 0.16 | positive | neutral (p = 0.76) |
1062 | The contract covers turnkey deliveries to all five airports operated by the authority -- John F Kennedy , LaGuardia , Newark , Teterboro and Stewart International . | 0.158537 | neutral | positive (p = 0.70) |
Vulnerability | Level | Data slice | Metric | Transformation | Deviation |
---|---|---|---|---|---|
Performance | medium 🟡 | text_length(text) >= 99.500 AND text_length(text) < 106.500 |
Precision = 0.921 | — | -5.22% than global |
🔍✨Examples
For records in the dataset where `text_length(text)` >= 99.500 AND `text_length(text)` < 106.500, the Precision is 5.22% lower than the global Precision.text | text_length(text) | label | Predicted label |
|
---|---|---|---|---|
533 | According to Finnair Technical Services , the measure is above all due to the employment situation . | 100 | neutral | positive (p = 0.50) |
1413 | The contract also includes installation work in a new multistorey carpark for close on 1,000 vehicles . | 103 | neutral | positive (p = 0.55) |
1496 | The repo rate will gradually reach 2 % at the end of 2010 , according to Nordea 's Economic Outlook . | 101 | neutral | positive (p = 0.51) |
Disclaimer: it's important to note that automated scans may produce false positives or miss certain vulnerabilities. We encourage you to review the findings and assess the impact accordingly.
💡 What's Next?
- Checkout the Giskard Space and improve your model.
- The Giskard community is always buzzing with ideas. 🐢🤔 What do you want to see next? Your feedback is our favorite fuel, so drop your thoughts in the community forum! 🗣️💬 Together, we're building something extraordinary.
🙌 Big Thanks!
We're grateful to have you on this adventure with us. 🚀🌟 Here's to more breakthroughs, laughter, and code magic! 🥂✨ Keep hugging that code and spreading the love! 💻 #Giskard #Huggingface #AISafety 🌈👏 Your enthusiasm, feedback, and contributions are what seek. 🌟 Keep being awesome!