Report for ahmedrachid/FinancialBERT-Sentiment-Analysis

#28
by inoki-giskard - opened

Hey Team!🤗✨
We’re thrilled to share some amazing evaluation results that’ll make your day!🎉📊

We have identified 3 potential vulnerabilities in your model based on an automated scan.

This automated analysis evaluated the model on the dataset financial_phrasebank (subset sentences_66agree, split train).

👉Robustness issues (3)
Vulnerability Level Data slice Metric Transformation Deviation
Robustness major 🔴 Fail rate = 0.431 Transform to uppercase 431/1000 tested samples (43.1%) changed prediction after perturbation
🔍✨Examples When feature “text” is perturbed with the transformation “Transform to uppercase”, the model changes its prediction in 43.1% of the cases. We expected the predictions not to be affected by this transformation.
text Transform to uppercase(text) Original prediction Prediction after perturbation
853 Strong growth has continued also in China . STRONG GROWTH HAS CONTINUED ALSO IN CHINA . positive (p = 1.00) neutral (p = 1.00)
1573 `` P&O Ferries now has a very efficient and powerful vessel for its Dover to Calais route , '' head of the shipbuilder 's Rauma yard , Timo Suistio , said . `` P&O FERRIES NOW HAS A VERY EFFICIENT AND POWERFUL VESSEL FOR ITS DOVER TO CALAIS ROUTE , '' HEAD OF THE SHIPBUILDER 'S RAUMA YARD , TIMO SUISTIO , SAID . positive (p = 1.00) neutral (p = 1.00)
256 Revenue grew 12 percent to ( x20ac ) 3.6 billion ( US$ 4.5 billion ) . REVENUE GREW 12 PERCENT TO ( X20AC ) 3.6 BILLION ( US$ 4.5 BILLION ) . positive (p = 1.00) neutral (p = 1.00)
Vulnerability Level Data slice Metric Transformation Deviation
Robustness major 🔴 Fail rate = 0.422 Transform to title case 422/1000 tested samples (42.2%) changed prediction after perturbation
🔍✨Examples When feature “text” is perturbed with the transformation “Transform to title case”, the model changes its prediction in 42.2% of the cases. We expected the predictions not to be affected by this transformation.
text Transform to title case(text) Original prediction Prediction after perturbation
852 Sales VAT inclusive expanded by 19 percent , to 351 million euros . Sales Vat Inclusive Expanded By 19 Percent , To 351 Million Euros . positive (p = 1.00) neutral (p = 1.00)
1573 `` P&O Ferries now has a very efficient and powerful vessel for its Dover to Calais route , '' head of the shipbuilder 's Rauma yard , Timo Suistio , said . `` P&O Ferries Now Has A Very Efficient And Powerful Vessel For Its Dover To Calais Route , '' Head Of The Shipbuilder 'S Rauma Yard , Timo Suistio , Said . positive (p = 1.00) neutral (p = 1.00)
3622 Finnish Suominen Flexible Packaging is cutting 48 jobs in its unit in Tampere and two in Nastola , in Finland . Finnish Suominen Flexible Packaging Is Cutting 48 Jobs In Its Unit In Tampere And Two In Nastola , In Finland . negative (p = 1.00) neutral (p = 1.00)
Vulnerability Level Data slice Metric Transformation Deviation
Robustness major 🔴 Fail rate = 0.111 Add typos 111/1000 tested samples (11.1%) changed prediction after perturbation
🔍✨Examples When feature “text” is perturbed with the transformation “Add typos”, the model changes its prediction in 11.1% of the cases. We expected the predictions not to be affected by this transformation.
text Add typos(text) Original prediction Prediction after perturbation
1743 Diluted loss per share stood at EUR 0.15 versus EUR 0.26 . Dilurted poss per share stood at EUR 0.15 versus EUR 0.26 . positive (p = 0.97) negative (p = 0.99)
1296 In the next few years , the ICT sector 's share of electricity consumption will be raised by the increase in the popularity of smartphones . In te nrxt few years , the ICT zecrtor 's share of electricity consumption will ge raised by the incerase in the popularity of smartphones . neutral (p = 0.98) positive (p = 1.00)
3376 The orders are for 26 machine-room-less KONE MonoSpace elevators , which would be installed during 2006 . The orders wre for 26 machine-rlom-lress KONE MonoSpace elevators which woulc be installed during 2006 . neutral (p = 1.00) positive (p = 0.60)

Disclaimer: it's important to note that automated scans may produce false positives or miss certain vulnerabilities. We encourage you to review the findings and assess the impact accordingly.

💡 What's Next?

  • Checkout the Giskard Space and improve your model.
  • The Giskard community is always buzzing with ideas. 🐢🤔 What do you want to see next? Your feedback is our favorite fuel, so drop your thoughts in the community forum! 🗣️💬 Together, we're building something extraordinary.

🙌 Big Thanks!

We're grateful to have you on this adventure with us. 🚀🌟 Here's to more breakthroughs, laughter, and code magic! 🥂✨ Keep hugging that code and spreading the love! 💻 #Giskard #Huggingface #AISafety 🌈👏 Your enthusiasm, feedback, and contributions are what seek. 🌟 Keep being awesome!

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment