--- license: apache-2.0 language: - en pipeline_tag: text-classification tags: - ESG --- # SASB ESG Category Classifier (Stage 2) The SASB ESG category classifier utilizes the [Universal Sentence Encoder (USE)](https://www.kaggle.com/models/google/universal-sentence-encoder) model fine-tuned to classify ESG sentences into one of 26 Sustainability Accounting Standards Board (SASB) standards categories. Interestingly, the USE model greatly outperformed BERT in our experiments. For a full description of our training data, please refer to https://www.kaggle.com/datasets/edwardjunprung/sasb-aligned-esg-sentences. Our classifier consists of a two-stage pipeline: 1. **[Stage 1](https://huggingface.co/ejunprung/SASB-ESG-Sentence-Classifier)** - Classify sentences as ESG or not. 2. **[Stage 2](https://huggingface.co/ejunprung/SASB-ESG-Classification-26Categories)** - Subsequently, bucket ESG sentences into one of [26 SASB categories](https://sasb.org/standards/materiality-finder/). ## Goal The objective is to categorize sentences within ESG documents in order to evaluate corporate ESG alignment. As an illustration, upon analyzing all sentences in Activision's annual ESG report, the SASB ESG model determined that more than 40% of sentences correspond with the Diversity & Inclusion and Human Rights SASB categories. Consequently, we can infer that Activision places a significant emphasis on these initiatives, which positions it as a potential candidate for investment funds with social impact mandates. ## Model Output The SASB ESG category classifier provides predictions for 26 different classes, which are mapped as follows: | SASB Category | Model Output | |----------------------------------------------------|:------------:| | GHG Emissions | 8 | | Air Quality | 5 | | Energy Management | 7 | | Water & Wastewater Management | 10 | | Waste & Hazardous Materials Management | 9 | | Ecological Impacts | 6 | | Human Rights & Community Relations | 23 | | Customer Privacy | 20 | | Data Security | 22 | | Access & Affordability | 19 | | Product Quality & Safety | 24 | | Customer Welfare | 21 | | Selling Practices & Product Labeling | 25 | | Labor Practices | 13 | | Employee Health & Safety | 12 | | Employee Engagement Diversity & Inclusion | 11 | | Product Design & Lifecycle Management | 3 | | Business Model Resilience | 0 | | Supply Chain Management | 4 | | Materials Sourcing & Efficiency | 1 | | Physical Impacts of Climate Change | 2 | | Business Ethics | 14 | | Competitive Behaviour | 15 | | Management of the Legal & Regulatory Environment | 17 | | Critical Incident Risk Management | 16 | | Systemic Risk Management | 18 | ## Results Below, we present a comparison between our two-stage approach and a baseline heuristic method. The baseline method categorizes ESG sentences based solely on the presence of specific keywords. For instance, any sentence containing the phrase "human rights" would be automatically labeled under that category. | Model | Parent ESG Category | Child ESG Category | |----------------|:-------------------:|:------------------:| | Heuristic | 31% | 34% | | SASB ESG Model | 71% | 61% | Please be aware of the substantial class imbalance in our training data. For example, categories like "Competitive Behavior" are considerably less prevalent compared to others such as "GHG Emissions." **Parent Category** = Environment, Social Capital, Human Capital, Business Model & Innovation, Leadership & Governance
**Child Category** = GHG Emissions, Air Quality, etc. Please visit https://sasb.org/standards/materiality-finder to see full list. ## Misc - Developed by: [Victor Chen](https://www.linkedin.com/in/victorzitianchen), [Jude Zhu](https://www.linkedin.com/in/judewzhu), [Michael Liston](https://www.linkedin.com/in/michael-c-liston/), [Edward Junprung](https://www.linkedin.com/in/ejunprung/) - Parent Model: [Universal Sentence Encoder](https://www.kaggle.com/models/google/universal-sentence-encoder) - Blog Post: https://www.gopeaks.org/esg-mapper