ejunprung's picture
Fix formatting
a9dcc9f
metadata
license: apache-2.0
language:
  - en
pipeline_tag: text-classification
tags:
  - ESG

SASB ESG Sentence Classifier (Stage 1)

The SASB ESG sentence classifier is a BERT-based model fine-tuned to separate ESG from non-ESG sentences. It was trained using data extracted from documents conforming to the Sustainability Accounting Standards Board (SASB) standards. For a full description of our training data, please refer to https://www.kaggle.com/datasets/edwardjunprung/sasb-aligned-esg-sentences.

Our classifier consists of a two-stage pipeline:

  1. Stage 1 - Classify sentences as ESG or not.
  2. Stage 2 - Subsequently, bucket ESG sentences into one of 26 SASB categories.

Goal

The objective is to categorize sentences within ESG documents in order to evaluate corporate ESG alignment. As an illustration, upon analyzing all sentences in Activision's annual ESG report, the SASB ESG model determined that more than 40% of sentences correspond with the Diversity & Inclusion and Human Rights SASB categories. Consequently, we can infer that Activision places a significant emphasis on these initiatives, which positions it as a potential candidate for investment funds with social impact mandates.

Model Output

SASB ESG sentence classifier outputs either 0 (i.e. Not ESG) or 1 (i.e. ESG).

Results

Below, we present a comparison between our two-stage approach and a baseline heuristic method. The baseline method categorizes ESG sentences based solely on the presence of specific keywords. For instance, any sentence containing the phrase "human rights" would be automatically labeled under that category.

Model Parent ESG Category Child ESG Category
Heuristic 31% 34%
SASB ESG Model 71% 61%

Parent Category = Environment, Social Capital, Human Capital, Business Model & Innovation, Leadership & Governance
Child Category = GHG Emissions, Air Quality, etc. Please visit https://sasb.org/standards/materiality-finder to see full list.

Misc