How to Use Data and AI to Measure ESG Performance

By Data Science Salon

ESG (Environmental, Social and Governance) investing is a great umbrella term to describe responsible investing (RI), a way to evaluate companies beyond financial factors. In other words, ESG is a way of measuring the impact of companies on the environment and society, thereby facilitating ethical ways to make investments or conduct business.

While companies may be considering ESG factors in their strategies, it’s hard for managers and investors to measure their impact to make decisions accordingly. The reason is that a proper analysis requires a lot of data which needs to be carefully interpreted. 

This is where Artificial Intelligence (AI) comes into play. AI can help analyze ESG data and measure the performance of companies by identifying hidden risks, potential opportunities, and overall trends of the environment in which they are operating.

This article gives an overview of what ESG data is, why it is important for companies, and how Machine Learning (ML) and NLP (Natural Language Processing) can help to create accurate ESG rankings.

What is ESG data?

ESG data provides useful information on environmental, social and governance factors and is being used by companies to assess how sustainable they are, and by investors to decide which companies to invest in.

Today, ESG data combined with AI has become a realistic way to measure company resiliency and take proactive measures. ESG data can be used to meet investors’ changing appetite for sustainable investments, as these metrics are often seen as a way to “get the story behind the numbers”, and provide an entire measure of company performance.

Why is ESG important?

ESG is important because it not only prevents negative impact on the environment but also provides an opportunity to better balance the interests of stakeholders. Today, companies are increasingly implementing ESG into their business practices, and to properly understand some of its importance, it pays to carefully study the relationship between ESG and financial performance.

Rochelle March, Head of ESG Product at Dun & Bradstreet describes ESG data performance in organizations in her presentation at DSSVirtual ESG as the signal in the noise: Using NLP and verified data assets to create a holistic measure of company resiliency. According to her, companies can benefit from the following when considering ESG factors:

Increase in efficiency

Businesses with good trading relationships and engagement, represented by good ESG rankings in their supply-chain have consistently outperformed firms that have poor ESG Rankings over 1-, 3- and 5-year time horizons. Over five years, their sales growth is up to five times as high.

Better supplier engagement 

Businesses with good trading relationships and engagements represented by good ESG rankings in their supply chain have consistently outperformed firms that have poor ESG rankings over 1-, 3- and 5- year horizons. The same applies to businesses with good corporate behaviour in the governance theme.

Lower delinquency rates

Payment delinquency is a popular commercial credit risk measurement. The delinquency rate of businesses with poor ESG ranking almost doubles that of businesses having very good ESG rankings.

Additionally, ESG is used to analyze companies and their ability to be resilient today and in the future. For example, there has been an increase in UN PRI (Principles for Responsible Investment) signatories and AUM (Assets Under Management) over the years with many of the signatories incorporating ESG factors in their investing. Companies who incorporate ESG metrics into evaluation tend to be more resilient and adaptable even when there are significant changes like the COVID-19 recession, climate impact or social impact.

Presentation by Rochelle March, Head of ESG Product at Dun & Bradstreet, at DSSVirtual.

How ESG data is used to assess companies 

The main idea behind ESG is that companies should be responsible to their shareholders and societies, which includes environmental protection, social responsibility and fair treatment of employees. They are evaluated based on these factors:

Environmental factors - refers to how the company impacts the environment. This can be in terms of greenhouse gas emissions or natural resources i.e. energy, water, pollution management, land use, biodiversity etc.

Social factors - refers to how the company impacts society in terms of its labour and human rights policies, and the working conditions of the people who make their products.

Governance factors - refers to how the company impacts the government in terms of its tax and other financial policies, and its involvement in the community. This also includes business ethics and transparency.

Investors want to know that they can invest in a company that is continuously looking at improving its ESG practices.

How ML and NLP are used to create ESG ranking datasets and models

ESG Model Data and Methodology

Source: Presentation by Rochelle March at DSSVirtual.

Machine Learning (ML) and Natural Language Processing (NLP) are used to create more accurate ESG rankings for companies. The above infographic displays a standard ESG methodology, which includes the following steps:

1. It all begins with coming up with a dataset for ESG metrics. This dataset can be retrieved through web scraping and from several sources. These sources include public data sources such as government and NGO data, company websites and reports, and third-party data mostly based on sustainability reports.

2. The leveraged data is passed through the Quality Assurance (QA) stage and then processed using ML and NLP. Some of these ML and NLP methods include:

  • Word embeddings: Identifying relationships between words. A word embedding is a numerical representation of text that captures its meanings, semantic relationships and different types of contexts they are used in.
  • Topic and theme tagging: This involves assigning relevant topics and themes to the data for further processing.
  • Sentiment analysis: Tagging the polarities: negative, positive and neutral to aggregate sentiments at topic and theme level.

3. The processed data is normalized from 1-5 distributions, 5 being the highest risk or worst performing company, 1 being the lowest risk or best-performing company.

Topic weighting is also done as this gives a better correlation to financial performance in ESG scores. Topics are weighted based on how important they are to a certain sector. For instance, let's say a credit card company wants to be ranked on ESG factors. Topics like energy management and consumption will not have any significance in that ranking, unlike that of data privacy which is very important to the business. In the same vein, a utility company won't benefit much from data privacy topics as energy management is more significant here.

4. The output of these models (the ESG ranking) is then included in an overall company ranking, which defines its resiliency.


The ESG industry continues to grow. The rapid increase in the number of companies that are incorporating ESG factors into their decision making, combined with the increased competition for capital from responsible investors, has led to an increased demand for ESG services.

And now with the help of AI, ESGs can be optimized through analytics and intelligent computer algorithms. AI can help analyze data and make recommendations promptly, enabling investors to make more informed decisions on everything from selecting projects or even divesting from certain industries.

Check out the Data Science Salon YouTube channel to watch more presentations by leading data scientists talking about AI and Machine Learning applications in the real world.

watch more data science talks

Get the latest data science news and resources every Friday right to your inbox!