Case study: ExxonMobil

In this section, we present a case study about the company ExxonMobil as a representative company of the energy industry by performing an end-to-end process for the ESG news related to ExxonMobil.

Firstly, we gathered the data needed for the analysis by developing crawlers that accumulate data into our datasets from alternative sources and company filing reports. As sources for the alternative data, we use Google RSS news feed, GDELT, and Yahoo Finance API. We collect specific news that is related to targeted ESG topics. We extract the news based on co-occurrence between the company we analyze and the topics we predefine (ex. Exxon climate change). We store the news title, content, date, and domain site with search query metadata. The company documents are gathered through the portal of the U.S. Securities and Exchange Commission (the “SEC”). For the filings, we store the whole document.

We continue filtering and preprocessing the news when we have all the data. In our batch streaming, we often encounter news that is irrelevant to our topics or company. Therefore, we filter the news by the frequency of appearances of our customized taxonomy words. Based on a predefined minimum word occurrence threshold, we classify the company news as ESG related or not. After filtering the news, we calculate the sentiment of the news by using a predefined Hugging Face pipeline, “sentiment-analysis.” The pipeline creates its embeddings, and afterward, leveraging the transformer architecture, it calculates a number between -1 and 1, indicating the sentiment(-1 is negative, 1 is positive). For our purposes, we define a threshold for the certainty of the sentiment, hence, for every value above 0.7, we classify the news as positive, and for every sentiment value below -0.7, we classify it as negative.

For a better insight into the results, we created several charts for the data collected between 05/01/2022 and 06/15/2022. Using the frequency of appearances of our taxonomy words, we mapped the values on the chart and colored them with their sentiment. The line indicates the number of times the term was found in the news content. The line’s color indicates the sentiment where red is negative and green is positive. We repeated this for the three ESG categories: environmental, social, and corporate governance. 

Figure 1: Sentiment on words in the news in the social category from 05/01/2022 to 06/15/2022

In Figure 1, we observe that ExxonMobil has more news about negative community behavior than positive, which indicates that their social index will fall. On the other hand, they provide health care for their workers and are involved in positive human rights movements.

 In the following bar plot, Figure 6, we observe the number of appearances of the term ESG in negative and positive news. The red bar indicates the number of occurrences of the word “ESG” in the week’s negative news related to ExxonMobil, starting with the date on the x-axis. The same applies to the green bar with the difference in sentiment.

Figure 2: Sentiment on the news where ESG was mentioned

Our following analysis involves Relation Extraction, which helps us extract meaningful connections between entities from unprocessed texts and use the relations to create a Knowledge Base. Using this method, we generate a Knowledge Graph from the text in Def 14A, a proxy document reported by ExxonMobil for 2022. The model has detected that, in their proxy statement (Def 14A), ExxonMobil specified that they are founders of an organization called Alliance to End Plastic Waste, the graph below. 


To compare this graph with the news, we create a knowledge graph after crawling the web for news with the query “Exxon Alliance to end Plastic Waste”, the graph below. As we can see in the Figure, we detect a potential greenwashing. We find a connection between a parent organization American Chemistry Council connected with relation’ instance of’ with Lobbying.


To ensure our results, we continued our research by creating one more Knowledge Graph from the news results of the search query “American Chemistry Council Lobbying,” graph below. We find that the news that resulted in the relation between American Chemistry Council and Lobbying has a headline: “Oil-backed trade group is lobbying the Trump administration to push plastics across Africa.” With this, we can conclude that they are doing some kind of lobbying so they can proceed with using plastic products which is both violation of environmental issues and a corporate governance problem.


Technology-based solutions for ESG investment outcomes

When considering ESG investments, portfolio managers look into companies or funds with high ESG ratings. Several companies, including MSCI, S&P, Sustainalytics, CDP, ISS, Bloomberg, and others, provide ESG ratings. However, the consensus level regarding the ratings is relatively low, which makes ESG-focused investments challenging. For example, let’s look at the correlations of ESG Ratings across different providers. We observe a correlation of 7% between ISS and CDP ratings, 16% between MSCI and CDP, 58% between Sustainalytics and Bloomberg, and the highest correlation of approximately 74% between Bloomberg and S&P. This dispersion among ESG ratings from different data providers is confusing for investors. The discrepancies observed in the Figure below occur because many of the ESG measures are subjective, disclosure can be inconsistent, there may be biases in the data, and there could be changes in corporate behavior.

Existing solutions from technology-based providers offer insights into how companies can improve their ESG index or score and get a better overall view of their index of sustainability. The insights are generated by building a specific model for the company based on accessing, organizing, and analyzing relevant company files. Another interdisciplinary, analytical solution to increasing the ESG transparency is to provide datasets of raw ESG indicators. This solution involves collecting company data from various sources, such as company documents, news, social media, and alternative data sources. The data can then be used either in a row format or in a more insightful way based on collaborative work between data scientists and high-level investor knowledge of market insights and company ESG involvement. This model offers excellent synergetic and profitable cooperation for influencing corporate behavior and contributing to enhanced sustainability and financial stability.

Correlations of ESG Ratings provided by MSCI, S&P, Sustainalytics,
CDP, ISS, and Bloomberg, where we observe a high level of dispersion.

Machine learning and AI-based solutions can contribute to the meaningful representation of each company’s most critical ESG factors, emphasizing corporate dedication to ESG issues by creating dashboards showing what the company reports and what the company does about ESG. The dashboard would extract and visualize the overall picture of a company’s ESG involvement. The insight is generated by collecting every data point for the company from internal and external sources and giving more profound insight into the company sustainability index. The AI-based solution cold model dynamic changes and go beyond ESG reporting by providing a future estimated outlook into corporate ESG risks. On the other hand, the AI-based solution could offer positive impacts indices and explainable results from machine learning analysis and extract insights about how social and environmental trends will impact the company. A similar study could offer insights into how negative scenarios (e.g., climate or social disasters) will affect the company’s future and determine its resilience to adverse events.

Responsible Investment

ESG is an investment approach that explicitly incorporates the environmental, social, and governance factors in investment decisions, keeping an investment portfolio’s long-term return at the forefront. Environmental factors concern the natural world, including using and interacting with renewable and nonrenewable resources. Essential considerations include biodiversity, deforestation, water security, pollution, and climate change. Social factors include human capital management, local communities, Labor Standards, human rights, health and safety, and customer responsibility.

U.N. Sustainable Development Goals, defined in 2015, at the 70th
anniversary of the foundation of the United Nations, which came into existence
in 1945

The governance factors involve issues tied to the interest of the broader stakeholder community, risk management, corporate governance, anti-corruption, and tax transparency. The ESG investment is a part of investing approaches collectively named responsible investment. To invest responsibly means to intend to impact the environment or society positively. Being socially responsible considers the issue of sustainability in the investment decision-making thinking about green investment, such as allocating capital to assets that mitigate climate change or biodiversity loss. Social investment intends to address social challenges faced by the bottom of the pyramid (BOP – refers to the poorest 2/3 of the economic human pyramid including more than four billion people leaving in poverty). The responsible investment framework is built around the U.N. Sustainable Development Goals (SDGs), shown in the Figure, addressing global challenges such as poverty, inequality, environmental issues, peace, and justice.

Future ESG opportunities

Research shows a significant increasing trend of institutional investors in the last fifteen years, directing funds toward ESG-related investments and funds. In 2005, institutional investors held $1.5 trillion in ESG-based funds, while by 2020, the level of ESG-based assets quadrupled to reach $6.2 trillion. This trend is a solid indicator of the need to develop reliable tools to distinguish between the companies and funds that are genuinely ESG-related and the companies involved in “Greenwashing.” We believe that the trend shown in the Figure below will continue to rise even faster if investors can determine whether a company or a fund is taking steps to improve its ESG profile with a high level of confidence.

Institutional investor funds directed towards ESG-related investments
and funds (Source: Forbes, Morgan Stanley Capital International, US SIF
(The National Association of College and University Business Officers

The challenges with ESG Data also create opportunities for interdisciplinary approaches to developing and offering tools for disentangling the information from the noise in reporting ESG metrics and unraveling the truth about corporate dedication and actions towards becoming more socially responsible. The newly developed methodologies could measure the “environmental legitimacy” of the corporations and offer evidence of “Greenwashing,” a term describing companies that are “talking the talk, but not walking the walk.

To provide a good understanding of Greenwashing, we need to collect and analyze many data, including market research and specific company data. We consider different data sources combining specific corporate news articles, company filings, and alternative data sources. The main challenge of data gathering is that often high-quality data sources are proprietary and inaccessible.
Hence, we focus on publicly available data from reputable data sources. One of the most valuable data sources in the U.S. is the SEC’s Electronic Data Gathering, Analysis, and Retrieval, which contains all company filings of publicly-traded companies on the stock exchanges. Using these documents, we can identify what the company reports and how they view themselves regarding any ESG issue.

Once we have one side of the story, we need data from the other; let’s call this side the media side. We intend to gather corporate news articles from various media websites, such as GDELT, Google RSS services, and Yahoo Finance news. We also consider social media portals such as Reddit and Twitter as supplementary data. These two platforms offer an API that gives access to their content, and they are accessible under certain limits. We also use APIs like Yahoo finance which provide financial information about a company like stock prices, trading volumes, and ESG risk scores. Before offering any machine learning, Artificial Intelligence, or other technological solution in finance, a crucial part of our research is to understand the regulatory environment and the repercussions of technology-based solutions within the securities and investment regulatory framework. Another important aspect is finding definitions and rules for determining ESG-related metrics as benchmarks for company ESG rankings. The European Union and the United States have already issued guidelines about greenhouse gas emissions, carbon neutralization diversity, etc. These documents offered recommendations about various aspects related to ESG, explaining considerations for improving corporate ESG rating, and outlining sanctions for the company’s negative impacts on specific aspects of ESG. One of the challenges in tackling the ESG issue and determining whether companies’ trajectory is improving, or deteriorating is to define a taxonomy of words that will represent the main topics of our environmental, social, and corporate governance. We establish a list of words and phrases that can be used when searching for meaningful news articles and social media posts and through the regulatory mandated corporate disclosure documents. We finally connect the search terms and phrases with selected corporations to extract information such as the frequency of co-occurrence between the company names and a set of ESG terms. We then map out the sentiments for the extracted text to understand the relationship between ESG-related topics and companies. This information gives us a timestamp with frequencies and sentiments where we observe the company’s good or bad ESG decisions. We could then overlap the frequency and sentiment graphs with the stock price return information to relate the ESG (qualitative) and the price (quantitative) indicators of company performance.