GRESB Scoring Insights 1: Why GRESB Scores?

Chris Pyke
Chief Innovation Officer

Scoring is central to the GRESB platform. It is, therefore, worth stepping back and asking why GRESB Scores. Scoring—one piece of the overall assessment process—has three primary purposes: simplification, differentiation, and flexibility.

Simplification

GRESB is rooted in the idea that investors and managers benefit from a systematic, rules-based assessment of sustainability practices and performance. Application of consistent rules and reporting allows for benchmarking and engagement over time. Scoring is one piece of this systematic assessment.

Scoring digests, interprets, and simplifies raw data. The GRESB Assessment collects information from approximately 45 indicators and metrics, which, in turn, can be divided into hundreds of discrete data points. Each indicator or metric provides a different type of information and, in most cases, represents different units and ranges of real-world values. For example, the 2024 Climate Resilience indicator (RM5) references 24 different transition and physical risk scenarios—each backed by substantial technical literature. Alternatively, the 2024 Waste Management indicator (WS1) recognizes two complementary waste generation metrics and three forms of waste disposal. Both are important dimensions of sustainability performance. Both need to be evaluated and interpreted in different ways.

Scoring provides a systematic way to aggregate and simplify this kind of multidimensional data into something stakeholders can understand and compare. In the case of GRESB, this means reducing hundreds of variables—like climate resilience and waste management—into a synthetic 0-to-100 score and a set of sub-scores for aspects and indicators. Information is always lost in this process. Statisticians call this a data reduction. There is no such thing as perfect scoring, although there are better and worse ways to do it. Ultimately, the simplification of complex data is necessary to make it interpretable and actionable. The alternative is a literal Tower of Babel, composed of a random, uninterpreted stack of measures and metrics.

There is both an art and a science to scoring. The art of scoring comes first. This may be surprising since, on the surface, scoring looks like math. However, scoring starts with a specific goal and a system of values. In the case of GRESB, the goal is to use management indicators and performance metrics to evaluate strengths and weaknesses of property companies relative to peers. In other words, we seek to differentiate entities—companies and funds—based on material, non-financial criteria relevant to investment risks and returns.

This purpose gives us a frame to evaluate indicators and metrics, including aggregating them in weighted combinations. The weights are based on the relative importance established through the Standard governance process. More specifically, weights reflect the combined opinions and values held by the Foundation Board and Real Estate Standard Committee. This is not a scientific or precise exercise, and the current weights are not necessarily the opinion of any given member of the Foundation. Rather, they are a collective expression of priority accumulated over time. This expression of values is necessary, because there is no definitive way to establish the relative importance of competing issues. For example, GRESB assesses greenhouse gas emissions, stakeholder engagement, and biodiversity protection. The importance of these issues depends on a range of factors. GRESB weights represent one of many possible realizations. They have legitimacy because of the process used to establish and communicate them, specifically that of governance by the GRESB Foundation and Real Estate Standards Committee.

Once values have been expressed, we get to the science of scoring. The science involves selecting the most concise set of practical metrics to represent each area of performance. Furthermore, it involves making a myriad of increasingly granular decisions regarding how metrics are defined, measured, analyzed, and interpreted. For example, we need to decide whether we measure occupant satisfaction on a 0-to-5 or a 1-to-10 scale or, alternatively, we need to decide whether we measure energy use every day, month, or year. These decisions matter and there is usually no optimal solution for every entity and circumstance. Fortunately, we can explore the consequences of these decisions and look for strategies to best represent different dimensions of performance across the universe of organizations that GRESB serves. We can then apply mathematical techniques to combine these values across categories to provide a synthetic view of practice and performance.

At the end of the day, scoring brings together art and science to provide a concise description of performance. It hides—but does not eliminate—the underlying complexity in a way that makes information understandable and actionable. This means that, first and foremost, we score to simplify and communicate.

Differentiation

Scores are always a simplification. We all have plenty of first-hand experience of this. We know that scoring a B+ grade in high school Spanish reflects a collection of tests, quizzes, homework assignments, and more. Yes, we could communicate all our individual test scores, maybe even individual right and wrong answers. But it is much more practical to roll up that information into an overall grade or score. In this case, our teacher (hopefully) weights these elements so that the final grade reflects a student’s skill with the Spanish language. In other words, we want the final score to aggregate information and differentiate students based on the desired outcome.

In the case of GRESB, we are trying to differentiate entities based on their sustainability management and performance. The goal is to help investors set expectations for risk-adjusted returns and engage constructively over time. This is somewhat harder to observe than Spanish speaking skills. Management practices and operational performance cannot always be described in a linear progression, such as “good, better, best” or “A, B, C grades.” Critically, sometimes the evaluation is situational. For example, practices for the management of offices are distinct from industrial real estate. In some cases, we want to differentiate absolute excellence, whereas in other instances we want to track relative improvement. Consequently, the GRESB Assessments attempt to integrate and balance the most common of these circumstances with the goal of meaningfully differentiating companies over time.

In practical terms, the value of a GRESB Score is reflected, in part, by the degree to which a high score (100) is meaningfully different from a low score (0). Market participants must believe that these values represent material differences in management and performance. GRESB addresses this in two ways.

First, GRESB Standards are designed so that a topline score of 90 is materially different from a 50 or a 10. Earning 90 points on GRESB’s 100-point scale requires a reasonably comprehensive set of management practices, above average data coverage, and at least some demonstrable progress toward improving performance over time. Conversely, 10 points indicates something much less comprehensive with many opportunities for improvement.

Second, GRESB also works with researchers to compare scores to financial returns and impact. Over the last decade, independent researchers have shown that scores are positively correlated with returns for real estate investment trusts, European private real estate companies, and, new this year, Asian private real estate firms. The details of the associations vary for each study, but the direction remains consistent. Higher scores are associated with higher risk-adjusted returns. The design of the Standard and the results from these studies also provide an important insight into GRESB Scores. In short, Scores matter in broad terms. The top quintile of GRESB Scores is meaningfully different from the median and very different from the lowest quintile. It is also true that smaller score differences, e.g., 1, 2, or even 5 points, do not meaningfully differentiate participants.

The bottom line is that scores do differentiate companies. However, there are limitations and caveats to when and how this information can be used. As noted above, scores are necessary, but necessarily imperfect.

Flexibility

The last essential element of scoring is flexibility. This might be a little less obvious. GRESB provides a hierarchical structure aggregating multiple layers of information into a set of scores and ratings. At the bottom of this hierarchy, we have discrete, granular management indicators and performance metrics. Some of the indicators and metrics are durable, for example, measures of energy intensity. Others evolve over time as practice and expertise change, or for example as with the evolution of expectations for climate risk assessment, or the evaluation of embodied carbon.

Scores provide flexibility to adapt and respond to these changes. Scoring provides a relatively stable framework, including a range of known values with consistent interpretation. This framework can be maintained even if it becomes necessary to change any individual indicator or metrics.

For example, GRESB has indicators to assess renewable energy generation and procurement. Practices in this area evolve quickly with changes in definitions, regulatory requirements, and technology. GRESB expects to continue assessing the concept of renewable energy, even if underlying strategies evolve over time. GRESB’s scoring structure allows us to replace or integrate new criteria, while maintaining the fundamental scoring logic and framework. At the end of the day, the categorical and total GRESB Score looks the same — a number between 0 and 100. However, the structure provides the flexibility for the basis of this calculation to evolve over time.

This flexibility also extends to the weights used to interpret and aggregate the underlying metrics. GRESB provides a set of default or standard weights, expressing the value of each indicator and metric. These standard weights ensure consistent interpretation and comparability. However, consistency is not always the goal. Sometimes, stakeholders want to interpret the same information in different ways. Scoring helps meet this need. Scoring elements—indicators, metrics, and weights—can be combined in different ways to represent different priorities, all while using the same information and maintaining similar outputs. This flexibility allows us to envision custom scores that simultaneously reflect different values and priorities. This kind of customization is an important feature of scoring, and it is something we plan to explore in the future.

So, why does GRESB score?

The bottom-line answers are clear:

  • Scoring simplifies a large amount of diverse information.
  • Scoring differentiates entities based on meaningful, objective indicators and metrics.
  • Scoring provides flexibility to evolve and change over time.

Today, GRESB’s scoring system is an integral part of a systematic real asset assessment. It helps investors and managers digest and compare diverse types of information. It cannot be perfect. As people say of models, all scores are wrong, some are useful. Scoring is always a mixture of art and science, values and math. Fortunately, these factors can readily change over time to meet the changing needs and expectations of our community.

Related insights