# Chapter 10 Aggregation

Aggregation is the operation of combining multiple indicators into one value. Many composite indicators have a hierarchical structure, so in practice this often involves multiple aggregations, for example aggregating groups of indicators into aggregate values, then aggregating those values into higher-level aggregates, and so on, until the final index value.

Aggregating should almost always be done on normalised data, unless the indicators are already on very similar scales. Otherwise the relative influence of indicators will be very uneven.

Of course you don’t *have* to aggregate indicators at all, and you might be content with a scoreboard, or perhaps aggregating into several aggregate values rather than a single index. However, consider that aggregation should not substitute the underlying indicator data, but complement it.

Overall, aggregating indicators is a form of information compression - you are trying to combine many indicator values into one, and inevitably information will be lost^{3}. As long as this is kept in mind, and indicator data is presented and made available along side aggregate values, then aggregate (index) values can complement indicators and be used as a useful tool for summarising the underlying data, and identifying overall trends and patterns.

## 10.1 Weighting

Many aggregation methods involve some kind of weighting, i.e. coefficients that define the relative weight of the indicators/aggregates in the aggregation. In order to aggregate, weights need to first be specified, but to effectively adjust weights it is necessary to aggregate.

This chicken and egg conundrum is best solved by aggregating initially with a trial set of weights, perhaps equal weights, then seeing the effects of the weighting, and making any weight adjustments necessary. For this reason, weighting is dealt with in the following chapter on Weighting.

## 10.2 Approaches

### 10.2.1 Means

The most straightforward and widely-used approach to aggregation is the **weighted arithmetic mean**. Denoting the indicators as \(x_i \in \{x_1, x_2, ... , x_d \}\), a weighted arithmetic mean is calculated as:

\[ y = \frac{1}{\sum_{i=1}^d w_i} \sum_{i=1}^d x_iw_i \]

where the \(w_i\) are the weights corresponding to each \(x_i\). Here, if the weights are chosen to sum to 1, it will simplify to the weighted sum of the indicators. In any case, the weighted mean is scaled by the sum of the weights, so weights operate relative to each other.

Clearly, if the index has more than two levels, then there will be multiple aggregations. For example, there may be three groups of indicators which give three separate aggregate scores. These aggregate scores would then be fed back into the weighted arithmetic mean above to calculate the overall index.

The arithmetic mean has “perfect compensability”, which means that a high score in one indicator will perfectly compensate a low score in another. In a simple example with two indicators scaled between 0 and 10 and equal weighting, a unit with scores (0, 10) would be given the same score as a unit with scores (5, 5) – both have a score of 5.

An alternative is the **weighted geometric mean**, which uses the product of the indicators rather than the sum.

\[ y = \left( \prod_{i=1}^d x_i^{w_i} \right)^{1 / \sum_{i=1}^d w_i} \]

This is simply the product of each indicator to the power of its weight, all raised the the power of the inverse of the sum of the weights.

The geometric mean is less compensatory than the arithmetic mean – low values in one indicator only partially substitute high values in others. For this reason, the geometric mean may sometimes be preferred when indicators represent “essentials”. An example might be quality of life: a longer life expectancy perhaps should not compensate severe restrictions on personal freedoms.

A third type of mean, in fact the third of the so-called Pythagorean means is the **weighted harmonic mean**. This uses the mean of the reciprocals of the indicators:

\[ y = \frac{\sum_{i=1}^d w_i}{\sum_{i=1}^d w_i/x_i} \]

The harmonic mean is the the least compensatory of the the three means, even less so than the geometric mean. It is often used for taking the mean of rates and ratios.

### 10.2.2 Other methods

The *weighted median* is also a simple alternative candidate. It is defined by ordering indicator values, then picking the value which has half of the assigned weight above it, and half below it. For *ordered* indicators \(x_1, x_2, ..., x_d\) and corresponding weights \(w_1, w_2, ..., w_d\) the weighted median is the indicator value \(x_m\) that satisfies:

\[ \sum_{i=1}^{m-1} w_i \leq \frac{1}{2}, \: \: \text{and} \sum_{i=m+1}^{d} w_i \leq \frac{1}{2} \]

The median is known to be robust to outliers, and this may be of interest if the distribution of scores across indicators is skewed.

Another somewhat different approach to aggregation is to use the Copeland method. This approach is based pairwise comparisons between units and proceeds as follows. First, an *outranking matrix* is constructed, which is a square matrix with \(N\) columns and \(N\) rows, where \(N\) is the number of units.

The element in the \(p\)th row and \(q\)th column of the matrix is calculated by summing all the indicator weights where unit \(p\) has a higher value in those indicators than unit \(q\). Similarly, the cell in the \(q\)th row and \(p\)th column (which is the cell opposite on the other side of the diagonal), is calculated as the sum of the weights unit where \(q\) has a higher value than unit \(p\). If the indicator weights sum to one over all indicators, then these two scores will also sum to 1 by definition. The outranking matrix effectively summarises to what extent each unit scores better or worse than all other units, for all unit pairs.

The Copeland score for each unit is calculated by taking the sum of the row values in the outranking matrix. This can be seen as an average measure of to what extent that unit performs above other units.

Clearly, this can be applied at any level of aggregation and used hierarchically like the other aggregation methods presented here.

In some cases, one unit may score higher than the other in all indicators. This is called a *dominance pair*, and corresponds to any pair scores equal to one (equivalent to any pair scores equal to zero).

The percentage of dominance pairs is an indication of robustness. Under dominance, there is no way methodological choices (weighting, normalisation, etc.) can affect the relative standing of the pair in the ranking. One will always be ranked higher than the other. The greater the number of dominance (or robust) pairs in a classification, the less sensitive country ranks will be to methodological assumptions. COINr allows to calculate the percentage of dominance pairs with an inbuilt function.

## 10.3 Aggregation in COINr

Many will be amazed to learn that the function to aggregate in COINr is called `aggregate()`

. First, let’s build the ASEM data set up to the point of normalisation, then aggregate it using the default approaches.

```
library(COINr6)
# Build ASEM data up to the normalisation step
# Assemble data and metadata into a COIN
<- assemble(IndData = COINr6::ASEMIndData, IndMeta = COINr6::ASEMIndMeta, AggMeta = COINr6::ASEMAggMeta)
ASEM ## -----------------
## Denominators detected - stored in .$Input$Denominators
## -----------------
## -----------------
## Indicator codes cross-checked and OK.
## -----------------
## Number of indicators = 49
## Number of units = 51
## Number of aggregation levels = 3 above indicator level.
## -----------------
## Aggregation level 1 with 8 aggregate groups: Physical, ConEcFin, Political, Instit, P2P, Environ, Social, SusEcFin
## Cross-check between metadata and framework = OK.
## Aggregation level 2 with 2 aggregate groups: Conn, Sust
## Cross-check between metadata and framework = OK.
## Aggregation level 3 with 1 aggregate groups: Index
## Cross-check between metadata and framework = OK.
## -----------------
# Denominate as specified in metadata
<- denominate(ASEM, dset = "Raw")
ASEM # Normalise using default method: min-max scaled between 0 and 100.
<- normalise(ASEM, dset = "Denominated")
ASEM
# Now aggregate using default method: arithmetic mean, using weights found in the COIN
<- aggregate(ASEM, dset = "Normalised")
ASEM
# show last few columns of aggregated data set
$Data$Aggregated[(ncol(ASEM$Data$Aggregated)-5): ncol(ASEM$Data$Aggregated)]
ASEM## # A tibble: 51 x 6
## Environ Social SusEcFin Conn Sust Index
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 71.0 67.0 65.7 50.4 67.9 59.1
## 2 55.9 86.3 53.5 54.8 65.3 60.0
## 3 56.4 54.0 62.5 42.4 57.6 50.0
## 4 76.0 54.8 55.5 45.9 62.1 54.0
## 5 61.9 59.5 31.1 45.5 50.8 48.2
## 6 53.7 61.1 72.0 48.9 62.3 55.6
## 7 72.5 79.8 62.7 51.0 71.7 61.4
## 8 35.7 68.3 70.1 49.5 58.0 53.8
## 9 52.9 82.1 63.8 48.2 66.3 57.2
## 10 66.5 60.8 54.2 47.0 60.5 53.7
## # ... with 41 more rows
```

By default, the aggregation function performs the following steps:

- Uses the weights that were attached to
`IndMeta`

and`AggMeta`

in`assemble()`

- Aggregates hierarchically (with default method of weighted arithmetic mean), following the index structure specified in
`IndMeta`

and using the data specified in`dset`

- Creates a new data set
`.$Data$Aggregated`

, which consists of the data in`dset`

, plus extra columns with scores for each aggregation group, at each aggregation level.

Like other COINr functions `aggregate()`

has arguments `dset`

which specifies which data set to aggregate, and `out2`

which specifies whether to output an updated COIN, or a data frame. Unlike other COINr functions however, `aggregate()`

only works on COINs, because it requires a number of inputs, including data, weights, and the index structure.

The data set outputted by `aggregate()`

simply adds the aggregate columns onto the end of the data set that was input. This means that aggregate columns will always appear at the end. To more easily see the results of the index, COINr’s `getResults()`

function rearranges the aggregated data set into a better format.

```
# generate simple results table, attach to COIN
<- getResults(ASEM, tab_type = "Summary", out2 = "COIN")
ASEM # view the table
$Results$SummaryScores |>
ASEMhead(10)
## NULL
```

This function, which is explained more in Visualising results, sorts the aggregated data and extracts the most relevant columns. It has options to control sorting and which columns to display.

The types of aggregation supported by `aggregate()`

are specified by `agtype`

from the following options:

`agtype = arith_mean`

uses the arithmetic mean, and is the default.`agtype = geom_mean`

uses the geometric mean. This only works if all data values are positive, otherwise it will throw an error.`agtype = harm_mean`

uses the harmonic mean. This only works if all data values are non-zero, otherwise it will throw an error.`agtype = median`

uses the weighted median`agtype = copeland`

uses the Copeland method. This may take a few seconds to process depending on the number of units, because it involves pairwise comparisons across units.`agtype = mixed`

a different aggregation method for each level. In this case, aggregation methods are specified as any of the previous options using the`agtype_bylevel`

argument.`agtype = custom`

allows you to pass a custom aggregation function.

For the last option here, if `agtype = custom`

then you need to also specify the custom function via the `agfunc`

argument. As an example:

```
# define an aggregation function which is the weighted minimum
<- function(x,w) min(x*w, na.rm = TRUE)
weightedMin
# pass to aggregate() and aggregate
<- aggregate(ASEM, dset = "Normalised", agtype = "custom",
ASEM agfunc = weightedMin, out2 = "df")
```

Any custom function should have the form `functionName <- function(x,w)`

, i.e. it accepts x and w as vectors of indicator values and weights respectively, and returns a scalar aggregated value. Ensure that `NA`

s are handled (e.g. set `na.rm = T`

) if your data has missing values, otherwise `NA`

s will be passed through to higher levels.

A number of sophisticated aggregation approaches and linked weighting methods are available in the compind package. These can nicely complement the features in COINr and may be of interest to those looking for more technical approaches to aggregation, such as those based on benefit of the doubt methods.

A further argument, `agtype_bylevel`

, allows specification of different normalisation types at different aggregation levels. For example, `agtype_bylevel = c("arith_mean", "geom_mean", "median")`

would use the arithmetic mean at the indicator level, the geometric mean at the pillar level, and the median at the sub-index level (for the ASEM data structure).

Alternative weights can also be used for aggregation by specifying the `agweights`

argument. This can either be:

`NULL`

, in which case it will use the weights that were attached to IndMeta and AggMeta in GII_assemble (if they exist), or- A character string which corresponds to a named list of weights stored in
`.$Parameters$Weights`

. You can either add these manually or through`rew8r`

(see chapter on Weighting). E.g. entering`agweights = "Original"`

will use the original weights read in on assembly. This is equivalent to`agweights = NULL`

. - A data frame of weights to use in the aggregation. See Weighting for the format of weight data frames.

A final option of `aggregate()`

is the possibility to set missing data limits for aggregation. By default, `aggregate()`

will produce an aggregate score as long as at least one value is available within the aggregation group. For example, for an aggregation group we might have five indicators. A unit might have the following:

```
## Ind1 Ind2 Ind3 Ind4 Ind5
## 1 NA 3 NA NA NA
```

The arithmetic mean of this would still be 3. But the unit has 80% missing data for this aggregation group, so calculating an aggregate score may seem a bit disingenuous. The `avail_limit`

argument helps here: it gives the possibility to set a minimum data availability threshold for each level of the index, such that if data availability is below the threshold (for a particular unit and aggregation group), it returns `NA`

(for that unit and group) instead of an aggregated score. This is probably good practice in any composite indicator where data availability is an issue: some units/countries may have fairly high data availability overall, but have low data availability for particular groups.

Now that the data has been aggregated, a natural next step is to explore the results. This is dealt with in the chapter on [Results visualisation].

This recent paper may be of interest: https://doi.org/10.1016/j.envsoft.2021.105208↩︎