Each of the six aggregate WGI measures are constructed by averaging together data from the underlying sources that correspond to the concept of governance being measured. This is done in the three steps:
STEP 1: Assigning data from individual sources to the six aggregate indicators. Individual questions from the underlying data sources are assigned to each of the six aggregate indicators. For example, a firm survey question on the regulatory environment would be assigned to Regulatory Quality, or a measure of press freedom would be assigned to Voice and Accountability. The individual variables used in the WGI and how they are assigned to the six aggregate indicators, can be found by clicking on the names of the six aggregate indicators listed here. Note that not all of the data sources cover all countries, and so the aggregate governance scores are based on different sets of underlying data for different countries.
STEP 2: Rescaling of the individual source data to run from 0 to 1. The questions from the individual data sources are first rescaled to range from 0 to 1, with higher values corresponding to better outcomes. If, for example, a survey question asks for responses on a scale from a minimum of 1 to a maximum of 4, we rescale a score of 2 as (2-min)/(max-min)=(2-1)/3=0.33. When an individual data source provides more than one question relating to a particular dimension of governance, we average together the rescaled scores.
The 0-1 rescaled data from the individual sources are available interactively through the WGI website here, and in the data files for each individual source. A combined dataset containing the data from all sources in a single file is available in Excel and Stata formats. Although nominally in the same 0-1 units, this rescaled data is not necessarily comparable across sources. For example, one data source might use a 0-10 scale but in practice most scores are clustered between 6 and 10, while another data source might also use a 0-10 scale but have responses spread out over the entire range. While the max-min rescaling above does not correct for this source of non-comparability, the procedure used to construct the aggregate indicators does (see below).
STEP 3: Using an Unobserved Components Model to construct a weighted average of the individual indicators for each source. A statistical tool known as an Unobserved Components Model (UCM) is used to make the 0-1 rescaled data comparable across sources, and then to construct a weighted average of the data from each source for each country. The UCM assumes that the observed data from each source are a linear function of the unobserved level of governance, plus an error term. This linear function is different for different data sources, and so corrects for the remaining non-comparability of units of the rescaled data noted above. The resulting estimates of governance are a weighted average of the data from each source, with weights reflecting the pattern of correlation among data sources. Summary information on for the weights applied to the component indicators can be downloaded in Excel and Stata format. The parameter estimates for the UCM can be downloaded in Excel and Stata format.
The UCM assigns greater weight to data sources that tend to be more strongly correlated with each other. While this weighting improves the statistical precision of the aggregate indicators, it typically does not affect very much the ranking of countries on the aggregate indicators. The composite measures of governance generated by the UCM are in units of a standard normal distribution, with mean zero, standard deviation of one, and running from approximately -2.5 to 2.5, with higher values corresponding to better governance. We also report the data in percentile rank terms, ranging from 0 (lowest rank) to 100 (highest rank).
Details on the aggregation procedure can be found in the WGI methodology paper. A full replication package is available here.