Monday, May 10, 2010

More on data

For those who missed it: The World Bank has released a large chunk of its licensed data in its Open Access Initiative. This is something quite exciting as more than 2000 cross-national variables have been made available in a big bang just about three weeks ago. Datasets are now available for download in XLS (some larger than 50 MB) - quite an exciting thing for a chiffrephile (the release happened less than week before the death of economic historian Angus Maddison). 

While all this is very exciting, it is quite striking that the World Bank has not released all its CPIA assessments (only the last recent years). These Country Policy and Institutional Assessment datasets, however, are quite disputed as they are based on subjective evaluations of  "experts and policy makers" and hence could often suffer systematic biases (Lawrence King from Cambridge, for example, assessed the EBRD privatization indices and showed that they were significantly biased in favor of establishing a positive link between mass privatization and growth in East Europe). The CPIA, in particular, has been extensively (ab)used in Paul Collier's cross-country regressions (which has also been critiqued by Easterly). Collier had access to the CPIA as a former World Bank researcher, but access to particularly problematic data should be granted to everyone in order to ensure transparency and replicability, two main characteristics of rigorous scientific method.

Get the World Bank data here.

No comments:

Post a Comment