World Bank data infrastructure: shortening the path from data to insights

Data is not valuable in a vacuum. Data is only valuable once information, insight or in data_infrastructure_visualother words knowledge is extracted from it and is used to make decisions, shape policies, and change behaviors.

Data scientists, analysts, and researchers spend a significant amount of time and effort extracting knowledge from data and communicating it. Because extracting knowledge from data can be expensive, it is important to find ways to reduce its cost. A robust and well-designed data infrastructure can contribute to this cost reduction by smoothing the frictions involved with data analytics projects: storing, searching, accessing, understanding, cleaning, transforming, analyzing, and visualizing data. Lowering that cost can go a long way toward increasing data use and knowledge production.

Making data assets easier to access, understand, and use has been, and continues to be, one main focus of the World Bank Data Group. The World Development Indicators (WDI) is one such good example of the Data Group’s work in the area. This is why the World Bank’s Development Data Hub (DDH) that currently houses the WDI and over 10,000 other datasets was designed and implemented—to ensure World Bank data assets are:

  • Easy to store – By providing a safe place to keep development data that has been collected or procured
  • Easy to find – By providing a search engine indexing all World Bank development data
  • Easy to access – By providing direct download and API access services
  • Easy to understand – By publishing rich metadata along with each dataset
  • Easy to use – By leveraging existing standards for data and metadata
  • Easy to combine – By nudging data producers to use existing standards when possible (for instance, using countries ISO3 codes in addition to country names for ease in mapping)

Because knowledge dissemination should also be cheap, the products derived from the World Bank data assets should be easy to share, publish, and disseminate.  Existing tools such as the World Bank Tableau server or the RStudio Connect server enable World Bank data professionals to share the results of their analysis with their clients, colleagues, or the entire world in just a few clicks.

Finally, this process of knowledge extraction should be transparent and easily reproducible. The World Bank Github account allows World Bank data professionals to easily share their code and collaborate with other researchers within and outside the Bank.

In the same way a robust transport infrastructure shortens the path from point A to point B, the World Bank Data Group aims to continuously shorten the path from data to insight by building a stronger data infrastructure. 

 

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: