Funded by the Institute for Data Valorization IVADO, Canada.

The goal of COVID-19 Data Hub is to provide the research community with a unified dataset by collecting worldwide fine-grained case data, merged with exogenous variables helpful for a better understanding of COVID-19.

Download the data

All the data are provided at the download centre.

Unified dataset

The dataset includes an extensive list of epidemiological variables, several policy measures by Oxford’s government response tracker, and a set of external keys to match the data with Google and Apple mobility reports, with the Hydromet dataset, and with spatial databases such as Eurostat for Europe or GADM worldwide.

Software packages

We release R and Python packages to simplify the interaction with the Data Hub. In general, it is possible to import the data in any software by reading the CSV files provided at the download centre.

Data transparency

The data acquisition pipeline is open source. All the code used to generate the data files can be found at our GitHub repository. In principle, one can use the function covid19 from the repository to generate the same data we provide at the download centre. However, this takes between 1-2 hours, so that downloading the pre-computed files is typically more convenient. Here we provide the full list of data sources from which the data are pulled.

Research reproducibility

As most governments are updating the data retroactively, we provide vintage data to simplify reproducibility of academic research. These are immutable snapshots of the data taken each day. We gratefully acknowledge financial support by the R Consortium in maintaining the vintage data.

Academic publications

See the publications that use COVID-19 Data Hub.

Latest news

  • 29/03/2022: The implementation details and the latest version of the data are described in “A worldwide epidemiological database for COVID-19 at fine-grained spatial resolution”, Scientific Data (Nature). Read here:


If you find some issues with the data, please report a bug at our GitHub repository. Suggestions about where to find data that we do not currently provide are also very welcome! Help our project grow: star the repo!


Terms of use

By using COVID-19 Data Hub, you agree to our terms of use.


The project was initiated via the R package COVID19 developed by Emanuele Guidotti (University of Neuchâtel), leveraged by David Ardia (HEC Montréal) via the funding by IVADO, enhanced by an awesome open source community, and it is maintained by Emanuele Guidotti.

Logo courtesy of Gary Sandoz and Talk-to-Me.

Supported by

R ConsortiumIVADOHEC MontréalHack ZurichUniversità degli Studi di Milano