Funded by the Institute for Data Valorization IVADO in 2020. Supported by the R Consortium from 2021 to 2024. Funded by the University of Lugano USI in 2025.
The goal of COVID-19 Data Hub is to provide the research community with a unified dataset by collecting worldwide fine-grained case data, merged with exogenous variables helpful for a better understanding of COVID-19.
JOB OFFER (published November 5, 2024): Hiring a research assistant with a Master’s or higher degree to work on the COVID-19 Data Hub, starting ASAP! The position is funded by the University of Lugano, Switzerland. Possibility to work partially or fully remotely. Read more here.
All the data are provided at the download centre.
The dataset includes an extensive list of epidemiological variables, several policy measures by Oxford’s government response tracker, and a set of external keys to match the data with Google and Apple mobility reports, with the Hydromet dataset, and with spatial databases such as Eurostat for Europe or GADM worldwide.
The R and Python packages simplify the interaction with the Data Hub. In general, it is possible to import the data in any software by reading the CSV files provided at the download centre.
The data acquisition pipeline is open source. All the code used to
generate the data files can be found at our GitHub repository.
In principle, one can use the function covid19
from the
repository to generate the same data available at the download centre. However,
this takes between 1-2 hours, so that downloading the pre-computed files
is typically more convenient. The full list of data sources where the
data are pulled from is available here.
As most governments are updating the data retroactively, we provide vintage data to simplify reproducibility of academic research. These are immutable snapshots of the data taken each day. We gratefully acknowledge financial support by the R Consortium in maintaining the vintage data.
The first version of the project is described in “COVID-19 Data Hub”, Journal of Open Source Software, 2020. The implementation details and the latest version of the data are described in “A worldwide epidemiological database for COVID-19 at fine-grained spatial resolution”, Scientific Data, Nature, 2022. You can browse the publications that use COVID-19 Data Hub here and here. Please cite our paper(s) when using COVID-19 Data Hub.
If you find some issues with the data, please report a bug at our GitHub repository.
StarBy using COVID-19 Data Hub, you agree to our terms of use.