UNHCR popstats data
Summary: this article describes the per-country datasets on HDX derived from the UNHCR Population Statistics API, via the HXL Proxy.
Source data overview
On its population statistics site, UNHCR publishes six global datasets (mostly at the annual and country level):
- Persons of concern (movements of people from country to country)
- Time series (Persons of concern reformatted as wide data)
- Demographics (sex-and-age-disaggregated data about Persons of concern)
- Asylum seekers (status of asylum seekers by country of origin, location, and year)
- Asylum seekers (monthly) (raw number of asylum seekers by country of origin, location, and month)
- Resettlement (total number of refugees resettled, with or without UNHCR assistance, by country of origin, location, and year)
Each of these is downloadable as a (large) HXL-tagged dataset, including all available countries and years. Example: http://popstats.unhcr.org/en/persons_of_concern.hxl (103,000+ rows). UNHCR has approved this method of publishing their data.
HDX datasets
HDX shares these datasets as live data from the UNHCR popstats site, filtered through the HXL Proxy to produce two views for each country:
- Data for people originating from the country, e.g. UNHCR's populations of concern originating from Colombia
- Data for people residing in the country, e.g. http://UNHCR's populations of concern residing in Colombia
The HXL Proxy filters the datasets from UNHCR on demand (with each download request, cached for an hour), and the download links on HDX are direct calls to the Proxy, such as the following:
(You can also view the data recipe on HDX.) Since the data comes directly from UNHCR on each request, there is rarely any need to update the dataset definitions in HDX.
Generating the datasets
The Python3 script used to generate or update the datasets is on HDX's GitHub repository:
Before running the script, you will need to copy the config.py.TEMPLATE
file to config.py
and fill in the appropriate values.
Note that the script also depends on a Google Sheet containing a mapping table between UNHCR country names and HDX (ISO3) country codes. The sheet is publicly readable and available at
https://docs.google.com/spreadsheets/d/1tHbzC8F79wQhpLos7Zw2qLQJI-UzccddDt0ds7R88F8/edit?usp=sharing