Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The Humanitarian Data Exchange (HDX) is adding a new mandatory metadata field called Expected Update Frequency. It replaces a previous optional field, Update Frequency, and its purpose is to tell us how often datasets shared through the site are likely to be updated.

...

We are introducing a new set of features to HDX based on the concept of "Data Freshness". We are interested in assessing how current is the data within each dataset because we want the portal to become more useful for users consumers of the data and : one important metric we can give them is how up to date are the datasets. Imagine a large walk in freezer in a restaurant. Delivery staff fill it with new products akin to how contributors add new datasets . Now that we have over 4,000 datasets on HDX, we want to make it easy for people to find data that is being actively maintained by contributors. For data providers, we will be able help ensure data remains currentto HDX. Cooks look inside for items they need and mix them in various tasty ways. Analogously, users find datasets in HDX and combine the data for analysis. Foodstuffs can be safely stored in the freezer for different periods of time. If no one checks, the caterers may use stale ingredients, so there needs to be a method to keep track of the contents and if anything is too old to order replacements. Given the choice, chefs would like to use the freshest produce, and similarly we want users have access to the most up to date data in HDX, particularly since it holds over 4000 datasets. We want to help data providers oversee their data, particularly where update processes are manual, and make it easy for people to find data that is actively maintained

How do we determine Data Freshness?

...

Fields in HDX related to Data Freshness

FieldDescriptionPurpose
data_update_frequencyDataset expected update frequencyShows how often the data is expected to be updated or at least checked to see if it needs updating
revision_last_updatedResource last modified dateIndicates the last time the resource was updated irrespective of whether it was a major or minor change
dataset_dateDataset dateThe date referred to by the data in the dataset. It changes when data for a new date comes to HDX so may not need to change for minor updates

What challenges do we face?

...

We are drawing on research being done on data freshness at Vienna University. Specifically, the researchers are looking at estimating the next change time for a resource based on previous update history and applying a Markov chain approach. The research is still ongoing but we hope to learn from their results to enhance HDX.

...

Let us know what you think of this approach. Send feedback to hdx@un.orgWatch this space! There will be more coming on the subject of Data Freshness.