Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Introduction

...

Four tables are presented below, one table for quality issues that are common to both community generated and curated datasets, the second table for dataset quality issues unique to community generated datasets, the third table for dataset quality issues unique to curated datasets and the fourth table lists the potential data quality issues that are checked as part of the HDX Quality Assurance Framework (QAF).

A mindmap graphic visualizes these issues:

...

Table 1. Common Dataset Quality Assurance Procedures 


ID

Potential Quality Issue

When to check

How to check

Corrective action

101

Unrelated item in related items tab

-When a related item is added

Using a script, identify datasets with a related item and evaluate the related item for relevance

Delete related item

Engage responsible user

102

Dataset metadata missing or incomplete

-When dataset is made public

-When dataset is revised

-During routine data QA process

Manual or scripted evaluation of the dataset

Engage org admin



ID

Potential Quality Issue

When to check

How to check

Corrective action

201

Dataset has no resources (files or links)

-When dataset is made public

-During routine data QA process

Manual or scripted count of the number of resources in the dataset

Make dataset private

Engage org admin

202

Dataset has a broken resource link

-When dataset is made public

-During routine data QA process

Manual or scripted check of resource link

Engage org admin

Check for 201

203

Dataset contains no relevant humanitarian data

-When dataset is made public

Manual evaluation of  the data

Make dataset private

Engage org admin

204

Dataset contains test data

-When dataset is made public

Manual evaluation of the data

Make dataset private

Engage org admin

205

Dataset contains sensitive data (PII, DII, CII)

-When dataset is made public

Manual evaluation of the data

Make dataset private/ Remove dataset from platform

Engage org admin

Refer for management review

206

Dataset contains inappropriate or otherwise objectionable content

-When dataset is made public

Manual evaluation of the data

Make dataset private

Revoke user editing privileges

Refer for management review

Engage org admin

207

Dataset contains the COD tag

-When dataset is made public

Manual or scripted check for the COD tag in the data

Remove the tag

Engage the data provider

208

Dataset contains individual survey data

-When dataset is made public

Manual evaluation of the data, special check for PII, DII or CII

Look for data dictionary

Make dataset private

Engage the data provider

209

Dataset contains a PDF resource that is not considered metadata

-When dataset is made public

Manual or scripted check for PDF document

Remove PDF or make dataset private if dataset no longer viable

Engage data provider

210

Dataset source is non existent or unclear

-When dataset is made public

Manual evaluation of the dataset

Make private

Engage data provider







...