Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Category

Description

(1) Comprehensive:

GEO_COOR (regex)

HXL_TAGS (regex)

PROTECTION_GROUP

RELIGIOUS_GROUP

SEXUALITY

SPOKEN_LANGUAGE

Static. No updates needed unless errors or omissions are found. 

Example: SPOKEN_LANGUAGE will not need to be updated unless certain rare or dying languages appear to be missing. 


(2) Comprehensive in context:

DISABILITY_GROUP

EDUCATION_LEVEL

MARITAL_STATUS

Functionality will be dependent on the correct context of key terms. 

Example: “single” is not exclusively a marital status, just as “primary” is not always an education level.

(3) Not comprehensive: 

OCCUPATION

HH_ATTRIBUTES

HDX_HEADERS

Difficult to capture all possibilities upfront; may need updates as more datasets are scanned. 

Example: “child_headed”, “families headed by children”, and “hohh child” all express the same household attribute; different data contributors may have their own versions. 

Over time, we will refine our use of DLP based on its performance. This process will involve adding, updating, or removing custom infoTypes across these three categories to improve the detection of different forms of sensitive data. 

...