File Naming Convention

 

The purpose of having a standardized naming convention is to provide an organized framework for the datasets, ensuring interoperability between users and platforms. (See below for Naming Convention for Information Products)

Naming Convention for Datasets  


The naming convention for datasets (which are used to generate information products but are not usually information products in and of themselves) is different from the naming convention for information products.  This section describes the naming convention for datasets, including geodata and other datasets. 

There are 5 elements to the naming convention, each separated by an underscore: _.  Optional elements are denoted by brackets: []. They are as follows:

ISO3_Code+DataType_SubCode_[Scale]_Source_[Additional Description]

where:

  • ISO3: The first part of the naming convention consists of the ISO3 code.  For example: wrl, afg, alb, etc.  Additional codes can be created for transnational datasets and are not limited to 3 characters.  Example: hoa for horn of Africa. [OCHA taxonomy reference source]

  • Code + Data type: The feature code as defined in the Dataset Naming Code Table (See below) followed by the first letter of the data type where:

  •  

    • a = polygon

    • l = arc

    • p = point

    • t = text

    • r = raster (can be omitted for data where Code = image)

  • Sub-Code (if applicable): The sub-ode (if applicable) as defined in Dataset Naming Code Table (See below).  For example, for political boundaries sub-codes include: adm1, adm2, adm3, etc.

  • Scale (optional parameter, omitted for tabular data): The denominator for the scale of the dataset in the following form:

  •  

    • Example 1 – 1:1,000,000 = 1m

    • Example 2 – 1:250,000 = 250k

    • Example 3 – scale not known or of mixed scales (should be documented in metadata) = unk

    • Example 4 – scale not applicable for this dataset (such as utm zone boundaries or tabular data) = na (or omitted)

    • Example 5 – for raster data, this parameter is the nominal pixel size in kilometers, meters or cm = 30m, 130cm

  • Source: The acronym or short version of the source of the data. 

  •  

    • Example 1 – United Nations Cartographic Section = uncs

    • Example 2 – Government of Guinea = govgin

  • Additional Description (optional parameter):  This is a place holder for additional metadata that may make sense for a given type of dataset, such as:

  •  

    • a grid designator that may be used with datasets such as scanned toposheets or image datasets where the data is split into different files

    • a date stamp for data where the specific date of publication is critical (such as humanitarian profile or other frequently published datasets)

    • other metadata as needed

    • IMPORTANT NOTE: if the datasets are referenced by filename in other files (such as is common with MXD files) adding the date to the file name will often break the referring file when the date (and therefore the filename) is changed

Special Case 1: Two Datasets having the same naming convention

In the case where two datasets have the same name and there is insufficient time to clean the data to merge them to one dataset (see Data Cleaning in the Geodata Preparation Manual), numbers are used to differentiate between the two datasets and differences are specified in the metadata title and abstract until the data may be combined to one.  The numbers run in descending order from the dataset at the lowest detail to the dataset at the highest detail.  Consider the following:

Two sets of population data for a particular country, one has the population for major cities and the other population data for small towns. The data for major cities are labeled with a “1” and the data for small towns are labeled with a “2”.

  • Dataset 1: Major cities in Burundi from Government of Burundi at 1:1M scale

  • Dataset 2: Cities in Burundi from Government of Burundi at 1:M scale

Dataset Names (interim solution):

  • Dataset 1: bdi_pplp1_1m_gov

  • Dataset 2: bdi_pplp2_1m_gov

Feature Class Name (long term solution):

  • Combine the two feature classes to 1 using guidance from Verifying Geometry. The resulting label would be: bdi_pplp_1m_gov.

Special Case 2: Data do not span an entire country or region

In the case where the dataset only coverts part of a country, administrative names are used to differentiate between administrations and city names are used to differentiate between urban areas.  See example below:

Datasets not covering an entire country:

  • Dataset 1: IDP Camps in Aceh, Indonesia

  • Dataset 2: IDP Camps Afgooye Cooridor, Somalia

Resulting Dataset Names:

  • Dataset 1: idn_aceh_cmpp_idp_1m_unhcr

  • Dataset 2: som_afgooye_cmpp_idp_1m_unhcr

File Naming Within Geodatabases


The naming of datasets (feature classes) within a Geodatabase is identical to the scheme defined above.  A geodatabase feature class, shapefile, and KML representation of the same dataset would have the same name (exclusive of the file extension). However, for geodatabases, the file naming convention must also define the names of the geodatabase and feature datasets which contain the feature classes. An example geodatabase can be found in the folder structure.

Geodatabase name: The name of the geodatabase is from the International Organization for Standardization (ISO) country code, ISO3 code of the country/region of interest. For example: wrl, afg, alb, etc. [OCHA taxonomy reference source  

Feature dataset name: Feature datasets are objects that are used to group together related feature classes. There are two parts to the feature dataset name naming convention, each separated by an underscore (_). They are as follows:

  • ISO3 Code – As with the geodatabase name, the first part of the naming convention consists of the ISO3 code.  For example: wrl, afg, alb, etc.

  • Topic – The topic corresponds to the folder in the data structure where the data would reside in flat file formats (shapefile, kml, xls, etc.). These topics can be found in the dataset naming codes table and in the folder structure

Feature class name: As described above, the feature classes are named using the naming standard outlined above as if they were shapefiles.

Dataset Naming Codes Table


This table provides some of the codes and sub-codes to be used for naming datasets as described in the File Naming Convention. This list is not exhaustive and tries to address datasets commonly held by OCHA.  Additional codes will almost certainly be needed by a country office to handle datasets particular to the local situation.  If you would like advice on generating codes for other datasets, or if you have identified codes that you think will be useful for other offices, please Contact ISS in Geneva.
* Sub-codes denoted by an asterisk (*) should ideally be part of one more general dataset, where the features are differentiated in the attribute table and not through a separate feature dataset or feature class.

Naming Convention for Information Products


OCHA Field Map names are made of four parts separated by an underscore: 

  1. The catalog number, if in use (a good practice for catalog numbers is to have a three letter code for the country office and a sequential number) 

  2. A short map name (e.g. somalia_3w) 

  3. The paper size (A4, A3, A0, etc)

  4. The date of publication in YYYYMMDD format. 

Examples: 

  • SUM001_aceh_reference_map_a4_20050128

  • LBN001_Lebanon_reference_map_20081029

  • template_sample_a4_20080917

Related pages