Skip links and keyboard navigation

Help shape the For government website by joining our user research panel.

Metadata schema for Queensland Government data assets guideline

Document type:
Guideline
Version:
v1.0.0
Status:
CurrentNon-mandated
Owner:
Data and Information Services, QGCDG
Effective:
September 2023–current
Security classification:
OFFICIAL-Public
Category:
Information

Introduction

Purpose

This guideline provides information and advice for Queensland Government departments to consider when capturing metadata. It draws inspiration from the Metadata Management Principles and seeks to further assist agencies to develop consistent metadata practices through operational guidance of its principles. These guidelines do not currently form any of the mandatory components of policy and are for information only.

Background

Metadata is the data that provides information about digital assets. The adoption of a consistent approach to metadata for describing Queensland Government data assets will enhance discoverability, interoperability, usability and sharing of government data assets.

The Intergovernmental Agreement on Data Sharing was signed by First Ministers at National Cabinet on 9 July 2021 and commits all jurisdictions to share public sector data as a default position, where it can be done securely, safely, lawfully and ethically. The role that metadata plays in improving the discoverability of data has been recognised with the inclusion of a System Reform initiative focused on metadata in the inaugural Work Program under the IGA.

Additionally, in 2022 the Queensland Government Enterprise Architecture (QGEA) team conducted a survey of Queensland Government’s Digital Leadership Group (DLG) to identify the greatest challenges and needs to enable better data, digital and ICT service delivery. ‘Integrating data from multiple sources’ was identified as a major challenge. The DLG also identified ‘data sharing and interoperability’ as a top priority and the need for more guidance regarding standards to improve data discovery, sharing and interoperability across agencies and platforms.

In recognition of the importance of data discovery, integration, interoperability and sharing, a Metadata Schema has been developed to describe data assets being collected and stored in government agencies. This will make it faster and easier for government employees to locate the data they need. Using the same metadata attributes to describe government data assets across agencies will facilitate future advances such as single point access to federated search, discovery, consumption of harmonized metadata extracted from data catalogues hosted by different agencies, and ultimately contribute to a national metadata repository (where authorised).

This data discovery Metadata Schema, described below, has been informed by previous, related efforts, including:

  • the Data Catalog Vocabulary (DCAT) [1]
  • ISO/IEC 11179-7:2019 Information technology — Metadata registries [2]
  • the ONDC Core Metadata Attributes developed through the Data Champions Network [3]
  • the DDMM Data and Analytics Working Group System Reform initiative to Improving data discoverability using metadata [link to ADN when available].

General guidance

Metadata profile

The metadata schema includes six core attributes which are critical for data management and discovery. When implemented, they also facilitate the process for consumers to submit data access requests to the custodian. They are:

  • Identifier
  • Title
  • Description
  • Creator
  • Point of Contact
  • Keyword.

Further detail on these attributes can be found below. At a minimum, these core attributes will be adopted by agencies to enable discovery of their data assets. Where possible, controlled vocabularies will also be adopted to maximise consistency and enable more effective searching. These vocabularies will use existing published standards where available and appropriate. Defining a minimum set of core attributes provides a pragmatic approach that aims to balances data maturity, metadata collection effort and associated benefits and may evolve over time to include additional core attributes as deemed beneficial.

As data custodianship matures, more agencies would be expected to also provide metadata for the five recommended attributes. They are:

  • Location
  • Temporal coverage from
  • Temporal coverage to
  • Access URL
  • Access rights.

Further detail on these attributes can be found below.

A further set of 16 additional attributes add to richer metadata capture. When combined with the core attributes listed above, they promote advanced search and discoverability as well as identifying relationships between data assets. The optional attributes are:

  • Date modified
  • Update frequency
  • License
  • Publisher
  • Date published
  • Security classification
  • File size
  • Format
  • Data attributes
  • Spatial resolution
  • Temporal resolution
  • Version
  • Purpose
  • Legal authority
  • Provenance
  • Data quality.

Further detail on these attributes can be found below.

In the tables below, the attribute name is the human-readable version. A description is provided in the second column to further detail the attribute and its relevant usage. The third column specifies how to format the metadata and includes values to use. The final column represents the cardinality of the attribute which indicated how many values can be entered for that specific attribute.

Attributes

Core

Name

Description and guidance

Format

Cardinality

Identifier

Unique and persistent identifier to the metadata record.

The identifier should be unique from other data assets. It is a vital element to ensure that a data asset can be located without confusion.

URL

1

Title

The name or title by which the data asset is known.

The title should be unique, and can follow any naming convention used by your agency.

Free text

1

Description

A descriptive statement of the data asset.

The description element is also searchable, therefore, carefully consider what keywords to use so that your audience can locate the data asset.

Free text

1

Creator

The agency or organisation who created the data asset. The agency that controls the data and has the right to deal with the data.

Controlled vocabulary

1

Point of Contact

The relevant contact from which information about the data asset can be obtained.

The email address of an individual or a group email inbox.

AS 4590—2006

1

Keyword

A keyword or tag describing the subject of the data asset.

A key field that supports discovery. When selecting keywords, consider how your users will search for the data asset.

Controlled vocabulary

1-n

Recommended

Name

Description

Format

Cardinality

Location

The geographic area the data asset applies to.

Controlled vocabulary for place names

0-n

Temporal Coverage From

The start date of the period for which this data asset is applicable.

Temporal coverage can be recorded as a date range or as a single datetime only.

ISO8601 - date element interchange format e.g., YYYYMMDD

0-1

Temporal Coverage To

The end date of the period for which this data asset is applicable.

The data asset may not have an end date. In this case, do not enter data into this field.

ISO8601 - date element interchange format e.g., YYYYMMDD

0-1

Access URL

The file path and/or URL that gives access to a distribution of the resource.

Can include an external resource (user authenticated) or internal system URL. In both cases, use a persistent link to connect to the data asset.

URL

0-1

Access Rights

A statement outlining access restrictions based on privacy, security, or other policies.

Free text e.g., “Access is restricted to govt employees only, subject to a signed data sharing agreement”

0-1

Additional

Name

Description

Format

Cardinality

Date Modified

The most recent date the data asset was changed, updated or modified.

ISO8601 - date element interchange format e.g., YYYYMMDD

0-1

Update Frequency

The frequency at which new or updated versions of this data asset are made available.

Controlled vocabulary

0-1

License

A legal document under which this data asset is made available.

Controlled vocabulary

0-1

Publisher

The name of the Government Agency responsible for making the data asset available.

Controlled vocabulary

0-1

Date Published

The date on which the data asset was formally released or made available.

ISO8601 - date element interchange format e.g., YYYYMMDD

0-1

Security Classification

Security class of the data asset.

The security classification applied to the data asset as specified by the Information security classification framework (QGISCF).

Controlled vocabulary

0-1

File Size

The size of the data asset in bytes.

For example, 3.2 MB

0-1

Format

The file format of the data distribution.

Controlled vocabulary

0-n

Data Attributes

List of attributes or fields contained in the data asset; link to a data dictionary file; or link to a landing page.

Free text or URL

0-n

Spatial Resolution

The spatial granularity of the data asset records.

Controlled vocabulary

0-1

Temporal Resolution

The temporal granularity of the data asset records.

Controlled vocabulary

0-1

Version

A version number or other version designation of the data asset.

For example: 2.01

0-1

Purpose

A descriptive summary of the intentions behind the creation of the data asset.

Free text

0-1

Legal Authority

The legal mandate under which the data asset was collected, created, received, used or disclosed.

This information can be sourced through the agency’s legal department.

Free text

0-1

Provenance

A statement about the lineage of a data asset.

This is a secondary description field that can address why, when and how the data asset was generated to provide more context and trust.

Free text

0-1

Data Quality

A statement about the quality of the data asset.

Free text

0-1

Implementation and evaluation

This metadata guideline is intended to be used by Queensland Government agencies to:

  • assist them in developing metadata for describing data assets that they host within their departments
  • map existing data asset metadata records to a common schema
  • develop common metadata repositories to support discovery across internal data catalogues
  • develop interoperable APIs to enable internal data catalogues to be programmatically searched and relevant data assets to be discovered and retrieved.

Contact datadiscovery@chde.qld.gov.au if you have any questions regarding the implementation of the above metadata schemas.

References