Guidance

Record information about data sets you share with others

Using metadata to make it easier to catalogue, validate, reuse and share your data.

When you create a spreadsheet, CSV file or other data in , you should create a record with information about your data and store it with your data. This information is called metadata. By doing this, you will:

  • make your data searchable
  • find it easier to catalogue and validate your data
  • make sure your data is accessible and reusable - often your data is reused even when you do not expect it to be

Refer to the guide on publishing your tabular data, if you바카라 사이트™re making your data open. All CSV files should comply with the Tabular data standard.

Who should use this guidance

Use this guidance if you are creating any data in tabular form that you intend to share. Data, in this instance, refers to data sets collected, used and maintained for analytics or for providing government services. It does not refer to finished documents.

You should use this guidance if your government organisation does not currently have metadata guidance for you to use. This guidance will become part of a collection to assist those already working with metadata.

Do not follow this guidance if you are creating, maintaining or managing metadata for geospatial data (that which references data to a location on the surface of the Earth). You should use metadata for spatial data sets, including those covered by the . You can also refer to the open standards profiles on 바카라 사이트˜Exchange of location point바카라 사이트™ and 바카라 사이트˜Identifying property and street information바카라 사이트™ for more details.

Using metadata in government

By following this guidance, you will be using a consistent metadata vocabulary which will improve interoperability across government. The metadata vocabulary in this guidance uses the Open Standards of and that are both recommended for government use.

If you are intending to publish your data, you should also read 바카라 사이트˜Publishing tabular data바카라 사이트™.

Where to record and store your metadata

When recording metadata, it바카라 사이트™s useful to store this close to, or with, the data it바카라 사이트™s describing.

You can do this by storing metadata:

  • within a data spreadsheet a separate tab
  • in a separate file, such as a readme file, and keep a record showing the link between data and metadata
  • in a Metadata Catalogue if your government organisation has one

When publishing your data, you will need to consider where you store your metadata depending on the types of data you are publishing and how findable you want your metadata to be. Read our guidance on 바카라 사이트˜Publishing tabular data바카라 사이트™ to understand more about how you publish metadata.

Making metadata machine readable and accessible

To make metadata machine readable and accessible, you must format your metadata in a specific way. For example, use camelCase which is the practice of writing phrases so that there are no spaces between words and each word in the middle of the phrase begins with a capital letter.

When recording your metadata, make sure you use plain English and follow the writing for 바카라 사이트 guide. For example, do not use jargon, and make sure you define technical terms and expand acronyms. Try to avoid using symbols that users might misinterpret.

When you do not have the information you need to record, you can still add the metadata, but add 바카라 사이트œunknown바카라 사이트 when relevant.

Metadata you should record

You should record information that will help others:

  • be informed on where and when your data was collected - use 바카라 사이트˜creator바카라 사이트™ and 바카라 사이트˜dateCreated바카라 사이트™ to record who created the data set and the date they created it

  • find the data you바카라 사이트™ve saved on a shared network, and identify whether it바카라 사이트™s the data set they need - use 바카라 사이트˜name바카라 사이트™, 바카라 사이트˜description바카라 사이트™ and 바카라 사이트˜identifier바카라 사이트™ to describe your data

  • validate the data you바카라 사이트™ve collected - use 바카라 사이트˜expires바카라 사이트™ and 바카라 사이트˜supersededBy바카라 사이트™바카라 사이트™ so users know which version of your data to use, 바카라 사이트˜temporalCoverage바카라 사이트™ to indicate the time period to which your data applies, and 바카라 사이트˜conformsTo바카라 사이트™ to tell users whether your file applies to a specific standard or schema

  • use the data you바카라 사이트™ve collected appropriately - use 바카라 사이트˜hasDigitalDocumentPermission바카라 사이트™ to make sure users do not share sensitive data in ways it shouldn바카라 사이트™t be and 바카라 사이트˜license바카라 사이트™ to help users understand their rights to using the data you바카라 사이트™ve collected

  • understand the structure and format of your CSV tabular data - use the and read our guidance on 바카라 사이트˜Publishing tabular data바카라 사이트™ to get started

Try to avoid recording any metadata that includes personal data. If you include personal data, you will need to comply with the principles, rights and obligations contained in GDPR. You can read the for more information.

Recording dates in your metadata

You must record any dates using the ISO 8601 standard, which is an Open Standard selected for use by the government.

This means listing the date and time elements in descending order of size (years, months, days, hours, minutes, seconds, milliseconds and microseconds). You should provide the right level of accuracy for your data set. For example, if you publish your data set once a year, it might be enough to provide a date down to the day, for example, 2020-07-14. If you publish multiple times a day, it is better to include information down to the second, for example, 2020-07-14T12:57:03Z.

Record the provenance of your data

Using 바카라 사이트˜creator바카라 사이트™ or 바카라 사이트˜contributor바카라 사이트™

You should record who created a data set so users can communicate with the creator and understand if the data is relevant to them. For example, a data analyst may want to find out how reliable a data set is before undertaking any analysis.

Record a name for future reference, and an email address if possible. This name and email address should refer to:

  • the name of a team or organisation
  • a role within a team
  • an individual name in some cases - if you can do this while remaining GDPR compliant

For example, creator:바카라 사이트Data Standards Authority team data-standards-authority@digital.cabinet-office.gov.uk바카라 사이트

You can use 바카라 사이트˜contributor바카라 사이트™ instead if multiple organisations or teams are contributing to the data set. You can also use 바카라 사이트˜creator바카라 사이트™ and 바카라 사이트˜contributor바카라 사이트™ together for full clarity around where data has come from.

Using 바카라 사이트˜dateCreated바카라 사이트™

You should record the date when you create a data set to help users of the data set know whether it is valid and relevant to them. You must record the date using the Open Standard ISO 8601.

For example, dateCreated:바카라 사이트œ2002-10-02바카라 사이트

You must capture the exact time a data set is collected when you바카라 사이트™re collecting more than one version of a data set a day.

Help users find, use and identify your data set

Using 바카라 사이트˜name바카라 사이트™

You must include the name of your data set so users can find and identify the right data set.

You should try to make sure the name captures information that will help users determine whether the data set meets their needs. For example, by capturing the topic and specific information about place and geography.

For example, name:바카라 사이트GDS London Office Employees office commuting tendencies바카라 사이트

Using 바카라 사이트˜description바카라 사이트™

You can add a description to your data set, in addition to the title, so that users of your data can find out if it바카라 사이트™s relevant to them.

The descriptions of your data should only describe the type of data collected and should not include warnings about how to use the data - any warnings should be explained with the term 바카라 사이트˜accessRights바카라 사이트™.

For example, description:바카라 사이트The amount GDS employees commute to the office and their busiest times to travel. This data also shows the tendencies of GDS employees to work from home바카라 사이트

Using 바카라 사이트˜identifier바카라 사이트™

You should uniquely identify your data set so that users of your data know exactly which source they바카라 사이트™re using.

You should identify your data set by:

  • using the identification system your organisation is using (in cases where organisations have a system in place)

  • using a meaningless identifier you바카라 사이트™ve created - this should be random numbers rather than sequential or semi-sequential numbers to avoid meaning being implied

Using a meaningless identifier avoids misunderstanding that comes with applying meaning to identifiers. For example, meaning can change over time. Meaningless identifiers have the ability to be genuinely constant things.

For example, identifier:바카라 사이트œ362857580바카라 사이트

You can ensure this meaningless identifier stays unique by keeping a catalogue of all data sets with their identifiers.

Using 바카라 사이트˜encodingFormat바카라 사이트™

You should record the file format in which you store your data so users know how to use and import it.

File extensions are commonly used for your operating system to decide what program to open a file with. Common file extensions include XLS for Excel spreadsheets and CSV.

Example, encodingFormat:바카라 사이트xls바카라 사이트

If you think you may publish your data set, you can also record the media type. Media types are used by browsers to decide how to present some data. For more information read, 바카라 사이트˜Publishing tabular data바카라 사이트™.

Media type is also known as a Multipurpose Internet Mail Extensions or MIME type. Mozilla .

Example, encodingFormat:바카라 사이트jpeg바카라 사이트

Help others validate your data

Using 바카라 사이트˜supersededBy바카라 사이트™

When the data you바카라 사이트™re collecting replaces an older version, you should record this change to make sure users use the most up-to-date version.

You must only use 바카라 사이트˜supersededBy바카라 사이트™ when the data you바카라 사이트™re collecting has:

  • the same period of time and location as the older version of the spreadsheet or file

  • different content to the older version of the spreadsheet or file

The new version of the data will need its own unique URL or identifier. For example,

supercededBy:바카라 사이트/government/organisations/government-digital-service/about/v1바카라 사이트

You may also choose 바카라 사이트˜isRelatedTo바카라 사이트™ as a more generic term that can account for any kind of relationship between resources.

Using 바카라 사이트˜supersedes바카라 사이트™

You can use 바카라 사이트˜supersedes바카라 사이트™ as an additional property as this will allow users to understand the history or timeline of a document.

For example, supersedes:바카라 사이트/government/organisations/government-digital-service/about/v1바카라 사이트

Using 바카라 사이트˜expires바카라 사이트™

If you바카라 사이트™re no longer using a particular data set, or it has been superseded by other data, you should record it as expired. You can do this by adding the date for when your data set is no longer valid. You should give any replacement data a new title and identifier.

You will often need to remember to revisit your data set to update its metadata when becoming aware of the need for the data set to be no longer used.

For example, expires:바카라 사이트2003-12-04바카라 사이트

Do not use this to record the period of time that applies to your data. In these cases, you should use 바카라 사이트˜temporalCoverage바카라 사이트™ instead.

Using 바카라 사이트˜temporalCoverage바카라 사이트™

If you바카라 사이트™re collecting data over a range of dates, you should record this so users know the period that the content applies to. You should add this using the Open Standard ISO8601.

For example, temporalCoverage:바카라 사이트œ2002-10-02/2013-01-01바카라 사이트

If your data does not have a specified end date, you can use 바카라 사이트..바카라 사이트 in place of the end date. This follows .

For example, temporalCoverage:바카라 사이트œ2020-10-02/..바카라 사이트

Using 바카라 사이트˜conformsTo바카라 사이트™

You should tell users whether your file conforms to a specific standard or schema so they can easily validate it. This could be the CSV on the Web schema or RFC4180 standard.

For example, conformsTo:바카라 사이트œ바카라 사이트

Your department, agency or local authority might also use a particular schema for specific types of data collection, and you may want to record this. For example the data standards for publishing brownfield land registers.

Make sure your data is used appropriately

Using 바카라 사이트˜license바카라 사이트™

For protected data such as personal, sensitive or commercial data, you should record information that will help users of the data understand its terms and conditions.

You may want to include the relevant data sharing agreement, legal regulation or certification. This could be a memorandum of understanding (MOU) or Data Protection Impact Assessment.

The open standards vocabularies schema.org and Dublin Core both spell the noun 바카라 사이트˜licence바카라 사이트™ using the American spelling 바카라 사이트˜license바카라 사이트™. You should use 바카라 사이트˜license바카라 사이트™ for consistency.

For example, license:바카라 사이트œMemorandum of Understanding between the Charity Commission for England and Wales and the Office for Students바카라 사이트

When publishing open data, you should label the data you바카라 사이트™ve collected with its licence for use. In many cases within government, this will be the . You should also link to the licence file to explain what the licence means and how others can use your code and content.

For example, license:

Using 바카라 사이트˜hasDigitalDocumentPermission바카라 사이트™

You should record the sensitivity of your data so it바카라 사이트™s not shared or published in ways it should not be.

You should provide information about who should be able to access the data you바카라 사이트™ve collected, and any restrictions including:

  • whether it바카라 사이트™s open or restricted/protected

  • the handling caveat for the data

  • the security classification of data

For example, hasDigitalDocumentPermission:바카라 사이트œrestricted access바카라 사이트

Updates to this page

Published 7 August 2020

Sign up for emails or print this page