Record information about data sets you share with others
Using metadata to make it easier to catalogue, validate, reuse and share your data.
When you create a spreadsheet, CSV file or other data in , you should create a record with information about your data and store it with your data. This information is called metadata. By doing this, you will:
- make your data searchable
- find it easier to catalogue and validate your data
- make sure your data is accessible and reusable - often your data is reused even when you do not expect it to be
Refer to the guide on publishing your tabular data, if you바카라 사이트re making your data open. All CSV files should comply with the Tabular data standard.
Who should use this guidance
Use this guidance if you are creating any data in tabular form that you intend to share. Data, in this instance, refers to data sets collected, used and maintained for analytics or for providing government services. It does not refer to finished documents.
You should use this guidance if your government organisation does not currently have metadata guidance for you to use. This guidance will become part of a collection to assist those already working with metadata.
Do not follow this guidance if you are creating, maintaining or managing metadata for geospatial data (that which references data to a location on the surface of the Earth). You should use metadata for spatial data sets, including those covered by the . You can also refer to the open standards profiles on 바카라 사이트Exchange of location point바카라 사이트 and 바카라 사이트Identifying property and street information바카라 사이트 for more details.
Using metadata in government
By following this guidance, you will be using a consistent metadata vocabulary which will improve interoperability across government. The metadata vocabulary in this guidance uses the Open Standards of and that are both recommended for government use.
If you are intending to publish your data, you should also read 바카라 사이트Publishing tabular data바카라 사이트.
Where to record and store your metadata
When recording metadata, it바카라 사이트s useful to store this close to, or with, the data it바카라 사이트s describing.
You can do this by storing metadata:
- within a data spreadsheet a separate tab
- in a separate file, such as a readme file, and keep a record showing the link between data and metadata
- in a Metadata Catalogue if your government organisation has one
When publishing your data, you will need to consider where you store your metadata depending on the types of data you are publishing and how findable you want your metadata to be. Read our guidance on 바카라 사이트Publishing tabular data바카라 사이트 to understand more about how you publish metadata.
Making metadata machine readable and accessible
To make metadata machine readable and accessible, you must format your metadata in a specific way. For example, use camelCase which is the practice of writing phrases so that there are no spaces between words and each word in the middle of the phrase begins with a capital letter.
When recording your metadata, make sure you use plain English and follow the writing for 바카라 사이트 guide. For example, do not use jargon, and make sure you define technical terms and expand acronyms. Try to avoid using symbols that users might misinterpret.
When you do not have the information you need to record, you can still add the metadata, but add 바카라 사이트unknown바카라 사이트 when relevant.
Metadata you should record
You should record information that will help others:
-
be informed on where and when your data was collected - use 바카라 사이트creator바카라 사이트 and 바카라 사이트dateCreated바카라 사이트 to record who created the data set and the date they created it
-
find the data you바카라 사이트ve saved on a shared network, and identify whether it바카라 사이트s the data set they need - use 바카라 사이트name바카라 사이트, 바카라 사이트description바카라 사이트 and 바카라 사이트identifier바카라 사이트 to describe your data
-
validate the data you바카라 사이트ve collected - use 바카라 사이트expires바카라 사이트 and 바카라 사이트supersededBy바카라 사이트바카라 사이트 so users know which version of your data to use, 바카라 사이트temporalCoverage바카라 사이트 to indicate the time period to which your data applies, and 바카라 사이트conformsTo바카라 사이트 to tell users whether your file applies to a specific standard or schema
-
use the data you바카라 사이트ve collected appropriately - use 바카라 사이트hasDigitalDocumentPermission바카라 사이트 to make sure users do not share sensitive data in ways it shouldn바카라 사이트t be and 바카라 사이트license바카라 사이트 to help users understand their rights to using the data you바카라 사이트ve collected
-
understand the structure and format of your CSV tabular data - use the and read our guidance on 바카라 사이트Publishing tabular data바카라 사이트 to get started
Try to avoid recording any metadata that includes personal data. If you include personal data, you will need to comply with the principles, rights and obligations contained in GDPR. You can read the for more information.
Recording dates in your metadata
You must record any dates using the ISO 8601 standard, which is an Open Standard selected for use by the government.
This means listing the date and time elements in descending order of size (years, months, days, hours, minutes, seconds, milliseconds and microseconds). You should provide the right level of accuracy for your data set. For example, if you publish your data set once a year, it might be enough to provide a date down to the day, for example, 2020-07-14. If you publish multiple times a day, it is better to include information down to the second, for example, 2020-07-14T12:57:03Z.
Record the provenance of your data
Using 바카라 사이트creator바카라 사이트 or 바카라 사이트contributor바카라 사이트
You should record who created a data set so users can communicate with the creator and understand if the data is relevant to them. For example, a data analyst may want to find out how reliable a data set is before undertaking any analysis.
Record a name for future reference, and an email address if possible. This name and email address should refer to:
- the name of a team or organisation
- a role within a team
- an individual name in some cases - if you can do this while remaining GDPR compliant
For example, creator:바카라 사이트Data Standards Authority team data-standards-authority@digital.cabinet-office.gov.uk바카라 사이트
You can use 바카라 사이트contributor바카라 사이트 instead if multiple organisations or teams are contributing to the data set. You can also use 바카라 사이트creator바카라 사이트 and 바카라 사이트contributor바카라 사이트 together for full clarity around where data has come from.
Using 바카라 사이트dateCreated바카라 사이트
You should record the date when you create a data set to help users of the data set know whether it is valid and relevant to them. You must record the date using the Open Standard ISO 8601.
For example, dateCreated:바카라 사이트2002-10-02바카라 사이트
You must capture the exact time a data set is collected when you바카라 사이트re collecting more than one version of a data set a day.
Help users find, use and identify your data set
Using 바카라 사이트name바카라 사이트
You must include the name of your data set so users can find and identify the right data set.
You should try to make sure the name captures information that will help users determine whether the data set meets their needs. For example, by capturing the topic and specific information about place and geography.
For example, name:바카라 사이트GDS London Office Employees office commuting tendencies바카라 사이트
Using 바카라 사이트description바카라 사이트
You can add a description to your data set, in addition to the title, so that users of your data can find out if it바카라 사이트s relevant to them.
The descriptions of your data should only describe the type of data collected and should not include warnings about how to use the data - any warnings should be explained with the term 바카라 사이트accessRights바카라 사이트.
For example, description:바카라 사이트The amount GDS employees commute to the office and their busiest times to travel. This data also shows the tendencies of GDS employees to work from home바카라 사이트
Using 바카라 사이트identifier바카라 사이트
You should uniquely identify your data set so that users of your data know exactly which source they바카라 사이트re using.
You should identify your data set by:
-
using the identification system your organisation is using (in cases where organisations have a system in place)
-
using a meaningless identifier you바카라 사이트ve created - this should be random numbers rather than sequential or semi-sequential numbers to avoid meaning being implied
Using a meaningless identifier avoids misunderstanding that comes with applying meaning to identifiers. For example, meaning can change over time. Meaningless identifiers have the ability to be genuinely constant things.
For example, identifier:바카라 사이트362857580바카라 사이트
You can ensure this meaningless identifier stays unique by keeping a catalogue of all data sets with their identifiers.
Using 바카라 사이트encodingFormat바카라 사이트
You should record the file format in which you store your data so users know how to use and import it.
File extensions are commonly used for your operating system to decide what program to open a file with. Common file extensions include XLS for Excel spreadsheets and CSV.
Example, encodingFormat:바카라 사이트xls바카라 사이트
If you think you may publish your data set, you can also record the media type. Media types are used by browsers to decide how to present some data. For more information read, 바카라 사이트Publishing tabular data바카라 사이트.
Media type is also known as a Multipurpose Internet Mail Extensions or MIME type. Mozilla .
Example, encodingFormat:바카라 사이트jpeg바카라 사이트
Help others validate your data
Using 바카라 사이트supersededBy바카라 사이트
When the data you바카라 사이트re collecting replaces an older version, you should record this change to make sure users use the most up-to-date version.
You must only use 바카라 사이트supersededBy바카라 사이트 when the data you바카라 사이트re collecting has:
-
the same period of time and location as the older version of the spreadsheet or file
-
different content to the older version of the spreadsheet or file
The new version of the data will need its own unique URL or identifier. For example,
supercededBy:바카라 사이트/government/organisations/government-digital-service/about/v1바카라 사이트
You may also choose 바카라 사이트isRelatedTo바카라 사이트 as a more generic term that can account for any kind of relationship between resources.
Using 바카라 사이트supersedes바카라 사이트
You can use 바카라 사이트supersedes바카라 사이트 as an additional property as this will allow users to understand the history or timeline of a document.
For example, supersedes:바카라 사이트/government/organisations/government-digital-service/about/v1바카라 사이트
Using 바카라 사이트expires바카라 사이트
If you바카라 사이트re no longer using a particular data set, or it has been superseded by other data, you should record it as expired. You can do this by adding the date for when your data set is no longer valid. You should give any replacement data a new title and identifier.
You will often need to remember to revisit your data set to update its metadata when becoming aware of the need for the data set to be no longer used.
For example, expires:바카라 사이트2003-12-04바카라 사이트
Do not use this to record the period of time that applies to your data. In these cases, you should use 바카라 사이트temporalCoverage바카라 사이트 instead.
Using 바카라 사이트temporalCoverage바카라 사이트
If you바카라 사이트re collecting data over a range of dates, you should record this so users know the period that the content applies to. You should add this using the Open Standard ISO8601.
For example, temporalCoverage:바카라 사이트2002-10-02/2013-01-01바카라 사이트
If your data does not have a specified end date, you can use 바카라 사이트..바카라 사이트 in place of the end date. This follows .
For example, temporalCoverage:바카라 사이트2020-10-02/..바카라 사이트
Using 바카라 사이트conformsTo바카라 사이트
You should tell users whether your file conforms to a specific standard or schema so they can easily validate it. This could be the CSV on the Web schema or RFC4180 standard.
For example, conformsTo:바카라 사이트바카라 사이트
Your department, agency or local authority might also use a particular schema for specific types of data collection, and you may want to record this. For example the data standards for publishing brownfield land registers.
Make sure your data is used appropriately
Using 바카라 사이트license바카라 사이트
For protected data such as personal, sensitive or commercial data, you should record information that will help users of the data understand its terms and conditions.
You may want to include the relevant data sharing agreement, legal regulation or certification. This could be a memorandum of understanding (MOU) or Data Protection Impact Assessment.
The open standards vocabularies schema.org and Dublin Core both spell the noun 바카라 사이트licence바카라 사이트 using the American spelling 바카라 사이트license바카라 사이트. You should use 바카라 사이트license바카라 사이트 for consistency.
For example, license:바카라 사이트Memorandum of Understanding between the Charity Commission for England and Wales and the Office for Students바카라 사이트
When publishing open data, you should label the data you바카라 사이트ve collected with its licence for use. In many cases within government, this will be the . You should also link to the licence file to explain what the licence means and how others can use your code and content.
For example, license:
Using 바카라 사이트hasDigitalDocumentPermission바카라 사이트
You should record the sensitivity of your data so it바카라 사이트s not shared or published in ways it should not be.
You should provide information about who should be able to access the data you바카라 사이트ve collected, and any restrictions including:
-
whether it바카라 사이트s open or restricted/protected
-
the handling caveat for the data
-
the security classification of data
For example, hasDigitalDocumentPermission:바카라 사이트restricted access바카라 사이트