This document sets out seven key principles of open data, tackling both the technical and legal aspects of open data. The principles are inspired by the 8 principles originally developed in 2007 by the Open Government Working Group in Sebastopol, California. The objectives of this document is to assist open data publishers in ensuring the data they publish is truly useful for members of the public.
Principle 1: Data Must be Complete
Open data must be published in complete form. This simply means the dataset must make sense and does not include any incomplete fields or missing information. Open data must also be accompanied by metadata. Metadata is information that puts a particular dataset into context by setting out descriptive details such as the data owner, method of collection, frequency of update, geographic coverage, and temporal coverage.
Principle 2: Data Release Must be Timely
Open data must have some relevance to the period during which it is published. It should also be current and up-to-date. Publication of historical data is encouraged where that dataset is relevant today; for example, to uncover data trends. Data publishers should strive to share data in real-time to the maximum extent possible.
Principle 3: The Data Source Must Be Primary and Reliable
Open data must be published by its primary source. This does not necessarily mean that open data may only be published by the primary collector, although it does mean that the data must be published by an entity that has overall responsibility for the dataset. In other words, a data user should be able to trace back the original source of the data. It should provide users with the confidence that the data is reliable or authoritative. Whatever data is published as open data should be permanently discoverable and not made available for a temporary period only.
Principle 4: Data Must be Raw
Open data must be made available in its rawest form. This means that data should be published in the form that it was collected. Data which has been grouped, manipulated, or previously used for a different purpose are nonconforming to this principle.
Principle 5: Data Must Be Technically Reusable
Open data must be published in digital form and in machine-readable format which allows for automated processing (for example: CSV, XLSX, JSON, or XML). Open data should also be structured in a predefined structure that enables the data to be stored, analysed, and processed easily (for example, tabular format). Finally for open data to be technically reusable, it must also be published in a technological format which ‘places no restrictions, monetary or otherwise, upon its use‘ or which, at the very least, ‘can be processed with at least one free/libre/open-source software tool‘ (for example: XLS and XLSX). In other words, this data must be published in open format.
Principle 6: Data Must Be Legally Reusable
Open data must be legally reusable. This means that users should be allowed to use open data freely and beyond its original intended purposes, for both commercial and non-commercial purposes. This is typically achieved by adopting the Open Government Licence or another widely accepted open licence such as those of the Creative Commons.
Principle 7: Data Must Be Accessible and Discoverable
Data providers must ensure that any open data is easily discoverable in that they may be easily accessed online. Access to Open Data must also be free of charge and available for download in bulk. Discriminatory restrictions, such as making data available only to nationals or residents, are not permissible.