Open Data

Open Data refers to research datasets that are freely available and readily accessible for use by others without restriction. Open data can be freely used, modified, and shared by anyone for any purpose, fundamentally enabling verification, replication, and reuse by other researchers, industry partners, and the broader scientific community.

This practice transforms how scientific knowledge is built and validated, moving away from siloed data ownership toward collaborative, transparent research ecosystems that accelerate discovery and enhance research quality.

Open Data embodies the philosophy that research findings should benefit society as a whole, not just the immediate research team or institution that generated them.

The Three Pillars of Open Data

1. Availability and Access

Data must be made available as a complete dataset at no more than reasonable reproduction costs, preferably through online download. Critically, data must be provided in convenient and formats that can be easily modified and that enable meaningful use by others. This means avoiding proprietary formats that require expensive software or technical barriers that limit accessibility.

2. Reuse and Redistribution

Open data must be provided under licensing terms that explicitly permit reuse and redistribution, including the ability to combine datasets with other sources. This enables researchers to build upon existing work, conduct meta-analyses, and develop new insights through data integration.

3. Universal Participation

Everyone must be able to use, reuse, and redistribute the data without discrimination against fields of endeavour, individuals, or groups. This means avoiding restrictions such as using the NC (non-commercial use only) attribute in Creative Common licenses, or limitations to specific purposes like education, as these create artificial barriers to knowledge advancement.

The FAIR Principles Framework

Open Data implementation is guided by the FAIR principles, which describe making scholarly materials Findable, Accessible, Interoperable and Reusable (FAIR). These principles provide a structured approach to data sharing:

Findable

Datasets are assigned persistent identifiers (DOIs, URNs)
Rich metadata descriptions enable discovery through search engines and catalogues
Clear indexing in data repositories and institutional systems
Descriptive metadata includes the identifier to locate relevant datasets

Accessible

Concerned with where materials are stored (e.g. in data repositories)
Data can be retrieved through standardised protocols
Authentication and authorisation procedures are clearly defined
Long-term preservation ensures continued access
Appropriate access controls for sensitive data while maintaining openness where possible

Interoperable

Focus on the importance of data formats and how such formats might change in the future
Use of standard file formats that can be read across different software platforms
Consistent variable naming and coding schemes
Clear documentation of data structure and relationships
Compatibility with existing data integration tools and workflows

Reusable

Comprehensive documentation and metadata enable proper interpretation
Clear licensing that specifies how data can be used
Quality assurance processes ensure data integrity
Sufficient contextual information for meaningful reanalysis

Courses

News