Open Data
Open Data refers to research datasets that are freely available and readily accessible for use by others without restriction. Open data can be freely used, modified, and shared by anyone for any purpose, fundamentally enabling verification, replication, and reuse by other researchers, industry partners, and the broader scientific community.
This practice transforms how scientific knowledge is built and validated, moving away from siloed data ownership toward collaborative, transparent research ecosystems that accelerate discovery and enhance research quality.
Open Data embodies the philosophy that research findings should benefit society as a whole, not just the immediate research team or institution that generated them.
The Three Pillars of Open Data
1. Availability and Access
Data must be made available as a complete dataset at no more than reasonable reproduction costs, preferably through online download. Critically, data must be provided in convenient and formats that can be easily modified and that enable meaningful use by others. This means avoiding proprietary formats that require expensive software or technical barriers that limit accessibility.
2. Reuse and Redistribution
Open data must be provided under licensing terms that explicitly permit reuse and redistribution, including the ability to combine datasets with other sources. This enables researchers to build upon existing work, conduct meta-analyses, and develop new insights through data integration.
3. Universal Participation
Everyone must be able to use, reuse, and redistribute the data without discrimination against fields of endeavour, individuals, or groups. This means avoiding restrictions such as using the NC (non-commercial use only) attribute in Creative Common licenses, or limitations to specific purposes like education, as these create artificial barriers to knowledge advancement.
The FAIR Principles Framework
Open Data implementation is guided by the FAIR principles, which describe making scholarly materials Findable, Accessible, Interoperable and Reusable (FAIR). These principles provide a structured approach to data sharing:
Findable
- Datasets are assigned persistent identifiers (DOIs, URNs)
- Rich metadata descriptions enable discovery through search engines and catalogues
- Clear indexing in data repositories and institutional systems
- Descriptive metadata includes the identifier to locate relevant datasets
Accessible
- Concerned with where materials are stored (e.g. in data repositories)
- Data can be retrieved through standardised protocols
- Authentication and authorisation procedures are clearly defined
- Long-term preservation ensures continued access
- Appropriate access controls for sensitive data while maintaining openness where possible
Interoperable
- Focus on the importance of data formats and how such formats might change in the future
- Use of standard file formats that can be read across different software platforms
- Consistent variable naming and coding schemes
- Clear documentation of data structure and relationships
- Compatibility with existing data integration tools and workflows
Reusable
- Comprehensive documentation and metadata enable proper interpretation
- Clear licensing that specifies how data can be used
- Quality assurance processes ensure data integrity
- Sufficient contextual information for meaningful reanalysis