Resources
Open & FAIR Data
Whether you collect ocean and coastal data as part of an non-governmental organization, as a government employee, as an academic research, or as an industry practitioner, ensuring open data management practices are embedded into your work plan is important for advancing multi-sector open collaborations and transparent decision-making.
Open data is:
-
machine-readable
-
freely shared
-
used and built on without restrictions, or with limited restrictions such as attribution
(source: open.canada.ca)
The Value of Open Data for Marine Environments
The ability to analyze reliable open data is key to monitoring and resolving ecological, social, and economic issues. Well managed open data leads to improved planning and decision-making in marine environments by encouraging knowledge exchange between multiple sectors and the public.
Better understanding of ocean currents, like speed and direction, can lead to improved weather forecasting and emergency response and preparedness. Current data can also inform offshore developments like aquaculture, shipping, and predicting the spread of plastics in our ocean. A better understanding of sea height can support earlier adaptation for coastal communities that are threatened by climate change. While an understanding of nutrient cycling can help inform fisheries and marine protected areas planning.
The Government of Canada is committed to creating a culture of “open by default”
This culture shift will be achieved through the Directive on Open Government, aimed to:
• maximize the value of government data and information
• support transparency
• support accountability
• promote citizen participation through information sharing
The National Action Plan on Open Government will increase the availability and usability of geospatial data (data with geographic coordinates) through the Federal Geospatial Platform (FGP) and the Open Maps section of the Open Government Portal to propel collaboration and engagement between the science community, the business sector, and the public.
Why open data?
Open data plays a role in supporting:
• Transparency
• Participation
• Empowerment
• Innovation
• Efficiency of services
• Effectiveness of services
• Impact measurement of policies
• New knowledge from combined data sources and patterns in large data volumes
(source: opendatahandbook.org)
There are many benefits to open data, but the benefits are lost if data are not managed effectively.
FAIR data management is an important aspect of managing your data for wider use and greater impact.
What is FAIR Data?
Advancements in digital science are built on timely sharing and access to digital data. COINAtlantic promotes open data management practices and the implementation of the FAIR data principles to maximize the value of your coastal & ocean data.
FAIR data is:
-
Findable
-
Accessible
-
Interoperable
-
Re-Useable
How does FAIR benefit you?
-
Enhance the findability and reuse of your research data and metadata
-
Provides a tool for knowledge discovery and innovation
-
Share, reuse, and be credited for data
-
Satisfy funding agencies who require long term data stewardship
-
More accurately determine the validity of the data to your work
-
Contribute to the rigorous management and stewardship of these valuable digital resources
(source: Wilkinson, Dumontier and Mons 2016)
Is your data FAIR?
Use these definitions and suggested questions to guide your evaluation.
Findable: Data has full metadata associated with it, and can be found in a searchable repository. Data has a globally unique and persistent identifier (PID) that improves the provenance of your data.
-
Do the data produced and/or used in the project have metadata associated with it, and/or PID such as Digital Object Identifiers (DOI)? (i.e. DataCite)
-
Are you including keywords or controlled vocabularies to optimize possibilities for re-use? i.e. Climate Forecast (CF), Darwin Core Terms, BODC Vocabularies?
-
What metadata will be created? (the 'who, what, where, when, why, how' about your data)
Accessible: Accessible data is downloadable in variety of formats, or the metadata remains accessible even when the data are no longer available.
-
Which data produced and/or used in the project will be made openly available as the default?
-
How will the data be made accessible (e.g. by depositing in an open data repository)?
-
Where will the data and associated metadata, documentation and code be deposited? Preference should be given to certified repositories which support open access where possible.
Interoperable: The (meta)data use a free and recognized standardization protocol that acts as a verification and quality procedure such as controlled vocabularies.
-
Are the data produced in the project interoperable, that is, allowing data exchange and re-use between researchers, institutions, organisations, countries, etc. (i.e. adhering to standards for formats)?
-
What data and metadata vocabularies, standards or methodologies will you follow to make your data interoperable?
-
Will you map all data fields in your dataset to controlled vocabularies? (CF, BODC, Darwin Core Terms)
Reusable: (Meta)data are thoroughly described with accurate and relevant fields, and contain clear provenance. The (meta)data are published with a clear and accessible data usage license that outlines terms for using data.
-
How will the data be licensed to permit the widest re-use possible? i.e. CC-BY 4.0
-
Are the data produced and/or used in the project useable after the end of the project? (are there any commercial use restrictions?)
-
Are data quality assurance processes described? (cleaning, processing)
(source: ec.europa.eu; ec.europa.eu)
Data cleaning the process of detecting incorrect, inaccurate, incomplete, inconsistent data and reformatting it according to the requirements of your quality assurance/ quality control procedures to improve reuse of the data by flagging, fixing, or removing any anomalies.
Data cleaning is important for increasing the findability and analytics of data.
Aspects of data cleaning and what to check for:
-
Validity
-
Accuracy
-
Completeness
-
Uniformity
Cleaned and standardized data allows others to more easily use your data in their own work.
All research projects generate some form of data. Implementing data management activities into your research project will help you organize, store, and retrieve data for use during and after your project is completed. Funders are increasingly looking for research data management (RDM) practices within their funding agreements.
The three federal research funding agencies—the Canadian Institutes of Health Research (CIHR), the Natural Sciences and Engineering Research Council of Canada (NSERC), and the Social Sciences and Humanities Research Council of Canada (SSHRC) (the agencies), have developed a draft Tri-Agency Research Data Management Policy, to support Canadian research excellence by promoting good digital data management and data stewardship practices.
The Portage Network is dedicated to the shared stewardship of research data in Canada through:
-
Developing a national research data culture
-
Fostering a community of practice for research data
-
Building national research data services and infrastructure
Data management activities include:
-
Planning
-
Documenting
-
Processing
-
Sharing
-
Collecting
-
Formatting
-
Storing
Click to learn more about RDM practices
If you are an academic researcher, contact your local Academic Librarian to learn about the RDM services offered at your institution.