Frequently Asked Questions

All the answers to your questions in one page.

Data Management Plans

Where do I create a Data Management Plan (DMP)?

What is the purpose of a DMP?

Data Management Plans (DMPs) describe how data will be collected, stored, analyzed, preserved and shared throughout the active phase of the research project and beyond, identify planned outputs to publish such as articles and datasets, outline the roles and responsibilities of project collaborators, and consolidate all the completed outputs from a project to demonstrate impact. For more information see Data Management Plans.

What are the advantages of registering a DMP with a Digital Object Identifier?

Finalizing a DMP and setting its visibility to either "Organizational" or "Public" allows users to generate a DMP ID which is a Digital Object Identifier, making it referencable by others. Doing so opens a "Follow-Up" tab where completed research outputs (e.g. articles, papers, presentations) can be manually added. It's a great way to keep track of all produced project outputs!

GitHub

What is GitHub?

GitHub is a cloud-based service that helps people store and manage code. While originally built for software development, it has been co-opted by data scientists as it offers excellent version control for both plain text data (.csv) files as well as code, allowing users to revert back to a previous version if needed. It's great for collaboration, as people can access the repository from anywhere and work on a project at the same time. A repository - a folder in which all files and their version histories are stored - can be set up within the Hakai Institute GitHub to store a data package (see Hosting data).

Should I use GitHub?

GitHub is a great platform to use if you're collaborating with others on a project, particularly if you're using code to clean up or analyze data. When working with tabular, text-based data packages, we strongly recommend these be hosted in a Hakai Institute GitHub repository. It is very useful if you're expecting to release annual versions for long-term monitoring data (see Versioning).

The scientific journal I wish to publish to does not view GitHub as a trusted repository for long-term archival. What should I do?

There is a service available from Zenodo to automatically archive versions of your repository and assign a DOI when you create a GitHub release. Make sure to include the DOI in your metadata record, which in this case should be created specific to a version. For more information on GitHub releases and the Zenodo integration see GitHub.

Data Services

What is OBIS?

OBIS is a global open-access data and information clearing-house on marine biodiversity for science, conservation and sustainable development. Learn more here https://obis.org/about/.

For details see our OBIS page here.

What is ERDDAP?

ERDDAP is a scientific data server that gives users a simple, consistent way to download subsets of gridded and tabular scientific datasets in common file formats and make graphs and maps. ERDDAP is a Free and Open Source data server initialy developed by NOAA NMFS SWFSC Environmental Research Division (ERD).

For more detail see our ERDDAP page here.

Persistent Identifiers (PIDs)

What are persistent identifiers and why should I apply them?

Digital persistent identifiers are long-lasting references to a digital resource that are machine-readable and uniquely point to a digital entity.

Those unique identifiers help to link together researchers, affiliations and research outputs, and can make your work easier to discover, cite, and can help track data usage. Applying persistent identiers to your work helps adhere to the FAIR Data Principles. Common digital persisten identifiers used within the scientific community are:

Digital Object Identifier (DOI)
Open Researcher and Contributor ID (ORCID)
Research Organization Registry (ROR)

Can a DOI point to multiple versions of a record?

Yes, a DOI can point to the metadata record in the Hakai Catalogue that links to multiple versions of the data (see Versioning).

Sometimes scientific journals require that a DOI point to a single specific version, or the data provider considers the new version to be a major change to previous versions. In that case, you will have to create a unique metadata record for the Hakai Catalogue for that version, and indicate the relation of that record to other versions through Related Works.

Is it possible to update an existing DOI?

We encourage updating existing time-series data records with additional data where possible, updating the version element.

After you have made the required changes, you can update the DOI through the Metadata Entry Tool by selecting 'update DOI'. The DOI will stay the same. Downstream users that resolve the DOI will be brought to the record landing page in the Hakai Catalogue displaying the latest version. The older versions should still be accessible through this record. If there are significant changes in the data or metadata between versions, this might warrant a separate metadata record with unique DOI (see Versioning).

General

My data is not yet ready to be shared publicly, do I still need to create a metadata record?

Yes, even if your data is contained within a private repository it is still worthwhile to create a metadata-only record in the interim to let others know of the existence of the project or data. Your metadata record should then include a Data Availability Statement so that others know when the data becomes available or how they might request it.

I already sent my data to a (domain-specific) repository, do I still need to create a metadata record for the Hakai Catalogue?

Yes, if you have already published the full granularity of your dataset to a (domain-specific) repository you will still need to create a metadata record for the Hakai Catalogue.

If the repository provided you with a DOI then you can re-use that DOI in the metadata record destined for the Hakai Catalogue. If you published a subset of your data to a domain-specific repository, you’ll want to create a record for the Hakai Catalogue linking to the full data package, as well as have a DOI minted. A relationship to the subset hosted somewhere else should be indicated.

I want to keep my date private / by access request only, do I still need to attach a license to the data?

Yes, the license indicates the terms and conditions for downstream use and dissemination.

I am working with sensitive data, can my data still adhere to the FAIR Data Principles if they cannot be shared publicly?

Yes, though sensitive data may not be publicly accessible and reusable, they can still be FAIR if the terms for access are clear (e.g. through a Data Access Statement), the data is findable, and structured in a reusable and interoperable way.

How can we adhere both to our Hakai Open Science Policy (where data is mandated to be shared publicly) as well as with the CARE Principles (managing restrictive data)?

The CARE Principles and its sub-principles enhance and extend the FAIR Data Principles for scientific data management by centering equity and ethics as core guiding principles alongside those set out by FAIR. Where the FAIR Principles are more data-centric, the CARE Principles focus on people and purpose-orientated standards, placing greater emphasis on the context in which data is collected and managed. Suggested ways to implement the CARE (sub-)principles within the data management workflow are outlined in Carroll et al. 2021 and O'Brien et al. 2024, preprint.

Still have some questions

Use the search bar above or ask data@hakai.org, start an issue here or a discussion here.