hakaiinstitute / hakai-datasets Goto Github PK

Hakai Datasets that are going into https://catalogue.hakai.org/erddap/

MATLAB 13.89% Python 5.48% Shell 29.54% CSS 43.28% Makefile 7.81%

hakai-datasets's Introduction

Hakai Datasets

This repository contains different components needed to produce and maintain Hakai's datasets on Hakai ERDDAP servers. Server update status:

All datasets made available within the datasets.d folder in the ERDDAP XML format are made available on the production server.

Hakai deploy ERDDAP as docker containers by using the docker-erddap image. Continuous Integration is handled via the erddap-deploy actions and the container configuration is handled via CapRover applications.

See GitHub Deployments for all active deployments maintained via this repository.

Configuration

The present repository is handled via CapRovers Applications. To configure a deployment, follow these steps:

Install CapRover on the server
CapRover Application Configuration:
- Create a new application for your ERDDAP instance
- Set container HTTP port: 8080
- Copy sample.env environment variables within App. Configs -> Environment Variables section and define the different parameters accordingly
- Define Persistent Directories:
  - /erddapData/ or defined bigParentDirectory environment variable
  - /datasets/ map to the local path on the server the file datasets are mounted
  - /algae_explorer mapped to the directory on the server where the algae explorer files are mounted
Generate SSH key for GitHub CI
Add public key to ~/.ssh/authorized_keys within remote server
Define Environment on GitHub
- Create new Environment
- Add environment protection (optional)
- Add Environment Secrets:
  - CAPROVER_TOKEN
  - SSH_HOST
  - SSH_USERNAME
  - SSH_KEY
  - SSH_PORT
- Add Environment Variables:
  - CAPROVER_URL
  - CAPROVER_APP_NAME
  - NAME
  - URL

Testing environment

For local development, make a copy of sample.env file as .env. Update the environment variables to match the deployed parameters. Omit the email parameters and baseHttpsUrl and baseUrl.

Add test files if needed within the datasets/ directory.

Run docker-compose

docker-compose up -d

If successful, you should be able to access your local ERDDAP instance at http://localhost:{HOST_PORT}/erddap (default: http://localhost:8080/erddap)

Hakai Database integration

All views and tables generated from the different SQL queries made available in the view directory are run nightly from the hecate.hakai.org server from the master branch.

Continuous Integration

All commits to this repository are tested by different linter through a PR or commit to the development and master branches:

python: black, isort
sql: sqlfluff
markdown
xml

We are using the super-linter library to generate the different automated integration tests.

If the linter tests and erddap_deploy tests pass, changes will automatically be reflected on the associated deployment via:

Redeployment of the application: on changes to erddap/**/*,Dockerfile or environment variables.
Update of the datasets.xml: on changes to datasets.d or datasets.xml

hakai-datasets's People

Contributors

Watchers

hakai-datasets's Issues

Dataset_ID: HakaiCTDProvisional

Below are listed all the different steps related to the initial submission of a dataset.

A more detailed written and visual description of every step is available respectively
here and here.

Submission steps

Initial Submission (Data Administrator)

Raw Data Submission
CIOOS Metadata Form

ERDDAP Dataset Creation (Data Integrator)

Dataset Transformation
- 🟢 Directly Compatible
- 🟡 Minor Revisions
- 🟠 Major Revisions
- 🔴 Incompatible/Missing Information
Near Real-time Data Integration
[] QARTOD Integration
ERDDAP Integration
ERDDAP Dataset Documentation
ERDDAP Test Locally
Add Dataset to Development Branch

Dataset Review (Data Administrator)

Dataset Development Branch Revision

🟢 Approved
🟡 Minor Revisions
🟠 Major Revisions

Dataset Completion (Data Integrator)

Merge Development Dataset to Production Branch
COMPLETED

Dataset Submission: HakaiADCPTimeSeriesProvisional

Hakai Dataset Submission

Below are listed all the different steps related to the initial submission of a dataset.

A more detailed written and visual description of every step is available respectively
here and here.

Submission steps

Initial Submission (Data Administrator)

Original Data Submission
CIOOS Metadata Form completed

ERDDAP Dataset Creation (Data Integrator)

Dataset Transformation (Format label)
- 🟢 Format Compatible
- 🟡 Format Minor Revisions
- 🟠 Format Major Revisions
- 🔴 Format Incompatible/Missing Information
Near Real-time Data Integration
QARTOD Integration
ERDDAP Integration
ERDDAP Dataset Documentation
ERDDAP Test Locally
Add Dataset to Development Branch

Dataset Review (Data Administrator)

CIOOS Metadata Form
ERDDAP development

Dataset Development Branch Revision (Reviewer Label)
- 🟢 Reviewer Approved
- 🟡 Reviewer Minor Revisions
- 🟠 Reviewer Major Revisions

Dataset Completion (Data Integrator)

Merge Development Dataset to Production Branch
COMPLETED

Issue: Hakai Limpet Non connected handling

Limpet dataset with non connected to server data

Issue description

For quiet some times now, the limpet CTD data is now not feeding directely within the sensor network and only retrieved every few months. The data is then shared through emails accross the different groups.

Suggestion for correction

As suggested by @shawn-hateley, it would be good to create a Limpet specific repository which could be use as primary location to host the data and the different information associated with the platform. This data could then be harvested by sensor-network directely and through ERDDAP.

Additional context

Generate an hakai-limpet-ocean-platform https://github.com/HakaiInstitute/hakai-limpet-ocean-platform
Add all the data availabe to it
Any other data?
Generate a little tool that convert the csv data to something usable to the different tools downstream.

Dataset Submission: HakaiPruthDockProvisional

Hakai Dataset Submission

Below are listed all the different steps related to the initial submission of a dataset.

A more detailed written and visual description of every step is available respectively
here and here.

Submission steps

Initial Submission (Data Administrator)

Original Data Submission
CIOOS Metadata Form completed

ERDDAP Dataset Creation (Data Integrator)

Dataset Transformation (Format label)
- 🟢 Format Compatible
- 🟡 Format Minor Revisions
- 🟠 Format Major Revisions
- 🔴 Format Incompatible/Missing Information
Near Real-time Data Integration
QARTOD Integration
ERDDAP Integration
ERDDAP Dataset Documentation
ERDDAP Test Locally
Add Dataset to Development Branch

Dataset Review (Data Administrator)

Dataset Development Branch Revision (Reviewer Label)
- 🟢 Reviewer Approved
- 🟡 Reviewer Minor Revisions
- 🟠 Reviewer Major Revisions
  ERDDAP Goose: https://goose.hakai.org/erddap/tabledap/HakaiPruthDockProvisional.html
  Metadata Form: https://cioos-siooc.github.io/metadata-entry-form/#/en/hakai/tV5qE0aUgaOjSVmgPgiZ6MyHuSy1/-MkxD-S6TxIvv2e6_OGH

Dataset Completion (Data Integrator)

Merge Development Dataset to Production Branch
COMPLETED

Dataset Submission: HakaiBottleSampleResearch

Hakai Dataset Submission

Below are listed all the different steps related to the initial submission of a dataset.

A more detailed written and visual description of every step is available respectively
here and here.

Submission steps

Initial Submission (Data Administrator)

Original Data Submission
CIOOS Metadata Form completed

ERDDAP Dataset Creation (Data Integrator)

Dataset Transformation (Format label)
- 🟢 Format Compatible
- 🟡 Format Minor Revisions
- 🟠 Format Major Revisions
- 🔴 Format Incompatible/Missing Information
Near Real-time Data Integration
QARTOD Integration
ERDDAP Integration
ERDDAP Dataset Documentation
ERDDAP Test Locally
Add Dataset to Development Branch

Dataset Review (Data Administrator)

Dataset Development Branch Revision (Reviewer Label)
- 🟢 Reviewer Approved
- 🟡 Reviewer Minor Revisions
- 🟠 Reviewer Major Revisions

Dataset Completion (Data Integrator)

Merge Development Dataset to Production Branch
COMPLETED

Dataset Submission: Nearshore Tidbits datasets

Hakai Dataset Submission

Below are listed all the different steps related to the initial submission of a dataset.

A more detailed written and visual description of every step is available respectively
here and here.

Submission steps

Initial Submission (Data Administrator)

Original Data Submission
CIOOS Metadata Form completed

ERDDAP Dataset Creation (Data Integrator)

Dataset Transformation (Format label)
- 🟢 Format Compatible
- 🟡 Format Minor Revisions
- 🟠 Format Major Revisions
- 🔴 Format Incompatible/Missing Information
Near Real-time Data Integration
QARTOD Integration
ERDDAP Integration
ERDDAP Dataset Documentation
ERDDAP Test Locally
Add Dataset to Development Branch

ERDDAP dataset is now available here
https://goose.hakai.org/erddap/tabledap/HakaiNearShoreStandAloneRaw.html

Dataset related issues

Add all missing stations to station-log https://github.com/HakaiInstitute/hakai-nearshore-sensor-data/issues/12
Add tide station reference and depth shift to apply from the reference tide station https://github.com/HakaiInstitute/hakai-nearshore-sensor-data/issues/2
Add more QARTOD tests?
Include Electric Blue new stations https://github.com/HakaiInstitute/hakai-nearshore-sensor-data/issues/15

Dataset Review (Data Administrator)

Dataset Development Branch Revision (Reviewer Label)
- 🟢 Reviewer Approved
- 🟡 Reviewer Minor Revisions
- 🟠 Reviewer Major Revisions

Dataset Completion (Data Integrator)

Merge Development Dataset to Production Branch
COMPLETED

Update: Hakai Rivers Inlet Mooring dataset

TO DO

Include the latest data from the Rivers Inlet mooring dataset.

Change the data workflow to a full a github repo method like other datasets
Update the processing method to handle the ocean_data_parser package
QC data
Make data available on dev erddap
Confirm data with Jen
Make data available on production
Confirm Metadata record is correct

Dataset Submission: HakaiMooringTimeSeriesResearch

Hakai Dataset Submission

Below are listed all the different steps related to the initial submission of a dataset.

A more detailed written and visual description of every step is available respectively
here and here.

Submission steps

Initial Submission (Data Administrator)

Original Data Submission
CIOOS Metadata Form completed

ERDDAP Dataset Creation (Data Integrator)

Dataset Transformation (Format label)
- 🟢 Format Compatible
- 🟡 Format Minor Revisions
- 🟠 Format Major Revisions
- 🔴 Format Incompatible/Missing Information
Near Real-time Data Integration
QARTOD Integration
ERDDAP Integration
ERDDAP Dataset Documentation
ERDDAP Test Locally
Add Dataset to Development Branch

Dataset Review (Data Administrator)

CKAN: https://cioos-siooc.github.io/metadata-entry-form/#/en/hakai/tV5qE0aUgaOjSVmgPgiZ6MyHuSy1/-MsbCMOYj2L_7dgICnzw
ERDDAP: https://goose.hakai.org/erddap/tabledap/HakaiMooredTimeSeriesResearch.html

Dataset Development Branch Revision (Reviewer Label)
- 🟢 Reviewer Approved
- 🟡 Reviewer Minor Revisions
- 🟠 Reviewer Major Revisions

Dataset Completion (Data Integrator)

Merge Development Dataset to Production Branch
COMPLETED

Dataset Submission: Seaspan Royal SuperCO2 password protected

Hakai Dataset Submission

Below are listed all the different steps related to the initial submission of a dataset.

A more detailed written and visual description of every step is available respectively
here and here.

Submission steps

Initial Submission (Data Administrator)

Original Data Submission

ERDDAP Dataset Creation (Data Integrator)

Dataset Transformation (Format label)
- 🟢 Format Compatible
- 🟡 Format Minor Revisions
- 🟠 Format Major Revisions
- 🔴 Format Incompatible/Missing Information
Near Real-time Data Integration
~~QARTOD Integration~~
ERDDAP Integration
ERDDAP Dataset Documentation
ERDDAP Test Locally
Add Dataset to Development Branch
~~Add password protection to ERDDAP and make this dataset accessible to role oa~~ This may get revised later. We will rely for now on the Goose Development Protected ERDDAP

Dataset Review (Data Administrator)

Dataset Development Branch Revision (Reviewer Label)
- 🟢 Reviewer Approved
- 🟡 Reviewer Minor Revisions
- 🟠 Reviewer Major Revisions

Dataset Completion (Data Integrator)

ERDDAP

Merge Development ERDDAP Dataset to Production Branch #95
Confirm ERDDAP Dataset is running on Hakai Production ERDDAP Server
Test MatLab data retrieval
COMPLETED

Dataset Submission: HakaiQuadraBoLResearch

Hakai Dataset Submission

Below are listed all the different steps related to the initial submission of a dataset.

A more detailed written and visual description of every step is available respectively
here and here.

Submission steps

Initial Submission (Data Administrator)

Original Data Submission
CIOOS Metadata Form completed

ERDDAP Dataset Creation (Data Integrator)

Dataset Transformation (Format label)
- 🟢 Format Compatible
- 🟡 Format Minor Revisions
- 🟠 Format Major Revisions
- 🔴 Format Incompatible/Missing Information
Near Real-time Data Integration
QARTOD Integration
ERDDAP Integration
ERDDAP Dataset Documentation
ERDDAP Test Locally
Add Dataset to Development Branch

Dataset Review (Data Administrator)

Dataset Development Branch Revision (Reviewer Label)
- 🟢 Reviewer Approved
- 🟡 Reviewer Minor Revisions
- 🟠 Reviewer Major Revisions

Dataset Completion (Data Integrator)

Merge Development Dataset to Production Branch
COMPLETED

Update: HakaiKCBuoy1hour

Dataset ID: HakaiKCBuoy1hour Update

Things to do:

Create a CIOOS metadata form to replace the present temporary CKAN record available
Add more documentation to the ERDDAP dataset which is reflecting the information available within the Research dataset.
- Global Attributes
- Standard Names
- Match Research vs Provisional Variable names

Dataset Submission: HakaiColumbiaFerryResearch

Hakai Dataset Submission

Below are listed all the different steps related to the initial submission of a dataset.

A more detailed written and visual description of every step is available respectively
here and here.

Submission steps

Initial Submission (Data Administrator)

Original Data Submission
CIOOS Metadata Form completed

ERDDAP Dataset Creation (Data Integrator)

Dataset Transformation (Format label)
- 🟢 Format Compatible
- 🟡 Format Minor Revisions
- 🟠 Format Major Revisions
- 🔴 Format Incompatible/Missing Information
Near Real-time Data Integration
QARTOD Integration
ERDDAP Integration
ERDDAP Dataset Documentation
ERDDAP Test Locally
Add Dataset to Development Branch

Dataset Review (Data Administrator)

Finalized Datasets for presentation to the data administrator:

dev ERDDAP
dev CKAN
CIOOS Metadata
Dataset Development Branch Revision (Reviewer Label)
- 🟢 Reviewer Approved
- 🟡 Reviewer Minor Revisions
- 🟠 Reviewer Major Revisions

Dataset Completion (Data Integrator)

Merge Development Dataset to Production Branch
COMPLETED

Dataset Submission: HakaiChlorophyllSampleProvisional

Hakai Dataset Submission

Below are listed all the different steps related to the initial submission of a dataset.

A more detailed written and visual description of every step is available respectively
here and here.

Submission steps

Initial Submission (Data Administrator)

Original Data Submission
CIOOS Metadata Form completed

ERDDAP Dataset Creation (Data Integrator)

Dataset Transformation (Format label)
- 🟢 Format Compatible
- 🟡 Format Minor Revisions
- 🟠 Format Major Revisions
- 🔴 Format Incompatible/Missing Information
Near Real-time Data Integration
QARTOD Integration
ERDDAP Integration
ERDDAP Dataset Documentation
ERDDAP Test Locally
Add Dataset to Development Branch

Dataset Review (Data Administrator)

Dataset Development Branch Revision (Reviewer Label)
- 🟢 Reviewer Approved
- 🟡 Reviewer Minor Revisions
- 🟠 Reviewer Major Revisions

https://goose.hakai.org/erddap/tabledap/HakaiChlorophyllSampleProvisional.html
https://cioos-siooc.github.io/metadata-entry-form/#/en/hakai/7U7b8oPpeTN6gjvXlUCTGJr5pga2/-McQFPAf457LB4-SWmyL

Dataset Completion (Data Integrator)

Merge Development Dataset to Production Branch
COMPLETED

Dataset Submission: HakaiKCBuoyResearch

Hakai Dataset Submission

Below are listed all the different steps related to the initial submission of a dataset.

A more detailed written and visual description of every step is available respectively
here and here.

Submission steps

Initial Submission (Data Administrator)

Original Data Submission
CIOOS Metadata Form completed

ERDDAP Dataset Creation (Data Integrator)

Dataset Transformation (Format label)
- 🟢 Format Compatible
- 🟡 Format Minor Revisions
- 🟠 Format Major Revisions
- 🔴 Format Incompatible/Missing Information
Near Real-time Data Integration
QARTOD Integration
ERDDAP Integration
ERDDAP Dataset Documentation
ERDDAP Test Locally
Add Dataset to Development Branch

Dataset Review (Data Administrator)

Here's the finalized dataset to present to the data administrator:

dev ERDDAP (there's an issue with the mooring_name variable, to review)
dev CKAN
CIOOS Metadata
Dataset Development Branch Revision (Reviewer Label)
- 🟢 Reviewer Approved
- 🟡 Reviewer Minor Revisions
- 🟠 Reviewer Major Revisions

Dataset Completion (Data Integrator)

Merge Development Dataset to Production Branch
COMPLETED

Dataset Submission: Synthesized Nutrient Dataset associated with research paper by Hayley Dosser

Hakai Dataset Submission

The intention of this issue is to make a nutrient dataset created and used by Hayley as part of a research paper Findable and Accessible via the Hakai metadata catalogue. Similar to other synthesized/paper specific datasets, this dataset included data from multiple sources that has been aggregated and processed in ways specific a particular research project/paper.

Related examples include:

Hakai Oceanography Nutrient Research Dataset (DRAFT, not yet published)
Marine CO2 system variability along the Inside Passage of the Pacific Northwest coast of North America determined from an Alaskan ferry. This dataset is specific to a research paper by Wiley and is referenced by this DOI: https://doi.org/10.21966/m0es-7520

Below are listed all the different steps related to the initial submission of a dataset.

A more detailed written and visual description of every step is available respectively
here and here.

Submission steps

Initial Submission (Data Administrator)

Draft a metadata record using the Hakai metadata entry form

Online Dataset Creation (Data Integrator)

Make data files and extra information available in a new GitHub repository or open Google Drive Folder

Dataset Review (Data Administrator)

Review data and metadata and approve for publishing

Dataset Completion (Data Integrator)

COMPLETED

Issue: Fix Hakai Datasets Metadata from compliance checker

Handle all the different metadata issues detected by the IOOS Compliance checker runner
https://cioos-siooc.github.io/erddap-compliance-runner/catalogue.hakai.org/

Dataset Submission: HakaiCTDResearch

Hakai Dataset Submission

Below are listed all the different steps related to the initial submission of a dataset.

A more detailed written and visual description of every step is available respectively
here and here.

Submission steps

Initial Submission (Data Administrator)

Original Data Submission
CIOOS Metadata Form completed

ERDDAP Dataset Creation (Data Integrator)

Dataset Transformation (Format label)
- 🟢 Format Compatible
- 🟡 Format Minor Revisions
- 🟠 Format Major Revisions
- 🔴 Format Incompatible/Missing Information
Near Real-time Data Integration
QARTOD Integration
ERDDAP Integration
ERDDAP Dataset Documentation
ERDDAP Test Locally
Add Dataset to Development Branch

Dataset Review (Data Administrator)

Dataset Development Branch Revision (Reviewer Label)
- 🟢 Reviewer Approved
- 🟡 Reviewer Minor Revisions
- 🟠 Reviewer Major Revisions

Dataset Completion (Data Integrator)

Merge Development Dataset to Production Branch
COMPLETED

Add non public datasets to hakai ERDDAP

General idea

As Hakai starts to use more and more ERDDAP has a primary platform to share and make accessible data. We're hoping to provide access to some of our data to our different internal Hakai groups through some non-public ERDDAP datasets. (see #92 as a first example)

Those datasets will be generated to help the different Hakai groups to get access to their data, access the data quality and problems, and eventually either make this specific dataset available or create another data product that will be made available.

Solutions

ERDDAP provides the ability to keep some datasets behind an authentication wall (see here for documentation). A number of methods can be used to authenticate. Among those the most interesting are:

custom -> username password
google
2Oauth

Conditions

The method used needs to be:

Secure
Easy to handle through the API with through different packages that will need to be developed to retrieve the protected data (matlab,python,...)

Steps

Which method should be used
Add OA as a role
Confirm that the authentication method still makes it possible to retrieve the data through a MatLab script.

Dataset Submission: HakaiCTDProvisional

Hakai Dataset Submission

Below are listed all the different steps related to the initial submission of a dataset.

A more detailed written and visual description of every step is available respectively
here and here.

Submission steps

Initial Submission (Data Administrator)

Original Data Submission
CIOOS Metadata Form completed

ERDDAP Dataset Creation (Data Integrator)

Dataset Transformation (Format label)
- 🟢 Format Compatible
- 🟡 Format Minor Revisions
- 🟠 Format Major Revisions
- 🔴 Format Incompatible/Missing Information
Near Real-time Data Integration
QARTOD Integration
ERDDAP Integration
ERDDAP Dataset Documentation
ERDDAP Test Locally
Add Dataset to Development Branch

Dataset Review (Data Administrator)

* ERDDAP dev

Dataset Development Branch Revision (Reviewer Label)
- 🟢 Reviewer Approved
- 🟡 Reviewer Minor Revisions
- 🟠 Reviewer Major Revisions

Dataset Completion (Data Integrator)

Merge Development Dataset to Production Branch
COMPLETED

Dataset Submission: HakaiJSP-draft

Hakai Dataset Submission

Below are listed all the different steps related to the initial submission of a dataset.

A more detailed written and visual description of every step is available respectively
here and here.

Submission steps

Initial Submission (Data Administrator)

Original Data Submission
CIOOS Metadata Form completed

OBIS Dataset Creation (Data Integrator)

Dataset Standardization (Format label)
- 🟢 Format Compatible
- 🟡 Format Minor Revisions
- 🟠 Format Major Revisions
- 🔴 Format Incompatible/Missing Information
Record created through the Integrated Publishing Toolkit (IPT) (visibility: private)
IPT: Metadata (eml.xml) created
IPT: Data Integration (matching to Darwin Core standard)
OBIS Dataset Documentation

Dataset Review (Data Administrator)

Dataset Revision (Reviewer Label)
- 🟢 Reviewer Approved
- 🟡 Reviewer Minor Revisions
- 🟠 Reviewer Major Revisions

Dataset Completion (Data Integrator)

Publish (Meta)Data to OBIS (visibility: public)
Update CIOOS record: Add OBIS Record Link
COMPLETED

Update: HakaiQuadraBoL5min.xml

Dataset ID HakaiQuadraBoL5min.xml

Things to do:

Create a CIOOS metadata form to replace the present temporary CKAN record available
Add more documentation to the ERDDAP dataset which is reflecting the information available within the Research dataset.
- Global Attributes
- Standard Names
- Match Research vs Provisional Variable names

Requests CF and BODC terms for chlorophyll related terms

Halai Chlorophyll Sample Protocol

Hakai uses the following stack filters to retrieve a fraction size measurement of the chlorophyll samples:

20um
mid-layer
1. 3um
2. 2um (pre-2015)
GF/F

From those samples, the Chlorophyll-a and phaeopigments concentrations are retrieved by filtration, ~~methanol~~ acetone extraction, and fluorometry.

Missing terms from vocabularies

Following this protocol the two vocabularies are missing the following terms:

CF Standard names

Completed	standard name	Canonical units
	mass_concentration_of_miscellaneous_phytoplankton_expressed_as_phaeopigments_in_sea_water	kg m-3
	mass_concentration_of_nanophytoplankton_expressed_as_phaeopigments_in_sea_water	kg m-3
	mass_concentration_of_phytoplankton_expressed_as_phaeopigments_in_sea_water	kg m-3
	mass_concentration_of_picophytoplankton_expressed_as_phaeopigments_in_sea_water	kg m-3
In Process	mass_concentration_of_phaeopigments_in_sea_water	kg m-3

NVS P01 terms:

Chlorophyll-a
- Concentration of chlorophyll-a {chl-a CAS 479-61-8} per unit volume of the water body [particulate 2-20um phase] by filtration, acetone extraction and fluorometry
- Concentration of chlorophyll-a {chl-a CAS 479-61-8} per unit volume of the water body [particulate 3-20um phase] by filtration, acetone extraction and fluorometry
- Concentration of chlorophyll-a {chl-a CAS 479-61-8} per unit volume of the water body [particulate GF/F-2um phase] by filtration, acetone extraction and fluorometry
- Concentration of chlorophyll-a {chl-a CAS 479-61-8} per unit volume of the water body [particulate GF/F-3um phase] by filtration, acetone extraction and fluorometry
Phaeopigments
- Concentration of phaeopigments {pheopigments} per unit volume of the water body [particulate 2-20um phase] by filtration, acetone extraction and fluorometry
- Concentration of phaeopigments {pheopigments} per unit volume of the water body [particulate 3-20um phase] by filtration, acetone extraction and fluorometry
- Concentration of phaeopigments {pheopigments} per unit volume of the water body [particulate GF/F-2um phase] by filtration, acetone extraction and fluorometry
- Concentration of phaeopigments {pheopigments} per unit volume of the water body [particulate GF/F-3um phase] by filtration, acetone extraction and fluorometry
- Concentration of phaeopigments {pheopigments} per unit volume of the water body [particulate >20um phase] by filtration, acetone extraction and fluorometry

Update: HakaiKCBuoy1hour add fluoresence and oxygen data

HakaiKCBuoy1hour

The KC buoy platform receive and is about to receive new sensors that are captured on the hakai database but still need to be made available on the different Hakai Data platforms. All the new sensors are feeding data through the the SBE16 unit mounted below the buoy.

Sensors added

Wetlabs FLS Sensor: Chlorophyll-a Fluorescence data (mounted last February 2020)
SBE 63: Dissolved oxygen sensor (Spring 2021)

To Do

We would need to add those respective data feeds to the following data plateforms:

Fluorescence

Hakai Sensor Network
ERDDAP database view HakaiKCBuoy1hour
ERDDAP HakaiKCBuoy1hour.xml

Dissolved Oxygen Sensor

Hakai Sensor Network
ERDDAP database view HakaiKCBuoy1hour
ERDDAP HakaiKCBuoy1hour.xml

Dataset Submission: HakaiADCPTransectResearch

Hakai Dataset Submission

Below are listed all the different steps related to the initial submission of a dataset.

A more detailed written and visual description of every step is available respectively
here and here.

Submission steps

Initial Submission (Data Administrator)

Original Data Submission
CIOOS Metadata Form completed

ERDDAP Dataset Creation (Data Integrator)

Dataset Transformation (Format label)
- 🟢 Format Compatible
- 🟡 Format Minor Revisions
- 🟠 Format Major Revisions
- 🔴 Format Incompatible/Missing Information
Near Real-time Data Integration
QARTOD Integration
ERDDAP Integration
ERDDAP Dataset Documentation
ERDDAP Test Locally
Add Dataset to Development Branch

Dataset Review (Data Administrator)

Dataset Development Branch Revision (Reviewer Label)
- 🟢 Reviewer Approved
- 🟡 Reviewer Minor Revisions
- 🟠 Reviewer Major Revisions

Dataset Completion (Data Integrator)

Merge Development Dataset to Production Branch
COMPLETED

Dataset Submission: HakaiQU5MooringProvisional

Hakai Dataset Submission

Below are listed all the different steps related to the initial submission of a dataset.

A more detailed written and visual description of every step is available respectively
here and here.

Submission steps

Initial Submission (Data Administrator)

Original Data Submission
CIOOS Metadata Form completed

ERDDAP Dataset Creation (Data Integrator)

Dataset Transformation (Format label)
- 🟢 Format Compatible
- 🟡 Format Minor Revisions
- 🟠 Format Major Revisions
- 🔴 Format Incompatible/Missing Information
Near Real-time Data Integration
QARTOD Integration
ERDDAP Integration
ERDDAP Dataset Documentation
ERDDAP Test Locally
Add Dataset to Development Branch

Dataset Review (Data Administrator)

Metadata Form
CKAN (dev)
Goose ERDDAP
Dataset Development Branch Revision (Reviewer Label)
- 🟢 Reviewer Approved
- 🟡 Reviewer Minor Revisions
- 🟠 Reviewer Major Revisions

Dataset Completion (Data Integrator)

Merge Development Dataset to Production Branch
COMPLETED

Dataset Submission: DFO Pacific Ocean Salmon Program

Hakai Dataset Submission

Below are listed all the different steps related to the initial submission of a dataset. This dataset is generated by the Department of Fisheries and Oceans Canada and is to be used as proof of concept to demonstrate integration of OBIS and CIOOS.

A more detailed written and visual description of every step is available respectively
here and here.

Submission steps

Initial Submission (Data Administrator)

Original Data Submission
CIOOS Metadata Form completed

OBIS Dataset Creation (Data Integrator)

Dataset Transformation (Format label)
- 🟢 Format Compatible
- 🟡 Format Minor Revisions
- 🟠 Format Major Revisions
- 🔴 Format Incompatible/Missing Information
Record created through the Integrated Publishing Toolkit (IPT) (visibility: private)
IPT: Metadata (eml.xml) created
IPT: Data Integration (matching to DwC standard)
OBIS Dataset Documentation

Dataset Review (Data Administrator)

Dataset Development Branch Revision (Reviewer Label)
- 🟢 Reviewer Approved
- 🟡 Reviewer Minor Revisions
- 🟠 Reviewer Major Revisions

Dataset Completion (Data Integrator)

Publish (Meta)data to OBIS (visibility: public)
Update CIOOS record: Add OBIS Record Link
COMPLETED

Dataset Submission: HakaiChlorophyllSampleResearch

Hakai Dataset Submission

Below are listed all the different steps related to the initial submission of a dataset.

A more detailed written and visual description of every step is available respectively
here and here.

Submission steps

Initial Submission (Data Administrator)

Original Data Submission
CIOOS Metadata Form completed

ERDDAP Dataset Creation (Data Integrator)

Dataset Transformation (Format label)
- 🟢 Format Compatible
- 🟡 Format Minor Revisions
- 🟠 Format Major Revisions
- 🔴 Format Incompatible/Missing Information
Near Real-time Data Integration
QARTOD Integration
ERDDAP Integration
ERDDAP Dataset Documentation
ERDDAP Test Locally
Add Dataset to Development Branch

Dataset Review (Data Administrator)

CIOOS Metadata Form

Dataset Development Branch Revision (Reviewer Label)
- 🟢 Reviewer Approved
- 🟡 Reviewer Minor Revisions
- 🟠 Reviewer Major Revisions

Dataset Completion (Data Integrator)

Merge Development Dataset to Production Branch
COMPLETED

Dataset Submission: HakaiNutrientSampleResearch

Hakai Dataset Submission

Below are listed all the different steps related to the initial submission of a dataset.

A more detailed written and visual description of every step is available respectively
here and here.

Submission steps

Initial Submission (Data Administrator)

Original Data Submission
CIOOS Metadata Form completed

ERDDAP Dataset Creation (Data Integrator)

Dataset Transformation (Format label)
- 🟢 Format Compatible
- 🟡 Format Minor Revisions
- 🟠 Format Major Revisions
- 🔴 Format Incompatible/Missing Information
Near Real-time Data Integration
QARTOD Integration
ERDDAP Integration
ERDDAP Dataset Documentation
ERDDAP Test Locally
Add Dataset to Development Branch

Dataset Review (Data Administrator)

ERDDAP Dataset
CIOOS Metadata form

Dataset Development Branch Revision (Reviewer Label)
- 🟢 Reviewer Approved
- 🟡 Reviewer Minor Revisions
- 🟠 Reviewer Major Revisions

Dataset Completion (Data Integrator)

Merge Development Dataset to Production Branch
COMPLETED

Example 2

Below are listed all the different steps related to the initial submission of a dataset.

A more detailed written and visual description of every step is available respectively
here and here.

Submission steps

Initial Submission (Data Administrator)

Original Data Submission
CIOOS Metadata Form completed

ERDDAP Dataset Creation (Data Integrator)

Dataset Transformation (Format label)
- 🟢 Format Compatible
- 🟡 Format Minor Revisions
- 🟠 Format Major Revisions
- 🔴 Format Incompatible/Missing Information
Near Real-time Data Integration
QARTOD Integration
ERDDAP Integration
ERDDAP Dataset Documentation
ERDDAP Test Locally
Add Dataset to Development Branch

Dataset Review (Data Administrator)

Dataset Development Branch Revision (Submission Label)
- 🟢 Submission Approved
- 🟡 Submission Minor Revisions
- 🟠 Submission Major Revisions

Dataset Completion (Data Integrator)

Merge Development Dataset to Production Branch
COMPLETED

Dataset Submission: HakaiCalvertCP1TideGauge

Hakai Dataset Submission

Below are listed all the different steps related to the initial submission of a dataset.

A more detailed written and visual description of every step is available respectively
here and here.

Submission steps

Initial Submission (Data Administrator)

Original Data Submission
CIOOS Metadata Form completed

ERDDAP Dataset Creation (Data Integrator)

Dataset Transformation (Format label)
- 🟢 Format Compatible
- 🟡 Format Minor Revisions
- 🟠 Format Major Revisions
- 🔴 Format Incompatible/Missing Information
Near Real-time Data Integration
QARTOD Integration
ERDDAP Integration
ERDDAP Dataset Documentation
ERDDAP Test Locally
Add Dataset to Development Branch

Dataset Review (Data Administrator)

Metadata Form

Dataset Development Branch Revision (Reviewer Label)
- 🟢 Reviewer Approved
- 🟡 Reviewer Minor Revisions
- 🟠 Reviewer Major Revisions

Dataset Completion (Data Integrator)

Merge Development Dataset to Production Branch
COMPLETED

Update: HakaiQuadraLimpet5min add RBR temperature and pressure

HakaiQuadraLimpet5min

The Limpet underwater platform just received a new permanent sensor which is measuring Temperature and Pressure. We will need to add this data to the ERDDAP dataset. Here's the steps:

Add both variables to the database view
Add both variables to the ERDDAP dataset.xml

Update the different NCEI submitted datasets from OA is reflected on ERDDAP

Some submissions have been made to NCEI that are not reflected on ERDDAP

Update datasets
Can we just connect to the NCEI repo and update automatically?

Dataset Submission: HakaiPruthMooringProvisional

Hakai Dataset Submission

Below are listed all the different steps related to the initial submission of a dataset.

A more detailed written and visual description of every step is available respectively
here and here.

Submission steps

Initial Submission (Data Administrator)

Original Data Submission
CIOOS Metadata Form completed

ERDDAP Dataset Creation (Data Integrator)

Dataset Transformation (Format label)
- 🟢 Format Compatible
- 🟡 Format Minor Revisions
- 🟠 Format Major Revisions
- 🔴 Format Incompatible/Missing Information
Near Real-time Data Integration
QARTOD Integration
ERDDAP Integration
ERDDAP Dataset Documentation
ERDDAP Test Locally
Add Dataset to Development Branch

Dataset Review (Data Administrator)

Metadata Record
CKAN (dev)
Dataset Development Branch Revision (Reviewer Label)
- 🟢 Reviewer Approved
- 🟡 Reviewer Minor Revisions
- 🟠 Reviewer Major Revisions

Dataset Completion (Data Integrator)

Merge Development Dataset to Production Branch
COMPLETED

HakaiPruthDockProvisional Tide data is failing

HakaiPruthDockProvisional

Issue description

HakaiPruthDockProvisional fails to provide the shifted tide data which is shifted to the average tide height

This is affecting all the variables that are using the row.columnFloat() operator in ERDDAP
<sourceName>=row.columnFloat(PruthDock:TideHeightPLS_Avg)-2.742</sourceName>

The error given is

Error {
    code=500;
    message="Internal Server Error: ERROR from data source: org.apache.commons.jexl3.JexlException$Parsing: gov.noaa.pfel.erddap.dataset.EDDTable.convertScriptColumnsToDataColumns:3648@1:26 parsing error in ':'";
}

This shift was applied in order to match an existing CF standard_name

Suggestion for correction

Ignore the shift in the data and provide the data as is.

Update: Hakai Flag Convention Standard

Hakai Flag Convention Standard

This is a more a broader issue to clearly identify the flagging convention used and accepted within both the sensor network and EIMS. Once agreed this work will get integrated within the Jupyter Notebook data QCing tools in development.

I have been told by @jdelbel that not all the terms listed below are accepted within EIMS. @fostermh can you confirm that?

Hakai Flag Convetion

Code	Name	Comments/description	QARTOD Mapping
AV	Accepted value	Has been reviewed and looks good	GOOD (1)
SVC	Suspicious value - caution	Value appears to be suspect, use with caution	SUSPECT (3)
SVD	Suspicious value - reject	Value is clearly suspect, recommend discarding	FAIL (4)
EV	Estimated value	Value has been estimated
NA	Not available	No value available	MISSING (9)
MV	Missing value	No measured value available because of equipment failure, etc.	UNKNOWN (2)
LB	Low battery	Sensor battery dropped below a threshold
CD	Calibration due	Sensor needs to be sent back to the manufacturer for calibration
CE	Calibration expired	Value was collected with a sensor that is past due for calibration
IC	Invalid chronology	One or more non‐sequential date/time values
PV	Persistent value	Repeated value for an extended period
AR	Above range	Value above a specified upper limit
BR	Below range	Value below a specified lower limit
SE	Slope exceedance	Value much greater than the previous value, resulting in an unrealistic slope
SI	Spatial inconsistency	Value greatly differed from values collected from nearby sensors
II	Internal inconsistency	Value was inconsistent with another related measurement
BDL	Below detection limit	Value was below the established detection limit of the sensor
ADL	Above detection limit	Value was above the established detection limit of the sensor

Source: https://docs.google.com/spreadsheets/d/1NZcwn7zPZ-98za4HpxQH705uw3tsEgKUwQhkPl0UiS8/edit?usp=sharing

Dataset_ID: Hakai*[Provisional/Research/Climate] Test

Below are listed all the different steps related to the initial submission of a dataset.

A more detailed written and visual description of every step is available respectively
here and here.

Submission steps

Initial Submission (Data Administrator)

Raw Data Submission
CIOOS Metadata Form

ERDDAP Dataset Creation (Data Integrator)

Dataset Review (Data Administrator)

Dataset Development Branch Revision

🟢 Approved
🟡 Minor Revisions
🟠 Major Revisions

Dataset Completion (Data Integrator)

Merge Development Dataset to Production Branch
COMPLETED

Hakai Sentinel Temperature Data

Hakai Dataset Submission

Below are listed all the different steps related to the initial submission of a dataset.

A more detailed written and visual description of every step is available respectively
here and here.

Submission steps

Initial Submission (Data Administrator)

Original Data Submission https://github.com/HakaiInstitute/sentinels-sensor-data
CIOOS Metadata Form completed
https://cioos-siooc.github.io/metadata-entry-form/#/en/hakai/57FPdLIBPcb9BOLfIOW44RJVywN2/-NH0r8ddHNhVJ5bXkyqR

ERDDAP Dataset Creation (Data Integrator)

Dataset Review (Data Administrator)

Dataset Development Branch Revision (Reviewer Label)
- 🟢 Reviewer Approved
- 🟡 Reviewer Minor Revisions
- 🟠 Reviewer Major Revisions

Dataset Completion (Data Integrator)

ERDDAP

Merge Development ERDDAP Dataset to Production Branch
Confirm ERDDAP Dataset is running on Hakai Production ERDDAP Server

Metadata Record

Confirm Metadata Record is pointing to the Production ERDDAP dataset
Publish Metadata record
Confirm Metadata record is available appropriately on the Hakai Institute CKAN
Confirm Metadata record is available appropriately on the CIOOS Pacific
Confirm Metadata record is available appropriately on the CIOOS National

DOI

Generate DOI associated with Hakai CKAN dataset page
COMPLETED

Get QCed data from database directely to ERDDAP

HakaiWaterPropertiesInstrumentProfileResearch

The dataset running in production is running temporary from NetCDF files produced occasionally. The last step is to connect the ERDDAP dataset directly to the database. To do this

Add ctd.ctd_post_qc_data view vieww on the production database
Add Hakai Specific view on the production database
Get ok

Update: Link Hakai CTD profiles provisional dataset to database

HakaiWaterPropertiesInstrumentProfileProvisional

The dataset running in production is running temporary from NetCDF files produced occasionally. The last step is to connect the ERDDAP dataset directely to the database. To do this

Add Hakai Provisional view on database
Add flags to the Hakai data
Get ok

This is it!

Hakai Dataset Submission

Below are listed all the different steps related to the initial submission of a dataset.

A more detailed written and visual description of every step is available respectively
here and here.

Submission steps

Initial Submission (Data Administrator)

Original Data Submission
CIOOS Metadata Form completed

ERDDAP Dataset Creation (Data Integrator)

Dataset Transformation (Format label)
- 🟢 Format Compatible
- 🟡 Format Minor Revisions
- 🟠 Format Major Revisions
- 🔴 Format Incompatible/Missing Information
Near Real-time Data Integration
QARTOD Integration
ERDDAP Integration
ERDDAP Dataset Documentation
ERDDAP Test Locally
Add Dataset to Development Branch

Dataset Review (Data Administrator)

Dataset Development Branch Revision (Submission Label)
- 🟢 Submission Approved
- 🟡 Submission Minor Revisions
- 🟠 Submission Major Revisions

Dataset Completion (Data Integrator)

Merge Development Dataset to Production Branch
COMPLETED

Issue: QUADRA BoL Dataset do not parse appropriately the time variable

Dataset ID

Issue description

ERDDAP expects two-digit hour format for the Quadra BoL Research dataset however it seems like the time variable has a 1-2 digit hour format.

Suggestion for correction

Fix within dataset.xml

Algae Explorer Chla dataset

Hakai Dataset Submission

Below are listed all the different steps related to the initial submission of a dataset.

A more detailed written and visual description of every step is available respectively
here and here.

Submission steps

Initial Submission (Data Administrator)

Original Data Submission
CIOOS Metadata Form completed

ERDDAP Dataset Creation (Data Integrator)

Dataset Transformation (Format label)
- 🟢 Format Compatible
- 🟡 Format Minor Revisions
- 🟠 Format Major Revisions
- 🔴 Format Incompatible/Missing Information
Near Real-time Data Integration
~~QARTOD Integration~~
ERDDAP Integration
ERDDAP Dataset Documentation
ERDDAP Test Locally
Add Dataset to Development Branch
ERDDAP Global Attributes are matching the Metadata Record associated fields (see Metadata Form ERDDAP Snippet)

Which datasets

Select which NetCDF files generated by the algae explorer needs to be made available on ERDDAP

Dataset Review (Data Administrator)

Dataset Development Branch Revision (Reviewer Label)
- 🟢 Reviewer Approved
- 🟡 Reviewer Minor Revisions
- 🟠 Reviewer Major Revisions

Dataset Completion (Data Integrator)

ERDDAP

Merge Development ERDDAP Dataset to Production Branch
Confirm ERDDAP Dataset is running on Hakai Production ERDDAP Server

Metadata Record

Confirm Metadata Record is pointing to the Production ERDDAP dataset
Publish Metadata record
Confirm Metadata record is available appropriately on the Hakai Institute CKAN
Confirm Metadata record is available appropriately on the CIOOS Pacific
Confirm Metadata record is available appropriately on the CIOOS National

DOI

Generate DOI associated with Hakai CKAN dataset page
COMPLETED

Dataset Submission: Hakai Dataset Submission template

Hakai Dataset Submission

Below are listed all the different steps related to the initial submission of a dataset.

A more detailed written and visual description of every step is available respectively
here and here.

Submission steps

Initial Submission (Data Administrator)

Original Data Submission
CIOOS Metadata Form completed

ERDDAP Dataset Creation (Data Integrator)

Dataset Transformation (Format label)
- 🟢 Format Compatible
- 🟡 Format Minor Revisions
- 🟠 Format Major Revisions
- 🔴 Format Incompatible/Missing Information
Near Real-time Data Integration
QARTOD Integration
ERDDAP Integration
ERDDAP Dataset Documentation
ERDDAP Test Locally
Add Dataset to Development Branch
ERDDAP Global Attributes are matching the Metadata Record associated fields (see Metadata Form ERDDAP Snippet)

Dataset Review (Data Administrator)

Dataset Development Branch Revision (Reviewer Label)
- 🟢 Reviewer Approved
- 🟡 Reviewer Minor Revisions
- 🟠 Reviewer Major Revisions

Dataset Completion (Data Integrator)

ERDDAP

Merge Development ERDDAP Dataset to Production Branch
Confirm ERDDAP Dataset is running on Hakai Production ERDDAP Server

Metadata Record

Confirm Metadata Record is pointing to the Production ERDDAP dataset
Publish Metadata record
Confirm Metadata record is available appropriately on the Hakai Institute CKAN
Confirm Metadata record is available appropriately on the CIOOS Pacific
Confirm Metadata record is available appropriately on the CIOOS National

DOI

Generate DOI associated with Hakai CKAN dataset page
COMPLETED

Update: Dataset ID here

Dataset ID

Add a description of the update to be completed on the dataset id.

Issue: Dataset ID here

Dataset ID

Issue description

A clear and concise description of what the bug is.

Suggestion for correction

Descript a possible correction to apply

Additional context

Add any other context about the problem here.

Outside Hakai Water Properties Vertical Profiles Datasets

CTD Profiles Handled by Hakai from other organizations

Issue description

Hakai ingests processes and makes available Water Properties Vertical Profiles collected by other organizations. All those datasets are available within the Hakai database system within the same workflow as the Hakai own data.

However, that data should not be presented within the Hakai dataset and should be treated separately for each individual organization as two datasets: research and provisional.

Those datasets will follow a very similar workflow to the Hakai datasets #7 and #8.

Where

Where those datasets should be hosted?

CIOOS Pacific ERDDAP/CKAN
Hakai ERDDAP/CKAN

Organizations

Nature Trusts
University of Washington
Parks Canada
Skeena River Fisheries

Provisional Datasets Tracking

The development of each dataset should be made available here.

Research Datasets

The development should each dataset should be made available here.

Issue: IYS-chlorophyll dataset on goose servers

A test dataset IYS-chlorophyll was created a long time ago on the development branch of the hakai datasets. I'm not too sure what is the status of it. I believe we should just remove it. Just wanna make sure with you @timvdstap or @Br-Johnson

This dataset fails to produce a dataset within ERDDAP https://goose.hakai.org/erddap/status.html

See IYS-chlorophyll.xml at:
https://github.com/HakaiInstitute/hakai-datasets/blob/development/datasets/IYS-chlorophyll.xml

Issue: All real-time datasets from the sensor network

Dataset ID

Issue description

Hakai Sensor Network data presents for each individual record different statistics associated to a recorded time interval. The CF convention suggests using the cell_method attribute to clearly define the time range of the cell associated to a variable and record and the statistic method used.

I believe there's a way also to describe the where within that cell the time variable correspond to (ex: start, middle, average, end)

Suggestion for correction

As described within the Pruth mooring dataset
The time variable should have :
time: point (interval: 5.0 minutes comment: time correspond to the end of the interval)

While the other variables are described by following the convention
time: mean (interval: 5.0 minutes)
time: median (interval: 5.0 minutes)

This would need to be added to every sensor network datasets and potentially more details too (need to be reviewed)

Additional context

More detail can be found here

Generate DOI for some Hakai Research Datasets

DOI Generation for some Hakai Research Datasets

Issue description

The following Hakai datasets will need to have a DOI generated. Some of them are not quite yet available within the Hakai CKAN system and would also need to be made available there:

Dataset	Hakai Metadata Record	CKAN	DOI
Hakai Nutrient Research	Metadata	CKAN
Hakai Chlorophyll Research	Metadata	CKAN	https://doi.org/10.21966/wsvt-ew96
Hakai Water Properties Profiles Research	Metadata	CKAN	https://doi.org/10.21966/6cz5-6d70
Dosser et al. (2021) Hakai Nutrients	Metadata	CKAN	https://doi.org/10.21966/j3j5-wt70

** This table will be updated as the different components are available.

Review and revised provisional BoL dataset descriptions and metadata

Through working through details related to the Quadra BoL research datasets, I realize that the initial dataset titles, abstracts, etc. we chose for the provisional BoL datasets are pretty technical and not necessarily user friendly.

Perhaps it would be better to reuse aspects of the NCEI dataset titles, abstracts and other metadata attributes to update and improve the provisional BoL metadata records. At a minimum, it should be very clear that the datasets include provisional data that should not be use for research purposes. We can likely keep the abstract/metadata pretty simple though, and mainly focus the provisional state and intended use of the data.