iodepo / oceanbestpractices Goto Github PK

View Code? Open in Web Editor NEW

12.0 12.0 6.0 18.13 MB

Repository to store the OpenSource version of the code made by E84 for OceanBestPractices.org

Home Page: https://oceanbestpractices.org

License: GNU Affero General Public License v3.0

JavaScript 70.79% Ruby 0.40% HTML 1.24% CSS 7.34% SCSS 3.02% TypeScript 17.04% Shell 0.15% Dockerfile 0.02%

oceanbestpractices's People

Contributors

Stargazers

Watchers

Forkers

pbuttigieg element84 icantech socioprophet brianandres2 pieterprovoost

oceanbestpractices's Issues

Need SPARQL query to build relationship graph from a selected term.

The user is able to select an existing term and the application should display the relationships (defined by IODEPO) for that term. We need SPARQL queries that can build that relationship for a given term. I assume we'll need queries specific to each ontology (or at least groups of ontologies). I can offer the current query as an example but it's so complex (and wrong) that I'm not sure it'll be of much help.

This is to address OBP-256: As the owner, I want to see expected semantic neighborhoods with full and accurate relationships.

Formalise cross-referencing of endorsed BPs with EOV Spec sheets

For those EOV Spec sheets that are at least in v1, plan a reciprocal mapping between the spec sheets and endorsed BPs. This would need the spec sheets to be more than documents - more like dynamic web-pages with section-level IRIs and stable markup.

Change "Search Options" button label to "Filter Options".

Label should be changed on both the home screen and search results.

To provide Arctic view on repository make links

https://arcticpractices.org :: link to https://repository.oceanbestpractices.org/handle/11329/1291
https://arcticpractices.net :: link to https://repository.oceanbestpractices.org/handle/11329/1291
https://arcticpracticespilot.org :: link to https://repository.oceanbestpractices.org/handle/11329/1291
https://arcticpracticespilot.net :: link to https://repository.oceanbestpractices.org/handle/11329/1291

Develop request/issue forwarding mechanism to terminology services

Ease the request for revision - forward to issue trackers for non-GitHub users. Communicate that if the content is not fit for use that it can be changed and the resources we use are responsive to requests for change/update.

Landing page descriptions of terminologies

Visible button on landing page providing explanation of what terminologies (ontologies, thesauri, etc) are.

Redirect NET to ORG

Redirect NET to ORG, without loosing .net to enter stuff!!!!

SPARQL query(ies) to extract labels for tagger index after ingesting a new ontology.

We're introducing dynamic ingest of ontologies and therefore are updating the tagging routine to respond when a new ontology is added to Neptune. The tagging routine needs to extract the labels from the new ontology in order to index them into the tagging index in Open Search. We need a SPARQL query that we can use to perform this as soon as a new ontology is loaded into Neptune. If we need to support different queries depending on the type of ontology ingested we need to know how to identify which query to use.

Please add to .org the Unesco Privacy Policy

Please add to .org the Unesco Privacy Policy Unesco Privacy Policy: https://en.unesco.org/this-site/our-online-privacy-policy http://www.unesco.org/new/en/terms-of-use/terms-of-use/privacy/

Use gazetteer service to identify documents having Arctic relevance

Eight circumpolar countries participating in the Arctic SDI have combined their official gazetteer services. The API supports Latin, Cyrillic and Syllabic character sets. available at: https://arctic-sdi.org/services/general-introduction/
Evaluate options for using the service to identify existing and incoming documents that have Arctic relevance.

Add Working group about page

Add a web page or a link/redirect to an external webpage describing the OBPS Working Group and Steering Committee.

A decision on where the page will be hosted is yet to be made.

Need SPARQL query for fetching synonyms and like words.

When the user searches by keyword they have the option to ask for synonyms and like words to be included in the search query. We need a SPARQL query that supports finding those words based on the keyword entered in the search field.

For example, the current query that performs this looks like:

 PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> \
 PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> \
 PREFIX owl: <http://www.w3.org/2002/07/owl#> \
 PREFIX skos:<http://www.w3.org/2004/02/skos/core#> \
 SELECT DISTINCT ?annotatedTarget ?annotatedPropertyLabel ?sameAsLabel \
 WHERE { \
  { \
    ?nodeID owl:annotatedSource ?xs . \
    ?nodeID owl:annotatedProperty ?annotatedProperty . \
    ?nodeID owl:annotatedTarget ?annotatedTarget . \
    ?nodeID ?aaProperty ?aaPropertyTarget . \
    OPTIONAL {?annotatedProperty rdfs:label ?annotatedPropertyLabel} . \
    OPTIONAL {?aaProperty rdfs:label ?aaPropertyLabel} . \
    FILTER ( isLiteral( ?annotatedTarget ) ) . \
    FILTER ( ?aaProperty NOT IN ( owl:annotatedSource, rdf:type, owl:annotatedProperty, owl:annotatedTarget ) ) \
    { \
      SELECT DISTINCT ?xs WHERE { \
        ?xs rdfs:label ?xl . \
        FILTER (?xl = '${term}'^^xsd:string) \
      } \
    }\
  } \
  UNION \
  { \
    SELECT ?sameAsLabel \
    WHERE { \
      ?concept skos:prefLabel ?prefLabel . \
      FILTER (str(?prefLabel) = '${term}') \
      ?concept owl:sameAs ?sameAsConcept . \
      ?sameAsConcept skos:prefLabel ?sameAsLabel . \
    } \
  } \
}

Which only works because we know exactly which ontologies we've ingested - and even then this query barely works. This can be a challenging query because the user isn't forced to select from a list of terms before submitting their search query. So we have to have a query that first finds the word that they typed in and then (I assume) use the information from that query to find synonyms and like words.

Add UN vocabs to tagging stack

Key vocabs found here

Improve deployments by migrating to Serverless Framework.

Serverless Framework (https://www.serverless.com/) has become almost the standard now for managing serverless deployments and OBP is a perfect candidate for it. There are a number of benefits to migrating the CloudFormation templates to this framework including deployment times and complexity.

Each "component" of the app can become a service. Resources such as Elasticsearch can still be defined with CloudFormation. We can start with a single serverless file for now and if it becomes too cumbersome replace it with multiple services managed by a single deployment script.

obp-scheduler-function-staging is throwing an exception

from the mail from Paul to Arno (21/02/2020) :

I started to look at why Elasticsearch was empty and found that the obp-scheduler-function-staging is throwing an exception (I found this by looking at the CloudWatch Logs for this function):

{ "errorType":
"Runtime.ImportModuleError",
"errorMessage":
"Error: Cannot find module 'xml2js'\nRequire stack:\n- /var/task/scheduler.js\n- /var/runtime/UserFunction.js\n-
/var/runtime/index.js", "stack":
[ "Runtime.ImportModuleError: Error: Cannot find module 'xml2js'",
"Require stack:",
"- /var/task/scheduler.js",
"- /var/runtime/UserFunction.js",
"- /var/runtime/index.js",
" at _loadUserApp (/var/runtime/UserFunction.js:100:13)",
" at Object.module.exports.load (/var/runtime/UserFunction.js:140:17)",
" at Object.<anonymous> (/var/runtime/index.js:43:30)",
" at Module._compile (internal/modules/cjs/loader.js:955:30)",
" at Object.Module._extensions..js (internal/modules/cjs/loader.js:991:10)",
" at Module.load (internal/modules/cjs/loader.js:811:32)",
" at Function.Module._load (internal/modules/cjs/loader.js:723:14)",
" at Function.Module.runMain (internal/modules/cjs/loader.js:1043:10)",
" at internal/main/run_main_module.js:17:11"
] }

It looks like it can't find the XML parsing library. It's possible this has something to do with updating the nodejs runtime.

bulk indexer output

the output of the bulk indexer does noet really reflect if the indexing was succesfull or not. It would be handy if that could change so there is an immediate indication that action has to be taken to solve problems with the indexing.

Document clustering

At the AtlantOS symposium at UNESCO, a desire to adapt the UX use the tagging to cluster submissions on similarity and to display smaller/more focused and informal best practices around those that are peer reviewed, come from a major consortium (which went through internal review), and/or are endorsed by some authority (e.g. a GOOS Panel).

The search will prioritise those that have higher QC, with an option to display all.

This is a means to prevent the highest quality docs from drowning in small/unreviewed (but still valuable) submissions

Update Contact Us link.

Should go to: https://repository.oceanbestpractices.org/feedback

Auto-tweet new submissions from RSS feed

Grab new entries in the RSS feed and push to our Twitter channel.
Intelligent hash-tagging would be a nice-to-have feature, if selected metadata fields can be parsed.

Add Ocean Modeling best practice template.

Templates drop drown should have 4 in total:

Data Management https://repository.oceanbestpractices.org/handle/ONEONETZHREETWO9/ONETWO45
Ocean Applications https://repository.oceanbestpractices.org/handle/ONEONETZHREETWO9/ONETWO46
Ocean Modelling https://repository.oceanbestpractices.org/handle/ONEONETZHREETWO9/ONE649
Sensors https://repository.oceanbestpractices.org/handle/ONEONETZHREETWO9/ONETWO4TZHREE

Missing "Ocean Modeling".

Display OBPS record google analytic metric on the UI

GA provides download metrics for each OBPS record, but it can only be accessed on the DSpace repository. We need to display this download metric on the individual record in the UI results display. This metric is a real selling point for OBPS

Update copy on first visit screen.

REWORD:

The Ocean Best Practices System (OBPS) is a secure, permanent global repository of ocean research, operations, data/information management and applications methodologies (also known as “BestPractices”) ** The OBPS invites the ocean community to submit their own methodologies to share globally with their colleagues.

Please note, unless it is annotated as Endorsed by an Expert Panel, inclusion of a methodology in OBPS does not indicate a recommendation by OBPS.

(Please Make this smaller font size)** A Best Practice is defined as “a methodology that has repeatedly produced superior results relative to other methodologies with the same objective”. To be fully elevated to a best practice, a promising method will have been adopted and employed by multiple organizations.

Embed visualisation of knowledge neighbourhood to improve UX

Couple the knowledge neighbourhood exploration to visuals/UX that is more intuitive and user friendly. Need user consultation on what works.

OBPS thumbnails not generating on DSPACE repository

OBPS thumbnails not generating on DSPACE repository for the last two days

Need WoRMS taxonomy in Neptune compliant format for ingest.

For OBP-243 we've been asked to import the WoRMS taxonomy (http://www.marinespecies.org/). We'll need that in a Neptune compliant format and should be ready to test the new ingest process with it.

Neptune formats: https://docs.aws.amazon.com/neptune/latest/userguide/sparql-media-type-support.html

Fix the indexer function to support the correct Elasticsearch index region.

The indexer.js file sets an environment variable:

const region = process.env.REGION || 'us-east-1';

which is used to build the requests for the Elasticsearch index. However, the CloudFormation template that deploys the function does not support a REGION parameter nor does it set the environment variable for you. The Cloudformation template should define a environment variable in function definition so that you can specify the region of the Elasticsearch index in case it is something other than us-east-1.

Create an OWL/RDF compliant WoRMS taxonomy import

Well-adopted and used in the marine community. Develop similar to the NCBITax import that lives in OBO, perhaps even include WoRMS in OBO.

Allow section-level endorsement of documents by reviewers/panels

At times, Panels / reviewers will endorse components of methodological documents in the OBPS. The technology / metatdata should work towards allowing this granularity, promoting the synthesis of submissions with complementary and endorsed parts.

[Discussed at the PEGASuS / GOOS BioEco Panel meeting 2019-12-05]

Explore Apache Tika...

...for NLP and NER across media types

Generate auto-creation of EOV communities / collections based on endorsement tags

Once a document (or section xref #24) has been endorsed by an EOV lead/Panel, it should be auto-added to an EOV-specific community on the OBPS:

Bulk Indexer (re-index on UI) please implement a chron

Bulk Indexer (re-index on UI) please implement a chron to re-index all documents, metadata etc every 48 hours

Improve deployment documentation for the Search API component and percolator script.

The Deployment.md documentation is missing instructions for deploying the Search API component of the OBP stack. It's also missing mention of (and instructions for running) the script required to build the percolator index after ingesting a new ontology.

Link submissions to the networks / programmes they come from

An expansion of the metadata field list, ideally against a controlled list of networks/programmes (new or unregistered programmes would have to register, xref to ODIS work).

This field would be distinct from publisher - and can have multiple entries identifying the networks/programmes it came from

Provide thematic search UXs

Some users may want to search the corpus from a specific axis (e.g. devices, environments, etc) - a tailored search interface for each of these major types (and associated UX) is desired. Providing these entry points would make the UX smoother.

Add "Endorsements" metadata field

xref #24 #27 #29

This field will have links to documents submitted by endorsing bodies (e.g. GOOS Panels) that will also be uploaded in the OBPS. For example:

Key	Value
Endorsements	[DOI of endorsement document]

Replace lib/open-search-client with @elastic/elasticsearch

@elastic/elasticsearch

We'll need to handle AWS auth with something like http-aws-es or aws-elasticsearch-js.

Replace Ingest Queue with SNS utils and batch messaging.

Consider creating lib/sns-utils.ts with:

import { chunk } from 'lodash';
import pMap from 'p-map';
import { sns } from './aws-clients';

interface PublishBatchInput {
  topicArn: string
  messages: string[],
  concurrency?: number
}

export const publishBatch = async (
  params: PublishBatchInput
): Promise<void> => {
  const { concurrency, messages, topicArn: TopicArn } = params;

  await pMap(
    chunk(messages, 10),
    (messageChunk) => {
      const PublishBatchRequestEntries = messageChunk.map((Message, index) => ({
        Id: index.toString(),
        Message,
      }));

      return sns().publishBatch({
        TopicArn,
        PublishBatchRequestEntries,
      }).promise();
    },
    { concurrency }
  );
};

With that utility, this could be refactored to something like:

  const dSpaceItems = await pMap(
      feed.channel[0].item,
      (feedItem) => dspaceClient.find(
        dspaceEndpoint,
        'dc.identifier.uri',
        feedItem.link[0]
      ),
      { concurrency: 5 }
    );

    const uuids = dSpaceItems.flat().map((i) => i.uuid);

    await publishBatch({
      topicArn: ingestTopicArn,
      messages: uuids,
      concurrency: 5,
    });

That feels a bit easier to read.

Originally posted by @marchuffnagle in #106 (comment)

documentation on the index updater

we need information/documentation on the script that can be run to update the index after changes have been made to entries in the repository.
Paul has made and tested that script on 23/08/2019.

included metadata fields in the ORG

Ensure all the new metadata fields are incuded in the ORG upload ie metadata content from these fields is searchable
EH has included this with EOV field so possibly could be a joint EOV/ECV field?
Adding the ECVs to the OBPS tech would be more straightforward as most of them can go into ENVO - ASK PLB FOR CLARIFICATION

PLB: Best to use the ontology IRIs for these. Arno, contact me when you get here.

Controlled vocabularies not retrieved during submission process

During the submission of a new document, some metadata fields that have links to controlled vocabularies have issues. The links do not resolve, giving a "Not Found" error

vocabulary:https://www.oceanbestpractices.net/JSON/controlled-vocabulary?vocabularyIdentifier=paradis&metadataFieldName=dc_subject_parameterDiscipline

For example:

Subject : Parameter Discipline:

Click the ‘Subject Categories’ link below to select appropriate parameter discipline keywords or phrases.

Consider two-step Essential Ocean Variable protocol endorsement process

The first step would be a general pre-endorsement. The second would necessitate more metadata on the supporting, derived, and other associated variables in the EOV spec sheet

Need stop word lists for all supported ontologies.

OBP-250: As an admin, I want to be able to upload a text file with arbitrary terms for each vocabulary, to help reduce less meaningful search results.

We need the list of terms for at least the following ontologies:
[ ] CHEBI
[ ] ENVO
[ ] SDGIO
[ ] L05
[ ] L06
[ ] L22
[ ] WoRMS

It'd probably be easiest of those files were in csv format but whatever is easiest for now and we can decide on the format and we implement this feature.

Track citations of each document in literature

Crawl the peer reviewed literature and publications from other professional sources to evaluate the uptake / use / citation of an OBPS DOI (or a DOI mapped to that DOI).

Present this on the interface and dashboards

Update citations help text.

Change to:

To export this citation select individual record check boxes or click 'Select All' - a Download Citation box will display, click it and the records will be downloaded to your designated Download folder OR just copy and paste.

Search for "sea ice"
Click on "View tags" for the first hit (Sea-Ice Information Services in the World. 3rd Edition, 2006 [SUPERSEDED by http://hdl.handle.net/11329/283])
Clicking on "first year ice", "sea ice", or many other ENVO terms yields: "You must select a tag to view relationships"

This of course shouldn't be. Compare this to clicking on "A-factor" from CHEBI.

OBPS website and UI have lost display of number of BP in repository_20200604

OBPS website and UI have lost display of number of BP in repository Thu 4 Jun 2020

iodepo / oceanbestpractices Goto Github PK

oceanbestpractices's People

Contributors

Stargazers

Watchers

Forkers

oceanbestpractices's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs