GithubHelp home page GithubHelp logo

Comments (13)

ssarrafan avatar ssarrafan commented on June 20, 2024

Continuation of work discussed (at least on GH) from the May sprint at https://github.com/microbiomedata/nmdc-metadata/issues/308

from nmdc-schema.

ssarrafan avatar ssarrafan commented on June 20, 2024

Third level of checks handled in https://github.com/microbiomedata/nmdc-metadata/issues/362

from nmdc-schema.

ssarrafan avatar ssarrafan commented on June 20, 2024

Removing other assignees. @turbomam let me know if this assignment isn't you.

from nmdc-schema.

turbomam avatar turbomam commented on June 20, 2024

RE Is the ID prefix valid? (e.g. KEGG.KO vs KEGG.ORTHOLOG)

I see prefix definitions, especially in nmdc-schema/src/schema/annotation.yaml

MAM@MAM-M74 schema % pwd
/Users/MAM/Documents/gitrepos/nmdc-schema/src/schema
MAM@MAM-M74 schema % grep -i kegg *
annotation.yaml:      - KEGG.PATHWAY
annotation.yaml:      - KEGG.REACTION
annotation.yaml:      - KEGG.ORTHOLOGY  ## KO number
core.yaml:      - KEGG.COMPOUND

Anywhere else I should be looking? @wdduncan @cmungall

from nmdc-schema.

turbomam avatar turbomam commented on June 20, 2024

RE Is the local part of the ID syntactically conformant? (e.g. KEGG:K\d+)

I don't see patterns for the local parts, at least not in nmdc-schema/src/schema/annotation.yaml

  pathway:
    aliases:
      - biological process
      - metabolic pathway
      - signaling pathway
    is_a: functional annotation term
    description: >-
      A pathway is a sequence of steps/reactions carried out by an organism or community of organisms
    slot_usage:
      has_part:
        range: reaction
        multivalued: true
        description: >-
          A pathway can be broken down to a series of reaction step
    id_prefixes:
      - KEGG.PATHWAY
      - COG
    exact_mappings:
      - biolink:Pathway

from nmdc-schema.

turbomam avatar turbomam commented on June 20, 2024

very rough example for nmdc-schema/src/schema/annotation.yaml from @cmungall

functional annotation term:
    aliases:
      - function
      - functional annotation
    is_a: ontology class
    slot_usage:
      id:
        pattern: "^(KEGG.ORTHOLOG:K\\d+|EC:\\d+\\.ETC)$"
    description: >-
      Abstract grouping class for any term/descriptor that can be applied to a functional unit of a genome (protein, ncRNA, complex).
    abstract: true
    todos:
      - decide if this should be used for product naming

from nmdc-schema.

turbomam avatar turbomam commented on June 20, 2024

Was microbiomedata/nmdc-metadata issue 360

I will be adding local part patterns to the yaml files in this repo.

@ssarrafan @wdduncan @cmungall

from nmdc-schema.

turbomam avatar turbomam commented on June 20, 2024

See notes from @cmungall at PR #70, especially

I suggested the parens to indicate that we need other IDs, e.g (FOO|BAR|...)

from nmdc-schema.

turbomam avatar turbomam commented on June 20, 2024

@wdduncan and @cmungall : I can't find patterns for COG or RetroRules at the BioRgistry, or sample usages in the MongoDB.

example working query:

> db.raw.functional_annotation_set.find({"has_function": {"$regex": "^pfam", $options: 'i'}})
{ "_id" : ObjectId("6011a09275ead576bdc24c02"), "subject" : "nmdc:Ga0482148_260452_3_287", "has_function" : "PFAM:PF00001", "was_generated_by" : "nmdc:8a43ec3baf8aafe09d96eb7fbf58c916" }
{ "_id" : ObjectId("6011a0d2666867f660864500"), "subject" : "nmdc:Ga0482235_197390_1_279", "has_function" : "PFAM:PF00001", "was_generated_by" : "nmdc:e763e255fa74e2629d7d86e10f838d4b" }
{ "_id" : ObjectId("6011a1113350938c11bd6527"), "subject" : "nmdc:Ga0482263_74753_2_277", "has_function" : "PFAM:PF00001", "was_generated_by" : "nmdc:686818cb31dc45d3d4482847ec007584" }

But neither of these return any matches:

  • db.raw.functional_annotation_set.find({"has_function": {"$regex": "^cog", $options: 'i'}})
  • db.raw.functional_annotation_set.find({"has_function": {"$regex": "^retrorules:", $options: 'i'}})

from nmdc-schema.

turbomam avatar turbomam commented on June 20, 2024

@ssarrafan do you have a sense of who raised this concern? Can I close it?

from nmdc-schema.

turbomam avatar turbomam commented on June 20, 2024

It doesn't seem like it's really specific to checks on the contents of a JOSN file by the JSON schema serialization of the schema.

from nmdc-schema.

turbomam avatar turbomam commented on June 20, 2024

Is the concern especially about validating KEGG-related CURIes?

from nmdc-schema.

ssarrafan avatar ssarrafan commented on June 20, 2024

@ssarrafan do you have a sense of who raised this concern? Can I close it?

This is from 2021 so I don't remember which meeting this came from. I would say it can probably be closed.

from nmdc-schema.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.