Comments (20)
Thanks @cthoyt and @pfabry for lexmatch!
It would be nice to post it on the LSDAO new ontology request issue. Thanks @pfabry !
from obofoundry.github.io.
While many of these matches make sense, some are totally off, such as
CAROLIO:0003210 locoregional therapy
ncit:C25388 Local-Regional (0.54)
ncit:C94796 Locally Recurrent Malignant Neoplasm (0.54)
Thus, this should not be an automated review for the dashboard, but rather presented to the ontology submitter for their review, so that they can be encouraged to import classes rather than recreate them, where appropriate.
from obofoundry.github.io.
@cthoyt
Could you please run the script for other new ontologies? Namely: aFPO, GALLONT, LSDAO In addition, NCIT could be removed from the matches.
Thanks!
from obofoundry.github.io.
Could you please run the script for other new ontologies?
@cthoyt so this is not on your plate, can you share the script so we can setup a github action to do this?
from obofoundry.github.io.
from obofoundry.github.io.
Our thinking so far is this:
- It is better to do it somewhat approximately than not to do it. A lot of matches reveal a pattern.
- The burden is on the submitters. They see the match, they check it and say: "this is not the same thing".
Re OMIT, I cannot say anything. We could push our "foundry status" back and define it as "passing the dashboard" - and only match against these. Just spitballing.
from obofoundry.github.io.
from obofoundry.github.io.
This is a bit of a complicated criticism..
(1) We do not want to let the 4th ontology defining Alzheimer's
and glucose
into OBO Foundry Ontology Library
(2) We have no good way to separate GUOBO modules (components of the Grand Unified OBO Ontology) from Application/Project ontologies.
We will not be able in any reasonable timeframe define "COB-Branch owning" ontologies. However, we could, possibly, use "bottom-up" COB mapping curation here to say: For new ontologies, only matches against ontologies mapped in COB are relevant. This is a bit shady, to be clear (not unreasonable, just a bit shady), as we refer from one system (OBO Library membership) to another (COB mappings), but I would be ok with that as well.
But IMO the need to achieve (1) outweighs all other concerns you raised. We can get a touch of "qualitative" in there by adding an SOP that the ontology reviewer can apply judgement if some of these matches are blocking or not.
What is the alternative?
from obofoundry.github.io.
from obofoundry.github.io.
Well, I dont really think this is quite the same. We are looking for a way to ensure that ontologies wanting to join the OBO Foundry do not significantly overlap with (key?) ontologies in OBO Foundry. And label matching is the only way I can think of right now to at least get started with this. What is the principle we should derive from a recommendation approach? I need to know at least how we should act now short term, like - is your preference that for the 5 ontologies currently under revision, we do not run lexical matching? If so what is your suggestion exactly?
from obofoundry.github.io.
Here is an example of the problem
This would not have been found by text mining. But clearly there needs to be coordination between these two ontologies. The correct way to do this by looking at the scope of the new ontology and the scope of existing ontologies. Currently this takes some very minimal knowledge of what is in OBO (I am not sure why we aren't doing this). This part could easily be semi-automated by e.g LLMs. But frankly everyone reviewing ontologies should be aware of the scope of different ontologies in OBO especially widely used ones like PO
from obofoundry.github.io.
@matentzn I have a new repo where I am building up and storing various lexical indexes, it now has a pre-built one for OBO to make this much more user friendly (and not have to parse the resources yourself)
https://github.com/biopragmatics/biolexica/tree/main/lexica/obo
from obofoundry.github.io.
Thank you @cthoyt - i am not too sure though what Chris position seems to be here :D As soon as there is some agreement somewhere, I will assign someone to work on this!
from obofoundry.github.io.
There doesn't necessarily have to be any agreement anywhere. OBO reviews are open and I can always re-run my script for each new ontology and post the results to the issue thread (I already sort of automated it). Any requester who disregards reasonable suggestions from this process has other bigger problems.
from obofoundry.github.io.
@cthoyt I made a case at the last call that the OBO New Ontology Request Manager should be running your script as part of the official pipeline, so that you dont have to distribute your attention too much. If, while I am trying to make this overlap checking more official, you could keep making these "overlap" posts and add a sentence:
This is (also) for the OBO Ontology Reviewer to assess overlap between the proposed ontology and existing ontologies.
To make clear that the reviewer should actually consider this, I would be greatful!
Thanks a ton!
from obofoundry.github.io.
Hello @matentzn @cmungall , all...
As usual and just quickly went thru the issue ...
I am assuming you're familiar with the functionalities from BioPortal (and OntoPortal) that automatically compute the "lexical matches" (using LOOM) with all the other ontologies in the portal.... I often not promote very much this as a "mapping" feature (because we all know lexical matches are very limited) but I often argue on the fact that OntoPortals is the only place that when one drop an ontology he/she gets an automatic lexical overlap with all the other ontologies in the next hour.
I mean could it help you address your need here?
Would that make sense to build on this feature to improve it?
from obofoundry.github.io.
@jonquet thanks for chiming in. The real problem lies in the fact that the ontologies we need to check are not loaded in any indexed infrastructure (including Bioportal). @cthoyt idea is to basically have one massive lexical index covering all of OBO dumped, and have a script just compare quickly and incoming ontology with that index. I am not sure if BioPortal should be covering this specific use case, as it is primarily concerned with ontologies outside of BioPortal..
from obofoundry.github.io.
@pfabry cc @OBOFoundry/obo-foundry-operations-committee
While I think there is value in doing lexical matching to assess overlap, I agree with Chris that it should be used a bit more wisely. The lexmatch should provide some evidence of non-reuse. But this is not just about using IRIs of existing ontologies. This should result in questions such as:
- GALLONT seems to define charactertistics. Are these aligned with PATO? Why are they not added in PATO?
- Should we really tell people if there is overlap with BTO, NCIT? These are all ontologies that contain everything under the sun.
I would suggest if Paul continues creating these, that we:
- Create an exclude list of ontologies including BTO, NCIT, perhaps OMIT
- Not including matching results from these ontologies
- Add a note to the comment by the NOR reviewer that "These matches are only for indication and do not constitute a formal part of the review. The ontology reviewer may refer to this information to illustrate patterns where re-use could be improved".
from obofoundry.github.io.
I would suggest if Paul continues creating these, that we:
1. Create an exclude list of ontologies including BTO, NCIT, perhaps OMIT 2. Not including matching results from these ontologies 3. Add a note to the comment by the NOR reviewer that "These matches are only for indication and do not constitute a formal part of the review. The ontology reviewer may refer to this information to illustrate patterns where re-use could be improved".
I agree that the lexical match could be a valuable informative tool but should be used with caution as it is "only" a lexical match. I agree with the 3 propositions, but I think the lexical match could be done even earlier, at the pre-registration checklist.
One of the check is the following:
For every term in my ontology, I checked whether another OBO Foundry ontology has one with the same meaning. If so, I re-used that term directly (not by cross-reference, by directly using the IRI).
Of course, label != meaning, but the lexical match could provide a general overview for the submitter.
Thank you VERY much for the script. However, while I have been able to use it for GALLONT, I can't make it work for LSDAO and I really don't know why. I created an issue about this.
from obofoundry.github.io.
Just a heads up and a question. Thanks to @cthoyt I have been able to run the lexmatch for the LSDAO ontology. @zhengj2007 as the reviewer of this ontology, how do you want to proceed? Do you want me to send you the file? post it directly in the issue? do not send it at all ?
from obofoundry.github.io.
Related Issues (20)
- Update the list of members of the Editorial Working Group HOT 8
- SOP addition: Add new members to duty rotation table HOT 7
- We would like to incentivise people to join and actively contribute to our core working groups HOT 3
- Request for new ontology pbpko HOT 11
- id-policy doc has a link to sharedname.org which has been domain squatted HOT 5
- How to obtain the gene annotations of the ontology file HOT 2
- Updating the list of members
- help updating my affiliation on OBO Foundry membership page HOT 2
- How to refer to OFOC members who are not actively contributing to operations (meetings, working groups)? HOT 5
- Add info about expectations of OBO Operations members HOT 1
- Remove two web pages? HOT 2
- Strengthen need to have capitalised ID space prefixes HOT 8
- Make the search button search the OBO site, not just ontobee HOT 10
- Biomarker Ontology HOT 12
- Minor tweaks to last two columns in ontology table on home page HOT 4
- Request for new OBO ontology HED (Hierarchical Event Descriptors) HOT 9
- Request for new ontology [Exercise Medicine Ontology] HOT 11
- Request for new ontology [FATO] HOT 3
- MONDO download files no longer working HOT 5
- Updating contact information for ontology HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from obofoundry.github.io.