GithubHelp home page GithubHelp logo

adc-disciplines's People

Contributors

amoeba avatar mbjones avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

amoeba

adc-disciplines's Issues

Feedback on discipline class hierarchy and ontology

@mbjones asked me to look over the latest copy of the disciplines ontology (ADCAT). I looked at whether the class hierarchy makes sense and I also looked at the ontology itself.

Class hierarchy notes

  1. I see a class "General Genetics". Why not just "Genetics" here?
  2. Should we toss in an "Evolutionary Biology" class under "Biology"?

My feedback here is pretty superficial since I'm not super familiar with the breadth and depth of submissions the ADC is managing nowadays.

Ontology notes

  1. Do we want more annotation properties on these classes right now? I know it'd take time to work up definitions for everything, for example.
  2. I'm seeing some funny stuff when I open things in Protégé. Namely the Ontology Header section is empty. When I look at the TTL file I saw it uses the odo prefix which is defined as @prefix odo: <.> ..

I went about fixing that and Protégé was much happier. Let me know if that makes sense and I'll toss this patch up:

diff --git ADCAT.ttl ADCAT.ttl
index 84b5c31..bd298a3 100644
--- ADCAT.ttl
+++ ADCAT.ttl
@@ -1,12 +1,13 @@
 @base <https://purl.dataone.org/odo/ADCAT_> .
 @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
-@prefix odo: <.> .
+@prefix odo: <https://purl.dataone.org/odo/> .
 @prefix dc: <http://purl.org/dc/elements/1.1/> .
 @prefix obo: <http://purl.obolibrary.org/obo/> .
 @prefix owl: <http://www.w3.org/2002/07/owl#> .
 @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
 @prefix terms: <http://purl.org/dc/terms/> .
 @prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
+@prefix adcat: <https://purl.dataone.org/odo/ADCAT_> .

 odo:ADCAT_
     terms:created "2021-11-10"^^xsd:date ;
@@ -16,324 +17,323 @@ odo:ADCAT_
     terms:title "Arctic Data Center Annotation Terms Ontology (ADCAT)" ;
     owl:versionIRI <ADCAT/0.2.0> ;
     owl:versionInfo "Version 0.2.0" ;
-    <rdf:type> owl:Ontology .
+    rdf:type owl:Ontology .

-odo:ADCAT_00000
+adcat:00000
     a owl:Class ;
     rdfs:label "Academic Discipline" .

-odo:ADCAT_00001
+adcat:00001
     a owl:Class ;
     rdfs:label "Humanities and Social Sciences" ;
-    rdfs:subClassOf odo:ADCAT_00000 .
+    rdfs:subClassOf adcat:00000 .

[patch above is truncated since the rest of it repeats]

Make ADCAD URIs resolve somewhere

ADCAD URIs (https://purl.dataone.org/odo/ADCAD_) should resolve somewhere useful. For our various ontologies, we've been using a mix of:

  1. PyLODE pages, as you can see on https://ontologies.dataone.org
  2. BioPortal redirects, as in https://bioportal.bioontology.org/ontologies/ADCAD/?p=classes&conceptid=https%3A%2F%2Fpurl.dataone.org%2Fodo%2FADCAD_00075

Since ADCAD is primarily a class hierarchy, I think the collapsible tree display BioPortal offers is better than what we get out of the box with PyLODE but PyLODE does support a top-level image which we could set to the image in the readme: https://github.com/NCEAS/adc-disciplines/blob/main/adc-disciplines.png.

I'm more or less ambivalent. Does anyone else have a preference?

align with wikidata identifiers

Each class closely corresponds to a wikidata entry. Need to look them up, add them to the adc-disciplines.csv file, and regenerate the ontology.

deprecate ethnology and add Indigenous Studies

In the Arctic Data Center social science training, two items came up related to the disciplines list:

Ethnology is now considered to be synonymous with anthropology. Perhaps historically ethnology was considered a field of its own, but today it is at best just a part of anthropology. It also has some history tied to racism (eg: cranial size measurements) so perhaps best to deprecate and give a 1:1 relationship with anthropology.

Indigenous Studies was suggested to be added as a sub-discipline of Social Science. It was pointed out that it would be best to not put as a sub-discipline of anthropology, given historical practices of anthropologists while studying indigenous cultures. Although it might seem odd to add indigenous studies and not other group-specific studies (eg: women's studies), a participant pointed out that given our position as the Arctic Data Center and the push to be inclusive of indigenous voices, adding this discipline would be a good way to be inclusive and give voice to those researchers.

@mbjones let me know if I missed anything in that conversation

identifiers with spurious unicode characters

The R script that generates ADCAT.ttl does so with a function that creates the class names for each of the terms. Those are drawn from the integer identifiers in the CSV file, which are then padded to create the class names, like so:

odo:ADCAT_00013
    a owl:Class ;
    rdfs:label "Biochemistry" ;
    rdfs:subClassOf odo:ADCAT_00011 .

The padding works fine for identifiers > 10, but for identifiers 0 to 9, it adds a spurious unicode character \u002 and mangles the prefixed URI format, as follows:

<ADCAT_000\u00202>
    a owl:Class ;
    rdfs:label "Humanities" ;
    rdfs:subClassOf odo:ADCAT_00001 .

Interestingly the parent reference in the subclass triple seems to be created fine. Not sure what's up here. @amoeba would love to get your review of this if you have a minute.

align with re3data terms

the ADCAT terms were derived from the re3data subject classification, and I have the re3data identifier for each class in the CSV file that is used to generate the ontology. Need to add a new triple aligning each of the ADCAT classes with its peer in re3data.

add missing disciplines

When using this taxonomy for our advisory board recruitment, it became clear that some key disciplines are missing, including:

  • Cyberinfrastructure
  • Data science

Let's re-evaluate this list and then put it into action in our dataset classifications. cc @jeanetteclark

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.