GithubHelp home page GithubHelp logo

sap218 / ocimido Goto Github PK

View Code? Open in Web Editor NEW
1.0 3.0 0.0 3.07 MB

An ontology for ocular immune-mediated inflammatory diseases

Home Page: https://sap218.github.io/ocimido/

ontology owl ocular inflammation inflammatory disease diseases disorder biomedical-ontologies layman-terms

ocimido's Introduction

OcIMIDo

Ocular Immune-Mediated Inflammatory Diseases Ontology

Important information

  • Issues - see for ontology suggestions, bug reporting, and future development information.
  • License - see before use (you can use/edit OcIMIDo as long as you give appropriate credit).
  • Changelog - see major changes of OcIMIDo throughout development.

Creators

  • Samantha Pendleton Institute of Cancer and Genomic Sciences, University of Birmingham, UK
  • Tasanee Braithwaite The Medical Eye Unit, St Thomas’ Hospital NHS Foundation Trust, London UK

Citing

@article{PENDLETON2021104542,
title = {Development and application of the ocular immune-mediated inflammatory diseases ontology enhanced with synonyms from online patient support forum conversation},
journal = {Computers in Biology and Medicine},
volume = {135},
pages = {104542},
year = {2021},
issn = {0010-4825},
doi = {https://doi.org/10.1016/j.compbiomed.2021.104542},
url = {https://www.sciencedirect.com/science/article/pii/S001048252100336X},
author = {Samantha C. Pendleton and Luke T. Slater and Andreas Karwath and Rose M. Gilbert and Nicola Davis and Konrad Pesudovs and Xiaoxuan Liu and Alastair K. Denniston and Georgios V. Gkoutos and Tasanee Braithwaite},
}

graphical abstract

ocimido's People

Contributors

sap218 avatar tasbraithwaite avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar

ocimido's Issues

Text Splitting

  • splitting via threads - all comments (not users)
  • each thread sorted into bins: short/medium/long
  • extract 20% from each bin and that is the test data set: put somewhere and not touch
  • training is 80% of each bin
  • extract synonyms for ontology

td-idf of terms, add synonyms from forum train set

after #1 calculate word frequencies wrt length of threads using td-idf. doing this iteratively (removing terms directly expressed in the ontology each time) will inform you on synonyms and new terms to add, until you have coverage of all relevant terms in the ontology.

looks like the td-idf isn't in nltk directly, but you can use it to calculate the measure: https://nlpforhackers.io/tf-idf/

It looks like scikit-learn does have the functionality directly though: https://www.bogotobogo.com/python/NLTK/tf_idf_with_scikit-learn_NLTK.php

test coverage of ontology annotation on ov forum

after #3, when there is coverage of all threads in training set by the terms in the ontology, we should evaluate how this extrapolates to the test set, by annotating this in turn

we will have to discuss the best way to do this exactly, perhaps seeing if we have a similar number of labels per thread on average, or evaluate whether there's a similar distribution of labels across the graph (which might also produce a nice graphic for the paper)

Change ontology base

currently the fully qualified IRIs are e.g. https://github.com/sap218/ocular-immune-mediated-inflammatory-disease-ontology/blob/master/ontology/ocimido.owl#OCIMIDO_00001

This is a bit long, but more importantly we should change it to a domain we have some control over (the above becomes problematic if github ever design to change their design layout, you want to change your username, transfer it to a github org, github goes out of business etc) - I would recommend:

http://www.bham.ac.uk/OCIMIDO/

We should be able to open a ticket with IT services to get that URL redirected to here. The same goes for the ontologyIRI, as well as just the base defined in the ontology

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.