GithubHelp home page GithubHelp logo

obo_parser's Introduction

Build Status Dependency Status

obo_parser

A simple Ruby gem for parsing OBO 1.2 (?4) formatted ontology files. Useful for reporting, comparing, and mapping data to other databases. There is presently no functionality for logical inference across the ontology.

Installation

gem install obo_parser

Use

General

require 'obo_parser'
o = parse_obo_file(File.read('my_ontology.obo'))  # => An OboParser instance  
first_term = o.terms.first                        # => An OboParser#Term instance 

first_term.id.value                                 # => 'HAO:1234'

d = first_term.def                                  # => An OboParser#Tag instance
d.tag                                               # => 'def'
d.value                                             # => 'Some defintition'
d.xrefs                                             # => ['xref:123', 'xref:456'] 
d.comment                                           # => 'Some comment'

t = first_term.name                                 # => An OboParser#Tag instance    
t.tag                                               # => 'name'
t.value                                             # => 'Some Term name' 

o = first_term.other_tags                           # => [OboParser#Tag, ... ] An array of tags that are not specially referenced in an OboParser::Stanza
o.first                                             # => An OboParser#Tag instance    

first_typedef = o.typdefs.first                   # => An OboParser#Typdef instance 
first_typdef.id.value                               # => 'Some typedef id'
first_typdef.name.value                             # => 'Some typedef name'

o.terms.first.tags_named('synonym')               # => [OboParser#Tag, ... ]
o.terms.first.tags_named('synonym').first.tag     # => 'synonym'
o.terms.first.tags_named('synonym').first.value   # => 'Some label'

o.terms.first.relationships                       # => [['relationship', 'FOO:123'], ['other_relationship', 'FOO:456'] ...] An array of [relation, related term id], includes 'is_a', 'disjoint_from' and Typedefs

Convenience methods

o.term_hash                                       # => { term (String) => id (String), ... for each [Term] in the file. } !! Assumes names terms are unique, they might not be, in which case you get key collisions. 
o.id_hash                                         # => { id (String) => term (String), ... for each [Term] in the file. } 

See also /test/test_obo_parser.rb

Utilities

A small set of methods (e.g. comparing OBO ontologies) utilizing the gem are included in /lib/utilities.rb. For example: 1) shared labels across sets of ontologies can be found and returned, 2) ontologies can be dumped into a simple Cytoscape node/edge format; 3) given a set of correspondances between two ontologies various reports can be made.

Viz

OboParser::Utilities::Viz.mock_coordinate_space(o, size: 100) # => STDOUT tab delimited table with x, y, z, identifier, label 

Contributing

Fork, test, code, test, pull request.

Copyright

Copyright (c) 2010-2017 Matt Yoder. See LICENSE for details.

License

MIT

obo_parser's People

Contributors

mjy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

obo_parser's Issues

Does not parse GO ontology 1.2

Used this gem successfully on the TAIR (arabidopsis.org) GO slim for Arabidopsis thaliana. Worked really well!

Didn't work so well when I try to use it on the full Gene Ontology:

> parse_obo_file(File.read("GO.obo"))
RuntimeError: cytochrome c
from /home/jwoods/obo_parser/lib/tokens.rb:69:in `initialize'
from /home/jwoods/obo_parser/lib/lexer.rb:52:in `new'
from /home/jwoods/obo_parser/lib/lexer.rb:52:in `match'
from /home/jwoods/obo_parser/lib/lexer.rb:38:in `block in read_next_token'
from /home/jwoods/obo_parser/lib/lexer.rb:37:in `each'
from /home/jwoods/obo_parser/lib/lexer.rb:37:in `read_next_token'
from /home/jwoods/obo_parser/lib/lexer.rb:10:in `peek'
from /home/jwoods/obo_parser/lib/parser.rb:32:in `parse_term'
from /home/jwoods/obo_parser/lib/parser.rb:16:in `parse_file'
from /home/jwoods/obo_parser/lib/obo_parser.rb:169:in `parse_obo_file'
from (irb):3
from /home/jwoods/.rvm/gems/ruby-1.9.2-p180/bundler/gems/rails-f064664de72a/railties/lib/rails/commands/console.rb:45:in `start'
from /home/jwoods/.rvm/gems/ruby-1.9.2-p180/bundler/gems/rails-f064664de72a/railties/lib/rails/commands/console.rb:8:in `start'
from /home/jwoods/.rvm/gems/ruby-1.9.2-p180/bundler/gems/rails-f064664de72a/railties/lib/rails/commands.rb:44:in `<top (required)>'
from script/rails:6:in `require'
from script/rails:6:in `<main>'

I do notice that this has a lot of escaped characters in it, which the code says it can't handle. Is there a workaround?

Many thanks for this! And BTW, I'm working on a hacked-together fork that has some graph code.

Feature Request: OBO Enumerator

I just tried parsing the Gene Ontology which is pretty huge, and my MacBook almost had a heart attack. What do you think about adding a method that enables lazy parsing (i.e. only parse one Term at a time, whenever somebody asks for it), either as a separate method or perhaps with an option such as lazy=true? I've been using this super-simple python OBO-parser until now but now that I'm trying to package my software into a Ruby Gem, of course everything should be just Ruby. And I don't like the idea of having multiple obo_parser gem equivalents floating around or that everybody starts their own thing from scratch.
I understand this would make many of the sanity/crossref checks impossible but I only care about very specific parts of the terms anyway and would rather have it parse quickly than safely in this case.

Error parsing XREFS

I'm using your gem to parse Human Phenotype Ontology, exactly this version (https://raw.githubusercontent.com/obophenotype/human-phenotype-ontology/master/hp.obo).

When I try to load it using your example code I obtain a "Runtime Exception: Facebase is seemingly infinite".

Searching into your code, I observe that it happend at line 69 of your "Tokens.rb" file. It seems to be a conceptual error because there, you are parsing XREFS and the line which launch the error using hp.obo file is a "synonym" of term "HP:0000175".

Tell me if you need any other information to replicate the error

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.