GithubHelp home page GithubHelp logo

fastobo / fastobo-syntax Goto Github PK

View Code? Open in Web Editor NEW
3.0 2.0 0.0 74 KB

PEG Syntax and pest parser for the OBO flat file format 1.4

License: MIT License

Rust 100.00%
obo obofoundry ontology parser pest-grammar pest-parser syntax peg

fastobo-syntax's Introduction

fastobo-syntax Star me

PEG Syntax and pest lexer for the OBO flat file format 1.4.

Actions TravisCI License Source Crate Documentation Changelog GitHub issues DOI

Overview

This library is a strict implementation of the OBO flat file format 1.4 syntax using the pest parser generator. It was outsourced from fastobo to reduce compilation time, since deriving the OboLexer from grammar.pest takes some time.

The lexer itself is reexported in fastobo::parser, so there is probably no need to depend on this crate directly.

Strictness

The syntax is a strict implementation of the 1.4 format, with the following exceptions:

  • property_value clauses can have a value with is not quote-enclosed. This is a workaround to support some ontology files using obo2owl or the owlapi to generate their OBO counterpart, which does not quote-enclose property values (owlcs/owlapi#833).
  • ISO-8601 datetimes can only be parsed from the canonical format (YYYY-MM-DDTHH:MM:SS) with an optional timezone. Week dates and calendar dates are not supported.
  • Dates in creation_date clauses can be either full ISO-8601 datetimes (as recommended by the format 1.4 specifications) or simply ISO-8601 dates, which is suggested by the format 1.4 guide (albeit non-normative).

See also

  • fastobo: Abstract Syntax Tree and data structures for the OBO format version 1.4.
  • fastobo-py: Idiomatic Python bindings to the fastobo crate.

Feedback

Found a bug ? Have an enhancement request ? Head over to the GitHub issue tracker of the project if you need to report or ask something. If you are filling in on a bug, please include as much information as you can about the issue, and try to recreate the same bug in a simple, easily reproducible situation.

About

This project was developed by Martin Larralde as part of a Master's Degree internship in the BBOP team of the Lawrence Berkeley National Laboratory, under the supervision of Chris Mungall. Please cite this project as:

Larralde M. Developing Python and Rust libraries to improve the ontology ecosystem [version 1; not peer reviewed]. F1000Research 2019, 8(ISCB Comm J):1500 (poster) (https://doi.org/10.7490/f1000research.1117405.1)

fastobo-syntax's People

Contributors

althonos avatar dependabot-preview[bot] avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

fastobo-syntax's Issues

Support independent OBO comments

Currently, fastobo will fail to parse a proper OBO document if it contains independent OBO comments (comments spanning on an entire line).

Support Windows line endings

Currently Windows line endings (\r\n in place of \n) will crash the parser, although some OBO files are still distributed with these.

Error while converting obo file to owl

I am trying to convert an obo file (ncbigene.obo) into owl format. This obo file was obtained via pyobo. Code for pyobo is irrelevant here but for the sake of reproducibility:

    from pyobo.sources.ncbigene import get_obo
    obo_file = get_obo()
    obo_file.write_obo(output_path, use_tqdm=True)

When I run the following:

    obo_doc = fastobo.load(input)
    if "default-namespace" not in obo_doc.header:
        obo_doc.header.insert(99, fastobo.header.DefaultNamespaceClause("NCBIGene"))
    if output_format == "owl":
        fastobo.dump_owl(obo_doc, output_path, format="ofn")

I get :

Traceback (most recent call last):
  ...
.../cli.py", line 100, in convert
    obo_doc = fastobo.load(input)
  File "<stdin>", line 203348962
    def: "\\Probable U2 small nuclear ribonucleoprotein B\\\\\\" []
         ^
SyntaxError: expected QuotedString

Would it be possible for fastobo to handle strings like these?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.