GithubHelp home page GithubHelp logo

nitish6174 / openchemvault Goto Github PK

View Code? Open in Web Editor NEW
3.0 5.0 6.0 270 KB

Framework to build chemistry logfile data repository and access it through web

Python 43.53% HTML 33.10% CSS 6.20% JavaScript 17.17%
flask-framework computational-chemistry cclib openbabel

openchemvault's Introduction

Howdy! I'm Nitish

image


Learning to fiddle with

python golang nodejs typescript grpc react numpy haskell
mongodb mysql elasticsearch neo4j mongodb react git docker kubernetes


Things I like to do when away from screen


Reach me at

gmail linkedin github instagram facebook medium twitter

openchemvault's People

Contributors

nitish6174 avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

openchemvault's Issues

Script crashing while generating SVG

When generating the SVG for H 10 C 84 N 4 S 4 , following out of range error was found to occur:

terminate called after throwing an instance of 'std::out_of_range'
  what():  basic_string::substr: __pos (which is 18446744073709551615) > this->size() (which is 0)
Aborted (core dumped)

Even though the makeopenbabel() function call is being made in a try-except block, the setup script gets aborted instead of ignoring this as exception and continuing with other files.

This issue was noted with the file

Gaussian/Gaussian03/DCV4T+C60.log

which is provided in the cclib-data repository.

This problem was observed when I was trying to setup the database for a large number of files (used the entire cclib-data repository here)

However, when running the setup script only with the above mentioned file, below warning was raised:

*** Open Babel Error  in expand_cycle
  maximum time exceeded...
==============================

but ran eventually and produced the SVG.

TODO plan

Database setup

  • Function to loop through files in directory and parse
  • Determine molecular formula of parsed file
  • Determine InChI corresponding to parsed file
  • Generate SVG for parsed file
  • Setting up molecule document schema
  • Setting up parsed file document schema
  • Finding extreme values of attributes (for filtering purpose in API)
  • Determine IUPAC and common name using InChI/other info
  • Find and store data/properties about molecule using PubChemPy

API developement

  • Fetch all molecular formulas (/api/browse/molecules)
  • Fetch all parsed files (/api/browse/files)
  • Fetch documents corresponding to a molecular formula (/api/browse/<formula>)
  • Get details of a particular parsed file (specified with MongoDB document id) (/api/file/<doc_id>)
  • Search API which supports filtering by various attributes available in parsed files (/api/search/<search_params>)
  • Support for searching by molecule's InChI/InChIKey
  • Support for searching by molecule's name
  • Upload a log file (/api/upload)
  • Add a log file to data repository (/api/addfile)
  • Add authentication for APIs

Application web front-end

  • Page to upload log file and view results instantly
  • Page to add a log file(s) to database
  • Browse repository by molecular formula
  • Search page
  • Page to show details of a particular parsed file
  • Button to download particular file data in JSON format (just like given by /api/file/<doc_id>)

Testing

  • Scalability of database
  • Size limit of parsed file which can be inserted as a MongoDB database
  • Invalid parameters/attributes to handle and detect corrupt log files
  • API testing
  • Non-API URLs
  • Front-end testing (using Selenium)

Miscellaneous

  • Dockerize the app
  • Add option to map port for flask app and MongoDB
  • Deploy an instance on a public server for user testing
  • Add method to find best view when generating SVG
  • Develop regex-like molecular formula matcher
  • Develop support for structural search

Partially completed (from above list)

  • Generate SVG for parsed file : Investigate the crash issue (#4)
  • Search : Add filters corresponding to more attributes
  • Page to show details of a particular parsed file : Identify more attributes suitable for display in tabular form rather than separately

"Upload" versus "Add file" is ambiguous

It should be more obvious just from looking at the menu bar text that "Upload" is parse without adding to the DB, and "Add file" is both parse and add file to the DB.

Maybe change "Upload" to "Parse", and say in the body text that this parses without adding.

Also, the "Submit Query" button should have more meaningful text, something specific to each page's main action, like "Parse file" and "Parse and add file".

Docker compose failing at mongodb step on Arch Linux

From sudo docker-compose build:

Step 5/18 : RUN service mongodb start
 ---> Running in a7e2b3f72c20
 * Starting database mongodb
   ...fail!
ERROR: Service 'app' failed to build: The command '/bin/sh -c service mongodb start' returned a non-zero code: 1

Is this because I already have mongodb installed and running? I can test on an Ubuntu 12.04 box.

Docker setup incomplete

Following needs to be fixed in the docker setup:

  • Openbabel installation
  • MongoDB instance creation

Prior to the fix, following error was encountered when import openbabel was called in python code running in docker container:

Traceback (most recent call last):
  File "test.py", line 5, in <module>
    import openbabel as ob
  File "/usr/local/lib/python3.5/dist-packages/openbabel.py", line 32, in <module>
    _openbabel = swig_import_helper()
  File "/usr/local/lib/python3.5/dist-packages/openbabel.py", line 28, in swig_import_helper
    _mod = imp.load_module('_openbabel', fp, pathname, description)
  File "/usr/lib/python3.5/imp.py", line 242, in load_module
    return load_dynamic(name, filename, file)
  File "/usr/lib/python3.5/imp.py", line 342, in load_dynamic
    return _load(spec)
ImportError: libopenbabel.so.5: cannot open shared object file: No such file or directory

Metadata not being parsed

When seeding the data repository, metadata should be a part of the parsed data returned from ccread() function for all the files.

However, when running the database setup in docker instance, the metadata attribute is missing in all the documents. However, this issue was not observed outside docker instance.

As a result, since the molecule browse menu shows just the metadata info, the listed items are empty in docker instance while they contain some content when running without docker as shown:

Outside docker

outsidedocker-au2

In docker

indocker-au2

It needs to be looked as to why metadata is not being parsed when using docker.

Case (in)sensitivity and/or regex-based searching

Uploading files from DALTON and then searching in the Package field for 'dalton' returned no hits. It would be nice to have an option for case-insensitive or regex-based search for all the fields

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.