GithubHelp home page GithubHelp logo

Comments (4)

jfreidin avatar jfreidin commented on June 3, 2024 1

Hi Reece, thanks for all your help this weekend!
As soon as I got show-status working, I realized I had a typo in HGVS_SEQREPO_DIR.
hgvs is working fine now as well so I'm closing this issue.

I successfully upgraded to seqrepo 0.4.4, but received the following warning:
hgvs 1.2.4 has requirement biopython==1.69, but you'll have biopython 1.70 which is incompatible.
I think seqrepo was initially installed as requirement of hgvs, not directly by me.

(pipecycle) [jfreidin@USAE1CBIOINTP04 jfreidin]$ seqrepo -r seqrepo show-status -i 2018-11-26
/compbio/development/sandbox/jfreidin/miniconda2/envs/pipecycle/lib/python2.7/site-packages/bioutils/_versionwarning.py:12: UserWarning: Support for Python < 3.6 is now deprecated and will be dropped on 2019-03-31. See https://github.com/biocommons/org/wiki/Migrating-to-Python-3.6
  "Support for Python < 3.6 is now deprecated and"
seqrepo 0.4.4
instance directory: seqrepo/2018-11-26, 11.3 GB
backends: fastadir (schema 1), seqaliasdb (schema 1)
sequences: 841931 sequences, 102777417560 residues, 268 files
aliases: 9697847 aliases, 9344237 current, 48 namespaces, 841931 sequences

from biocommons.seqrepo.

reece avatar reece commented on June 3, 2024

Please do the following:

  • Send the version number from seqrepo --version. 0.4.2 is current.

  • Send the output of ls -l seqrepo/2018-11-26/aliases.sqlite3 seqrepo/2018-11-26/sequences/db.sqlite3. You should get something like:

-r--r--r-- 1 reece reece 2.4G Nov 26 08:35 seqrepo/2018-11-26/aliases.sqlite3
-r--r--r-- 1 reece reece 183M Nov 26 08:35 seqrepo/2018-11-26/sequences/db.sqlite3

For good measure, here are checksums:

$ md5sum seqrepo/2018-11-26/aliases.sqlite3 seqrepo/2018-11-26/sequences/db.sqlite3 
177b5ffdbb0c81adb5089f9972762bfb  seqrepo/2018-11-26/aliases.sqlite3
648d946b7793f318e61fc01c9b588b3c  seqrepo/2018-11-26/sequences/db.sqlite3
  • Snapshots are intentionally made read-only. The permissions should be be a-w on the snapshot directory and all contained files. For instance, I have:
dr-xr-xr-x 3 reece reece 4.1k Nov 18  2017 2017-11-18/
dr-xr-xr-x 3 reece reece 4.1k Aug 21 15:33 2018-08-21/
dr-xr-xr-x 3 reece reece 4.1k Oct  3 12:24 2018-10-03/
dr-xr-xr-x 3 reece reece 4.1k Nov 26 08:34 2018-11-26/

The error you show is consistent with a missing or corrupted database file. I would have expected that a transfer failure would have caused rsync to fail, which then would have prevented the instance temporary directory to have been renamed to 2018-11-26. That is, the fact that the name is not a temp directory suggests that rsync completed successfully. (Also, perms are set by rsync after transfer because it needed write permission in order to actually write.)

from biocommons.seqrepo.

jfreidin avatar jfreidin commented on June 3, 2024

Hi Reece, rsync appears to complete successfully.
I deleted everything and tried again--same result.
I thought after rsync completes there is supposed to be decompression and
verification.
That doesn't seem to happen.
I don't know if this is relevant, but in this research environment,
filesystem is NFS to S3.
I've previously installed seqrepo on OSX without any issues.
Best,
Jonathan

(pipecycle) [jfreidin@USAE1CBIOINTP04 jfreidin]$ seqrepo --version
/compbio/development/sandbox/jfreidin/miniconda2/envs/pipecycle/lib/python2.7/site-packages/bioutils/_versionwarning.py:12:
UserWarning: Support for Python < 3.6 is now deprecated and will be dropped
on 2019-03-31. See
https://github.com/biocommons/org/wiki/Migrating-to-Python-3.6
  "Support for Python < 3.6 is now deprecated and"
0.4.2

(pipecycle) [jfreidin@USAE1CBIOINTP04 jfreidin]$ ls -l
seqrepo/2018-11-26/aliases.sqlite3 seqrepo/2018-11-26/sequences/db.sqlite3
-r--r--r--. 1 jfreidin compbio 2391415808 Nov 26 16:35
seqrepo/2018-11-26/aliases.sqlite3
-r--r--r--. 1 jfreidin compbio  182138880 Nov 26 16:35
seqrepo/2018-11-26/sequences/db.sqlite3

(pipecycle) [jfreidin@USAE1CBIOINTP04 jfreidin]$ md5sum
seqrepo/2018-11-26/aliases.sqlite3 seqrepo/2018-11-26/sequences/db.sqlite3
177b5ffdbb0c81adb5089f9972762bfb  seqrepo/2018-11-26/aliases.sqlite3
648d946b7793f318e61fc01c9b588b3c  seqrepo/2018-11-26/sequences/db.sqlite3

from biocommons.seqrepo.

reece avatar reece commented on June 3, 2024

Hi Jonathan-

I typo'd previously. 0.4.4 is current. Please update. That turns out to be helpful here.

I now see the problem. show-status should be invoked like this: seqrepo -r seqrepo show-status -i 2018-11-26.

The root directory (-r) is where all of the instances are stored. The instance is the name (== directory name) of the subdirectory. The reason for the root / instance decomposition is that seqrepo management needs a notion of the root directory, but the actual directory is the instance directory. I had anticipated that this distinction would be confusing (since I occasionally screwed it up!), so ~5 weeks ago I added a check with better explanation. Since 0.4.3, you'll get:

$ seqrepo -r /tmp/seqrepo/2018-11-26 show-status   # wrong usage!
...
OSError: Unable to open SeqRepo directory /tmp/seqrepo/2018-11-26/latest

$ seqrepo -r /tmp/seqrepo show-status -i 2018-11-26  # correct usage
seqrepo 0.4.4
instance directory: /tmp/seqrepo/2018-11-26, 2.6 GB
backends: fastadir (schema 1), seqaliasdb (schema 1) 
...

Invoking show-status correctly should solve your problem. However, you probably won't get the error message at the moment because of the other bug you reported (#44).

Please let me know show-status now works for you.

from biocommons.seqrepo.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.