Comments (4)
Hi Reece, thanks for all your help this weekend!
As soon as I got show-status
working, I realized I had a typo in HGVS_SEQREPO_DIR
.
hgvs
is working fine now as well so I'm closing this issue.
I successfully upgraded to seqrepo 0.4.4
, but received the following warning:
hgvs 1.2.4 has requirement biopython==1.69, but you'll have biopython 1.70 which is incompatible.
I think seqrepo
was initially installed as requirement of hgvs
, not directly by me.
(pipecycle) [jfreidin@USAE1CBIOINTP04 jfreidin]$ seqrepo -r seqrepo show-status -i 2018-11-26
/compbio/development/sandbox/jfreidin/miniconda2/envs/pipecycle/lib/python2.7/site-packages/bioutils/_versionwarning.py:12: UserWarning: Support for Python < 3.6 is now deprecated and will be dropped on 2019-03-31. See https://github.com/biocommons/org/wiki/Migrating-to-Python-3.6
"Support for Python < 3.6 is now deprecated and"
seqrepo 0.4.4
instance directory: seqrepo/2018-11-26, 11.3 GB
backends: fastadir (schema 1), seqaliasdb (schema 1)
sequences: 841931 sequences, 102777417560 residues, 268 files
aliases: 9697847 aliases, 9344237 current, 48 namespaces, 841931 sequences
from biocommons.seqrepo.
Please do the following:
-
Send the version number from
seqrepo --version
.0.4.2
is current. -
Send the output of
ls -l seqrepo/2018-11-26/aliases.sqlite3 seqrepo/2018-11-26/sequences/db.sqlite3
. You should get something like:
-r--r--r-- 1 reece reece 2.4G Nov 26 08:35 seqrepo/2018-11-26/aliases.sqlite3
-r--r--r-- 1 reece reece 183M Nov 26 08:35 seqrepo/2018-11-26/sequences/db.sqlite3
For good measure, here are checksums:
$ md5sum seqrepo/2018-11-26/aliases.sqlite3 seqrepo/2018-11-26/sequences/db.sqlite3
177b5ffdbb0c81adb5089f9972762bfb seqrepo/2018-11-26/aliases.sqlite3
648d946b7793f318e61fc01c9b588b3c seqrepo/2018-11-26/sequences/db.sqlite3
- Snapshots are intentionally made read-only. The permissions should be be
a-w
on the snapshot directory and all contained files. For instance, I have:
dr-xr-xr-x 3 reece reece 4.1k Nov 18 2017 2017-11-18/
dr-xr-xr-x 3 reece reece 4.1k Aug 21 15:33 2018-08-21/
dr-xr-xr-x 3 reece reece 4.1k Oct 3 12:24 2018-10-03/
dr-xr-xr-x 3 reece reece 4.1k Nov 26 08:34 2018-11-26/
The error you show is consistent with a missing or corrupted database file. I would have expected that a transfer failure would have caused rsync to fail, which then would have prevented the instance temporary directory to have been renamed to 2018-11-26. That is, the fact that the name is not a temp directory suggests that rsync completed successfully. (Also, perms are set by rsync after transfer because it needed write permission in order to actually write.)
from biocommons.seqrepo.
Hi Reece, rsync appears to complete successfully.
I deleted everything and tried again--same result.
I thought after rsync completes there is supposed to be decompression and
verification.
That doesn't seem to happen.
I don't know if this is relevant, but in this research environment,
filesystem is NFS to S3.
I've previously installed seqrepo on OSX without any issues.
Best,
Jonathan
(pipecycle) [jfreidin@USAE1CBIOINTP04 jfreidin]$ seqrepo --version
/compbio/development/sandbox/jfreidin/miniconda2/envs/pipecycle/lib/python2.7/site-packages/bioutils/_versionwarning.py:12:
UserWarning: Support for Python < 3.6 is now deprecated and will be dropped
on 2019-03-31. See
https://github.com/biocommons/org/wiki/Migrating-to-Python-3.6
"Support for Python < 3.6 is now deprecated and"
0.4.2
(pipecycle) [jfreidin@USAE1CBIOINTP04 jfreidin]$ ls -l
seqrepo/2018-11-26/aliases.sqlite3 seqrepo/2018-11-26/sequences/db.sqlite3
-r--r--r--. 1 jfreidin compbio 2391415808 Nov 26 16:35
seqrepo/2018-11-26/aliases.sqlite3
-r--r--r--. 1 jfreidin compbio 182138880 Nov 26 16:35
seqrepo/2018-11-26/sequences/db.sqlite3
(pipecycle) [jfreidin@USAE1CBIOINTP04 jfreidin]$ md5sum
seqrepo/2018-11-26/aliases.sqlite3 seqrepo/2018-11-26/sequences/db.sqlite3
177b5ffdbb0c81adb5089f9972762bfb seqrepo/2018-11-26/aliases.sqlite3
648d946b7793f318e61fc01c9b588b3c seqrepo/2018-11-26/sequences/db.sqlite3
from biocommons.seqrepo.
Hi Jonathan-
I typo'd previously. 0.4.4 is current. Please update. That turns out to be helpful here.
I now see the problem. show-status
should be invoked like this: seqrepo -r seqrepo show-status -i 2018-11-26
.
The root directory (-r) is where all of the instances are stored. The instance is the name (== directory name) of the subdirectory. The reason for the root / instance decomposition is that seqrepo management needs a notion of the root directory, but the actual directory is the instance directory. I had anticipated that this distinction would be confusing (since I occasionally screwed it up!), so ~5 weeks ago I added a check with better explanation. Since 0.4.3, you'll get:
$ seqrepo -r /tmp/seqrepo/2018-11-26 show-status # wrong usage!
...
OSError: Unable to open SeqRepo directory /tmp/seqrepo/2018-11-26/latest
$ seqrepo -r /tmp/seqrepo show-status -i 2018-11-26 # correct usage
seqrepo 0.4.4
instance directory: /tmp/seqrepo/2018-11-26, 2.6 GB
backends: fastadir (schema 1), seqaliasdb (schema 1)
...
Invoking show-status
correctly should solve your problem. However, you probably won't get the error message at the moment because of the other bug you reported (#44).
Please let me know show-status now works for you.
from biocommons.seqrepo.
Related Issues (20)
- update seqrepo to use the current biocommons.example project template HOT 2
- Update README to include OSX-facing instructions HOT 1
- Specify black version HOT 5
- Database file problem with 2023-09-18 snapshot HOT 6
- Sqlite3 timestamp converter is deprecated as of Python 3.12 HOT 1
- Repeated sequence not being captured entirely, intermittent data issues HOT 1
- Seqrepo not giving back consistent data
- Make fd_cache_size configureable via env variable
- Support custom/federated data HOT 3
- Problem with latest release (aliases) ... HOT 2
- Generalize the seqrepo interface and implement new backends HOT 2
- Command Line usage example commands do not run properly HOT 3
- `seqrepo` cli not installed in 0.6.6 and 0.6.7 HOT 4
- --update-latest CLI flag is not pulling in the most recent version HOT 4
- seqrepo pull often fails when renaming tmpdir
- pre-commit hook causes unexpected problems on existing codebase HOT 1
- ModuleNotFoundError (pkg_resources) on import in 3.12
- `seqrepo pull` expects existing `seqrepo` dir
- IPython dependency -- remove it or make it optional
- Reintroduce or remove DuplicateFilter
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from biocommons.seqrepo.