kr-colab / popvae Goto Github PK
View Code? Open in Web Editor NEWgenotype dimensionality reduction with a VAE
License: Other
genotype dimensionality reduction with a VAE
License: Other
I followed the exact installation instructions (including the new conda activate popvae
part) and then tested popvae with the test data with:
popvae.py --infile data/pabu/pabu_test_genotypes.vcf --out out/pabu_test --seed 42
And got the following run and error message:
Using TensorFlow backend.
WARNING:tensorflow:From /home/eric/miniconda3/envs/popvae/lib/python3.7/site-packages/popvae-0.1-py3.7.egg/EGG-INFO/scripts/popvae.py:126: The name tf.set_random_seed is deprecated. Please use tf.compat.v1.set_random_seed instead.
loading genotypes
[read_vcf] 7386 rows in 0.07s; chunk in 0.07s (113138 rows/s); 1685:69
[read_vcf] all done (113048 rows/s)
counting alleles
dropping non-biallelic sites
dropping singletons
filling missing data with rbinom(2,derived_allele_frequency)
100%|█████████████████████████████████████████████████████████████| 95/95 [00:00<00:00, 8151.78it/s]
running train/test splits
['validation samples:ARK_CAH156' 'validation samples:GUA_DHB4507'
'validation samples:KNS_SAR7995' 'validation samples:LOU_CAH140'
'validation samples:OAX_DHB5582' 'validation samples:SCL_PB65556'
'validation samples:SIN_CSW7732' 'validation samples:TEX_CAH087'
'validation samples:TEX_CAH149' 'validation samples:YUC_BRB875']
running on 3887 SNPs
WARNING:tensorflow:From /home/eric/miniconda3/envs/popvae/lib/python3.7/site-packages/tensorflow-1.15.2-py3.7-linux-x86_64.egg/tensorflow_core/python/ops/resource_variable_ops.py:1630: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
WARNING:tensorflow:From /home/eric/miniconda3/envs/popvae/lib/python3.7/site-packages/tensorflow-1.15.2-py3.7-linux-x86_64.egg/tensorflow_core/python/ops/nn_impl.py:183: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
/home/eric/miniconda3/envs/popvae/lib/python3.7/site-packages/Keras-2.3.1-py3.7.egg/keras/engine/training_utils.py:819: UserWarning: Output decoder missing from loss dictionary. We assume this was done on purpose. The fit and evaluate APIs will not be expecting any data to be passed to decoder.
'be expecting any data to be passed to {0}.'.format(name))
2020-10-21 10:25:19.612072: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-10-21 10:25:19.649267: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2195055000 Hz
2020-10-21 10:25:19.652350: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5570827b5d90 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-10-21 10:25:19.652387: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
WARNING:tensorflow:From /home/eric/miniconda3/envs/popvae/lib/python3.7/site-packages/Keras-2.3.1-py3.7.egg/keras/backend/tensorflow_backend.py:422: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead.
Train on 85 samples, validate on 10 samples
Epoch 1/500
85/85 [==============================] - 1s 8ms/step - loss: 2664.8421 - val_loss: 2437.6572
Epoch 00001: val_loss improved from inf to 2437.65723, saving model to out/pabu_test_weights.hdf5
Traceback (most recent call last):
File "/home/eric/miniconda3/envs/popvae/bin/popvae.py", line 4, in <module>
__import__('pkg_resources').run_script('popvae==0.1', 'popvae.py')
File "/home/eric/miniconda3/envs/popvae/lib/python3.7/site-packages/pkg_resources/__init__.py", line 650, in run_script
self.require(requires)[0].run_script(script_name, ns)
File "/home/eric/miniconda3/envs/popvae/lib/python3.7/site-packages/pkg_resources/__init__.py", line 1446, in run_script
exec(code, namespace, namespace)
File "/home/eric/miniconda3/envs/popvae/lib/python3.7/site-packages/popvae-0.1-py3.7.egg/EGG-INFO/scripts/popvae.py", line 415, in <module>
batch_size=batch_size)
File "/home/eric/miniconda3/envs/popvae/lib/python3.7/site-packages/Keras-2.3.1-py3.7.egg/keras/engine/training.py", line 1239, in fit
validation_freq=validation_freq)
File "/home/eric/miniconda3/envs/popvae/lib/python3.7/site-packages/Keras-2.3.1-py3.7.egg/keras/engine/training_arrays.py", line 216, in fit_loop
callbacks.on_epoch_end(epoch, epoch_logs)
File "/home/eric/miniconda3/envs/popvae/lib/python3.7/site-packages/Keras-2.3.1-py3.7.egg/keras/callbacks/callbacks.py", line 152, in on_epoch_end
callback.on_epoch_end(epoch, logs)
File "/home/eric/miniconda3/envs/popvae/lib/python3.7/site-packages/Keras-2.3.1-py3.7.egg/keras/callbacks/callbacks.py", line 719, in on_epoch_end
self.model.save(filepath, overwrite=True)
File "/home/eric/miniconda3/envs/popvae/lib/python3.7/site-packages/Keras-2.3.1-py3.7.egg/keras/engine/network.py", line 1152, in save
save_model(self, filepath, overwrite, include_optimizer)
File "/home/eric/miniconda3/envs/popvae/lib/python3.7/site-packages/Keras-2.3.1-py3.7.egg/keras/engine/saving.py", line 449, in save_wrapper
save_function(obj, filepath, overwrite, *args, **kwargs)
File "/home/eric/miniconda3/envs/popvae/lib/python3.7/site-packages/Keras-2.3.1-py3.7.egg/keras/engine/saving.py", line 541, in save_model
_serialize_model(model, h5dict, include_optimizer)
File "/home/eric/miniconda3/envs/popvae/lib/python3.7/site-packages/Keras-2.3.1-py3.7.egg/keras/engine/saving.py", line 161, in _serialize_model
layer_group[name] = val
File "/home/eric/miniconda3/envs/popvae/lib/python3.7/site-packages/Keras-2.3.1-py3.7.egg/keras/utils/io_utils.py", line 233, in __setitem__
dataset = self.data.create_dataset(attr, val.shape, dtype=val.dtype)
File "/home/eric/miniconda3/envs/popvae/lib/python3.7/site-packages/h5py-3.0.0rc1-py3.7-linux-x86_64.egg/h5py/_hl/group.py", line 143, in create_dataset
if '/' in name:
TypeError: a bytes-like object is required, not 'str'
Dear all, thank you very much for this tool. Unfortunately I am finding many issues in installing this tool on a conda env with ptyhon=3.7.7. The final issue I encountered is the python version, it says that python=3.8 is required.
I was wondering if you pushed a new version and other requirments are needed.
Thank you very much in advance.
Alessandro
Hi @cjbattey
FYI,
When I first installed
I had issues with astor/code_gen.py with async being keyword in Python 3.7
I tried to fix it by changing async to is_async in code_gen.py
Popvae seems to work now (with example dataset in the repo).
Can you take a look and confirm what I noticed, or was I using wrong version of astor (I did everything according to instructions)?
Thanks,
Best regards,
Hi there, was looking through your codebase and paper and after your mention of data being too large to load into memory, I was wondering if you had heard of/tried Keras DataGenerators. They stream data on the fly, and you can customize them quite a bit.
I have some custom generators built for some other genetics deep learning networks I'm working on, would you mind if I submitted a pull request in the next few days after testing if a generator works? I might also functionalize the script while I'm at it so it can be more modular for the actual generation process.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.