GithubHelp home page GithubHelp logo

kipoi / models Goto Github PK

View Code? Open in Web Editor NEW
162.0 162.0 58.0 18.77 MB

Model zoo for genomics

Home Page: http://kipoi.org

License: MIT License

Python 53.92% Jupyter Notebook 45.69% Shell 0.20% Dockerfile 0.06% Singularity 0.13%

models's Introduction

Kipoi: Model zoo for genomics

CircleCI Coverage status Gitter PyPi Python License: MIT

This repository implements a python package and a command-line interface (CLI) to access and use models from Kipoi-compatible model zoo's.

Links

Installation

Kipoi requires conda to manage model dependencies. Make sure you have either anaconda (download page) or miniconda (download page) installed. If you are using OSX, see Installing python on OSX. Maintained python versions: >=3.6<=3.10.

Install Kipoi using pip:

pip install kipoi

Known issue: h5py

For systems using python 3.6 and 3.7, pretrained kipoi models of type kipoi.model.KerasModel and kipoi.model.TensorflowModel which were saved with h5py <3.* are incompatible with h5py >= 3.*. Please downgrade h5py after installing kipoi

pip install h5py==2.10.0

This is not a problem with systems using python >=3.8<=3.10. More information available here

For systems using python >=3.8<=3.10, it is necessary to install hdf5 and pkgconfig prior to installing kipoi.

conda install --yes -c conda-forge hdf5 pkgconfig

Quick start

Explore available models on https://kipoi.org/groups/. Use-case oriented tutorials are available at https://github.com/kipoi/examples.

Installing all required model dependencies

Use kipoi env create <model> to create a new conda environment for the model. You can use the following two commands to create common environments suitable for multiple models.

kipoi env create shared/envs/kipoi-py3-keras2-tf1
kipoi env create shared/envs/kipoi-py3-keras2-tf2
kipoi env create shared/envs/kipoi-py3-keras1.2

Before using a model in any way, activate the right conda enviroment:

source activate $(kipoi env get <model>)

Using pre-made containers

Alternatively, you can use the Singularity or Docker containers with all dependencies installed. Singularity containers can be seamlessly used with the CLI by adding the --singularity flag to kipoi predict commands. For example: Look at the sigularity tab under this. Alternatively, you can use the docker containers directly. For more information: Look at the docker tab under any model web page on kipoi.org such as this. We are currently offering two types of docker images. A full sized version (under the banner Get the full sized docker image) comes with conda pre-installed along with model (group) specific dependencies. Use this if you plan to experiment with conda funcitonalities. A slim sized version (under the banner Get the docker image) comes with all dependencies installed. However, it does not come with a working conda packge manager. Use the slim versions if you plan to use it for kipoi related tasks only.

A note about installing singularity

Singularity has been renamed to Apptainer. However, it is also possible to use SingularityCE from Sylabs. Current versions of kipoi containers are compatible with the latest version of Apptainer (1.0.2) and SingularityCE 3.9. Install Apptainer from here or SingularityCE from here.

Python

Before using a model from python in any way, activate the right conda enviroment:

source activate $(kipoi env get <model>)
import kipoi

kipoi.list_models() # list available models

model = kipoi.get_model("Basset") # load the model

model = kipoi.get_model(  # load the model from a past commit
    "https://github.com/kipoi/models/tree/<commit>/<model>",
    source='github-permalink'
)

# main attributes
model.model # wrapped model (say keras.models.Model)
model.default_dataloader # dataloader
model.info # description, authors, paper link, ...

# main methods
model.predict_on_batch(x) # implemented by all the models regardless of the framework
model.pipeline.predict(dict(fasta_file="hg19.fa",
                            intervals_file="intervals.bed"))
# runs: raw files -[dataloader]-> numpy arrays -[model]-> predictions 

For more information see: notebooks/python-api.ipynb and docs/using/python

Command-line

$ kipoi
usage: kipoi <command> [-h] ...

    # Kipoi model-zoo command line tool. Available sub-commands:
    # - using models:
    ls               List all the available models
    list_plugins     List all the available plugins
    info             Print dataloader keyword argument info
    get-example      Download example files
    predict          Run the model prediction
    pull             Download the directory associated with the model
    preproc          Run the dataloader and save the results to an hdf5 array
    env              Tools for managing Kipoi conda environments

    # - contributing models:
    init             Initialize a new Kipoi model
    test             Runs a set of unit-tests for the model
    test-source      Runs a set of unit-tests for many/all models in a source
    
    # - plugin commands:
    interpret        Model interpretation using feature importance scores like ISM, grad*input or DeepLIFT
# Run model predictions and save the results
# sequentially into an HDF5 file
kipoi predict <Model> --dataloader_args='{
  "intervals_file": "intervals.bed",
  "fasta_file": "hg38.fa"}' \
  --singularity \
  -o '<Model>.preds.h5'

Explore the CLI usage by running kipoi <command> -h. Also, see docs/using/cli/ for more information.

Configure Kipoi in .kipoi/config.yaml

You can add your own (private) model sources. See docs/using/03_Model_sources/.

Contributing models

See docs/contributing getting started and docs/tutorials/contributing/models for more information.

Plugins

Kipoi supports plug-ins which are published as additional python packages. Currently available plug-in is:

Model interpretation plugin for Kipoi. Allows to use feature importance scores like in-silico mutagenesis (ISM), saliency maps or DeepLift with a wide range of Kipoi models. example notebook

pip install kipoi_interpret

Variant effect prediction with a subset of Kipoi models

Variant effect prediction allows to annotate a vcf file using model predictions for the reference and alternative alleles. The output is written to a new tsv file. For more information see https://github.com/kipoi/kipoi-veff2.

Documentation

Tutorials

Citing Kipoi

If you use Kipoi for your research, please cite the publication of the model you are using (see model's cite_as entry) and the paper describing Kipoi: https://doi.org/10.1038/s41587-019-0140-0.

@article{kipoi,
  title={The Kipoi repository accelerates community exchange and reuse of predictive models for genomics},
  author={Avsec, Ziga and Kreuzhuber, Roman and Israeli, Johnny and Xu, Nancy and Cheng, Jun and Shrikumar, Avanti and Banerjee, Abhimanyu and Kim, Daniel S and Beier, Thorsten and Urban, Lara and others},
  journal={Nature biotechnology},
  pages={1},
  year={2019},
  publisher={Nature Publishing Group}
}

Development

If you want to help with the development of Kipoi, you are more than welcome to join in!

For the local setup for development, you should install all required dependencies using one of the provided dev-requirements(-py<36|37>).yml files

For systems using python 3.6/3.7:

conda env create -f dev-requirements-py36.yml --experimental-solver=libmamba
or
conda env create -f dev-requirements-py37.yml --experimental-solver=libmamba
conda activate kipoi-dev
pip install -e .
git lfs install

For systems using python >=3.8<=3.10:

conda create --name kipoi-dev python=3.8 (or 3.9, 3.10)
conda activate kipoi-dev
conda env update --name kipoi-dev --file dev-requirements.yml --experimental-solver=libmamba 
pip install -e .
conda install -c bioconda cyvcf2 pybigwig
git lfs install    

A note about cyvcf2 and pybigwig

For python >= 3.10, cyvcf2 and pybigwig are not available in conda yet. Install them from source like here and here instead. I will recommend against installing them using pip as it may lead to unexpected inconsistencies.

You can test the package by running py.test.

If you wish to run tests in parallel, run py.test -n 6.

License

Kipoi is MIT-style licensed, as found in the LICENSE file.

models's People

Contributors

agitter avatar avsecz avatar banwang27 avatar bernardo-de-almeida avatar bytewife avatar cbravo93 avatar derthorsten avatar haimasree avatar hoeze avatar hy395 avatar itaskiran avatar jacklanchantin avatar jeffmylife avatar jisraeli avatar karollus avatar katrinleinweber avatar krrome avatar laraurban avatar mheinzinger avatar mlweilert avatar muhammedhasan avatar okurman avatar s6juncheng avatar stefanches7 avatar suragnair avatar twrightsman avatar vagarwal87 avatar vervacity avatar xnancy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

models's Issues

Missing maxentpy.maxent dependency for MaxEnt/3prime model

kipoi$get_model("MaxEntScan/3prime")

Error in py_call_impl(callable, dots$args, dots$keywords): ImportError: No module named maxentpy.maxent

Detailed traceback: 
  File "/users/annashch/kipoi/kipoi/model.py", line 77, in get_model
    Mod = load_model_custom(**md.args)
  File "/users/annashch/kipoi/kipoi/model.py", line 129, in load_model_custom
    return getattr(load_module(file), object)
  File "/users/annashch/kipoi/kipoi/utils.py", line 37, in load_module
    module = imp.load_source(module_name, path)
  File "model.py", line 12, in <module>
  File "/users/annashch/.kipoi/models/MaxEntScan/3prime/../template/model_template.py", line 3, in <module>
    from maxentpy.maxent import score5, score3, load_matrix5, load_matrix3

Traceback:

1. kipoi$get_model("MaxEntScan/3prime")
2. py_call_impl(callable, dots$args, dots$keywords)

(I don't have permissions to assign issues to people, but I believe this model was contributed by @s6juncheng )

TODO - add labels for issues

Issue types

  • Model request
    • review of the paper was done
  • Issue with the model/dataloader (functionality)
  • Issue with model metadata/description

Model request processing stages

  • Model definition request
    • provides: Paper and Github link
    • asks for: integration request
  • Integration request (Paper review performed)
    • provides: brief information about the model with pointers
    • asks for: pull request
  • Pull request
    • provides: pull request
    • asks for: review and merge

Add the TF name to DeepBind models

  • change the naming schema to: DeepBind/<Species>/<Type>/<ID>_<Experiment>_<Protein>
  • add the metadata information to the description
# The index has the following fields:
#
#   ID: A unique id of the form 01234.567 that identifies a model 
#       01234 is unique for each combination of (Protein, Species), and 567 is unique 
#       for each combination of (Experiment, Experiment Details, Model ID),
#       In other words, 01234 identifies the protein 'version', whereas 567 identifies 
#       the model 'version' for that protein version.
#   Protein: The protein name, e.g. RBFOX1.
#   Type: Indicates whether the protein is considered an RBP, a TF, etc.
#   Species: The species name, e.g. Homo sapiens.
#   Family: The protein family, e.g. Homeodomain.
#   Class: The protein structural class, e.g. Zinc-coordinating.
#   Experiment: The technology used to measure the training data, e.g. ChIP-seq, HT-SELEX.
#   Experiment Details: A list of strings that describe how the experiment was configured,
#                       e.g. ["Cell Line=K562","Lab=Stanford"]
#   Model ID: A number/string that can be used to identify this model as distinct from 
#             other models for the same (Protein, Species, Experiment, Experiment Details),
#             e.g. if a new, better model was trained and therefore needs a new version ID.
#   Comments: 
#
#
ID	Protein	Type	Species	Family	Class	Experiment	Experiment Details	Model	Cite	Labels	Path	Comment
D00001.001	Cebpb	TF	Mus musculus	bZIP		PBM	['DREAM5ID=C_1', 'Array=HK']	deepbind 0.1	PMID 23354101		E:/Results/deepbind/dream5/final/C_1	
D00002.001	Egr2	TF	Mus musculus	C2H2 ZF		PBM	['DREAM5ID=C_2', 'Array=HK']	deepbind 0.1	PMID 23354101		E:/Results/deepbind/dream5/final/C_2	
D00003.001	Esr1	TF	Mus musculus	Nuclear receptor		PBM	['DREAM5ID=C_3', 'Array=HK']	deepbind 0.1	PMID 23354101		E:/Results/deepbind/dream5/final/C_3	

For models that require fixed width offer resize functionality

For models like deepse, the input has to be strictly 1000nt. It would be nice to have a flag: resize=True which would resize the bed files accordingly

  • in case resize=True is not used and the ranges don't match the width, display a verbose error message

DeepBind models description

The DeepBind models in the zoo have rather un-interpretable tags. They are listed as some IDs like D00001.0001 . But there is no easy way to search or figure out which TF or what the model is representing. Is there some mapping file between these IDs and what the models actually are? Isn't obvious where to find it.

Also some relevant metadata in the 'Tags' column of the main models page is missing for the DeepBind model group. And there is no entry for the training procedure field for any of the DeepBind models.

Join the team (post here to be added to the kipoi team)

Everybody is welcome to contribute to Kipoi. Simply reply to this issue and you will be added to the Kipoi team. That way, you contribute models via git branches and don't have to fork the repo.

Edit: After you post here, we will email you an invitation through github to join the Kipoi team. Click the link in that invitation in order to be added.

Regarding the output of the example for Basenji

So I've successfully ran the Basenji on the example_files/ but still have some comments/questions before I do this for my data.

First, the output file Basenji.example_pred.tsv has some complications.
It cannot be "headed" easily or looped through easily.
screen shot 2018-06-01 at 4 22 41 pm
After transposing the data, you get:
screen shot 2018-06-01 at 4 26 01 pm which is much easier to work with.
Additionally, the values for every interval supplied from interveral.bed are equal. It seems redundant to have these chromosomal regions as examples if they predict the same thing. Is there any reason for this?

Second, notice that the data in the above photo is ambiguous. "pred/n/m " doesn't tell me much. I now know that 'n' ranges from [0,960] and 'm' ranges from [0,4229].
m * n = 4059840, or the number of output values of the model.
I see that m corresponds to sequencing reads from the various experiments that the model predicts. However, the mapping of the number (0 through 4229) in the file to the identifier in Supplementary Table 1 is not clear since the there is no numeral ID; I've presumed 0 corresponds to the first data point, though.

Most importantly, what I do not understand is the meaning of 'n'. The corresponding article has no reference to the number 960. Given that the model's goal is to predict experimental data for regions of the genome, I presume that 'n' corresponds to binned locations of the input sequence.
However, the bin size used is supposed to be 128bp (2^7) and the input sequence length is 131,072bp (2^17).
131,072bp / 128bp = 1,024 bins per input sequence
This implies that 'n' (the numbers 0 to 960) is not the bins because it would mean that
131,072bp / 960 bins = ~136.5 bp per bin
which I don't think is allowed.
So what is the meaning of the output in Basenji.example_pred.tsv ?

Any guidance would be helpful. I am aware that this question might better be directed at the Basenji github, but I thought I'd try here first.
Thanks

Scope of models in Kipoi

It would be good to define a scope and a minimal set of standards for the contributed models

Minimal set of standards

  • The model has to be described in detail somewhere (arxiv/bioarxiv/journal paper or maybe even good blogpost). It should be clear on what data it has been trained on and what the performance on the held-out validation set is.

  • all the kipoi tests should pass

Scope

  • models from the field of (regulatory) genomics?

DeepBind not functional

As in title, perhaps there is something wrong with the weights file? Any idea?

kipoi test -i DeepBind/D00001.001 --source kipoi
...
INFO [kipoi.remote] git-lfs pull -I DeepBind/D00001.001/**
INFO [kipoi.remote] git-lfs pull -I DeepBind/template/**
INFO [kipoi.remote] model DeepBind/D00001.001 loaded
INFO [kipoi.remote] git-lfs pull -I DeepBind/D00001.001/./**
INFO [kipoi.remote] git-lfs pull -I DeepBind/template/**
INFO [kipoi.remote] dataloader DeepBind/D00001.001/. loaded
INFO [kipoi.data] successfully loaded the dataloader from .../.kipoi/models/DeepBind/D00001.001/dataloader.py::SeqDataset
.../conda-envs/kipoi/lib/python3.5/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
Using TensorFlow backend.
Traceback (most recent call last):
  File ".../conda-envs/kipoi/lib/python3.5/site-packages/keras/utils/generic_utils.py", line 230, in func_load
    code = marshal.loads(raw_code)
ValueError: bad marshal data (unknown type code)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File ".../conda-envs/kipoi/bin/kipoi", line 11, in <module>
    load_entry_point('kipoi', 'console_scripts', 'kipoi')()
  File "...kipoi/__main__.py", line 74, in main
    command_fn(args.command, sys.argv[2:])
  File "...kipoi/cli/main.py", line 53, in cli_test
    mh = kipoi.get_model(args.model, args.source)
  File "...kipoi/model.py", line 103, in get_model
    mod = AVAILABLE_MODELS[md.type](**md.args)
  File "...kipoi/model.py", line 223, in __init__
    self.model = load_model(weights, custom_objects=self.custom_objects)
  File ".../conda-envs/kipoi/lib/python3.5/site-packages/keras/models.py", line 243, in load_model
    model = model_from_config(model_config, custom_objects=custom_objects)
  File ".../conda-envs/kipoi/lib/python3.5/site-packages/keras/models.py", line 317, in model_from_config
    return layer_module.deserialize(config, custom_objects=custom_objects)
  File ".../conda-envs/kipoi/lib/python3.5/site-packages/keras/layers/__init__.py", line 55, in deserialize
    printable_module_name='layer')
  File ".../conda-envs/kipoi/lib/python3.5/site-packages/keras/utils/generic_utils.py", line 144, in deserialize_keras_object
    list(custom_objects.items())))
  File ".../conda-envs/kipoi/lib/python3.5/site-packages/keras/engine/topology.py", line 2510, in from_config
    process_layer(layer_data)
  File ".../conda-envs/kipoi/lib/python3.5/site-packages/keras/engine/topology.py", line 2496, in process_layer
    custom_objects=custom_objects)
  File ".../conda-envs/kipoi/lib/python3.5/site-packages/keras/layers/__init__.py", line 55, in deserialize
    printable_module_name='layer')
  File ".../conda-envs/kipoi/lib/python3.5/site-packages/keras/utils/generic_utils.py", line 144, in deserialize_keras_object
    list(custom_objects.items())))
  File ".../conda-envs/kipoi/lib/python3.5/site-packages/keras/layers/core.py", line 711, in from_config
    function = func_load(config['function'], globs=globs)
  File ".../conda-envs/kipoi/lib/python3.5/site-packages/keras/utils/generic_utils.py", line 234, in func_load
    code = marshal.loads(raw_code)
ValueError: bad marshal data (unknown type code)

Fix occasional nightly tests fails due to leftover keras config file

Currently, models are tested in a random order. Whichever model using Keras is tested first, the Keras config file of that model remains. Hence, if the config file of the first model uses theano, all the subsequent keras model will try to run keras on theano.

Possible solutions:

  • 'cleanup' the keras config file after every test (in case it wasn't present there before)
  • sort the models to be tested alphabetically (hacky)
  • set backend: tensorflow for all keras models (need to merge and release kipoi/kipoi#210 before that)

Models file structure

  • shall the folder structure be: __model? (not concerning about the implementation)
  • how to handle different versions?
    • /model_name/version/model

ValueError in Basenji model

Hi all. First thank you for developing kipoi! Great idea, practical and highly needed.

As a test I tried to use the Basenji model. After creating the environment I used the test comamnd which results in an tensorflow error. It seems that inputs are not define correctly:

(kipoi-Basenji)[user@secretServer]$ kipoi test Basenji --source=kipoi

INFO [kipoi.remote] Update /fast/users/schubacm_c/.kipoi/models/
Already up-to-date.
INFO [kipoi.remote] git-lfs pull -I Basenji/**
INFO [kipoi.remote] model Basenji loaded
INFO [kipoi.remote] git-lfs pull -I Basenji/./**
INFO [kipoi.remote] dataloader Basenji/. loaded
INFO [kipoi.data] successfully loaded the dataloader from /fast/users/schubacm_c/.kipoi/models/Basenji/dataloader.py::SeqDataset
/fast/users/schubacm_c/miniconda3/envs/kipoi-Basenji/lib/python3.5/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
INFO [kipoi.pipeline] dataloader.output_schema is compatible with model.schema
INFO [kipoi.pipeline] Initialized data generator. Running batches...
INFO [kipoi.pipeline] Returned data schema correct
Traceback (most recent call last):
  File "/fast/users/schubacm_c/miniconda3/envs/kipoi-Basenji/bin/kipoi", line 11, in <module>
    sys.exit(main())
  File "/fast/users/schubacm_c/miniconda3/envs/kipoi-Basenji/lib/python3.5/site-packages/kipoi/__main__.py", line 74, in main
    command_fn(args.command, sys.argv[2:])
  File "/fast/users/schubacm_c/miniconda3/envs/kipoi-Basenji/lib/python3.5/site-packages/kipoi/cli/main.py", line 62, in cli_test
    mh.pipeline.predict_example(batch_size=args.batch_size)
  File "/fast/users/schubacm_c/miniconda3/envs/kipoi-Basenji/lib/python3.5/site-packages/kipoi/pipeline.py", line 82, in predict_example
    pred_list.append(self.model.predict_on_batch(batch['inputs']))
  File "/fast/users/schubacm_c/miniconda3/envs/kipoi-Basenji/lib/python3.5/site-packages/kipoi/model.py", line 529, in predict_on_batch
    feed_dict=merge_dicts(feed_dict, self.const_feed_dict))
  File "/fast/users/schubacm_c/miniconda3/envs/kipoi-Basenji/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 905, in run
    run_metadata_ptr)
  File "/fast/users/schubacm_c/miniconda3/envs/kipoi-Basenji/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1113, in _run
    str(subfeed_t.get_shape())))
ValueError: Cannot feed value of shape (10, 131072, 4) for Tensor 'inputs:0', which has shape '(2, 131072, 4)'

I guess that inputs are defined here

shape: (131072, 4)
and here
shape: (131072, 4)

In the dataload.pyyou already wrote that Basenji strictly requires a batch size of 2.

if len(self.bt) % 2 == 1:
raise ValueError("Basenji strictly requires batch_size=2," +
" hence the bed file should have an od length")
But this seems to be ignored in the command kipoi test

Cheers,
Max

A ValueError when attempting to run Basenji

When I attempt to run the Basenji model as in the example, this shows up:

`

pred = model.pipeline.predict_example()
0%| | 0/1 [00:00<?, ?it/s]
Traceback (most recent call last):
File "", line 1, in
File "/anaconda3/lib/python3.6/site-packages/kipoi/pipeline.py", line 84, in predict_example
pred_list.append(self.model.predict_on_batch(batch['inputs']))
File "/anaconda3/lib/python3.6/site-packages/kipoi/model.py", line 1310, in predict_on_batch
feed_dict=merge_dicts(feed_dict, self.const_feed_dict))
File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 900, in run
run_metadata_ptr)
File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1111, in _run
str(subfeed_t.get_shape())))
ValueError: Cannot feed value of shape (10, 131072, 4) for Tensor 'inputs:0', which has shape '(2, 131072, 4)'
`
screen shot 2018-05-31 at 3 24 09 pm

This occurred again when attempting to run the

(kipoi-Basenji) nsheng-mbp:Basenji jeffrey$ kipoi predict Basenji
--dataloader_args='{'intervals_file': 'example_files/intervals.bed', 'fasta_file': 'example_files/hg38_chr22.fa'}'
-o '/tmp/Basenji.example_pred.tsv'

screen shot 2018-05-31 at 3 25 26 pm

Not sure how to fix this one. Any help is appreciated.

Add Basenji

Repo: https://github.com/calico/basenji

Model

type: tensorflow
Get model weights:

wget https://storage.googleapis.com/131k/model.tf.index
wget https://storage.googleapis.com/131k/model.tf.meta
wget https://storage.googleapis.com/131k/model.tf.data-00000-of-00001

Dataloader

  • Simple fasta + bed dataloader as for deepsea

Dimensions

  • TODO

TODO's

  • try to run the model
  • write out the dimensions
  • add tensorflow to Kipoi
  • PR

Add DeepTarget

Model

type: Keras
Weights (obtained per email)

License: GPL3

Dataloader

  • simple seq + one-hot loader

TODO

  • test
  • submit a PR

pytorch-cpu package not found on Mac

Hello, I am trying to run the following command:

kipoi env create DeepSEA/variantEffects

And am getting the following error:

Solving environment: ...working... failed

ResolvePackageNotFound:

  • pytorch-cpu[version='>=0.2.0']

Traceback (most recent call last):
File "/Users/ps14/anaconda3/bin/kipoi", line 11, in
sys.exit(main())
File "/Users/ps14/anaconda3/lib/python3.6/site-packages/kipoi/main.py", line 74, in main
command_fn(args.command, sys.argv[2:])
File "/Users/ps14/anaconda3/lib/python3.6/site-packages/kipoi/cli/env.py", line 302, in cli_main
command_fn(args.command, raw_args[1:])
File "/Users/ps14/anaconda3/lib/python3.6/site-packages/kipoi/cli/env.py", line 226, in cli_create
kipoi.conda.create_env_from_file(env_file)
File "/Users/ps14/anaconda3/lib/python3.6/site-packages/kipoi/conda.py", line 78, in create_env_from_file
return _call_conda(cmd_list, use_stdout=True)
File "/Users/ps14/anaconda3/lib/python3.6/site-packages/kipoi/conda.py", line 162, in _call_conda
return _call_command("conda", extra_args, use_stdout)
File "/Users/ps14/anaconda3/lib/python3.6/site-packages/kipoi/conda.py", line 149, in _call_command
raise subprocess.CalledProcessError(return_code, cmd_list)
subprocess.CalledProcessError: Command '['conda', 'env', 'create', '--file', '/tmp/kipoi/envfiles/94afb1e6/kipoi-DeepSEA__variantEffects.yaml']' returned non-zero exit status 1.

I have tried using conda to install pytorch-cpu, but it does not seem to be available from any channels.

I can install it just fine on linux, but it does not seem to be available on mac. Thank you for your help!!

Requirements for HAL model missing

According to install_model_requirements("HAL", "kipoi", and_dataloaders=True) everything is installed, but when trying to load the dataloader ther error No module named 'gtf_utils' is raised. The requirements.txt only contains #TODO

Uploaded models testing

  1. diff what has been updated (in the PR)
    • how to do this? Use travis cache?
  2. Run tests on the modified directory
    • setup test runtime limit
  3. Every now and then, run tests for all the models

Divergent430 cleanup

  • dataloader.py
    def __len__(self):
        return 1000

https://github.com/kipoi/models/blob/master/Divergent430/Multitask/dataloader.py#L31

Solution:

class SeqDataset(Dataset):
    """
    Args:
        intervals_file: bed3+1 file containing intervals+labels
        fasta_file: file path; Genome sequence
    """

    def __init__(self, intervals_file, fasta_file):
        self.bt = BedToolLinecache(intervals_file)
        self.fasta_extractor = FastaExtractor(fasta_file)

    def __len__(self):
        return len(self.bt)
  • dataloader.yaml

the output shape for Divergent430/Mutlitask it only 1

https://github.com/kipoi/models/blob/master/Divergent430/Multitask/dataloader.yaml#L32

  • target_labels.txt

Have 164 rows instead of 430

shape = 430 in https://github.com/kipoi/models/blob/master/Divergent430/Multitask/model.yaml#L29?

  • unnecessary files/directories
    • test_basset_model.py
    • test_files/

Tests not triggered on the PR

Problem: in the following PR: #37, the build didn't get triggered

  • try:
    • make a PR from your account and see if tests will get triggered
    • if not, use a different branch

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.