sameerkhurana10 / dsol_rv0.2 Goto Github PK
View Code? Open in Web Editor NEWdeep protein solubility prediction
License: MIT License
deep protein solubility prediction
License: MIT License
Hi! I am trying to use your DeepSol neural network to predict the solubility of the designs.
Unfortunately, I am stuck at the test run as stated in README.md. My command and the error are as follow:
(dsol) cltam@cltam-System-Product-Name:~/Desktop/deepsol_trial$ R --vanilla < /home/cltam/Desktop/DSOL_rv0.2-v0.3/sameerkhurana10-DSOL_rv0.2-20562ad/scripts/PaRSnIP.R seq1-8_DPBB_trimmed.fasta /home/cltam/SCRATCH-1D_1.1/bin/run_SCRATCH-1D_predictors.sh newtest 32
...
[SCRATCH-1D_predictions.pl] 1 protein sequence(s) found
[SCRATCH-1D_predictions.pl] generating sequence profiles...
chmod: cannot access '/home/cltam/SCRATCH-1D_1.1/tmp/20190602-010133-829841579409/dataset.pro': No such file or directory
[SCRATCH-1D_predictions.pl] failed generating sequence profiles...
[1] 1
Error in file(file, "r") : cannot open the connection
Calls: PaRSnIP ... PaRSnIP.calc.features.test -> as.vector -> read.fasta -> scan -> file
In addition: Warning message:
In file(file, "r") :
cannot open file '/tmp/RtmpQsM2Kr/tmp3a395e10ade.ss': No such file or directory
Execution halted
I have googled a similar error message but didn't help.
May I have your suggestion on how to deal with the error?
Thanks!
Predicted_Class P0 P1
1 0.496959 0.503788
0 0.641231 0.359205
1 0.483336 0.51587
Hi, thanks for this software. Could you help me explain this result(shown already). I guess the P0 represents the probability of label 0. If so, how I can read the first column (1-0-1). Does the rank of 1-0-1 represent the rank of the amino acid sequences in the input fasta file?
Also, do you have any suggestion if the input protein has many chains and each chain has the same amino acid sequence as other chains?
Hi,
Thanks a lot for the great work. I would like to reuse the data you provide to do a comparison of various techniques on the same task. While preparing the dataset I found little inconsistency and it would be great if you could shorlty clarify it for me.
In your paper, you speak of 2001 test examples which corresponds to the number of examples in test_src_bio but test_src and test_tgt both only contain 1999 examples. It seems there are two negative examples missing as there are only 999 avialable. Not a big deal but depending on which examples are missing the aglignment of the biological data and sequences provided will be skewed.
Thanks a lot for you clarification.
Can we use or enable GPUs to perform the 1st step
Hi, I saw the link from your study and tried to use your code and followed the instruction in the readme file, but the output was not as I expected.
Firstly, I tested the best model with precompiled file and the output in all 3 models are all very low around 52~55%. Secondly, I found that the deposit version in the link and the latest submit in this master are different especially in test file. I'm not sure the bad output result I got is because of the different precompiled file. I also tried to recompile the sequence file but the SCRATCH took unacceptable runtime.
I'm curious how the author complete SCRATCH generating features of all sequence (because it really take a lot of time on my platform that I couldn't wait for the program finish). Also, could you please help me double check the result of your precompile file performed on your original platform with your latest submit file?
Thanks a lot.
Dear all,
I followed all the installation instructions and I succesfully installed the program (and dependencies). but when I run the command:
R --vanilla < scripts/PaRSnIP.R data/Seq_solo.fasta <path-to-your-scratch-installation>/bin/run_SCRATCH-1D_predictors.sh new_test 32
(once specified the right directories)
this error occurs:
###################################
# #
# SCRATCH-1D release 1.1 (2015) #
# #
###################################
[SCRATCH-1D_predictions.pl] 1 protein sequence(s) found
[SCRATCH-1D_predictions.pl] generating sequence profiles...
chmod: cannot access '/my/path/DSOL_rv0.2/SCRATCH-1D_1.1/tmp/20240126-161014-720423000963/dataset.pro': No such file or directory
[SCRATCH-1D_predictions.pl] failed generating sequence profiles...
[1] 1
Error in file(file, "r") : cannot open the connection
Calls: PaRSnIP ... PaRSnIP.calc.features.test -> as.vector -> read.fasta -> scan -> file
In addition: Warning message:
In file(file, "r") :
cannot open file '/tmp/RtmpQlBOft/tmp560bd37c91444.ss': No such file or directory
Execution halted
Then I gave the permission to that folder, so i run again the command but the same error occurs in another tmp dir (every time is created a new dir in "tmp" dir. In addition, this error occurs by simply running the test in the README file provided in the SCRATCH-1D 1-1 folder.
Thank you for your attention
Matteo
Dears,
I followed the instructions and got this error on my Mac Monterey OS 12.5:
conda env create -f environment.yml
Collecting package metadata (repodata.json): done
Solving environment: failed
ResolvePackageNotFound:
- r-broom==0.4.2=r342h90c284a_0
- traitlets==4.3.2=py36h674d592_0
- webencodings==0.5.1=py36h800622e_1
- r-dbi==0.7=r342h835dbd3_0
- r-pryr==0.1.2=r342hf3bf84b_0
- r-survival==2.41_3=r342hac888f2_0
- r-uuid==0.1_2=r342h1393287_4
- numpy==1.12.1=py36_0
- gxx_linux-64==7.2.0=25
- r-htmltools==0.3.6=r342h4080c21_0
- r-repr==0.12.0=r342hde08b23_0
- libtiff==4.0.9=h28f6b97_0
- r-rcpproll==0.2.2=r342hdd0cc16_0
- libgcc-ng==7.2.0=h7cc24e2_2
- r-randomforest==4.6_12=r342h41e62a9_4
- r-foreign==0.8_69=r342hbd6b6bb_0
- r-ggplot2==2.2.1=r342h8b6cf70_0
- ptyprocess==0.5.2=py36h69acd42_0
- r-dplyr==0.7.4=r342hee63e26_0
- python==3.6.3=hc9025b9_1
- r-caret==6.0_77=r342h201f268_0
- r-labeling==0.3=r342h7c30189_4
- cudnn==5.1=0
- r-openssl==0.9.7=r342h1e3c16d_0
- r-r6==2.2.2=r342hcc750b5_0
- pcre==8.41=hc27e229_1
- pixman==0.34.0=hceecf20_3
- r-lubridate==1.6.0=r342h47739ec_0
- libpng==1.6.34=hb9fc6fc_0
- r-stringr==1.2.0=r342h701d218_0
- r-pkgconfig==2.0.1=r342h5d9b92e_0
- r-class==7.3_14=r342h4797347_4
- r-tidyr==0.7.1=r342h1bc61e7_0
- r-magrittr==1.5=r342h895a831_4
- libedit==3.1=heed3624_0
- r-forcats==0.2.0=r342h5fcb364_0
- r-irkernel==0.8.9=r342hfe3cb8f_0
- r-sfsmisc==1.1_1=r342hadb04b4_0
- curl==7.55.1=hcb0b314_2
- r-kernsmooth==2.23_15=r342hdc3efa4_4
- r-rlang==0.1.2=r342hc878738_0
- harfbuzz==1.7.4=hc5b324e_0
- libxml2==2.9.7=h26e45fe_0
- r-rvest==0.3.2=r342h8e81af5_0
- pandocfilters==1.4.2=py36ha6701b7_1
- r-tibble==1.3.4=r342h041fa31_0
- pango==1.41.0=hd475d92_0
- pickleshare==0.7.4=py36h63277f8_0
- r-scales==0.5.0=r342h876bdd3_0
- r-formatr==1.5=r342h0e06a18_0
- zeromq==4.2.2=hbedb6e5_2
- libsodium==1.0.15=hf101ebd_0
- pyzmq==16.0.3=py36he2533c7_0
- r-codetools==0.2_15=r342hc9ffb9b_0
- r-rmarkdown==1.6=r342hacd9e3e_2
- r-withr==2.0.0=r342hfa1897f_0
- r-lava==1.5.1=r342h598eca9_0
- r-tidyverse==1.1.1=r342hf9e2102_0
- r-hexbin==1.27.1=r342h72fd8d9_4
- r-crayon==1.3.4=r342h0ed458a_0
- r-httr==1.3.1=r342h7aba7de_0
- r-bh==1.65.0_1=r342h2d7c2ce_0
- r-mnormt==1.5_5=r342he1a489d_0
- r-maps==3.2.0=r342h79c810f_0
- gcc_linux-64==7.2.0=25
- bzip2==1.0.6=h6d464ef_2
- r-digest==0.6.12=r342hee14287_0
- r-shiny==1.0.5=r342hc0a01ce_0
- r-deoptimr==1.0_8=r342h996e1ad_0
- r-mass==7.3_47=r342hd8605c9_0
- r-mime==0.5=r342h01856a9_0
- r-timedate==3012.100=r342h48d2e88_4
- r-stringi==1.1.6=r342hf484d3e_0
- r-glmnet==2.0_13=r342h0aac7f1_0
- r-modelr==0.1.1=r342heb933ea_0
- r-recipes==0.1.0=r342h0924b81_0
- tk==8.6.7=hc745277_3
- ipython_genutils==0.2.0=py36hb52b0d5_0
- parso==0.1.1=py36h35f843b_0
- r-xml2==1.1.1=r342h3ff9a91_0
- sqlite==3.21.0=h1bed415_0
- r-data.table==1.10.4_1=r342h3908b10_0
- r-haven==1.1.0=r342h8909e7d_0
- r-catools==1.17.1=r342hf01772b_4
- r-plogr==0.1_1=r342h6d7f4c2_0
- r-httpuv==1.3.5=r342ha5ddd88_0
- r-ttr==0.23_2=r342hf82c04a_0
- r-viridislite==0.2.0=r342h3706391_0
- r-yaml==2.1.14=r342h72d8650_0
- gsl==2.2.1=h0c605f7_3
- r-jsonlite==1.5=r342hf92f79e_0
- r-xts==0.10_0=r342h17852d6_4
- r-knitr==1.17=r342h6500ef9_0
- jsonschema==2.6.0=py36h006f8b5_0
- r-cluster==2.0.6=r342hcb72a25_0
- freetype==2.8=hab7d2ae_1
- jpeg==9b=h024ee3a_2
- r-boot==1.3_20=r342ha5ac741_0
- r-dimred==0.1.0=r342hb158d3e_0
- r-reshape2==1.4.2=r342h2e254a0_0
- r-glue==1.1.1=r342h3154e12_0
- r-rematch==1.0.1=r342he3f91f1_0
- r-essentials==1.7.0=r342hf65ed6a_0
- r-quantmod==0.4_11=r342h9c9c021_0
- r-htmlwidgets==0.9=r342h7fcc9b6_0
- r-numderiv==2016.8_1=r342h12eb246_0
- binutils_impl_linux-64==2.28.1=h04c84fa_2
- r-evaluate==0.10.1=r342hb679cc2_0
- r-highr==0.6=r342h2351100_0
- pandoc==1.19.2.1=hea2e7c5_1
- r-drr==0.0.2=r342h0fe108c_0
- gxx_impl_linux-64==7.2.0=hd3faf3d_2
- hdf5==1.8.17=10
- r-hms==0.3=r342ha729a9b_0
- r-kernlab==0.9_25=r342hd770e69_0
- libstdcxx-ng==7.2.0=h7a57d05_2
- python-dateutil==2.6.1=py36h88d3b88_1
- r-psych==1.7.8=r342h1e2dc86_0
- pygments==2.2.0=py36h0d3125c_0
- graphite2==1.3.10=hf63cedd_1
- r-bindr==0.1=r342hdee8079_0
- r-markdown==0.8=r342h4cc6e3e_0
- r-selectr==0.3_1=r342hd6c1ff9_0
- r-tidyselect==0.2.2=r342h04f720c_0
- r-irdisplay==0.4.4=r342hc31a1b2_0
- icu==58.2=h9c2bf20_1
- r-sourcetools==0.1.6=r342h519fec0_0
- gcc_impl_linux-64==7.2.0=hc5ce805_2
- r-foreach==1.4.3=r342h49221b0_4
- r-lattice==0.20_35=r342h0f762c2_0
- r-rpart==4.1_11=r342hd35cc14_0
- ipython==6.2.1=py36h88c514a_1
- simplegeneric==0.8.1=py36h2cb9092_0
- r-cellranger==1.1.0=r342h51baf57_0
- gmp==6.1.2=h6c8ec71_1
- r-xtable==1.8_2=r342hb75f6e7_0
- r-mgcv==1.8_22=r342h176cc83_0
- r-bindrcpp==0.2=r342h4ea5b31_0
- r-zoo==1.8_0=r342h9faeba2_0
- r-rcolorbrewer==1.1_2=r342h0dda8fb_0
- r-backports==1.1.1=r342h9a9f1f2_0
- r-nlme==3.1_131=r342h7f704e8_0
- testpath==0.3.1=py36h8cadb63_0
- r-cvst==0.2_1=r342h830f301_0
- nbconvert==5.3.1=py36hb41ffb7_0
- r-gistr==0.4.0=r342h5cd5773_0
- krb5==1.14.2=hd3fe544_3
- binutils_linux-64==7.2.0=25
- r-gower==0.1.2=r342h54e18f5_0
- r-spatial==7.3_11=r342hff2a1d0_4
- libssh2==1.8.0=h8c220ad_2
- prompt_toolkit==1.0.15=py36h17d85b1_0
- entrypoints==0.2.3=py36h1aec115_2
- r-nnet==7.3_12=r342ha98c111_0
- r-rbokeh==0.5.0=r342h257354a_0
- xz==5.2.3=h55aa19d_2
- glib==2.53.6=h5d9569c_2
- r-bitops==1.0_6=r342hd891396_4
- jinja2==2.10=py36ha16c418_0
- r-colorspace==1.3_2=r342h81b277d_0
- ncurses==6.0=h9df7e31_2
- nbformat==4.4.0=py36h31c9010_0
- r-dichromat==2.0_0=r342h341c752_4
- r-iterators==1.0.8=r342h4caec00_4
- fontconfig==2.12.4=h88586e7_1
- r-readxl==1.0.0=r342h1e5739b_0
- libxcb==1.12=hcd93eb1_4
- pygpu==0.7.6=py36h3010b51_0
- r-base64enc==0.1_3=r342h7c929ec_4
- r-ipred==0.9_6=r342h7d58d5b_0
- r-robustbase==0.92_7=r342h6021a74_0
- jupyter_core==4.4.0=py36h7c827e3_0
- wcwidth==0.1.7=py36hdf4376a_0
- r-matrix==1.2_11=r342h3a55fe1_0
- r-munsell==0.4.3=r342h79883fb_0
- r-prodlim==1.6.1=r342hd95c883_0
- r-curl==3.0=r342hba591e3_0
- r-assertthat==0.2.0=r342h06193eb_0
- r-pbdzmq==0.2_6=r342h934a24f_0
- r-readr==1.1.1=r342hb25467c_0
- libffi==3.2.1=hd88cf55_4
- libgfortran-ng==7.2.0=h9f7466a_2
- readline==7.0=ha6073c6_4
- cairo==1.14.12=h77bcde2_0
- r-ddalpha==1.3.1=r342hd2d3f94_0
- r-modelmetrics==1.1.0=r342h5a23eb1_0
- r-purrr==0.2.3=r342h7916a1c_0
- r-base==3.4.2=haf99962_0
- r-gtable==0.2.0=r342h8e3b2c8_0
- r-lazyeval==0.2.0=r342h346dbdc_0
- r-rcpp==0.12.13=r342h9f83869_0
- r-rprojroot==1.2=r342hd69aa9e_0
- zlib==1.2.11=ha838bed_2
- r-plyr==1.8.4=r342h876901b_0
on install.packages('Interpol')
Error: package ‘Interpol’ is not available (for R version 3.4.2)
please see https://cran.r-project.org/web/packages/Interpol/index.html
Hi there,
I'm trying to do just like in the description manual but I can't understand when I'm trying to run at step 3 (test sample in the README.md manual):
./run.sh --model deepsol1 --stage 2 --mode decode --device cpu data/newtest.data
It came out like this:
TypeError: The two branches should have identical types, but they are TensorType(float64, 3D) and TensorType(int32, 3D) respectively. This error could be raised if for example you provided a one element list on the
thenbranch but a tensor on the
else branch.
Can you tell me how to fix it?
Thank you.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.