Comments (5)
Dear Joachim,
Thank you very much for your kind help.
I am sorry that I misunderstood Henry suggestion.
I had only removed the header from the first line and didn't realize that the header would be there for all files in the concatenated file.
So, I used the following command as you suggested and started my RF model run with the new file.
grep -v "pvalue" all.DVF.predictions.txt > all.DVF.predictions.NEW.txt
This time RF model run finished successfully.
Parsing deepvirfinder
Parsing voghmm
Parsing micompletehmm
Loading Model and annotation table
Writing: 260460 bins to file
Thank you very much @enryH and @sklucas for your suggestions.
from phamb.
I solved this problem by concatenating my dvf files using the code below, keeping only the first header:
awk '(NR == 1) || (FNR > 1)' {input.dvf} > {output.dvf}
input.dvf is a list of all the input files and output.dvf is the concatenated file with only a single header.
from phamb.
Hi Bhim
Did you remove all header lines in the all.DVF.predictions.txt file? if you concatenated a bunch of DeepVirFinder files you will still have headers in multiple lines of the file.
Can you try something like this to make sure there are no headers left:
grep -v "pvalue" all.DVF.predictions.txt > all.DVF.predictions.NEW.txt
Then use the all.DVF.predictions.NEW.txt as input instead.
Let me know if it works.
Best,
Joachim
from phamb.
Can you try to delete the first header line? As I read the error the program fail when it tries to convert score
to a float value.
float("score") # fails
float(0.4933076500892639) # should work
Best, Henry
from phamb.
Dear Henry,
Thank you very much for your quick reply.
I tried your solution but still got the same error.
Traceback (most recent call last):
File "/lustre7/home/bhimbiswa/MAGs/Virus/Phamb_new/mag_annotation/scripts/run_RF.py", line 227, in <module>
viral_annotation = run_RF_modules.Viral_annotation(annotation_files=viral_annotation_files,genomes=reference)
File "/lustre7/home/bhimbiswa/MAGs/Virus/Phamb_new/mag_annotation/scripts/run_RF_modules.py", line 358, in __init__
self._parse_viralannotation_file(filetype.lower(),file)
File "/lustre7/home/bhimbiswa/MAGs/Virus/Phamb_new/mag_annotation/scripts/run_RF_modules.py", line 386, in _parse_viralannotation_file
annotation_tuple = parse_function(line)
File "/lustre7/home/bhimbiswa/MAGs/Virus/Phamb_new/mag_annotation/scripts/run_RF_modules.py", line 513, in _parse_dvf_row
score =round(float(score),2)
ValueError: could not convert string to float: 'score'.
As you suggested, I removed the header line of "all.DVF.predictions.txt".
S10CNODE_1_length_374305_cov_118.066653 374305 0.4933076500892639 0.06760329330009819
S10CNODE_2_length_331174_cov_150.761282 331174 0.5215792059898376 0.05410151824155903
S10CNODE_3_length_327615_cov_134.196242 327615 0.6207031011581421 0.03997658433416421
S10CNODE_4_length_275508_cov_107.113522 275508 0.3987869620323181 0.09687287559483344
S10CNODE_5_length_273839_cov_39.234849 273839 0.37943029403686523 0.10166931037087393
S10CNODE_6_length_265257_cov_21.606357 265257 0.7501952648162842 0.029231815091774305
S10CNODE_7_length_254430_cov_27.129502 254430 0.6598391532897949 0.036350932849913135
S10CNODE_8_length_239244_cov_15.625518 239244 0.5251834392547607 0.05332729058085958
S10CNODE_9_length_235224_cov_151.910707 235224 0.4213518500328064 0.09149104917289826
Regards,
Bhim
from phamb.
Related Issues (20)
- modified header names in PHAMB HOT 4
- Versioned release package for Phamb HOT 16
- Parsing deepvirfinder line 512, in _parse_dvf_row contig_name, length, score, pvalue = line[:-1].split() HOT 2
- contig length HOT 1
- Update shebang lines in phamb python scripts HOT 2
- High number of bacterial genes in phamb assembled bins HOT 3
- split_contigs.py produces empty files HOT 1
- Can PHAMB output comparable performance on environmental metagenome compared to gut metagenome HOT 1
- split_contigs.py produces empty files HOT 2
- how to evaluate the bin-annotations? HOT 1
- What are the criteria of RF model HOT 2
- The predicted 'viral' number in 'vambbins_RF_predictions.txt' is inconsistent with the actual number in 'vamb_bins.1.fna'? HOT 1
- Binning question, how to use vamb? HOT 7
- 'run_RF.py' operation problem HOT 1
- how to get the file 'clusters.tsv' ?
- VAE or AAE? HOT 1
- How to Run - not in parallel - quick and dirty HOT 1
- interpret the results of RF model
- Can PHAMB be used directly for Virome analysis (enrichment of viral particles followed by sequencing) HOT 2
- category of viruses identified by PHAMB ?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from phamb.