GithubHelp home page GithubHelp logo

phyckle's Introduction

phyckle

In the src folder are several scripts to help conduct phylogenomic analyses.

conducting alternative relationship analyses

These analyses require RAxML.

There are several ways that these analyses can be conducted. Once means to compare alternatives is to do the following:

  • Create a file that has the alternative bipartitions listed. There is an example in example/test.bp with the focal bipartition on a line and then alternatives listed below preceded by a -.
  • Run edge_investigate_conflicts_given.py like phyckle/src/edge_investigate_conflicts_given.py test.bp test.out seqs/ temp/ where temp is an outdir, seqs only has the sequence files (nothing else), test.bp is the set of bipartitions and test.out is an outfile.
  • Then run postprocess_edge_investigate.py link python phyckle/src/postprocess_edge_investigate.py test.out temp/ with test.out from above and temp from above.
  • This will result in a file called cons_0.csv that has each line with gene,cons_0,cons_0_conf_0,cons_0_conf_1,bestone,best,secondbest,diffbestsecondbest listing the gene, likelihoods for each edge (cons_0 is the main and cons_0_conf_0,cons_0_conf_1 are the two alternatives from an example). best is the highest likelihood with the differences listed.

There are several other analyses that can be conducted but this is the basic set of analyses.

conducting combinability analyses

At some point for this, you will need to use iqtree and bp (from here).

There are two scripts for conducting these analyses currently: test_clusters.py and test_clusters_multi_measure.py. The test_clusters.py will conduct analyses by using the RFW, constructing a graph, and proceeding through the graph. test_clusters_multi_measure.py has additional considerations like penalizing based on poor overlap of taxa (using rfp from bp).

  • Conduct iqtree analyses on each gene. You can do this with run_iqtree.py.
  • Create a file with the seq filenames in order (ls *.fa > treemap) and place them all in a file in the same order (cat *.treefile > mltrees)
  • Probably delete extra files (rm *.bionj *.mldist *.log *.ckp.gz) and move iqtree files and treefiles to the gene directory.
  • Calculate the weighted RF for the trees in the tree file (bp -rfw -t mltres > rfw).
  • Conduct the clustering analysis python phyckle/src/test_clusters.py -d seqs/ -m treemap -w rfw -e spp -a

Example with pxbdsim and pxseqgen. You can use phyx to generate data that you can use to conduct an example analysis. Here are the commands:

pxbdsim -e 10 | pxtscale -r 1 > test.tre
pxseqgen -t test.tre -l 500 -o test1.fa
pxseqgen -t test.tre -l 500 -o test2.fa
pxseqgen -t test.tre -l 500 -o test3.fa
pxseqgen -t test.tre -l 500 -o test4.fa
pxseqgen -t test.tre -l 500 -o test5.fa
mkdir seqs/
mv *.fa seqs/
python phyckle/src/run_iqtree.py seqs/ test
rm *.bionj *.mldist *.log *.ckp.gz
mv *.treefile *.iqtree seqs/
cd seqs/
ls test*.fa > ../treemap
cat test*.treefile > ../mltrees
cd ../
bp -rfw -t mltrees > rfw
python phyckle/src/test_clusters.py -d seqs/ -m treemap -w rfw -e spp -a

some comments

The runs above switch between iqtree and RAxML. This is primarily because iqtree has several branch length options when concatenating and RAxML is more efficient when conducting constrained analyses.

phyckle's People

Contributors

blackrim avatar

Stargazers

Miao Sun avatar

Watchers

James Cloos avatar Miao Sun avatar  avatar

phyckle's Issues

ERROR: No module named 'tree_reader'

python edge_investigate_conflicts_given.py -h
Traceback (most recent call last):
File "/ds3200_1/users_root/shenzongfang/phyckle/comb/test1/edge_investigate_conflicts_given.py", line 4, in
import tree_reader
ModuleNotFoundError: No module named 'tree_reader'
The error appearance, when i run the script in linux . But not find this module in python

error: SEQUENCE: 6282_alignment.phy constraint: 0 DONTRUN

Hello,

Just read through the new SSB paper and I'm trying to implement this set of scripts on my data set. I keep getting the error of

constraint: 0 -conflicts- 0 1 2 3 4 running constraints SEQUENCE: exon_6295_alignment.phy constraint: 0 DONTRUN

It is unclear to me what this means. I looked through the code but couldn't make sense of it.

Thanks for any advice and all of the great tools!!!

Cody Coyotee

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.