josephwb / decisivator Goto Github PK

View Code? Open in Web Editor NEW

2.0 2.0 1.0 127.19 MB

License: GNU General Public License v3.0

C 9.21% C++ 66.56% Makefile 0.51% Turing 23.72%

decisivator's People

Contributors

Stargazers

Watchers

Forkers

cactusolo

decisivator's Issues

Support for non-Nexus tree input

Currently only seems to accept Nexus user trees. Add newick.

Support compound CHARSETs

Currently things are as:

'data_file' is a simple 'vanilla' Nexus file containing sequences and defined CHARSETs.
  - PLEASE NOTE! Only simple CHARSETs are currently supported.
    - e.g. contiguous (X-Y) or interval (e.g. codon: X-Y\3) data are fine.
    - CHARSET referencing is NOT allowed at present (but will be!)

Need to allow more complex partitions, e.g.:

CHARSET p10 = 66259-66387, 101052-104953, 104954-105897, 105898-106034, 106035-106848, 106849-111795;

Will require a change to how partition-specific sites are stored.

Support for config file

Menu navigation can be tedious. Make scriptable (i.e. use config file, exit on completion).

Return character-specific coverage

Currently coverage is reported on a taxon-gene basis (and so implicitly weights genes the same, even if different length). Return character-specific coverage as well.

Considering all possible trees is silly. When simulating 'random' trees, make sure they are compatible with the constraint tree. Should give much more informative results. Definitely important, but difficult given current tree generation (random).

Make Nexus formatting requirements less draconian

Requirements (i.e. no spaces allowed) in data block are tedious, and no informative error is returned. Just make it more flexible.

Support for independent characters

Allow user to specify that all characters are independent (e.g. morphological) instead of providing a huge number of CHARSET declarations.

Use getopt for parsing commandline arguments

The whole commandline argument parsing is ugly. Replace it with getopt-style handling.

Support for non-Nexus alignment input

Allow input of phylip-formatted data (and separate partition file) e.g. ala RAxML. Or possibly a small accessory program which can do the conversion (both ways).

Check whether partitions contain constant data

Decisiveness involves only whether overlapping data are present. Ideally this would involve some measure of the informativeness of the data involved. However, an easy first step is to condition on whether the data present show any substitutions; if not, the data should not be considered to contribute to informativeness.
This makes the search a fair bit more complicated, because, even with complete sampling, some quartets within a partition will be uninformative, while others are not. So really this will need to involve a per-partition quartet-specific assessment.

Use autoconf and configure?

Maybe not since code base is so small. But would take care of the mac vs. linux compiling issue (that is, must use gcc on mac, which requires working around the clang-centric OS defaults). Definitely not high priority.

Rewrite in OO

This should really be done...

Question about complete decisiveness output

Hello,
I have been using Decisivator with a data set and am somewhat confused about the output and summary status when I use the complete decisiveness metric. When I run the T option, I get the following output:
Searching for presence of all possible taxon triplets...
Triplet: 131071 of 134044
Counted 134044 total triplets, 134044 of which were observed.
Woo-hoo! All possible taxon triplets observed. Matrix is (probably) decisive for all possible trees!

Searching for presence of all possible taxon quartets...
Quartet: 3047423 of 3049501
Counted 3049501 total quartets, 3049501 of which were observed.
Woo-hoo! All possible taxon quartets observed. Matrix IS decisive for all possible trees!

So it seems that the matrix is decisive. But when I print the summary info, I get this:

1 processor available for analysis.
Input file: 'Input.nexus'.
A total of 2139 partitions read for 94 taxa.

No reference taxa found (i.e. have data for all partitions).
1 user tree in memory.
Matrix coverage is currently at: 0.902365
Taxon-character matrix is NOT currently decisive for all possible trees.
Partial tree-wise decisiveness for taxon-character matrix is currently: 1 (determined from 1000 random trees).
Partial branch-wise decisiveness has not yet been determined for this taxon-character matrix.

This says the matrix is NOT decisive. I didn't make any changes to it, so I am not sure why these things seem to contradict one another.

Thanks!

josephwb / decisivator Goto Github PK

decisivator's People

Contributors

Stargazers

Watchers

Forkers

decisivator's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs