josephwb / decisivator Goto Github PK
View Code? Open in Web Editor NEWLicense: GNU General Public License v3.0
License: GNU General Public License v3.0
Currently only seems to accept Nexus user trees. Add newick.
Currently things are as:
'data_file' is a simple 'vanilla' Nexus file containing sequences and defined CHARSETs.
- PLEASE NOTE! Only simple CHARSETs are currently supported.
- e.g. contiguous (X-Y) or interval (e.g. codon: X-Y\3) data are fine.
- CHARSET referencing is NOT allowed at present (but will be!)
Need to allow more complex partitions, e.g.:
CHARSET p10 = 66259-66387, 101052-104953, 104954-105897, 105898-106034, 106035-106848, 106849-111795;
Will require a change to how partition-specific sites are stored.
Menu navigation can be tedious. Make scriptable (i.e. use config file, exit on completion).
Currently coverage is reported on a taxon-gene basis (and so implicitly weights genes the same, even if different length). Return character-specific coverage as well.
Considering all possible trees is silly. When simulating 'random' trees, make sure they are compatible with the constraint tree. Should give much more informative results. Definitely important, but difficult given current tree generation (random).
Requirements (i.e. no spaces allowed) in data block are tedious, and no informative error is returned. Just make it more flexible.
Allow user to specify that all characters are independent (e.g. morphological) instead of providing a huge number of CHARSET declarations.
The whole commandline argument parsing is ugly. Replace it with getopt-style handling.
Allow input of phylip-formatted data (and separate partition file) e.g. ala RAxML. Or possibly a small accessory program which can do the conversion (both ways).
Decisiveness involves only whether overlapping data are present. Ideally this would involve some measure of the informativeness of the data involved. However, an easy first step is to condition on whether the data present show any substitutions; if not, the data should not be considered to contribute to informativeness.
This makes the search a fair bit more complicated, because, even with complete sampling, some quartets within a partition will be uninformative, while others are not. So really this will need to involve a per-partition quartet-specific assessment.
Maybe not since code base is so small. But would take care of the mac vs. linux compiling issue (that is, must use gcc on mac, which requires working around the clang-centric OS defaults). Definitely not high priority.
This should really be done...
Hello,
I have been using Decisivator with a data set and am somewhat confused about the output and summary status when I use the complete decisiveness metric. When I run the T option, I get the following output:
Searching for presence of all possible taxon triplets...
Triplet: 131071 of 134044
Counted 134044 total triplets, 134044 of which were observed.
Woo-hoo! All possible taxon triplets observed. Matrix is (probably) decisive for all possible trees!
Searching for presence of all possible taxon quartets...
Quartet: 3047423 of 3049501
Counted 3049501 total quartets, 3049501 of which were observed.
Woo-hoo! All possible taxon quartets observed. Matrix IS decisive for all possible trees!
So it seems that the matrix is decisive. But when I print the summary info, I get this:
1 processor available for analysis.
Input file: 'Input.nexus'.
A total of 2139 partitions read for 94 taxa.
No reference taxa found (i.e. have data for all partitions).
1 user tree in memory.
Matrix coverage is currently at: 0.902365
Taxon-character matrix is NOT currently decisive for all possible trees.
Partial tree-wise decisiveness for taxon-character matrix is currently: 1 (determined from 1000 random trees).
Partial branch-wise decisiveness has not yet been determined for this taxon-character matrix.
This says the matrix is NOT decisive. I didn't make any changes to it, so I am not sure why these things seem to contradict one another.
Thanks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.