arborworkflows / arbor Goto Github PK
View Code? Open in Web Editor NEWaRbor, an R package with useful functions for Arbor workflows
aRbor, an R package with useful functions for Arbor workflows
We need to put the legend back on the tree for the ancestral state plot for discrete traits.
In R I usually use
colors<-rainbow(nlevels(trait))
tiplabels(pch=21, bg=colors[as.numeric(trait)], cex=2.5, adj=1)
nodelabels(pie = ans$lik.anc, piecol = colors, cex = 0.5)
legend(locator(),c(levels(trait)), fill=colors, cex=1, xpd=TRUE, horiz = FALSE)
For aRbor, should this be
legend("topleft",c(levels(trait)), fill=colors, cex=1, xpd=TRUE, horiz = FALSE)
Phenomics users requested a better way to find an appropriate analysis.
Instead of using the eyeball click and scrolling through a lot of names.
For example, where is the tree+heatmap function? H2 tell a user where it is?
Phenomics users requested a button to click that would remove the "wings" on the output plot for easy mode ancestral state continuous traits.
in both phylogenetic signal and ancestral state, the parser of the trait file does not read the header line.
this happens in both .csv and .tsv formats
instead, the columns are simply Column 1 and Column 2...
and subsequent calculations will fail because the length of the traits is different from the length of the tip labels.
workaround: i left off the headers, and just remembered what was in each column.
weird error is returned if data matrix does not have column names. Minimal example (uncomment to fix):
tree<-pbtree(n=100, scale=1)
mm<-cbind(c(1, -0.5), c(-0.5, 1))
z<-sim.char(tree, mm, model="BM")[,,1]
td<-make.treedata(tree, z)
this has literally no flexibility as it stands.
I identified and (I think) fixed an issue with the ER model in aceArbor. I expect the same issue - relating to diversitree constraints - exists for the SYM model. Needs to be checked and fixed.
Can we implement Easy Mode / Expert Mode code to evaluate what evolutionary model best fits a dataset, e.g., BM, OU, etc.?
This example is in R-phylo-wiki
Can we calculate multiple models and then provide the user with a likelihood ratio test or comparison of AICc scores, enabling users to select the best fit model.
The image link is broken for the arbor logo in EasyMode - Ancestral State Reconstruction
within the arbor web interface, it seems that make.tree asks users to choose whether the characters are discreet or continuous, it might be nice for make.tree to be used on all characters and for the functions that use the output to ask for discreet or continuous, as appropriate
All the different types of tests should return consistent outputs.
Several users requested a simple tree metrics routine that would output the results of tests like:
Tree Status
is.binary YES
is.ultrametric NO
....
etc.
basically, a large summary of descriptive output that is commonly needed for analyses.
When using expert mode, users requested a small box that could be clicked to read a manual page when an analysis is selected.
I think we need three cases: single column (checks for NAs, removes from data and tree as needed); pairwise (removes any taxa not present in BOTH, for things like PGLS); and multivariate (removes any incomplete taxa, for things like phyloPCA).
There is no color in the circles at the tips of the EasyMode - Ancestral State Reconstruction plot. The circles at the tips are also very small compared to the circles at the nodes (which are nicely sized and have the correct color).
See the files in GoogleDrive, "files that failed at Biosphere2", KirrWithHeaderUnderscore.csv and KirrGrafen.phy
Travis throws an error about the treedata s3 functions:
See section ‘Generic functions and methods’ of the ‘Writing R
Extensions’ manual.
When I edit an input table (.csv) and drop the new table into easyMode, the file name changes, and the new file is active. However, the input table preview does not change / refresh. It just displays what the previous file was.
I cannot figure out how to use select.treedata() inside another function.
tree<-pbtree(n=100, scale=1)
Q<-matrix(c(-1,1,1,-1),2,2)
rownames(Q)<-colnames(Q)<-1:2
x<-sim.history(tree,Q)$states
y<-setNames(as.numeric(x),names(x))
ydf <- as.data.frame(y)
ytd<-make.treedata(tree, ydf)
#either of these work
select(ytd, 2)
select(ytd, y)
# now try in a function
foo<-function(ytd, columnSelect) {
ynew<-select(ytd, columnSelect)
return(ynew)
}
# now these don't work, and return what was (to me) a strange error
foo(ytd, 2)
foo(ytd, y)
# I think I get what's happening - the second argument is passed through to dplyr 'select'
# as 'columnSelect' and the variable within the function is being ignored.
# I am not sure how to fix this.
# there are hacks online but this might be a real issue:
# http://stackoverflow.com/questions/22919448/passing-function-argument-to-dplyr-select
Would be nice to have the ability to "smartly" detect character type. E.g. users often send a vector of 1s and 0s, not a factor - but the data actually does represent a factor.
Right now this function only fits bisse mk2 using ML. Better to:
deal with mkn
allow user to specify constraints or ER, SYM, ARD
compare trait independent and trait dependent models
run bayesian version
Zach and I tried with his data and got this:
physigArbor(tree, data$SpongeHost)
Error in physigArbor(tree, data$SpongeHost) :
could not find function "detectCharacterType"
Where can we find "detectCharacterType"?
Just checks for a factor, which is not even what the code expects
require(aRbor)
data(anolis)
td<-make.treedata(anolis$phy, anolis$dat, name_column=1)
summarize(td)
Users continue to attempt to input trees with singleton nodes.
R can read these by:
using package(phytools) read.newick,
followed by collapse.singles,
then write.tree to get a plain newick tree
a check for is.binary.tree, followed by multi2di() is a related issue
the most common manipulation I need to do to get aRbor to read a tree is to make the tree binary.
Can we add a check when we read trees into any easy mode app:
is.binary.tree(phy) - if it returns FALSE, then do
phy<-multi2di(phy)
is.binary.tree(phy) - if it returns TRUE, continue; if FALSE, need to print an error that Arbor could not generate a binary tree.
See the GoogleDrive folder "files that failed at Biosphere2"
These are from Jenna, a microbiologist in Jonathan's lab at UC-Davis.
The Kirr.csv file needed a header, so I fixed that.
The Kirr.phy file contains singleton nodes.
It draws the following error in EasyMode:
"Analysis failed. Error in read.tree(text = input) : The tree has apparently singleton node(s): cannot read tree file. Reading Newick file aborted at tree no. 1"
For EasyMode to be easy, instead of just giving this error that there is a singleton node, the software should just fix the singleton node for the user:
library(phytools)
phy<-read.newick(input)
phy<-collapse.singles(phy)
phy<-multi2di(phy)
(could also check minimum edge length)
if (min(phy$edge.length) == 0) { phy$edge.length = phy$edge.length + 0.00001 }
because we really want
To fix all this for Jenna, I ran in R
phy<-read.newick("Kirr.phy")
Read 1 item
phy<-collapse.singles(phy)
phy<-multi2di(phy)
min(phy$edge.length)
[1] 1e-06
write.tree(phy,"KirrCollapsed.phy")
Now the error in easy mode is
"Analysis failed. Error in make.treedata(tree, table) : No matching names found between data and tree"
This one is easier. The tree has underscores and the .csv does not. Change the .csv.
A good error message here, so easily noticed and fixed.
Now the error in easy mode is
"Analysis failed. Error in check.tree(tree) : 'tree' must be ultrametric"
newphy<-compute.brlen(phy,method="Grafen")
write.tree(newphy,"KirrGrafen.phy")
Now easyMode works, but 3 other issues came up
add geosse capabilities
Phenomics users requested an explanation box for each easy mode analysis.
We need to display the exact test that was completed (e.g. what would the equivalent command(s) in R be?).
We need a brief explanation text of what the test is doing.
We need to re-state clearly the model used in the test (e.g. ace with what parameters?).
Can expert and easy modes "sniff" files to figure out if they are:
nexus vs newick format trees
.csv or .tsv
Major issues include:
several programs reserve ".tre" for trees and ".phy" for phylip-format data files
Arbor expert and easy modes need to be able to read nexus files to obtain a tree and the associated traits
Histograms of simulations versus actual data
Users requested the ability to integrate with their iPlant accounts.
Hosting a version of aRbor at iPlant/atmosphere could allow:
We should check for and prune out empty columns and rows
Phylogenetic signal (Arbor) - csv=HeliContRare.csv (heliconia) Tree=HeliContRare.phy (make.treedata phy (2) (heliconia)
Error in eval(expr, envir, enclos) :
could not find function "detectCharacterType"
make.treedata - It seems that if there are NAs in the csv, even if you have run make.treedata the output doesn't work in some functions, maybe because the presence of some characters prevents the tree from being rarified, then when you select a column to run the analysis on, there is missing data and you get an error. If you remove all rows with NAs from the character table, then run make.treedata, it rarifies the tree and you can use the output in analyses.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.