GithubHelp home page GithubHelp logo

bmhyb's People

Contributors

bomeara avatar

Watchers

 avatar  avatar  avatar

Forkers

djhwueng

bmhyb's Issues

Redo sims

Make sure the sim code is right for m!=0.5 (some oddness with the order)

Reload the new BMhyb code to the cluster and run.

Fix wrong model

From Tony:

After considering a more general case (please see the attach figure), we probably can use the code if you agree

V.modified[recipient.index, recipient.index] <-

(V.original[recipient.index, recipient.index] - sigma.sq_flow$time.from.root.recipient[flow.index]) # this equals to sigma.sq_t3 = sigma.sq_(t1+t2+t3) - sigma.sq_(t1+t2)

( flow$m[flow.index]^2 + (1- flow$m[flow.index])^2 ) * (flow$time.from.root.recipient[flow.index]) # this is m^2_var(A) + (1-m)^2var(C) = m^2_(t1+t2) +(1-m)^2(t1+t2)

+2_m_(1-m)V.original[recipient.index, donor.index] # this is 2_m(1-m)cov(A,C) =2_m(1-m)t1 which is equal to 2_m(1-m)_cov(X,Y)

vh

20151102_184315

Convert m to gamma

The standard term in the field is gamma, not m, for fraction of inheritance from one ancestor.

Time from donor

Hi @bomeara and @djhwueng,

I have been trying to test your function GetVModified on the example network you show on your preprint, but I ran into a problem.

I used the following function to create this network (with t1, t2 and t3 as in the preprint):

create_paper_network <- function(gamma, t1, t2, t3){
    phy <- read.tree(text = paste0("((R:", t3, ",Y:", t3, "):", t1 + t2, ",X:", t1 + t2 + t3, ");"))
    network <- list(phy = phy,
                    flow = data.frame(donor = "X",
                                      recipient = "R",
                                      gamma = gamma,
                                      time.from.root.donor = t1,
                                      time.from.root.recipient = t1 + t2))
    network$flow$donor <- as.character(network$flow$donor)
    network$flow$recipient <- as.character(network$flow$recipient)
    return(network)
}

To plot an example:

gamma <- 0.5
t1 <- 0.3; t2 <- 0.4; t3 <- 0.3; # unit height
network <- create_paper_network(gamma, t1, t2, t3)

PlotNetwork(network$phy, network$flow)
axis(1, at = c(0, t1, t1+t2, t1+t2+t3), labels = c("0", "t1", "t1+t2", "t1+t2+t3"))

Is this network correct ? I tried to copy the format given by outputs of your function SimulateNetwork, but I might have made a mistake.

Using this network, I had a problem computing the induced variance matrix using GetVModified:

sigma2 = 1
x <- c(sigma.sq = sigma2, mu = 0, SE = 0)
actual.params <- c("sigma.sq", "mu", "bt", "vh", "SE")

vcv_BMhyb <- GetVModified(x, network$phy, network$flow, actual.params)

This gave me the following result:

     R   Y    X
R 0.65 0.7 0.35
Y 0.70 1.0 0.00
X 0.35 0.0 1.00

There is a problem here with Cov[Y,R] and Cov[X,R]. Applying the formulas, I get:

Cov[X, R] = sigma^2 * gamma * t1 = 0.5*0.3 = 0.15 \neq 0.35
Cov[Y, R] = sigma^2 * (1-gamma) * (t1 + t2) = 0.35 \neq 0.7

I could not explain this discrepancy. Did I misused your functions ? Or are my computations wrong ?

One point that is unclear to me, is that I could find no reference to the parameters time.from.root.donor (t1) in the code of GetVModified, that seems essential for the computation of this matrix (but maybe it's hidden in the call of an other function, in which case I might have missed it).

Thank you for your help, and for your package !

Session infos:

> sessionInfo()
R version 3.4.2 (2017-09-28)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.3 LTS

Matrix products: default
BLAS: /usr/lib/openblas-base/libblas.so.3
LAPACK: /usr/lib/libopenblasp-r0.2.18.so

locale:
[1] LC_CTYPE=fr_FR.UTF-8       LC_NUMERIC=C               LC_TIME=fr_FR.UTF-8        LC_COLLATE=fr_FR.UTF-8    
[5] LC_MONETARY=fr_FR.UTF-8    LC_MESSAGES=fr_FR.UTF-8    LC_PAPER=fr_FR.UTF-8       LC_NAME=C                 
[9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=fr_FR.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] BMhyb_1.5.1 ape_4.1    

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.12            subplex_1.4-1           msm_1.6.4               mvtnorm_1.0-6          
 [5] lattice_0.20-35         tidyr_0.7.1             corpcor_1.6.9           prettyunits_1.0.2      
 [9] assertthat_0.2.0        digest_0.6.12           foreach_1.4.3           R6_2.2.2               
[13] plyr_1.8.4              phytools_0.6-20         coda_0.19-1             httr_1.3.1             
[17] ggplot2_2.2.1           progress_1.1.2          rlang_0.1.2.9000        uuid_0.1-2             
[21] lazyeval_0.2.0          curl_2.8.1              data.table_1.10.4       taxize_0.9.0           
[25] phangorn_2.2.0          Matrix_1.2-11           RNeXML_2.0.7            combinat_0.0-8         
[29] splines_3.4.2           stringr_1.2.0           igraph_1.1.2            munsell_0.4.3          
[33] compiler_3.4.2          numDeriv_2016.8-1       geiger_2.0.6            pkgconfig_2.0.1        
[37] mnormt_1.5-5            tibble_1.3.4            gridExtra_2.2.1         TreeSim_2.3            
[41] expm_0.999-2            quadprog_1.5-5          codetools_0.2-15        XML_3.98-1.9           
[45] reshape_0.8.7           viridisLite_0.2.0       dplyr_0.7.2             MASS_7.3-47            
[49] crul_0.3.8              grid_3.4.2              nlme_3.1-131            jsonlite_1.5           
[53] gtable_0.2.0            magrittr_1.5            scales_0.5.0            stringi_1.1.5          
[57] reshape2_1.4.2          viridis_0.4.0           bindrcpp_0.2            scatterplot3d_0.3-40   
[61] phylobase_0.8.4         xml2_1.1.1              fastmatch_1.1-0         deSolve_1.20           
[65] iterators_1.0.8         tools_3.4.2             rncl_0.8.2              ade4_1.7-8             
[69] bold_0.5.0              glue_1.1.1              purrr_0.2.3             maps_3.2.0             
[73] plotrix_3.6-6           parallel_3.4.2          survival_2.41-3         colorspace_1.3-2       
[77] bindr_0.1               animation_2.5           clusterGeneration_1.3.4

Variance between hybrid descendants

Variance between hybrid descendants

Hi again, @bomeara and @djhwueng

This might be related to #13.

I tried a network a little more sophisticated, with an hybrid having several descendants, here R and Y.

## Underlying tree
t1 <- 0.3; t2 <- 0.4; t3 <- 0.3;
phy <- read.tree(text = paste0("((R:", t3, ",Y:", t3, "):", t1 + t2, ",X:", t1 + t2 + t3, ");"))
## Network
don_recp <- expand.grid(c("X"), c("Y", "R"))
network <- list(phy = phy,
                flow = data.frame(donor = don_recp[,1],
                                  recipient = don_recp[,2],
                                  gamma = rep(gamma, 2),
                                  time.from.root.donor = rep(t1, 2),
                                  time.from.root.recipient = rep(t1, 2)))
network$flow$donor <- as.character(network$flow$donor)
network$flow$recipient <- as.character(network$flow$recipient)
## Plot
PlotNetwork(network$phy, network$flow)
axis(1, at = c(0, t1, t1+t2, t1+t2+t3), labels = c("0", "t1", "t1+t2", "t1+t2+t3"))

I tried to respect your format for the flow matrix, using your description here. Is this network correctly defined ?

I then tried to compute the associated variance matrix.

> sigma2 = 1
> x <- c(sigma.sq = sigma2, mu = 0, SE = 0)
> actual.params <- c("sigma.sq", "mu", "bt", "vh", "SE")

> GetVModified(x, network$phy, network$flow, actual.params)
     R    Y    X
R 0.85 0.70 0.15
Y 0.70 0.85 0.15
X 0.15 0.15 1.00

In this matrix, if the network is correctly defined and my computations right, I think that Cov[R,Y] is not correct. I think it should be:

Cov[Y,R] = sigma^2 * [(gamma^2 + (1-gamma)^2)*t1 + t2] = 0.55 \neq 0.70

What do you think about it ? Did I make a mistake somewhere ?

I did not dive into your code very deep, but from what I understood of your algorithm, you are modifying all the couple (recipient, donors) one by one (browsing through your flow matrix), but never the couples (recipient1, recipient2), when there are several descendants from a single event, as it is the case here.
In the example above, the function indeed gives Cov[R,Y]=0.70, which seems like the non-actualized variance one would get from the underlying tree.

But it's possible I misunderstood something, please correct me if I'm wrong !

Thanks again !

Session infos:

> sessionInfo()
R version 3.4.2 (2017-09-28)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.3 LTS

Matrix products: default
BLAS: /usr/lib/openblas-base/libblas.so.3
LAPACK: /usr/lib/libopenblasp-r0.2.18.so

locale:
[1] LC_CTYPE=fr_FR.UTF-8       LC_NUMERIC=C               LC_TIME=fr_FR.UTF-8        LC_COLLATE=fr_FR.UTF-8    
[5] LC_MONETARY=fr_FR.UTF-8    LC_MESSAGES=fr_FR.UTF-8    LC_PAPER=fr_FR.UTF-8       LC_NAME=C                 
[9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=fr_FR.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] BMhyb_1.5.1 ape_4.1    

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.12            subplex_1.4-1           msm_1.6.4               mvtnorm_1.0-6          
 [5] lattice_0.20-35         tidyr_0.7.1             corpcor_1.6.9           prettyunits_1.0.2      
 [9] assertthat_0.2.0        digest_0.6.12           foreach_1.4.3           R6_2.2.2               
[13] plyr_1.8.4              phytools_0.6-20         coda_0.19-1             httr_1.3.1             
[17] ggplot2_2.2.1           progress_1.1.2          rlang_0.1.2.9000        uuid_0.1-2             
[21] lazyeval_0.2.0          curl_2.8.1              data.table_1.10.4       taxize_0.9.0           
[25] phangorn_2.2.0          Matrix_1.2-11           RNeXML_2.0.7            combinat_0.0-8         
[29] splines_3.4.2           stringr_1.2.0           igraph_1.1.2            munsell_0.4.3          
[33] compiler_3.4.2          numDeriv_2016.8-1       geiger_2.0.6            pkgconfig_2.0.1        
[37] mnormt_1.5-5            tibble_1.3.4            gridExtra_2.2.1         TreeSim_2.3            
[41] expm_0.999-2            quadprog_1.5-5          codetools_0.2-15        XML_3.98-1.9           
[45] reshape_0.8.7           viridisLite_0.2.0       dplyr_0.7.2             MASS_7.3-47            
[49] crul_0.3.8              grid_3.4.2              nlme_3.1-131            jsonlite_1.5           
[53] gtable_0.2.0            magrittr_1.5            scales_0.5.0            stringi_1.1.5          
[57] reshape2_1.4.2          viridis_0.4.0           bindrcpp_0.2            scatterplot3d_0.3-40   
[61] phylobase_0.8.4         xml2_1.1.1              fastmatch_1.1-0         deSolve_1.20           
[65] iterators_1.0.8         tools_3.4.2             rncl_0.8.2              ade4_1.7-8             
[69] bold_0.5.0              glue_1.1.1              purrr_0.2.3             maps_3.2.0             
[73] plotrix_3.6-6           parallel_3.4.2          survival_2.41-3         colorspace_1.3-2       
[77] bindr_0.1               animation_2.5           clusterGeneration_1.3.4

Remove metR from dependencies

Hi! I'm the maintainer of the metR package. I'm trying to publish an update to CRAN and your package, which is listed as a reverse dependency, is failing some tests.
Looking at your code, it seems that you used to import metR::geom_contour_fill() but now those lines are commented out and thus the metR dependency is no longer needed. If this is the case (I might me mistaken) then, I think you should remove metR from Imports. If you'd like, I can submit a pull request with the change.

Thanks!

ConvertPhyAndFlowToPhygraph failing

data("nicotiana")
  p <- BMhyb:::ConvertPhyAndFlowToPhygraph(nicotiana$phy, nicotiana$flow)

Creates an object that looks right, but when plotting causes R to abort. Probably something to do with the numbering or ordering of the p$edge object.

plot(ConvertEvonetToIgraphWithNodeNumbers(p), vertex.shape="none")

works but the resulting object looks like the flow is all running the wrong way: out of taxon 1, for example.

bad numerical issues

If the final VCV has any eigen values < 0 (or just tiny), it is not positive definite. Calculation of the likelihoods may have no relation to the truth. And this can happen for realistic values of our parameters. The spline smoothing is supposed to deal with this, but doesn't do it well. We can see if matrices are ultrametric (this does not mean the same as trees being ultrametric) but how do you get likelihood in a region that is biologically realistic but which isn't appropriate?

We've tried simulating, estimating a var() on the simulated tips to get the VCV, and then use that, but that often has issues as well. We implemented a pseudo-determinant: that also failed. We might do ABC, but it's embarrassing.

Input of enewick?

SNAQ, which is popular for inferring networks, exports enewick format (the one with a hybrid node appearing twice, not the one with multiple trees to represent a network). So perhaps accept this as an option.

Decompose network into all trees

Do this by messing with flow (setting to all possibilities of zero and one), then getting VCV from it (and can then go back to tree like in datelife). Calculate likelihood on each contained tree, weighting by the probability of that tree being seen (using the product of the gammas for the edges leading to that tree). See if the weighted sum of tree likelihoods is same as network likelihood in positive definite vcv case; if so, use for cases where not positive definite.

Several hybridization events

Variance between hybrid descendants

Hi again, @bomeara and @djhwueng

This might be related to #13 and #14.

I tried a network with several hybridization events:

gamma1 <- 0.5; gamma2 <- 0.5;
## Underlying tree
t1 <- 0.2; t2 <- 0.2; t3 <- 0.2; t4 <- 0.2; t5 <- 0.2;
phy <- read.tree(text = paste0("(((R:",t4+t5,",Y:",t4+t5,"):",t3,",X:",t3+t4+t5,"):",t1+t2,",Z:",t1+t2+t3+t4+t5,");"))
plot(phy)
## Network
don_recp <- rbind(expand.grid(c("Z"), c("Y", "R", "X")), 
                  expand.grid(c("X"), c("R")))
network <- list(phy = phy,
                flow = data.frame(donor = don_recp[,1],
                                  recipient = don_recp[,2],
                                  gamma = c(rep(gamma1, 3), gamma2),
                                  time.from.root.donor = c(rep(t1, 3), t1+t2+t3+t4),
                                  time.from.root.recipient = c(rep(t1, 3), t1+t2+t3+t4)))
network$flow$donor <- as.character(network$flow$donor)
network$flow$recipient <- as.character(network$flow$recipient)
## Plot
PlotNetwork(network$phy, network$flow)
axis(1, at = c(0, t1, t1+t2, t1+t2+t3, t1+t2+t3+t4, t1+t2+t3+t4+t5), 
     labels = c("0", "t1", "t1+t2", "t1+t2+t3", "t1+t2+t3+t4", "t1+t2+t3+t4+t5"))

This gives the folowing variance matrix:

> sigma2 = 1
> x <- c(sigma.sq = sigma2, mu = 0, SE = 0)
> actual.params <- c("sigma.sq", "mu", "bt", "vh", "SE")

> GetVModified(x, network$phy, network$flow, actual.params)
   R   Y   X   Z
R 0.8 0.6 0.6 0.1
Y 0.6 0.9 0.4 0.1
X 0.6 0.4 0.9 0.1
Z 0.1 0.1 0.1 1.0

I think that the variance of R is not coherent with the model of trait evolution. If my computations are correct, we should have:

Var[R] = (gamma2^2 + (1-gamma2)^2)*((gamma1^2 + (1-gamma1)^2)*t1+t2+t3+t4) 
          + 2*gamma2*(1-gamma2)*((gamma1^2 + (1-gamma1)^2)*t1+t2) + t5     = 0.7 \neq 0.8

(Note that the covariances between R and Y and X might also have problems, see #14).

Browsing through the code, this might be linked with the fact that a new hybridization "erases" an older one in your algorithm. Indeed, all the computations are made using V.original, that do not take ancestral hybrids into account. Here, if there were only one hybridization (the second one), then we would have:

Var[R] = (gamma2^2 + (1-gamma2)^2)*(t1+t2+t3+t4) + 2*gamma2*(1-gamma2)*(t1+t2) + t5 = 0.8

which is the result given by GetVModified.

I think this is a seperate problem from the two other ones, hence the new issue. Again, I'm sorry if I mis-used your functions or made mistakes, please correct me if I did.

Thanks !

Session infos:

> sessionInfo()
R version 3.4.2 (2017-09-28)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.3 LTS

Matrix products: default
BLAS: /usr/lib/openblas-base/libblas.so.3
LAPACK: /usr/lib/libopenblasp-r0.2.18.so

locale:
[1] LC_CTYPE=fr_FR.UTF-8       LC_NUMERIC=C               LC_TIME=fr_FR.UTF-8        LC_COLLATE=fr_FR.UTF-8    
[5] LC_MONETARY=fr_FR.UTF-8    LC_MESSAGES=fr_FR.UTF-8    LC_PAPER=fr_FR.UTF-8       LC_NAME=C                 
[9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=fr_FR.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] BMhyb_1.5.1 ape_4.1    

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.12            subplex_1.4-1           msm_1.6.4               mvtnorm_1.0-6          
 [5] lattice_0.20-35         tidyr_0.7.1             corpcor_1.6.9           prettyunits_1.0.2      
 [9] assertthat_0.2.0        digest_0.6.12           foreach_1.4.3           R6_2.2.2               
[13] plyr_1.8.4              phytools_0.6-20         coda_0.19-1             httr_1.3.1             
[17] ggplot2_2.2.1           progress_1.1.2          rlang_0.1.2.9000        uuid_0.1-2             
[21] lazyeval_0.2.0          curl_2.8.1              data.table_1.10.4       taxize_0.9.0           
[25] phangorn_2.2.0          Matrix_1.2-11           RNeXML_2.0.7            combinat_0.0-8         
[29] splines_3.4.2           stringr_1.2.0           igraph_1.1.2            munsell_0.4.3          
[33] compiler_3.4.2          numDeriv_2016.8-1       geiger_2.0.6            pkgconfig_2.0.1        
[37] mnormt_1.5-5            tibble_1.3.4            gridExtra_2.2.1         TreeSim_2.3            
[41] expm_0.999-2            quadprog_1.5-5          codetools_0.2-15        XML_3.98-1.9           
[45] reshape_0.8.7           viridisLite_0.2.0       dplyr_0.7.2             MASS_7.3-47            
[49] crul_0.3.8              grid_3.4.2              nlme_3.1-131            jsonlite_1.5           
[53] gtable_0.2.0            magrittr_1.5            scales_0.5.0            stringi_1.1.5          
[57] reshape2_1.4.2          viridis_0.4.0           bindrcpp_0.2            scatterplot3d_0.3-40   
[61] phylobase_0.8.4         xml2_1.1.1              fastmatch_1.1-0         deSolve_1.20           
[65] iterators_1.0.8         tools_3.4.2             rncl_0.8.2              ade4_1.7-8             
[69] bold_0.5.0              glue_1.1.1              purrr_0.2.3             maps_3.2.0             
[73] plotrix_3.6-6           parallel_3.4.2          survival_2.41-3         colorspace_1.3-2       
[77] bindr_0.1               animation_2.5           clusterGeneration_1.3.4

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.