russel88 / miceco Goto Github PK
View Code? Open in Web Editor NEWVarious functions for analysis of microbial community data
License: GNU General Public License v3.0
Various functions for analysis of microbial community data
License: GNU General Public License v3.0
Hey,
Thanks for the fantastic package, and function ps_venn.
I have been unable to change the font size of the numbers and/or labels. Could you please help?
Best wishes.
Hi Russel,
Thank you for the package.
As the title suggested, the R squared value differed from those of other packages, such as iCAMP's snm and reltools' fit_sncm when tested on the same dataset.
fit_sncm(t(data.frame(otu_table(OR16S_rare)))) # R2=0.4450697 m=0.1035867
fit_sncm(t(data.frame(otu_table(VA16S_rare)))) # R2=0.3859323 m=0.0897298snm(t(data.frame(otu_table(OR16S_rare)))) # R2=0.4450697 m=0.1035854
snm(t(data.frame(otu_table(VA16S_rare)))) # R2=0.3859323 m=0.08973541neutral.fit(t(data.frame(otu_table(OR16S_rare)))) # R2=0.7925008 m=0.1035908
neutral.fit(t(data.frame(otu_table(VA16S_rare)))) # R2=0.8073008 m=0.08970018
The R squared is about 0.3 to 0.4 more than the other two. I am new to modeling. Could you give me some insight on the differences?
Thank you,
Xiaoping
Hi,
I'm trying to install the MicEco package using R Studio (version "Ghost Orchid" Release (077589bc, 2021-09-20) for macOS) and this command:
githubinstall("MicEco")
(I have installed Git ahead)
But I got the following error message :
Downloading GitHub repo Russel88/MicEco@master
Error: Failed to install 'MicEco' from GitHub:
Command failed (1)
In addition: Warning message:
In system(full, intern = TRUE, ignore.stderr = quiet) : running command ''/usr/bin/git' ls-remote https://git.bioconductor.org/packages/phyloseq RELEASE_3_14 2>/dev/null' had status 1
Do you have any idea how can I solve the problem?
Thank you!
Hello!
I'm using two dfferent functions (ps_venn and plot(venn(list_core)) and I'm obtaining two different results. I'm not understanding why this is happening, as the dataset is the same for both diagrams. I think there's something with the parameters..
For the Diagram 1, using eulerr, microbiome and microbiomeutilities packages, the script is:
for (n in situation_list)
ps.sub <- subset_samples(ps.rel, Situation == n)
core_m <- core_members(ps.sub,
detection = 0.001,
prevalence = 0.50)
print(paste0(n, length(core_m)))
list_core[[n]] <- core_m
}
mycols <- c(Wild, Captive)
venn <- plot(venn(list_core),
fills = mycols)
venn
For the Diagram 2, using MicEco package, the script is:
venn2 <- ps_venn(ps3_dna,
"Situation",
fraction = 0.5,
weight = FALSE,
relative = TRUE,
plot = TRUE, #
)
venn2
Could you, please, help me understand that? Do you have any instructions on how to perform this analysis properly?
Thanks in advance!
Hi Russel!
I am using ps_venn which I find super useful, my only question is, I am trying to run a diagram for 10 variables but I get an error stating that the max groups admitted are 5. Is there a workaround this? I understand the final plot would look really messy but I would like to see how it turns out.
Thanks!!
Hi,
I'm trying to install the MicEco package using the command lines in R (version 4.0.4):
install.packages("remotes")
remotes::install_github("Russel88/MicEco")
But I got the following error message : " Error: Failed to install 'MicEco' from GitHub:
Git does not seem to be installed on your system"
Do you have any idea what should be the problem ?
Thank you for your time!
Hello!
I am having this error message (Error in do.call(c, singles[-x]) : second argument must be a list) when trying to print the list using ps_venn. Here follows the code:
venn_list <- ps_venn(ps3_dna.t,
"Situation",
fraction = 0.9,
weight = FALSE,
relative = TRUE,
plot = FALSE,
)
venn_list
I am using an agglomerated (genus level) phyloseq object.
Any advice on that?
Thanks in advance!
Hi team
Thanks for making things easier to create a Venn diagram directly on phyloseq object.
Area-proportional Venn diagram (also called a Venn diagram by area) is preferable because of grasping the idea of shared and unique quickly.
This is a suggestion to implement to your wonderful package, if possible?
Cheers
M
Hi, this is my command and output:
rarefy_rrna.matrix(EQM_spec_Z1_TaxID_sorted_by_Abundance_descending, 1000, copy.database = "v13.5", seed = NULL,
trim = FALSE)
Remember to set seed! Now set to 1562117525.63036
Error in rep(rrna.rev, times = x[i, ]) : invalid 'times' argument
I'm not sure about the usage of 'times' here.
Hello,
I am using the neutral.fit function and I was surprised by the gRsqr results I was getting: they were not sustaining the visual impression I had of the fit. I tried to calculate the generalized R squared using the formula provided in Burns et al. (2015, ISME J 10(3):655-664) and I got very different results that fitted what I expected.
In Burns and al., they calculate R2 with: R2 = 1 - SSerr/SStotal
with SSerr the sum of squares of residuals and SStotal the total sum of squares
In neutral.fit, gRsqr is calculated with: R2 = 1 - exp(-as.numeric(logLik(m.mle))/length(p))
with p the number of observations and logLik(m.mle) the log likelihood of the model predicted by mle2
Do you have an insight on the difference between the two calculations?
Thanks,
Hi Russel, I have been playing with your package and really enjoy it!
Based on the same dataset, the Venn and Euler diagrams look different and I don't understand why. For example in the Euler plot, what happened to the 23 OTUs shared between all forest types in the Venn plot? Am I missing something here?
Thanks!
ps_venn(
data,
group = "ForestType",
quantities = list(type=c("counts")),
plot = TRUE
)
ps_euler(
data,
group = "ForestType",
quantities = list(type=c("counts")),
plot = TRUE
)
Hello,
Thank you for this useful package!
Would there be a way to combine both the ASV/OTU count and the relative abundance they represent using weight = T in the plotted Venn diagram ?
Have a great day,
Simon
when I run my code for ps_venn
ps_venn (ps.nonc.nocyano, SampleType,
fraction = 0.1,
weight = FALSE,
type = "percent",
relative = TRUE,
plot = TRUE)
SampleType is my variable
here is when I run sampledata
Sample Data: [357 samples by 2 sample variables]:
SampleType Sample2
1001SH CFH 1001SH
1001WH CFH 1001WH
1001WS CS 1001WS
I got this error
Error in paste("value ~ Var1 +", group) : object 'SampleType' not found
it also with other function such as ps_euler, ps_pheatmap.
Hi MicEco :)
I am using ps_venn function, but the percentage of each group does not add up to 100%. I used this code before taking the number of samples into account:
ps_venn(
ps.prop,
group = "Ecosites", type = "Pond"
)
And this is the diagram:
And then I added the fraction:
fractions <- c(Dugout = 0.2, Upland = 0.2, Lowland = 0.6)
ps_venn(
ps.prop,
group = "Ecosites", type = "Pond", relative = TRUE, quantities = list(type = c("percent", "counts")), fraction = fractions
)
And got this:
But the sum of the unique and common percentages does not add up. Any idea what went wrong?
Thank you,
Z.
Hi,
I'm trying to use the community_rrna function but I have trained my taxa from the Silva database because Greengenes hasn't been updated. The function says I can use an OTU-table with taxa as rows and OTU names as rownames and a dataframe with two variables: "ID" is the OTU id matched by rownames in x and "Copy" is the copy number. However, that doesn't include any sample information. Shouldn't the OTU table have OTUs as rownames and samples as column names and the frequency of each OTU in each sample?
If this doesn't work, can you post the fasta file you used from Greengenes to assign taxonomy in your phyloseq?
Thanks!
Hello
I get this error Error in drawVennDiagram(data = x, small = small, showSetLogicLabel = showSetLogicLabel, : gplots.drawVennDiagram: This internal function is used wrongly. Please call the function 'venn' with the same arguments, instead.
What this could mean? ps_euler works fine and I am able to plot.
Thanks in advance.
Hi Russel88,
I like your venn diagram display. However, I find it difficult to extract the unique taxa and shared taxa through the console, when I set the plot to "False". Hence, I would love to see an add-on features to export the list, or into a data frame. Thank you
Trying to produce a venn of ASV overlap between two samples. I have my phyloseq object with my sample_data(ps). Im trying to group by a categorical variable in my sample data (either spec or environ) and keep getting this error:
ps_venn(stdc2, group = 'Type', type = 'counts')
Error in aggregate.data.frame(mf[1L], mf[-1L], FUN = FUN, ...) :
no rows to aggregate
after checking my phlyloseq object, it looks normal:
phyloseq-class experiment-level object
otu_table() OTU Table: [ 87079 taxa and 62 samples ]
sample_data() Sample Data: [ 62 samples by 7 sample variables ]
tax_table() Taxonomy Table: [ 87079 taxa by 7 taxonomic ranks ]
Note: my sample_names(ps) are only numbers. Is this error because my sample names are not the grouping variable?
It was easy to use ps_venn
I was wondering if it is possible to change the labels of the circles to be on their edges?
so rather than
So also can I make changes where the percents appear only on the cross-section between circles and not for every circle?
to only have % in here as you can see (screenshot from my first fig shared)
here are my codes
ps_venn(ps.prev,
group = "treatment",
quantities = list(type=c("percent","counts"), font = 2),
labels = list(cex = 1)
Hi, I'm finding the ps_euler command really useful, however I was wondering if there is a way to display the quantities as percentages of each group, rather than across all the groups. I would like to see what proportion of each group is unique vs shared.
Thanks!
i've attached my distance matrix, strata vector, and phylogenetic distance matrix.
I use the following code to load and run them:
library(ape)
library(picante)
library(MicEco)
library(doSNOW)
tsv.data <- read.delim("../otu_data/clustered_sequences/test_abundances.txt", row.names=1)
phydf <- read.delim("../otu_data/clustered_sequences/test_distmat.txt", row.names=1)
strata_df <- read.delim("../otu_data/clustered_sequences/test_strata.txt", row.names=1)
strat_vec <- unname(unlist(strata_df[,'CollectionAgency']))
strata_ <- as.integer(as.factor(strat_vec)[drop = TRUE])
phydist = as.matrix(phydf)
mntd_scores <- ses.comdistnt2(tsv.data, phydist, method = "quasiswap", strata =strata_ , abundance.weighted = TRUE, runs = 5, cores=1)
The error I get is:
Error in match.comm.dist(comm, dis) :
Community data set lacks taxa (column) names, these are required to match distance matrix and community data
I do not get this error when strata = NULL
Hello,
First of all, thank you for coding this package, which I use a lot to speed up ses.mpd calculations on a multi-core machine.
I have a question regarding the number of reads to sample when using rarefy_rrna. I think that choosing the number of reads is not that straightforward as when using any other simpler rarefaction tool. I think choosing something like min(sample_sums(physeq)) as the number of reads is not completely correct as we are correcting and estimating the number of reads based on the fact that some organisms have multiple copies of the rRNA operon.
Do you have any suggestion or recommendation on how to choose the number?
Thank you,
Eduardo
Hi there,
First off, this package is great, thanks for making it. I was wondering though if there is a way to have less decimal places show up in the relative abundance venn diagrams. Currently it shows 7 decimal places which is a little overwhelming to look at.
Let me know if this is possible,
Thanks!
Hi Russel
How do I cite my MicEco package in my paper ?
Hesham
Hi, I'm loving the ps_venn and ps_euler commands. They make great images and are super easy to use, but is there a way to also output the list of shared and not-shared ASVs or OTUs? I'd like to be able to cross-reference the data to some other things I'm doing. I've looked over the readme files, but I'm not sure savvy with code so I couldn't find an apparent way to easily do this.
Thanks!
Alicia Reigel
Hello,
Thank you for the great tool. I am preparing some ven diagrams based on different fractions(0.3,0.5 and 0.7). I expected to see OTUs that are persent in fraction 0.7, are also present in lower fraction settings(0.3 and 0.5) but that is not always the case.
Am i missing something ?
Thank you,
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.