pyubero / microguilds Goto Github PK
View Code? Open in Web Editor NEWPipepline for the standard quantification of microbial guilds from metagenomic samples.
License: GNU General Public License v3.0
Pipepline for the standard quantification of microbial guilds from metagenomic samples.
License: GNU General Public License v3.0
It would be nice to be able to order the appaerance of the implementations or clusters in the radial plots.
It would be nice that every implementation of several functions can be visualized at the same time. Things to consider:
Because sometimes different clusters represent different genes, so regressions need to be gene- and context-specific.
It would be freaking good to compare distinct guild_tensors in the same plot.
Some concerns:
i) the k-values of different guilds can be in different magnitude orders. We should build a smart visualization!
ii) the resulting plot can represent an understanding-challenge. We should keep the things "easy".
It would be nice to add an argument for visualizing normalized abundances instead of k-values.
When trying to run a master_tab with only GTDB taxonomic assigments, guild_tensor_generate.py is displaying this error:
File "/Users/juanrivassantisteban/Desktop/uGuilds-main/guild_tensors/guild_tensor_utils.py", line 132, in check_mastertable
idc_null = np.argwhere(df['Species_SQM'].isnull().values)[:, 0]
File "/Users/juanrivassantisteban/miniconda3/lib/python3.10/site-packages/pandas/core/frame.py", line 3807, in __getitem__
indexer = self.columns.get_loc(key)
File "/Users/juanrivassantisteban/miniconda3/lib/python3.10/site-packages/pandas/core/indexes/base.py", line 3804, in get_loc
raise KeyError(key) from err
KeyError: 'Species_SQM''
And if you add an empty column called "Species_SQM", the error changes to this one:
File "/Users/juanrivassantisteban/Desktop/uGuilds-main/guild_tensors/guild_tensor_utils.py", line 144, in check_mastertable
df.at[jj, "Genus_SQM"] = "GTDB:"+df.loc[jj, "Genus_GTDB"]
TypeError: can only concatenate str (not "float") to str
Sometimes, the number of represented colors does not match with the actual legend colors.
It would be interesting to compare the sum of abundance values with the k-values. In order to do that in contexts with very different number of samples, we should add another column with Abudance / n samples.
If anyone wants to screen guilds in reduced contexts, this is, that not every single row the my master_tab.tsv will be classified in a context to visualize or to generate k-values, it would be useful to not getting this error in modify_mastertable.py:
ValueError: Length of values (2375) does not match length of index (91159)
I rewrite the modify_mastertable.py module to generate a master_tab with 4 context (instead of depth, latitude). I'm getting this error in guild_tensor_generate.py:
Found 4 contexts in gene subtable.
z1
z2
z3
z4
100%|██████████| 2064/2064 [00:01<00:00, 1746.92it/s]
Bivariate loglog regression results:
gamma = 0.6510109031387529
c = 0.2594626316691924
R2 = 0.8597118038702496
Gene: potF with R2=0.86
Data saved in kvalues_potF_Species_GTDB.tsv.
0%| | 0/2064 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/Users/juanrivassantisteban/miniconda3/lib/python3.10/site-packages/spyder_kernels/py3compat.py", line 356, in compat_exec
exec(code, globals, locals)
File "/Users/juanrivassantisteban/Desktop/uGuilds-main/guild_tensors/guild_tensor_generate.py", line 129, in
gtutils.export_legacy(adu_table, _filepath, column="Diversity")
File "/Users/juanrivassantisteban/Desktop/uGuilds-main/guild_tensors/guild_tensor_utils.py", line 91, in export_legacy
assert sum(idx) == 1
AssertionError
A major update can consist of a new module for computing, generating and displaying several suitable statistical tests for compositional analysis.
We need to compute diversity as the number of unique identifiers (rows) matching a value over 0 in the abundance column for any position in the k-tensor. Otherwise, diversity will always be the same (while abundance, of course, not) for every taxonomic level, regardless of the context.
guild_tensor_generate.py
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.