stomics / stereopy Goto Github PK
View Code? Open in Web Editor NEWA toolkit of spatial transcriptomic analysis.
License: MIT License
A toolkit of spatial transcriptomic analysis.
License: MIT License
I have extended a couple of functions for saving and reading results and key records from the pipeline as well as the raw expression matrix (https://github.com/nilsmechtel/stereopy). If you are interested in them, feel free to add them to the main branch.
完成时间 | 数据质控 | 细胞识别 | 空间特异性 | 组织边界 | 细胞间相互作用 | 基因互作 | 细胞命运 | 可视化 | |
已完成 | 集成 |
信息统计
|
降维
|
静态图
|
|||||
过滤 |
聚类
|
||||||||
标准化
|
find marker gene
|
||||||||
自研 |
标准化
|
直接注释1(RF)
|
gene pattern | ||||||
Q3 | 集成 |
|
|
两区域pathway富集分析 | 静态图:可提需求 | ||||
|
|
||||||||
自研 | cell bin注释 | 交互式可视化:可提需求 | |||||||
聚类新算法(白勇,待确认) | |||||||||
Q4 | 集成 | ||||||||
自研 | |||||||||
参考文献: | |||||||||
1. Ruben Dries, et, al. a toolbox for integrative analysis and visualization of spatial expression data. Genome Biology, | |||||||||
2. Rui Hou, et, al. scMatch: a single-cell gene expression profile annotation tool using reference datasets. Bioinformatics, 2019. | |||||||||
3. | |||||||||
4. |
After creating conda environment I'm getting an error when I run setup:
Processing dependencies for stereopy==0.7.0
Searching for gefpy>=0.6.7
Reading https://pypi.org/simple/gefpy/
No local packages or working download links found for gefpy>=0.6.7
error: Could not find suitable distribution for Requirement.parse('gefpy>=0.6.7')
I suppose it could be solved by providing a specific Conda repository where all specified package versions are available.
Hi,
I'm exploring Stereopy with the .h5ad files you have provided in the link here. However, on my Jupyter notebook I get the following error when trying to load stereopy
Below is what I am loading
import os
from os.path import join as jn
import warnings
warnings.filterwarnings('ignore')
import stereo as st
Here is the error
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
Cell In [7], line 5
2 from os.path import join as jn
3 #import warnings
4 #warnings.filterwarnings('ignore')
----> 5 import stereo as st
File ~/miniconda3/envs/stereopy/lib/python3.8/site-packages/stereo/__init__.py:10
8 from . import tools
9 from . import utils
---> 10 from . import plots as plt
11 from . import image
File ~/miniconda3/envs/stereopy/lib/python3.8/site-packages/stereo/plots/__init__.py:13
11 from .marker_genes import marker_genes_text, marker_genes_heatmap
12 from .plot_collection import PlotCollection
---> 13 from .interact_plot.spatial_cluster import interact_spatial_cluster
14 from .interact_plot.interactive_scatter import InteractiveScatter
15 from .interact_plot.poly_selection import PolySelection
File ~/miniconda3/envs/stereopy/lib/python3.8/site-packages/stereo/plots/interact_plot/spatial_cluster.py:8
1 #!/usr/bin/env python3
2 # coding: utf-8
3 """
4 @author: [email protected]
5 @time:2021/09/06
6 """
----> 8 import holoviews as hv
9 import hvplot.pandas
10 import panel as pn
File ~/miniconda3/envs/stereopy/lib/python3.8/site-packages/holoviews/__init__.py:12
8 __version__ = str(param.version.Version(fpath=__file__, archive_commit="$Format:%h$",
9 reponame="holoviews"))
11 from . import util # noqa (API import)
---> 12 from .annotators import annotate # noqa (API import)
13 from .core import archive, config # noqa (API import)
14 from .core.boundingregion import BoundingBox # noqa (API import)
File ~/miniconda3/envs/stereopy/lib/python3.8/site-packages/holoviews/annotators.py:10
6 from inspect import getmro
8 import param
---> 10 from panel.pane import PaneBase
11 from panel.layout import Row, Tabs
12 from panel.util import param_name
File ~/miniconda3/envs/stereopy/lib/python3.8/site-packages/panel/__init__.py:1
----> 1 from . import layout # noqa
2 from . import links # noqa
3 from . import pane # noqa
File ~/miniconda3/envs/stereopy/lib/python3.8/site-packages/panel/layout/__init__.py:1
----> 1 from .accordion import Accordion # noqa
2 from .base import Column, ListLike, ListPanel, Panel, Row, WidgetBox # noqa
3 from .card import Card # noqa
File ~/miniconda3/envs/stereopy/lib/python3.8/site-packages/panel/layout/accordion.py:5
1 import param
3 from bokeh.models import Column as BkColumn, CustomJS
----> 5 from .base import NamedListPanel
6 from .card import Card
9 class Accordion(NamedListPanel):
File ~/miniconda3/envs/stereopy/lib/python3.8/site-packages/panel/layout/base.py:13
11 from ..io.model import hold
12 from ..io.state import state
---> 13 from ..reactive import Reactive
14 from ..util import param_name, param_reprs
16 _row = namedtuple("row", ["children"])
File ~/miniconda3/envs/stereopy/lib/python3.8/site-packages/panel/reactive.py:25
22 from param.parameterized import ParameterizedMetaclass
23 from tornado import gen
---> 25 from .config import config
26 from .io.model import hold
27 from .io.notebook import push
File ~/miniconda3/envs/stereopy/lib/python3.8/site-packages/panel/config.py:20
14 import param
16 from pyviz_comms import (
17 JupyterCommManager as _JupyterCommManager, extension as _pyviz_extension
18 )
---> 20 from .io.notebook import load_notebook
21 from .io.state import state
23 __version__ = str(param.version.Version(
24 fpath=__file__, archive_commit="$Format:%h$", reponame="panel"))
File ~/miniconda3/envs/stereopy/lib/python3.8/site-packages/panel/io/__init__.py:9
6 import logging
7 import sys
----> 9 from ..config import config
11 from .callbacks import PeriodicCallback # noqa
12 from .embed import embed_state # noqa
ImportError: cannot import name 'config' from partially initialized module 'panel.config' (most likely due to a circular import) (/home/sama0023/miniconda3/envs/stereopy/lib/python3.8/site-packages/panel/config.py)
Is it because the name config
is being used somewhere else? (ref here: https://stackoverflow.com/questions/64807163/importerror-cannot-import-name-from-partially-initialized-module-m)
Many thanks for your reply - looking forward to hearing from you soon.
Shani.
Hi stereopy teams!
I have generated cellbin.cgef and i want to convert to h5ad format, yet i got the error: [reader][429][ERROR]: convert to AnnData should have raw data. the code was:
path_high_res = "./data/nom_1d_highres_cellbin.gef"
data_high_res = st.io.read_gef(path_high_res, bin_type='cell_bins')
data_high_res.tl.cal_qc()
ins_high_res = data_high_res.plt.interact_spatial_scatter(width=500, height=500, poly_select=True)
ins_high_res.show()
adata = st.io.stereo_to_anndata(data_high_res,flavor='scanpy',output='nom_1d_highres_cellbin.h5ad')
Exception Traceback (most recent call last)
Cell In[12], line 1
----> 1 adata = st.io.stereo_to_anndata(data_high_res,flavor='scanpy',output='nom_1d_highres_cellbin.h5ad')
File ~/miniconda3/envs/st/lib/python3.8/site-packages/stereo/io/reader.py:430, in stereo_to_anndata(data, flavor, sample_id, reindex, output, split_batches)
428 if data.tl.raw is None:
429 logger.error('convert to AnnData should have raw data')
--> 430 raise Exception
432 exp = data.tl.raw.exp_matrix if issparse(data.tl.raw.exp_matrix) else csr_matrix(data.tl.raw.exp_matrix)
433 cells = data.tl.raw.cells.to_df()
Exception:
Can you check this error, many thanks!
Dear Sir/Madam,
I cloned Stereopy form https://github.com/BGIResearch/stereopy, then installed the required packed locally (pycharm). Default installation was gefpy 0.1.0, how system warned requirement for gefpy>=0.1.1. Failure is always reminded, no matter installed from pycharm or pin from the terminal.
Thus, may I ask do you have any suggestions for this issue?
Thanks so much!
Hello
Thanks for developing the package. I used read_gef
in the io
module of stereopy
to parse the gef
file, then a StereoExpData
object was returned, which contains much information. But I don't understand the cell_names
of the object. Because the gef
file didn't contain the related information. They look like the id of the mask file. Please advise.
cell_names
of StereoExpData
>>> import stereo as st
>>> dat = st.io.read_gef('./FP200003336_L01_72.raw.gef', bin_size = 1, is_sparse=True)
>>> dat.cell_names
array([ 3590592669555, 37293201051182, 25997437066045, ...,
794568965723, 19477676698069, 9083855849722])
HDF5 "/share/ShareData/Stereo-seq/Mask_file/E11.5_E1S3.barcodeToPos.h5" {
DATASET "/bpMatrix_1" {
DATATYPE H5T_STD_U64LE
DATASPACE SIMPLE { ( 26462, 26462, 1 ) / ( 26462, 26462, 1 ) }
DATA {
(0,0,0): 90894471210542,
(0,1,0): 152917129528969,
(0,2,0): 245238102184754,
(0,3,0): 183680511537661,
(0,4,0): 204000043738565,
(0,5,0): 21505938126074,
(0,6,0): 822201656293554,
(0,7,0): 902627202106568,
(0,8,0): 982822939930488,
(0,9,0): 259155236519271,
(0,10,0): 241356892181875,
(0,11,0): 697765384657892,
(0,12,0): 301210434339907,
(0,13,0): 56003391889515,
(0,14,0): 676778668696025,
(0,15,0): 464497669882956,
(0,16,0): 1099133513140956,
(0,17,0): 1118933043153630,
(0,18,0): 838006647635862,
(0,19,0): 817678488407851,
(0,20,0): 1081564470506869,
(0,21,0): 795831308672738,
(0,22,0): 340919166457846,
(0,23,0): 570096918465330,
(0,24,0): 576679295911424,
(0,25,0): 76328420739723,
(0,26,0): 728368681319792,
(0,27,0): 523708119536022,
(0,28,0): 219845555935278,
(0,29,0): 21202826412319,
(0,30,0): 652047660144813,
(0,31,0): 770467494289177,
(0,32,0): 799251329080094,
(0,33,0): 1107366538726038,
(0,34,0): 1072110141530501,
Quick start notebook is requesting data:
# read the GEF file
mouse_data_path = './stereomics.h5'
But, on the provided link in the notebook, this file is not present. Also, the page is in Chinese, and it is impossible without a translation tool to guess which button is for download.
I want to change the bin size when I read the .gem file. However, changing the 'bin' parameter of read_gem(v0.2.2) didn't work, and read_stereo_data(v0.1) worked. Is it a bug?
When I run the "dyn.tl.leiden(adata, result_key='spatial_leiden_res')", an error is reported:
“Fatal error at src/core/vector.c:483 : Assertion failed: v->stor_begin != NULL ”
and the Jupyter will appears to have died.
Is there any methods to solve it ? thx.
Hi there,
I just got an error when installing stereopy (version 0.2.4), it seems there is something wrong with the gefpy package, could you please help me out of this?
ERROR: Could not find a version that satisfies the requirement gefpy>=0.1.1 (from stereopy) (from versions: none)
ERROR: No matching distribution found for gefpy>=0.1.1
Hello, I used your model about cell segmentation.Can i use my dataset to get a new model?
I have followed instructions from stereopy instructions page through the following link (https://stereopy.readthedocs.io/en/latest/General/Installation.html) by typing the following commands
conda create --name st python=3.8
conda activate st
pip install stereopy
After installation, when I was trying to use stereopy by typing the following commands ( shown below) it's giving an error.
import warnings
warnings.filterwarnings('ignore')
import stereo as st
It would be really helpful, if you help on this.
I've been trying to follow the cell_segmentation tutorial v0.10.0 using data from MOSTA; however, when I try to run the cell_cut function, I run into the following error:
python3: /workitems/geftools/cgefCellgem.cpp:897: void cgefCellgem::readmask_new(const string&): Assertion `m_rows == cgefParam::GetInstance()->m_max_y - cgefParam::GetInstance()->m_min_y+1' failed.
This is the code I run on a linux computing cluster:
from stereo.tools.cell_cut import CellCut
cgef_out_dir = "./cgef"
bgef_path = "../E12.5_E1S3.bgef"
gem_path = "../E12.5_E1S3_bin1.gem"
mask_path = "./cgef/deep-learning/E12.5_E1S3_mask.tif"
image_path = "../image_spat/E12.5_E1S3.tif"
model_path = "./models/seg_model_20211210.pth"
cc = CellCut(cgef_out_dir=cgef_out_dir)
out_path = cc.cell_cut(gem_path=gem_path, mask_path=mask_path)
# out_path = cc.cell_cut(bgef_path=bgef_path, mask_path=mask_path)
# out_path = cc.cell_cut(bgef_path=bgef_path, image_path=image_path, model_path=model_path)
GEM File Source: https://ftp.cngb.org/pub/SciRAID/stomics/STDS0000058/Bin1_matrix/E12.5_E1S3_GEM_bin1.tsv.gz
Image File Source: https://ftp.cngb.org/pub/SciRAID/stomics/STDS0000058/Image/E12.5_E1S3.tif
I have also tried running the commented out versions as well. The mask generation and conversion from gem to bgef are successful, but get the same error for the cell binning.
Please check for the renewal of Jinja2 package.
It's no longer able to import Markup from jinja2 since version 3.1.0. You can either renew the script or restrict the Jinja2 version in requirement.txt. Otherwise, importing stereo would fail.
Another suggestion, hope developers can create a conda environment.yaml for the ease of installation. If installing through pip, it would be a mess to install the large amount of dependent libraries.
Sincerely,
magcurly
Hi authors:
when i read informations of tissue.gef, i got that error.
data_path = './SS200000385BR_E3.tissue.gef'
st.io.read_gef_info(data_path)
IndexError Traceback (most recent call last)
in
----> 1 st.io.read_gef_info(data_path)
/home/program/conda/anaconda3/envs/stereopy/lib/python3.8/site-packages/stereo/io/reader.py in read_gef_info(file_path)
580 logger.info('Bin size list: {0}'.format(info_dict['bin_list']))
581
--> 582 info_dict['resolution'] = h5_file['geneExp']['bin1']['expression'].attrs['resolution'][0]
583 logger.info('Resolution: {0}'.format(info_dict['resolution']))
584
IndexError: invalid index to scalar variable.
Hi to the developers,
Thank you very much for the thoughtful .h5ad
to Seurat
conversion R
script.I tried using that on the data I downloaded form the STOMICS database, however I ran into a problem.
Here is the code I ran and the error ( I ran this with R/4.2
)
sama0023@arxcss:~/STOMICS$ Rscript annh5ad2rds.R --infile ./Stomics_data/E16.5_E2S3.MOSTA.h5ad --outfile ./Stomics_data/E16.5_E2S3.MOSTA.RDS
Registered S3 method overwritten by 'SeuratDisk':
method from
as.sparse.H5Group Seurat
Attaching SeuratObject
Attaching package: ‘dplyr’
The following objects are masked from ‘package:stats’:
filter, lag
The following objects are masked from ‘package:base’:
intersect, setdiff, setequal, union
[1] "Converting h5ad to h5seurat file."
Warning: Unknown file type: h5ad
Creating h5Seurat file for version 3.1.5.9900
Adding X as data
Adding X as counts
Adding meta.features from var
Adding spatial as cell embeddings for spatial
Warning: Cannot find a reduction named pca (PCs in varm)
Adding annotation_colors to miscellaneous data
Adding layer count as data in assay count
Adding layer count as counts in assay count
[1] "Loading h5seurat file."
Validating h5Seurat file
Initializing Spatial with data
Adding counts for Spatial
Adding feature-level metadata for Spatial
Initializing count with data
Adding counts for count
Adding reduction spatial
Adding cell embeddings for spatial
Adding miscellaneous information for spatial
Adding command information
Adding cell-level metadata
Warning: Invalid name supplied, making object name syntactically valid. New object name is n_genes_by_countslog1p_n_genes_by_countstotal_countslog1p_total_countsannotationRegulon...Acaa1aRegulon...AhrRegulon...Alx1Regulon...Alx4Regulon...Arid3aRegulon...ArntRegulon...Arnt2Regulon...ArxRegulon...Atf2Regulon...Atf3Regulon...Atf4Regulon...Atf6Regulon...Bach2Regulon...Barhl1Regulon...Barhl2Regulon...Bcl6Regulon...Bcl6bRegulon...Bclaf1Regulon...Bhlhe22Regulon...Bhlhe40Regulon...Bhlhe41Regulon...BmycRegulon...Borcs8Regulon...BptfRegulon...Brf1Regulon...CebpaRegulon...CebpbRegulon...CebpdRegulon...CebpgRegulon...CebpzRegulon...Chd1Regulon...CicRegulon...ClockRegulon...Cnot3Regulon...Creb1Regulon...Creb3Regulon...Creb3l1Regulon...Creb3l2Regulon...CrxRegulon...CtcfRegulon...Cux1Regulon...Cux2Regulon...DbpRegulon...Dbx1Regulon...Ddit3Regulon...Ddx4Regulon...Dlx1Regulon...Dlx2Regulon...Dlx3Regulon...Dlx5Regulon...Dlx6Regulon...Dmrt2Regulon...E2f1Regulon...E2f2Regulon...E2f3Regulon...E2f4Regulon...E2f5Regu [... truncated]
Adding miscellaneous information
Adding tool-specific results
Warning message:
Cannot add objects with duplicate keys (offending key: spatial_), setting key to 'spatialla_'
Error in `[.data.frame`([email protected], , c("x", "y")) :
undefined columns selected
Calls: unique -> [ -> [.data.frame
Execution halted
Can you please explain how this can be corrected?
Many thanks in advance,
Shani.
Hi,
In the document, the explanation for x, and y is:
x, y are the spatial position of the gene in the tissue section.
And the explanation for .gem file looks like is:
GeneID | x | y | count |
---|---|---|---|
Gene1 | 121 | 200 | 2 |
Gene2 | 234 | 300 | 1 |
… | … | … | … |
Gene n | 234 | 300 | 1 |
When I load the file SS200000135TL_D1.tissue.gem, I can see the table like this:
geneID | x | y | MIDCount | ExonCount | CellID |
---|---|---|---|---|---|
Camk1d | 7566 | 19777 | 1 | 0 | 56203 |
Gabra1 | 7567 | 19777 | 1 | 1 | 56203 |
The x values of Camk1d and Gabra1 are different(7566 / 7567), but they have the same cell id: 56203. So I got two questions here:
Thank you!
Hi stereopy team,
I followed the cell correct tutorial to cell correct my data, code as followed in #96
Yet i got the error:
Python 3.8.16 | packaged by conda-forge | (default, Feb 1 2023, 16:01:55)
In [1]: from stereo.tools.cell_correct import cell_correct
...: bgef_path = "./data/C01626E6F6.raw.gef"
...: mask_path = "./data/C01626E6F6_regist.tif"
...: out_dir = "./cell_correct_result"
...: only_save_result = False
...: fast = True
...: data = cell_correct(out_dir=out_dir,
...: bgef_path=bgef_path,
...: mask_path=mask_path,
...: process_count=10,
...: only_save_result=only_save_result,
...: fast=fast)
[2023-02-19 11:03:58][Stereo][328478][140382292883264][time_consume][55][INFO]: start to run cell_correct...
[2023-02-19 11:03:58][Stereo][328478][140382292883264][time_consume][55][INFO]: start to run correcting...
[2023-02-19 11:03:58][Stereo][328478][140382292883264][time_consume][55][INFO]: start to run generate_raw_data...
[2023-02-19 11:03:58][Stereo][328478][140382292883264][cell_correct][99][INFO]: start to generate raw cellbin gef (./cell_correct_result/C01626E6F6.raw.raw.cellbin.gef)
create h5 file: ./cell_correct_result/C01626E6F6.raw.raw.cellbin.gef
minx:0 miny:0 maxx:26459 maxy:44099
genecnt:28263 geneExpcnt:138988822 hashcnt:99139069
readBgef_new - elapsed time: 223833.54450 ms
img row:44100 col:26460
readmask_new - elapsed time: 3465.90027 ms
storeAttr - 0.000116 cpu sec
Segmentation fault (core dumped)
I don't know how to solve it? Thanks!
dim reduce
pca
umap
tsen
low variance
factor analysis
clustering
leiden
louvain
phenograph
cell type annotation
direct annotation using random forest
gene pattern
find pattern
recode sparkx
marker genes
t-test
wilcoxon-test
spatial lag model
enrich
go enrich
pathway activation
RNA velocity
IO:文件数据读写,生成数据类,目前使用AnnData
预处理:对数据类进行质控、过滤、标准化
分析:输入数据类以及相应参数设置,运行分析,将分析结果写回数据类
可视化:对分析结果进行可视化
框架流程图
1.模块化
考虑到框架的可拓展性,主要设计了三种基类,分别为数据类、分析类、结果类。整体实现流程是将数据类传入分析类,进行分析,
得到结果类,将结果写回数据类,或者输出文件,并可视化结果。
2.基类说明
数据基类(StereoData)
下一版增加,目前暂时使用AnnData
Class StereoData:
pass
Class ToolBase(data, method, name=None):
参数:
data:分析数据,AnnData(下一版调为StereoData)
method: 分析方法,str
name:分析名字,str,作为结果存取的key
属性:
self.data
self.method
self.name
self.exp_matrix
方法
check_param(): 检查参数
sparse2array(): 将self.exp_matrix稀疏矩阵转换成np.array
get_params(var_info): 获取参数变量名以及相应的值
add_result(): 保存分析结果
Class StereoResult(name='stereo', param=None):
参数:
name: 名称, str
param:参数信息,dict
属性:
self.name
self.param
方法:
update_params(v): 更新参数
__str__(): 返回类的参数信息及结果信息
__repr__():打印类的参数信息及结果信息
1 tool分析模块开发
tool分析模块主要包含两个内容的开发,分别为分析类以及结果类的开发。
1.1 分析子类编写
1.1.1 子类需继承基类,以及主要重写和实现相关的方法
1.1.2 可自定义分析需要的参数跟方法
编写示例:
from ..core.tool_base import ToolBase
from ..log_manager import logger
class DimReduce(ToolBase):
def __init__(self, data: AnnData, method='pca', n_pcs=2, min_variance=0.01, n_iter=250,
n_neighbors=5, min_dist=0.3, inplace=False, name='dim_reduce'):
# 获取参数
self.params = self.get_params(locals())
# 继承父类构造方法
super(DimReduce, self).__init__(data=data, method=method, name=name)
# 参数检查
self.check_param()
# 自定义分析所需属性,结合分析需要
self.n_pcs = n_pcs
self.min_variance = min_variance
self.n_iter = n_iter
self.n_neighbors = n_neighbors
self.min_dist = min_dist
self.result = DimReduceResult(name=name, param=self.params)
def check_param(self): # 重写参数检查方法,根据分析所传参数以及其值范围进行检查约束
"""
Check whether the parameters meet the requirements.
:return:
"""
super(DimReduce, self).check_param()
if self.method.lower() not in ['pca', 'tsen', 'umap', 'factor_analysis', 'low_variance']:
logger.error(f'{self.method} is out of range, please check.')
raise ValueError(f'{self.method} is out of range, please check.')
def fit(self, exp_matrix=None):
# 自定义分析方法实现逻辑
exp_matrix = exp_matrix if exp_matrix is not None else self.exp_matrix
if self.method == 'low_variance':
self.result.x_reduce = low_variance(exp_matrix, self.min_variance)
elif self.method == 'umap':
self.result.x_reduce = u_map(exp_matrix, self.n_pcs, self.n_neighbors, self.min_dist)
else:
pca_res = pca(exp_matrix, self.n_pcs)
self.result.x_reduce = pca_res['x_pca']
self.result.variance_ratio = pca_res['variance_ratio']
self.result.variance_pca = pca_res['variance']
self.result.pcs = pca_res['pcs']
# 通过self.add_result()添加结果回数据类
self.add_result(result=self.result, key_added=self.name)
1.2 结果子类编写
子类需继承结果基类
根据分析结果,自定义该结果类存放的内容以及实现的方法
编写示例:
class FindMarkerResult(StereoResult):
def __init__(self, name: str = 'find_marker', param: Optional[dict] = None,
degs_data: Optional[pd.DataFrame] = None):
super(FindMarkerResult, self).__init__(name, param)
self.degs_data = degs_data
def __str__(self):
info = super(FindMarkerResult, self).__str__()
if self.degs_data is not None:
info += f' result: a DataFrame which has `genes`,`pvalues`,`pvalues_adj`, `log2fc`, `score` columns.\n'
info += f' the shape is: {self.degs_data.shape}'
return info
def top_k_marker(self, top_k_genes=10, sort_key='pvalues', ascend=False):
"""
obtain the first k significantly different genes
:param top_k_genes: the number of top k
:param sort_key: sort by the column
:param ascend: the ascend order of sorting.
:return:
"""
if self.degs_data is not None:
top_k_data = self.degs_data.sort_values(by=sort_key, ascending=ascend).head(top_k_genes)
return top_k_data
else:
logger.warning('the result of degs is None, return None.')
return None
2 可视化模块
3 书写规范
Hello
Thanks for developing the stereopy. Recently, when I used stereopy for tissue segmentation (function: tissue_extraction_to_bgef), I found that the axolotl data on the official website of Huada spatiotemporal omics (https://db.cngb.org/stomics/artista/download.html) was not aligned with the genetic data. I checked the paper, mentioned that the author performed the alignment manually. so I want to know if there is a way to align and complete the segmentation of the tissue through the stereopy package?
gene scatter plot:
Hello, Stereopy team,
When I use the data_helper.merge function to merge two slices and save them as an anndata file using st.io.stereo_to_anndata(data, flavor='scanpy'......), two files are generated .But i want get only one merged file.
I am looking forward to your response.
Sincerely
Hi, when I run 'data.tl.leiden(neighbors_res_key='neighbors', res_key='leiden')', there is an error below:
data.tl.filter_cells(min_gene=200, pct_counts_mt=10, min_n_genes_by_counts=3, inplace=True)
data.tl.filter_genes(min_cell=10)
data.tl.raw_checkpoint()
data.tl.normalize_total(target_sum=10000)
data.tl.log1p()
data.tl.highly_variable_genes(min_mean=0.0125, max_mean=3,min_disp=0.5,
n_top_genes=2000, res_key='highly_variable_genes')
data.tl.scale(max_value=10, zero_center=True)
data.tl.pca(use_highly_genes=False, n_pcs=50, res_key='pca')
data.tl.neighbors(pca_res_key='pca', n_pcs=50, res_key='neighbors')
data.tl.umap(pca_res_key='pca', neighbors_res_key='neighbors', res_key='umap')
data.tl.leiden(neighbors_res_key='neighbors', res_key='leiden')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/NAS/lh/software/miniconda3/envs/st/lib/python3.8/site-packages/stereopy-0.9.0-py3.8.egg/stereo/core/st_pipeline.py", line 34, in wrapped
res = func(*args, **kwargs)
File "/NAS/lh/software/miniconda3/envs/st/lib/python3.8/site-packages/stereopy-0.9.0-py3.8.egg/stereo/core/st_pipeline.py", line 590, in leiden
clusters = le(neighbor=neighbor, adjacency=connectivities, directed=directed, resolution=resolution,
File "/NAS/lh/software/miniconda3/envs/st/lib/python3.8/site-packages/stereopy-0.9.0-py3.8.egg/stereo/algorithm/leiden.py", line 89, in leiden
part = leidenalg.find_partition(g, partition_type, **partition_kwargs)
File "/NAS/lh/software/miniconda3/envs/st/lib/python3.8/site-packages/leidenalg/functions.py", line 81, in find_partition
partition = partition_type(graph,
File "/NAS/lh/software/miniconda3/envs/st/lib/python3.8/site-packages/leidenalg/VertexPartition.py", line 840, in __init__
self._partition = _c_leiden._new_RBConfigurationVertexPartition(pygraph_t,
BaseException: Could not construct partition: Weight vector not the same size as the number of edges.
I don't know how to solve it? Thanks!
这个stereopy的数据好像不能下载,想试试程序,请问有办法吗?
core
scatter: the basic func of scatter
heatmap: the basic func of heatmap
qc
plot_spatial_distribution: draw the total_count and n_gene_by_count in the spatial position.
plot_genes_count: draw the relation of total_count with n_gene_by count and mt_genes_pt.
plot_violin_distribution: draw the violin distribution of total_count , n_gene_by count and mt_genes_pt.
scatter
plot_scatter: plot the scatter figure of the bins.
dim_reduce
plot_dim_reduce: specify a specific key and draw the distribution of the bin after dimension-reduction
marker_genes
plot_marker_genes: draw the gravel figure of marker gene whose score is in the top few
plot_heatmap_marker_genes: draw the heatmap figure of marker gene whose score is in the top few in cluster groups.
clustering
plot_spatial_cluster: draw the spatial distribution of clusters.
Hi, I am on a mac (2.3 GHz 8-Core Intel Core i9) and I am having a lot of trouble getting stereopy to work. Is there a docker container/conda environment which can be used to install stereopy? While it says it is installed, when I run several tools it says it requires further modules/packages, a lot of these having incompatibilities with the existing modules. Any streamlined installation methods would be greatly appreciated.
dear authors:
error occurs when i trying 'import stereo' in python3.8, here is the message:
Traceback (most recent call last):
File "", line 1, in
File "D:\Anaconda3\envs\python38\lib\site-packages\stereo_init_.py", line 6, in
from . import io
File "D:\Anaconda3\envs\python38\lib\site-packages\stereo\io_init_.py", line 9, in
from .reader import read_gef, read_gem, read_ann_h5ad, read_stereo_h5ad, anndata_to_stereo, stereo_to_anndata
File "D:\Anaconda3\envs\python38\lib\site-packages\stereo\io\reader.py", line 28, in
from shapely.geometry import Point, MultiPoint
File "C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\shapely\geometry_init_.py", line 4, in
from .base import CAP_STYLE, JOIN_STYLE
File "C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\shapely\geometry\base.py", line 19, in
from shapely.coords import CoordinateSequence
File "C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\shapely\coords.py", line 8, in
from shapely.geos import lgeos
File "C:\Users\Administrator\AppData\Roaming\Python\Python38\site-packages\shapely\geos.py", line 154, in
lgeos = CDLL(os.path.join(sys.prefix, 'Library', 'bin', 'geos_c.dll'))
File "D:\Anaconda3\envs\python38\lib\ctypes_init.py", line 373, in init
self._handle = _dlopen(self._name, mode)
FileNotFoundError: Could not find module 'D:\Anaconda3\envs\python38\Library\bin\geos_c.dll' (or one of its dependencies). Try using the full path with constructor syntax.
how can i fix it?
best wishes, thank you.
han
Hi,
So I manage to install Stereopy correctly and load it in my Jupyter notebook. However when I try to load my .h5ad data, I get the FileNotFoundError: Please ensure there is a file
error.
Below is the code I used:
import os
from os.path import join as jn
import warnings
#warnings.filterwarnings('ignore')
import stereo as st
mouse_data = '~/STOMICS/Stomics_data/E16.5_E2S3.MOSTA.h5ad'
mouse_data
data = st.io.read_stereo_h5ad(file_path=mouse_data, use_raw=True, use_result=True)
This is the error in detail:
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
Cell In [21], line 1
----> 1 data = st.io.read_stereo_h5ad(file_path=mouse_data, use_raw=True, use_result=True)
File ~/miniconda3/envs/st/lib/python3.8/site-packages/stereo/utils/read_write_utils.py:19, in ReadWriteUtils.check_file_exists.<locals>.wrapped(*args, **kwargs)
17 if kwargs.get('file_path'):
18 if not os.path.exists(kwargs.get('file_path')):
---> 19 raise FileNotFoundError("Please ensure there is a file")
20 else:
21 if args:
FileNotFoundError: Please ensure there is a file
This is my Stereopy version:
(st) pip show stereopy
Name: stereopy
Version: 0.6.0
Summary: Spatial transcriptomic analysis in python.
Home-page: https://github.com/BGIResearch/stereopy
Author: BGIResearch
Author-email: [email protected]
License:
Location: /mnt/nectar_volume/home/sama0023/miniconda3/envs/st/lib/python3.8/site-packages
Requires: albumentations, anndata, arboreto, bokeh, colorcet, dask, datashader, distributed, gefpy, glog, gtfparse, h5py, harmonypy, holoviews, hotspotsc, hvplot, igraph, imageio, Jinja2, joblib, KDEpy, keras, leidenalg, loompy, louvain, matplotlib, natsort, numba, numpy, opencv-python, packaging, pandas, panel, param, patsy, phenograph, pillow, protobuf, pyarrow, requests, scikit-image, scikit-learn, scipy, seaborn, setuptools, shapely, slideio, spatialpandas, squidpy, statsmodels, tables, tensorflow, tensorflow-io-gcs-filesystem, tifffile, torch, torchvision, tqdm, typing-extensions, umap-learn, urllib3, xarray
Required-by:
This is my file structure:
sama0023@xxx:~/STOMICS$ tree
.
├── requirements.txt
├── Stomics_data
│ └── E16.5_E2S3.MOSTA.h5ad
└── STOMICS.ipynb
1 directory, 3 files
Please help!
Running data.tl.find_marker_genes(cluster_res_key='phenograph', method='t_test', use_highly_genes=False, use_raw=True)
resulted in a ZeroDivisionError
[2023-02-28 11:28:00][Stereo][3139][47732363878080][st_pipeline][35][INFO]: start to run find_marker_genes...
[2023-02-28 11:28:00][Stereo][3139][47732363878080][tool_base][117][INFO]: read group information, grouping by group column.
[2023-02-28 11:28:00][Stereo][3139][47732363878080][tool_base][155][INFO]: start to run...
Find marker gene: 64%|██████████████████████████████████████████████▍ | 67/104 [00:15<00:08, 4.37it/s]
---------------------------------------------------------------------------
ZeroDivisionError Traceback (most recent call last)
Cell In[6], line 1
----> 1 data.tl.find_marker_genes(cluster_res_key='phenograph', method='t_test', use_highly_genes=False, use_raw=True)
File /stornext/HPCScratch/home/wang.ch/conda_dir/envs/rapids-22.12/lib/python3.8/site-packages/stereo/core/st_pipeline.py:37, in logit.<locals>.wrapped(*args, **kwargs)
35 logger.info('start to run {}...'.format(func.__name__))
36 tk = tc.start()
---> 37 res = func(*args, **kwargs)
38 logger.info('{} end, consume time {:.4f}s.'.format(func.__name__, tc.get_time_consumed(key=tk, restart=False)))
39 return res
File /stornext/HPCScratch/home/wang.ch/conda_dir/envs/rapids-22.12/lib/python3.8/site-packages/stereo/core/st_pipeline.py:745, in StPipeline.find_marker_genes(self, cluster_res_key, method, case_groups, control_groups, corr_method, use_raw, use_highly_genes, hvg_res_key, res_key, output)
743 data = self.raw if use_raw else self.data
744 data = self.subset_by_hvg(hvg_res_key, use_raw=use_raw, inplace=False) if use_highly_genes else data
--> 745 tool = FindMarker(data=data, groups=self.result[cluster_res_key], method=method, case_groups=case_groups,
746 control_groups=control_groups, corr_method=corr_method, raw_data=self.raw)
747 self.result[res_key] = tool.result
748 if output is not None:
File /stornext/HPCScratch/home/wang.ch/conda_dir/envs/rapids-22.12/lib/python3.8/site-packages/stereo/tools/find_markers.py:63, in FindMarker.__init__(self, data, groups, method, case_groups, control_groups, corr_method, tie_term, raw_data)
61 self.tie_term = tie_term
62 self.raw_data = raw_data
---> 63 self.fit()
File /stornext/HPCScratch/home/wang.ch/conda_dir/envs/rapids-22.12/lib/python3.8/site-packages/stereo/core/tool_base.py:156, in ToolBase.fit_log.<locals>.wrapper(*args, **kwargs)
153 @functools.wraps(func)
154 def wrapper(*args, **kwargs):
155 logger.info('start to run...')
--> 156 func(*args, **kwargs)
157 logger.info('end to run.')
File /stornext/HPCScratch/home/wang.ch/conda_dir/envs/rapids-22.12/lib/python3.8/site-packages/stereo/tools/find_markers.py:159, in FindMarker.fit(self)
157 # self.logger.info('end selelct group')
158 if self.method == 't_test':
--> 159 result = statistics.ttest(g_data, other_data, self.corr_method)
160 elif self.method == 'logreg':
161 if logres_score is None:
File /stornext/HPCScratch/home/wang.ch/conda_dir/envs/rapids-22.12/lib/python3.8/site-packages/stereo/algorithm/statistics.py:76, in ttest(group, other_group, corr_method)
75 def ttest(group, other_group, corr_method=None):
---> 76 mean_group, var_group = get_mean_var(group)
77 mean_rest, var_rest = get_mean_var(other_group)
78 with np.errstate(invalid="ignore"):
File /stornext/HPCScratch/home/wang.ch/conda_dir/envs/rapids-22.12/lib/python3.8/functools.py:875, in singledispatch.<locals>.wrapper(*args, **kw)
871 if not args:
872 raise TypeError(f'{funcname} requires at least '
873 '1 positional argument')
--> 875 return dispatch(args[0].__class__)(*args, **kw)
File /stornext/HPCScratch/home/wang.ch/conda_dir/envs/rapids-22.12/lib/python3.8/site-packages/stereo/utils/hvg_utils.py:41, in _(X, axis)
39 var = mean_sq - mean ** 2
40 # enforce R convention (unbiased estimator) for variance
---> 41 var *= X.shape[axis] / (X.shape[axis] - 1)
42 return mean, var
ZeroDivisionError: division by zero
When I pip install stereopy
using pytthon 3.8.10 there are no problems. However, I run into an error trying to do this with python 3.9.5:
pip install stereopy
...
Attempting uninstall: setuptools
Found existing installation: setuptools 56.0.0
Uninstalling setuptools-56.0.0:
Successfully uninstalled setuptools-56.0.0
Running setup.py install for pynndescent ... error
ERROR: Command errored out with exit status 1:
command: /home/mf/.pyenv/versions/3.9.5/bin/python3.9 -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-hkvk21ac/pynndescent_580104e55fcd428a8187c7566ef5b0a0/setup.py'"'"'; __file__='"'"'/tmp/pip-install-hkvk21ac/pynndescent_580104e55fcd428a8187c7566ef5b0a0/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /tmp/pip-record-nsj8ft48/install-record.txt --single-version-externally-managed --compile --install-headers /home/mf/.pyenv/versions/3.9.5/include/python3.9/pynndescent
cwd: /tmp/pip-install-hkvk21ac/pynndescent_580104e55fcd428a8187c7566ef5b0a0/
Complete output (11 lines):
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/mf/.pyenv/versions/3.9.5/lib/python3.9/site-packages/setuptools/__init__.py", line 18, in <module>
from setuptools.dist import Distribution, Feature
File "/home/mf/.pyenv/versions/3.9.5/lib/python3.9/site-packages/setuptools/dist.py", line 30, in <module>
from setuptools.depends import Require
File "/home/mf/.pyenv/versions/3.9.5/lib/python3.9/site-packages/setuptools/depends.py", line 7, in <module>
from .py33compat import Bytecode
File "/home/mf/.pyenv/versions/3.9.5/lib/python3.9/site-packages/setuptools/py33compat.py", line 55, in <module>
unescape = getattr(html, 'unescape', html_parser.HTMLParser().unescape)
AttributeError: 'HTMLParser' object has no attribute 'unescape'
----------------------------------------
ERROR: Command errored out with exit status 1: /home/mf/.pyenv/versions/3.9.5/bin/python3.9 -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-hkvk21ac/pynndescent_580104e55fcd428a8187c7566ef5b0a0/setup.py'"'"'; __file__='"'"'/tmp/pip-install-hkvk21ac/pynndescent_580104e55fcd428a8187c7566ef5b0a0/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /tmp/pip-record-nsj8ft48/install-record.txt --single-version-externally-managed --compile --install-headers /home/mf/.pyenv/versions/3.9.5/include/python3.9/pynndescent Check the logs for full command output.
related problem seems to be: coursera-dl/coursera-dl#778
thanks
Mark
Hi,
Sorry to trouble you!
I have some quesitions with Cell Correction. I used the following codes to get the "SS200000314TL_D1.raw.adjusted.cellbin.gef".Then, I used the "Quick Start (Cell Bin)", but I got few cell number.I don't know why.
###########1.Cell Correction
from stereo.tools.cell_correct import cell_correct
bgef_path = "02.count/SS200000314TL_D1.raw.gef"
mask_path = "03.register/SS200000314TL_D1_regist.tif"
out_dir = "8.analysis/cell_correct_result_1"
only_save_result = False
fast = True
data = cell_correct(out_dir=out_dir,
bgef_path=bgef_path,
mask_path=mask_path,
process_count=10,
only_save_result=only_save_result,
fast=fast)
#########2.Quick Start (Cell Bin)
import warnings
warnings.filterwarnings('ignore')
import stereo as st
data_path = ['8.analysis/cell_correct_result_1/SS200000314TL_D1.raw.adjusted.cellbin.gef']
st.io.read_gef_info(data_path)
[2023-02-12 20:46:53][Stereo][149731][139721301030720][reader][762][INFO]: This is GEF file which contains cell bin infomation.
[2023-02-12 20:46:53][Stereo][149731][139721301030720][reader][763][INFO]: bin_type: cell_bins
[2023-02-12 20:46:53][Stereo][149731][139721301030720][reader][769][INFO]: Number of cells: 87
[2023-02-12 20:46:53][Stereo][149731][139721301030720][reader][772][INFO]: Number of gene: 26735
[2023-02-12 20:46:54][Stereo][149731][139721301030720][reader][775][INFO]: Resolution: 500
[2023-02-12 20:46:54][Stereo][149731][139721301030720][reader][778][INFO]: offsetX: 3
[2023-02-12 20:46:54][Stereo][149731][139721301030720][reader][781][INFO]: offsetY: 0
[2023-02-12 20:46:54][Stereo][149731][139721301030720][reader][784][INFO]: Average number of genes: 1.390804648399353
[2023-02-12 20:46:54][Stereo][149731][139721301030720][reader][787][INFO]: Maximum number of genes: 28
[2023-02-12 20:46:54][Stereo][149731][139721301030720][reader][790][INFO]: Average expression: 523.9310302734375
[2023-02-12 20:46:54][Stereo][149731][139721301030720][reader][793][INFO]: Maximum expression: 45450
{'cell_num': 87, 'gene_num': 26735, 'resolution': 500, 'offsetX': 3, 'offsetY': 0, 'averageGeneCount': 1.3908046, 'maxGeneCount': 28, 'averageExpCount': 523.931, 'maxExpCount': 45450}
When I used the GEM to do Cell Correcting, there was an error: bin 1 matrix: min_x=3 len_x=65883 min_y=0 len_y=740227421 matrix_len=48768403177743
python3: /workitems/geftools/main_bgef.cpp:231: void gem2gef(BgefOptions*): Assertion `dnb_matrix.pmatrix_us' failed.
Aborted (core dumped)
from stereo.tools.cell_correct import cell_correct
gem_path = "041.cellcut/SS200000314TL_D1.cellbin.gem"
mask_path = "03.register/SS200000314TL_D1_regist.tif"
out_dir = "8.analysis/cell_correct_result_gem_fast"
only_save_result = False
fast = True
data = cell_correct(out_dir=out_dir,
gem_path=gem_path,
mask_path=mask_path,
process_count=10,
only_save_result=only_save_result,
fast=fast)
Dear authors,
I am trying to install stereopy using the pip install stereopy command.
Unfortunately, I am facing the following error: " FileNotFoundError: [Errno 2] No such file or directory: 'requirements.txt' "
Could you please help me to solve this issue?
Best regards,
Davide Maspero
a group funcs of reading different data with seq format.
Following the example code, I am wondering if it is possible to add a scale bar to the cluter scatter visualisation. Many thanks.
filter
filter bins by total_count, n_genes_by_count, mt_gene_pt and bins_list.
filter genes by gene count and genes_list.
filter bins by position.
normalize
total normalize
quantile normalize
z-score normalize
sc_transform
recipes seruat
qc
total count
n_gene_by_count
mt_gene_pt
Hello, Stereopy team:
How can I visualize single gene like seurat SpatialFeaturePlot?
Sincerely
aaa
I followed the example and ran the code on jupyter, but I did not get the Interactive scatter, How can I solve it? thanks.
Dear authors,
I am having difficulty installing the stereopy package.
When I run:
pip install stereopy==0.2.4
I get the error:
ERROR: Could not find a version that satisfies the requirement gefpy>=0.1.1 (from stereopy) (from versions: none)
ERROR: No matching distribution found for gefpy>=0.1.1
When I run:
pip install stereopy
I get the error:
FileNotFoundError: [Errno 2] No such file or directory: 'requirements.txt'
And it seems the package has heavy dependency, so it is hard to directly build it.
Could you please help me in understanding the problem and in getting around it?
Kind regards,
Arinze.
你好,我按照教程代码运行交互式截取感兴趣区域,但是界面没有显示。
%matplotlib inline
import warnings
warnings.filterwarnings('ignore')
import stereo as st
import os
os.chdir('C:/Users/llmt/Desktop/jupyter notebook')
path = 'data/demo/SS200000135TL_D1.tissue.gef'
data = st.io.read_gef(path, bin_size=40)
data.tl.cal_qc()
ins = data.plt.interact_spatial_scatter(width=500, height=500, poly_select=False)
ins.show()
Hi,
I have tried to run the Quick Start (Square Bin). But it keeps reporting errors on the following line:
fig = data.plt.umap(gene_names=['Atpif1', 'Tmsb4x'], res_key='umap')
The error message is shown below:
I used the /home/Jianning/Stereoseq_Demo_Data/SS200000135TL_D1.tissue.gef
file to run the script. I don't know why it can't return the image that was shown in the example. I would appreciate it if you would like to kindly help me with it.
With thanks,
Jianning
Hi everyone,
I'm attempting to run stereopy with python=3.9.5 and python=3.8 in a separate conda environment. Every installment displays a successful message. However, I failed with messages in python such as
'ImportError: Could not import 'Markup' from 'jinja2' (./miniconda3/lib/python3.9/site-packages/jinja2/_init_.py))'
'ImportError: cannot import name 'Markup' from 'jinja2' (./miniconda3/envs/Stereopy_env/lib/python3.8/site-packages/jinja2/_init_.py)'
Someone stated that it can only be used with Python 3.8 or Python 3.7.
#59
However, python3.9 is also supported in another post.
#11
Could you please provide me with detailed information on a successful installment as well as a test case?
Thank you in advance!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.