wanglab-broad / clustermap Goto Github PK

ClusterMap for multi-scale clustering analysis of spatial gene expression

License: GNU General Public License v3.0

Python 0.08% Jupyter Notebook 99.92%

in-situ-sequencing spatial-transcriptomics single-cell-analysis spatial-omics bioinformatics cell-segmentation transcriptomics

clustermap's People

Contributors

Stargazers

Watchers

Forkers

morganwu1998 eleozzr lmh12580 shaobo-bio cmzuo11 shachafl kang-bioinfo dhtc xintangg pvtodorov mengyuanchen-utokyo

clustermap's Issues

missing dependencies

in the environment.yml, there are missing pip dependencies:

fastdist
open-python
scanpy

the images from my spatial transcriptomics data don't fill the entire image so we I split up the image into smaller tiles I get an error that there are no cells (see error message below). Is there a way to instead of teminating the for loop to just skip empty tiles?

ValueError Traceback (most recent call last)
File :15, in

File /allen/programs/celltypes/workgroups/rnaseqanalysis/mFISH/michaelkunst/python/ClusterMap/ClusterMap/clustermap.py:87, in ClusterMap.preprocess(self, dapi_grid_interval, LOF, contamination, pct_filter)
86 def preprocess(self,dapi_grid_interval=5, LOF=False, contamination=0.1, pct_filter=0.1):
---> 87 preprocessing_data(self.spots, dapi_grid_interval, self.dapi_binary, LOF,contamination, self.xy_radius,pct_filter)

File /allen/programs/celltypes/workgroups/rnaseqanalysis/mFISH/michaelkunst/python/ClusterMap/ClusterMap/preprocessing.py:136, in preprocessing_data(spots, dapi_grid_interval, dapi_binary, LOF, contamination, xy_radius, pct_filter)
134 #compute neighbors within radius for local density
135 knn = NearestNeighbors(radius=xy_radius)
--> 136 knn.fit(all_points)
137 spots_array = np.array(spots.loc[:, ['spot_location_2', 'spot_location_1']])
138 neigh_dist, neigh_array = knn.radius_neighbors(spots_array)

File /allen/programs/celltypes/workgroups/rnaseqanalysis/mFISH/michaelkunst/miniconda3/envs/clustermap_hpc/lib/python3.8/site-packages/sklearn/neighbors/_unsupervised.py:166, in NearestNeighbors.fit(self, X, y)
149 def fit(self, X, y=None):
150 """Fit the nearest neighbors estimator from the training dataset.
151
152 Parameters
(...)
164 The fitted nearest neighbors estimator.
165 """
--> 166 return self._fit(X)

File /allen/programs/celltypes/workgroups/rnaseqanalysis/mFISH/michaelkunst/miniconda3/envs/clustermap_hpc/lib/python3.8/site-packages/sklearn/neighbors/_base.py:435, in NeighborsBase._fit(self, X, y)
433 else:
434 if not isinstance(X, (KDTree, BallTree, NeighborsBase)):
--> 435 X = self._validate_data(X, accept_sparse="csr")
437 self._check_algorithm_metric()
438 if self.metric_params is None:

File /allen/programs/celltypes/workgroups/rnaseqanalysis/mFISH/michaelkunst/miniconda3/envs/clustermap_hpc/lib/python3.8/site-packages/sklearn/base.py:561, in BaseEstimator._validate_data(self, X, y, reset, validate_separately, **check_params)
559 raise ValueError("Validation should be done on X, y or both.")
560 elif not no_val_X and no_val_y:
--> 561 X = check_array(X, **check_params)
562 out = X
563 elif no_val_X and not no_val_y:

File /allen/programs/celltypes/workgroups/rnaseqanalysis/mFISH/michaelkunst/miniconda3/envs/clustermap_hpc/lib/python3.8/site-packages/sklearn/utils/validation.py:797, in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, estimator)
795 n_samples = _num_samples(array)
796 if n_samples < ensure_min_samples:
--> 797 raise ValueError(
798 "Found array with %d sample(s) (shape=%s) while a"
799 " minimum of %d is required%s."
800 % (n_samples, array.shape, ensure_min_samples, context)
801 )
803 if ensure_min_features > 0 and array.ndim == 2:
804 n_features = array.shape[1]

ValueError: Found array with 0 sample(s) (shape=(0, 2)) while a minimum of 1 is required.

Failed to install

Hi authors,

I tried to use pip install git+https://github.com/LiuLab-Bioelectronics-Harvard/ClusterMap.git to install, but it showed "Command errored out with exit status 128".

I changed "git+https" to "git+git". This time it could be downloaded successfully, but the site-packages folder only contained "ClusterMap-0.0.1.dist-info", which cannot be imported.

I finally tried to use git clone and used the command python setup.py install to install. But it showed "warning: install_lib:'build/lib' does not exist - no Python modules to install".

What should I do to successfully install ClusterMap?

Detailed explanation about plot_with_dapi parameter

Hi @heihaizhengdong , I'd like to know what's the meaning of plot_with_dapi in most clustermap functions? In my opinion, if it is set to False then the spots overlapping with dapi will be discarded. Is that true?

ClusterMap_STARmap_V1_1020_BY1.ipynb file corrupted

This notebook was not shown in github, and it could not be opened on local computer. Hope the authors can fix/update it. Thanks a lot!

Questions and Enhancements about ClusterMap

Hello @heihaizhengdong , I’m very interested in ClusterMap. I’d like to ask several questions which are crucial to improving ClusterMap

Questions

Is the position of the centroid determined by DAPI image? Is the centroid position of a cell equal to the DAPI position of a cell? What is the function of DAPI image in cell segmentation?
What’s the unit of xy_radius and z_radius? μm? How can one know the area of an individual cell? Can I extract this information from the AnnData produced by ClusterMap pipeline?
In general, I processed my data using the ‘split-and-stitch’ pipeline. Is a spot in an individual tile image equal to a fluorescent signal? If so, why do spots form a cluster-like pattern even if they are not so crowded in the scatter distribution plot? Is it a matter of plotting, which means the spots will be clearly separated after I decrease the size of a spot when visualizing?
How to improve the accuracy of cell segmentation? A tactic I commonly used to downsize a cell was to decrease threshand decrease min_spot_per_cell.

Enhancements

Explain some important elements in model.cell_adata so users can easily interoperate it with other tool, like scanpy. In addition, there is an error when I try to export ClusterMap AnnData.h5ad file using adata.write which says TypeError: 0 of type <class 'int'> is an invalid key. Should be str. Above error raised while writing key 'var' of <class 'h5py._hl.files.File'> from /.. I try to resolve this by adata = ad.AnnData(model.cell_adata.X, obs=model.cell_adata.obs,var=model.cell_adata.var) but it is still not working. Any recommendation?
How to change the color of a convex hull to make it be in accordance with unsupervised clustering results after scanpy pipeline?

In a word, ClusterMap is well performed in analyzing spatial transcriptomics data. Thanks again for your excellent work. And I would appreciate it if we can discuss more details about ClusterMap.

Could you please share the expert-annotated labels in STARmap datasets?

Dear,

Thanks so much for your effort in developing so great technology and tool. I am interested to reproduce your results. Could you please share expert-annotated labels of two 1020 datasets by STARmap with me?

Thanks so much for your kind help!

Best,
zuo

Export cell segmentation plot

Hi @heihaizhengdong , I'd like to know how to export the cell segmentation plot into a local .tiff file after stitching. plt.savefig() seems not working because the figure produced by model.plot_segmentation is Nonetype.

Vignette Enhancement

Hi @heihaizhengdong Thanks for your excellent work! And I'd like to use clustermap to analyze my ISS data. However, I'm totally a beginner in python. So could you provide a more detailed vignette on the module installation, and a test run using test data? Thanks!

optimization of segmentation

Hi,
i we are using very large images, and i noticed that the slowest part seems to be the segmentation.
i also noticed only 1/2 cores are being utilized through the pipeline.

any chance to optimize / parallelize either this part or the looping through tiles?

thanks
Shahaf

Export clustermap object

Hi @heihaizhengdong, how can I export the clustermap object produced by clustermap pipeline? Can I export it as a pickle file? Any other recommendations?

wanglab-broad / clustermap Goto Github PK

clustermap's People

Contributors

Stargazers

Watchers

Forkers

clustermap's Issues

Questions

Enhancements

Recommend Projects

Recommend Topics

Recommend Org

Jobs