GithubHelp home page GithubHelp logo

wanglab-broad / clustermap Goto Github PK

View Code? Open in Web Editor NEW
39.0 1.0 11.0 174.81 MB

ClusterMap for multi-scale clustering analysis of spatial gene expression

License: GNU General Public License v3.0

Python 0.08% Jupyter Notebook 99.92%
in-situ-sequencing spatial-transcriptomics single-cell-analysis spatial-omics bioinformatics cell-segmentation transcriptomics

clustermap's People

Contributors

yichunher avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

clustermap's Issues

missing dependencies

in the environment.yml, there are missing pip dependencies:

  • fastdist
  • open-python
  • scanpy

how to handle empty tiles

Hi,

the images from my spatial transcriptomics data don't fill the entire image so we I split up the image into smaller tiles I get an error that there are no cells (see error message below). Is there a way to instead of teminating the for loop to just skip empty tiles?

ValueError Traceback (most recent call last)
File :15, in

File /allen/programs/celltypes/workgroups/rnaseqanalysis/mFISH/michaelkunst/python/ClusterMap/ClusterMap/clustermap.py:87, in ClusterMap.preprocess(self, dapi_grid_interval, LOF, contamination, pct_filter)
86 def preprocess(self,dapi_grid_interval=5, LOF=False, contamination=0.1, pct_filter=0.1):
---> 87 preprocessing_data(self.spots, dapi_grid_interval, self.dapi_binary, LOF,contamination, self.xy_radius,pct_filter)

File /allen/programs/celltypes/workgroups/rnaseqanalysis/mFISH/michaelkunst/python/ClusterMap/ClusterMap/preprocessing.py:136, in preprocessing_data(spots, dapi_grid_interval, dapi_binary, LOF, contamination, xy_radius, pct_filter)
134 #compute neighbors within radius for local density
135 knn = NearestNeighbors(radius=xy_radius)
--> 136 knn.fit(all_points)
137 spots_array = np.array(spots.loc[:, ['spot_location_2', 'spot_location_1']])
138 neigh_dist, neigh_array = knn.radius_neighbors(spots_array)

File /allen/programs/celltypes/workgroups/rnaseqanalysis/mFISH/michaelkunst/miniconda3/envs/clustermap_hpc/lib/python3.8/site-packages/sklearn/neighbors/_unsupervised.py:166, in NearestNeighbors.fit(self, X, y)
149 def fit(self, X, y=None):
150 """Fit the nearest neighbors estimator from the training dataset.
151
152 Parameters
(...)
164 The fitted nearest neighbors estimator.
165 """
--> 166 return self._fit(X)

File /allen/programs/celltypes/workgroups/rnaseqanalysis/mFISH/michaelkunst/miniconda3/envs/clustermap_hpc/lib/python3.8/site-packages/sklearn/neighbors/_base.py:435, in NeighborsBase._fit(self, X, y)
433 else:
434 if not isinstance(X, (KDTree, BallTree, NeighborsBase)):
--> 435 X = self._validate_data(X, accept_sparse="csr")
437 self._check_algorithm_metric()
438 if self.metric_params is None:

File /allen/programs/celltypes/workgroups/rnaseqanalysis/mFISH/michaelkunst/miniconda3/envs/clustermap_hpc/lib/python3.8/site-packages/sklearn/base.py:561, in BaseEstimator._validate_data(self, X, y, reset, validate_separately, **check_params)
559 raise ValueError("Validation should be done on X, y or both.")
560 elif not no_val_X and no_val_y:
--> 561 X = check_array(X, **check_params)
562 out = X
563 elif no_val_X and not no_val_y:

File /allen/programs/celltypes/workgroups/rnaseqanalysis/mFISH/michaelkunst/miniconda3/envs/clustermap_hpc/lib/python3.8/site-packages/sklearn/utils/validation.py:797, in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, estimator)
795 n_samples = _num_samples(array)
796 if n_samples < ensure_min_samples:
--> 797 raise ValueError(
798 "Found array with %d sample(s) (shape=%s) while a"
799 " minimum of %d is required%s."
800 % (n_samples, array.shape, ensure_min_samples, context)
801 )
803 if ensure_min_features > 0 and array.ndim == 2:
804 n_features = array.shape[1]

ValueError: Found array with 0 sample(s) (shape=(0, 2)) while a minimum of 1 is required.

Failed to install

Hi authors,

I tried to use pip install git+https://github.com/LiuLab-Bioelectronics-Harvard/ClusterMap.git to install, but it showed "Command errored out with exit status 128".

I changed "git+https" to "git+git". This time it could be downloaded successfully, but the site-packages folder only contained "ClusterMap-0.0.1.dist-info", which cannot be imported.

I finally tried to use git clone and used the command python setup.py install to install. But it showed "warning: install_lib:'build/lib' does not exist - no Python modules to install".

What should I do to successfully install ClusterMap?

Detailed explanation about plot_with_dapi parameter

Hi @heihaizhengdong , I'd like to know what's the meaning of plot_with_dapi in most clustermap functions? In my opinion, if it is set to False then the spots overlapping with dapi will be discarded. Is that true?

Questions and Enhancements about ClusterMap

Hello @heihaizhengdong , I’m very interested in ClusterMap. I’d like to ask several questions which are crucial to improving ClusterMap

Questions

  1. Is the position of the centroid determined by DAPI image? Is the centroid position of a cell equal to the DAPI position of a cell? What is the function of DAPI image in cell segmentation?
  2. What’s the unit of xy_radius and z_radius? μm? How can one know the area of an individual cell? Can I extract this information from the AnnData produced by ClusterMap pipeline?
  3. In general, I processed my data using the ‘split-and-stitch’ pipeline. Is a spot in an individual tile image equal to a fluorescent signal? If so, why do spots form a cluster-like pattern even if they are not so crowded in the scatter distribution plot? Is it a matter of plotting, which means the spots will be clearly separated after I decrease the size of a spot when visualizing?
    image
  4. How to improve the accuracy of cell segmentation? A tactic I commonly used to downsize a cell was to decrease threshand decrease min_spot_per_cell.

Enhancements

  1. Explain some important elements in model.cell_adata so users can easily interoperate it with other tool, like scanpy. In addition, there is an error when I try to export ClusterMap AnnData.h5ad file using adata.write which says TypeError: 0 of type <class 'int'> is an invalid key. Should be str. Above error raised while writing key 'var' of <class 'h5py._hl.files.File'> from /.. I try to resolve this by adata = ad.AnnData(model.cell_adata.X, obs=model.cell_adata.obs,var=model.cell_adata.var) but it is still not working. Any recommendation?
  2. How to change the color of a convex hull to make it be in accordance with unsupervised clustering results after scanpy pipeline?

In a word, ClusterMap is well performed in analyzing spatial transcriptomics data. Thanks again for your excellent work. And I would appreciate it if we can discuss more details about ClusterMap.

Export cell segmentation plot

Hi @heihaizhengdong , I'd like to know how to export the cell segmentation plot into a local .tiff file after stitching. plt.savefig() seems not working because the figure produced by model.plot_segmentation is Nonetype.

Vignette Enhancement

Hi @heihaizhengdong Thanks for your excellent work! And I'd like to use clustermap to analyze my ISS data. However, I'm totally a beginner in python. So could you provide a more detailed vignette on the module installation, and a test run using test data? Thanks!

optimization of segmentation

Hi,
i we are using very large images, and i noticed that the slowest part seems to be the segmentation.
i also noticed only 1/2 cores are being utilized through the pipeline.

any chance to optimize / parallelize either this part or the looping through tiles?

thanks
Shahaf

Export clustermap object

Hi @heihaizhengdong, how can I export the clustermap object produced by clustermap pipeline? Can I export it as a pickle file? Any other recommendations?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.