GithubHelp home page GithubHelp logo

baraslab / spatialumap Goto Github PK

View Code? Open in Web Editor NEW
2.0 2.0 2.0 510 KB

SpatialUMAP - UMAP applied to spatially derived feature vectors from segmented single cell imaging data.

Python 100.00%
umap spatial-analysis single-cell

spatialumap's Introduction

SpatialUMAP

SpatialUMAP.py

Using cell coordinates and cell labels, a table of densities across a defined grid of distances stratified by cell label is first create from the data. These densitis describe the local microenvironment of a cell with respect to the composition of the cellular millieu across different distances. By default, the system examines 5 concentric spatial bounds (0-25, 26-50, 51-100, 101-150, 151-200 um) around each individual cell. Therefore, the total number of features extracted is 5 times the number of unique cell labels (in these data - Tumor, CD163+, CD8+, FoxP3+). This feature vector of densities is then used as input for UMAP to fit the embedding function to over 200,000 cells and then subsequently it is applied to over 200,000 different cells for visualization and analytics.

PlottingTool.py

Conventional 2D density ploting is performed using np.histogram2d and gaussian smoothing. Further, we allow weighting by a scalar value for each cell such as by the mean fluorescence intensity (MFI) measured for that cell for things like PD-L1 and PD-1. Note, this data is not part of the feature vector that is used as input for UMAP, described above.

spatialumap's People

Contributors

alexbaras avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

spatialumap's Issues

Question for the input file format.

Dear author,
I have read your paper, and it's a great job. I want to know what's the input file format?(cell_datafile & clinical_datafile) Can you share a toy example of the two .csv files? Thank you very much!

Parameter n in Train/Test split step

Hello there! I have a question regarding an analysis step seen at line 42 of generate_umap.py. At this step the code performs a train/test split of the data before performing the UMAP. I noticed two things about this function.

  1. The value of n applied to the function is 2500. From my understanding of the train/test split this will find 2500 useable cells for the training sample and an additional 2500 useable cells for the testing sample. Does this value have significance for the UMAP calculation? Why was this number chosen over another? Are there decisions/pitfalls to consider when choosing this value of n?
  2. I noticed you used a train/test split of 50/50. Could you share how you came to choose these split values? Is there something specific to your data that made you choose this split? Again are there decisions/pitfalls to consider when choosing the percentage split inherent to these sorts of experiments?

Thank you for your time and for sharing your thoughts

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.