iancovert / persist Goto Github PK
View Code? Open in Web Editor NEWLicense: MIT License
License: MIT License
Hello,
I'm running PERSIST on my scRNA-seq dataset, however I have >100K cells and I noticed training taking quite a while. I was wondering what I can do to speedup training? For reference I have 48gb GPU memory.
Best,
Chang
Hi!
Really cool package that's easy to use. I've already compared it to some prior selections using geneBasis and think I'd like to switch over. One of the things I'm I'd like to make sure I understand is how the binarization thresholds are set. In the "Expression quantization" section, you recommend using a threshold matching approach, where a threshold value of zero is used in the scRNA-seq measurements. You then suggest finding the corresponding quantile in the scRNA-seq data and identifying the matching threshold in the FISH data.
I'm starting from scRNA-seq data available and plan to select a 100-gene panel for smFISH data. I was wondering if the following an appropriate procedure:
All the best,
Petar
I assume that the title might be enough.
In your example 01_persist_supervised.ipynb you have this statement:
# Initialize the dataset for PERSIST
# Note: Here, data_train.layers['bin'] is a sparse array
# data_train.layers['bin'].A converts it to a dense array
train_dataset = ExpressionDataset(adata_train.layers['bin'].A, adata_train.obs['cell_types_25_codes'])
val_dataset = ExpressionDataset(adata_val.layers['bin'].A, adata_val.obs['cell_types_25_codes'])
And of cause if you do not do that it does not work.
Can't this tool be supporting sparse data instead?
This does not feel state of the art - sorry.
Hi all,
I have come across this very promising gene marker selection and I want to try it on some example data that I have.
In the block where PERSIST is run, the following line:
candidates, model = selector.eliminate(target=500, max_nepochs=250)
returns a runtime Error. More details attached in the file.
Has anyone come across this bug? Could it be that it is trying to force us to use GPU?
Thanks in advance for looking into this.
Hi, @iancovert
In your paper, you mentioned using LightGBM models for cell type classification with a learning rate of 0.05 and 10,000 boosters (with early stopping). However, I couldn't locate the complete code implementation for LightGBM in the materials you provided. Since LightGBM has several hyperparameters, I'm interested in details about additional parameters beyond the learning rate and booster count, specifically:
If possible, could you please share the code or provide guidance on how these parameters were configured in your experiments?
Thank you for your time and consideration.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.