simonwm / tacco Goto Github PK

TACCO: Transfer of Annotations to Cells and their COmbinations

License: BSD 3-Clause "New" or "Revised" License

Shell 0.08% Python 99.75% Makefile 0.09% CSS 0.07%

tacco's Introduction

TACCO: Transfer of Annotations to Cells and their COmbinations

TACCO is a python framework for working with categorical and compositional annotations for high-dimensional observations, in particular for transferring annotations from single cell to spatial transcriptomics data. TACCO comes with an extensive ever expanding documentation and a set of example notebooks. If TACCO is useful for your research, you can cite Nat Biotechnol (2023).

How to install TACCO

Clean

The simplest way to install TACCO is to create a clean environment with conda using the environment.yml file from the TACCO repository:

conda env create -f "https://raw.githubusercontent.com/simonwm/tacco/master/environment.yml"

(For older versions of conda one needs to download the environment.yml and use the local file for installation.)

Conda

To install TACCO in an already existing environment, use conda to install from the conda-forge channel:

conda install -c conda-forge tacco

Pip

It is also possible to install from pypi via pip:

pip install tacco

This is however not recommended. Unlike conda, pip cannot treat python itself as a package, so if you start with the wrong python version, you will run into errors with dependencies (e.g. at the time of writing, mkl-service is not available for python 3.10 and numba not for 3.11).

Github

To access the most recent pre-release versions it is also possible to pip-install directly from github:

pip install tacco@git+https://github.com/simonwm/tacco.git

Obviously, this is not recomended for production environments.

How to use TACCO

TACCO features a fast and straightforward API for the compositional annotation of one dataset, given as an anndata object adata, with a categorically annotated second dataset, given as an anndata object reference. The annotation is wrapped in a single function call

import tacco as tc
tc.tl.annotate(adata, reference, annotation_key='my_categorical_annotation', result_key='my_compositional_annotation')

where 'my_categorical_annotation' is the name of the categorical .obs annotation in reference and 'my_compositional_annotation' is the name of the new compositional .obsm annotation to be created in adata. There are many options for customizing this function to call e.g. external annotation tools, which are described in the documentation of the annotate function.

As the TACCO framework contains much more than a compositional annotation method (single-molecule annotation, object-splitting, spatial co-occurrence analysis, enrichments, visualization, ...), its documentation does not fit into a README.

tacco's People

Contributors

Stargazers

Watchers

tacco's Issues

How to split objects

Hi all,

First of all, thanks for developing useful tools for analysis.

Thanks to your tacco, I have got compositional matrix by cell type. (using below command line)

tc.tl.annotate(adata, reference, annotation_key='Annotation', result_key='my_compositional_annotation')

But, I also want to obtain compositionally annotated count data that is amenable to standard downstream single cell analysis workflows.

Could you please let me know how to split objects by cell types or what command to use?

Thanks!

Best,
KJ

In the python3.9 environment, the installation is successful. When importing tacco, TypeError occurs. Do you know the reason?

The specific error is as follows:
import tacco as tc

Traceback (most recent call last):
File "", line 1, in
File "/home/baism/anaconda3/envs/TACCO_env/lib/python3.9/site-packages/tacco/init.py", line 6, in
from . import plots as pl
File "/home/baism/anaconda3/envs/TACCO_env/lib/python3.9/site-packages/tacco/plots/init.py", line 6, in
from ._plots import get_default_colors, mix_base_colors, subplots, scatter, cellsize, frequency_bar, frequency, comparison, compositions, contribution, significances, heatmap, sigmap, co_occurrence, co_occurrence_matrix, annotated_heatmap, annotation_coordinate, dotplot
File "/home/baism/anaconda3/envs/TACCO_env/lib/python3.9/site-packages/tacco/plots/_plots.py", line 4, in
import scanpy as sc
File "/home/baism/anaconda3/envs/TACCO_env/lib/python3.9/site-packages/scanpy/init.py", line 17, in
from . import plotting as pl
File "/home/baism/anaconda3/envs/TACCO_env/lib/python3.9/site-packages/scanpy/plotting/init.py", line 1, in
from ._anndata import (
File "/home/baism/anaconda3/envs/TACCO_env/lib/python3.9/site-packages/scanpy/plotting/_anndata.py", line 28, in
from . import _utils
File "/home/baism/anaconda3/envs/TACCO_env/lib/python3.9/site-packages/scanpy/plotting/_utils.py", line 36, in
class _AxesSubplot(Axes, axes.SubplotBase, ABC):
TypeError: metaclass conflict: the metaclass of a derived class must be a (non-strict) subclass of the metaclasses of all its bases

May I ask why this error is reported and how to solve it?

Mapping gene programs

Dear all,

First of all, thanks for developing useful tools for analysis.

I would like to map a specific gene program in TACCO like your paper (https://www.biorxiv.org/content/10.1101/2022.10.02.508492v1.full).

Could you please let me know how to input the gene program as a reference?
Do I input a matrix consisting of programs x genes?

Thanks!

Best,
KM

Problems when annotating the spatial data

A problem was met when using tc.tl.annotate, the code is:
tc.tl.annotate(adata_st1, adata_sc, annotation_key='fine',result_key='fine',multi_center=10,);

Both the reference scRNA data and spatial transcriptome data were read by Scanpy, and the error is:

ValueError Traceback (most recent call last)
Input In [19], in <cell line: 1>()
----> 1 tc.tl.annotate(adata_st1, adata_sc, annotation_key='fine',result_key='fine',multi_center=10,)

File ~/.local/lib/python3.9/site-packages/tacco/tools/_annotate.py:802, in annotate(adata, reference, annotation_key, result_key, counts_location, method, bisections, bisection_divisor, platform_iterations, normalize_to, annotation_prior, multi_center, multi_center_amplitudes, reconstruction_key, max_annotation, min_counts_per_gene, min_counts_per_cell, min_cells_per_gene, min_genes_per_cell, remove_constant_genes, remove_zero_cells, min_log2foldchange, min_expression, remove_mito, n_hvg, skip_checks, assume_valid_counts, return_reference, gene_keys, verbose, **kw_args)
800 print('\n'.join(method_construction_info[::-1]))
801 start = time.time()
--> 802 cell_type = _method(tdata, reference, annotation_key, annotation_prior, verbose)
803 if verbose > 0:
804 print(f'Finished annotation in {np.round(time.time() - start, 2)} seconds.')

File ~/.local/lib/python3.9/site-packages/tacco/tools/_annotate.py:328, in platform_normalize_annotation_method.._method(adata, reference, annotation_key, annotation_prior, verbose)
325 # renormalize profiles as they have been denormalized by platform noramlization
326 reference.varm[annotation_key] /= reference.varm[annotation_key].sum(axis=0).to_numpy()
--> 328 cell_type = annotation_method(adata, reference, annotation_key, annotation_prior, verbose)
329 return cell_type

File ~/.local/lib/python3.9/site-packages/tacco/tools/_annotate.py:408, in multi_center_annotation_method.._method(adata, reference, annotation_key, annotation_prior, verbose)
406 utils.log1p(preped)
407 sc.pp.scale(preped)
--> 408 sc.pp.pca(preped, random_state=42, n_comps=min(10,min(preped.shape[0],preped.shape[1])-1))
410 new_cats = []
411 for cat, df in reference.obs.groupby(annotation_key):

File ~/.local/lib/python3.9/site-packages/scanpy/preprocessing/pca.py:188, in pca(data, n_comps, zero_center, svd_solver, random_state, return_info, use_highly_variable, dtype, copy, chunked, chunk_size)
184 X = X.toarray()
185 pca = PCA(
186 n_components=n_comps, svd_solver=svd_solver, random_state=random_state
187 )
--> 188 X_pca = pca_.fit_transform(X)
189 elif issparse(X) and zero_center:
190 from sklearn.decomposition import PCA

File ~/anaconda3/lib/python3.9/site-packages/sklearn/decomposition/_pca.py:407, in PCA.fit_transform(self, X, y)
385 def fit_transform(self, X, y=None):
386 """Fit the model with X and apply the dimensionality reduction on X.
387
388 Parameters
(...)
405 C-ordered array, use 'np.ascontiguousarray'.
406 """
--> 407 U, S, Vt = self.fit(X)
408 U = U[:, : self.n_components]
410 if self.whiten:
411 # X_new = X * V / S * sqrt(n_samples) = U * sqrt(n_samples)

File ~/anaconda3/lib/python3.9/site-packages/sklearn/decomposition/_pca.py:430, in PCA._fit(self, X)
424 if issparse(X):
425 raise TypeError(
426 "PCA does not support sparse input. See "
427 "TruncatedSVD for a possible alternative."
428 )
--> 430 X = self._validate_data(
431 X, dtype=[np.float64, np.float32], ensure_2d=True, copy=self.copy
432 )
434 # Handle n_components==None
435 if self.n_components is None:

File ~/anaconda3/lib/python3.9/site-packages/sklearn/base.py:566, in BaseEstimator._validate_data(self, X, y, reset, validate_separately, **check_params)
564 raise ValueError("Validation should be done on X, y or both.")
565 elif not no_val_X and no_val_y:
--> 566 X = check_array(X, **check_params)
567 out = X
568 elif no_val_X and not no_val_y:

File ~/anaconda3/lib/python3.9/site-packages/sklearn/utils/validation.py:800, in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, estimator)
794 raise ValueError(
795 "Found array with dim %d. %s expected <= 2."
796 % (array.ndim, estimator_name)
797 )
799 if force_all_finite:
--> 800 _assert_all_finite(array, allow_nan=force_all_finite == "allow-nan")
802 if ensure_min_samples > 0:
803 n_samples = _num_samples(array)

File ~/anaconda3/lib/python3.9/site-packages/sklearn/utils/validation.py:114, in _assert_all_finite(X, allow_nan, msg_dtype)
107 if (
108 allow_nan
109 and np.isinf(X).any()
110 or not allow_nan
111 and not np.isfinite(X).all()
112 ):
113 type_err = "infinity" if allow_nan else "NaN, infinity"
--> 114 raise ValueError(
115 msg_err.format(
116 type_err, msg_dtype if msg_dtype is not None else X.dtype
117 )
118 )
119 # for object dtype data, we only check for NaNs (GH-13254)
120 elif X.dtype == np.dtype("object") and not allow_nan:

ValueError: Input contains NaN, infinity or a value too large for dtype('float64').

How can I solve it?
Thanks.

"Exception: type_prior contains negative values!" when running tc.tl.annotate()

Hello,
I tried to run tc.tl.annotate() and encountered this error:
`

tc.tl.annotate(puck,reference,'subtype',result_key='subtype',) Starting preprocessing
Annotation profiles were not found in reference.varm["subtype"]. Constructing reference profiles with tacco.preprocessing.construct_reference_profiles and default arguments...
Finished preprocessing in 118.54 seconds.
Starting annotation of data with shape (173055, 2841) and a reference of shape (56131, 2841) using the following wrapped method:
+- platform normalization: platform_iterations=0, gene_keys=subtype, normalize_to=adata
+- multi center: multi_center=None multi_center_amplitudes=True
+- bisection boost: bisections=4, bisection_divisor=3
+- core: method=OT annotation_prior=None
mean,std( rescaling(gene) ) 29.675947315286834 623.3959057199611
bisection run on 1
/USER/lizeyu/00.software/Miniconda3/envs/TACCO_env/lib/python3.10/site-packages/tacco/utils/_dist.py:393: RuntimeWarning: invalid value encountered in sqrt
A = np.sqrt(A)
Traceback (most recent call last):
File "", line 1, in
File "/USER/lizeyu/00.software/Miniconda3/envs/TACCO_env/lib/python3.10/site-packages/tacco/tools/_annotate.py", line 802, in annotate
cell_type = _method(tdata, reference, annotation_key, annotation_prior, verbose)
File "/USER/lizeyu/00.software/Miniconda3/envs/TACCO_env/lib/python3.10/site-packages/tacco/tools/_annotate.py", line 328, in _method
cell_type = annotation_method(adata, reference, annotation_key, annotation_prior, verbose)
File "/USER/lizeyu/00.software/Miniconda3/envs/TACCO_env/lib/python3.10/site-packages/tacco/tools/_annotate.py", line 381, in _method
cell_type = annotation_method(adata, reference, annotation_key, annotation_prior, verbose)
File "/USER/lizeyu/00.software/Miniconda3/envs/TACCO_env/lib/python3.10/site-packages/tacco/tools/_annotate.py", line 208, in _method
cell_type = annotation_method(adata, reference, annotation_key, annotation_prior, verbose)
File "/USER/lizeyu/00.software/Miniconda3/envs/TACCO_env/lib/python3.10/site-packages/tacco/tools/_annotate.py", line 115, in _method
cell_type = annotate(adata, reference, annotation_key, annotation_prior=_annotation_prior, **verbose_arg, **kw_args)
File "/USER/lizeyu/00.software/Miniconda3/envs/TACCO_env/lib/python3.10/site-packages/tacco/tools/_OT.py", line 75, in _annotate_OT
cell_type = _run_OT(type_cell_dist, annotation_prior, cell_prior=cell_prior, epsilon=epsilon, lamb=lamb)
File "/USER/lizeyu/00.software/Miniconda3/envs/TACCO_env/lib/python3.10/site-packages/tacco/utils/_utils.py", line 247, in _run_OT
raise Exception('type_prior contains negative values!')
Exception: type_prior contains negative values!
`
Since i couldn't find an example data format. Could you give me a hint about was that because of and where my data format is wrong?

The puck and reference i used were like:

Question - can TACCO examine regions of a Visium slide

I am analysing a 10x Visium dataset and I would like to examine 2 different areas or regions of the same Visium slide (image below). The data is from a mouse colon and some of the tissue is tumour and some is not. We would like to examine and compare the spots / genes in both regions (tumour vs non).
Can TACCO assist with this task?
Thanks

Error when using annotate

When I try to use the annotate function on my own slide-seq data with my own annotated reference I run into the error:

I am unsure what is causing this, but somehow it seems like the variables are not found in the datasets:

Met problems when using tc.tl.annotate()

When I use tc.tl.annotate(), where the ref and space data was read by scanpy. I met the following error:

InvalidIndexError Traceback (most recent call last)
Input In [7], in <cell line: 1>()
----> 1 tc.tl.annotate(adata_spatial, reference,'celltype1',result_key='celltype1',)

File D:\Basic Tools\Anaconda\lib\site-packages\tacco\tools_annotate.py:734, in annotate(adata, reference, annotation_key, result_key, counts_location, method, bisections, bisection_divisor, platform_iterations, normalize_to, annotation_prior, multi_center, multi_center_amplitudes, reconstruction_key, max_annotation, min_counts_per_gene, min_counts_per_cell, min_cells_per_gene, min_genes_per_cell, remove_constant_genes, remove_zero_cells, min_log2foldchange, min_expression, remove_mito, n_hvg, skip_checks, assume_valid_counts, return_reference, gene_keys, verbose, **kw_args)
732 except ValueError as e: # as e syntax added in ~python2.5
733 raise ValueError(f'{str(e)}\nYou can deactivate checking for invalid counts by specifying assume_valid_counts=True.')
--> 734 tdata,reference = preprocessing.filter(adata=(tdata,reference), min_counts_per_cell=min_counts_per_cell, min_counts_per_gene=min_counts_per_gene, min_cells_per_gene=min_cells_per_gene, min_genes_per_cell=min_genes_per_cell, remove_constant_genes=remove_constant_genes, remove_zero_cells=remove_zero_cells, assume_valid_counts=True) # ensure consistent gene selection
735 if verbose > 0:
736 print(f'Finished preprocessing in {np.round(time.time() - start, 2)} seconds.')

File D:\Basic Tools\Anaconda\lib\site-packages\tacco\preprocessing_qc.py:163, in filter(adata, min_counts_per_gene, min_counts_per_cell, min_cells_per_gene, min_genes_per_cell, remove_constant_genes, remove_zero_cells, assume_valid_counts, return_view)
161 for i in range(len(adatas)):
162 if len(adatas[i].var.index) != len(good_genes): # filter happened
--> 163 adatas[i] = adatas[i][:,good_genes]
164 changed = True
165 elif (adatas[i].var.index != good_genes).any(): # reordering happened: no side effects on cell filtering

File D:\Basic Tools\Anaconda\lib\site-packages\anndata_core\anndata.py:1113, in AnnData.getitem(self, index)
1111 def getitem(self, index: Index) -> "AnnData":
1112 """Returns a sliced view of the object."""
-> 1113 oidx, vidx = self._normalize_indices(index)
1114 return AnnData(self, oidx=oidx, vidx=vidx, asview=True)

File D:\Basic Tools\Anaconda\lib\site-packages\anndata_core\anndata.py:1094, in AnnData._normalize_indices(self, index)
1093 def _normalize_indices(self, index: Optional[Index]) -> Tuple[slice, slice]:
-> 1094 return _normalize_indices(index, self.obs_names, self.var_names)

File D:\Basic Tools\Anaconda\lib\site-packages\anndata_core\index.py:36, in _normalize_indices(index, names0, names1)
34 ax0, ax1 = unpack_index(index)
35 ax0 = _normalize_index(ax0, names0)
---> 36 ax1 = _normalize_index(ax1, names1)
37 return ax0, ax1

File D:\Basic Tools\Anaconda\lib\site-packages\anndata_core\index.py:98, in _normalize_index(indexer, index)
96 return positions # np.ndarray[int]
97 else: # indexer should be string array
---> 98 positions = index.get_indexer(indexer)
99 if np.any(positions < 0):
100 not_found = indexer[positions < 0]

File D:\Basic Tools\Anaconda\lib\site-packages\pandas\core\indexes\base.py:3905, in Index.get_indexer(self, target, method, limit, tolerance)
3902 self._check_indexing_method(method, limit, tolerance)
3904 if not self._index_as_unique:
-> 3905 raise InvalidIndexError(self._requires_unique_msg)
3907 if len(target) == 0:
3908 return np.array([], dtype=np.intp)

InvalidIndexError: Reindexing only valid with uniquely valued Index objects

Can you help?

Question regarding osmFISH mouse somato-sensory cortex

Hello,

I am running the tacco examples osmFISH mouse somato-sensory cortex with my own data, but the result file full_anno.csv of osmFISH_run_TACCO_segmentation.py contains only double the number of segmentations, what parameters can I change to make the segmentation more reasonable?

Thank you!

Tacco key in adata.uns causes issues when saving h5ad

Hi,

I am on tacco 0.3.0 and anndata 0.10.5.post1.

When I try to save a h5ad after tacco I receive the error:

 merged.write_h5ad("merged_4heart_counts.h5ad")
  File "/work/rwth1209/enviroments/spatial_analysis/lib/python3.11/site-packages/anndata/_core/anndata.py", line 2017, in write_h5ad
    write_h5ad(
  File "/work/rwth1209/enviroments/spatial_analysis/lib/python3.11/site-packages/anndata/_io/h5ad.py", line 111, in write_h5ad
    write_elem(f, "uns", dict(adata.uns), dataset_kwargs=dataset_kwargs)
  File "/work/rwth1209/enviroments/spatial_analysis/lib/python3.11/site-packages/anndata/_io/specs/registry.py", line 359, in write_elem
    Writer(_REGISTRY).write_elem(store, k, elem, dataset_kwargs=dataset_kwargs)
  File "/work/rwth1209/enviroments/spatial_analysis/lib/python3.11/site-packages/anndata/_io/utils.py", line 243, in func_wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/work/rwth1209/enviroments/spatial_analysis/lib/python3.11/site-packages/anndata/_io/specs/registry.py", line 309, in write_elem
    return write_func(store, k, elem, dataset_kwargs=dataset_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/work/rwth1209/enviroments/spatial_analysis/lib/python3.11/site-packages/anndata/_io/specs/registry.py", line 57, in wrapper
    result = func(g, k, *args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/work/rwth1209/enviroments/spatial_analysis/lib/python3.11/site-packages/anndata/_io/specs/methods.py", line 312, in write_mapping
    _writer.write_elem(g, sub_k, sub_v, dataset_kwargs=dataset_kwargs)
  File "/work/rwth1209/enviroments/spatial_analysis/lib/python3.11/site-packages/anndata/_io/utils.py", line 243, in func_wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/work/rwth1209/enviroments/spatial_analysis/lib/python3.11/site-packages/anndata/_io/specs/registry.py", line 304, in write_elem
    self.find_writer(dest_type, elem, modifiers),
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/work/rwth1209/enviroments/spatial_analysis/lib/python3.11/site-packages/anndata/_io/specs/registry.py", line 269, in find_writer
    return self.registry.get_writer(dest_type, type(elem), modifiers)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/work/rwth1209/enviroments/spatial_analysis/lib/python3.11/site-packages/anndata/_io/specs/registry.py", line 117, in get_writer
    raise IORegistryError._from_write_parts(dest_type, src_type, modifiers)
anndata._io.specs.registry.IORegistryError: No method registered for writing <class 'pandas.core.series.Series'> into <class 'h5py._hl.group.Group'>
Error raised while writing key 'tacco_mc8' of <class 'h5py._hl.group.Group'> to /uns

Deleting the key fixes the issue.
Any idea what is going wrong?

Different result using annotate_OT or annotate

Dear author,
I test the annotate and the annotate_OT, but their results were very different. The annotate_OT result is better according to the priori knowledge. So I want to known why they are different and how to choose the better method.

tc.tl.annotate(adata, ref, method='OT',result_key='TACCO',annotation_key='cell_type',assume_valid_counts=True)
df = tc.tl.annotate_OT(adata, ref,annotation_key='cell_type')

Looking forward to your reply, and thank you in advance.

quanlong

type_prior contains na!

Hi,

I am trying to annotate a dataset like this:

def tacco_annotation(adata, ref, ct_column="cell_type", **kwargs):
    assert (
        adata.X.max().is_integer() and ref.X.max().is_integer()
    ), "Data must be raw counts"
    adata = tc.tl.annotate(
        adata,
        ref,
        annotation_key=ct_column,
        **kwargs,
        result_key="tacco",
    )
    adata = tc.utils.get_maximum_annotation(adata, "tacco", "tacco")
    adata.obs["tacco_score"] = adata.obsm["tacco"].max(axis=1)
    return adata

adata = tacco_annotation(
    adata,
    ref,
    ct_column="cell_subtype",
)

I get the following error:

Starting preprocessing
Annotation profiles were not found in `reference.varm["cell_subtype"]`. Constructing reference profiles with `tacco.preprocessing.construct_reference_profiles` and default arguments...
Finished preprocessing in 0.5 seconds.
Starting annotation of data with shape (67344, 483) and a reference of shape (2736, 483) using the following wrapped method:
+- platform normalization: platform_iterations=0, gene_keys=cell_subtype, normalize_to=adata
   +- multi center: multi_center=None multi_center_amplitudes=True
      +- bisection boost: bisections=4, bisection_divisor=3
         +- core: method=OT annotation_prior=None
mean,std( rescaling(gene) )  51.64145217636585 121.51160341211514
bisection run on 1
[241]( tacco/utils/_utils.py:241) def _run_OT(type_cell_dist, type_prior=None, cell_prior=None, epsilon=5e-3, lamb=None, inplace=False):
    [242]( tacco/utils/_utils.py:242) 
    [243]( tacco/utils/_utils.py:243)     # check sanity of arguments
    [244]( tacco/utils/_utils.py:244)     if type_prior is not None and type_prior.isna().any():
--> [245]( tacco/utils/_utils.py:245)         raise Exception('type_prior contains na!')
    [246]( tacco/utils/_utils.py:246)     if type_prior is not None and (type_prior<0).any():
    [247]( tacco/utils/_utils.py:247)         raise Exception('type_prior contains negative values!')

Exception: type_prior contains na!

I previoulsy annotated other datasets succesfully.
I also used the novosparc method for this reference and dataset and it completed with a warning:

Trying with epsilon: 5.00e-04
ot/bregman/_sinkhorn.py:498: RuntimeWarning: divide by zero encountered in divide
  v = b / KtransposeU
ot/bregman/_sinkhorn.py:498: RuntimeWarning: overflow encountered in divide
  v = b / KtransposeU
ot/bregman/_sinkhorn.py:506: UserWarning: Warning: numerical errors at iteration 0
  warnings.warn('Warning: numerical errors at iteration %d' % ii)

I made sure that both dataset have positive integers and no NaNs.
Any ideas what is going wrong?

max_annotation parameter not considered during observation splitting

I am using TACCO to analyse some Visium samples, and after annotating the reference cell types I have tried to split the gene expression counts of Visium spots across the contributing cell types with the tc.tl.split_observations() function.
Although during the annotation step I set the max_annotation parameter to 7, the Visium counts are then splitted by considering all the possible cell types defined in the single cell reference (51).
Is there a a way to preserve the max_annotation information even during the observation splitting step?

Thank you for your help!

Effect of Batch Effect in scRNA reference

Thanks for the tool, very impressive!

I am curious since TACCO expects counts from the scRNA reference and most reference sets are composed of multiple patients or 10x batches:

Is this a problem for TACCO, should I subset to a single batch or maybe calculate "batch-corrected counts" with something like scVI?

Example request - Visium

Would it be possible to show a simple example from a 10x Visium output from Space Ranger and how TACCO can interact with it? Even just read in and annotate the spatial data. I am having trouble with the formatting when following the ''Slide-Seq Mouse Colon'' and ''Mapping single cells into space'' tutorials at the tc.tl.annotate() step and the same method for a Visium would be useful to many.

Clarification on the Implementation of _count_soft_co_occurrences_dense Function

Hi,

Thank you for the great software and informative documentation. Your setup to approach the cell-type co-occurrence problem is particularly intriguing to me. After reading the source code, I have a clarification question about the implementation of the _count_soft_co_occurences_dense function, specifically concerning [this line of code] (

tacco/tacco/tools/_co_occurrence.py

Line 165 in ce8478c

temp_i[k, rc] += reference_contributions_j[rc]

). The code seems to aggerate the composition of reference cells into bins by summing up contributions within the same bin:

temp_i[k, rc] += reference_contributions_j[rc]

I am uncertain whether summing compositional data within the same bin effectively estimates the overall composition. Would it be more accurate to use the mean of these compositional contributions instead? After all, this is a 2D surface, so increasing distance means a large area of coverage. Any insights or comments you could provide would be greatly appreciated.

Thank you for your time and assistance.

Raw counts data as reference or normalized data as reference?

Hello,

I am using sc reference data processed via Seurat. While extracting expression data from Seurat object, I wonder which among raw counts and normalized data should I extract to use in tacco.
Although the result did not deviate much from each other's, I wanted to be sure which is right.

Thank you!

simonwm / tacco Goto Github PK

tacco's Introduction

TACCO: Transfer of Annotations to Cells and their COmbinations

How to install TACCO

Clean

Conda

Pip

Github

How to use TACCO

tacco's People

Contributors

Stargazers

Watchers

tacco's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs