#!/usr/bin/env python # coding: utf-8 # # Mapping single-cell profile onto spatial profile # # Tangram is a method for mapping single-cell (or single-nucleus) gene expression data onto spatial gene expression data. Tangram takes as input a single-cell dataset and a spatial dataset, collected from the same anatomical region/tissue type. Via integration, Tangram creates new spatial data by aligning the scRNAseq profiles in space. This allows to project every annotation in the scRNAseq (e.g. cell types, program usage) on space. # # The most common application of Tangram is to resolve cell types in space. Another usage is to correct gene expression from spatial data: as scRNA-seq data are less prone to dropout than (e.g.) Visium or Slide-seq, the “new” spatial data generated by Tangram resolve many more genes. As a result, we can visualize program usage in space, which can be used for ligand-receptor pair discovery or, more generally, cell-cell communication mechanisms. If cell segmentation is available, Tangram can be also used for deconvolution of spatial data. If your single cell are multimodal, Tangram can be used to spatially resolve other modalities, such as chromatin accessibility. # # Biancalani, T., Scalia, G., Buffoni, L. et al. Deep learning and alignment of spatially resolved single-cell transcriptomes with Tangram. Nat Methods 18, 1352–1362 (2021). https://doi.org/10.1038/s41592-021-01264-7 # # ![img](https://tangram-sc.readthedocs.io/en/latest/_images/tangram_overview.png) # In[1]: import omicverse as ov #print(f"omicverse version: {ov.__version__}") import scanpy as sc #print(f"scanpy version: {sc.__version__}") ov.utils.ov_plot_set() # ## Prepared scRNA-seq # # Published scRNA-seq datasets of lymph nodes have typically lacked an adequate representation of germinal centre-associated immune cell populations due to age of patient donors. We, therefore, include scRNA-seq datasets spanning lymph nodes, spleen and tonsils in our single-cell reference to ensure that we captured the full diversity of immune cell states likely to exist in the spatial transcriptomic dataset. # # Here we download this dataset, import into anndata and change variable names to ENSEMBL gene identifiers. # # Link: https://cell2location.cog.sanger.ac.uk/paper/integrated_lymphoid_organ_scrna/RegressionNBV4Torch_57covariates_73260cells_10237genes/sc.h5ad # In[2]: adata_sc=ov.read('data/sc.h5ad') import matplotlib.pyplot as plt fig, ax = plt.subplots(figsize=(3,3)) ov.utils.embedding( adata_sc, basis="X_umap", color=['Subset'], title='Subset', frameon='small', #ncols=1, wspace=0.65, #palette=ov.utils.pyomic_palette()[11:], show=False, ax=ax ) # For data quality control and preprocessing, we can easily use omicverse's own preprocessing functions to do so # In[3]: print("RAW",adata_sc.X.max()) adata_sc=ov.pp.preprocess(adata_sc,mode='shiftlog|pearson',n_HVGs=3000,target_sum=1e4) adata_sc.raw = adata_sc adata_sc = adata_sc[:, adata_sc.var.highly_variable_features] print("Normalize",adata_sc.X.max()) # ## Prepared stRNA-seq # # First let’s read spatial Visium data from 10X Space Ranger output. Here we use lymph node data generated by 10X and presented in [Kleshchevnikov et al (section 4, Fig 4)](https://www.biorxiv.org/content/10.1101/2020.11.15.378125v1). This dataset can be conveniently downloaded and imported using scanpy. See [this tutorial](https://cell2location.readthedocs.io/en/latest/notebooks/cell2location_short_demo.html) for a more extensive and practical example of data loading (multiple visium samples). # In[5]: adata = sc.datasets.visium_sge(sample_id="V1_Human_Lymph_Node") adata.obs['sample'] = list(adata.uns['spatial'].keys())[0] adata.var_names_make_unique() # We used the same pre-processing steps as for scRNA-seq # #
Note
## We introduced the spatial special svg calculation module prost in omicverse versions greater than `1.6.0` to replace scanpy's HVGs, if you want to use scanpy's HVGs you can set mode=`scanpy` in `ov.space.svg` or use the following code. #
#