Spaces:
Sleeping
Sleeping
File size: 6,142 Bytes
2999286 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 |
#!/usr/bin/env python # coding: utf-8 # # Identifying Pseudo-Spatial Map # # SpaceFlow is Python package for identifying spatiotemporal patterns and spatial domains from Spatial Transcriptomic (ST) Data. Based on deep graph network, SpaceFlow provides the following functions: # 1. Encodes the ST data into **low-dimensional embeddings** that reflecting both expression similarity and the spatial proximity of cells in ST data. # 2. Incorporates **spatiotemporal** relationships of cells or spots in ST data through a **pseudo-Spatiotemporal Map (pSM)** derived from the embeddings. # 3. Identifies **spatial domains** with spatially-coherent expression patterns. # # Check out [(Ren et al., Nature Communications, 2022)](https://www.nature.com/articles/s41467-022-31739-w) for the detailed methods and applications. # # # ![fig](https://media.springernature.com/full/springer-static/image/art%3A10.1038%2Fs41467-022-31739-w/MediaObjects/41467_2022_31739_Fig1_HTML.png) # # In[1]: import omicverse as ov #print(f"omicverse version: {ov.__version__}") import scanpy as sc #print(f"scanpy version: {sc.__version__}") ov.utils.ov_plot_set() # ## Preprocess data # # Here we present our re-analysis of 151676 sample of the dorsolateral prefrontal cortex (DLPFC) dataset. Maynard et al. has manually annotated DLPFC layers and white matter (WM) based on the morphological features and gene markers. # # This tutorial demonstrates how to identify spatial domains on 10x Visium data using STAGATE. The processed data are available at https://github.com/LieberInstitute/spatialLIBD. We downloaded the manual annotation from the spatialLIBD package and provided at https://drive.google.com/drive/folders/10lhz5VY7YfvHrtV40MwaqLmWz56U9eBP?usp=sharing. # In[2]: adata = sc.read_visium(path='data', count_file='151676_filtered_feature_bc_matrix.h5') adata.var_names_make_unique() # <div class="admonition warning"> # <p class="admonition-title">Note</p> # <p> # We introduced the spatial special svg calculation module prost in omicverse versions greater than `1.6.0` to replace scanpy's HVGs, if you want to use scanpy's HVGs you can set mode=`scanpy` in `ov.space.svg` or use the following code. # </p> # </div> # # ```python # #adata=ov.pp.preprocess(adata,mode='shiftlog|pearson',n_HVGs=3000,target_sum=1e4) # #adata.raw = adata # #adata = adata[:, adata.var.highly_variable_features] # ``` # In[3]: sc.pp.calculate_qc_metrics(adata, inplace=True) adata = adata[:,adata.var['total_counts']>100] adata=ov.space.svg(adata,mode='prost',n_svgs=3000,target_sum=1e4,platform="visium",) adata.raw = adata adata = adata[:, adata.var.space_variable_features] adata # We read the ground truth area of our spatial data # In[4]: # read the annotation import pandas as pd import os Ann_df = pd.read_csv(os.path.join('data', '151676_truth.txt'), sep='\t', header=None, index_col=0) Ann_df.columns = ['Ground Truth'] adata.obs['Ground Truth'] = Ann_df.loc[adata.obs_names, 'Ground Truth'] sc.pl.spatial(adata, img_key="hires", color=["Ground Truth"]) # ## Training the SpaceFlow Model # # Here, we used `ov.space.pySpaceFlow` to construct a SpaceFlow Object and train the model. # # We need to store the space location info in `adata.obsm['spatial']` # In[5]: sf_obj=ov.space.pySpaceFlow(adata) # We then train a spatially regularized deep graph network model to learn a low-dimensional embedding that reflecting both expression similarity and the spatial proximity of cells in ST data. # # Parameters: # - `spatial_regularization_strength`: the strength of spatial regularization, the larger the more of the spatial coherence in the identified spatial domains and spatiotemporal patterns. (default: 0.1) # - `z_dim`: the target size of the learned embedding. (default: 50) # - `lr`: learning rate for optimizing the model. (default: 1e-3) # - `epochs`: the max number of the epochs for model training. (default: 1000) # - `max_patience`: the max number of the epoch for waiting the loss decreasing. If loss does not decrease for epochs larger than this threshold, the learning will stop, and the model with the parameters that shows the minimal loss are kept as the best model. (default: 50) # - `min_stop`: the earliest epoch the learning can stop if no decrease in loss for epochs larger than the `max_patience`. (default: 100) # - `random_seed`: the random seed set to the random generators of the `random`, `numpy`, `torch` packages. (default: 42) # - `gpu`: the index of the Nvidia GPU, if no GPU, the model will be trained via CPU, which is slower than the GPU training time. (default: 0) # - `regularization_acceleration`: whether or not accelerate the calculation of regularization loss using edge subsetting strategy (default: True) # - `edge_subset_sz`: the edge subset size for regularization acceleration (default: 1000000) # # In[6]: sf_obj.train(spatial_regularization_strength=0.1, z_dim=50, lr=1e-3, epochs=1000, max_patience=50, min_stop=100, random_seed=42, gpu=0, regularization_acceleration=True, edge_subset_sz=1000000) # ## Calculated the Pseudo-Spatial Map # # Unlike the original SpaceFlow, we only need to use the `cal_PSM` function when calling SpaceFlow in omicverse to compute the pSM. # In[7]: sf_obj.cal_pSM(n_neighbors=20,resolution=1, max_cell_for_subsampling=5000,psm_key='pSM_spaceflow') # In[8]: sc.pl.spatial(adata, color=['pSM_spaceflow','Ground Truth'],cmap='RdBu_r') # ## Clustering the space # # We can use `GMM`, `leiden` or `louvain` to cluster the space. # # ```python # sc.pp.neighbors(adata, n_neighbors=15, n_pcs=50, # use_rep='spaceflow') # ov.utils.cluster(adata,use_rep='spaceflow',method='louvain',resolution=1) # ov.utils.cluster(adata,use_rep='spaceflow',method='leiden',resolution=1) # ``` # In[9]: ov.utils.cluster(adata,use_rep='spaceflow',method='GMM',n_components=7,covariance_type='full', tol=1e-9, max_iter=1000, random_state=3607) # In[10]: sc.pl.spatial(adata, color=['gmm_cluster',"Ground Truth"]) # In[ ]: |