File size: 6,365 Bytes
2999286
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
#!/usr/bin/env python
# coding: utf-8

# # Trajectory Inference with VIA
# 
# VIA is a single-cell Trajectory Inference method that offers topology construction, pseudotimes, automated terminal state prediction and automated plotting of temporal gene dynamics along lineages. Here, we have improved the original author's colouring logic and user habits so that users can use the anndata object directly for analysis。 
# 
# We have completed this tutorial using the analysis provided by the original VIA authors.
# 
# Paper: [Generalized and scalable trajectory inference in single-cell omics data with VIA](https://www.nature.com/articles/s41467-021-25773-3)
# 
# Code: https://github.com/ShobiStassen/VIA
# 
# Colab_Reproducibility:https://colab.research.google.com/drive/1A2X23z_RLJaYLbXaiCbZa-fjNbuGACrD?usp=sharing

# In[1]:


import omicverse as ov
import scanpy as sc
import matplotlib.pyplot as plt
ov.utils.ov_plot_set()


# ## Data loading and preprocessing
# 
# We have used the dataset scRNA_hematopoiesis provided by the authors for this analysis, noting that the data have been normalized and logarithmicized but not scaled.

# In[2]:


adata = ov.single.scRNA_hematopoiesis()
sc.tl.pca(adata, svd_solver='arpack', n_comps=200)
adata


# ## Model construct and run
# 
# We need to specify the cell feature vector `adata_key` used for VIA inference, which can be X_pca, X_scVI or X_glue, depending on the purpose of your analysis, here we use X_pca directly. We also need to specify how many components to be used, the components should not larger than the length of vector.
# 
# Besides, we need to specify the `clusters` to be colored and calculate for VIA. If the `root_user` is None, it will be calculated the root cell automatically.
# 
# We need to set `basis` argument stored in `adata.obsm`. An example setting `tsne` because it stored in `obsm: 'tsne', 'MAGIC_imputed_data', 'palantir_branch_probs', 'X_pca'`
# 
# We also need to set `clusters` argument stored in `adata.obs`. It means the celltype key.
# 
# Other explaination of argument and attributes could be found at https://pyvia.readthedocs.io/en/latest/Parameters%20and%20Attributes.html

# In[3]:


v0 = ov.single.pyVIA(adata=adata,adata_key='X_pca',adata_ncomps=80, basis='tsne',
                         clusters='label',knn=30,random_seed=4,root_user=[4823],)

v0.run()


# ## Visualize and analysis
# 
# Before the subsequent analysis, we need to specify the colour of each cluster. Here we use sc.pl.embedding to automatically colour each cluster, if you need to specify your own colours, specify the palette parameter

# In[4]:


fig, ax = plt.subplots(1,1,figsize=(4,4))
sc.pl.embedding(
    adata,
    basis="tsne",
    color=['label'],
    frameon=False,
    ncols=1,
    wspace=0.5,
    show=False,
    ax=ax
)
fig.savefig('figures/via_fig1.png',dpi=300,bbox_inches = 'tight')


# ## VIA graph
# 
# To visualize the results of the Trajectory inference in various ways. Via offers various plotting functions.We first show the cluster-graph level trajectory abstraction consisting of two subplots colored by annotated (true_label) composition and by pseudotime

# In[5]:


fig, ax, ax1 = v0.plot_piechart_graph(clusters='label',cmap='Reds',dpi=80,
                                   show_legend=False,ax_text=False,fontsize=4)
fig.savefig('figures/via_fig2.png',dpi=300,bbox_inches = 'tight')


# In[ ]:


#you can use `v0.model.single_cell_pt_markov` to extract the pseudotime
v0.get_pseudotime(v0.adata)
v0.adata


# ## Visualise gene/feature graph
# 
# View the gene expression along the VIA graph. We use the computed HNSW small world graph in VIA to accelerate the gene imputation calculations (using similar approach to MAGIC) as follows: 
# 

# In[6]:


gene_list_magic = ['IL3RA', 'IRF8', 'GATA1', 'GATA2', 'ITGA2B', 'MPO', 'CD79B', 'SPI1', 'CD34', 'CSF1R', 'ITGAX']
fig,axs=v0.plot_clustergraph(gene_list=gene_list_magic[:4],figsize=(12,3),)
fig.savefig('figures/via_fig2_1.png',dpi=300,bbox_inches = 'tight')


# ## Trajectory projection
# 
# Visualize the overall VIA trajectory projected onto a 2D embedding (UMAP, Phate, TSNE etc) in different ways.
# 
# - Draw the high-level clustergraph abstraction onto the embedding;
# - Draws a vector field plot of the more fine-grained directionality of cells along the trajectory projected onto an embedding.
# - Draw high-edge resolution directed graph

# In[7]:


fig,ax1,ax2=v0.plot_trajectory_gams(basis='tsne',clusters='label',draw_all_curves=False)
fig.savefig('figures/via_fig3.png',dpi=300,bbox_inches = 'tight')


# In[8]:


fig,ax=v0.plot_stream(basis='tsne',clusters='label',
               density_grid=0.8, scatter_size=30, scatter_alpha=0.3, linewidth=0.5)
fig.savefig('figures/via_fig4.png',dpi=300,bbox_inches = 'tight')


# In[9]:


fig,ax=v0.plot_stream(basis='tsne',density_grid=0.8, scatter_size=30, color_scheme='time', linewidth=0.5,
                             min_mass = 1, cutoff_perc = 5, scatter_alpha=0.3, marker_edgewidth=0.1,
                             density_stream = 2, smooth_transition=1, smooth_grid=0.5)
fig.savefig('figures/via_fig5.png',dpi=300,bbox_inches = 'tight')


# ## Probabilistic pathways
# 
# Visualize the probabilistic pathways from root to terminal state as indicated by the lineage likelihood. The higher the linage likelihood, the greater the potential of that particular cell to differentiate towards the terminal state of interest.

# In[10]:


fig,axs=v0.plot_lineage_probability(figsize=(8,4),)
fig.savefig('figures/via_fig6.png',dpi=300,bbox_inches = 'tight')


# We can specify a specific linkage for visualisation

# In[11]:


fig,axs=v0.plot_lineage_probability(figsize=(6,3),marker_lineages = [2,3])
fig.savefig('figures/via_fig7.png',dpi=300,bbox_inches = 'tight')


# ## Gene Dynamics
# 
# The gene dynamics along pseudotime for all detected lineages are automatically inferred by VIA. These can be interpreted as the change in gene expression along any given lineage. 

# In[12]:


fig,axs=v0.plot_gene_trend(gene_list=gene_list_magic,figsize=(8,6),)
fig.savefig('figures/via_fig8.png',dpi=300,bbox_inches = 'tight')


# In[14]:


fig,ax=v0.plot_gene_trend_heatmap(gene_list=gene_list_magic,figsize=(4,4),
                          marker_lineages=[2])
fig.savefig('figures/via_fig9.png',dpi=300,bbox_inches = 'tight')


# In[ ]:





# In[ ]: