Spaces:
Runtime error
Runtime error
Commit
·
82370ff
1
Parent(s):
4652972
Update app.py
Browse files
app.py
CHANGED
@@ -168,11 +168,27 @@ st.latex("A \in \mathbb{R}^{MxK}, B \in R^{KxN}, C \in \mathbb{R}^{MxN}")
|
|
168 |
|
169 |
st.markdown('''
|
170 |
To execute this operation on the GPU, we need to
|
171 |
-
|
172 |
-
|
173 |
-
|
174 |
''')
|
175 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
176 |
|
177 |
breakdown = st.checkbox("Show breakdown per operation")
|
178 |
if breakdown:
|
|
|
168 |
|
169 |
st.markdown('''
|
170 |
To execute this operation on the GPU, we need to
|
171 |
+
1. Read A, B from memory
|
172 |
+
2. Perform math operations
|
173 |
+
3. Write C to memory
|
174 |
''')
|
175 |
|
176 |
+
st.latex('''
|
177 |
+
For float16 operations (2 bytes), we can estimate the memory access time of A as follows:
|
178 |
+
T_mem(A) = 2*M*K / BW_mem
|
179 |
+
where BW_mem is the memory bandwidth of the GPU (e.g. 1935 GB/s for A100)
|
180 |
+
''')
|
181 |
+
|
182 |
+
st.latex('''
|
183 |
+
For float16 operations (2 bytes), we can estimate the memory access time of A as follows:
|
184 |
+
T_mem(A) = 2*M*K / BW_mem
|
185 |
+
where BW_mem is the memory bandwidth of the GPU (e.g. 1935 GB/s for A100)
|
186 |
+
''')
|
187 |
+
|
188 |
+
|
189 |
+
|
190 |
+
|
191 |
+
|
192 |
|
193 |
breakdown = st.checkbox("Show breakdown per operation")
|
194 |
if breakdown:
|