Rsr2425 commited on
Commit
2f47bd0
Β·
1 Parent(s): 9178758

[HW3] Completed questions

Browse files
Files changed (1) hide show
  1. README.md +21 -2
README.md CHANGED
@@ -168,9 +168,9 @@ Simply put, this downloads the file as a temp file, we load it in with `TextFile
168
 
169
  Why do we want to support streaming? What about streaming is important, or useful?
170
 
171
- #### :check_mark: ANSWER #1:
172
 
173
- Streaming is important because it allows us to send the response to the user as soon as it is available. LLM responses can be quite slow, especially when there are a large number of input tokens. Streaming provides a much more responsive experience for the user.
174
 
175
  ### On Chat Start:
176
 
@@ -214,6 +214,10 @@ Now, we'll save that into our user session!
214
 
215
  Why are we using User Session here? What about Python makes us need to use this? Why not just store everything in a global variable?
216
 
 
 
 
 
217
  ### On Message
218
 
219
  First, we load our chain from the user session:
@@ -339,6 +343,21 @@ Upload a PDF file of the recent DeepSeek-R1 paper and ask the following question
339
 
340
  Does this application pass your vibe check? Are there any immediate pitfalls you're noticing?
341
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
342
  ## 🚧 CHALLENGE MODE 🚧
343
 
344
  For the challenge mode, please instead create a simple FastAPI backend with a simple React (or any other JS framework) frontend.
 
168
 
169
  Why do we want to support streaming? What about streaming is important, or useful?
170
 
171
+ #### βœ… Answer #1:
172
 
173
+ LLMs can introduce a ton of latency, especially when we're providing it with a lot of input tokens. Streaming allows us to send the response to the user as soon as it is available, which provides a much more responsive experience for the user.
174
 
175
  ### On Chat Start:
176
 
 
214
 
215
  Why are we using User Session here? What about Python makes us need to use this? Why not just store everything in a global variable?
216
 
217
+ #### βœ… Answer #2:
218
+
219
+ We're using User Session here because we want to be able to maintain state between chat sessions. This way we can maintain data unique to each user chatting with the application, without accidentally leaking (potentially private data) to other users.
220
+
221
  ### On Message
222
 
223
  First, we load our chain from the user session:
 
343
 
344
  Does this application pass your vibe check? Are there any immediate pitfalls you're noticing?
345
 
346
+ #### βœ… Discussion Answer #1:
347
+ 1. What is RL and how does it help reasoning?
348
+ Reinforcement Learning (RL) is a type of machine learning where an agent learns to make decisions by taking actions in an environment to maximize some notion of cumulative reward.
349
+
350
+ In the context of reasoning, RL helps improve the reasoning capabilities of language models by allowing them to learn from their interactions and the feedback they receive, rather than relying solely on supervised data. By using RL, models, like DeepSeek-R1-Zero, can self-evolve their reasoning capabilities through a process of trial and error. This approach has demonstrated significant effectiveness, leading to improved reasoning performance on various benchmarks. Specifically, RL encourages the development of powerful reasoning behaviors without the need for extensive datasets that require manual labeling, making the learning process more efficient.
351
+ 2. What is the difference between DeepSeek-R1 and DeepSeek-R1-Zero?
352
+ The main difference between DeepSeek-R1 and DeepSeek-R1-Zero lies in their approaches to reinforcement learning and the incorporation of data. DeepSeek-R1-Zero demonstrates strong reasoning capabilities and exhibits self-evolution through reinforcement learning, but it faces issues like poor readability and language mixing. In contrast, DeepSeek-R1 builds upon the strengths of DeepSeek-R1-Zero by exploring the use of a small amount of high-quality data as a cold start, aiming to improve reasoning performance and create a user-friendly model. Additionally, DeepSeek-R1 is designed to make its reasoning processes more readable and accessible to the community, addressing some of the drawbacks identified in DeepSeek-R1-Zero.
353
+ 3. What is this paper about? I don't know the answer.
354
+
355
+
356
+ Does this application pass your vibe check? Are there any immediate pitfalls you're noticing? It particularly struggled to answer broad high-level questions, like what the totality of th paper is about. The app did okay, but clearly there are gaps. This parituclar issue is likely the result of suboptimal chunking.
357
+
358
+ -----------------
359
+
360
+
361
  ## 🚧 CHALLENGE MODE 🚧
362
 
363
  For the challenge mode, please instead create a simple FastAPI backend with a simple React (or any other JS framework) frontend.