Spaces:
Paused
Paused
Commit
·
8a75988
1
Parent(s):
32b0d27
Update README.md
Browse files
README.md
CHANGED
@@ -18,6 +18,10 @@ pinned: true
|
|
18 |
|
19 |
Rogue GPT is an attempt to not only instruct Tune LLM-powered agents (treating LLMs as reasoning engines) for tasks in the mini hack environment but also to explore the use of reinforcement learning and continuous learning for embodied agents inside environments, using only LLMs so that Lessons Learned can be abstracted to other modalities.
|
20 |
|
|
|
|
|
|
|
|
|
21 |
## Justifications for the Datasets
|
22 |
|
23 |
### Tiny Stories Dataset
|
|
|
18 |
|
19 |
Rogue GPT is an attempt to not only instruct Tune LLM-powered agents (treating LLMs as reasoning engines) for tasks in the mini hack environment but also to explore the use of reinforcement learning and continuous learning for embodied agents inside environments, using only LLMs so that Lessons Learned can be abstracted to other modalities.
|
20 |
|
21 |
+
I want to use small LLMs and a focused dataset so I can really get a good idea of how the moving parts perform and what data is necessary besides just large general knowledge. I'm under the assumption that a carefully curated dataset specifically tailored towards continuous learning in an embodied agent can yield desirable results even in models with fewer than a billion parameters.
|
22 |
+
|
23 |
+
My rough strategy is to use the tiny stories dataset, a trajectory-based dataset only using the human Monk trajectories, and select categories from the Nat hack Wiki. I plan to perform some ablations to see which datasets are critical. Once I have the basic instruction tuning up, so we can follow basic small instructions, I will then attempt to implement some combination of ideas from papers that I've been interested in.
|
24 |
+
|
25 |
## Justifications for the Datasets
|
26 |
|
27 |
### Tiny Stories Dataset
|