Update README.md
Browse files
README.md
CHANGED
@@ -6,6 +6,28 @@ license: apache-2.0
|
|
6 |
|
7 |
# Rationalyst (with rationales extracted from reasoning datasets)
|
8 |
|
9 |
-
This model is a
|
10 |
introduced in RATIONALYST: Supervising Reasoning via Self-Supervised Rationale Extraction. The code for the rationale extraction, model training and
|
11 |
inference can be found [here](https://github.com/JHU-CLSP/reasoning_world_model).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
6 |
|
7 |
# Rationalyst (with rationales extracted from reasoning datasets)
|
8 |
|
9 |
+
This model is a fine-tuned version of the [LLaMa-3-Instruct-8B](https://huggingface.co/bert-base-uncased). It was
|
10 |
introduced in RATIONALYST: Supervising Reasoning via Self-Supervised Rationale Extraction. The code for the rationale extraction, model training and
|
11 |
inference can be found [here](https://github.com/JHU-CLSP/reasoning_world_model).
|
12 |
+
|
13 |
+
## Model description
|
14 |
+
Implicit rationales are often embedded in the unlabelled text, reflecting the natural thought processes behind speech and writing.
|
15 |
+
RATIONALYST is a self-supervised approach to extract and filter these implicit rationales from unlabelled text and apply
|
16 |
+
them to supervise reasoning.
|
17 |
+
|
18 |
+
## How to use
|
19 |
+
To use it, simply input question and partial reasoning trajectory, and the model will output the rationale to supervise the next reasoning step.
|
20 |
+
|
21 |
+
## Training data
|
22 |
+
|
23 |
+
This Rationalyst is trained using 17566 rationales from GSM8K and 19669 rationales from ECQA. The data used can be found
|
24 |
+
[here](https://huggingface.co/datasets/Dongwei/reasoning_world_model)
|
25 |
+
|
26 |
+
|
27 |
+
## Evaluation results
|
28 |
+
|
29 |
+
When used to evaluate on downstream tasks, this model achieves the following results:
|
30 |
+
|
31 |
+
| Task | GSM8K | MATH | ECQA | HellaSwag | ProofWriter | ARC | MMLU-Pro |
|
32 |
+
|:----:|:----:|:----:|:----:|:-----:|:----:|:-----:|:----:|
|
33 |
+
| | 76.2 | 32.5 | 76.2 | 59.4 | 90.1 | 79.3 | 32.1 |
|