Dongwei commited on
Commit
4d16990
·
verified ·
1 Parent(s): 74d3f8c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +23 -1
README.md CHANGED
@@ -6,6 +6,28 @@ license: apache-2.0
6
 
7
  # Rationalyst (with rationales extracted from reasoning datasets)
8
 
9
- This model is a distilled version of the [LLaMa-3-Instruct-8B](https://huggingface.co/bert-base-uncased). It was
10
  introduced in RATIONALYST: Supervising Reasoning via Self-Supervised Rationale Extraction. The code for the rationale extraction, model training and
11
  inference can be found [here](https://github.com/JHU-CLSP/reasoning_world_model).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6
 
7
  # Rationalyst (with rationales extracted from reasoning datasets)
8
 
9
+ This model is a fine-tuned version of the [LLaMa-3-Instruct-8B](https://huggingface.co/bert-base-uncased). It was
10
  introduced in RATIONALYST: Supervising Reasoning via Self-Supervised Rationale Extraction. The code for the rationale extraction, model training and
11
  inference can be found [here](https://github.com/JHU-CLSP/reasoning_world_model).
12
+
13
+ ## Model description
14
+ Implicit rationales are often embedded in the unlabelled text, reflecting the natural thought processes behind speech and writing.
15
+ RATIONALYST is a self-supervised approach to extract and filter these implicit rationales from unlabelled text and apply
16
+ them to supervise reasoning.
17
+
18
+ ## How to use
19
+ To use it, simply input question and partial reasoning trajectory, and the model will output the rationale to supervise the next reasoning step.
20
+
21
+ ## Training data
22
+
23
+ This Rationalyst is trained using 17566 rationales from GSM8K and 19669 rationales from ECQA. The data used can be found
24
+ [here](https://huggingface.co/datasets/Dongwei/reasoning_world_model)
25
+
26
+
27
+ ## Evaluation results
28
+
29
+ When used to evaluate on downstream tasks, this model achieves the following results:
30
+
31
+ | Task | GSM8K | MATH | ECQA | HellaSwag | ProofWriter | ARC | MMLU-Pro |
32
+ |:----:|:----:|:----:|:----:|:-----:|:----:|:-----:|:----:|
33
+ | | 76.2 | 32.5 | 76.2 | 59.4 | 90.1 | 79.3 | 32.1 |