Update README.md
Browse files
README.md
CHANGED
@@ -44,26 +44,18 @@ To ensure the robustness and effectiveness of Llama-3-SEC, the model has undergo
|
|
44 |
|
45 |
- Domain-specific perplexity, measuring the model's performance on SEC-related data
|
46 |
|
47 |
-
|
48 |
|
49 |
- Extractive numerical reasoning tasks, using subsets of TAT-QA and ConvFinQA datasets
|
50 |
|
51 |
-
|
52 |
|
53 |
- General evaluation metrics, such as BIG-bench, AGIEval, GPT4all, and TruthfulQA, to assess the model's performance on a wide range of tasks
|
54 |
|
55 |
-
|
56 |
-
|
57 |
-
- General perplexity on various datasets, including bigcode/starcoderdata, open-web-math/open-web-math, allenai/peS2o, mattymchen/refinedweb-3m, and Wikitext
|
58 |
|
59 |
The evaluation results demonstrate significant improvements in domain-specific performance while maintaining strong general capabilities, thanks to the use of advanced CPT and model merging techniques.
|
60 |
|
61 |
-
In this version:
|
62 |
-
- The images are resized using the `{width=X}` syntax in Markdown, where `X` represents the desired width in pixels.
|
63 |
-
- The first two images are set to a width of 400 pixels.
|
64 |
-
- The last image is set to a width of 600 pixels.
|
65 |
-
|
66 |
-
This approach keeps the original structure and text of the evaluation section while simply making the images smaller using Markdown syntax.
|
67 |
|
68 |
## Training and Inference
|
69 |
|
|
|
44 |
|
45 |
- Domain-specific perplexity, measuring the model's performance on SEC-related data
|
46 |
|
47 |
+
<img src="https://i.ibb.co/xGHRfLf/Screenshot-2024-06-11-at-10-23-59-PM.png" width="400">
|
48 |
|
49 |
- Extractive numerical reasoning tasks, using subsets of TAT-QA and ConvFinQA datasets
|
50 |
|
51 |
+
<img src="https://i.ibb.co/2v6PdDx/Screenshot-2024-06-11-at-10-25-03-PM.png" width="400">
|
52 |
|
53 |
- General evaluation metrics, such as BIG-bench, AGIEval, GPT4all, and TruthfulQA, to assess the model's performance on a wide range of tasks
|
54 |
|
55 |
+
<img src="https://i.ibb.co/K5d0wMh/Screenshot-2024-06-11-at-10-23-18-PM.png" width="600">
|
|
|
|
|
56 |
|
57 |
The evaluation results demonstrate significant improvements in domain-specific performance while maintaining strong general capabilities, thanks to the use of advanced CPT and model merging techniques.
|
58 |
|
|
|
|
|
|
|
|
|
|
|
|
|
59 |
|
60 |
## Training and Inference
|
61 |
|