Crystalcareai commited on
Commit
ff7b4af
·
verified ·
1 Parent(s): c6b7bd7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -11
README.md CHANGED
@@ -44,26 +44,18 @@ To ensure the robustness and effectiveness of Llama-3-SEC, the model has undergo
44
 
45
  - Domain-specific perplexity, measuring the model's performance on SEC-related data
46
 
47
- ![Domain Specific Perplexity of Model Variants](https://i.ibb.co/xGHRfLf/Screenshot-2024-06-11-at-10-23-59-PM.png){width=400}
48
 
49
  - Extractive numerical reasoning tasks, using subsets of TAT-QA and ConvFinQA datasets
50
 
51
- ![Domain Specific Evaluations of Model Variants](https://i.ibb.co/2v6PdDx/Screenshot-2024-06-11-at-10-25-03-PM.png){width=400}
52
 
53
  - General evaluation metrics, such as BIG-bench, AGIEval, GPT4all, and TruthfulQA, to assess the model's performance on a wide range of tasks
54
 
55
- ![General Evaluations of Model Variants](https://i.ibb.co/K5d0wMh/Screenshot-2024-06-11-at-10-23-18-PM.png){width=600}
56
-
57
- - General perplexity on various datasets, including bigcode/starcoderdata, open-web-math/open-web-math, allenai/peS2o, mattymchen/refinedweb-3m, and Wikitext
58
 
59
  The evaluation results demonstrate significant improvements in domain-specific performance while maintaining strong general capabilities, thanks to the use of advanced CPT and model merging techniques.
60
 
61
- In this version:
62
- - The images are resized using the `{width=X}` syntax in Markdown, where `X` represents the desired width in pixels.
63
- - The first two images are set to a width of 400 pixels.
64
- - The last image is set to a width of 600 pixels.
65
-
66
- This approach keeps the original structure and text of the evaluation section while simply making the images smaller using Markdown syntax.
67
 
68
  ## Training and Inference
69
 
 
44
 
45
  - Domain-specific perplexity, measuring the model's performance on SEC-related data
46
 
47
+ <img src="https://i.ibb.co/xGHRfLf/Screenshot-2024-06-11-at-10-23-59-PM.png" width="400">
48
 
49
  - Extractive numerical reasoning tasks, using subsets of TAT-QA and ConvFinQA datasets
50
 
51
+ <img src="https://i.ibb.co/2v6PdDx/Screenshot-2024-06-11-at-10-25-03-PM.png" width="400">
52
 
53
  - General evaluation metrics, such as BIG-bench, AGIEval, GPT4all, and TruthfulQA, to assess the model's performance on a wide range of tasks
54
 
55
+ <img src="https://i.ibb.co/K5d0wMh/Screenshot-2024-06-11-at-10-23-18-PM.png" width="600">
 
 
56
 
57
  The evaluation results demonstrate significant improvements in domain-specific performance while maintaining strong general capabilities, thanks to the use of advanced CPT and model merging techniques.
58
 
 
 
 
 
 
 
59
 
60
  ## Training and Inference
61