Crystalcareai commited on
Commit
25213ca
·
verified ·
1 Parent(s): 9022eb4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -12
README.md CHANGED
@@ -38,19 +38,16 @@ The model's deep understanding of SEC filings and related financial data makes i
38
 
39
  To ensure the robustness and effectiveness of Llama-3-SEC, the model has undergone rigorous evaluation on both domain-specific and general benchmarks. Key evaluation metrics include:
40
 
41
- - Domain-specific perplexity, measuring the model's performance on SEC-related data
 
 
 
 
 
 
 
 
42
 
43
- ![Domain Specific Perplexity of Model Variants](https://i.ibb.co/xGHRfLf/Screenshot-2024-06-11-at-10-23-59-PM.png)
44
-
45
- - Extractive numerical reasoning tasks, using subsets of TAT-QA and ConvFinQA datasets
46
-
47
- ![Domain Specific Evaluations of Model Variants](https://i.ibb.co/2v6PdDx/Screenshot-2024-06-11-at-10-25-03-PM.png)
48
-
49
- - General evaluation metrics, such as BIG-bench, AGIEval, GPT4all, and TruthfulQA, to assess the model's performance on a wide range of tasks
50
-
51
- ![General Evaluations of Model Variants](https://i.ibb.co/K5d0wMh/Screenshot-2024-06-11-at-10-23-18-PM.png)
52
-
53
- - General perplexity on various datasets, including bigcode/starcoderdata, open-web-math/open-web-math, allenai/peS2o, mattymchen/refinedweb-3m, and Wikitext
54
 
55
  The evaluation results demonstrate significant improvements in domain-specific performance while maintaining strong general capabilities, thanks to the use of advanced CPT and model merging techniques.
56
 
 
38
 
39
  To ensure the robustness and effectiveness of Llama-3-SEC, the model has undergone rigorous evaluation on both domain-specific and general benchmarks. Key evaluation metrics include:
40
 
41
+ <table>
42
+ <tr>
43
+ <td><img src="https://i.ibb.co/xGHRfLf/Screenshot-2024-06-11-at-10-23-59-PM.png" alt="Domain Specific Perplexity of Model Variants" width="300"></td>
44
+ <td><img src="https://i.ibb.co/2v6PdDx/Screenshot-2024-06-11-at-10-25-03-PM.png" alt="Domain Specific Evaluations of Model Variants" width="300"></td>
45
+ </tr>
46
+ <tr>
47
+ <td colspan="2" style="text-align:center;"><img src="https://i.ibb.co/K5d0wMh/Screenshot-2024-06-11-at-10-23-18-PM.png" alt="General Evaluations of Model Variants" width="600"></td>
48
+ </tr>
49
+ </table>
50
 
 
 
 
 
 
 
 
 
 
 
 
51
 
52
  The evaluation results demonstrate significant improvements in domain-specific performance while maintaining strong general capabilities, thanks to the use of advanced CPT and model merging techniques.
53