davidkim205
commited on
Commit
·
40aaaea
1
Parent(s):
e3d48b3
Update README.md
Browse files
README.md
CHANGED
@@ -31,6 +31,7 @@ This study addresses these challenges by introducing a multi-task instruction te
|
|
31 |
Refer https://github.com/davidkim205/komt
|
32 |
|
33 |
## Evaluation
|
|
|
34 |
|
35 |
| model | score | average(0~5) | percentage |
|
36 |
| --------------------------------------- | ------- | ------------ | ---------- |
|
|
|
31 |
Refer https://github.com/davidkim205/komt
|
32 |
|
33 |
## Evaluation
|
34 |
+
For objective model evaluation, we initially used EleutherAI's lm-evaluation-harness but obtained unsatisfactory results. Consequently, we conducted evaluations using ChatGPT, a widely used model, as described in [Self-Alignment with Instruction Backtranslation](https://arxiv.org/pdf/2308.06502.pdf) and [Three Ways of Using Large Language Models to Evaluate Chat](https://arxiv.org/pdf/2308.06259.pdf) .
|
35 |
|
36 |
| model | score | average(0~5) | percentage |
|
37 |
| --------------------------------------- | ------- | ------------ | ---------- |
|