Update README.md
Browse files
README.md
CHANGED
@@ -116,7 +116,7 @@ The ICE methodology provides metrics for Usefulness and Functional Correctness a
|
|
116 |
* Functional Correctness: An LLM that has complex reasoning capabilities is utilized to conduct unit tests while considering the given question and the reference code.
|
117 |
|
118 |
|
119 |
-
We utilized GPT4 to measure the above metrics and provide a score from 0-4. This is the test dataset
|
120 |
|
121 |
You can read more about the ICE methodology in this paper.
|
122 |
|
|
|
116 |
* Functional Correctness: An LLM that has complex reasoning capabilities is utilized to conduct unit tests while considering the given question and the reference code.
|
117 |
|
118 |
|
119 |
+
We utilized GPT4 to measure the above metrics and provide a score from 0-4. This is the [test dataset](https://drive.google.com/file/d/1R6DDyBhcR6TSUYFTgUosJxrvibkR1BHC/view) and [Jupyter notebook] (https://colab.research.google.com/drive/1USuNLFxLex-C5tLHYET_nQfpM4ALCbc5?usp=sharing#scrollTo=lNCZTBj1nBsJ) we used to perform the benchmark.
|
120 |
|
121 |
You can read more about the ICE methodology in this paper.
|
122 |
|