Why do you use pass@10 to test coding perfmance...
#4
by
Leon-Leee
- opened
... while main-stream benchmarks usually use greedy decoding and pass@1?
@Leon-Leee we were mostly matching previous tulu papers. Agree the community has moved a bunch, so we'll probably update in the future!
natolambert
changed discussion status to
closed