Why do you use pass@10 to test coding perfmance...

#4
by Leon-Leee - opened

... while main-stream benchmarks usually use greedy decoding and pass@1?

@Leon-Leee we were mostly matching previous tulu papers. Agree the community has moved a bunch, so we'll probably update in the future!

natolambert changed discussion status to closed

Sign up or log in to comment