Commit History

mistral 5-shot
5158692

inflaton commited on

counted few-shot prompts for all models
a8683cf

dh-mc commited on

o1-preview 20-shot
0baa6cc

inflaton commited on

Update eval-mgtv-shots_4bit.sh
492d1d4

dh-mc commited on

log
fe51ea8

inflaton commited on

o1-preview 5-shot
f2a583b

inflaton commited on

o1-mini 5/20 shots results
9042941

inflaton commited on

try 5-shot for open source models
d2150e8

dh-mc commited on

o1-preview 0-shot
545719f

inflaton commited on

o1-mini 0-shot
16adfc9

inflaton commited on

o1-preview 10-shot
6838eea

inflaton commited on

ready to run 10-shots for 70/72B models
809e98c

dh-mc commited on

10-shot results ready for 7/8 B models
3db2ae5

dh-mc commited on

logs/internlm2_5-20b-chat_tune_and_few_shots.txt
d8cfffb

inflaton commited on

10-shot results
6bc1181

inflaton commited on

internlm 20b results
47d6ce1

inflaton commited on

0-shot notebook
5b276b0

dh-mc commited on

Create eval-mgtv-internlm-20b.sh
75c4663

dh-mc commited on

mistral 10-shot
33cd694

dh-mc commited on

rtx4090 0-shot
d028752

dh-mc commited on

ready for few shots eval
cf912f1

dh-mc commited on

claude 0-shot
397a2fa

inflaton commited on

added original data from MGTV challenge
5f9686b

dh-mc commited on

https://github.com/mazzzystar/TurtleBenchmark
444a581

dh-mc commited on

compare o1 vs gpt-4o
4cd13da

dh-mc commited on

o1-mini analyzed
f1b0a53

dh-mc commited on

o1-mini results
fd14581

inflaton commited on

LogiQA2.0 dataset
bf13772

dh-mc commited on

openai batch
921fa92

dh-mc commited on

Create 04e_OpenAI_comparison.ipynb
2bb5512

dh-mc commited on

internlm_v2 results
83818dc

inflaton commited on

internlm2_5-7b-chat fine-tune results
e4bce5e

inflaton commited on

added scripts/eval-mgtv-internlm_v2.sh
71dcee7

inflaton commited on

Update 04_Few-shot_Prompting_OpenAI.ipynb
8e678e8

dh-mc commited on

ready for fine-tuning internlm2_5-20b-chat
62c2b84

dh-mc commited on

saved best results/metrics
573f5d1

dh-mc commited on

completed eval/analysis
468b88d

dh-mc commited on

qwen2-72b full results
6e932d8

inflaton commited on

openai zero-shot results
8b9bb19

inflaton commited on

Update eval_logical_reasoning_all_epochs.py
090acf8

dh-mc commited on

change BATCH_SIZE to 1 for qwen2-72b eval
4c31851

dh-mc commited on

open source LLM results almost done
5a8f8d2

dh-mc commited on

llama3.1-70b done
5dc41da

inflaton commited on

mistral updated
a9f4f1f

dh-mc commited on

llama-3.1-70b wip
60dc2c4

inflaton commited on

llama-3.1-70b wip
717ab95

inflaton commited on

mistral wip
9129c41

dh-mc commited on

llama3.1-70b wip
e5b5f58

inflaton commited on

Update llm_utils.py
71af822

dh-mc commited on

mistral complete
1e26971

dh-mc commited on