Commits · inflaton-ai/logical-reasoning

mistral 5-shot

5158692

inflaton commited on Sep 20, 2024

counted few-shot prompts for all models

a8683cf

dh-mc commited on Sep 20, 2024

o1-preview 20-shot

0baa6cc

inflaton commited on Sep 19, 2024

Update eval-mgtv-shots_4bit.sh

492d1d4

dh-mc commited on Sep 19, 2024

log

fe51ea8

inflaton commited on Sep 19, 2024

o1-preview 5-shot

f2a583b

inflaton commited on Sep 19, 2024

o1-mini 5/20 shots results

9042941

inflaton commited on Sep 19, 2024

try 5-shot for open source models

d2150e8

dh-mc commited on Sep 18, 2024

o1-preview 0-shot

545719f

inflaton commited on Sep 18, 2024

o1-mini 0-shot

16adfc9

inflaton commited on Sep 16, 2024

o1-preview 10-shot

6838eea

inflaton commited on Sep 16, 2024

ready to run 10-shots for 70/72B models

809e98c

dh-mc commited on Sep 16, 2024

10-shot results ready for 7/8 B models

3db2ae5

dh-mc commited on Sep 16, 2024

logs/internlm2_5-20b-chat_tune_and_few_shots.txt

d8cfffb

inflaton commited on Sep 16, 2024

10-shot results

6bc1181

inflaton commited on Sep 15, 2024

internlm 20b results

47d6ce1

inflaton commited on Sep 15, 2024

0-shot notebook

5b276b0

dh-mc commited on Sep 15, 2024

Create eval-mgtv-internlm-20b.sh

75c4663

dh-mc commited on Sep 15, 2024

mistral 10-shot

33cd694

dh-mc commited on Sep 15, 2024

rtx4090 0-shot

d028752

dh-mc commited on Sep 15, 2024

ready for few shots eval

cf912f1

dh-mc commited on Sep 14, 2024

claude 0-shot

397a2fa

inflaton commited on Sep 14, 2024

added original data from MGTV challenge

5f9686b

dh-mc commited on Sep 14, 2024

https://github.com/mazzzystar/TurtleBenchmark

444a581

dh-mc commited on Sep 14, 2024

compare o1 vs gpt-4o

4cd13da

dh-mc commited on Sep 14, 2024

o1-mini analyzed

f1b0a53

dh-mc commited on Sep 13, 2024

o1-mini results

fd14581

inflaton commited on Sep 13, 2024

LogiQA2.0 dataset

bf13772

dh-mc commited on Sep 13, 2024

openai batch

921fa92

dh-mc commited on Sep 13, 2024

Create 04e_OpenAI_comparison.ipynb

2bb5512

dh-mc commited on Sep 13, 2024

internlm_v2 results

83818dc

inflaton commited on Sep 13, 2024

internlm2_5-7b-chat fine-tune results

e4bce5e

inflaton commited on Sep 13, 2024

added scripts/eval-mgtv-internlm_v2.sh

71dcee7

inflaton commited on Sep 13, 2024

Update 04_Few-shot_Prompting_OpenAI.ipynb

8e678e8

dh-mc commited on Sep 12, 2024

ready for fine-tuning internlm2_5-20b-chat

62c2b84

dh-mc commited on Sep 12, 2024

saved best results/metrics

573f5d1

dh-mc commited on Sep 12, 2024

completed eval/analysis

468b88d

dh-mc commited on Sep 12, 2024

qwen2-72b full results

6e932d8

inflaton commited on Sep 11, 2024