apcl
/

jam-contextsum / README.md
chiayisu's picture
Create README.md
bb5bc63 verified

Jam-Contextsum

Jam-Contextsum is a GPT2-like model finetuned to generate summary on why the method exists.

Jam-Contextsum Training Details

  • ckpt_pretrain is the file that we use to finetune the model for generating the summary on why the method exists
  • Our GitHub repo contains the code for reproduction using the same data.

ckpt_pretrain.pt

Hyperparameter Description Value
e embedding dimensions 512
L number of layers 4
h attention heads 4
c block size / context length 1,024
b batch size 4
a accumulation steps 32
d dropout 0.20
r learning rate 3e-5
y iterations 1e-5
iter number of iterations after pretraing 137,900