apcl
/

YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

A Lossless Syntax Tree Generator with Zero-shot Error Correction

  • We follow jam's pretraining procedure and use the same data to pretrain except we also use srcml to pretrain the models.
  • In the finetuning stage, we finetune our models for 3 epochs.
  • Our GitHub repo contains the code for reproduction using the same data.

Pretrained model parameters

Hyperparameter Description Value
e embedding dimensions 1024
L number of layers 24
h attention heads 16
c block size / context length 256
b batch size 4
a accumulation steps 32
r learning rate 3e-5
y weight decay 1e-5
iter iterations 570000

Model files

Filename Description
ckpt.pt A model file for finetuning
ckpt_base.pt A model file for generating syntax tree with the error correction in zero-shot setting
ckpt_finetune.pt A model finetuned with the syntatic error dataset
  • Note that you can adjust the batch size and accumulation steps based on your GPU memory. But, the batch size * accumulation steps should be 128.
  • If you finetune your models with multiple GPUs, you can turn down accumulation steps. For example, if you finetune with 2 GPUs, you will need to half the accumulation steps.
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.