YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
A Lossless Syntax Tree Generator with Zero-shot Error Correction
- We follow jam's pretraining procedure and use the same data to pretrain except we also use srcml to pretrain the models.
- In the finetuning stage, we finetune our models for 3 epochs.
- Our GitHub repo contains the code for reproduction using the same data.
Pretrained model parameters
Hyperparameter | Description | Value |
---|---|---|
e | embedding dimensions | 1024 |
L | number of layers | 24 |
h | attention heads | 16 |
c | block size / context length | 256 |
b | batch size | 4 |
a | accumulation steps | 32 |
r | learning rate | 3e-5 |
y | weight decay | 1e-5 |
iter | iterations | 570000 |
Model files
Filename | Description |
---|---|
ckpt.pt | A model file for finetuning |
ckpt_base.pt | A model file for generating syntax tree with the error correction in zero-shot setting |
ckpt_finetune.pt | A model finetuned with the syntatic error dataset |
- Note that you can adjust the batch size and accumulation steps based on your GPU memory. But, the batch size * accumulation steps should be 128.
- If you finetune your models with multiple GPUs, you can turn down accumulation steps. For example, if you finetune with 2 GPUs, you will need to half the accumulation steps.
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no library tag.