apcl
/

Jam_so

Jam_so is a GPT2-like model for research in fine-grained Java analysis. It is intended for fine-grained analysis of Java source code at the level of methods, statements, and variables, as a foundation for downstream tasks like code completion, comment generation, and automated bug repair.


Jam_so Training Details

  • We trained the jam_so model using the training procedures from Daniel Grittner's NanoGPT-LoRA

  • The dataset used to train our model is our own dataset so13m dataset, processed from 13 million StackOverflow posts picked from a Stack Exchange data dump for posts between January 2014 and December 2022.

  • We train the model on training set for 1 epoch, roughly 300,000 training iterations.

  • Our GitHub repo contains the code for re-training using the raw data.

Hyperparameter Description Value
e embedding dimensions 1024
L number of layers 24
h attention heads 16
c block size / context length 256
b batch size 4
a accumulation steps 32
d dropout 0.20
r learning rate 3e-5
y weight decay 1e-1

We train our models using a single NVidia A5000 GPUs.


Jam Projects

Current projects using the jam_so pre-trained model can be found at our Github repository:

https://github.com/apcl-research/jam

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Dataset used to train apcl/jam_so