license: mit
datasets:
- ammarnasr/the-stack-java-clean
library_name: adapter-transformers
tags:
- code
pipeline_tag: text-generation
language:
- code
CodeGen (CodeGen-Mono 350M LoRa Java)
Model description
CodeGen LoRa Java is a family of autoregressive language models fine-tuned using LoRa on Different Programming Langauges.
Training data
This model was fine-tuned on the cleaned Java subset from TheStack Avilable here. The data consists of 1 Million Java code files.
Training procedure
This model was fine-tuned using LoRa on 1 T4 GPU. The model was trained for 10,000 steps with batch size of 4. The model was trained using causal language modeling loss.
Evaluation results
We evaluate our models on the MultiPle-E bencchmark. The model achieves 8.9 Pass@10 Rate. 
tokenizer = AutoTokenizer.from_pretrained(peft_config.base_model_name_or_path)
model = AutoModelForCausalLM.from_pretrained(peft_config.base_model_name_or_path)
model = PeftModel.from_pretrained(model, model_name)
model.print_trainable_parameters()
text = "public static void main(String[] args) {"
input_ids = tokenizer.encode(text, return_tensors="pt")
generated_ids = model.generate(input_ids=input_ids, max_length=100)
print('Generated: \n')
print(tokenizer.decode(generated_ids[0], skip_special_tokens=True))
BibTeX entry and citation info
@article{Nijkamp2022ACP,
title={A Conversational Paradigm for Program Synthesis},
author={Nijkamp, Erik and Pang, Bo and Hayashi, Hiroaki and Tu, Lifu and Wang, Huan and Zhou, Yingbo and Savarese, Silvio and Xiong, Caiming},
journal={arXiv preprint},
year={2022}
}