YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
roberta_python
language: code datasets: - code_search_net - Fraser/python-lines tags: - python - code - masked-lm widget: - text "assert 6 == sum([i for i in range(
Details
This is a roBERTa-base model trained on the python part of CodeSearchNet and reached a dev perplexity of 3.296
This model was used for the Programming Puzzles enumerative solver baseline detailed in Programming Puzzles paper.
See also the Python Programming Puzzles (P3) Repository for more details.
Usage
You can either load the model and further fine-tune it for a target task (as done for the puzzle solver), or you can experiment with mask-filling directly with this model as in the following example:
from transformers import AutoTokenizer, AutoModelWithLMHead, pipeline
tokenizer = AutoTokenizer.from_pretrained("tals/roberta_python")
model = AutoModelWithLMHead.from_pretrained("tals/roberta_python")
demo = pipeline("fill-mask", model=model, tokenizer=tokenizer)
code = """sum= 0
for i in range(<mask>):
sum += i
assert sum == 6
"""
demo(code)
BibTeX entry and citation info
@inproceedings{
schuster2021programming,
title={Programming Puzzles},
author={Tal Schuster and Ashwin Kalyan and Alex Polozov and Adam Tauman Kalai},
booktitle={Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 1)},
year={2021},
url={https://openreview.net/forum?id=fe_hCc4RBrg}
}
- Downloads last month
- 14
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.