codebleu / README.md
giulio98's picture
Update README.md
d5ba44d

A newer version of the Gradio SDK is available: 5.9.1

Upgrade
metadata
title: CodeBLEU
sdk: gradio
sdk_version: 3.0.2
app_file: app.py
pinned: false
tags:
  - evaluate
  - metric
description: CodeBLEU metric for Python and C++

Metric Card for CodeBLEU

Module Card Instructions: Fill out the following subsections. Feel free to take a look at existing metric cards if you'd like examples.

Metric Description

CodeBLEU metric is used on code synthesis not only consider the surface match similar with the original BLEU, but can also consider the grammatical correctness and the logic correctness, leveraging the abstract syntax tree and the data-flow structure.

How to Use

  • clone the repository
git clone https://huggingface.co/spaces/giulio98/codebleu.git
  • import metric
from codebleu.calc_code_bleu import calculate
  • compute score
true_codes = [["def hello_world():\n    print("hello world!")"], ["def add(a,b)\n    return a+b"]]
code_gens = ["def hello_world():\n    print("hello world!")", "def add(a,b)\n    return a+b"]
codebleu = calculate(references=true_codes, predictions=code_gens, language="python", alpha=0.25, beta=0.25, gamma=0.25, theta=0.25)
print(codebleu['code_bleu_score'])

Inputs

List all input arguments in the format below

  • references (list of list of string): contains n possible solutions for each problem
  • predictions (list of string): contains a single prediction for each problem
  • language (string): python or cpp

Output Values

Values from Popular Papers

Limitations and Bias

Citation

@unknown{unknown,
author = {Ren, Shuo and Guo, Daya and Lu, Shuai and Zhou, Long and Liu, Shujie and Tang, Duyu and Zhou, Ming and Blanco, Ambrosio and Ma, Shuai},
year = {2020},
month = {09},
pages = {},
title = {CodeBLEU: a Method for Automatic Evaluation of Code Synthesis}
}