|
--- |
|
datasets: |
|
- nvidia/OpenCodeReasoning |
|
- future-technologies/Universal-Transformers-Dataset |
|
metrics: |
|
- bleu |
|
--- |
|
# AI FixCode Model π οΈ |
|
|
|
A Transformer-based code fixing model trained on diverse buggy β fixed code pairs. Built using [CodeT5](https://huggingface.co/Salesforce/codet5p-220m), this model identifies and corrects syntactic and semantic errors in source code. |
|
|
|
## π Model Details |
|
- **Base Model**: `Salesforce/codet5p-220m` |
|
- **Type**: Seq2Seq (Encoder-Decoder) |
|
- **Trained On**: Custom dataset with real-world buggy β fixed examples. |
|
- **Languages**: Python (initially), can be expanded to JS, Go, etc. |
|
|
|
## π§ Intended Use |
|
|
|
Input a buggy function or script and receive a syntactically and semantically corrected version. |
|
|
|
**Example**: |
|
```python |
|
# Input: |
|
def add(x, y) |
|
return x + y |
|
|
|
# Output: |
|
def add(x, y): |
|
return x + y |
|
|
|
π§ How it Works |
|
|
|
The model learns from training examples that map erroneous code to corrected code. It uses token-level sequence generation to predict patches. |
|
|
|
π Inference |
|
|
|
Use the transformers pipeline or run via CLI: |
|
|
|
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer |
|
|
|
model = AutoModelForSeq2SeqLM.from_pretrained("khulnasoft/aifixcode-model") |
|
tokenizer = AutoTokenizer.from_pretrained("khulnasoft/aifixcode-model") |
|
|
|
input_code = "def foo(x):\n print(x" |
|
inputs = tokenizer(input_code, return_tensors="pt") |
|
out = model.generate(**inputs, max_length=512) |
|
print(tokenizer.decode(out[0], skip_special_tokens=True)) |
|
|
|
π Dataset Format |
|
|
|
[ |
|
{ |
|
"input": "def add(x, y)\n return x + y", |
|
"output": "def add(x, y):\n return x + y" |
|
} |
|
] |
|
|
|
π‘οΈ License |
|
|
|
MIT License |
|
|
|
π Acknowledgements |
|
|
|
Built using π€ HuggingFace Transformers + Salesforce CodeT5. |
|
|
|
|
|
--- |
|
|
|
### β Common Issues That Break Model Cards |
|
|
|
- **Using triple quotes** (`"""`) to wrap content β β Not allowed in Markdown. |
|
- **Markdown inside Python strings** β β Will not render correctly. |
|
- **Non-escaped special characters** β e.g., `[` or `*` inside code blocks. |
|
- **Improper indentation inside code fences** β causes rendering problems. |
|
- **Incorrect file name** β Make sure the file is named `README.md` exactly (case-sensitive). |
|
|
|
--- |
|
|
|
If you're uploading this model via the Hugging Face CLI (`transformers-cli` or `huggingface_hub`), placing the `README.md` in the root of your model directory will automatically display it on the model page. |
|
|
|
Would you like me to validate this model card in Hugging Face's format validator or prepare a metadata block (`model-index`, `tags`, etc.) as well? |