|
# Design Pattern Detection Model |
|
|
|
This model detects software design patterns in Java source code using CodeBERT. The model has been fine-tuned for single-label classification tasks and supports the following design pattern labels: |
|
|
|
## Supported Labels |
|
|
|
| Label ID | Design Pattern | |
|
|----------|--------------------| |
|
| 0 | Observer | |
|
| 1 | Decorator | |
|
| 2 | Adapter | |
|
| 3 | Proxy | |
|
| 4 | Singleton | |
|
| 5 | Facade | |
|
| 6 | AbstractFactory | |
|
| 7 | Memento | |
|
| 8 | FactoryMethod | |
|
| 9 | Prototype | |
|
| 10 | Visitor | |
|
| 11 | Builder | |
|
| 12 | Unknown | |
|
|
|
## How to Use |
|
|
|
```python |
|
from transformers import AutoTokenizer, AutoModelForSequenceClassification |
|
|
|
# Load the model and tokenizer |
|
tokenizer = AutoTokenizer.from_pretrained("ichsanbudiman/design-pattern-detection-codebert") |
|
model = AutoModelForSequenceClassification.from_pretrained("ichsanbudiman/design-pattern-detection-codebert") |
|
|
|
# Example input |
|
input_code = """ |
|
public class Singleton { |
|
private static Singleton instance; |
|
|
|
private Singleton() {} |
|
|
|
public static Singleton getInstance() { |
|
if (instance == null) { |
|
instance = new Singleton(); |
|
} |
|
return instance; |
|
} |
|
} |
|
""" |
|
|
|
# Tokenize the input |
|
inputs = tokenizer(input_code, return_tensors="pt", padding="max_length", truncation=True, max_length=512) |
|
|
|
# Make predictions |
|
with torch.no_grad(): |
|
outputs = model(**inputs) |
|
|
|
# Get the predicted class and label |
|
predicted_class = torch.argmax(outputs.logits, dim=1).item() |
|
predicted_label = model.config.id2label[predicted_class] |
|
|
|
print(f"Predicted label: {predicted_label}") |
|
``` |
|
|
|
## Input Requirements |
|
- **Input Format**: Java code snippets as strings. |
|
- **Max Length**: Input code longer than 512 tokens will be truncated. |
|
- **Padding**: Automatically pads to 512 tokens for batch processing. |
|
|
|
## Task |
|
This model performs single-label classification for the detection of design patterns in Java source code. The supported design patterns are listed above. |
|
|
|
## Fine-Tuning Details |
|
- **Base Model**: [CodeBERT](https://huggingface.co/microsoft/codebert-base) |
|
- **Dataset**: Fine-tuned on a curated dataset of labeled Java code examples. The dataset was sourced from the following research article: |
|
|
|
> Najam Nazar, Aldeida Aleti, Yaokun Zheng, Feature-based software design pattern detection, Journal of Systems and Software, Volume 185, 2022, 111179, ISSN 0164-1212, [https://doi.org/10.1016/j.jss.2021.111179](https://doi.org/10.1016/j.jss.2021.111179). |
|
|
|
- **Metrics**: The model achieves high accuracy on detecting design patterns, making it suitable for software engineering tasks. |
|
|
|
## Contact |
|
For inquiries or feedback, please reach out to [Ichsan Budiman](mailto:[email protected]). |
|
|
|
## License |
|
This model is licensed under the Apache 2.0 License. |
|
|
|
|