File size: 2,978 Bytes
2a3b6ae
c02337a
2a3b6ae
c02337a
2a3b6ae
c02337a
2a3b6ae
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c02337a
2a3b6ae
c02337a
2a3b6ae
 
c02337a
2a3b6ae
 
 
c02337a
2a3b6ae
 
 
 
c02337a
2a3b6ae
c02337a
2a3b6ae
 
 
 
 
 
 
 
c02337a
2a3b6ae
 
c02337a
2a3b6ae
 
 
c02337a
2a3b6ae
 
 
c02337a
2a3b6ae
 
c02337a
2a3b6ae
 
 
 
c02337a
2a3b6ae
 
c02337a
2a3b6ae
 
 
c02337a
2a3b6ae
c02337a
2a3b6ae
c02337a
2a3b6ae
 
c02337a
2a3b6ae
 
c02337a
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
# Design Pattern Detection Model

This model detects software design patterns in Java source code using CodeBERT. The model has been fine-tuned for single-label classification tasks and supports the following design pattern labels:

## Supported Labels

| Label ID | Design Pattern     |
|----------|--------------------|
| 0        | Observer           |
| 1        | Decorator          |
| 2        | Adapter            |
| 3        | Proxy              |
| 4        | Singleton          |
| 5        | Facade             |
| 6        | AbstractFactory    |
| 7        | Memento            |
| 8        | FactoryMethod      |
| 9        | Prototype          |
| 10       | Visitor            |
| 11       | Builder            |
| 12       | Unknown            |

## How to Use

```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification

# Load the model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("ichsanbudiman/design-pattern-detection-codebert")
model = AutoModelForSequenceClassification.from_pretrained("ichsanbudiman/design-pattern-detection-codebert")

# Example input
input_code = """
public class Singleton {
    private static Singleton instance;

    private Singleton() {}

    public static Singleton getInstance() {
        if (instance == null) {
            instance = new Singleton();
        }
        return instance;
    }
}
"""

# Tokenize the input
inputs = tokenizer(input_code, return_tensors="pt", padding="max_length", truncation=True, max_length=512)

# Make predictions
with torch.no_grad():
    outputs = model(**inputs)

# Get the predicted class and label
predicted_class = torch.argmax(outputs.logits, dim=1).item()
predicted_label = model.config.id2label[predicted_class]

print(f"Predicted label: {predicted_label}")
```

## Input Requirements
- **Input Format**: Java code snippets as strings.
- **Max Length**: Input code longer than 512 tokens will be truncated.
- **Padding**: Automatically pads to 512 tokens for batch processing.

## Task
This model performs single-label classification for the detection of design patterns in Java source code. The supported design patterns are listed above.

## Fine-Tuning Details
- **Base Model**: [CodeBERT](https://huggingface.co/microsoft/codebert-base)
- **Dataset**: Fine-tuned on a curated dataset of labeled Java code examples. The dataset was sourced from the following research article:

  > Najam Nazar, Aldeida Aleti, Yaokun Zheng, Feature-based software design pattern detection, Journal of Systems and Software, Volume 185, 2022, 111179, ISSN 0164-1212, [https://doi.org/10.1016/j.jss.2021.111179](https://doi.org/10.1016/j.jss.2021.111179).

- **Metrics**: The model achieves high accuracy on detecting design patterns, making it suitable for software engineering tasks.

## Contact
For inquiries or feedback, please reach out to [Ichsan Budiman](mailto:[email protected]).

## License
This model is licensed under the Apache 2.0 License.