File size: 2,483 Bytes
ba95021
3fd672e
 
 
6398a5a
 
 
 
ba95021
3fd672e
6398a5a
3fd672e
6398a5a
3fd672e
 
6398a5a
3fd672e
6398a5a
3fd672e
6398a5a
3fd672e
 
6398a5a
 
 
 
 
 
 
 
 
 
 
3fd672e
6398a5a
 
 
3fd672e
 
6398a5a
 
3fd672e
6398a5a
3fd672e
 
6398a5a
fed412d
 
 
 
 
 
3fd672e
fed412d
3fd672e
 
 
6398a5a
 
 
 
 
 
 
 
 
 
 
 
 
 
07fb67d
 
 
 
 
 
 
 
 
6398a5a
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
---
pipeline_tag: sentence-similarity
tags:
- sentence-transformers
- Text Classification
license: gpl-3.0
language:
- en
---

# FewShotIssueClassifier-NLBSE23

This is a SetFit model using Sentence Transformers to map sentences & paragraphs to a 768 dimensional dense vector space. It be used for tasks like clustering or semantic search.

<!--- Describe your model here -->
This specific model is fine-tuned for Issue Report Classification in 4 classes: bug, documentation, feature, question

## Usage

You can use the model like this:

```python
from sentence_transformers.losses import CosineSimilarityLoss
from setfit import SetFitModel
from setfit import SetFitTrainer
sentences = ["error in line 20", "add method list_features"]

label_mapping = {
  0 : "bug",
  1 : "documentation",
  2 : "feature",
  3 : "question"
}

model = SetFitModel.from_pretrained('PeppoCola/FewShotIssueClassifier-NLBSE23')
predictions = model.predict(sentences)
print([label_mapping[i] for i in predictions])
```

## Dataset
This model is trained on a subset of the [NLBSE23](https://nlbse2023.github.io/tools/) dataset. The sample was hand-labeled, and made available on [Zenodo](https://zenodo.org/record/7628150#.ZBnM3XbMJD8)

## Citing & Authors

```
@software{Colavito_Few-Shot_Learning_for_2023,
	title        = {{Few-Shot Learning for Issue Report Classification}},
	author       = {Colavito, Giuseppe and Lanubile, Filippo and Novielli, Nicole},
	year         = 2023,
	month        = 2,
	url          = {https://github.com/collab-uniba/Issue-Report-Classification-NLBSE2023},
	version      = {1.0.0}
}

```

```
@dataset{colavito_giuseppe_2023_7628150,
  author       = {Colavito Giuseppe and
                  Lanubile Filippo and
                  Novielli Nicole},
  title        = {Few-Shot Learning for Issue Report Classification},
  month        = feb,
  year         = 2023,
  note         = {{To use this, merge the CSV with the original 
                   dataset (after removing duplicates on the 'id'
                   column)}},
  publisher    = {Zenodo},
  doi          = {10.5281/zenodo.7628150},
  url          = {https://doi.org/10.5281/zenodo.7628150}
}
```

```
@inproceedings{Colavito-2023,
	title        = {Few-Shot Learning for Issue Report Classification},
	author       = {Colavito, Giuseppe and Lanubile, Filippo and Novielli, Nicole},
	year         = 2023,
	booktitle    = {2nd International Workshop on Natural Language-Based Software Engineering (NLBSE)}
}
```