File size: 4,616 Bytes
fd189cc
 
0a5ccf2
 
eb35f51
fd189cc
0a5ccf2
 
 
 
 
 
 
 
820b5cc
 
0a5ccf2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ea09ade
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
---
license: apache-2.0
language:
- en
pipeline_tag: text-classification
---
# Model Card for Model ID

<!-- Based on  https://huggingface.co/t5-small, model generates SQL from text given table list with "CREATE TABLE" statements. 
This is a very light weigh model and could be used in multiple analytical applications. -->

Based on  [google/mobilebert-uncased](https://huggingface.co/google/mobilebert-uncased) (MobileBERT is a thin version of BERT_LARGE, while equipped with bottleneck structures and a carefully designed balance between self-attentions and feed-forward networks). This model detects SQLInjection attacks in the input string (check How To Below). This is a very very light  model (100mb) and can be used for edge computing use cases. Used dataset from [Kaggle](www.kaggle.com) called [SQl_Injection](https://www.kaggle.com/datasets/sajid576/sql-injection-dataset).
**Please test the model before deploying into any environment**.
Contact us for more info: [email protected]
### Code Repo
Here is the code repo https://github.com/cssupport23/AI-Model---SQL-Injection-Attack-Detector

## Model Details

### Model Description

<!-- Provide a longer summary of what this model is. -->



- **Developed by:** cssupport ([email protected])
- **Model type:** Language model
- **Language(s) (NLP):** English
- **License:** Apache 2.0
- **Finetuned from model :** [google/mobilebert-uncased](https://huggingface.co/google/mobilebert-uncased)

### Model Sources 

<!-- Provide the basic links for the model. -->

Please refer [google/mobilebert-uncased](https://huggingface.co/google/mobilebert-uncased) for Model Sources.

## How to Get Started with the Model

Use the code below to get started with the model.

```python
import torch
from transformers import MobileBertTokenizer, MobileBertForSequenceClassification


device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
tokenizer = MobileBertTokenizer.from_pretrained('google/mobilebert-uncased')
model = MobileBertForSequenceClassification.from_pretrained('cssupport/mobilebert-sql-injection-detect')
model.to(device)
model.eval()

def predict(text):
    inputs = tokenizer(text, padding=False, truncation=True, return_tensors='pt', max_length=512)
    input_ids = inputs['input_ids'].to(device)
    attention_mask = inputs['attention_mask'].to(device)

    with torch.no_grad():
        outputs = model(input_ids=input_ids, attention_mask=attention_mask)

    logits = outputs.logits
    probabilities = torch.softmax(logits, dim=1)
    predicted_class = torch.argmax(probabilities, dim=1).item()
    return predicted_class, probabilities[0][predicted_class].item()


#text = "SELECT * FROM users WHERE username = 'admin' AND password = 'password';"
#text = "select * from users where username = 'admin' and password = 'password';"
#text = "SELECT * from USERS where id  =  '1' or @ @1  =  1 union select 1,version  (    )   -- 1'"
#text = "select * from data where id  =  '1'  or @"
text ="select * from users where id  =  1 or 1#\"?  =  1 or 1  =  1 -- 1"
predicted_class, confidence = predict(text)

if predicted_class > 0.7:
    print("Prediction: SQL Injection Detected")
else:
    print("Prediction: No SQL Injection Detected")
    
print(f"Confidence: {confidence:.2f}")
# OUTPUT
# Prediction: SQL Injection Detected
# Confidence: 1.00
```


## Uses

<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->

[More Information Needed]

### Direct Use

<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
Could used in application where natural language is to be converted into SQL queries. 
[More Information Needed]



### Out-of-Scope Use

<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->

[More Information Needed]

## Bias, Risks, and Limitations

<!-- This section is meant to convey both technical and sociotechnical limitations. -->

[More Information Needed]

### Recommendations

<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.



## Technical Specifications 

### Model Architecture and Objective

[google/mobilebert-uncased](https://huggingface.co/google/mobilebert-uncased)

### Compute Infrastructure



#### Hardware

one P6000 GPU

#### Software

Pytorch and HuggingFace