selectorseb commited on
Commit
1538836
·
verified ·
1 Parent(s): e91d738

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +110 -3
README.md CHANGED
@@ -1,3 +1,110 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ library_name: LogClassifier
4
+ tags:
5
+ - log-classification
6
+ - log feature
7
+ - log-similarity
8
+ - transformers
9
+ - AIOps
10
+ pipeline_tag: text-classification
11
+ ---
12
+
13
+
14
+ # s2-log-classifier-BERT-v1
15
+ This model is a transformers classification model, trained using BERTForSequenceClassification designed for use in network and device log mining tasks.
16
+ Developed by [Selector AI](https://www.selector.ai/)
17
+
18
+ ## Model Usage
19
+ ```python
20
+ from transformers import BertForSequenceClassification, BertTokenizer
21
+
22
+ # Step 1: Load the model and tokenizer from Hugging Face
23
+ model = BertForSequenceClassification.from_pretrained("rahulm-selector/log-classifier-BERT-v1")
24
+ tokenizer = BertTokenizer.from_pretrained("rahulm-selector/log-classifier-BERT-v1")
25
+
26
+ import torch
27
+
28
+ model.eval()
29
+
30
+ # Step 2: Prepare the input data (Example log text)
31
+ log_text = "Error occurred while accessing the database."
32
+
33
+ # Tokenize the input data
34
+ inputs = tokenizer(log_text, return_tensors="pt", padding=True, truncation=True, max_length=128)
35
+
36
+ # Step 3: Make predictions
37
+ with torch.no_grad():
38
+ outputs = model(**inputs)
39
+ logits = outputs.logits
40
+
41
+ # Step 4: Get the predicted class (the class with the highest score)
42
+ predicted_class = torch.argmax(logits, dim=1).item()
43
+
44
+ # label mapping (can load from JSON file in repo or config)
45
+ label_mapping = model.config.id2label
46
+
47
+ # Step 5: Get the event name
48
+ predicted_event = label_mapping[predicted_class]
49
+ print(f"Predicted Event: {predicted_event}")
50
+ ```
51
+
52
+ ## Background
53
+
54
+ The model focuses on structured and semi-structured log data, outputing around 60 different event categories. It is highly effective
55
+ for real-time log analysis, anomaly detection, and operational monitoring, helping organizations manage
56
+ large-scale network data by automatically classifying logs into predefined categories, facilitating faster
57
+ and more accurate diagnosis of network issues.
58
+
59
+ ## Intended uses
60
+
61
+ Our model is intended to be used as classifier. Given an input text (a log coming from a network/device/router), it outputs a corresponding event most associated with the log.
62
+ The possible events that can be classified are shown in [encoder-main.json](https://huggingface.co/rahulm-selector/log-classifier-BERT-v1/blob/main/encoder-main.json)
63
+
64
+
65
+ ## Training Details
66
+
67
+ ### Data
68
+
69
+ The model was trained on a variety of network events and system logs, focusing on monitoring and analyzing state changes,
70
+ protocol behaviors, and hardware interactions across infrastructure components. This included tracking routing issues,
71
+ protocol neighbor state changes, link stability, and security events, ensuring that the model could recognize and
72
+ classify critical patterns in device communications, network health, and configuration activities.
73
+
74
+ ### Train/Test Split
75
+
76
+ - **Train Data Size**: `~80K Logs`
77
+ - **Test Data Size**: `~20K Logs`
78
+
79
+ #### Hyper Parameters
80
+
81
+ The following hyperparameters were used during training to optimize the model's performance:
82
+
83
+ - **Batch Size**: `32`
84
+ - **Learning Rate**: `.001`
85
+ - **Optimizer**: `Adam`
86
+ - **Epochs**: `10`
87
+ - **Dropout Rate**: N/A
88
+ - **LSTM Hidden Dimension**: `384`
89
+ - **Embedding Dimension**: `384`
90
+
91
+ ## Credits
92
+
93
+ This project was developed by a collaborative team at [Selector AI](https://www.selector.ai/). Below are the key contributors:
94
+
95
+ ### Authors
96
+ - **Rahul Muthuswamy**
97
+ Role: Project Lead and Model Developer
98
+ Email: [[email protected]]
99
+
100
+ - **Alex Lau**
101
+ Role: Mentor
102
+ Email: [[email protected]]
103
+
104
+ - **Sebastian Reyes**
105
+ Role: Mentor
106
+ Email: [[email protected]]
107
+
108
+ - **Surya Nimmagadda**
109
+ Role: Mentor
110
+ Email: [[email protected]]