shreyasmeher
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -85,6 +85,20 @@ inference:
|
|
85 |
- 4-bit Quantization: Enabled
|
86 |
- Max Sequence Length: 1024
|
87 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
88 |
### Memory Optimizations
|
89 |
- Used 4-bit quantization
|
90 |
- Gradient accumulation steps: 8
|
|
|
85 |
- 4-bit Quantization: Enabled
|
86 |
- Max Sequence Length: 1024
|
87 |
|
88 |
+
## Model Architecture
|
89 |
+
The model uses a combination of efficient fine-tuning techniques and optimizations for handling conflict event classification:
|
90 |
+
|
91 |
+
<p align="center">
|
92 |
+
<img src=".github/images/model-arch.png" alt="Model Training Architecture" width="800"/>
|
93 |
+
</p>
|
94 |
+
|
95 |
+
### Data Processing Pipeline
|
96 |
+
The preprocessing pipeline transforms raw GTD data into a format suitable for fine-tuning:
|
97 |
+
|
98 |
+
<p align="center">
|
99 |
+
<img src=".github/images/preprocessing.png" alt="Data Preprocessing Pipeline" width="800"/>
|
100 |
+
</p>
|
101 |
+
|
102 |
### Memory Optimizations
|
103 |
- Used 4-bit quantization
|
104 |
- Gradient accumulation steps: 8
|