shreyasmeher commited on
Commit
1ee0121
·
verified ·
1 Parent(s): 5553045

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +116 -3
README.md CHANGED
@@ -1,3 +1,116 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ I'll help you create a model card for your specific implementation. I'll modify the template based on your actual training configuration:
2
+
3
+ ---
4
+ base_model: unsloth/llama-3-8b-bnb-4bit
5
+ tags:
6
+ - llama.cpp
7
+ - gguf
8
+ - quantized
9
+ - q4_k_m
10
+ license: apache-2.0
11
+ language:
12
+ - en
13
+ ---
14
+ # ConflLlama: GTD-Finetuned LLaMA-3 8B
15
+ - **Model Type:** GGUF quantized (q4_k_m and q8_0)
16
+ - **Base Model:** unsloth/llama-3-8b-bnb-4bit
17
+ - **Quantization Details:**
18
+ - Methods: q4_k_m and q8_0
19
+ - q4_k_m uses Q6_K for half of attention.wv and feed_forward.w2 tensors
20
+ - Optimized for both speed (q8_0) and quality (q4_k_m)
21
+
22
+ ### Training Data
23
+ - **Dataset:** Global Terrorism Database (GTD)
24
+ - **Time Period:** Events before January 1, 2017
25
+ - **Format:** Event summaries with associated attack types
26
+ - **Labels:** Attack type classifications from GTD
27
+
28
+ ### Data Processing
29
+ 1. **Date Filtering:**
30
+ - Filtered events occurring before 2017-01-01
31
+ - Handled missing dates by setting default month/day to 1
32
+ 2. **Data Cleaning:**
33
+ - Removed entries with missing summaries
34
+ - Cleaned summary text by removing special characters and formatting
35
+ 3. **Attack Type Processing:**
36
+ - Combined multiple attack types with separator '|'
37
+ - Included primary, secondary, and tertiary attack types when available
38
+ 4. **Training Format:**
39
+ - Input: Processed event summaries
40
+ - Output: Combined attack types
41
+ - Used chat template:
42
+ ```
43
+ Below describes details about terrorist events.
44
+ >>> Event Details:
45
+ {summary}
46
+ >>> Attack Types:
47
+ {combined_attacks}
48
+ ```
49
+
50
+ ### Training Details
51
+ - **Framework:** Unsloth optimization framework
52
+ - **Hardware:** NVIDIA A100-SXM4-40GB GPU on Delta Supercomputer
53
+ - **Training Configuration:**
54
+ - Batch Size: 1 per device
55
+ - Gradient Accumulation Steps: 8
56
+ - Learning Rate: 2e-4
57
+ - Max Steps: 1000
58
+ - Save Steps: 200
59
+ - Logging Steps: 10
60
+ - **LoRA Configuration:**
61
+ - Rank: 8
62
+ - Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
63
+ - Alpha: 16
64
+ - Dropout: 0
65
+ - **Optimizations:**
66
+ - Gradient Checkpointing: Enabled
67
+ - 4-bit Quantization: Enabled
68
+ - Max Sequence Length: 1024
69
+
70
+ ### Memory Optimizations
71
+ - Used 4-bit quantization
72
+ - Gradient accumulation steps: 8
73
+ - Memory-efficient gradient checkpointing
74
+ - Reduced maximum sequence length to 1024
75
+ - Disabled dataloader pin memory
76
+
77
+ ## Intended Use
78
+ This model is designed for:
79
+ 1. Classification of terrorist events based on event descriptions
80
+ 2. Research in conflict studies and terrorism analysis
81
+ 3. Understanding attack type patterns in historical events
82
+ 4. Academic research in security studies
83
+
84
+ ## Limitations
85
+ 1. Training data limited to pre-2017 events
86
+ 2. Maximum sequence length limited to 1024 tokens
87
+ 3. May not capture recent changes in attack patterns
88
+ 4. Performance dependent on quality of event descriptions
89
+
90
+ ## Ethical Considerations
91
+ 1. Model trained on sensitive terrorism-related data
92
+ 2. Should be used responsibly for research purposes only
93
+ 3. Not intended for operational security decisions
94
+ 4. Results should be interpreted with appropriate context
95
+
96
+ ## Citation
97
+ ```bibtex
98
+ @misc{conflllama,
99
+ author = {Meher, Shreyas},
100
+ title = {ConflLlama: GTD-Finetuned LLaMA-3 8B},
101
+ year = {2024},
102
+ publisher = {HuggingFace},
103
+ note = {Based on Unsloth's LLaMA-3 8B and GTD Dataset}
104
+ }
105
+ ```
106
+
107
+ ## Acknowledgments
108
+ - Unsloth for optimization framework and base model
109
+ - Hugging Face for transformers infrastructure
110
+ - Global Terrorism Database team
111
+ - NCSA Delta for computing resources
112
+ - BBOV project support
113
+
114
+ <img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>
115
+
116
+ Would you like me to add or modify any specific sections based on your implementation details?