--- base_model: unsloth/llama-3-8b-bnb-4bit tags: - llama.cpp - gguf - quantized - q4_k_m - text-classification - bf16 license: apache-2.0 language: - en widget: - text: >- On the morning of June 15th, armed individuals forced their way into a local bank in Mexico City. They held bank employees and customers at gunpoint for several hours while demanding access to the vault. The perpetrators escaped with an undisclosed amount of money after a prolonged standoff with local authorities. example_title: Armed Assault Example output: - label: Armed Assault | Hostage Taking score: 0.9 - text: >- A massive explosion occurred outside a government building in Baghdad. The blast, caused by a car bomb, killed 12 people and injured over 30 others. The explosion caused significant damage to the building's facade and surrounding structures. example_title: Bombing Example output: - label: Bombing/Explosion score: 0.95 pipeline_tag: text-classification inference: parameters: temperature: 0.7 max_new_tokens: 128 do_sample: true --- # ConflLlama: GTD-Finetuned LLaMA-3 8B - **Model Type:** GGUF quantized (q4_k_m and q8_0) - **Base Model:** unsloth/llama-3-8b-bnb-4bit - **Quantization Details:** - Methods: q4_k_m, q8_0, BF16 - q4_k_m uses Q6_K for half of attention.wv and feed_forward.w2 tensors - Optimized for both speed (q8_0) and quality (q4_k_m) ### Training Data - **Dataset:** Global Terrorism Database (GTD) - **Time Period:** Events before January 1, 2017 - **Format:** Event summaries with associated attack types - **Labels:** Attack type classifications from GTD ### Data Processing 1. **Date Filtering:** - Filtered events occurring before 2017-01-01 - Handled missing dates by setting default month/day to 1 2. **Data Cleaning:** - Removed entries with missing summaries - Cleaned summary text by removing special characters and formatting 3. **Attack Type Processing:** - Combined multiple attack types with separator '|' - Included primary, secondary, and tertiary attack types when available 4. **Training Format:** - Input: Processed event summaries - Output: Combined attack types - Used chat template: ``` Below describes details about terrorist events. >>> Event Details: {summary} >>> Attack Types: {combined_attacks} ``` ### Training Details - **Framework:** QLoRA - **Hardware:** NVIDIA A100-SXM4-40GB GPU on Delta Supercomputer - **Training Configuration:** - Batch Size: 1 per device - Gradient Accumulation Steps: 8 - Learning Rate: 2e-4 - Max Steps: 1000 - Save Steps: 200 - Logging Steps: 10 - **LoRA Configuration:** - Rank: 8 - Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj - Alpha: 16 - Dropout: 0 - **Optimizations:** - Gradient Checkpointing: Enabled - 4-bit Quantization: Enabled - Max Sequence Length: 1024 ## Model Architecture The model uses a combination of efficient fine-tuning techniques and optimizations for handling conflict event classification: