--- base_model: - unsloth/Mistral-Nemo-Base-2407-bnb-4bit library_name: transformers tags: - unsloth - trl - sft license: apache-2.0 --- I have no idea what I’m doing… if this causes the apocalypse someone please let me know. Luca-MN-bf16 8.0bpw h8 EXL2 Includes [measurement.json](https://huggingface.co/FuturisticVibes/Luca-MN-bf16-8.0bpw-h8-exl2/tree/measurement) file for further quantization I’ll be taking a break until mid-2025 to recoup some funds, might still do small models occasionally, but anything big will have to wait. See you all next year! 😊 Original Model: https://huggingface.co/rAIfle/Luca-MN-bf16 # Original Model Card ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6569a4ed2419be6072890cf8/T_ITjuaHakgamjwuElcAs.png) # Luca-MN-bf16 This thing was just intended as an experiment but it turned out quite good. I had it both name and prompt imagegen for itself. Created by running a high-r LoRA-pass over Nemo-Base with 2 epochs of some RP data, then a low-r pass with 0.5 epochs of the c2-data, then 3 epochs of DPO using [jondurbin/gutenberg-dpo-v0.1](https://huggingface.co/datasets/jondurbin/gutenberg-dpo-v0.1). ## Prompting Use the `Mistral V3-Tekken` context- and instruct-templates. Temperature at about `1.25` seems to be the sweet spot, with either MinP at `0.05` or TopP at `0.9`. DRY/Smoothing etc depending on your preference. ## Quantized versions - [iMat GGUFs](https://huggingface.co/Quant-Cartel/Luca-MN-iMat-GGUF), courtesy of the [Quant-Cartel](https://huggingface.co/Quant-Cartel/)