rootxhacker commited on
Commit
5b80ffd
·
verified ·
1 Parent(s): 18e3a44

Add model card

Browse files
Files changed (1) hide show
  1. README.md +44 -0
README.md ADDED
@@ -0,0 +1,44 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model: unsloth/Meta-Llama-3.1-8B-Instruct
4
+ tags:
5
+ - diffusion
6
+ - language-model
7
+ - llama
8
+ - text-generation
9
+ library_name: transformers
10
+ pipeline_tag: text-generation
11
+ ---
12
+
13
+ # Llama-3.1-8B Diffusion Model (LAD)
14
+
15
+ This is a **Language Autoregressive Diffusion (LAD)** model based on Llama-3.1-8B-Instruct.
16
+
17
+ ## Features
18
+ - 🎯 Dual mode: Autoregressive + Diffusion generation
19
+ - 🚀 Cosine noise schedule with 1000 timesteps
20
+ - 🧠 LoRA fine-tuning (rank 32)
21
+ - ⚡ Custom diffusion components
22
+
23
+ ## Usage
24
+
25
+ ```python
26
+ from transformers import AutoTokenizer, AutoModelForCausalLM
27
+
28
+ model = AutoModelForCausalLM.from_pretrained("rootxhacker/llama3-diffusion")
29
+ tokenizer = AutoTokenizer.from_pretrained("rootxhacker/llama3-diffusion")
30
+
31
+ # Generate text
32
+ inputs = tokenizer("The future of AI", return_tensors="pt")
33
+ outputs = model.generate(**inputs, max_length=100)
34
+ print(tokenizer.decode(outputs[0]))
35
+ ```
36
+
37
+ ## Training Details
38
+ - Base: Meta-Llama-3.1-8B-Instruct
39
+ - Dataset: PatrickHaller/cosmopedia-v2-1B
40
+ - Framework: Unsloth + Custom Diffusion
41
+ - Context: 256 tokens
42
+ - Training: 60% AR + 40% Diffusion
43
+
44
+ Uploaded: 2025-06-08 23:13