divinetaco commited on
Commit
dde8e72
·
verified ·
1 Parent(s): c7f64c6

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +68 -0
README.md ADDED
@@ -0,0 +1,68 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: llama3.3
3
+ base_model:
4
+ - deepseek-ai/DeepSeek-R1-Distill-Llama-70B
5
+ library_name: transformers
6
+ tags:
7
+ - not-for-all-audiences
8
+ - nsfw
9
+ - mergekit
10
+ - merge
11
+
12
+ ---
13
+ # L3.3-70B-Lycosa-v0.2
14
+
15
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
16
+
17
+ ## Merge Details
18
+
19
+ changes from v0.1:
20
+ <br>\- Dropped llama-3.3-70b-instruct as a pivot to further reduce positive bias. No noticeable impact on reasoning.
21
+ <br>\- Added DeepSeek-R1-Distill-Llama-70B as a target model for improved reasoning.
22
+
23
+ An RP merge with a focus on:
24
+ <br>\- model intelligence
25
+ <br>\- removing positive bias
26
+ <br>\- creativity
27
+
28
+ This model was merged using the sce merge method using deepseek-ai/DeepSeek-R1-Distill-Llama-70B as a base.
29
+ <br>\
30
+ The included DeepSeek-R1-Distill-Llama-70B chat template is recommended.
31
+ ```txt
32
+ <|begin▁of▁sentence|>system prompt here<|User|>user 1st message here<|Assistant|>assistant 1st response here<|end▁of▁sentence|><|User|>user 2nd message here<|Assistant|>
33
+
34
+ ```
35
+
36
+ The llama3 chat template is no longer recommended due to the increased Deepseek-R1 influence in this v0.2 merge.
37
+
38
+ <img src="https://huggingface.co/divinetaco/L3.3-70B-Lycosa-v0.1/resolve/main/lycosa.png">
39
+
40
+ ### Models Merged
41
+
42
+ The following models were included in the merge:
43
+ * deepseek-ai/DeepSeek-R1-Distill-Llama-70B
44
+ * Sao10K/70B-L3.3-Cirrus-x1
45
+ * TheDrummer/Nautilus-70B-v0.1
46
+ * Doctor-Shotgun/L3.3-70B-Magnum-v4-SE
47
+ * SicariusSicariiStuff/Negative_LLAMA_70B
48
+
49
+ ### Configuration
50
+
51
+ The following YAML configuration was used to produce this model:
52
+
53
+ ```yaml
54
+ models:
55
+ # Pivot model
56
+ - model: SicariusSicariiStuff/Negative_LLAMA_70B
57
+ # Target models
58
+ - model: Sao10K/70B-L3.3-Cirrus-x1
59
+ - model: TheDrummer/Nautilus-70B-v0.1
60
+ - model: Doctor-Shotgun/L3.3-70B-Magnum-v4-SE
61
+ - model: deepseek-ai/DeepSeek-R1-Distill-Llama-70B
62
+ merge_method: sce
63
+ base_model: deepseek-ai/DeepSeek-R1-Distill-Llama-70B
64
+ parameters:
65
+ select_topk: 1.0
66
+ dtype: bfloat16
67
+
68
+ ```