TroyDoesAI commited on
Commit
e7f94f8
·
verified ·
1 Parent(s): 5b5186f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -0
README.md CHANGED
@@ -3,3 +3,20 @@ license: apache-2.0
3
  ---
4
 
5
  Base Model: Just Merged ~ No Training Gates After Merge
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  ---
4
 
5
  Base Model: Just Merged ~ No Training Gates After Merge
6
+
7
+ ### Model Overview
8
+
9
+ I have developed a Mixture of Experts (MoE) architecture with two always-active experts designed to work together for Python instruction tuning. Each expert possesses a distinct skill:
10
+
11
+ - **Expert 1**: Specializes in generating Mermaid diagrams, primarily from Python code, which requires a deep understanding of code structures and logic.
12
+ - **Expert 2**: Focuses on strict context obedience, ensuring that the model only generates outputs based on the provided instructions.
13
+
14
+ ### Why Always-Active MoE is Optimal
15
+
16
+ In this model, both experts are always active for each token, allowing them to complement each other:
17
+
18
+ - **Expert 1’s competence in Python structures** enhances the model's ability to generate correct and structured Python code.
19
+ - **Expert 2’s context obedience** ensures that the output remains aligned with the user’s instructions, preventing unnecessary or irrelevant outputs, such as Mermaid diagrams, unless explicitly requested.
20
+
21
+ This setup allows me to efficiently train the model for Python instruction following. By leveraging both experts simultaneously, I ensure that the model generates syntactically correct Python code while strictly adhering to user prompts.
22
+