jisukim8873 commited on
Commit
b4bbe96
1 Parent(s): b184dbb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -1
README.md CHANGED
@@ -9,4 +9,22 @@ language:
9
  # Model Details
10
  * Model Description: This model is test for data ordering.
11
  * Developed by: Juhwan Lee
12
- * Model Type: Large Language Model
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  # Model Details
10
  * Model Description: This model is test for data ordering.
11
  * Developed by: Juhwan Lee
12
+ * Model Type: Large Language Model
13
+
14
+ # Model Architecture
15
+ This model is based on Mistral-7B-v0.1. We fine-tuning this model for data ordering task.
16
+
17
+ Mistral-7B-v0.1 is a transformer model, with the following architecture choices:
18
+
19
+ * Grouped-Query Attention
20
+ * Sliding-Window Attention
21
+ * Byte-fallback BPE tokenizer
22
+
23
+ # Dataset
24
+ We random sample Open-Orca dataset. (We finetune the 100,000 dataset)
25
+
26
+ # Guthub
27
+ https://github.com/trailerAI
28
+
29
+ # License
30
+ Apache License 2.0