loiccabannes
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,25 @@
|
|
1 |
---
|
2 |
license: apache-2.0
|
|
|
|
|
|
|
|
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
+
datasets:
|
4 |
+
- SkelterLabsInc/JaQuAD
|
5 |
+
language:
|
6 |
+
- ja
|
7 |
---
|
8 |
+
|
9 |
+
MambaSan-370m 🐍
|
10 |
+
|
11 |
+
MambaSan-370m is the first chat Japanese language model based on a state-space model architecture (Mamba).
|
12 |
+
|
13 |
+
The model is based on Albert Gu's and Tri Dao's work Mamba: Linear-Time Sequence Modeling with Selective State Spaces (paper) as well as their model implementation. .
|
14 |
+
|
15 |
+
The Code used for pretraining will soon be published on my github: https://github.com/lcabannes
|
16 |
+
Citation
|
17 |
+
|
18 |
+
bibtex
|
19 |
+
@misc{lcabannes2024MambaSan-370m-instruct,
|
20 |
+
title = {MambaSan-370-instruct},
|
21 |
+
author = {Loïc Cabannes},
|
22 |
+
year = {2024},
|
23 |
+
howpublished = {HuggingFace},
|
24 |
+
url = {https://huggingface.co/loiccabannes/MambaSan-370m/}
|
25 |
+
}
|