XT-60M-v0.1 / config.json
styalai's picture
Push model using huggingface_hub.
e19b3db verified
raw
history blame contribute delete
324 Bytes
{
"device": "cuda",
"dropout": 0.2,
"n_blocks": 3,
"transformer_config": {
"block_size": 250,
"dropout": 0.2,
"n_embd": 244,
"n_head": 6
},
"vocab_size": 8010,
"xlstm_config": {
"batch_size": 8,
"block_size": 250,
"config_block": "msm",
"device": "cuda",
"n_embd": 244
}
}