Sliding Window Attention is enabled but not implemented for `sdpa`; unexpected results may be encountered.
1
#13 opened 6 days ago
by
septemberlemon

About the rope_theta
#12 opened 11 days ago
by
HongyuZang
about the "model_max_length": 16384
#11 opened 15 days ago
by
AlexWuKing
Questions about the input format
#10 opened 18 days ago
by
volcanos

Update tokenizer_config.json
#8 opened about 2 months ago
by
robinxw
Two SafeTensor Files
1
#7 opened about 2 months ago
by
RadwanBakor
Add pipeline tag
#5 opened 2 months ago
by
nielsr

Vocab size in config.json mismatches the actual tokenizer size
5
#4 opened 2 months ago
by
Fizzarolli

high-low mode
1
#3 opened 2 months ago
by
LD-inform
System Prompt
13
#2 opened 2 months ago
by
Wanfq
