Zoran
zokica
AI & ML interests
None yet
Organizations
None yet
zokica's activity
Gemma 2's Flash attention 2 implementation is strange...
61
#23 opened 8 months ago
by
GPT007

Problem with Lora finetuning, Out of memory
3
#13 opened 7 months ago
by
zokica
OOM when finetuning with lora.
5
#1 opened 7 months ago
by
zokica
Peft out of memory
#2 opened 7 months ago
by
zokica
Model repeating information and "spitting out" random characters
8
#14 opened 8 months ago
by
brazilianslib
Gemma2FlashAttention2 missing sliding_window variable
2
#8 opened 8 months ago
by
emozilla

Why batch size>1 does not increase model speed
#41 opened 8 months ago
by
zokica
why UMT5
6
#1 opened 11 months ago
by
pszemraj

Something broken on last update
7
#85 opened 10 months ago
by
Nayjest
Can't get it to generate the EOS token and beam search is not supported
2
#3 opened about 1 year ago
by
miguelcarv
How to fine-tune this? + Training code
43
#19 opened about 1 year ago
by
cekal

Added token
1
#5 opened 11 months ago
by
zokica
Generation after finetuning does not ends at EOS token
1
#123 opened 11 months ago
by
zokica
Attention mask for generation function in the future?
21
#7 opened over 1 year ago
by
rchan26

The model is extremelly slow in 4bit, is my code for loading ok?
#7 opened over 1 year ago
by
zokica
guanaco-65b
6
#1 opened almost 2 years ago
by
bodaay
Speed on CPU
13
#8 opened almost 2 years ago
by
zokica
Will you make a 3B model as well?
4
#7 opened almost 2 years ago
by
zokica
How do you run this?
3
#2 opened almost 2 years ago
by
zokica
How to run this?
3
#13 opened almost 2 years ago
by
zokica