Hafez Mousavi
hafezmg48
·
AI & ML interests
LLMs, Transformers
Organizations
None yet
hafezmg48's activity
Does Qwen use RMSNorm or LayerNorm?
1
#21 opened 6 months ago
by
hafezmg48
Why 72B model has different vocab size comparing with other models?
7
#1 opened 11 months ago
by
Mikasaka
Intermediate_size is doubled in config.json
1
#3 opened 9 months ago
by
hafezmg48
Intermediate_size is doubled in config.json
1
#3 opened 9 months ago
by
hafezmg48