FP4 in attention proj
2
#9 opened 8 days ago
by
yoursmin
can this model run on Hopper GPU
3
#8 opened 9 days ago
by
simonlindelta

Can this model work with vLLM?
3
#7 opened 11 days ago
by
KimChen

Request for Detailed Benchmarking Setup with TensorRT-LLM on B200
#6 opened 12 days ago
by
StardusterLiu

Benchmark results compared to orig fp8 / int4 quants etc?
4
#1 opened 18 days ago
by
CHNtentes