3 2 10

spooner

spooner2

[email protected]

AI & ML interests

None yet

Recent Activity

reacted to mitkox's post with 🔥 17 days ago

I got 370 tokens/sec of Qwen3-30B-A3B 2507 on my desktop Z8 GPU workstation. My target is 400 t/s, and the last 10 % always tastes like victory!

reacted to eaddario's post with 🚀 about 2 months ago

Layer-wise and Pruned versions of Qwen/Qwen3-30B-A3B * Tesor-wise: https://huggingface.co/eaddario/Qwen3-30B-A3B-GGUF * Pruned: https://huggingface.co/eaddario/Qwen3-30B-A3B-pruned-GGUF Even though the Perplexity scores of the pruned version are 3 times higher, the ARC, HellaSwag, MMLU, Truthful QA and WinoGrande scores are holding remarkably well, considering two layers were removed (5 and 39). This seems to support Xin Men et al conclusions in ShortGPT: Layers in Large Language Models are More Redundant Than You Expect (2403.03853) Results summary in the model's card and test results in the ./scores directory. Questions/feedback is always welcomed.

liked a model 3 months ago

ubergarm/DeepSeek-R1-0528-GGUF

View all activity

Organizations

None yet

reacted to mitkox's post with 🔥 17 days ago

Post

2600

I got 370 tokens/sec of Qwen3-30B-A3B 2507 on my desktop Z8 GPU workstation. My target is 400 t/s, and the last 10 % always tastes like victory!

3 replies

reacted to eaddario's post with 🚀 about 2 months ago

Post

3734

Layer-wise and Pruned versions of Qwen/Qwen3-30B-A3B

* Tesor-wise: eaddario/Qwen3-30B-A3B-GGUF
* Pruned: eaddario/Qwen3-30B-A3B-pruned-GGUF

Even though the Perplexity scores of the pruned version are 3 times higher, the ARC, HellaSwag, MMLU, Truthful QA and WinoGrande scores are holding remarkably well, considering two layers were removed (5 and 39). This seems to support Xin Men et al conclusions in
ShortGPT: Layers in Large Language Models are More Redundant Than You Expect (2403.03853)

Results summary in the model's card and test results in the ./scores directory. Questions/feedback is always welcomed.

liked a model 3 months ago

ubergarm/DeepSeek-R1-0528-GGUF

Text Generation • 672B • Updated 29 days ago • 3.42k • 47

liked 3 models 4 months ago

New activity in unsloth/Qwen3-32B-GGUF 4 months ago

FIXED: Failed to parse Jinja template

#2 opened 4 months ago by

wapxmas

reacted to Xenova's post with 🔥 4 months ago

Post

8348

Introducing the ONNX model explorer: Browse, search, and visualize neural networks directly in your browser. 🤯 A great tool for anyone studying Machine Learning! We're also releasing the entire dataset of graphs so you can use them in your own projects! 🤗

Check it out! 👇
Demo: onnx-community/model-explorer
Dataset: onnx-community/model-explorer
Source code: https://github.com/xenova/model-explorer

upvoted a collection 4 months ago

Qwen3

Collection

Qwen's new Qwen3 models. In Unsloth Dynamic 2.0, GGUF, 4-bit and 16-bit Safetensor formats. Includes 128K Context Length variants. • 79 items • Updated 3 days ago • 189

New activity in Kijai/SkyReels-V1-Hunyuan_comfy 6 months ago

Error HyVideoModelLoader

#7 opened 6 months ago by

Nikita661995

liked a model 7 months ago

hexgrad/Kokoro-82M

Text-to-Speech • Updated Apr 10 • 2.03M • • 4.87k

upvoted a paper 10 months ago

OmniGen: Unified Image Generation

Paper • 2409.11340 • Published Sep 17, 2024 • 116

reacted to yongchanghao's post with 🔥 10 months ago

Post

3796

We just released a paper (NeuZip) that compresses VRAM in a lossless manner to run larger models. This should be particularly useful when VRAM is insufficient during training/inference. Specifically, we look inside each floating number and find that the exponents are highly compressible (as shown in the figure below).

Read more about the work at NeuZip: Memory-Efficient Training and Inference with Dynamic Compression of Neural Networks (2410.20650)

liked a model 12 months ago

alvdansen/flux-koda

Text-to-Image • Updated Aug 16, 2024 • 2.98k • • 242

New activity in HaileyStorm/FLUX.1-Merges about 1 year ago

combined safetensors , but comfyui issue a error.

#3 opened about 1 year ago by

demo001s

liked 3 models about 1 year ago

black-forest-labs/FLUX.1-dev

Text-to-Image • Updated Jun 27 • 1.59M • • 11.2k

black-forest-labs/FLUX.1-schnell

Text-to-Image • Updated Aug 16, 2024 • 649k • • 4.18k

stabilityai/stable-diffusion-3-medium

Text-to-Image • Updated Aug 12, 2024 • 14.9k • • 4.81k

liked a model over 1 year ago

ResplendentAI/SOVL_Llama3_8B

Text Generation • 8B • Updated Apr 25, 2024 • 6 • 33

reacted to codelion's post with 🚀 over 1 year ago

Post

1765

Happy to announce the open source framework to turbo charge devops called patchwork - https://github.com/patched-codes/patchwork

You can use it to build patchflows - workflows that use LLMs for software development tasks like bug fixing, pull request review, library migration and documentation.

Supports any LLM of your choice including our own MoE model - patched-codes/patched-mix-4x7B

Give it a try!

2 replies

spooner

AI & ML interests

Recent Activity

Organizations

spooner2's activity

FIXED: Failed to parse Jinja template

Error HyVideoModelLoader

combined safetensors , but comfyui issue a error.