It's incredible!

by Apel-sin - opened May 8

Discussion

Apel-sin

May 8

Nice work, thanx for sharing!
Can u share code?

lucyknada

Edgerunners org May 8

glad you like it! you can replicate it with wassname's notebook: https://gist.github.com/wassname/42aba7168bb83e278fcfea87e70fa3af

Apel-sin

May 8

Thanx for answer! What is the difference between your models?

Waldo00

May 8

This comment has been hidden

lucyknada

Edgerunners org May 8

just kept retrying until there was only 2 failures out of 128, I plan to do one with ~500 inst too

lucyknada

Edgerunners org May 9

released both 500 and 1000 now, though have not had the time yet to try them and see if the quality of the responses scales together with the instruction amount:

Apel-sin

May 11

https://huggingface.co/Apel-sin/llama-3-8B-ortho-by-edgerunners-v4_exl2_8.0bpw
Sometimes (stably on some queries) the model suddenly cuts the message. But in my short and subjective tests, this is the best version so far

Apel-sin

May 11

Have you ever used cognitivecomputations/Llama-3-8B-Instruct-abliterated-v2 model? In my short tests, it works more stable and predictable.
What GPU are you using for experiments?

lucyknada

Edgerunners org May 11

I did not use it; h100 sxm

Apel-sin

May 11

Try it, it's really interesting. Based on: https://huggingface.co/failspy/llama-3-70B-Instruct-abliterated/blob/main/ortho_cookbook.ipynb
Unfortunately, I don't understand how it works :(

lucyknada

Edgerunners org May 11

on quick glance; that's just wassname's notebook copy pasted, it uses the default csv instead of 3000 instructs.

Apel-sin

May 11

Can u share your "3000 instructs" dataset? It's very interesting to see :)

lucyknada

Edgerunners org May 12

it's not my own for now, I just concat'd bunch of toxic DPO datasets; working on a dedicated one however and will open source it when it's ready

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment