VNTL Settings Guide for ST

by Casual-Autopsy - opened 15 days ago

Discussion

Casual-Autopsy

15 days ago

•

edited 1 day ago

Alright, I think I've mess around with this model enough to create a guide, so here we go...

SillyTavern

Prerequisite

lmg-anon/vntl-llama3-8b-v2-hf (Obviously)
- Imatrix GGUFs by yours truly (Uses a multilingual fork of Bartowski's imatrix dataset. GGUF requests are welcomed.)
Snowflake/snowflake-arctic-embed-l-v2.0 (Optional. If you're using RAG and your pc can handle a bigger embedding model, use this instead of default)
- GGUFs by yours truly (GGUF requests are welcomed.)
SillyTavern
Extensions
- Required:
  - LALib https://github.com/LenAnderson/SillyTavern-LALib
  - Message Actions https://github.com/LenAnderson/SillyTavern-MessageActions
  - Send Button https://github.com/LenAnderson/SillyTavern-SendButton
- Recommended:
  - Input History https://github.com/LenAnderson/SillyTavern-InputHistory
  - Keyboard https://github.com/LenAnderson/SillyTavern-Keyboard
  - Notebook https://github.com/SillyTavern/Extension-Notebook
  - Chat Top Bar https://github.com/SillyTavern/Extension-TopInfoBar
  - Backup Browser https://github.com/LenAnderson/SillyTavern-BackupsBrowser
  - Message Limit https://github.com/SillyTavern/Extension-MessageLimit
  - Auto Focus https://github.com/LenAnderson/SillyTavern-AutoFocus
My ST Master Preset
AutoHotkey
Luna Translator
an LLM Backend(KoboldCpp recommended. If you want to host a Quanted embedding model, use llama.cpp with the --embedding flag)
PC specs that allow you to run VNTL at atleast 10 t/s(5 t/s if you're desperate enough)

Backend Settings

Due to overfitting with 1k token only dataset, translation quality diminishes past the 1k point, but with the help of RoPE, this isn't too much of an issue, so if you use SillyTavern instead of Luna Translator as your LLM frontend, then use these backend configs to fix the quality drop.

Context: 4096

Custom RoPE:
  - RoPE Base: 6315084.4

TextGen

While the model card says to use neutral samplers with 0 temp, I've found these settings to work rather well, increasing writing quality, and even translation accuracy. Keep in mind the golden garbage in, garbage out rule and fix any mistakes you find when using these settings.

QR's

Note: Import my QR set instead. It's more up-to-date.

Eventually you'll end up with some translation weirdness that's hard to figure out and fix, in cases like that you can start the context anew by hiding all messages from the prompt. Here's the QR to do so.

/hide 0-{{lastMessageId}}

There will also be cases where you notice lag. SillyTavern isn't made to handle large amounts of messages loaded at once, so make sure to enable a message load limit when opening a chat in User Settings > Chat/Message Handling > # Msg. to Load

Once done that, create a QR to reload the chat whenever things start to get a bit laggy:

/chat-reload

QoL

Document Mode: setting chat style tp document mode allows to quickly enter message edit mode with any chat message by simply double clicking.

Auto-save message edits: Enabling auto-saving for message edits allows to quickly leave message edit mode with Esc and pairs well with document mode.

Setting Up RAG/Vector Storage for Better Translation Quality/Consistency

1. Vector Storage

Use the following images to set up Vector Storage:

1. Main STscript

Note: You can also download and import my Message Actions QR Set and skip to the final paragraph of this part.

Create a new QRset called MTL MesAct and create a QR called Chat-Pair RAG. Paste the following script:

/message-get {{mes::id}} |
/= pipe.is_user |
/let isuser {{pipe}} |

/if left={{var::isuser}} else={:
    /abort quiet=false QR must be used on a user role message. Aborting. |
:}
{:
    /if left={{mes::id}} right={{lastmessageid}} rule=eq {:
        /abort quiet=false No message pair. Aborting. |
    :}|
:}|


/add {{mes::id}} 1 |
/let engid {{pipe}} |

/message-get {{var::engid}} |
/= pipe.is_user |
/let isassist {{pipe}} |

/if left=isassist {:
    /abort quiet=false Improper pair. Aborting. |
:}|

/message-get {{mes::id}} |
/= pipe.name |
/let name_user {{pipe}} |

/message-get {{var::engid}} |
/= pipe.name |
/let name_assist {{pipe}} |


/messages names=on {{mes::id}}-{{var::engid}} |

/let mesGrab {{pipe}} |

/re-replace find="/{{var::name_user}}: /" replace="<\|start_header_id\|>Japanese<\|end_header_id\|>{{newline}}{{newline}}" {{var::mesGrab}} |
/re-replace find="/\n\n{{var::name_assist}}: /" replace="<\|eot_id\|><\|start_header_id\|>English<\|end_header_id\|>{{newline}}{{newline}}" {{pipe}} |

/let mesInst "{{pipe}}<\|eot_id\|>" |

/let entName "" |
/input rows=1 Write data-back entry name(Recommended to use the name of the character speaking) |

/if left="{{pipe}}" right="" rule=eq else={:
    /var key=entName as=string "{{pipe}}" |
:}
{:
    /var key=entName as=string "Translation-Snip" |
:}|


/databank-add name="{{var::entName}}_{{mes::id}}-{{var::engid}}" "{{var::mesInst}}" |

Set to the following Icon:

Once done, send the cmd /messageactions "MTL MesAct" | to add it as a button to messages in the expandable menu.

3. Usage

The best way to use RAG to increase quality is to turn every chat pair where you've fixed the translation, and only your fixed translations.

Make sure to only add fixes you are absolutely positive is correct, else RAG will have the opposite effect on translation quality

Casual-Autopsy

15 days ago

•

edited 1 day ago

Luna

ToDo

Casual-Autopsy

14 days ago

•

edited 1 day ago

AutoHotkey Setup

Finally, The AutoHotkey script to tie it togetther. When the Japanese text is sent to the clipboard, AutoHotkey will send the text in SillyTavern to be translated before switching back to the set target window.

Hotkeys

Ctrl + Alt + T: Target window for switching back to after sending the Japanese text.
Ctrl + Alt + E: Enable/Disable Auto-pasting to SillyTavern

The Script

#SingleInstance Force
#Requires AutoHotKey v2.0+

previousClipboard := A_Clipboard
targetWindow := ""
grabToggle := true

CheckClipboard() {
    global previousClipboard
    currentClipboard := A_Clipboard

    if (currentClipboard != previousClipboard) {
        previousClipboard := currentClipboard
        HandleClipboardChange(currentClipboard)
    }
}

HandleClipboardChange(currentClipboard) {
    win := WinExist("SillyTavern")
    if (win AND grabToggle) {
        WinActivate(win)

        WinWaitActive(win)

        Send(currentClipboard)

        Sleep(500)

        Send("{Enter}")

        Sleep(500)

        WindowSwitchBack()
    }
}

WindowSwitchBack() {
    global targetWindow
    win := WinExist(targetWindow)
    if (win) {
        WinActivate(win)

        WinWaitActive(win)
    }
}

SetTimer(CheckClipboard, 500)

^!T:: {
    global targetWindow
    targetWindow := WinGetTitle("A")
    MsgBox("Target window set to: " targetWindow,,"T2")
}

^!E:: {
    global grabToggle
    if (grabToggle) {
        grabToggle := false
        MsgBox("Auto-paste: Disabled",,"T2")
    } else {
        grabToggle := true
        MsgBox("Auto-paste: Enabled",,"T2")
    }
}

Casual-Autopsy

12 days ago

•

edited 1 day ago

Update 1:

Added QoL section and updated sampler settings in TextGen

Update 2:

Added RAG setup guide
Added Prerequisite section
Updated AutoHotkey script

Update 3:

Update Vector Storage settings

Update 4:

Update Prerequisite section
- Added Auto Focus extension
- Added embedding model recommendation and GGUFs
- Added VNTL and GGUFs
- Update recommended backends

Update 5:

Update QR STscripts
Added links to QR set exports
Edit and reformat AutoHotkey setup

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment