alac's picture
Update README.md
e87990d
|
raw
history blame
3.62 kB
metadata
language:
  - en
tags:
  - llama-2
  - instruct
  - instruction
  - writing
  - story
pipeline_tag: text-generation
license: other

Waxwing-Storytelling-70B-LoRA model card

Waxwing is a storytelling lora for Llama 2 70B.

  • Guide the story with Waxwing's turn-based instruction system.
  • Tailor the feel of your story using style tags.
  • Experience storytelling free of ChatGPT's idiosyncrasies, thanks to a "human-generated" dataset of public domain writing. Waxwing avoids GPT-isms like positivity bias, "bond" emphasis, rushed endings and exaggerated stylistic tics.

Waxwing is available*:

  • LoRA: As a LoRA on this branch and can be applied at runtime on any variant of the Llama 2 70B base model.
  • 16fp model: Merged into the base Llama 2 model, in full precision in the 16fp branch.
  • Quantized for used with Exllama 2:

*this is something like 400gb of model, so it might take a while to upload.

By using this model, you take full responsibility for anything done with its outputs.

Model Details

Model Description

  • Developed by: alac
  • Model Type: QLoRA
  • Finetuned from model: Llama-2 70B
  • Language(s): English

Dataset

Waxwing was trained with a small dataset gathered from public domain writing. The exact dataset will remain private, but the code used to generate prompts and metadata is available on github. Upstage's SOLAR model was used to tag the dataset.

Prompt Template

### System:
A chat between a user and a writing assistant.
{context}

### User:
{style tags}
Write a scene where: {events that should happen in the next scene}

### Assistant:
{output}

context is an optional story synopsis. style tags should be a string along the lines of:

Tone: {list of tones}. Writing style: {list of writing styles}.
Written with {slow|medium|fast} pacing, in moment to moment detail, in {abstract|selective|vivid sensory} detail, from a {First|Third Person (Character)} perspective.

The exact values it was trained on are in the dataset_tags.json file. Anecdotally, it works better with a subset of the style tags used (Tone: tense) or with tags that are complementary (Tone: tense, mysterious. Writing style: dramatic. Written in abstract detail.). It's unclear how well Waxwing responds to tags that it was not trained on (e.g. 'genre').

For SillyTavern users, the style tags work well in the "Author's Note" field at depth 1. User messages should begin with Write a scene where: ; to continue a scene, just type continue. Most testing was done using the Genesis preset.

Training

Waxwing was trained on a single machine with 72GB of VRAM. The training parameters are available in the training_parameters.json file of the main branch. The software used to train was FartyPants' Training_PRO extension for the Oobabooga Text Generation WebUI.