Kudryavka-8B-alpha / README.md
Elizezen's picture
Update README.md
565c7b5 verified
metadata
license: other
license_name: mrl
license_link: https://mistral.ai/licenses/MRL-0.1.md
tags:
  - not-for-all-audiences
  - nsfw
language:
  - ja

Kudryavka 8B Alpha

drawing

Base Model

This model is built upon Ministral-8B-Instruct-2410

NOTICE

Licensed by Mistral AI under the Mistral AI Research License. By using this model, you agree to be bound by the terms of the Mistral AI Research License.

Description

First and foremost, this model is under development and currently falls far short of the Berghof series.

I chose the 8B model simply because in my environment, it's difficult to load models larger than 9B (without quantization), and I wanted to select the largest size that I could handle.

Gemma2 9B was also an option, but I decided to pass on it after testing.

This model has the following characteristics:

  • It can write relatively consistent continuations if given a sufficiently long prompt (1000~ tokens) beforehand;
  • It tends to be roundabout and verbose, sometimes providing excessive explanations in the narrative text, often resulting in a slower-paced development
  • It occasionally generates NSFW content in unexpected ways
  • Its output undoubtedly tends to be repetitive, especially when the initial prompt is sparse
  • If often fails to insert linebreaks between dialogues and monologues. This is expected to be fixed in later release of the series(hopefully). For now, if you can simply ban these tokens and it will fix the problem without any noticeable deterioration in quality.
1010,1032,1548,8620,114840,19397,19428,16850,42260,109315

As a rule of thumb, I think it's suitable for situations where you have a relatively complete piece of writing and want help continuing it, or when writing collaboratively while manually correcting inconsistencies here and there. And for those who don't know, repetitiveness can be greatly reduced by using the DRY multiplier.

Intended Use

Primarily designed for novel generation. Not optimized for:

  • Role-playing (RP) scenarios
  • Instruction-based responses