File size: 3,488 Bytes
e1cd9ab
 
 
 
3c7669a
 
 
 
 
f6a7864
0fce647
e1cd9ab
46e3459
 
d140a96
e1cd9ab
d3b0538
5993b7e
6d872e4
5993b7e
 
b351622
e1cd9ab
 
 
 
 
 
d1f7cdf
e1cd9ab
d140a96
e1cd9ab
 
 
 
 
18aa041
e1cd9ab
 
18aa041
 
46e3459
 
5055c01
 
4d6b398
 
6073b4d
5b6f409
b052c6e
 
 
 
 
 
 
 
4d6b398
 
 
6073b4d
b052c6e
 
 
 
 
 
 
 
4d6b398
5b6f409
5055c01
5b6f409
5055c01
d6b8cd1
 
6073b4d
376d575
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
---
tags:
- lora
---
<!-- header start -->
<div style="width: 100%;">
    <img src="https://media.tenor.com/frGCmLDFbkMAAAAC/karen-ok.gif" alt="FPHam's Karen" style="width: 30%; min-width: 200px; display: block; margin: auto;">
</div>
<!-- header end -->

## Karen is an editor for your fiction. (v.0.2)

She fixes grammar and wording issues, but doesn't necessary start rewording everything like ChatGPT into a corporate talk. So it should keep the style intact.

Based on LLAMA 13b and Wizard-Vucna-uncensored finetune, then finetuned with about 20k grammar examples (bad grammar/good grammar). 

## Quantized version (Quantized by TheBloke)

* [4-bit GPTQ models for GPU inference](https://huggingface.co/FPHam/Karen_theEditor-13B-4bit-128g-GPTQ)
* [4-bit, 5-bit and 8-bit GGML models for CPU(+GPU) inference](https://huggingface.co/TheBloke/Karen_theEditor_13B-GGML)

Karen gets triggered by this prompt (pun intended):

```
USER: Edit the following for spelling and grammar mistakes: 
ASSISTANT:
```

Feed it a paragraph at a time - that's where she works best - a few sentences.

Here is my instruct yaml for oobabooga webui (in Instruct mode):

```
user: "USER: Edit the following for spelling and grammar mistakes: "
bot: "ASSISTANT:"
turn_template: "<|user|> <|user-message|>\n<|bot|> <|bot-message|></s>\n"
context: "Assistant is a meticulous senior editor with a specialization in editing fictional stories. When given text, Assistant detects and corrects grammatical errors, including subject-verb agreement, tense consistency, punctuation, capitalization, use of correct articles and correct present perfect and past perfect tense.\n"
```

I am not sure to what extend the context part has any effect. Doesn't hurt though.

This is a version 0.2 and finetuned just so it retains the underlaying Vicuna qualities. 

## Example:

Input
```
USER: Edit the following for spelling and grammar mistakes:
unce upon time lived name John who have pet monkey Jackie.
one day while walkin down the street he saw his friend Tom coming to him.
"Hey Tom" said John "hows it going?"

Tom replyied "Not too bad how 'bout yourself?". as they was talk Jackie sudden jump onto Tom shoulder and started play with his hair.
Tom look at John and ask "Is that yours?"

John reply "Yeah, this crazy lil guy follow me everywhere" just then Jackie grab hold of Tom glasses and try to take thmm off.
Tom struggle to keep his balance as he laugh hystericaly.
```
Output
```
ASSISTANT: Once upon a time there lived a man named John who had a pet monkey called Jackie.
One day while walking down the street he saw his friend Tom approaching him.
"Hey Tom," said John. "How's it going?"

Tom replied, "Not too bad, how about yourself?" As they were talking, Jackie suddenly jumped onto Tom's shoulder and began playing with his hair.
Tom looked at John and asked, "Is that yours?"

John replied, "Yeah, this crazy little guy follows me everywhere." Just then Jackie grabbed hold of Tom's glasses and tried to take them off.
Tom struggled to keep his balance as he laughed hysterically.
```

## Goal: to create the best grammar checker you have ever seen

## To do:
- train on larger dataset, big, enormous, gargantuan 
- see if finetuning on just plain LLAMA without Vicuna would work better or worse (the theory is that it will be very focused on editing and nothing else)
- explore what different settings (temperature, top_p, top_k do for this type of finetune)
- create Rachel, the paraphrasing editor