File size: 2,291 Bytes
fca1612
 
 
 
 
 
 
 
 
 
 
84b7954
 
fca1612
 
84b7954
fca1612
be35602
 
84b7954
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
be35602
84b7954
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
be35602
84b7954
 
 
fca1612
 
 
84b7954
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
---
language:
- en
license: apache-2.0
tags:
- text-generation-inference
- transformers
- unsloth
- mistral
- trl
base_model: teknium/OpenHermes-2.5-Mistral-7B
datasets:
- cnbeining/sentence-segmentation-dpo-raw
---

# OpenHermes-2.5-Mistral-7B-Sentence-Segmentation

_See files for original notebook used for finetuning_

## Model description

`OpenHermes-2.5-Mistral-7B-Sentence-Segmentation` is a DPO finetuned OpenHermes model for sentence segmentation capability. 

This model is based on `teknium/OpenHermes-2.5-Mistral-7B`, a state-of-the-art chat-aligned 7B model.  

## Example Outputs

The model has been finetuned with (ChatML)[https://github.com/MicrosoftDocs/azure-docs/blob/main/articles/ai-services/openai/includes/chat-markup-language.md#messages] template: 

```
<|im_start|>system
Segment:<|im_end|>
<|im_start|>user
```yaml
"input":
  "sentence":
    "segment":
    - "word": "Shere,"
    - "word": "in"
    - "word": "your"
    - "word": "report"
    - "word": "on"
    - "word": "female"
    - "word": "sexuality,"
    - "word": "men"
    - "word": "were"
    - "word": "staggered"
    - "word": "to"
    - "word": "learn"
    - "word": "that"
    - "word": "clitoral"
    - "word": "stimulation"
    - "word": "was"
    - "word": "much"
    - "word": "more"
    - "word": "important"
    - "word": "than"
    - "word": "penetration."
```<|im_end|>
<|im_start|>assistant
```

with output in the format of 

```

```yaml
"output":
  "sentence":
    "segment":
    - "word": "Shere,"
    - "word": "in"
    - "word": "your"
    - "word": "report"
    - "word": "on"
    - "word": "female"
    - "word": "sexuality,"
    "segment":
    - "word": "men"
    - "word": "were"
    - "word": "staggered"
    - "word": "to"
    - "word": "learn"
    - "word": "that"
    "segment":
    - "word": "clitoral"
    - "word": "stimulation"
    - "word": "was"
    - "word": "much"
    - "word": "more"
    - "word": "important"
    - "word": "than"
    - "word": "penetration."
```

```

## Misc

This mistral model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)