File size: 2,053 Bytes
7f978e2
 
 
 
 
 
 
 
 
 
 
 
 
ac6b59e
a717a8c
e8a07bc
 
 
a717a8c
e8a07bc
 
f188180
e8a07bc
 
 
 
 
b48807a
e8a07bc
b48807a
e8a07bc
 
 
b48807a
e8a07bc
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
---
license: other
license_name: qwen
license_link: https://huggingface.co/Qwen/Qwen2.5-72B-Instruct/blob/main/LICENSE
language:
- en
pipeline_tag: text-generation
base_model: Qwen/Qwen2.5-72B
tags:
- chat
library_name: transformers
---

<p style="font-size:20px;" align="left">
<div style="width: 80px; height: 80px; border-radius: 15px;">
    <img 
        src="https://shuttleai.com/shuttle.png" 
        alt="ShuttleAI Thumbnail" 
        style="width: auto; height: auto; margin-left: 0; object-fit: cover; border-radius: 15px;">
</div>

<p align="left">
    💻 <a href="https://shuttleai.com/" target="_blank">Use via API</a>
</p>

## Shuttle-3 (beta) [2024/10/25]

We are excited to introduce Shuttle-3, our next-generation state-of-the-art language model designed to excel in complex chat, multilingual communication, reasoning, and agent tasks.

- **Shuttle-3** is a fine-tuned version of [Qwen-2.5-72b-Instruct](https://huggingface.co/Qwen/Qwen2.5-72B-Instruct), emulating the writing style of Claude 3 models and thoroughly trained on role-playing data.

## Model Details

* **Model Name**: Shuttle-3
* **Developed by**: ShuttleAI Inc.
* **Base Model**: [Qwen-2.5-72b-Instruct](https://huggingface.co/Qwen/Qwen2.5-72B-Instruct)
* **Parameters**: 72B
* **Language(s)**: Multilingual
* **Repository**: [https://huggingface.co/shuttleai](https://huggingface.co/shuttleai)
* **Fine-Tuned Model**: [https://huggingface.co/shuttleai/shuttle-3](https://huggingface.co/shuttleai/shuttle-3)

### Key Features

- Pretrained on a large proportion of multilingual and code data
- Finetuned to emulate the prose quality of Claude 3 models and extensively on role play data

## Fine-Tuning Details

- **Training Setup**: Trained on 130 million tokens for 12 hours using 4 A100 PCIe GPUs.

## Prompting

Shuttle-3 uses ChatML as its prompting format:

```
<|im_start|>system
You are a pirate! Yardy harr harr!<|im_end|>
<|im_start|>user
Where are you currently!<|im_end|>
<|im_start|>assistant
Look ahoy ye scallywag! We're on the high seas!<|im_end|>
```