File size: 4,001 Bytes
dffb49e
 
 
 
 
 
 
 
 
234eadd
dffb49e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e6083f6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
dffb49e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5462d0d
 
 
dffb49e
 
 
37b0a47
dffb49e
5462d0d
dffb49e
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
---
license: apache-2.0
datasets:
- nomic-ai/gpt4all-j-prompt-generations
language:
- en
pipeline_tag: text-generation
---

# Model Card for GPT4All-J-v1.0

An Apache-2 licensed chatbot trained over a massive curated corpus of assistant interactions including word problems, multi-turn dialogue, code, poems, songs, and stories.

## Model Details

### Model Description

<!-- Provide a longer summary of what this model is. -->

This model has been finetuned from [GPT-J](https://huggingface.co/EleutherAI/gpt-j-6B)

- **Developed by:** [Nomic AI](https://home.nomic.ai)
- **Model Type:** A finetuned GPT-J model on assistant style interaction data
- **Language(s) (NLP):** English
- **License:** Apache-2
- **Finetuned from model [optional]:** [GPT-J](https://huggingface.co/EleutherAI/gpt-j-6B)


We have released several versions of our finetuned GPT-J model using [different dataset versions](https://huggingface.co/datasets/nomic-ai/gpt4all-j-prompt-generations)

- v1.0: The original model trained on the v1.0 dataset
- v1.1-breezy: Trained on afiltered dataset where we removed all instances of AI language model
- v1.2-jazzy: Trained on a filtered dataset where we also removed instances like I'm sorry, I can't answer... and AI language model

To download a model with a specific revision run 

```python
from transformers import AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained("nomic-ai/gpt4all-j", revision="v1.2-jazzy")
```

Downloading without specifying `revision` defaults to `main`/`v1.0`.

### Model Sources [optional]

<!-- Provide the basic links for the model. -->

- **Repository:** [https://github.com/nomic-ai/gpt4all](https://github.com/nomic-ai/gpt4all)
- **Base Model Repository:** [https://github.com/kingoflolz/mesh-transformer-jax](https://github.com/kingoflolz/mesh-transformer-jax)
- **Paper [optional]:** [GPT4All-J: An Apache-2 Licensed Assistant-Style Chatbot](https://s3.amazonaws.com/static.nomic.ai/gpt4all/2023_GPT4All-J_Technical_Report_2.pdf)
- **Demo [optional]:** [https://gpt4all.io/](https://gpt4all.io/)


### Training Procedure 
GPT4All is made possible by our compute partner [Paperspace](https://www.paperspace.com/).

Trained on a DGX cluster with 8 A100 80GB GPUs for ~12 hours. Using Deepspeed + Accelerate, we use a global batch size of 256 with a learning rate of 2e-5. More information can be found in the repo.


### Results

Results on common sense reasoning benchmarks

```
 Model                     BoolQ       PIQA     HellaSwag   WinoGrande    ARC-e      ARC-c       OBQA
  ----------------------- ---------- ---------- ----------- ------------ ---------- ---------- ----------
  GPT4All-J 6.7B v1.0        73.4       74.8       63.4         64.7        54.9       36.0       40.2
  GPT4All-J v1.1-breezy      74.0       75.1       63.2         63.6        55.4       34.9       38.4
  GPT4All-J v1.2-jazzy      *74.8*      74.9       63.6         63.8        56.6       35.3       41.0
  GPT4All-J Lora 6.7B        68.6       75.8       66.2         63.5        56.4       35.7       40.2
  GPT4All LLaMa Lora 7B      73.1       77.6       72.1         67.8        51.1       40.4       40.2
  Dolly 6B                   68.8       77.3       67.6         63.9        62.9       38.7       41.2
  Dolly 12B                  56.7       75.4       71.0         62.2       *64.6*      38.5        40.4
  Alpaca 7B                  73.9       77.2       73.9         66.1        59.8       43.3       43.4
  Alpaca Lora 7B             74.3      *79.3*     *74.0*       *68.8*       56.6      *43.9*     *42.6*
  GPT-J 6.7B                 65.4       76.2       66.2         64.1        62.2       36.6       38.2
  LLaMa 7B                   73.1       77.4       73.0         66.9        52.5       41.4       42.4
  Pythia 6.7B                63.5       76.3       64.0         61.1        61.3       35.2       37.2
  Pythia 12B                 67.7       76.6       67.3         63.8        63.9       34.8        38
```