File size: 1,595 Bytes
7def60a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
# Data query example

Example of integration with HuggingFace Inference API with help of [langchaingo](https://github.com/tmc/langchaingo).

## Setup

Download the LocalAI and start the API:

```bash
# Clone LocalAI
git clone https://github.com/go-skynet/LocalAI

cd LocalAI/examples/langchain-huggingface

docker-compose up -d
```

Node: Ensure you've set `HUGGINGFACEHUB_API_TOKEN` environment variable, you can generate it
on [Settings / Access Tokens](https://huggingface.co/settings/tokens) page of HuggingFace site.

This is an example `.env` file for LocalAI:

```ini
MODELS_PATH=/models
CONTEXT_SIZE=512
HUGGINGFACEHUB_API_TOKEN=hg_123456
```

## Using remote models

Now you can use any remote models available via HuggingFace API, for example let's enable using of
[gpt2](https://huggingface.co/gpt2) model in `gpt-3.5-turbo.yaml` config:

```yml
name: gpt-3.5-turbo
parameters:
  model: gpt2
  top_k: 80
  temperature: 0.2
  top_p: 0.7
context_size: 1024
backend: "langchain-huggingface"
stopwords:
- "HUMAN:"
- "GPT:"
roles:
  user: " "
  system: " "
template:
  completion: completion
  chat: gpt4all
```

Here is you can see in field `parameters.model` equal `gpt2` and `backend` equal `langchain-huggingface`.

## How to use

```shell
# Now API is accessible at localhost:8080
curl http://localhost:8080/v1/models
# {"object":"list","data":[{"id":"gpt-3.5-turbo","object":"model"}]}

curl http://localhost:8080/v1/completions -H "Content-Type: application/json" -d '{
  "model": "gpt-3.5-turbo",
  "prompt": "A long time ago in a galaxy far, far away",
  "temperature": 0.7
}'
```