Devarui379 commited on
Commit
3b5ca6f
·
verified ·
1 Parent(s): 4658c90

Upload 2 files

Browse files
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ numind.NuExtract-v1.5.Q5_K_M.llamafile filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,143 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # llamafile
2
+
3
+ [![ci status](https://github.com/Mozilla-Ocho/llamafile/actions/workflows/ci.yml/badge.svg)](https://github.com/Mozilla-Ocho/llamafile/actions/workflows/ci.yml)<br/>
4
+
5
+
6
+ **llamafile lets you distribute and run LLMs with a single file. ([announcement blog post](https://hacks.mozilla.org/2023/11/introducing-llamafile/))**
7
+
8
+ llamafile aims to make open LLMs much more
9
+ accessible to both developers and end users. They're doing that by
10
+ combining [llama.cpp](https://github.com/ggerganov/llama.cpp) with [Cosmopolitan Libc](https://github.com/jart/cosmopolitan) into one
11
+ framework that collapses all the complexity of LLMs down to
12
+ a single-file executable (called a "llamafile") that runs
13
+ locally on most computers, with no installation.<br/><br/>
14
+
15
+ llamafile is a Mozilla Builders project.
16
+ ## Quickstart
17
+
18
+ The easiest way to try it for yourself is to download the example
19
+ llamafile for the [numind.NuExtract](numind/NuExtract-1.5) model (license: [mit],
20
+ [OpenAI](https://openai.com/policies/terms-of-use)). With llamafile, this you can run this model locally while consuming comparitively less resources and having better performance in CPU alone.
21
+
22
+ 1. Download [numind.NuExtract-v1.5.Q5_K_M.llamafile](https://huggingface.co/Devarui379/numind.NuExtract-v1.5-Q5_K_M-llamafile/resolve/main/numind.NuExtract-v1.5.Q5_K_M.llamafile?download=true) (2.78 GB).
23
+
24
+ 2. Open your computer's terminal.
25
+
26
+ 3. If you're using macOS, Linux, or BSD, you'll need to grant permission
27
+ for your computer to execute this new file. (You only need to do this
28
+ once.)
29
+
30
+ ```sh
31
+ chmod +x numind.NuExtract-v1.5.Q5_K_M.llamafile
32
+ ```
33
+
34
+ 4. If you're on Windows, rename the file by adding ".exe" on the end.
35
+
36
+ 5. Run the llamafile. e.g.:
37
+
38
+ ```sh
39
+ ./numind.NuExtract-v1.5.Q5_K_M.llamafile
40
+ ```
41
+
42
+ 6. Your browser should open automatically and display a chat interface.
43
+ (If it doesn't, just open your browser and point it at http://localhost:8080)
44
+
45
+ 7. When you're done chatting, return to your terminal and hit
46
+ `Control-C` to shut down llamafile.
47
+
48
+ **Having trouble? See the "Gotchas" section in the official github page of [llamafile](https://github.com/Mozilla-Ocho/llamafile) **
49
+
50
+ ## Distribution
51
+
52
+ One good way to share a llamafile with your friends is by posting it on
53
+ Hugging Face. If you do that, then it's recommended that you mention in
54
+ your Hugging Face commit message what git revision or released version
55
+ of llamafile you used when building your llamafile. That way everyone
56
+ online will be able verify the provenance of its executable content. If
57
+ you've made changes to the llama.cpp or cosmopolitan source code, then
58
+ the Apache 2.0 license requires you to explain what changed. One way you
59
+ can do that is by embedding a notice in your llamafile using `zipalign`
60
+ that describes the changes, and mention it in your Hugging Face commit.
61
+
62
+ ## Documentation
63
+
64
+ There's a manual page for each of the llamafile programs installed when you
65
+ run `sudo make install`. The command manuals are also typeset as PDF
66
+ files that you can download from the GitHub releases page. Lastly, most
67
+ commands will display that information when passing the `--help` flag.
68
+
69
+ ## Running llamafile with models downloaded by third-party applications
70
+
71
+ This section answers the question *"I already have a model downloaded locally by application X, can I use it with llamafile?"*. The general answer is "yes, as long as those models are locally stored in GGUF format" but its implementation can be more or less hacky depending on the application. A few examples (tested on a Mac) follow.
72
+
73
+ ### LM Studio
74
+ [LM Studio](https://lmstudio.ai/) stores downloaded models in `~/.cache/lm-studio/models`, in subdirectories with the same name of the models (following HuggingFace's `account_name/model_name` format), with the same filename you saw when you chose to download the file.
75
+
76
+ So if you have downloaded e.g. the `llama-2-7b.Q2_K.gguf` file for `TheBloke/Llama-2-7B-GGUF`, you can run llamafile as follows:
77
+
78
+ ```
79
+ cd ~/.cache/lm-studio/models/TheBloke/Llama-2-7B-GGUF
80
+ llamafile -m llama-2-7b.Q2_K.gguf
81
+ ```
82
+
83
+ ### Ollama
84
+
85
+ When you download a new model with [ollama](https://ollama.com), all its metadata will be stored in a manifest file under `~/.ollama/models/manifests/registry.ollama.ai/library/`. The directory and manifest file name are the model name as returned by `ollama list`. For instance, for `llama3:latest` the manifest file will be named `.ollama/models/manifests/registry.ollama.ai/library/llama3/latest`.
86
+
87
+ The manifest maps each file related to the model (e.g. GGUF weights, license, prompt template, etc) to a sha256 digest. The digest corresponding to the element whose `mediaType` is `application/vnd.ollama.image.model` is the one referring to the model's GGUF file.
88
+
89
+ Each sha256 digest is also used as a filename in the `~/.ollama/models/blobs` directory (if you look into that directory you'll see *only* those sha256-* filenames). This means you can directly run llamafile by passing the sha256 digest as the model filename. So if e.g. the `llama3:latest` GGUF file digest is `sha256-00e1317cbf74d901080d7100f57580ba8dd8de57203072dc6f668324ba545f29`, you can run llamafile as follows:
90
+
91
+ ```
92
+ cd ~/.ollama/models/blobs
93
+ llamafile -m sha256-00e1317cbf74d901080d7100f57580ba8dd8de57203072dc6f668324ba545f29
94
+ ```
95
+
96
+
97
+
98
+ ## Security
99
+
100
+ llamafile adds pledge() and SECCOMP sandboxing to llama.cpp. This is
101
+ enabled by default. It can be turned off by passing the `--unsecure`
102
+ flag. Sandboxing is currently only supported on Linux and OpenBSD on
103
+ systems without GPUs; on other platforms it'll simply log a warning.
104
+
105
+ Our approach to security has these benefits:
106
+
107
+ 1. After it starts up, your HTTP server isn't able to access the
108
+ filesystem at all. This is good, since it means if someone discovers
109
+ a bug in the llama.cpp server, then it's much less likely they'll be
110
+ able to access sensitive information on your machine or make changes
111
+ to its configuration. On Linux, we're able to sandbox things even
112
+ further; the only networking related system call the HTTP server will
113
+ allowed to use after starting up, is accept(). That further limits an
114
+ attacker's ability to exfiltrate information, in the event that your
115
+ HTTP server is compromised.
116
+
117
+ 2. The main CLI command won't be able to access the network at all. This
118
+ is enforced by the operating system kernel. It also won't be able to
119
+ write to the file system. This keeps your computer safe in the event
120
+ that a bug is ever discovered in the GGUF file format that lets
121
+ an attacker craft malicious weights files and post them online. The
122
+ only exception to this rule is if you pass the `--prompt-cache` flag
123
+ without also specifying `--prompt-cache-ro`. In that case, security
124
+ currently needs to be weakened to allow `cpath` and `wpath` access,
125
+ but network access will remain forbidden.
126
+
127
+ Therefore your llamafile is able to protect itself against the outside
128
+ world, but that doesn't mean you're protected from llamafile. Sandboxing
129
+ is self-imposed. If you obtained your llamafile from an untrusted source
130
+ then its author could have simply modified it to not do that. In that
131
+ case, you can run the untrusted llamafile inside another sandbox, such
132
+ as a virtual machine, to make sure it behaves how you expect.
133
+
134
+ ## Licensing
135
+
136
+ While the llamafile project is Apache 2.0-licensed, the changes
137
+ to llama.cpp are licensed under MIT (just like the llama.cpp project
138
+ itself) so as to remain compatible and upstreamable in the future,
139
+ should that be desired.
140
+
141
+
142
+
143
+ [![Star History Chart](https://api.star-history.com/svg?repos=Mozilla-Ocho/llamafile&type=Date)](https://star-history.com/#Mozilla-Ocho/llamafile&Date)
numind.NuExtract-v1.5.Q5_K_M.llamafile ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5e60f226e3832221302892fd2b64ec9045c4c2b758a85a10bede9dd42fc3a116
3
+ size 2985572436