zhaode commited on
Commit
0efaa38
·
verified ·
1 Parent(s): 19a2057

Upload folder using huggingface_hub

Browse files
Files changed (6) hide show
  1. README.md +38 -1
  2. config.json +7 -1
  3. llm.mnn +2 -2
  4. llm.mnn.json +0 -0
  5. llm.mnn.weight +1 -1
  6. visual.mnn +2 -2
README.md CHANGED
@@ -9,5 +9,42 @@ tags:
9
  # Qwen-VL-Chat-MNN
10
 
11
  ## Introduction
 
12
 
13
- This model is a 4-bit quantized version of the MNN model exported from Qwen-VL-Chat using [llm-export](https://github.com/wangzhaode/llm-export).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  # Qwen-VL-Chat-MNN
10
 
11
  ## Introduction
12
+ This model is a 4-bit quantized version of the MNN model exported from [Qwen-VL-Chat](https://modelscope.cn/models/qwen/Qwen-VL-Chat/summary) using [llmexport](https://github.com/alibaba/MNN/tree/master/transformers/llm/export).
13
 
14
+ ## Download
15
+ ```bash
16
+ # install huggingface
17
+ pip install huggingface
18
+ ```
19
+ ```bash
20
+ # shell download
21
+ huggingface download --model 'taobao-mnn/Qwen-VL-Chat-MNN' --local_dir 'path/to/dir'
22
+ ```
23
+ ```python
24
+ # SDK download
25
+ from huggingface_hub import snapshot_download
26
+ model_dir = snapshot_download('taobao-mnn/Qwen-VL-Chat-MNN')
27
+ ```
28
+
29
+ ```bash
30
+ # git clone
31
+ git clone https://www.modelscope.cn/taobao-mnn/Qwen-VL-Chat-MNN
32
+ ```
33
+
34
+ ## Usage
35
+ ```bash
36
+ # clone MNN source
37
+ git clone https://github.com/alibaba/MNN.git
38
+
39
+ # compile
40
+ cd MNN
41
+ mkdir build && cd build
42
+ cmake .. -DMNN_LOW_MEMORY=true -DMNN_CPU_WEIGHT_DEQUANT_GEMM=true -DMNN_BUILD_LLM=true -DMNN_SUPPORT_TRANSFORMER_FUSE=true
43
+ make -j
44
+
45
+ # run
46
+ ./llm_demo /path/to/Qwen-VL-Chat-MNN/config.json prompt.txt
47
+ ```
48
+
49
+ ## Document
50
+ [MNN-LLM](https://mnn-docs.readthedocs.io/en/latest/transformers/llm.html#)
config.json CHANGED
@@ -4,5 +4,11 @@
4
  "backend_type": "cpu",
5
  "thread_num": 4,
6
  "precision": "low",
7
- "memory": "low"
 
 
 
 
 
 
8
  }
 
4
  "backend_type": "cpu",
5
  "thread_num": 4,
6
  "precision": "low",
7
+ "memory": "low",
8
+ "mllm": {
9
+ "backend_type": "cpu",
10
+ "thread_num": 4,
11
+ "precision": "low",
12
+ "memory": "low"
13
+ }
14
  }
llm.mnn CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:9b82d2d344a1c53950074f8b0a1e6f8fa9fd0fe70b99a25c0ff164dee05e9759
3
- size 1567904
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f88cc299570943361d8abc41015c4fe89e501254c6b3bbe7315bb6edffa2b984
3
+ size 2630400
llm.mnn.json CHANGED
The diff for this file is too large to render. See raw diff
 
llm.mnn.weight CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:7ce25adc9b09b6fe0c8e7fe3155b409cfb27c7b8d4b658a2f0f65965001e35e9
3
  size 3994391386
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7dfcd413c69dbdeca8b3456016136aa55857d547d71b69be049a08c3ea5c6fc3
3
  size 3994391386
visual.mnn CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:5ff8ad74df0aa7846f364cfdacbe68e01c8b24aa62f4109ba18d285e6ccf67ea
3
- size 17084864
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ec0bcf800a7a370d4ab731c6ed2dba1106728fe34374ef198f2be3cbf43df5b5
3
+ size 17015352