mikitona commited on
Commit
0efe4b7
·
verified ·
1 Parent(s): c8de36d

Upload 4 files

Browse files
Files changed (4) hide show
  1. README.md +44 -0
  2. special_tokens_map.json +24 -0
  3. tokenizer.model +3 -0
  4. tokenizer_config.json +35 -0
README.md ADDED
@@ -0,0 +1,44 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ inference: false
3
+ pipeline_tag: image-text-to-text
4
+ ---
5
+
6
+ <br>
7
+ <br>
8
+
9
+ # LLaVA Model Card
10
+
11
+ ## Model details
12
+
13
+ **Model type:**
14
+ LLaVA is an open-source chatbot trained by fine-tuning LLaMA/Vicuna on GPT-generated multimodal instruction-following data.
15
+ It is an auto-regressive language model, based on the transformer architecture.
16
+
17
+ **Model date:**
18
+ LLaVA-v1.5-13B was trained in September 2023.
19
+
20
+ **Paper or resources for more information:**
21
+ https://llava-vl.github.io/
22
+
23
+ ## License
24
+ Llama 2 is licensed under the LLAMA 2 Community License,
25
+ Copyright (c) Meta Platforms, Inc. All Rights Reserved.
26
+
27
+ **Where to send questions or comments about the model:**
28
+ https://github.com/haotian-liu/LLaVA/issues
29
+
30
+ ## Intended use
31
+ **Primary intended uses:**
32
+ The primary use of LLaVA is research on large multimodal models and chatbots.
33
+
34
+ **Primary intended users:**
35
+ The primary intended users of the model are researchers and hobbyists in computer vision, natural language processing, machine learning, and artificial intelligence.
36
+
37
+ ## Training dataset
38
+ - 558K filtered image-text pairs from LAION/CC/SBU, captioned by BLIP.
39
+ - 158K GPT-generated multimodal instruction-following data.
40
+ - 450K academic-task-oriented VQA data mixture.
41
+ - 40K ShareGPT data.
42
+
43
+ ## Evaluation dataset
44
+ A collection of 12 benchmarks, including 5 academic VQA benchmarks and 7 recent benchmarks specifically proposed for instruction-following LMMs.
special_tokens_map.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<s>",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "eos_token": {
10
+ "content": "</s>",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": "<unk>",
17
+ "unk_token": {
18
+ "content": "<unk>",
19
+ "lstrip": false,
20
+ "normalized": false,
21
+ "rstrip": false,
22
+ "single_word": false
23
+ }
24
+ }
tokenizer.model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9e556afd44213b6bd1be2b850ebbbd98f5481437a8021afaf58ee7fb1818d347
3
+ size 499723
tokenizer_config.json ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_bos_token": true,
3
+ "add_eos_token": false,
4
+ "bos_token": {
5
+ "__type": "AddedToken",
6
+ "content": "<s>",
7
+ "lstrip": false,
8
+ "normalized": false,
9
+ "rstrip": false,
10
+ "single_word": false
11
+ },
12
+ "clean_up_tokenization_spaces": false,
13
+ "eos_token": {
14
+ "__type": "AddedToken",
15
+ "content": "</s>",
16
+ "lstrip": false,
17
+ "normalized": false,
18
+ "rstrip": false,
19
+ "single_word": false
20
+ },
21
+ "legacy": false,
22
+ "model_max_length": 2048,
23
+ "pad_token": null,
24
+ "padding_side": "right",
25
+ "sp_model_kwargs": {},
26
+ "tokenizer_class": "LlamaTokenizer",
27
+ "unk_token": {
28
+ "__type": "AddedToken",
29
+ "content": "<unk>",
30
+ "lstrip": false,
31
+ "normalized": false,
32
+ "rstrip": false,
33
+ "single_word": false
34
+ }
35
+ }