abalakrishnaTRI commited on
Commit
f7ee94c
·
verified ·
1 Parent(s): 1a055a3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -1
README.md CHANGED
@@ -25,4 +25,20 @@ The primary use of PRISMs are for research and development on visually-condition
25
 
26
  PRISM models are released under an MIT License. Copyright (c) 2023 Siddharth Karamcheti, Suraj Nair, Ashwin Balakrishna and Toyota Research Institute. Toyota did not provide any of the materials used to train these models. They are here for reference and verification and evaluation of the training procedures described in the [paper](https://arxiv.org/abs/2402.07865) and as enabled in the [code](https://github.com/TRI-ML/prismatic-vlms). See the paper and the README in the codebase for more details.
27
 
28
- These models are provided as-is. Toyota Research Institute disclaims all warranties, express or implied, including any warranty of merchantability and fitness for a particular purpose.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
25
 
26
  PRISM models are released under an MIT License. Copyright (c) 2023 Siddharth Karamcheti, Suraj Nair, Ashwin Balakrishna and Toyota Research Institute. Toyota did not provide any of the materials used to train these models. They are here for reference and verification and evaluation of the training procedures described in the [paper](https://arxiv.org/abs/2402.07865) and as enabled in the [code](https://github.com/TRI-ML/prismatic-vlms). See the paper and the README in the codebase for more details.
27
 
28
+ These models are provided as-is. Toyota Research Institute disclaims all warranties, express or implied, including any warranty of merchantability and fitness for a particular purpose.
29
+
30
+ ## Training Procedures
31
+
32
+ All models are trained as described in the [paper](https://arxiv.org/abs/2402.07865) using the associated [training codebase](https://github.com/TRI-ML/prismatic-vlms). The following datasets are used for training:
33
+
34
+ - All LLaVA 1.5 Training Data
35
+ - LVIS-Instruct-4V
36
+ - LRV-Instruct
37
+
38
+ ## Evaluation Procedures
39
+
40
+ Models are evaluated as described in the [paper](https://arxiv.org/abs/2402.07865) using the associated [evaluation codebase](https://github.com/TRI-ML/vlm-evaluation). Evaluation datasets span a number of visual reasoning tasks including:
41
+
42
+ - General visual question answering
43
+ - Bounding box prediction
44
+ - Challenge sets which evaluate counting, identifying spatial relationships, and propensity to hallucinate