File size: 1,109 Bytes
f268251
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
## LLaVA-Med-1.5 Performance

<p align="center">
    <img src="https://hanoverprod.z21.web.core.windows.net/med_llava/web/llava-med_1.5_eval.png" width="90%"> <br>
 
  *Performance comparison of mulitmodal chat instruction-following abilities, measured by the relative score via language GPT-4 evaluation.*
</p>


## LLaVA-Med-1.0 Performance

<p align="center">
    <img src="../images/llava_med_chat_example1.png" width="90%"> <br>
 
  *Example 1: comparison of medical visual chat. The language-only GPT-4 is considered as the performance upper bound, as the golden captions and inline mentions are fed into GPT-4 as the context, without requiring the model to understand the raw image.*
</p>

<p align="center">
    <img src="../images/llava_med_chat_example2.png" width="90%"> <br>
 
  *Example 2: comparison of medical visual chat. LLaVA tends to halluciate or refuse to provide domain-specific knowledgable response.*
</p>


<p align="center">
    <img src="../images/llava_med_vqa.png" width="90%"> <br>
 
  *Performance comparison of fine-tuned LLaVA-Med on established Medical QVA datasets.*
</p>