--- license: apache-2.0 language: - en metrics: - accuracy library_name: transformers tags: - misinformation - fake news - vlm - mllm - llm --- # Model Card SNIFFER is a multimodal large language model specifically engineered for Out-Of-Context misinformation detection and explanation. It employs two-stage instruction tuning on [InstructBLIP](https://huggingface.co/Salesforce/instructblip-vicuna-13b), including news-domain alignment and task-specific tuning. The whole model is composed of three parts: 1) _internal checking_ that analyzes the consistency of the image and text content; 2) _external checking_ that analyzes the relevance between the context of the retrieved image and the provided text, and 3) _composed reasoning_ that combines the two-pronged analysis to arrive at a final judgment and explanation. Here the checkpoint is used for the _internal checking_ part. ## Model Sources - **Paper:** https://arxiv.org/abs/2403.03170 (to be appear in CVPR 2024) - **Project:** https://pengqi.site/Sniffer/ - **Repository:** https://github.com/MischaQI/Sniffer ## Results Dataset: [NewsCLIPpings](https://github.com/g-luo/news_clippings)