view post Post 2482 A great vision language benchmark: MM-UPD evaluates how model responds to unsolvable problems 🤓As of now, most VLMs, including GPT-4V and LLaVA-Next-34B, struggle with refusing to answerDataset MM-UPD/MM-UPDPaper Unsolvable Problem Detection: Evaluating Trustworthiness of Vision Language Models (2403.20331) 👍 5 5 + Reply
view post Post 2915 New multimodal dataset by @xai-org @liuhaotian 🤩❤️ xai-org/RealworldQA 🔥 7 7 + Reply
view post Post 3481 With AutoTrain, you can already finetune the latest llama3 models without writing a single line of code. Here's an example finetune of llama3 8b model: abhishek/autotrain-llama3-no-robots 2 replies · 👀 7 7 🔥 6 6 🚀 3 3 + Reply