dafajudin
update code
0ad6e28
raw
history blame
4.8 kB
<!DOCTYPE html>
<html>
<head>
<link rel="stylesheet" href="file/style.css" />
<link rel="preconnect" href="https://fonts.googleapis.com" />
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin />
<link href="https://fonts.googleapis.com/css2?family=Source+Sans+Pro:wght@400;600;700&display=swap" rel="stylesheet" />
<title>Visual Question Answering (VQA) for Medical Imaging</title>
<style>
* {
box-sizing: border-box;
}
body {
font-family: 'Source Sans Pro', sans-serif;
font-size: 16px;
}
.container {
width: 100%;
margin: 0 auto;
}
.title {
font-size: 24px !important;
font-weight: 600 !important;
letter-spacing: 0em;
text-align: center;
color: #374159 !important;
}
.subtitle {
font-size: 24px !important;
font-style: italic;
font-weight: 400 !important;
letter-spacing: 0em;
text-align: center;
color: #1d652a !important;
padding-bottom: 0.5em;
}
.overview-heading {
font-size: 24px !important;
font-weight: 600 !important;
letter-spacing: 0em;
text-align: left;
}
.overview-content {
font-size: 14px !important;
font-weight: 400 !important;
line-height: 33px !important;
letter-spacing: 0em;
text-align: left;
}
.content-image {
width: 100% !important;
height: auto !important;
}
.vl {
border-left: 5px solid #1d652a;
padding-left: 20px;
color: #1d652a !important;
}
.grid-container {
display: grid;
grid-template-columns: 1fr 2fr;
gap: 20px;
align-items: flex-start;
margin-bottom: 1em;
}
@media screen and (max-width: 768px) {
.container {
width: 90%;
}
.grid-container {
display: block;
}
.overview-heading {
font-size: 18px !important;
}
}
</style>
</head>
<body>
<div class="container">
<h1 class="title">Visual Question Answering (VQA) for Medical Imaging</h1>
<h2 class="subtitle">Kalbe Digital Lab</h2>
<section class="overview">
<div class="grid-container">
<h3 class="overview-heading"><span class="vl">Overview</span></h3>
<div>
<p class="overview-content">
This project addresses the challenge of accurate and efficient medical imaging analysis in healthcare,
aiming to reduce human error and workload for radiologists. The proposed solution involves developing advanced AI
models for Visual Question Answering (VQA) to assist healthcare professionals in analyzing
medical images (radiology images) quickly and accurately. We fine-tune HuggingFace multimodal model Idefics2-8b using radiology VQA datasets.
</p>
</div>
</div>
<div class="grid-container">
<h3 class="overview-heading"><span class="vl">Dataset</span></h3>
<div>
<p class="overview-content">
We fine-tune pre-trained model using these datasets :
</p>
<ul>
<li><a href="https://huggingface.co/datasets/flaviagiammarino/vqa-rad" target="_blank">VQA-RAD dataset</a></li>
<li><a href="https://huggingface.co/datasets/mdwiratathya/SLAKE-vqa-english" target="_blank">SLAKE dataset</a></li>
<li><a href="https://huggingface.co/datasets/mdwiratathya/ROCO-radiology" target="_blank">ROCO dataset</a></li>
</ul>
</div>
</div>
<div class="grid-container">
<h3 class="overview-heading"><span class="vl">Model Architecture</span></h3>
<div>
<p class="overview-content">The model is trained using Idefics2-8b.</p>
<img class="content-image" src="img/idefics2_architecture.png" alt="model-architecture" />
</div>
</div>
</section>
<h3 class="overview-heading"><span class="vl">Demo</span></h3>
<p class="overview-content">Please select or upload a image and text to see the prediction of this model</p>
</div>
</body>
</html>