Spaces:
Running
Running
File size: 5,547 Bytes
3a4bf94 ab2b4c2 8b884de ab2b4c2 8b884de 3a4bf94 8b884de 3a4bf94 a7d1184 4e2ecc7 d059157 4e2ecc7 35e07c0 e84fc82 35e07c0 1f425c4 e84fc82 35e07c0 4e2ecc7 e179a3b a7d1184 8f2c408 a7d1184 8889521 4144a3c 9ee1fc5 a1f2578 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 |
---
title: BMI - Biomedical Informatics Lab
emoji: 🐠
colorFrom: blue
colorTo: green
sdk: static
pinned: true
---
<p align="center">
<img src="bmilogo.png" />
</p>
<p align="center">
<img src="bmi_scritta.png" />
</p>
# BMI - Biomedical Informatics Lab "Mario Stefanelli"
## About Us
BMI belongs to the Department of Electrical, Computer, and Biomedical Engineering (Faculty of Engineering) of the **University of Pavia**, Italy.
Established in 1982, it is a leading center for education, research, and IT innovative solutions in the healthcare area. Nowadays about 30 people are working at BMI, focusing their research on:
<div align="center">
<small>
<span style="background: rgba(155,155,155,0.7); color: white; padding: 4px 8px; text-align: center; border-radius: 8px;">Biomedical NLP</span>
<span style="background: rgba(155,155,155,0.7); color: white; padding: 4px 8px; text-align: center; border-radius: 8px;">Medical Imaging</span>
<span style="background: rgba(155,155,155,0.7); color: white; padding: 4px 8px; text-align: center; border-radius: 8px;">Clinical Data Mining</span>
<span style="background: rgba(155,155,155,0.7); color: white; padding: 4px 8px; text-align: center; border-radius: 8px;">Biomedical Knowledge Management</span>
<span style="background: rgba(155,155,155,0.7); color: white; padding: 4px 8px; text-align: center; border-radius: 8px;">Decision Support Systems</span>
<span style="background: rgba(155,155,155,0.7); color: white; padding: 4px 8px; text-align: center; border-radius: 8px;">Telemedicine</span>
<span style="background: rgba(155,155,155,0.7); color: white; padding: 4px 8px; text-align: center; border-radius: 8px;">E-learning</span>
</small>
</div>
## NLP Models
Our research interests have led us to frequently explore the realm of Natural Language Processing, including Transformers.
Here we host public weights for our biomedical language models. There are several options to choose from, please check the details below.
| Model | Domain | Type | Details |
|------------|---------|-------------------|-------------------------------------------------------------|
| [Igea](https://huggingface.co/bmi-labmedinfo/Igea-7B-v0.1) | Biomedical | CausalLM Pretrain | Small language model trained after [sapienzanlp/Minerva](https://huggingface.co/sapienzanlp/Minerva-1B-base-v1.0) with more than 5 billion biomedical words in Italian. Three versions available: [350M params](https://huggingface.co/bmi-labmedinfo/Igea-350M-v0.1), [1B params](https://huggingface.co/bmi-labmedinfo/Igea-1B-v0.1), [3B params](https://huggingface.co/bmi-labmedinfo/Igea-3B-v0.1), and [7B params](https://huggingface.co/bmi-labmedinfo/Igea-7B-v0.1). |
| [BioBIT](https://huggingface.co/bmi-labmedinfo/bioBIT) <sup>*</sup>| Biomedical | MaskedLM Pretrain | BERT model trained after [dbmdz/bert-base-italian-xxl-cased](https://huggingface.co/dbmdz/bert-base-italian-xxl-cased) with 28GB Pubmed abstracts (as in BioBERT) that have been translated from English into Italian using Neural Machine Translation (GNMT). |
| [MedBIT](https://huggingface.co/bmi-labmedinfo/medBIT) <sup>*</sup>| Medical | MaskedLM Pretrain | BERT model trained after [BioBIT](https://huggingface.co/bmi-labmedinfo/bioBIT) with additional 100MB of medical textbook data without any regularization. |
| [MedBIT-R3+](https://huggingface.co/bmi-labmedinfo/medBIT-r3-plus) (recommended) <sup>*</sup>| Medical | MaskedLM Pretrain | BERT model trained after [BioBIT](https://huggingface.co/bmi-labmedinfo/bioBIT) with additional 200MB of medical textbook data and web-crawled medical resources in Italian. Regularized with LLRD (.95), Mixout (.9), and Warmup (.02). |
<sup>*</sup> <small>model developed for the [Italian Neuroscience and Rehabilitation Network](https://www.reteneuroscienze.it/en/istituti-nazionali-virtuali/) in partnership with the Neuroinformatics Lab of IRCCS Centro San Giovanni di Dio Fatebenefratelli, Brescia, Italy</small>
## Related Research Papers
* *Buonocore T. M., Rancati S., and Parimbelli E (2024). Igea: a Decoder-Only Language Model for Biomedical Text Generation in Italian, ArXiv. https://arxiv.org/abs/2407.06011*
* *Buonocore T. M., Parimbelli E., Tibollo V., Napolitano C., Priori S., and Bellazzi R. (2023). A Rule-Free Approach for Cardiological Registry Filling from Italian Clinical Notes with Question Answering Transformers, Artificial Intelligence in Medicine: 21st International Conference on Artificial Intelligence in Medicine, AIME 2023. https://doi.org/10.1007/978-3-031-34344-5_19*
* *Crema C., Buonocore T.M., Fostinelli S., Parimbelli E., Verde F., Fundarò C., Manera M., Ramusino M.C., Capelli M., Costa A., Binetti G., Bellazzi R., Redolfi A. (2023). Advancing Italian biomedical information extraction with transformers-based models: Methodological insights and multicenter practical application, Journal of Biomedical Informatics. https://doi.org/10.1016/j.jbi.2023.104557*
* *Buonocore T. M., Crema C., Redolfi A., Bellazzi R., Parimbelli E. (2023). Localising In-Domain Adaptation of Transformer-Based Biomedical Language Models, Journal of Biomedical Informatics. https://doi.org/10.1016/j.jbi.2023.104431*
* *Buonocore T. M., Parimbelli E., Sacchi L., Bellazzi R., Del Campo L., & Quaglini S. (2022). Improving Keyword-Based Topic Classification in Cancer Patient Forums with Multilingual Transformers. Studies in health technology and informatics, 290, 597–601. https://doi.org/10.3233/SHTI220147* |