|
--- |
|
license: mit |
|
language: |
|
- fr |
|
pipeline_tag: token-classification |
|
tags: |
|
- biomedical |
|
- clinical |
|
- life sciences |
|
datasets: |
|
- rntc/nuner-pubmed-e3c-french-umls |
|
|
|
|
|
|
|
|
|
library_name: gliner |
|
--- |
|
|
|
<a href=https://camembert-bio-model.fr/> |
|
<img width="300px" src="https://camembert-bio-model.fr/authors/camembert-bio/camembert-bio-ner-logo.png"> |
|
</a> |
|
|
|
# CamemBERT-bio-gliner-v0.1 : Zero-shot French Biomedical Named Entity Recognition |
|
|
|
CamemBERT-bio-gliner is a Named Entity Recognition (NER) model capable of identifying any french biomedical entity type using a BERT-like encoder. It provides a practical alternative to traditional NER models, which are limited to predefined entities, and Large Language Models (LLMs) that, despite their flexibility, are costly and large for resource-constrained scenarios. |
|
[CamemBERT-bio](https://huggingface.co/almanach/camembert-bio-base) is used as a backbone. |
|
This model is based on the fantastic work of [Urchade Zaratiana](https://huggingface.co/urchade) on the [GLiNER](https://github.com/urchade/GLiNER) architecture. |
|
|
|
|
|
## Important |
|
|
|
This is the v0.1 of the CamemBERT-bio-gliner model. There might be a few quirks or unexpected predictions. So, if you notice anything off or have suggestions for improvements, we'd really appreciate hearing from you! |
|
|
|
## Installation |
|
To use this model, you must install the GLiNER Python library: |
|
``` |
|
!pip install gliner |
|
``` |
|
|
|
## Usage |
|
Once you've downloaded the GLiNER library, you can import the GLiNER class. You can then load this model using `GLiNER.from_pretrained` and predict entities with `predict_entities`. |
|
|
|
```python |
|
from gliner import GLiNER |
|
|
|
model = GLiNER.from_pretrained("almanach/camembert-bio-gliner-v0.1") |
|
|
|
text = """ |
|
Mme A.P. âgée de 52 ans, non tabagique, ayant un diabète de type 2 a été hospitalisée pour une pneumopathie infectieuse. Cette patiente présentait depuis 2 ans des infections respiratoires traités en ambulatoire. L’examen physique a trouvé une fièvre à 38ºc et un foyer de râles crépitants de la base pulmonaire droite. |
|
""" |
|
|
|
labels = ["Âge", "Patient", "Maladie", "Symptômes"] |
|
|
|
entities = model.predict_entities(text, labels, threshold=0.5, flat_ner=True) |
|
|
|
for entity in entities: |
|
print(entity["text"], "=>", entity["label"]) |
|
``` |
|
|
|
```bash |
|
Mme A.P. => Patient |
|
52 ans => Âge |
|
pneumopathie infectieuse => Maladie |
|
infections respiratoires => Maladie |
|
fièvre => Symptômes |
|
râles crépitants => Symptômes |
|
``` |
|
|
|
## Links |
|
|
|
* Model: https://huggingface.co/almanach/camembert-bio-gliner-v0.1 |
|
* Backbone model: https://huggingface.co/almanach/camembert-bio-base |
|
* GLiNER library: https://github.com/urchade/GLiNER |
|
* Developed by: [Rian Touchent](https://rian-t.github.io), [Eric Villemonte de La Clergerie](http://pauillac.inria.fr/~clerger/) |
|
* Logo by: [Alix Chagué](https://alix-tz.github.io/), [Rian Touchent](https://rian-t.github.io) |
|
* License: MIT |