|
--- |
|
license: apache-2.0 |
|
tags: |
|
- roberta |
|
- sequence-classification |
|
- biomedical |
|
- clinical |
|
- transformers |
|
- pytorch |
|
- nlp |
|
- spanish |
|
- spanish-clinical |
|
--- |
|
|
|
model_name: roberta-base-biomedical-clinical |
|
model_type: roberta |
|
model_size: base |
|
pretrained_model_name_or_path: roberta-base |
|
|
|
# Model description |
|
# Replace with a concise description of the model's purpose |
|
description: | |
|
This is a custom RoBERTa model fine-tuned on a biomedical/clinical dataset for sequence classification tasks. |
|
It is designed to handle text within the biomedical and clinical domains, making it suitable for tasks like document classification and sentence-level classification. |
|
The model has been pre-trained on general text and then fine-tuned on a specific biomedical/clinical corpus to capture domain-specific patterns. |
|
|
|
# Intended Use |
|
# Replace with information about the specific tasks the model is intended for |
|
intended_use: | |
|
This model can be used for sequence classification tasks in the biomedical and clinical domain, such as medical text classification, sentiment analysis, or named entity recognition in clinical texts. |
|
It is well-suited for tasks that involve understanding medical terminology and biomedical context. |
|
|
|
# Model Details |
|
# A brief description of the model's architecture, what it has been fine-tuned for, and important details |
|
model_details: | |
|
This RoBERTa model is based on the original architecture and consists of 12 transformer layers, each with a hidden size of 768 and a feed-forward layer size of 3072. |
|
It was fine-tuned on a specialized biomedical and clinical corpus to enhance its performance in domain-specific tasks. |
|
The model accepts sequences of up to 512 tokens. |
|
|
|
# Training data |
|
# Describe where the training data comes from |
|
training_data: | |
|
The model was fine-tuned on a biomedical/clinical dataset. This corpus includes medical literature, clinical notes, and other biomedical texts that provide a rich source of domain-specific language. |
|
|
|
# Evaluation results |
|
# If available, include performance metrics such as accuracy, F1 score, etc. |
|
evaluation_results: | |
|
Evaluation results can vary depending on the task and dataset used for testing. For this model, typical evaluation metrics (e.g., accuracy, F1 score) should be reported based on a specialized biomedical/clinical benchmark. |
|
|
|
# Usage instructions |
|
# How to use the model |
|
usage_instructions: | |
|
To use the model, simply load it with the `transformers` library: |
|
|
|
```python |
|
from transformers import RobertaTokenizer, RobertaForSequenceClassification |
|
|
|
model_name = "roberta-base-biomedical-clinical" |
|
tokenizer = RobertaTokenizer.from_pretrained(model_name) |
|
model = RobertaForSequenceClassification.from_pretrained(model_name) |
|
|
|
inputs = tokenizer("your input text here", return_tensors="pt", padding=True, truncation=True) |
|
outputs = model(**inputs) |
|
|