Llama2-7b-med-v1 / README.md
cxllin's picture
Update README.md
3cbfe18
|
raw
history blame
2.83 kB
---
license: apache-2.0
datasets:
- cxllin/medinstruct
language:
- en
metrics:
- accuracy
library_name: transformers
pipeline_tag: question-answering
tags:
- medical
---
# cxllin/Llama2-7b-med-v1
## Model Details
### Description
The **cxllin/Llama2-7b-med-v1** model, derived from the Llama 7b model, is posited to specialize in Natural Language Processing tasks within the medical domain.
#### Development Details
- **Developer**: Collin Heenan
- **Model Architecture**: Transformer
- **Base Model**: [Llama-2-7b](https://huggingface.co/NousResearch/Nous-Hermes-llama-2-7b)
- **Primary Language**: English
- **License**: apache 2.0
### Model Source Links
- **Repository**: Not Specified
- **Paper**: [Jin, Di, et al. "What Disease does this Patient Have?..."](https://github.com/jind11/MedQA)
### Direct Applications
The model is presumed to be applicable for various NLP tasks within the medical domain, such as:
- Medical text generation or summarization.
- Question answering related to medical topics.
### Downstream Applications
Potential downstream applications might encompass:
- Healthcare chatbot development.
- Information extraction from medical documentation.
### Out-of-Scope Utilizations
- Rendering definitive medical diagnoses or advice.
- Employing in critical healthcare systems without stringent validation.
- Applying in any high-stakes or legal contexts without thorough expert validation.
## Bias, Risks, and Limitations
- **Biases**: The model may perpetuate biases extant in the training data, influencing neutrality.
- **Risks**: There exists the peril of disseminating inaccurate or misleading medical information.
- **Limitations**: Expertise in highly specialized or novel medical topics may be deficient.
### Recommendations for Use
Utilizers are urged to:
- Confirm outputs via expert medical review, especially in professional contexts.
- Employ the model judiciously, adhering to pertinent legal and ethical guidelines.
- Maintain transparency with end-users regarding the model鈥檚 capabilities and limitations.
## Getting Started with the Model
Details regarding model deployment and interaction remain to be provided.
### Training Dataset
- **Dataset Source**:[cxllin/medinstruct](https://huggingface.co/datasets/cxllin/medinstruct)
- **Size**: 10.2k rows
- **Scope**: Medical exam-related question-answering data.
#### Preprocessing Steps
Details regarding data cleaning, tokenization, and special term handling during training are not specified.
---
@article{jin2020disease,
title={What Disease does this Patient Have? A Large-scale Open Domain Question Answering Dataset from Medical Exams},
author={Jin, Di and Pan, Eileen and Oufattole, Nassim and Weng, Wei-Hung and Fang, Hanyi and Szolovits, Peter},
journal={arXiv preprint arXiv:2009.13081},
year={2020}
}