Spaces:
Running
Running
A newer version of the Streamlit SDK is available:
1.44.1
metadata
title: MedKGC
emoji: 🐠
colorFrom: red
colorTo: red
sdk: streamlit
sdk_version: 1.39.0
app_file: app.py
pinned: false
Medical Knowledge Graph Construction (medKGC)
Overview
A automated annotation tool using LLMs to help medical annotators annotate the input radiology reports.
这个工具涉及了Named Entity Recognition,relation extraction, named entity normalization,最终结果会以知识图谱的形式输出。
medKGC is a medical text knowledge graph construction and review system. It supports entity recognition, relation extraction, and visualization of medical reports, providing a convenient review interface.
Deployment
Installation
- Create conda environment
conda create -n medkgc python=3.10
conda activate medkgc
- Install dependencies
pip install -r requirements.txt
- Run application
streamlit run app.py
Core Features
1. Data Processing
- Position Conversion: Support word-level and char-level position conversion
- Entity Conversion: Convert between JSON format and Selection objects
- Relation Extraction: Entity ID-based relation mapping and reconstruction
2. Entity Annotation
- Label Types:
- OBS-DP: Observation definitely present (Red)
- ANAT-DP: Anatomy definitely present (Cyan)
- OBS-U: Observation uncertain (Yellow)
- OBS-DA: Observation definitely absent (Gray)
- Interactive Annotation: Support entity selection and annotation
3. Relation Visualization
- Node Merging: Automatically merge entities with same text
- Color Coding: Different colors for different entity types
- Dynamic Updates: Support real-time graph updates
4. Review Process
- Report Selection: Display pending and reviewed reports separately
- Status Saving: Automatically save review status and modifications
- Batch Processing: Support continuous review of multiple reports
Technical Implementation
Data Structures
- Entity Data
{
"entities": {
"1": {
"tokens": "entity text",
"label": "entity type",
"start_ix": "word-level start position",
"end_ix": "word-level end position",
"relations": [["relation type", "target entity ID"]]
}
}
}
index start from 0.
- Selection Object
@dataclass
class Selection:
start: int # char-level start position
end: int # char-level end position
text: str # entity text
labels: List[str] # entity type list
Core Algorithms
- Position Conversion
def word_to_char_span(text, start_ix, end_ix):
"""Convert word-level position to character-level range"""
- Relation Reconstruction
def find_relations_with_entities(entities, entities_data):
"""Rebuild relations based on entity text matching"""
TODO
- Add data export functionality
- Named Entity Recognition
- 增加输入框
- 调用llms
- Relation Extraction
- Add relation editing functionality
- 数据在哪里
- 从某个地方读取,git上
- 存到某个地方,存有点麻烦(commit吗)
Contributing
Welcome to contribute through:
- Submit Issues for bug reports or suggestions
- Submit Pull Requests to improve code
- Improve documentation and comments
License
MIT License