Model Card for ChartPointNet-InstanceSeg
ChartPointNet-InstanceSeg is a high-precision data point instance segmentation model for scientific charts. It uses Mask R-CNN with a Swin Transformer backbone to detect and segment individual data points, especially in dense and small-object scenarios common in scientific figures.
Model Details
Model Description
ChartPointNet-InstanceSeg is designed for pixel-precise instance segmentation of data points in scientific charts (e.g., scatter plots). It leverages Mask R-CNN with a Swin Transformer backbone, trained on enhanced COCO-style datasets with instance masks for data points. The model is ideal for extracting quantitative data from scientific figures and for downstream chart analysis.
- Developed by: Hansheng Zhu
- Model type: Instance Segmentation
- License: Apache-2.0
- Finetuned from model: openmmlab/mask-rcnn
Model Sources
- Repository: https://github.com/hanszhu/ChartSense
- Paper: https://arxiv.org/abs/2106.01841
Uses
Direct Use
- Instance segmentation of data points in scientific charts
- Automated extraction of quantitative data from figures
- Preprocessing for downstream chart understanding and data mining
Downstream Use
- As a preprocessing step for chart structure parsing or data extraction
- Integration into document parsing, digital library, or accessibility systems
Out-of-Scope Use
- Segmentation of non-data-point elements
- Use on figures outside the supported chart types
- Medical or legal decision making
Bias, Risks, and Limitations
- The model is limited to data point segmentation in scientific charts.
- May not generalize to figures with highly unusual styles or poor image quality.
- Potential dataset bias: Training data is sourced from scientific literature.
Recommendations
Users should verify predictions on out-of-domain data and be aware of the model’s limitations regarding chart style and domain.
How to Get Started with the Model
import torch
from mmdet.apis import inference_detector, init_detector
config_file = 'legend_match_swin/mask_rcnn_swin_datapoint.py'
checkpoint_file = 'chart_datapoint.pth'
model = init_detector(config_file, checkpoint_file, device='cuda:0')
result = inference_detector(model, 'example_chart.png')
# result: list of detected masks and class labels
Training Details
Training Data
- Dataset: Enhanced COCO-style scientific chart dataset with instance masks
- Data point class with pixel-precise segmentation masks
- Images and annotations filtered and preprocessed for optimal Swin Transformer performance
Training Procedure
- Images resized to 1120x672
- Mask R-CNN with Swin Transformer backbone
- Training regime: fp32
- Optimizer: AdamW
- Batch size: 8
- Epochs: 36
- Learning rate: 1e-4
Evaluation
Testing Data, Factors & Metrics
- Testing Data: Held-out split from enhanced COCO-style dataset
- Factors: Data point density, image quality
- Metrics: mAP (mean Average Precision), AP50, AP75, per-class AP
Results
Category | mAP | mAP_50 | mAP_75 | mAP_s | mAP_m | mAP_l |
---|---|---|---|---|---|---|
data-point | 0.485 | 0.687 | 0.581 | 0.487 | 0.05 | nan |
Summary
The model achieves strong mAP for data point segmentation, excelling in dense and small-object scenarios. It is highly effective for scientific figures requiring pixel-level accuracy.
Environmental Impact
- Hardware Type: NVIDIA V100 GPU
- Hours used: 10
- Cloud Provider: Google Cloud
- Compute Region: us-central1
- Carbon Emitted: ~15 kg CO2eq (estimated)
Technical Specifications
Model Architecture and Objective
- Mask R-CNN with Swin Transformer backbone
- Instance segmentation head for data point class
Compute Infrastructure
- Hardware: NVIDIA V100 GPU
- Software: PyTorch 1.13, MMDetection 2.x, Python 3.9
Citation
BibTeX:
@article{DocFigure2021,
title={DocFigure: A Dataset for Scientific Figure Classification},
author={S. Afzal, et al.},
journal={arXiv preprint arXiv:2106.01841},
year={2021}
}
APA:
Afzal, S., et al. (2021). DocFigure: A Dataset for Scientific Figure Classification. arXiv preprint arXiv:2106.01841.
Glossary
- Data Point: An individual visual marker representing a value in a scientific chart (e.g., a dot in a scatter plot)
More Information
Model Card Authors
Hansheng Zhu
Model Card Contact
Model tree for hanszhu/ChartPointNet-InstanceSeg
Base model
microsoft/swin-base-patch4-window7-224-in22k