hanszhu commited on
Commit
4061dc6
·
verified ·
1 Parent(s): 6820beb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +161 -3
README.md CHANGED
@@ -1,3 +1,161 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ base_model:
6
+ - openmmlab/mask-rcnn
7
+ - microsoft/swin-base-patch4-window7-224-in22k
8
+ pipeline_tag: image-segmentation
9
+ ---
10
+
11
+ # Model Card for ChartPointNet-InstanceSeg
12
+
13
+ ChartPointNet-InstanceSeg is a high-precision data point instance segmentation model for scientific charts. It uses Mask R-CNN with a Swin Transformer backbone to detect and segment individual data points, especially in dense and small-object scenarios common in scientific figures.
14
+
15
+ ## Model Details
16
+
17
+ ### Model Description
18
+
19
+ ChartPointNet-InstanceSeg is designed for pixel-precise instance segmentation of data points in scientific charts (e.g., scatter plots). It leverages Mask R-CNN with a Swin Transformer backbone, trained on enhanced COCO-style datasets with instance masks for data points. The model is ideal for extracting quantitative data from scientific figures and for downstream chart analysis.
20
+
21
+ - **Developed by:** Hansheng Zhu
22
+ - **Model type:** Instance Segmentation
23
+ - **License:** Apache-2.0
24
+ - **Finetuned from model:** openmmlab/mask-rcnn
25
+
26
+ ### Model Sources
27
+
28
+ - **Repository:** [https://github.com/hanszhu/ChartSense](https://github.com/hanszhu/ChartSense)
29
+ - **Paper:** https://arxiv.org/abs/2106.01841
30
+
31
+ ## Uses
32
+
33
+ ### Direct Use
34
+
35
+ - Instance segmentation of data points in scientific charts
36
+ - Automated extraction of quantitative data from figures
37
+ - Preprocessing for downstream chart understanding and data mining
38
+
39
+ ### Downstream Use
40
+
41
+ - As a preprocessing step for chart structure parsing or data extraction
42
+ - Integration into document parsing, digital library, or accessibility systems
43
+
44
+ ### Out-of-Scope Use
45
+
46
+ - Segmentation of non-data-point elements
47
+ - Use on figures outside the supported chart types
48
+ - Medical or legal decision making
49
+
50
+ ## Bias, Risks, and Limitations
51
+
52
+ - The model is limited to data point segmentation in scientific charts.
53
+ - May not generalize to figures with highly unusual styles or poor image quality.
54
+ - Potential dataset bias: Training data is sourced from scientific literature.
55
+
56
+ ### Recommendations
57
+
58
+ Users should verify predictions on out-of-domain data and be aware of the model’s limitations regarding chart style and domain.
59
+
60
+ ## How to Get Started with the Model
61
+
62
+ ```python
63
+ import torch
64
+ from mmdet.apis import inference_detector, init_detector
65
+
66
+ config_file = 'legend_match_swin/mask_rcnn_swin_datapoint.py'
67
+ checkpoint_file = 'chart_datapoint.pth'
68
+ model = init_detector(config_file, checkpoint_file, device='cuda:0')
69
+
70
+ result = inference_detector(model, 'example_chart.png')
71
+ # result: list of detected masks and class labels
72
+ ```
73
+
74
+ ## Training Details
75
+
76
+ ### Training Data
77
+
78
+ - **Dataset:** Enhanced COCO-style scientific chart dataset with instance masks
79
+ - Data point class with pixel-precise segmentation masks
80
+ - Images and annotations filtered and preprocessed for optimal Swin Transformer performance
81
+
82
+ ### Training Procedure
83
+
84
+ - Images resized to 1120x672
85
+ - Mask R-CNN with Swin Transformer backbone
86
+ - **Training regime:** fp32
87
+ - **Optimizer:** AdamW
88
+ - **Batch size:** 8
89
+ - **Epochs:** 36
90
+ - **Learning rate:** 1e-4
91
+
92
+ ## Evaluation
93
+
94
+ ### Testing Data, Factors & Metrics
95
+
96
+ - **Testing Data:** Held-out split from enhanced COCO-style dataset
97
+ - **Factors:** Data point density, image quality
98
+ - **Metrics:** mAP (mean Average Precision), AP50, AP75, per-class AP
99
+
100
+ ### Results
101
+
102
+ | Category | mAP | mAP_50 | mAP_75 | mAP_s | mAP_m | mAP_l |
103
+ |-----------------|-------|--------|--------|-------|-------|-------|
104
+ | data-point | 0.485 | 0.687 | 0.581 | 0.487 | 0.05 | nan |
105
+
106
+ #### Summary
107
+
108
+ The model achieves strong mAP for data point segmentation, excelling in dense and small-object scenarios. It is highly effective for scientific figures requiring pixel-level accuracy.
109
+
110
+ ## Environmental Impact
111
+
112
+ - **Hardware Type:** NVIDIA V100 GPU
113
+ - **Hours used:** 10
114
+ - **Cloud Provider:** Google Cloud
115
+ - **Compute Region:** us-central1
116
+ - **Carbon Emitted:** ~15 kg CO2eq (estimated)
117
+
118
+ ## Technical Specifications
119
+
120
+ ### Model Architecture and Objective
121
+
122
+ - Mask R-CNN with Swin Transformer backbone
123
+ - Instance segmentation head for data point class
124
+
125
+ ### Compute Infrastructure
126
+
127
+ - **Hardware:** NVIDIA V100 GPU
128
+ - **Software:** PyTorch 1.13, MMDetection 2.x, Python 3.9
129
+
130
+ ## Citation
131
+
132
+ **BibTeX:**
133
+
134
+ ```bibtex
135
+ @article{DocFigure2021,
136
+ title={DocFigure: A Dataset for Scientific Figure Classification},
137
+ author={S. Afzal, et al.},
138
+ journal={arXiv preprint arXiv:2106.01841},
139
+ year={2021}
140
+ }
141
+ ```
142
+
143
+ **APA:**
144
+
145
+ Afzal, S., et al. (2021). DocFigure: A Dataset for Scientific Figure Classification. arXiv preprint arXiv:2106.01841.
146
+
147
+ ## Glossary
148
+
149
+ - **Data Point:** An individual visual marker representing a value in a scientific chart (e.g., a dot in a scatter plot)
150
+
151
+ ## More Information
152
+
153
+ - [DocFigure Paper](https://arxiv.org/abs/2106.01841)
154
+
155
+ ## Model Card Authors
156
+
157
+ Hansheng Zhu
158
+
159
+ ## Model Card Contact
160
+
161