PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space
Abstract
Few prior works study deep learning on point sets. PointNet by Qi et al. is a pioneer in this direction. However, by design PointNet does not capture local structures induced by the metric space points live in, limiting its ability to recognize fine-grained patterns and generalizability to complex scenes. In this work, we introduce a hierarchical neural network that applies PointNet recursively on a nested partitioning of the input point set. By exploiting metric space distances, our network is able to learn local features with increasing contextual scales. With further observation that point sets are usually sampled with varying densities, which results in greatly decreased performance for networks trained on uniform densities, we propose novel set learning layers to adaptively combine features from multiple scales. Experiments show that our network called PointNet++ is able to learn deep point set features efficiently and robustly. In particular, results significantly better than state-of-the-art have been obtained on challenging benchmarks of 3D point clouds.
Community
Proposes PointNet++: applying PointNet on hierarchical point sets (multi-scale) to learn better structural features in point clouds; partition set into overlapping regions (based on distance metric) and extract local features. Can handle non-uniform sampling better than volumetric grids and geometric graphs. Each set abstraction level takes N, (d+C) dimension input (N points, d spatial dimension, C-dim point features) and outputs N1, (d+C1): sampling layer gets subsets using iterative farthest point sampling (thus, receptive field depends on point cloud shape), grouping layer takes point set and centroid coordinates and produces N1, K, (d+C) dim output (K varies for each of N1 centroid - neighbourhood), PointNet layer applies PointNet to centroid-neighbourhood giving N1, (d+C1) - points in local region are transferred to centroid frame (offset by centroid coordinates). Two types of feature learning for robustness: Multi-scale grouping (apply to multiple scales/region sizes and group/concat - one layer/resolution only) and multi-resolution grouping (get higher resolution/level by grouping lower resolution point sets, extract features from multiple resolutions and concat); random input dropout for MSG. Add residual propagation for point cloud segmentation (up sampling by interpolation, PointNet, and residual); inverse distance weighted average from k nearest neighbors. Benchmarked classification on MNIST (pixels as point sets), better than PointNet (plain) but not Network in Network; ModelNet-40 (human-made rigid objects), best result (with normals) - compared to MVCNN and PointNet; Multi-scale and multi-resolution grouping (with dropout) are better than single-scale grouping. Higher ScanNet PC segmentation accuracy (compared to PointNet). Also tested on non-metric spaces (deformable objects/animal classification in SHREC15): Use geodesic distance for embedding metric with wave, heart kernel signature (WKS and HKS), and multi-scale Gaussian curvature for point features. Feature visualization of patterns learned (first layers) for ModelNet40 classification. Supplementary material has network architecture, more experiment details, and more experiments (semantic part segmentation, kNN vs. ball query, randomness in farthest point sampling, time and space complexity. From Stanford (Leonidas J Guibas).
Links: website, PapersWithCode, GitHub
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper