A list of papers, libraries and datasets I recently read is collected for anyone who shows interest at 3D detection, shape representation, shape completion, shape reconstruction, 3D scene understanding, 3D scene reconstruction.
Statistics: ๐ฅ code is available & stars >= 100 โ|โ โญ citation >= 50
- [AAAI2020] Point-GNN: Graph Neural Network for 3D Object Detection in a Point Cloud
- [AAAI2020] ZoomNet: Part-Aware Adaptive Zooming Neural Network for 3D Object Detection
- [Arxiv] MonoPair: Monocular 3D Object Detection Using Pairwise Spatial Relationships
- [Arxiv] HVNet: Hybrid Voxel Network for LiDAR Based 3D Object Detection
- [Arxiv] SMOKE: Single-Stage Monocular 3D Object Detection via Keypoint Estimation
- [Arxiv] 3DSSD: Point-based 3D Single Stage Object Detector
- [Arxiv] Monocular 3D Object Detection with Decoupled Structured Polygon Estimation and Height-Guided Depth Estimation
- [Arxiv] ImVoteNet: Boosting 3D Object Detection in Point Clouds with Image Votes
- [Arxiv] A Review on Object Pose Recovery: from 3D Bounding Box Detectors to Full 6D Pose Estimators
- [Arxiv] ScanRefer: 3D Object Localization in RGB-D Scans using Natural Language
- [Arxiv] Objects as Points [github] โญ๐ฅ
- [Arxiv] RTM3D: Real-time Monocular 3D Detection from Object Keypoints for Autonomous Driving [github]
- [Arxiv] DSGN: Deep Stereo Geometry Network for 3D Object Detection [github]
- [Arxiv] Learning and Memorizing Representative Prototypes for 3D Point Cloud Semantic and Instance Segmentation
- [Arxiv] PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection
- [Arxiv] Object as Hotspots: An Anchor-Free 3D Object Detection Approach via Firing of Hotspots
- [Arxiv] SESS: Self-Ensembling Semi-Supervised 3D Object Detection
- [NeurIPS2019] PerspectiveNet: 3D Object Detection from a Single RGB Image via Perspective Points
- [NeurIPS2019] Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds
- [ICCV2019] Deep Hough Voting for 3D Object Detection in Point Clouds
- [AAAI2020] JSNet: Joint Instance and Semantic Segmentation of 3D Point Clouds
- [ICCV2019] M3D-RPN: Monocular 3D Region Proposal Network for Object Detection [pytorch]
- [ICCV2019] 3D Instance Segmentation via Multi-Task Metric Learning
- [Arxiv] Single-Stage Monocular 3D Object Detection with Virtual Cameras
- [Arxiv] Depth Completion via Deep Basis Fitting
- [Arxiv] Relation Graph Network for 3D Object Detection in Point Clouds
- [CVPR2019] 3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans [pytorch] ๐ฅ
- [ICCV2019] Rescan: Inductive Instance Segmentation for Indoor RGBD Scans [C++]
- [ICCV2019] Transferable Semi-Supervised 3D Object Detection From RGB-D Data
- [ICCV2019] STD: Sparse-to-Dense 3D Object Detector for Point Cloud
- [ICCV2019] Transferable Semi-Supervised 3D Object Detection From RGB-D Data
- [CVPR2019] PointRCNN: 3D Object Proposal Generation and Detection from Point Cloud [pytorch]
- [Arxiv] Fast Point R-CNN
- [Arxiv] Class-balanced Grouping and Sampling for Point Cloud 3D Object Detection [pytorch] ๐ฅ
- [Arxiv] Implicit Geometric Regularization for Learning Shapes
- [Arxiv] Analytic Marching: An Analytic Meshing Solution from Deep Implicit Surface Networks
- [Arxiv] Adversarial Generation of Continuous Implicit Shape Representations [pytorch]
- [Arxiv] A Novel Tree-structured Point Cloud Dataset For Skeletonization Algorithm Evaluation [dataset]
- [CVPRW2019] SkelNetOn 2019: Dataset and Challenge on Deep Learning for Geometric Shape Understanding [project]
- [Arxiv] Skeleton Extraction from 3D Point Clouds by Decomposing the Object into Parts
- [Arxiv] InSphereNet: a Concise Representation and Classification Method for 3D Object
- [Arxiv] Deep Structured Implicit Functions
- [CVIU] 3D articulated skeleton extraction using a single consumer-grade depth camera
- [ICLR2019] Point Cloud GAN [tensorflow]
- [ICCV2019] Learning Shape Templates with Structured Implicit Functions
- [ICCV2019] 3D Point Cloud Generative Adversarial Network Based on Tree Structured Graph Convolutions [pytorch]
- [ICCV2019] Implicit Surface Representations as Layers in Neural Networks
- [CVPR2019] DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation [pytorch] ๐ฅ โญ
- [SIGGRAPH2019] StructureNet: Hierarchical Graph Networks for 3D Shape Generation [pytorch]
- [SIGGRAPH Asia2019] LOGAN: Unpaired Shape Transform in Latent Overcomplete Space [tensorflow]
- [TOG] Voxel Cores: Efficient, robust, and provably good approximation of 3D medial axes
- [SIGGRAPH2018] P2P-NET: Bidirectional Point Displacement Net for Shape Transform [tensorflow]
- [ICML2018] Learning Representations and Generative Models for 3D Point Clouds [tensorflow] ๐ฅโญ
- [NeurIPS2018] Discovery of Latent 3D Keypoints via End-to-end Geometric Reasoning [tensorflow][project page]:star::fire:
- [AAAI2018] Unsupervised Articulated Skeleton Extraction from Point Set Sequences Captured by a Single Depth Camera
- [3DV2018] Parsing Geometry Using Structure-Aware Shape Templates
- [SIGGRAPH2017] GRASS: Generative Recursive Autoencoders for Shape Structures [pytorch] ๐ฅ
- [TOG] Erosion Thickness on Medial Axes of 3D Shapes
- [Vis Comput] Distance field guided L1-median skeleton extraction
- [CGF] Contracting Medial Surfaces Isotropically for Fast Extraction of Centred Curve Skeletons
- [CGF] Improved Use of LOP for Curve Skeleton Extraction
- [SIGGRAPH Asia2015] Deep Points Consolidation [C++ & Qt]
- [SIGGRAPH2015] Burning The Medial Axis
- [SIGGRAPH2009] Curve Skeleton Extraction from Incomplete Point Cloud [matlab] โญ
- [TOG] SDM-NET: deep generative network for structured deformable mesh
- [TOG] Robust and Accurate Skeletal Rigging from Mesh Sequences ๐ฅ
- [TOG] L1-medial skeleton of point cloud [C++] ๐ฅ
- [EUROGRAPHICS2016] 3D Skeletons: A State-of-the-Art Report ๐ฅ
- [SGP2012] Mean Curvature Skeletons [C++] ๐ฅ
- [SMIC2010] Point Cloud Skeletons via Laplacian-Based Contraction [Matlab] ๐ฅ
- [Arxiv] Implicit Functions in Feature Space for 3D Shape Reconstruction and Completion
- [Arxiv] PF-Net: Point Fractal Network for 3D Point Cloud Completion
- [Arxiv] 3D Gated Recurrent Fusion for Semantic Scene Completion
- [ICCVW2019] EdgeConnect: Structure Guided Image Inpainting using Edge Prediction [pytorch] ๐ฅโญ
- [ICRA2020] Depth Based Semantic Scene Completion with Position Importance Aware Loss
- [Arxiv] SG-NN: Sparse Generative Neural Networks for Self-Supervised Scene Completion of RGB-D Scans
- [Arxiv] PQ-NET: A Generative Part Seq2Seq Network for 3D Shapes
- [Arxiv] Unpaired Point Cloud Completion on Real Scans using Adversarial Training [tensorflow]
- [AAAI2020] Morphing and Sampling Network for Dense Point Cloud Completion [pytorch]
- [ICCV2019] ForkNet: Multi-branch Volumetric Semantic Completion from a Single Depth Image [tensorflow]
- [ICCV2019] Cascaded Context Pyramid for Full-Resolution 3D Semantic Scene Completion [Caffe3D]
- [ICCV2019] Multi-Angle Point Cloud-VAE: Unsupervised Feature Learning for 3D Point Clouds from Multiple Angles by Joint Self-Reconstruction and Half-to-Half Prediction
- [Arxiv] EdgeNet: Semantic Scene Completion from RGB-D images
- [CVPR2019] TopNet: Structural Point Cloud Decoder [pytorch & tensorflow]
- [CVPR2019] Deep Reinforcement Learning of Volume-guided Progressive View Inpainting for 3D Point Scene Completion from a Single Depth Image
- [CVPR2019] Leveraging Shape Completion for 3D Siamese Tracking [pytorch]
- [CVPR2019] RL-GAN-Net: A Reinforcement Learning Agent Controlled GAN Network for Real-Time Point Cloud Shape Completion [pytorch]
- [3DV2018] PCN: Point Completion Network [tensorflow] ๐ฅ
- [ECCV2018] Efficient Semantic Scene Completion Network with Spatial Group Convolution [pytorch]
- [CVPR2018] ScanComplete: Large-Scale Scene Completion and Semantic Segmentation for 3D Scans [tensorflow] ๐ฅโญ
- [CVPR2018] Learning 3D Shape Completion from Laser Scan Data with Weak Supervision [torch][torch]
- [IJCV2018] Learning 3D Shape Completion under Weak Supervision [torch][torch]
- [ICCV2017] High-Resolution Shape Completion Using Deep Neural Networks for Global Structure and Local Geometry Inference โญ
- [ICCV2017] Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis [torch] ๐ฅโญ
- [CVPR2017] Semantic Scene Completion from a Single Depth Image [caffe] ๐ฅโญ
- [CVPR2016] Structured Prediction of Unobserved Voxels From a Single Depth Image [resource] โญ
- [Arxiv] Hypernetwork approach to generating point clouds
- [Arxiv] Inverse Graphics GAN: Learning to Generate 3D Shapes from Unstructured 2D Data
- [Arxiv] Meshlet Priors for 3D Mesh Reconstruction
- [Arxiv] Front2Back: Single View 3D Shape Reconstruction via Front to Back Prediction
- [Arxiv] SDFDiff: Differentiable Rendering of Signed Distance Fields for 3D Shape Optimization
- [CVPR2019] Occupancy Networks: Learning 3D Reconstruction in Function Space [pytorch] ๐ฅโญ
- [NeurIPS2019] DISN: Deep Implicit Surface Network for High-quality Single-view 3D Reconstruction [tensorflow]
- [NeurIPS2019] Learning to Infer Implicit Surfaces without 3D Supervision
- [CVPR2019] A Skeleton-bridged Deep Learning Approach for Generating Meshes of Complex Topologies from Single RGB Images [pytorch & tensorflow]
- [Arxiv] Deep Level Sets: Implicit Surface Representations for 3D Shape Inference
- [CVPR2019] Learning Implicit Fields for Generative Shape Modeling [tensorflow] ๐ฅ
- [ICCV2019] Point-based Multi-view Stereo Network [pytorch] โญ
- [Arxiv] TSRNet: Scalable 3D Surface Reconstruction Network for Point Clouds using Tangent Convolution
- [Arxiv] DR-KFD: A Differentiable Visual Metric for 3D Shape Reconstruction
- [ICCV2019] GraphX-Convolution for Point Cloud Deformation in 2D-to-3D Conversion
- [ICCV2019] Pixel2Mesh++: Multi-View 3D Mesh Generation via Deformation [pytorch]
- [ICCV2019] Few-Shot Generalization for Single-Image 3D Reconstruction via Priors
- [ICCV2019] Deep Mesh Reconstruction from Single RGB Images via Topology Modification Networks
- [AAAI2018] Learning Efficient Point Cloud Generation for Dense 3D Object Reconstruction [tensorflow] โญ๐ฅ
- [NeurIPS2017] MarrNet: 3D Shape Reconstruction via 2.5D Sketches [torch]:star::fire:
- [CVPR2020] RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds [tensorflow] ::fire::
- [CVPR2020] Intelligent Home 3D: Automatic 3D-House Design from Linguistic Descriptions Only
- [ICRA2020] 3DCFS: Fast and Robust Joint 3D Semantic-Instance Segmentation via Coupled Feature Selection
- [Arxiv] Indoor Scene Recognition in 3D
- [Journal] Dark, Beyond Deep: A Paradigm Shift to Cognitive AI with Humanlike Common Sense
- [Arxiv] BlockGAN Learning 3D Object-aware Scene Representations from Unlabelled Images
- [Arxiv] 3D Dynamic Scene Graphs: Actionable Spatial Perception with Places, Objects, and Humans
- [Arxiv] Generating 3D People in Scenes without People
- [CVPR2019] Putting Humans in a Scene: Learning Affordance in 3D Indoor Environments
- [ICCV2019] Holistic++ Scene Understanding: Single-view 3D Holistic Scene Parsing and Human Pose Estimation with Human-Object Interaction and Physical Commonsense
- [ICCV2019] U4D: Unsupervised 4D Dynamic Scene Understanding
- [ICCV2019] UprightNet: Geometry-Aware Camera Orientation Estimation from Single Images
- [ICCV2019] Habitat: A Platform for Embodied AI Research [habitat-api] [habitat-sim] โญ
- [ICCV2019] SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences [project page] โญ
- [ICCV2019] Neural Inverse Rendering of an Indoor Scene From a Single Image
- [ICCV2019] SceneGraphNet: Neural Message Passing for 3D Indoor Scene Augmentation [pytorch]
- [ICCV2019] RIO: 3D Object Instance Re-Localization in Changing Indoor Environments [dataset]
- [ICCV2019] CamNet: Coarse-to-Fine Retrieval for Camera Re-Localization
- [ICCV2019] U4D: Unsupervised 4D Dynamic Scene Understanding
- [NeurIPS2018] Learning to Exploit Stability for 3D Scene Parsing
- [Arxiv] Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single Image
- [Arxiv] Indoor Layout Estimation by 2D LiDAR and Camera Fusion
- [Arxiv] General 3D Room Layout from a Single View by Render-and-Compare
- [ICCV2019] Learning to Reconstruct 3D Manhattan Wireframes from a Single Image [pytorch]
- [CVPR2019] PlaneRCNN: 3D Plane Detection and Reconstruction from a Single Image [pytorch]:fire:
- [CVPR2018] Factoring Shape, Pose, and Layout from the 2D Image of a 3D Scene [pytorch]
- [ICCV2019] 3D Scene Reconstruction with Multi-layer Depth and Epipolar Transformers
- [ICCV Workshop2019] Silhouette-Assisted 3D Object Instance Reconstruction from a Cluttered Scene
- [ICCV2019] 3D-RelNet: Joint Object and Relation Network for 3D prediction [pytorch]
- [3DV2019] Pano Popups: Indoor 3D Reconstruction with a Plane-Aware Network
- [CVPR2018] Factoring Shape, Pose, and Layout from the 2D Image of a 3D Scene [pytorch]
- [Arxiv] Defense-PointNet: Protecting PointNet Against Adversarial Attacks
- [Arxiv] FPConv: Learning Local Flattening for Point Convolution [[github]]
- [Arxiv] PointAugment: an Auto-Augmentation Framework for Point Cloud Classification
- [CVPR2020] D3VO: Deep Depth, Deep Pose and Deep Uncertainty for Monocular Visual Odometry
- [ICIP2020] TRIANGLE-NET: TOWARDS ROBUSTNESS IN POINT CLOUD CLASSIFICATION
- [ICRA2020] Robust 6D Object Pose Estimation by Learning RGB-D Features
- [Arxiv] Predicting Sharp and Accurate Occlusion Boundaries in Monocular Depth Estimation Using Displacement Fields
- [Arxiv] Learning multiview 3D point cloud registration [code]
- [Arxiv] Single Image Depth Estimation Trained via Depth from Defocus Cues [pytorch]
- [Arxiv] DepthTransfer: Depth Extraction from Video Using Non-parametric Sampling
- [Arxiv] Target-less registration of point clouds: A review
- [Arxiv] Quaternion Equivariant Capsule Networks for 3D point clouds
- [Arxiv] Category-Level Articulated Object Pose Estimation
- [Arxiv] A Quantum Computational Approach to Correspondence Problems on Point Sets
- [Arxiv] DeepSFM: Structure From Motion Via Deep Bundle Adjustment
- [Arxiv] P2GNet: Pose-Guided Point Cloud Generating Networks for 6-DoF Object Pose Estimation
- [ICCV2019] Learning Local RGB-to-CAD Correspondences for Object Pose Estimation
- [ICCV2019] Joint Embedding of 3D Scan and CAD Objects [dataset]
- [ICLR2019] BA-NET: DENSE BUNDLE ADJUSTMENT NETWORKS [tensorflow]
- [ICCV2019] GP2C: Geometric Projection Parameter Consensus for Joint 3D Pose and Focal Length Estimation in the Wild
- [ICCV2019] Closed-Form Optimal Two-View Triangulation Based on Angular Errors
- [ICCV2019] Polarimetric Relative Pose Estimation
- [ICCV2019] End-to-End CAD Model Retrieval and 9DoF Alignment in 3D Scans
- [ICCV2019] Deep Non-Rigid Structure from Motion
- [CVPR2019] On the Continuity of Rotation Representations in Neural Networks [pytorch]
- [Arxiv] Deep Interpretable Non-Rigid Structure from Motion [tensorflow]
- [Arxiv] IKEA Furniture Assembly Environment for Long-Horizon Complex Manipulation Tasks [dataset]
- [CVPR2019] Scan2CAD: Learning CAD Model Alignment in RGB-D Scans [pytorch] ๐ฅ
- [3DV2019] Location Field Descriptors: Single Image 3D Model Retrieval in the Wild
- [CVPR2016] Marr Revisited: 2D-3D Alignment via Surface Normal Prediction [caffe]
- [Arxiv] KeypointNet: A Large-scale 3D Keypoint Dataset Aggregated from Numerous Human Annotations
- [Arxiv] A Review on Generative Adversarial Networks: Algorithms, Theory, and Applications
- [Arxiv] From Seeing to Moving: A Survey on Learning for Visual Indoor Navigation (VIN)
- [Arxiv] DIODE: A Dense Indoor and Outdoor DEpth Dataset [dataset]
- [Github] Various GANs with Pytorch.
- [Arxiv] SemanticPOSS: A Point Cloud Dataset with Large Quantity of Dynamic Instances [dataset]
- [CVM] A Survey on Deep Geometry Learning: From a Representation Perspective
- [Arxiv] A survey on Semi-, Self- and Unsupervised Techniques in Image Classification
- [Arxiv] fastai: A Layered API for Deep Learning
- [Arxiv] AU-AIR: A Multi-modal Unmanned Aerial Vehicle Dataset for Low Altitude Traffic Surveillance [dataset]
- [Arxiv] VIRTUAL KITTI 2 [dataset]
- [Arxiv] Tutorial on Variational Autoencoders
- [Arxiv] Review: deep learning on 3D point clouds
- [Arxiv] Image Segmentation Using Deep Learning: A Survey
- [CVPR2018] Pixels, Voxels, and Views: A Study of Shape Representations for Single View 3D Object Shape Prediction
- [Arxiv] Evolution of Image Segmentation using Deep Convolutional Neural Network: A Survey
- [Arxiv] MCMLSD: A Probabilistic Algorithm and Evaluation Framework for Line Segment Detection
- [Arxiv] Deep Learning for 3D Point Clouds: A Survey
- [Arxiv] A Survey on Deep Learning-based Architectures for Semantic Segmentation on 2D images
- [Arxiv] A Survey on Deep Learning Architectures for Image-based Depth Reconstruction
- [Arxiv] secml: A Python Library for Secure and Explainable Machine Learning
- [Arxiv] Bundle Adjustment Revisited
- [ICCV2019] Deep CG2Real: Synthetic-to-Real Translation via Image Disentanglement
- [Arxiv] SIFT Meets CNN: A Decade Survey of Instance Retrieval
- [ICCV2019] Revisiting Point Cloud Classification: A New Benchmark Dataset and Classification Model on Real-World Data [tensorflow]
- [Arxiv] BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Networks [dataset]
- [Arxiv] Imbalance Problems in Object Detection: A Review [repository]
- [IJCV] Deep Learning for Generic Object Detection: A Survey
- [Arxiv] Differentiable Visual Computing (Ph.D thesis)
- [BMVC2018] InteriorNet: Mega-scale Multi-sensor Photo-realistic Indoor Scenes Dataset [dataset]
- [ICCV2017] The Mapillary Vistas Dataset for Semantic Understanding of Street Scenes [dataset] [script] โญ
- [Arxiv] SynthCity: A large scale synthetic point cloud [dataset]
- [Github] Mesh Voxelization (SDFs or Occupancy grids)
- [Github] SDFGen (to generate grid-based signed distance field (level set))
- [Github] Blender renderer for python
- [Github] Blender renderer for python
- [Github] Volumetric TSDF Fusion of RGB-D Images in Python
- [Github] Volumetric TSDF Fusion of Multiple Depth Maps
- [Github] PyFusion
- [Github] PyRender
- [Github] PyMCubes
- [Github] Watertight and Simplified Meshes through TSDF Fusion (Python tool for obtaining watertight meshes using TSDF fusion.)
- [Github] Several tools about SDF functions.
- [Github] 3DMatch Toolbox
- [stackoverflow] Computing truncated signed distance function(TSDF) from a point cloud
- [Github] voxblox: A library for flexible voxel-based mapping, mainly focusing on truncated and Euclidean signed distance fields.
- [Github] Discregrid: A static C++ library for the generation of discrete functions on a box-shaped domain. This is especially suited for the generation of signed distance fields.
- [Github] awesome-voxel: Voxel resources for coders
- [Github] gvdb-voxels: Sparse volume compute and rendering on NVIDIA GPUs
- [Github] pyntcloud is a Python library for working with 3D point clouds.
- [Github] Open3D: A Modern Library for 3D Data Processing
- [Github] mesh_to_sdf: Calculate signed distance fields for arbitrary meshes