GithubHelp home page GithubHelp logo

fg3d-net's Introduction

Fine-Grained 3D Shape Classification with Hierarchical Part-View Attentions

Created by Xinhai Liu, Zhizhong Han, Yu-Shen Liu, Matthias Zwicker.

framework

Figure 1. The framework of the FG3D-Net.

Abstract

Fine-grained 3D shape classification is important for shape understanding and analysis, and it poses a challenging research problem. Due to the lack of fine-grained 3D shape benchmarks, however, research on fine-grained 3D shape classification has rarely been explored. To address this issue, we first introduce a new 3D shape dataset with fine-grained class labels, which consists of three categories including airplane, car and chair. Each category consists of several subcategories at a fine-grained level. According to our experiments under this fine-grained dataset, we find that state-of-the-art methods are significantly limited by the small variance among subcategories in the same category. To resolve this problem, we further propose a novel fine-grained 3D shape classification method named FG3D-Net to capture the fine-grained local details of 3D shapes from multiple rendered views. Specifically, we first train a Region Proposal Network (RPN) to detect the generally semantic parts inside multiple views under the benchmark of generally semantic part detection. Then, we design a hierarchical part-view attention aggregation module to learn a global shape representation by aggregating generally semantic part features, which preserves the local details of 3D shapes. The part-view attention module leverages part-level and view-level attention to increase the discriminability of our features, where the part-level attention highlights the important parts in each view while the view-level attention highlights the discriminative views among all the views of the same object. In addition, we integrate a Recurrent Neural Network (RNN) to capture the spatial relationships among sequential views from different viewpoints. Our results under the fine-grained 3D shape dataset show that our method outperforms other state-of-the-art methods.

FG3D Dataset

statistic

Table 1. The statistics of the FG3D dataset which consists of 3 categories and 66 subcategories.

As shown in Table 1, FG3D dataset consists of three basic categories including Airplane, Car and Chair, which contain 3,441 shapes in 13 subcategories, 8,235 shapes in 20 subcategories, and 13,054 shapes in 33 subcategories, respectively. We represent each 3D shape by an object format file (.off) with polygonal surface geometry. One can easily convert the .off files into other shape representations, such as rendered views, voxels and point clouds. All shapes in FG3D are collected from multiple online repositories including 3D Warehouse, Yobi3D and ShapeNet, which contain a massive collection of CAD shapes that are publicly available for research purpose. By collecting 3D shapes over a period of two months, we obtained a collection of more than 20K 3D shapes in three shape categories. We organized these 3D shapes using the WordNet noun “synsets” (synonym sets). WordNet provides a broad and deep taxonomy with over 80K distinct synsets representing distinct noun concepts. This taxonomy has been utilized by ImageNet and ShapeNet to formulate the object subcategories. In our dataset, we also introduce the taxonomy into the collection of 3D shapes, as shown in Figure 2.

dataset

Figure 2. There are three shape categories in our fine-grained dataset including Airplane, Car and Chair.

For evaluation, we split the shapes in each categories into training and testing sets. Specifically, the 3D shapes in airplane are split into 3,441 for training and 732 for testing. The cars category contains 7,010 shapes for training and 1,315 shapes for testing. The chairs category contains 11,124 shapes for training and 1,930 shapes for testing.

Data Download

We provide the download link of the FG3D dataset at Google Drive, where 3D shapes are represented in the mesh (.off) and the multiple views (.png). And the meaning of different files in the download link are illustrated as follows.

 Filename                                               
   Airplane_subcategories.txt          # The name of subcategories under the Airplane category.
   Airplane_off.zip                    # The 3D objects (.off) under the Airplane category.
   Airplane_off_train.txt              # The filename of training shapes (.off) under the Airplane category.
   Airplane_off_test.txt               # The filename of testing shapes (.off) under the Airplane category.
   Airplane_view.zip                   # The 2D rendered views (.png) of 3D objects under the Airplane category. (12 views per shape)
   Airplane_view_train.txt             # The filename of training views (.png) under the Airplane category.
   Airplane_view_test.txt              # The filename of testing views (.png) under the Airplane category.

   Car_subcategories.txt               # The name of subcategories under the Car category.
   Car_off.zip                         # The 3D objects (.off) under the Car category.
   Car_off_train.txt                   # The filename of training shapes (.off) under the Car category.       
   Car_off_test.txt                    # The filename of testing shapes (.off) under the Car category.
   Car_view.zip                        # The 2D rendered views (.png) of 3D objects under the Car category. (12 views per shape)
   Car_view_train.txt                  # The filename of training views (.png) under the Car category.
   Car_view_test.txt                   # The filename of testing views (.png) under the Car category.

   Chair_subcategories.txt             # The name of subcategories under the Chair category.
   Chair_off.zip                       # The 3D objects (.off) under the Chair category.
   Chair_off_train.txt                 # The filename of training shapes (.off) under the Chair category.
   Chair_off_test.txt                  # The filename of testing shapes (.off) under the Chair category.
   Chair_view.zip                      # The 2D rendered views (.png) of 3D objects under the Chair category. (12 views per shape)
   Chair_view_train.txt                # The filename of training views (.png) under the Chair category.
   Chair_view_test.txt                 # The filename of testing views (.png) under the Chair category.

TODOs

  • We will release the code of FG3D-Net in this respository.

Citation

If you find our work useful in your research, please consider citing:

@article{liu2021fine,
	title={Fine-Grained 3D Shape Classification with Hierarchical Part-View Attentions},
	author={Liu, Xinhai and Han, Zhizhong and Liu, Yu-Shen and Zwicker, Matthias},
	journal={IEEE Transactions on Image Processing},
	year={2021}
}

fg3d-net's People

Contributors

liuxinhai avatar

Stargazers

ziyu avatar  avatar Weida Wang avatar Kurten avatar LogicNorthSea avatar  avatar Cheng Zejin avatar  avatar  avatar Codie Petersen avatar Tianbao Li avatar  avatar chenchao avatar Yushen avatar  avatar Byron Rogers avatar  avatar Kril Allen avatar  avatar AllenXiang avatar 一成 avatar Liu Xinchen avatar Baorui Ma avatar 何飞 avatar  avatar Sucheng Qian avatar  avatar Adam  avatar  avatar ZhangZiyu avatar Jousef Murad avatar  avatar Fangzhou Hong avatar Albhox avatar

Watchers

James Cloos avatar  avatar

fg3d-net's Issues

Building gpu_nms. so for my local linux

How to build gpu_nms. so for my local PC? could you please assist me? Do I need to do this? can I use this file you complied?

System specifications: UBUNTU 20 LTS

Nice Work

Dear Authors,

Thanks for the excellent work.

I am wondering if there is any PyTorch code that can be used as a reference? Or could you please give any syggestion to me?

I would be very grateful for your help.

Best Wishes,
Haoran

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.