GithubHelp home page GithubHelp logo

masataka46 / multimodaldl Goto Github PK

View Code? Open in Web Editor NEW
39.0 3.0 16.0 1.29 MB

This model implementation of 'Multimodal Deep Learning for Robust RGB-D Object Recognition'

License: MIT License

Python 100.00%

multimodaldl's Introduction

Multimodal Deep Learning for Robust RGB-D Object Recognition

Requirements

  • Pillow (Pillow requires an external library that corresponds to the image format)

Description

This is an implementation of 'Multimodal Deep Learning for Robust RGB-D Object Recognition'. It requires the training and validation dataset of following format:

  • Each line contains one training example.
  • Each line consists of two elements separated by space(s).
  • The first element is a path to 256x256 RGB image.
  • The second element is its groundtruth label from 0 to arbitrary.

The text format is equivalent to what Caffe uses for ImageDataLayer.

This example requires "mean file" which is computed by compute_mean.py.

This example also requires CaffeNet model 'bvlc_reference_faffenet.caffemodel' sited at http://dl.caffe.berkeleyvision.org/

So, you must to download its model before implement training.

The process to train is follow:

  1. command 'python train_rgb_d.py' with color datas.
  2. command 'python train_rgb_d.py' with depth datas.
  3. command 'python train_full.py' with color datas and depth datas.

multimodaldl's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

multimodaldl's Issues

Some question about the code of MultimodalDL

Hello! I am a graduate student of UESTC in China, I haved dowmloaded your code about the MultimodalDL, and want to run it.But I meet a trouble, when I run the file "train_rgbd.py", it need load the caffemodel, and you use the function "serializers.load_npz",in the "load_npz" function , it try to load the caffemodel by "numpy.load", and the "numpy.load" can't be used to load the caffemodel, it just can load the file about "npz" 、"npy" format and so on. So when I run the code ,it throw the error "Failed to interpret file 'bvlc_reference_caffenet.caffemodel' as a pickle". I just want to ask you, how can you do to change the caffemodel to the "npz" or "npy" format? or maybe I forget do something for this code ?
Hope you can anwser my quetion , Sincerely thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.