GithubHelp home page GithubHelp logo

facebookresearch / active-3d-vision-and-touch Goto Github PK

View Code? Open in Web Editor NEW
24.0 7.0 9.0 39.36 MB

A repository for the paper Active 3D Shape Reconstruction from Vision and Touch and robotic touch simulator package.

License: MIT License

Shell 0.01% Jupyter Notebook 97.03% Python 2.97%

active-3d-vision-and-touch's Introduction

Active Vision and Touch

Companion code for E.J. Smith, et al.: Active 3D Shape Reconstruction from Vision and Touch.

This repository contains a simulator for extracting vision and touch signals from the interaction of a robotic hand and a large dataset of 3d shapes, and a code base and dataset for learning to select touch signals to optimally reconstruct 3D shapes from vision and touch.
The code comes with pre-defined train/valid/test splits over the dataset released, pretrained models, and training and evaluation scripts. This code base uses a subset of the ABC Dataset (released under MIT License).

If you find this code useful in your research, please consider citing with the following BibTeX entry:

@article{smith2021active,
  title={Active 3D Shape Reconstruction from Vision and Touch},
  author={Smith, Edward J and Meger, David and Pineda, Luis and Calandra, Roberto and Malik, Jitendra and Romero, Adriana and Drozdzal, Michal},
  journal={arXiv preprint arXiv:2107.09584},
  year={2021}
}

Installation

This code uses Python 3.8 , PyTorch 1.7.1, and PyTorch3D 0.5.0 . Here are the step I use to install this:

conda create -n pytorch3d python=3.8
conda activate pytorch3d
conda install -c pytorch pytorch=1.7.1 torchvision cudatoolkit=10.2
conda install -c fvcore -c iopath -c conda-forge fvcore iopath
conda install -c bottler nvidiacub
conda install pytorch3d -c pytorch3d

If you are having trouble setting up this envrionment I recommend following the PyTorch3D installation code found here.

  • Install dependencies
$ pip install -r requirements.txt
  • The package is called pterotactyl. To install call:
$ python setup.py develop
  • If you are having trouble rendering using pyrender it is very possibly that the update sugggested here will solve your issue.

Dataset

To download the dataset call the following, keep in mind this will take some time (~30 mins) to download and unpack:

$ bash download_data.sh

Pretrained Models

If you wish to download pretrained models please call the following:

$ bash download_models.sh

Simulator

The simulator allows for touch signals and images of an object - hand grasp interaction to be rendered.

A simple example, performing an action grasp on an object in the simulator:

import os
from PIL import Image
from pterotactyl.simulator.scene import sampler
from pterotactyl.simulator.physics import grasping
import pterotactyl.objects as objects

OBJ_LOCATION = os.path.join(os.path.dirname(objects.__file__), "test_objects/0")
batch = [OBJ_LOCATION]
s = sampler.Sampler(grasping.Agnostic_Grasp, bs=1, vision=True, resolution = [256, 256])
s.load_objects(batch, from_dataset=False, scale = 2.6)

action = [30]
parameters = [[[.3, .3, .3], [60, 0, 135]]]
signals = s.sample(action, touch=True, touch_point_cloud=False, vision=True, vision_occluded=True,parameters=parameters )

img_vision_grasp = Image.fromarray(signals["vision_occluded"][0])
display(img_vision_grasp)

image = np.zeros((121*4, 121*2, 3)).astype(np.uint8)
for i in range(4):
    touch = signals["touch_signal"][0][i].data.numpy().astype(np.uint8)
    image[i*121:i*121+121, :121] = touch
    depth = utils.visualize_depth(signals["depths"][0][i].data.numpy()).reshape(121, 121, 1)
    image[i*121:i*121+121, 121:] = depth
print(' ')
print('     TOUCH         DEPTH')
display(Image.fromarray(image))

More extensive examples are provided in the jupyter notebook at notebook/simulator.ipynb .

Reconstruction Models

We provide a reconstruction model which predicts a mesh from vision and touch inputs, and also an auntoencoder which embeds these predictions in a small latent space. The training and eveluation code for these models are found at pterotactyl/reconstruction/. The touch folder converts touch signals to mesh charts, the vision folder predicts a full mesh prediction from vision signals, and predicted touch charts, and the autoencoder folder embeds the prediction in a latent space.

Extensive examples for all models are provided in the jupyter notebooks at notebook/reconstruction.
Pretrained models are provided for all of these reconstruction examples.

Policies

We provide baseline, oracle, and datadriven solutions for optimally selecting touch signals for 3D reconstruction. The and evaluation code for these models are found at pterotactyl/policies/.

A simple example, testing the even baseline:

import os
from pterotactyl.reconstruction import touch 
from pterotactyl.reconstruction import vision 
from pterotactyl.policies.baselines import even
from pterotactyl import pretrained

TOUCH_LOCATION = os.path.dirname(pretrained.__file__) + '/reconstruction/touch/best/'
VISION_LOCATION = os.path.dirname(pretrained.__file__) + '/reconstruction/vision/v_t_p/'




class Params: # define training arguments 
    def __init__(self):
        
        self.limit_data = True
        self.env_batch_size = 2
        self.num_actions = 50
        self.seed = 0 
        self.budget = 5
        self.number_points = 10000
        self.loss_coeff = 9000
        self.exp_type = "even_v_t_p_example"
        self.finger = False 
        self.num_grasps = 5 
        self.use_touch = True 
        self.use_img = True
        self.touch_location = TOUCH_LOCATION
        self.vision_location = VISION_LOCATION
        self.visualize = True
        self.use_latent = False
        self.use_recon = True 
        self.eval = True 
        self.pretrained_recon = True
        
    

params = Params()
random_test = even.Engine(params)
random_test()

Extensive examples for all policies are provided in the jupyter notebook at notebook/policies.
Pretrained models are provided for all Policies.

Results

We evaluate our learned policies in terms of % chamfer distance improvement after 5 touches. In the following table we highight the performance of each policy over 5 models, in 4 learning settings:

License

See LICENSE for details.

active-3d-vision-and-touch's People

Contributors

edwardsmith1884 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

active-3d-vision-and-touch's Issues

Mask embedding computation

Hi, thanks for your help so far! I have noticed that the mask embedding, which is computed at the first deformation loop

mask_features = self.mask_encoder(mask)

is computed again at the third deformation loop:
mask_features = self.mask_encoder(mask)

Because the mask embedding is fixed throughout the entire training process, shouldn't we use the same embedding variable computed at the first deformation loop? I am wondering whether computing it twice results in a suboptimal optimisation of the mask embedding network during backpropagation.

I am aware that this is a minor issue, so I am just asking to make sure that I am not missing an important detail. Thanks!

The number of recon_train set objects found is 0

Hi, I am having an issue when I run object_prediction.ipynb, touch_chart_prediction.ipynb, or any other notebook besides simulator.ipynb (which works correctly). Running the first cell in object_prediction.ipynb, I get this output:

The number of recon_train set objects found : 0

and then this error:

ValueError: num_samples should be a positive integer value, but got num_samples=0

The same happens for touch_chart_prediction.ipynb. I downloaded both the data and the pre-trained models using the scripts provided in the repository.

I am running this on macOS M1, python==3.8.13, torch==1.9.0 (pypi).

I am not sure whether I am missing something.

Thank you!

Here is the traceback:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
 /Active-3D-Vision-and-Touch/notebook/Reconstruction/touch_chart_prediction.ipynb Cell 4' in <cell line: 21>()
     19 params = Params()
     20 touch_trainer = touch_train.Engine(params)
---> 21 touch_trainer()

File  ~/Active-3D-Vision-and-Touch/pterotactyl/reconstruction/touch/train.py:52, in Engine.__call__(self)
     48 self.optimizer = optim.Adam(params, lr=self.args.lr)
     49 writer = SummaryWriter(
     50     os.path.join("experiments/tensorboard/", self.args.exp_type)
     51 )
---> 52 train_loader, valid_loader = self.get_loaders()
     54 # evaluate
     55 if self.args.eval:

File  ~/Active-3D-Vision-and-Touch/pterotactyl/reconstruction/touch/train.py:77, in Engine.get_loaders(self)
     73 if not self.args.eval:
     74     train_data = data_loaders.mesh_loader_touch(
     75         self.args, set_type="recon_train"
     76     )
---> 77     train_loader = DataLoader(
     78         train_data,
     79         batch_size=self.args.batch_size,
     80         shuffle=True,
     81         num_workers=16,
     82         collate_fn=train_data.collate,
     83     )
     84 # dataloader for evaluation
     85 set_type = "test" if self.args.eval else "valid"

File ~/miniforge3/envs/active_touch/lib/python3.8/site-packages/torch/utils/data/dataloader.py:270, in DataLoader.__init__(self, dataset, batch_size, shuffle, sampler, batch_sampler, num_workers, collate_fn, pin_memory, drop_last, timeout, worker_init_fn, multiprocessing_context, generator, prefetch_factor, persistent_workers)
    266 else:  # map-style
    267     if shuffle:
    268         # Cannot statically verify that dataset is Sized
    269         # Somewhat related: see NOTE [ Lack of Default `__len__` in Python Abstract Base Classes ]
--> 270         sampler = RandomSampler(dataset, generator=generator)  # type: ignore[arg-type]
    271     else:
    272         sampler = SequentialSampler(dataset)  # type: ignore[arg-type]

File ~/miniforge3/envs/active_touch/lib/python3.8/site-packages/torch/utils/data/sampler.py:102, in RandomSampler.__init__(self, data_source, replacement, num_samples, generator)
     98     raise ValueError("With replacement=False, num_samples should not be specified, "
     99                      "since a random permute will be performed.")
    101 if not isinstance(self.num_samples, int) or self.num_samples <= 0:
--> 102     raise ValueError("num_samples should be a positive integer "
    103                      "value, but got num_samples={}".format(self.num_samples))

ValueError: num_samples should be a positive integer value, but got num_samples=0

Directory not found error

Thanks for sharing your code.

When I was trying to run the DDQN notebook in notebook/Policies/DDQN.ipynb, I kept getting the following error,
FileNotFoundError: [Errno 2] No such file or directory: '/private/home/ejsmith/projects/active/ActiveTouch/pterotactyl/reconstruction/touch/experiments/checkpoint/touch_train/touch_27/model'

Here is the traceback for this error:
FileNotFoundError Traceback (most recent call last)
in
44 params = Params()
45 ddqn_trainer = ddqn_train.Engine(params)
---> 46 ddqn_trainer()

5 frames
/content/gdrive/MyDrive/Active-3D-Vision-and-Touch/pterotactyl/policies/DDQN/train.py in call(self)
44 def call(self):
45 # initialize the learning environment
---> 46 self.env = environment.ActiveTouch(self.args)
47 self.replay_memory = replay.ReplayMemory(self.args)
48 self.policy = ddqn.DDQN(self.args, self.env.mesh_info, self.replay_memory)

/content/gdrive/MyDrive/Active-3D-Vision-and-Touch/pterotactyl/policies/environment.py in init(self, args)
34 )
35 self.pretrained_recon_models()
---> 36 self.setup_recon()
37 self.get_loaders()
38 self.sampler = sampler.Sampler(

/content/gdrive/MyDrive/Active-3D-Vision-and-Touch/pterotactyl/policies/environment.py in setup_recon(self)
111 touch_args, weights = utils.load_model_config(self.args.touch_location)
112 self.touch_prediction = touch_model.Encoder().cuda()
--> 113 self.touch_prediction.load_state_dict(torch.load(weights))
114 self.touch_prediction.eval()
115

/usr/local/lib/python3.7/dist-packages/torch/serialization.py in load(f, map_location, pickle_module, **pickle_load_args)
697 pickle_load_args['encoding'] = 'utf-8'
698
--> 699 with _open_file_like(f, 'rb') as opened_file:
700 if _is_zipfile(opened_file):
701 # The zipfile reader is going to advance the current file position.

/usr/local/lib/python3.7/dist-packages/torch/serialization.py in _open_file_like(name_or_buffer, mode)
228 def _open_file_like(name_or_buffer, mode):
229 if _is_path(name_or_buffer):
--> 230 return _open_file(name_or_buffer, mode)
231 else:
232 if 'w' in mode:

/usr/local/lib/python3.7/dist-packages/torch/serialization.py in init(self, name, mode)
209 class _open_file(_opener):
210 def init(self, name, mode):
--> 211 super(_open_file, self).init(open(name, mode))
212
213 def exit(self, *args):

I tried to understand this error and ran the following code
args, weight_location = load_model_config(os.path.dirname(pretrained.__file__) + "/reconstruction/touch/best/")
print(weight_location)
And the output is the directory which seems strange to me.
'/private/home/ejsmith/projects/active/ActiveTouch/pterotactyl/reconstruction/touch/experiments/checkpoint/touch_train/touch_27/model'
Please could you help with this issue?

Thanks in advance!

filepath error

thank you for your sharing
when I run policies, such as "Random.ipyb", I get "error: Cannot load URDF file."

b3Warning[examples/Importers/ImportURDFDemo/UrdfFindMeshFile.h,102]: /home/sda/zcy/active3d/active_pycharm_data/pterotactyl/object_data/object_info/26766.urdf:17: cannot find '/private/home/ejsmith/projects/active/ActiveTouch/pterotactyl/object_data/object_info/26766.obj' in any directory in urdf path b3Error[examples/Importers/ImportURDFDemo/BulletUrdfImporter.cpp,121]: Could not parse visual element for Link:b3Error[examples/Importers/ImportURDFDemo/BulletUrdfImporter.cpp,121]: baseLinkb3Error[examples/Importers/ImportURDFDemo/BulletUrdfImporter.cpp,121]: failed to parse link

` error Traceback (most recent call last)
Cell In[1], line 40
38 params = Params()
39 random_test = rand.Engine(params)
---> 40 random_test()

File /home/sda/zcy/active3d/active_pycharm_data/pterotactyl/policies/baselines/rand.py:33, in Engine.call(self)
31 # compute accuracy
32 with torch.no_grad():
---> 33 self.validate(valid_loaders)

File /home/sda/zcy/active3d/active_pycharm_data/pterotactyl/policies/baselines/rand.py:54, in Engine.validate(self, dataloader)
52 for v, batch in enumerate(tqdm(dataloader)):
53 names += batch["names"]
---> 54 obs = self.env.reset(batch)
55 all_done = False
56 cur_scores = [obs["score"]]

File /home/sda/zcy/active3d/active_pycharm_data/pterotactyl/policies/environment.py:151, in ActiveTouch.reset(self, batch)
147 self.current_data["batch"] = batch
148 self.current_data["mask"] = torch.zeros(
149 [self.args.env_batch_size, self.args.num_actions]
150 )
--> 151 self.sampler.load_objects(batch["names"], from_dataset=True)
152 obs = self.compute_obs()
153 self.current_data["score"] = obs["score"]

File /home/sda/zcy/active3d/active_pycharm_data/pterotactyl/simulator/scene/sampler.py:83, in Sampler.load_objects(self, batch, from_dataset, scale)
80 verts, faces = utils.get_obj_data(obj_location, scale=scale)
81 utils.make_urdf(verts, faces, urdf_location)
---> 83 self.pybullet_scenes[i].load_obj(verts, faces, urdf_location)

File /home/sda/zcy/active3d/active_pycharm_data/pterotactyl/simulator/scene/instance.py:90, in Scene.load_obj(self, verts, faces, urdf_location)
88 faces = utils.add_faces(faces)
89 # loading into pybullet
---> 90 self.obj = self.pb.loadURDF(
91 urdf_location, [0, 0, 0], [0, 0, 0, 1], useFixedBase=1
92 )
94 # loading into pyrender
95 mesh = trimesh.Trimesh(vertices=verts, faces=faces, process=False)

error: Cannot load URDF file.`

I try to change "check_point" in pretrained config.json, but it does not work, do you think there are some absolute path cause this?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.