shaohua0116 / multiview2novelview Goto Github PK

An official TensorFlow implementation of "Multi-view to Novel view: Synthesizing novel views with Self-Learned Confidence" (ECCV 2018) by Shao-Hua Sun, Minyoung Huh, Yuan-Hong Liao, Ning Zhang, and Joseph J. Lim

Home Page: https://shaohua0116.github.io/Multiview2Novelview/

License: MIT License

Python 100.00%

computer-vision image-synthesis view-synthesis novel-view-synthesis deep-learning eccv-2018 eccv

multiview2novelview's Introduction

Multi-view to Novel view:
Synthesizing Novel Views with Self-Learned Confidence

Descriptions

This project is a TensorFlow implementation of Multi-view to Novel view: Synthesizing Novel Views with Self-Learned Confidence, which is published in ECCV 2018. We provide codes, datasets, and checkpoints.

In this work, we address the task of multi-view novel view synthesis, where we are interested in synthesizing a target image with an arbitrary camera pose from given source images. An illustration of the task is as follows.

We propose an end-to-end trainable framework that learns to exploit multiple viewpoints to synthesize a novel view without any 3D supervision. Specifically, our model consists of a flow prediction module (flow predictor) and a pixel generation module (recurrent pixel generator) to directly leverage information presented in source views as well as hallucinate missing pixels from statistical priors. To merge the predictions produced by the two modules given multi-view source images, we introduce a self-learned confidence aggregation mechanism. An illustration of the proposed framework is as follows.

We evaluate our model on images rendered from 3D object models (ShapeNet) as well as real and synthesized scenes (KITTI and Synthia). We demonstrate that our model is able to achieve state-of-the-art results as well as progressively improve its predictions when more source images are available.

A simpler novel view synthesis codebase can be found at Novel View Synthesis in TensorFlow, where all the data loaders, as well as training/testing scripts, are well-configured, and you can just play with models.

Prerequisites

Datasets

All datasets are stored as HDF5 files, and the links are as follows. Each data point (HDF5 group) contains an image and its camera pose.

ShapeNet

Download from
- car (150GB)
- chair (14GB)
Put the file to this directory ./datasets/shapenet.

KITTI

Download from here (4.3GB)
Put the file to this directory ./datasets/kitti.

Synthia

Download from here (3.3GB)
Put the file to this directory ./datasets/synthia.

Usage

After downloading the datasets, we can start to train models with the following command:

Train

$ python trainer.py  --batch_size 8 --dataset car --num_input 4

Selected arguments (see the trainer.py for more details)
- --prefix: a nickname for the training
- --dataset: choose among car, chair, kitti, and synthia. You can also add your own datasets.
- Checkpoints: specify the path to a pre-trained checkpoint
  - --checkpoint: load all the parameters including the flow and pixel modules and the discriminator.
- Logging
  - --log_setp: the frequency of outputing log info ([train step 681] Loss: 0.51319 (1.896 sec/batch, 16.878 instances/sec))
  - --ckpt_save_step: the frequency of saving a checkpoint
  - --test_sample_step: the frequency of performing testing inference during training (default 100)
  - --write_summary_step: the frequency of writing TensorBoard summaries (default 100)
- Hyperparameters
  - --num_input: the number of source images
  - --batch_size: the mini-batch size (default 8)
  - --max_steps: the max training iterations
- GAN
  - --gan_type: the type of GAN losses such as LS-GAN, WGAN, etc

Interpret TensorBoard

Launch Tensorboard and go to the specified port, you can see differernt losses in the scalars tab and plotted images in the images tab. The plotted images could be interpreted as follows.

Test

We can also evaluate trained models or the checkpoints provided by the authors with the following command:

$ python evaler.py --dataset car --data_id_list ./testing_tuple_lists/id_car_random_elevation.txt [--train_dir /path/to/the/training/dir/ OR --checkpoint /path/to/the/trained/model] --loss True --write_summary True --summary_file log_car.txt --plot_image True --output_dir img_car

Selected arguments (see the evaler.py for more details)
- Id list
  - --data_id_list: specify a list of data point that you want to evaluate
- Task
  - --loss: report the loss
  - --write_summary: write the summary of this evaluation as a text file
  - --plot_image: render synthesized images
- Output
  - --quiet: only display the final report
  - --summary_file: the path to the summary file
  - --output_dir: the output dir of plotted images

Result

ShapeNet Cars

More results for ShapeNet cars (1k randomly samlped results from all 10k testing data)

ShapeNet Chairs

More results for ShapeNet cars (1k randomly samlped results from all 10k testing data)

Scenes: KITTI and Synthia

Checkpoints

We provide checkpoints and evaluation report files of our models for all eooxperiments.

Related work

[L_1] Multi-view 3D Models from Single Images with a Convolutional Network in CVPR 2016
[Appearance Flow]View Synthesis by Appearance Flow in ECCV 2016
[TVSN] Transformation-Grounded Image Generation Network for Novel 3D View Synthesis in CVPR 2017
Neural scene representation and rendering in Science 2018
Weakly-supervised Disentangling with Recurrent Transformations for 3D View Synthesis in NIPS 2015
DeepStereo: Learning to Predict New Views From the World's Imagery in CVPR 2016
Learning-Based View Synthesis for Light Field Cameras in SIGGRAPH Asia 2016

Cite the paper

If you find this useful, please cite

@inproceedings{sun2018multiview,
  title={Multi-view to Novel View: Synthesizing Novel Views with Self-Learned Confidence},
  author={Sun, Shao-Hua and Huh, Minyoung and Liao, Yuan-Hong and Zhang, Ning and Lim, Joseph J},
  booktitle={European Conference on Computer Vision},
  year={2018},
}

Authors

Shao-Hua Sun, Minyoung Huh, Yuan-Hong Liao, Ning Zhang, and Joseph J. Lim

multiview2novelview's People

Contributors

Stargazers

Watchers

multiview2novelview's Issues

Sharing the scripts to make hdf5 file for custom dataset

Hello sirs,
I am trying to implement your code with my own small dataset.
Is that okay if you can share how you create the dataset in case of ShapeNet or KITTI.

Performance issue in the definition of build_loss, model.py(P1)

Hello, I found a performance issue in the definition of build_loss, model.py, tf.expand_dims(target_image, axis=-1) will be created repeatedly during program execution, resulting in reduced efficiency. I think it should be created before the loop.

Looking forward to your reply. Btw, I am very glad to create a PR to fix it if you are too busy.

about training steps

Dear Author
Thanks for sharing your interesting work but i have the following questions:

What is the specs of the machine you used cause it takes days of running with me and did not finish, for example it is working since 3 days and reached to:
[2019-03-24 17:43:27,001] [train step 145431] Loss: 4.35295 Pixel loss: 4.03669 Flow loss: 0.31627 (1.589 sec/batch, 5.036 instances/sec)
and did not finish yet. so am asking what is the number of training steps?
the input and output for your network is just images no videos right?
I tried the following command :
python trainer.py --batch_size 8 --dataset car --num_input 4
but it gives the following error after reaching training step (number train step 4251), do you have any idea why?

[2019-03-23 05:36:24,763] [train step 4261] Loss: 2.96025 Pixel loss: 2.85450 Flow loss: 0.10575 (1.607 sec/batch, 2.489 instances/sec)
Traceback (most recent call last):
File "trainer.py", line 380, in
main()
File "trainer.py", line 377, in main
trainer.train()
File "trainer.py", line 193, in train
opt_gan=s > gan_start_step, is_train=True)
File "trainer.py", line 209, in run_single_step
batch_chunk = self.session.run(batch)
File "/usr/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 895, in run
run_metadata_ptr)
File "/usr/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1124, in _run
feed_dict_tensor, options, run_metadata)
File "/usr/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1321, in _do_run
options, run_metadata)
File "/usr/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1340, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.OutOfRangeError: RandomShuffleQueue '_0_shuffle_batch/random_shuffle_queue' is closed and has insufficient elements (requested 4, current size 3)
[[Node: shuffle_batch = QueueDequeueManyV2[component_types=[DT_FLOAT, DT_STRING, DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](shuffle_batch/random_shuffle_queue, shuffle_batch/n)]]

Caused by op u'shuffle_batch', defined at:
File "trainer.py", line 380, in
main()
File "trainer.py", line 374, in main
trainer = Trainer(config, dataset_train, dataset_test)
File "trainer.py", line 48, in init
dataset, self.batch_size, is_training=True)
File "/data/ehab/Multiview2NovelviewMaster/input_ops.py", line 76, in create_input_ops
min_after_dequeue=min_capacity,
File "/usr/lib/python2.7/site-packages/tensorflow/python/training/input.py", line 1220, in shuffle_batch
name=name)
File "/usr/lib/python2.7/site-packages/tensorflow/python/training/input.py", line 791, in _shuffle_batch
dequeued = queue.dequeue_many(batch_size, name=name)
File "/usr/lib/python2.7/site-packages/tensorflow/python/ops/data_flow_ops.py", line 457, in dequeue_many
self._queue_ref, n=n, component_types=self._dtypes, name=name)
File "/usr/lib/python2.7/site-packages/tensorflow/python/ops/gen_data_flow_ops.py", line 1342, in _queue_dequeue_many_v2
timeout_ms=timeout_ms, name=name)
File "/usr/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
op_def=op_def)
File "/usr/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2630, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1204, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

OutOfRangeError (see above for traceback): RandomShuffleQueue '_0_shuffle_batch/random_shuffle_queue' is closed and has insufficient elements (requested 4, current size 3)
[[Node: shuffle_batch = QueueDequeueManyV2[component_types=[DT_FLOAT, DT_STRING, DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](shuffle_batch/random_shuffle_queue, shuffle_batch/n)]]

i tried to get some figures from tensorboard, but how can i get numerical results published in tables in your paper?
I really appreciate your time and your reply
Regards

Implementing the code

Dear Sir
Thank you for sharing you work
I tried to use the following command but I got error loading the data as follow:
I use:

python trainer.py --batch_size 4 --dataset kitti --num_input 4 --checkpoint ～/Checkpoints_KITTI/model-1

I got the following:

Traceback (most recent call last):
File "trainer.py", line 380, in
main()
File "trainer.py", line 369, in main
dataset.create_default_splits(config.num_input)
File "/data/ehab/Multiview2NovelviewMaster/datasets/kitti.py", line 132, in create_default_splits
bound=bound)
File "/data/ehab/Multiview2NovelviewMaster/datasets/kitti.py", line 35, in init
self.data = h5py.File(file, 'r')
File "/usr/lib64/python2.7/site-packages/h5py/_hl/files.py", line 394, in init
swmr=swmr)
File "/usr/lib64/python2.7/site-packages/h5py/_hl/files.py", line 170, in make_fid
fid = h5f.open(name, flags, fapl=fapl)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py/h5f.pyx", line 85, in h5py.h5f.open
IOError: Unable to open file (unable to open file: name = './datasets/kitti/data_kitti.hdf5', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)

About your code

Dear Author
Thanks for sharing your interesting work but i have the following questions:

What is the specs of the machine you used cause it takes days of running with me and did not finish, for example it is working since 3 days and reached to:
[2019-03-24 17:43:27,001] [train step 145431] Loss: 4.35295 Pixel loss: 4.03669 Flow loss: 0.31627 (1.589 sec/batch, 5.036 instances/sec)
and did not finish yet. so am asking what is the number of training steps?
the input and output for your network is just images no videos right?
I tried the following command :
python trainer.py --batch_size 8 --dataset car --num_input 4
but it gives the following error after reaching training step (number train step 4251), do you have any idea why?
[2019-03-23 05:36:24,763] [train step 4261] Loss: 2.96025 Pixel loss: 2.85450 Flow loss: 0.10575 (1.607 sec/batch, 2.489 instances/sec)
Traceback (most recent call last):
File "trainer.py", line 380, in
main()
File "trainer.py", line 377, in main
trainer.train()
File "trainer.py", line 193, in train
opt_gan=s > gan_start_step, is_train=True)
File "trainer.py", line 209, in run_single_step
batch_chunk = self.session.run(batch)
File "/usr/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 895, in run
run_metadata_ptr)
File "/usr/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1124, in _run
feed_dict_tensor, options, run_metadata)
File "/usr/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1321, in _do_run
options, run_metadata)
File "/usr/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1340, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.OutOfRangeError: RandomShuffleQueue '_0_shuffle_batch/random_shuffle_queue' is closed and has insufficient elements (requested 4, current size 3)
[[Node: shuffle_batch = QueueDequeueManyV2[component_types=[DT_FLOAT, DT_STRING, DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](shuffle_batch/random_shuffle_queue, shuffle_batch/n)]]

Caused by op u'shuffle_batch', defined at:
File "trainer.py", line 380, in
main()
File "trainer.py", line 374, in main
trainer = Trainer(config, dataset_train, dataset_test)
File "trainer.py", line 48, in init
dataset, self.batch_size, is_training=True)
File "/data/ehab/Multiview2NovelviewMaster/input_ops.py", line 76, in create_input_ops
min_after_dequeue=min_capacity,
File "/usr/lib/python2.7/site-packages/tensorflow/python/training/input.py", line 1220, in shuffle_batch
name=name)
File "/usr/lib/python2.7/site-packages/tensorflow/python/training/input.py", line 791, in _shuffle_batch
dequeued = queue.dequeue_many(batch_size, name=name)
File "/usr/lib/python2.7/site-packages/tensorflow/python/ops/data_flow_ops.py", line 457, in dequeue_many
self._queue_ref, n=n, component_types=self._dtypes, name=name)
File "/usr/lib/python2.7/site-packages/tensorflow/python/ops/gen_data_flow_ops.py", line 1342, in _queue_dequeue_many_v2
timeout_ms=timeout_ms, name=name)
File "/usr/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
op_def=op_def)
File "/usr/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2630, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1204, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

i tried to get some figures from tensorboard, but how can i get numerical results published in tables in your paper?
I really appreciate your time and your reply
Regards

About quantitative indicators and visualization results

Dear Author
Thank you for sharing your code.
I have some questions about quantitative indicators and visualization results.
When I used the KITTI model you provided for the test, I found that the SSIM index was higher than the index shown in your paper. Moreover, when I used the code you provided to train KITTI, the final loss of single picture input would converge to 0.198, but the visual effect of the output picture was not as good as the result you provided. What is the reason?

load kitti checkpoint failed

Hi,
I am trying to train your network using my own dataset. I want to finetune the model from your pretrained kitti checkpint, but it failed.

trainning problem

Dear sir:
thanks for sharing your code!
but I meet some strange problems when trainning,

the image list is totally different from yours, but I do the trainning with the code you released, I didn't change anything~
Waiting for your reply

Aggregate output

Thanks for sharing your code
I need help to find the line in your code that calculate the aggregate output mentioned in equation to determine estimated target image in page 7 between equation 5 and 6.
Thanks

The poses in the KITTI dataset

Hi,

I would like to start by thanking you for sharing your code and trained models. I'm working these days on an undergraduate project with a partner. We are trying to use your model on general data without your supplied poses data using SfM algorithms.

For our project, we tried to use the data from KITTI's official website and couldn't make it work on your trained model. We understood the problem was that your poses data was different than the data on KITTI's website.

Can you please explain how did you create your poses data from the data supplied on the KITTI website?

Batch Normalization

Hello dear authors,

In your source code, we can set the normalization parameter to 'batch', 'instance' normalization or the default 'None'.

Can I ask that is there a reason why normalization is off by default? Did you have the time to experiment with normalization as well? Didn't it contribute to better results?

Any information is greatly appreciated.
Thanks.

Evaluate model from checkpoints

Hello,
I am trying to evaluate model from checkpoints provided by you.
I have referred to descriptions provided by you in Github.
Thus I used
python evaler.py --dataset kitti --data_id_list ./testing_tuple_lists/id_kitti.txt --checkpoint ./kitti_checkpoint/model-1 --loss True --write_summary True --summary_file log_kitti.txt --plot_image True --output_dir img_kitti .
However, I get an error: OutOfRangeError (see above for traceback): Read less bytes than requested

I would appreciate if anyone could help me.

About Flow module

Dear Author
Thank you for sharing your code
I have a question about flow module using appearance flow network, in [23] they reported that they used Disocclusionaware Appearance Flow Network, Did you used DOAFN or just AFN?
Thanks

Supplementary material?

Thanks for the code. Was wondering if you code also post the supplementary material for reference as well.

The number of source images & tensorboard images

Hello @shaohua0116

Thank you for sharing such good work!

What are the best number of iterations for datasets after which you perform evaluations?

If this question already raised/explained and I missed it, then I am sorry for raising it again.
Thanks in advance

something about normalization

I am sorry to bother you, but I would like to consult you about some structural problems. Are you not using the Normalization layer in the whole network or only in the discriminator? I am looking forward to your reply.

Per-pixel confidence loss question

Hello,
In the equation (5) in your paper, there is a element-wise square operator like this:

I believe this loss function was defined in this piece of code
l1_loss += tf.reduce_mean(loss_map * normalized_mask) * current_weight / \ (int(img.get_shape()[1]) * int(img.get_shape()[2])) * regularizer_weight

I wonder it should be like this
l1_loss += tf.reduce_mean((loss_map**2) * normalized_mask) * current_weight / \ (int(img.get_shape()[1]) * int(img.get_shape()[2])) * regularizer_weight

Can u verify it is correct ? If your code is correct then I am sorry for my question. I just confused when I see the loss like that. Thanks !

Can anyone tell me how much training time is needed after running the trainer.py script? I have a NVIDIA RTX A6000 GPU.

Foreground Mask