toughstonex / self-supervised-mvs Goto Github PK
View Code? Open in Web Editor NEWPytorch codes for "Self-supervised Multi-view Stereo via Effective Co-Segmentation and Data-Augmentation"
Pytorch codes for "Self-supervised Multi-view Stereo via Effective Co-Segmentation and Data-Augmentation"
Thanks for your excellent work!
After reading your papers, I am very interested in trying this out. I have an RTX3070 video card which has a total of 8GB of video memory. I set the batch_size to 1, but it still prompts insufficient memory during training.
RuntimeError: CUDA out of memory. Tried to allocate 480.00 MiB (GPU 0; 7.79 GiB total capacity; 5.06 GiB already allocated; 155.00 MiB free; 5.41 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
I don't have the funds to afford a more advanced card right now, is there any other way to reduce the amount of video memory needed when training the network? Or can you please share your trained xxx.ckpt file to me directly? My email address is [email protected].
Anyway, thank you for your excellent work and I wish you good health and success again!
I have tried to train this net when I set batchsize==8 and trained in 20 epoch,but I can't understand why the training loss don't decline and surrounding 10%. I can see the loss is in losses/unsup_loss and combine with a Vgg pretrain block ,the loss is
12 * self.reconstr_loss+6 * self.ssim_loss + 0.05 * self.smooth_loss (0.8*x+0.2*y+0.05*z)*15
I can't understand why the loss must multiply by 15.
I hope somebody can explain this,3U
How to continue training a trained model after it has been interrupted in jdacs-ms.
Hi, thanks for your excellent works.
The details of my tests on the Tanks and Temples dataset are not as good as yours, can you share the test code on the Tanks dataset?
Hi, thanks for the nice project.
I try to use the inverse warping code in the JDACS for textureless area like planary wall.
However, the model collapse as the warped img is perfect while the depth is all blank. I am assuming the inverse warping and the bilinear sampling, spatial_transformer has the robustness to warp perfect img without the depth if the img contains too texureless area and little parallax (video sequence img). As a result, the learner can not learn because the perfectly warped image creates an illusion it has done good progress but in practice, it is not. Any suggestions or ideas why this is happening?
Any help is much appreciated. Thanks
Thanks for your amazing work. But I have a question about cross-view masking.
According to the code provided, it seems that you only block out some regions on reference view but didn't mask out the corresponding area in source views, which is inconsistent with the statement in the paper.
What's more, I think if you do mask out the corresponding area in source views, it means that you need the groudtruth depth, which is not allowable in the setting of self-supervised MVS.
Is my understanding correct? Looking forward to your reply.
Dear author,
I cannot build fusbile. Can you send me an executable one?
I am very glad about the author's paper, but I cannot reproduce the results of jdacs-ms in your paper, and the reproduced model cannot reach 0.35. May I ask if all the codes are displayed?
RuntimeError: CUDA out of memory. How do two 3060 graphics cards compute in parallel
Thanks for your amazing work!
Are there any available pre-trained models?
Excellent contributions! When will you release the training code?
Looking forward to opening source.
Thanks for this great work! I have two quick questions:
BTW, is there any Python implementation of the evaluation code, which is currently implemented with Matlab?
Many thanks.
Thanks for your wonderful work! I have some question about the loss in the paper.
I train the JDACS with MVSNet according the the provided code, and the performance of the obtined model is much lower than that given in the paper on DTU dataset, the Acc., Comp., and overall I obtained are 0.6295, 0.5531, and 0.5913. I found that the setting of weights in the code is different from the setting of weights given in the paper. When I use the setting of weights given in the paper, the model doesn't seem to work, the thres2mm_error on test set is about 0.99 at the 1st epoch during the training. But the thres2mm_error on test set is about 0.55 at the 1st epoch during the training when I use the default setting of weights in the code. In this case, the weights of the loss seem very important. How could I set the weight to achieve the similar performance as the results provided in the paper?
Hi, nscale is set to 2 for the training of cvp-mvsnet, while it is 5 for the testing.
Are there any reasons?
Hi, thanks a lot for your swift response and your reminder helps a lot.
One more thing, I train on the DTU dataset with augmentation and co-seg deactivated. The training loss looks like below, the SSIM loss dominates the standard unsupervised loss based on the default weight [12xself.reconstr_loss (photo_loss) + 6xself.ssim_loss + 0.05xself.smooth_loss]. In this case, is it sensible to change the weight, like reduce the 6xself.ssim_loss to 1xself.ssim_loss such that it is in the similar range with reconstr_loss?
Also, the training seems not steady, it fluctuates a lot. Any clues why this happens? Thanks in advance for your help.
Originally posted by @TWang1017 in #22 (comment)
Hi, sorry to disturb you. But when I reproduce the quantitative performance of my own model only with the standard loss, I find there will be a large difference using evaluation dataset you provided or the origin evaluation dataset from dtu_yao.py
Specifically, when I use the depth maps from dtu_yao.py(using test list) to get point clouds by fusion.py, the number of the points will be dramatically low(about 1 or 2 million), thus make the acc. and comp. too high(about 3.3). But using your evaluation dataset can produce much better result
I'm wondering where the reason lies in.
Could you please send the code of jdacs-ms-v2 to my email?
Hi, thanks for your great job! I'd like to replace cvp-mvsnet to patchmatchnet and train the new model. However I found the model loss can't converge to a low level(about 30), is there any suggestions for train a new model? The training strategies are same as train.sh your provided(the only one change is scale set from 2 to 3).
Hi, thanks for the amazing work. I encountered that error below when run the train.py script. any ideas what happened?
File "c:\Users\xxx\Desktop\jdacs-ms\models\network.py", line 72, in forward
conv6 = conv0 + self.conv6(conv5)
RuntimeError: The size of tensor a (11) must match the size of tensor b (12) at non-singleton dimension 2
Hi, thanks a lot for the amazing work. I am learning the MVS and would like to ask the difference between homo_warping and inverse_warping in your code.
I understand homo_warping warp the feature from a point in one camera to another using homography
H 𝑖 (𝑑) = 𝑑K 𝑖 T 𝑖 T 1 −1 K−1
Could you plz explain the inverse_warping a bit and the function of it in your code?
If the inverse_warping is just the reverse of homography from my understanding, in what situation you use the homo_warping and inverse_warping.
Thanks in advance.
Thanks for this great work! I have one questions
Why can this code generate a mask? Could you explain the principle?
mask = (x0 >= 0) & (x1 <= max_x) & (y0 >= 0) & (y0 <= max_y) mask = mask.float()
Many thanks.
CMake Deprecation Warning at CMakeLists.txt:1 (cmake_minimum_required):
Compatibility with CMake < 2.8.12 will be removed from a future version of
CMake.
Update the VERSION argument value or use a ... suffix to tell
CMake that the project does not need compatibility with older versions.
CMake Warning at /home/camellia/anaconda3/envs/JDACS/lib/python3.8/site-packages/cmake/data/share/cmake-3.22/Modules/FindCUDA.cmake:1054 (message):
Expecting to find librt for libcudart_static, but didn't find it.
Call Stack (most recent call first):
CMakeLists.txt:5 (find_package)
-- Could NOT find OpenMP_C (missing: OpenMP_pthread_LIBRARY) (found version "3.1")
-- Could NOT find OpenMP_CXX (missing: OpenMP_pthread_LIBRARY) (found version "3.1")
-- Could NOT find OpenMP (missing: OpenMP_C_FOUND OpenMP_CXX_FOUND)
-- Configuring done
-- Generating done
-- Build files have been written to: /home/camellia/zyf/Self-Supervised-MVS-main/jdacs/fusion/fusibile/fusibile/build
Consolidate compiler generated dependencies of target fusibile
[ 33%] Linking CXX executable fusibile
/usr/bin/ld: warning: //home/camellia/anaconda3/envs/JDACS-MVS/lib/libgomp.so.1: unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010001
/usr/bin/ld: warning: //home/camellia/anaconda3/envs/JDACS-MVS/lib/libgomp.so.1: unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010002
/usr/bin/ld: /usr/local/cuda-10.0/lib64/libcudart_static.a(libcudart_static.a.o): undefined reference to symbol 'shm_unlink@@GLIBC_2.2.5'
//lib/x86_64-linux-gnu/librt.so.1: 无法添加符号: DSO missing from command line
collect2: 错误: ld 返回 1
make[2]: *** [CMakeFiles/fusibile.dir/build.make:332:fusibile] 错误 1
make[1]: *** [CMakeFiles/Makefile2:83:CMakeFiles/fusibile.dir/all] 错误 2
make: *** [Makefile:91:all] 错误 2
When I run test.sh,and I set nsrc 3, nscale 1,but terminal report error
attribute lookup MemoryError on numpy.core._exceptions failed
DefaultCPUAllocator: not enough memory: you tried to allocate 23040000 bytes. Buy new RAM!
i have 12g vram and 16g ram. What can I do ?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.