wz0919 / scalevln Goto Github PK
View Code? Open in Web Editor NEW[ICCV 2023 Oral]: Scaling Data Generation in Vision-and-Language Navigation
[ICCV 2023 Oral]: Scaling Data Generation in Vision-and-Language Navigation
Hello @wz0919 @YicongHong @jialuli-luka , I want to express my gratitude for your outstanding contributions to the field of VLN.
Could you please release the script for extracting features for images as you have used additional environment HM3D, Gibson..
so could you please tell in detail how you have extracted features.
Thank you so much for your time.
Hey, thank you for the amazing work. Can you pls release the script to extract the below features:
clip_vit-h14_mp3d_hm3d_gibson.hdf5
clip_vit-b16_mp3d_hm3d_gibson.hdf5
Thank You
Hi, thanks for sharing such a great work!
I am wondering if it is possible to share the weights of the Speaker to generate navigational instructions in ScaleVLN.
I followed the paper to train an EnvDrop Speaker with the clip feature provided on the R2R dataset. However, when generating the instructions for HM3D environments with the provided feature, the results are much worse than the instructions in ScaleVLN.
Could I know what the problem is or is there any plan to release your trained speaker?
Thanks!
Hi, thanks for your interesting work!
I just wanted to ask how you installed Matterport3DSimulator. I think the command you give is for local installation without using docker (correct me if I am wrong). Is there a convenient way to use the docker version? Because I successfully installed Matterport3DSimulator with docker, but had difficulties installing the dependencies when I attempted to do it locally (because many of the packages they used are out of date).
Thanks for your time.
I am very grateful for your research on VLN. Could you please release or share the code, data, and trained models for other downstream tasks, such as REVERIE, R4R, and R2R-CE? Your work could greatly benefit my ongoing projects. Thank you so much for your time.
Thanks for this outstanding work!
Could you please share your Recovered Images from the new environments Gibson and HM3D?
Hi, @wz0919
Do you have a plan to release the depth images or the depth features?
In the "Appendices B.3.Effect of Depth Modality" of the paper, I see you compared RGB data and RGBD data.
Hi, when I upload the test file of r2r to the leaderboard, I fail to test the results, it display ' from df045272aeba414dbefac729c49d92f5 to f45a8a43423e45788bde4e50d4ec1e2e but the navigation graph contains no edge between these viewpoints ' . Did you meet this problem. Looking forward to your replay.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.