The segment-anything-in-nerf from ruilvdotcomceo

SAM-NeRF: A Simple Baseline for Segmenting Anything in NeRF with Language Prompts.

SAM-with-Language-prompts.mp4

segment-anything-in-NeRF.mp4

This repository provides language prompt support for SAM through a combination of ClipSeg and SAM. Additionally, we offer a simple baseline for connecting SAM with NeRF and a 2D-to-3D SAM feature distillation method. Specifically, this project contains:

An extension of Segment Anything for incorporating language prompts by combining ClipSeg with SAM. The inference of ClipSeg features is fast, taking only 0.04s per image and causing negligible overhead.

An implementation of combining Segment Anything with NeRF, allowing for locking 3D objects for different views and segmentation in 3D by language and point prompts.

An implementation of distilling SAM and ClipSeg features into 3D fields. In this pipeline, the image encoders of SAM and ClipSeg are replaced by a volumetric rendering process, significantly reducing the time of image encoding. The acceleration mainly comes from the much lower rendering resolutions, we believe the image encoder will give more acceleration and better results through reducing the input image size. We use a patch-based rendering and aggregate neighboring features to make up for the loss of inner-interactions of patches, improving the mask qualities.

A viewer for visualizing the trained SAM-NeRF. This viewer allows users to lock onto a certain 3D object via clicking or providing text prompt. For language prompts, the viewer can also search objects by text and provide a heatmap indicating pixel-level relevance.

Install

1. Install required packages

git clone https://github.com/WangFeng18/Explore-Sam-in-NeRF.git
# or if ssh is available
git clone [email protected]:WangFeng18/Explore-Sam-in-NeRF.git
cd Explore-Sam-in-NeRF
pip install -r requirements.txt

2. Download pretrained models

Download with script

bash download.sh

Or download manually (Same as using script)

For pretrained CLIPseg model:

cd samnerf/clipseg
wget https://owncloud.gwdg.de/index.php/s/ioHbRzFx6th32hn/download -O weights.zip
unzip -d weights -j weights.zip

For pertained SAM model:

cd samnerf/segment-anything
wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_b_01ec64.pth
wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth

3. Host your own viewer

Install viser via

pip install viser

and make sure node.js and yarn are available on your machine. Refer to this if you are not clear about how to install.

Host on your machine via

cd nerfstudio/viewer/app
yarn
yarn start

After compiling, the viewer is available on <your_machine_ip>:<port>/?websocket_url=ws://localhost:<ws_port>. By default, <port> will be set to 4000, you can change the PORT variable to what you need in nerfstudio/viewer/app/.env.development. <ws_port> is set through --viewer.websocket_port <ws_port> in the command line with your NeRF training.

For a more complete viewer instruction, checkout here 🙉 .

NOTE: The viewer is currently work in progress, and there may exist some bugs. Please let us know if you encounter something unexpected, thanks in advance for you help 🥰 .

Getting Started

1. SAM with Language prompts

We provide the usage of language promptable SAM in samclip.ipynb, and provide a gradio program for interactively segmenting objects with language prompts.

2. Segment Anything in NeRF

# data pre-processing, get the json files for training nerf in nerfstudio
bash samnerf/preprocessing/mipnerf360.sh json

Without 3D feature distillation

python -m samnerf.train samnerf.train samnerf_no_distill --vis viewer+wandb --viewer.websocket-port 7007

With 3D feature distillation, this method will distill the feature of SAM encoder into a 3D feature fields. The image encoding process is replaced by a volumetric rendering.

# first extract the features of SAM encoder and ClipSeg features
bash samnerf/preprocessing/mipnerf360.sh feature
# training nerf
python -m samnerf.train samnerf.train samnerf_distill --vis viewer+wandb --viewer.websocket-port 7007

Acknowledgement

Our codes are based on Segment-Anything,

@article{kirillov2023segany,
  title={Segment Anything},
  author={Kirillov, Alexander and Mintun, Eric and Ravi, Nikhila and Mao, Hanzi and Rolland, Chloe and Gustafson, Laura and Xiao, Tete and Whitehead, Spencer and Berg, Alexander C. and Lo, Wan-Yen and Doll{\'a}r, Piotr and Girshick, Ross},
  journal={arXiv:2304.02643},
  year={2023}
}

ClipSeg,

@InProceedings{lueddecke22_cvpr,
    author    = {L\"uddecke, Timo and Ecker, Alexander},
    title     = {Image Segmentation Using Text and Image Prompts},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2022},
    pages     = {7086-7096}
}

nerfstudio,

@article{nerfstudio,
   author = {Tancik, Matthew and Weber, Ethan and Ng, Evonne and Li, Ruilong and Yi,
           Brent and Kerr, Justin and Wang, Terrance and Kristoffersen, Alexander and Austin,
           Jake and Salahi, Kamyar and Ahuja, Abhik and McAllister, David and Kanazawa, Angjoo},
   title = {Nerfstudio: A Modular Framework for Neural Radiance Field Development},
   journal = {arXiv preprint arXiv:2302.04264},
   year = {2023},
}

and LERF,

@article{kerr2023lerf,
  title={LERF: Language Embedded Radiance Fields},
  author={Kerr, Justin and Kim, Chung Min and Goldberg, Ken and Kanazawa, Angjoo and Tancik, Matthew},
  journal={arXiv preprint arXiv:2303.09553},
  year={2023}
}

Citation

If you find the project is useful, please consider citing:

@misc{sam-nerf,
    Author = {Feng Wang, Zilong Chen and Huaping Liu},
    Year = {2023},
    Note = {https://github.com/WangFeng18/Explore-Sam-in-NeRF/tree/main},
    Title = {SamNeRF: A Simple Baseline for Segmenting Anything in NeRF with Language Prompts}
}

ruilvdotcomceo / segment-anything-in-nerf Goto Github PK

segment-anything-in-nerf's Introduction

SAM-NeRF: A Simple Baseline for Segmenting Anything in NeRF with Language Prompts.

Install

1. Install required packages

2. Download pretrained models

Download with script

Or download manually (Same as using script)

3. Host your own viewer

Getting Started

1. SAM with Language prompts

2. Segment Anything in NeRF

Acknowledgement

Citation

segment-anything-in-nerf's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs