We aim to classify the output masks of segment-anything with the off-the-shelf CLIP models. The cropped image corresponding to each mask is sent to the CLIP model.
- maxi-w/CLIP-SAM
- Curt-Park/segment-anything-with-clip
- kadirnar/segment-anything-video
- fudan-zvg/Semantic-Segment-Anything
- continue-revolution/sd-webui-segment-anything
- RockeyCoss/Prompt-Segment-Anything
- ttengwang/Caption-Anything
- ngthanhtin/owlvit_segment_anything
- lang-segment-anything
- helblazer811/RefSAM
- Hedlen/awesome-segment-anything
- ziqi-jin/finetune-anythin
- ylqi/Count-Anything
- We plan connect segment-anything with MaskCLIP.
- We plan to finetune on the COCO and LVIS datasets.
Download the sam_vit_h_4b8939.pth model from the SAM repository and put it at ./SAM-CLIP/
. Follow the instructions to install segment-anything and clip packages using the following command.
cd SAM-CLIP; pip install -e .
pip install git+https://github.com/openai/CLIP.git
Then run the following script:
sh run.sh
Input an example image and a point (250, 250) to the SAM model. The input image and output three masks as follows:
The three masks and corresponding predicted category are as follows:
You can change the point location at L273-274 of scripts/amp_points.py
.
## input points
input_points_list = [[250, 250]]
label_list = [1]