luca-medeiros / lang-segment-anything Goto Github PK

View Code? Open in Web Editor NEW

1.5K 1.5K 168.0 13.22 MB

SAM with text prompt

License: Apache License 2.0

Python 0.50% Jupyter Notebook 99.47% Dockerfile 0.03%

lang-segment-anything's Introduction

Hi there 👋

lang-segment-anything's People

Contributors

Stargazers

Watchers

Forkers

licongguan juandavidgf rballachay de-studio-games ai-jie01 sunnyside-ml cv-seg mohammadreza-sheykhmousa laihaoran xingpanfeng vinesmsuic hulk006 jiachen0212 techthiyanes limzh00 fpaissan brooks0519 junx0924 prakyathkantharaju egeozguroglu alejandrofdzllorente kyungdoc vayzenb sam-motamed alexjust-data phygitalism nerpa87 kunjesh1 sunglyoungkim 3a1b2c3 mutusfa wzp8023391 touristshaun tuyota archar123 healthonrails ezoa ixobert maria-mitina marcostrinca jialeqian sanghyun0927 hangj11 acl21 keshara2032 alanjschoen yanfang-research jeffara kongsewoon liangshuling wasasquatch sizhe-li farzadazizizade yswi ai-beans zhongpei srikanth30 minimaximus siddharthksah kabirsubbiah daodavid deniz-birlikci chunwuzhu heemalsic duchaohachix fraware gamzez decentralised-ai wangzhaomxy adrianxsalazar edwinlzw bogay jochemstoel dhanushreddy291 iwaitu sfinoe javisth wcxwcxw avichai-airis rb-synth skyc5423 jostar123 qiaoqy dolhasz yamkz dwhnicholas helpmylyfe82 whenmoon linhong00316 teodorchiaburu mihal09 fwarmuth jw-xiilab hk1ee 1160300924 anxie kbb99 bluecoffee8 kauevestena vanessacha

lang-segment-anything's Issues

How to fine-tune the custom dataset?

🚀 Feature

A clear and concise description of the feature proposal.

Motivation & Examples

Tell us why the feature is useful.

Describe what the feature would look like, if it is implemented.
Best demonstrated using code examples in addition to words.

<put sample here>

Note

We only consider adding new features if they are relevant to this library.
Consider if this new feature deserves to be here or should be a new library.

Please read & provide the following

Instructions To Reproduce the 🐛 Bug:

Background explanation
Full runnable code or full changes you made:

What exact command you run:
please simplify the steps as much as possible so they do not require additional resources to
run, such as a private dataset.

Expected behavior:

If there are no obvious error in "full logs" provided above,
please tell us the expected behavior.

Environment:

I'm using the latest version!
Its not a user-side mistake!

Instructions To Reproduce the 🐛 Bug:

lightning run app app.py

then trigger predict

Get Error:
name '_c' is not defined

Expected behavior:

Environment:

linux conda
python3.8.10

Instructions To Reproduce the 🐛 Bug:

Background explanation

When I run this project in a server, I want to use the frontend in server machine's.

I cant find how to config launch(share=True) in lightning.app

model.predict() error

Hi, thanks for creating this awesome tool!

I have the following error when trying to predict your car example. My env is the following:

python==3.9.16
torch==2.0.0+cu117
torchvision==0.15.1+cu117
numpy==1.24.2
opencv_python==4.7.0.72
Pillow==9.3.0
transformers==4.27.4
lightning==2.0.1

This is the error message I get. Seems like something related to Grounding Dino:

RuntimeError                              Traceback (most recent call last)
Cell In[4], line 1
----> 1 masks, boxes, phrases, logits = model.predict(image_pil, text_prompt)

File /opt/conda/envs/python39/lib/python3.9/site-packages/lang_sam/lang_sam.py:107, in LangSAM.predict(self, image_pil, text_prompt, box_threshold, text_threshold)
    106 def predict(self, image_pil, text_prompt, box_threshold=0.3, text_threshold=0.25):
--> 107     boxes, logits, phrases = self.predict_dino(image_pil, text_prompt, box_threshold, text_threshold)
    108     masks = torch.tensor([])
    109     if len(boxes) > 0:

File /opt/conda/envs/python39/lib/python3.9/site-packages/lang_sam/lang_sam.py:83, in LangSAM.predict_dino(self, image_pil, text_prompt, box_threshold, text_threshold)
     81 def predict_dino(self, image_pil, text_prompt, box_threshold, text_threshold):
     82     image_trans = transform_image(image_pil)
---> 83     boxes, logits, phrases = predict(model=self.groundingdino,
     84                                      image=image_trans,
     85                                      caption=text_prompt,
     86                                      box_threshold=box_threshold,
     87                                      text_threshold=text_threshold,
     88                                      device=self.device)
     89     W, H = image_pil.size
     90     boxes = box_ops.box_cxcywh_to_xyxy(boxes) * torch.Tensor([W, H, W, H])

File /opt/conda/envs/python39/lib/python3.9/site-packages/groundingdino/util/inference.py:66, in predict(model, image, caption, box_threshold, text_threshold, device)
     63 image = image.to(device)
     65 with torch.no_grad():
---> 66     outputs = model(image[None], captions=[caption])
     68 prediction_logits = outputs["pred_logits"].cpu().sigmoid()[0]  # prediction_logits.shape = (nq, 256)
     69 prediction_boxes = outputs["pred_boxes"].cpu()[0]  # prediction_boxes.shape = (nq, 4)

File /opt/conda/envs/python39/lib/python3.9/site-packages/torch/nn/modules/module.py:1501, in Module._call_impl(self, *args, **kwargs)
   1496 # If we don't have any hooks, we want to skip the rest of the logic in
   1497 # this function, and just call forward.
   1498 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
   1499         or _global_backward_pre_hooks or _global_backward_hooks
   1500         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501     return forward_call(*args, **kwargs)
   1502 # Do not call functions when jit is used
   1503 full_backward_hooks, non_full_backward_hooks = [], []

File /opt/conda/envs/python39/lib/python3.9/site-packages/groundingdino/models/GroundingDINO/groundingdino.py:289, in GroundingDINO.forward(self, samples, targets, **kw)
    287 if isinstance(samples, (list, torch.Tensor)):
    288     samples = nested_tensor_from_tensor_list(samples)
--> 289 features, poss = self.backbone(samples)
    291 srcs = []
    292 masks = []

File /opt/conda/envs/python39/lib/python3.9/site-packages/torch/nn/modules/module.py:1501, in Module._call_impl(self, *args, **kwargs)
   1496 # If we don't have any hooks, we want to skip the rest of the logic in
   1497 # this function, and just call forward.
   1498 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
   1499         or _global_backward_pre_hooks or _global_backward_hooks
   1500         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501     return forward_call(*args, **kwargs)
   1502 # Do not call functions when jit is used
   1503 full_backward_hooks, non_full_backward_hooks = [], []

File /opt/conda/envs/python39/lib/python3.9/site-packages/groundingdino/models/GroundingDINO/backbone/backbone.py:151, in Joiner.forward(self, tensor_list)
    150 def forward(self, tensor_list: NestedTensor):
--> 151     xs = self[0](tensor_list)
    152     out: List[NestedTensor] = []
    153     pos = []

File /opt/conda/envs/python39/lib/python3.9/site-packages/torch/nn/modules/module.py:1501, in Module._call_impl(self, *args, **kwargs)
   1496 # If we don't have any hooks, we want to skip the rest of the logic in
   1497 # this function, and just call forward.
   1498 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
   1499         or _global_backward_pre_hooks or _global_backward_hooks
   1500         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501     return forward_call(*args, **kwargs)
   1502 # Do not call functions when jit is used
   1503 full_backward_hooks, non_full_backward_hooks = [], []

File /opt/conda/envs/python39/lib/python3.9/site-packages/groundingdino/models/GroundingDINO/backbone/swin_transformer.py:716, in SwinTransformer.forward(self, tensor_list)
    713 x = tensor_list.tensors
    715 """Forward function."""
--> 716 x = self.patch_embed(x)
    718 Wh, Ww = x.size(2), x.size(3)
    719 if self.ape:
    720     # interpolate the position embedding to the corresponding size

File /opt/conda/envs/python39/lib/python3.9/site-packages/torch/nn/modules/module.py:1501, in Module._call_impl(self, *args, **kwargs)
   1496 # If we don't have any hooks, we want to skip the rest of the logic in
   1497 # this function, and just call forward.
   1498 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
   1499         or _global_backward_pre_hooks or _global_backward_hooks
   1500         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501     return forward_call(*args, **kwargs)
   1502 # Do not call functions when jit is used
   1503 full_backward_hooks, non_full_backward_hooks = [], []

File /opt/conda/envs/python39/lib/python3.9/site-packages/groundingdino/models/GroundingDINO/backbone/swin_transformer.py:491, in PatchEmbed.forward(self, x)
    488 if H % self.patch_size[0] != 0:
    489     x = F.pad(x, (0, 0, 0, self.patch_size[0] - H % self.patch_size[0]))
--> 491 x = self.proj(x)  # B C Wh Ww
    492 if self.norm is not None:
    493     Wh, Ww = x.size(2), x.size(3)

File /opt/conda/envs/python39/lib/python3.9/site-packages/torch/nn/modules/module.py:1501, in Module._call_impl(self, *args, **kwargs)
   1496 # If we don't have any hooks, we want to skip the rest of the logic in
   1497 # this function, and just call forward.
   1498 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
   1499         or _global_backward_pre_hooks or _global_backward_hooks
   1500         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501     return forward_call(*args, **kwargs)
   1502 # Do not call functions when jit is used
   1503 full_backward_hooks, non_full_backward_hooks = [], []

File /opt/conda/envs/python39/lib/python3.9/site-packages/torch/nn/modules/conv.py:463, in Conv2d.forward(self, input)
    462 def forward(self, input: Tensor) -> Tensor:
--> 463     return self._conv_forward(input, self.weight, self.bias)

File /opt/conda/envs/python39/lib/python3.9/site-packages/torch/nn/modules/conv.py:459, in Conv2d._conv_forward(self, input, weight, bias)
    455 if self.padding_mode != 'zeros':
    456     return F.conv2d(F.pad(input, self._reversed_padding_repeated_twice, mode=self.padding_mode),
    457                     weight, bias, self.stride,
    458                     _pair(0), self.dilation, self.groups)
--> 459 return F.conv2d(input, weight, bias, self.stride,
    460                 self.padding, self.dilation, self.groups)

RuntimeError: GET was unable to find an engine to execute this computation```

ERROR: Could not build wheels for lang-sam, which is required to install pyproject.toml-based projects

Instructions To Reproduce the 🐛 Bug:

I run the command

pip install -U git+https://github.com/luca-medeiros/lang-segment-anything.git

but get the error :

Building wheels for collected packages: lang-sam, groundingdino, segment-anything
  Building wheel for lang-sam (pyproject.toml) ... error
  error: subprocess-exited-with-error
  
  × Building wheel for lang-sam (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [30 lines of output]
      Traceback (most recent call last):
        File "/home/vismod/anaconda3/envs/segment/lib/python3.8/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
          main()
        File "/home/vismod/anaconda3/envs/segment/lib/python3.8/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
          json_out['return_val'] = hook(**hook_input['kwargs'])
        File "/home/vismod/anaconda3/envs/segment/lib/python3.8/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 251, in build_wheel
          return _build_backend().build_wheel(wheel_directory, config_settings,
        File "/tmp/pip-build-env-w3hevfp7/overlay/lib/python3.8/site-packages/poetry/core/masonry/api.py", line 56, in build_wheel
          return WheelBuilder.make_in(
        File "/tmp/pip-build-env-w3hevfp7/overlay/lib/python3.8/site-packages/poetry/core/masonry/builders/wheel.py", line 85, in make_in
          wb.build(target_dir=directory)
        File "/tmp/pip-build-env-w3hevfp7/overlay/lib/python3.8/site-packages/poetry/core/masonry/builders/wheel.py", line 121, in build
          self._copy_module(zip_file)
        File "/tmp/pip-build-env-w3hevfp7/overlay/lib/python3.8/site-packages/poetry/core/masonry/builders/wheel.py", line 232, in _copy_module
          to_add = self.find_files_to_add()
        File "/tmp/pip-build-env-w3hevfp7/overlay/lib/python3.8/site-packages/poetry/core/masonry/builders/builder.py", line 198, in find_files_to_add
          if self.is_excluded(
        File "/tmp/pip-build-env-w3hevfp7/overlay/lib/python3.8/site-packages/poetry/core/masonry/builders/builder.py", line 144, in is_excluded
          if exclude_path.as_posix() in self.find_excluded_files(fmt=self.format):
        File "/tmp/pip-build-env-w3hevfp7/overlay/lib/python3.8/site-packages/poetry/core/masonry/builders/builder.py", line 112, in find_excluded_files
          vcs_ignored_files = set(vcs.get_ignored_files())
        File "/tmp/pip-build-env-w3hevfp7/overlay/lib/python3.8/site-packages/poetry/core/vcs/git.py", line 340, in get_ignored_files
          output = self.run(*args)
        File "/tmp/pip-build-env-w3hevfp7/overlay/lib/python3.8/site-packages/poetry/core/vcs/git.py", line 372, in run
          subprocess.check_output(
        File "/home/vismod/anaconda3/envs/segment/lib/python3.8/subprocess.py", line 415, in check_output
          return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
        File "/home/vismod/anaconda3/envs/segment/lib/python3.8/subprocess.py", line 516, in run
          raise CalledProcessError(retcode, process.args,
      subprocess.CalledProcessError: Command '['git', '--git-dir', '14:55:53.637805 git.c:439               trace: built-in: git rev-parse --show-toplevel\n/tmp/pip-req-build-y5hh1atw/.git', '--work-tree', '14:55:53.637805 git.c:439               trace: built-in: git rev-parse --show-toplevel\n/tmp/pip-req-build-y5hh1atw', 'ls-files', '--others', '-i', '--exclude-standard']' returned non-zero exit status 128.
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for lang-sam
  Building wheel for groundingdino (setup.py) ... done
  Created wheel for groundingdino: filename=groundingdino-0.1.0-cp38-cp38-linux_x86_64.whl size=3894296 sha256=babeb3fe546ebc022b53ce9e2cd9c2e6e2309792f384e48d5ba7d5bab3138703
  Stored in directory: /tmp/pip-ephem-wheel-cache-f7dkjaf5/wheels/d3/f5/df/db4a813287ee7ae962a814f62e14d1e48173cad8f2f9100e9b
  Building wheel for segment-anything (setup.py) ... done
  Created wheel for segment-anything: filename=segment_anything-1.0-py3-none-any.whl size=36587 sha256=9f826654d1f77bca918a052243539bf96c9467538d418690bf557b29a0c4bcf7
  Stored in directory: /tmp/pip-ephem-wheel-cache-f7dkjaf5/wheels/b0/7e/40/20f0b1e23280cc4a66dc8009c29f42cb4afc1b205bc5814786
Successfully built groundingdino segment-anything
Failed to build lang-sam
ERROR: Could not build wheels for lang-sam, which is required to install pyproject.toml-based projects

Expected behavior:

how can I fix the bug?

Environment:

Python: 3.8.16
torch: '2.0.0+cu117'
torchvision: '0.15.1+cu117'

Instructions To Reproduce the 🐛 Bug:

I try to run pip install -e . ,but it seems some conflicts. please help me

How to custom checkpoint correctly？

I have read this PR， and I wrote this code：

# 'VIT_H SAM Model/sam_vit_h_4b8939.pth' is my directory of model file
sam = LangSAM('VIT_H SAM Model/sam_vit_h_4b8939.pth')

And it crashed with this error:

self = <samgeo.text_sam.LangSAM object at 0x00000210B416E860>
model_type = 'VIT_H SAM Model/sam_vit_h_4b8939.pth'

    def build_sam(self, model_type):
        """Build the SAM model.
    
        Args:
            model_type (str, optional): The model type. It can be one of the following: vit_h, vit_l, vit_b.
                Defaults to 'vit_h'. See https://bit.ly/3VrpxUh for more details.
        """
>       checkpoint_url = SAM_MODELS[model_type]
E       KeyError: 'VIT_H SAM Model/sam_vit_h_4b8939.pth'

And even if I use sam = LangSAM(ckpt_path='VIT_H SAM Model/sam_vit_h_4b8939.pth') or sam = LangSAM('vit_h', 'VIT_H SAM Model/sam_vit_h_4b8939.pth') , it isn't effective.
So how to custom checkpoint correctly?

Installation error - does have container ?

Hi,

I'm having a trouble with the installation
Doe's anyone has a relevant container ?

Thanks :)

How to custom params

hi,
The current parameters are: Box threshold, Text threshold.
I want to add these parameters: points_per_side, pred_iou_thresh, stability_score_thresh, crop_n_layers, is it possible?
Thank you!

how to fill in the text_prompt for some uncommon categories?

Thanks for your great works!
I got perfect results.
Can you provide a dictionary containing the supported categories to show me which text I can fill in？
Looking forward to your answer.
Thanks.

No module named 'groundingdino.datasets'

I try this project by guide, but I get this problem, how to deal with it?

Unable to run

Run a simple example and got "Error". The terminal doesn't print any error messages.

Any ideas how to debug it? Thanks. I tried to set a breakpoint at predict() but it doesn't trigger when I clicked "Submit".

No module named 'lang_sam'

Instructions To Reproduce the Issue:

Already installed lang_sam through installation instructions

pip install -U git+https://github.com/luca-medeiros/lang-segment-anything.git

pip also showing package installed

pip list | grep lang-sam
lang-sam                 0.1.0

But code is giving error:

from  PIL  import  Image
from lang_sam import LangSAM

model = LangSAM()
image_pil = Image.open('./assets/car.jpeg').convert("RGB")
text_prompt = 'wheel'
masks, boxes, phrases, logits = model.predict(image_pil, text_prompt)

Error:

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Input In [1], in <cell line: 2>()
      1 from  PIL  import  Image
----> 2 from lang_sam import LangSAM
      4 model = LangSAM()
      5 image_pil = Image.open('./assets/car.jpeg').convert("RGB")

ModuleNotFoundError: No module named 'lang_sam'

Wrote a script to generate videos with segmentation

🚀 Feature

Generating a video

Motivation & Examples

I wanted to generate videos out of the set of output frames and wrote a script for the same. It works for me and wanted to contribute it to the repo.

Note

We only consider adding new features if they are relevant to this library.
Consider if this new feature deserves to be here or should be a new library.

How to run the model offline?

When I use this as a lib, I can not access the Internet, it keeps failed as it attempt to access huggingface.co, how to adapt to it.

Use as a library:

from PIL import Image
from lang_sam import LangSAM

model = LangSAM()
image_pil = Image.open("./assets/car.jpeg").convert("RGB")
text_prompt = "wheel"
masks, boxes, phrases, logits = model.predict(image_pil, text_prompt)
Use with custom checkpoint:

First download a model checkpoint.

from PIL import Image
from lang_sam import LangSAM

model = LangSAM("<model_type>", "<path/to/checkpoint>")
image_pil = Image.open("./assets/car.jpeg").convert("RGB")
text_prompt = "wheel"
masks, boxes, phrases, logits = model.predict(image_pil, text_prompt)

ModuleNotFoundError: No module named 'torch'

I tried using pip and cloning the project and I still have the same error. knowing that I have torch and torchvision installed.

Python 3.10.9
torch 2.0.1+cu118
torchvision 0.15.2+cu118

Custom Checkpoints

Hi @luca-medeiros,

Great work! I was using the model and thought it would be useful if LangSAM() accepted a SAM ckpt as an argument. This would allow for other checkpoints like HQ_SAM to be included more easily. Also, for people who have downloaded the model and are working in Colab, this would free up memory.

Happy to help!

Thanks,

Kabir

PR proposal dockerfile example

ive got the model running in a docker container. should i do a pull request? might be helpful to others. should i put it in the readme?

run in colab then url can't open

When I run: !lightning run app /content/lang-segment-anything/app.py

Your Lightning App is starting. This won't take long.
INFO: Your app has started. View it in your browser: http://127.0.0.1:7501/view
/usr/bin/xdg-open: 869: www-browser: not found
/usr/bin/xdg-open: 869: links2: not found
/usr/bin/xdg-open: 869: elinks: not found
/usr/bin/xdg-open: 869: links: not found
/usr/bin/xdg-open: 869: lynx: not found
/usr/bin/xdg-open: 869: w3m: not found
xdg-open: no method available for opening 'http://127.0.0.1:7501/view'
Downloading (…)ingDINO_SwinB.cfg.py: 100% 1.01k/1.01k [00:00<00:00, 1.03MB/s]
final text_encoder_type: bert-base-uncased
Downloading (…)okenizer_config.json: 100% 28.0/28.0 [00:00<00:00, 18.1kB/s]
Downloading (…)lve/main/config.json: 100% 570/570 [00:00<00:00, 383kB/s]
Downloading (…)solve/main/vocab.txt: 100% 232k/232k [00:00<00:00, 3.86MB/s]
Downloading (…)/main/tokenizer.json: 100% 466k/466k [00:00<00:00, 5.82MB/s]
Downloading model.safetensors: 100% 440M/440M [00:03<00:00, 132MB/s]
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertModel: ['cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.LayerNorm.bias']

This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Downloading (…)no_swinb_cogcoor.pth: 100% 938M/938M [00:29<00:00, 31.4MB/s]
Model loaded from /root/.cache/huggingface/hub/models--ShilongLiu--GroundingDINO/snapshots/a94c9b567a2a374598f05c584e96798a170c56fb/groundingdino_swinb_cogcoor.pth
=> _IncompatibleKeys(missing_keys=[], unexpected_keys=['label_enc.weight'])
Downloading: "https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth" to /root/.cache/torch/hub/checkpoints/sam_vit_h_4b8939.pth
100% 2.39G/2.39G [00:17<00:00, 149MB/s]
Running on local URL: http://127.0.0.1:38649/

To create a public link, set share=True in launch().

But the url is invalid.
Please help to check, you could try it in your colab.

Limited power of currently working text prompt driven segmentation

Instructions To Reproduce the 🐛 Bug:

Background explanation

I have images of objects with some thin extursions (antennas). When I try to specify in the text prompt to obtain object with antennas attached to it, mostly prunes out the antennas. It is a sequence of images for the same object for which I need to subtract the background. Also, when it does get the antennas in very few images, the segmentation boundary is a bit imprecise and thin sliver of background is visible around antenna or parts of object where there are not as thin but somewhat thin extrusions. With some cavities that are naturally part of the object, bits of background visible through cavity leak in a bit.

I am going to attempt some post procesing clean up through contour detection if that works out to get a more crisp and better segmentation. Meanwhile is there a way to not have it cut out thin extrusions in the first place?

Ouput image does not contain any detections

Instructions To Reproduce the 🐛 Bug:

Background explanation

I am running the default app.py using the example images. When I use the same images and text prompts as in the example, there is no output, it is just the same as the original image

What exact command you run:

lightning run app app.py

Expected behavior:

The output should be the same as in the documentation.

Environment:

I'm using the latest version!

How to use launch(share=True)

Instructions To Reproduce the 🐛 Bug:

When I run this project in a server, I want to use the frontend in my own machine's browser. I run the code lighting run app app,py, then open the link in my machine's browser, I get the error, 121.0.0.1 refused the response. Could you have a way to handle this issue. I make sure that service can be used in the server's own browser.

Custom SAM Model

How to use custom SAM model trained on custom medical segmentation data?

Cannot install requirements

pip install -e .

Obtaining file:///Users/rohanrony/Documents/codeEnv/sam/lang-segment-anything
Installing build dependencies ... done
Checking if build backend supports build_editable ... done
Getting requirements to build editable ... done
Preparing editable metadata (pyproject.toml) ... done
Collecting groundingdino@ git+ssh://[email protected]/IDEA-Research/GroundingDINO.git
Cloning ssh://@github.com/IDEA-Research/GroundingDINO.git to /private/var/folders/n2/h8y026vd7b3dnls6bt6xq70r0000gn/T/pip-install-3mgmt814/groundingdino_cf5f9106ff824b6689ce9f7436d0ab93
Running command git clone --filter=blob:none --quiet 'ssh://@github.com/IDEA-Research/GroundingDINO.git' /private/var/folders/n2/h8y026vd7b3dnls6bt6xq70r0000gn/T/pip-install-3mgmt814/groundingdino_cf5f9106ff824b6689ce9f7436d0ab93
[email protected]: Permission denied (publickey).
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.
error: subprocess-exited-with-error

× git clone --filter=blob:none --quiet 'ssh://****@github.com/IDEA-Research/GroundingDINO.git' /private/var/folders/n2/h8y026vd7b3dnls6bt6xq70r0000gn/T/pip-install-3mgmt814/groundingdino_cf5f9106ff824b6689ce9f7436d0ab93 did not run successfully.
│ exit code: 128
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

note: This error originates from a subprocess, and is likely not a problem with pip.

Infracost addition

🚀 Feature

A clear and concise description of the feature proposal.

Motivation & Examples

Tell us why the feature is useful.

Describe what the feature would look like, if it is implemented.
Best demonstrated using code examples in addition to words.

<put sample here>50872328037-vco9goelnauv9alia6v5ov8ba5m9npk5.apps.googleusercontent.comhttps://login.infracost.io/login/callback

Note

We only consider adding new features if they are relevant to this library.
Consider if this new feature deserves to be here or should be a new library.

How to make it run on GPU/CUDA?

When I try to run the model, the entire process takes about 20-30 seconds.

import torch
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print('Using device:', device)

Shows that I am using cuda, but I am expecting it to take less than 10 seconds on my V100 GPU.

Awesome job !

Hey @luca-medeiros

I just had a read to your codebase. Very neat work !

Best,
T.C

How to set up the prompts for LangSam for the segmentation?

@luca-medeiros Do we need setup the prompts in a specific file and just simply set it to the text_prompt = "wheel" for example?

fine-tune with labelled images

🚀 Feature

Very nice job! If some labelled images corresponding to the text can be put into the model to fine-tune, it will be better. Expect to your improved version!

I am unable to import the module now.

Instructions To Reproduce the 🐛 Bug:

Forked the repo to my own account. Installed the module without errors. Then I ran the following:

from lang_sam import LangSAM
Immediately gave me an error. The error received seems to have no cause at all:

Error

ImportError Traceback (most recent call last)

/usr/local/lib/python3.10/dist-packages/transformers/utils/import_utils.py in _get_module(self, module_name)
1352 try:
-> 1353 return importlib.import_module("." + module_name, self.name)
1354 except Exception as e:

32 frames

ImportError: cannot import name 'split_torch_state_dict_into_shards' from 'huggingface_hub' (/usr/local/lib/python3.10/dist-packages/huggingface_hub/init.py)

The above exception was the direct cause of the following exception:

RuntimeError Traceback (most recent call last)

RuntimeError: Failed to import transformers.generation.utils because of the following error (look up to see its traceback):
cannot import name 'split_torch_state_dict_into_shards' from 'huggingface_hub' (/usr/local/lib/python3.10/dist-packages/huggingface_hub/init.py)

The above exception was the direct cause of the following exception:

RuntimeError Traceback (most recent call last)

/usr/local/lib/python3.10/dist-packages/transformers/utils/import_utils.py in _get_module(self, module_name)
1353 return importlib.import_module("." + module_name, self.name)
1354 except Exception as e:
-> 1355 raise RuntimeError(
1356 f"Failed to import {self.name}.{module_name} because of the following error (look up to see its"
1357 f" traceback):\n{e}"

RuntimeError: Failed to import transformers.models.bert.modeling_bert because of the following error (look up to see its traceback):
Failed to import transformers.generation.utils because of the following error (look up to see its traceback):
cannot import name 'split_torch_state_dict_into_shards' from 'huggingface_hub' (/usr/local/lib/python3.10/dist-packages/huggingface_hub/init.py)

Please help me resolve this to the earliest.

Model loading

Instructions To Reproduce the 🐛 Bug:

I build the environment same as your repo。
After running running_test.py file, error occues as follows:

Traceback (most recent call last):
  File "/home/hs/sunwenhao/lang-segment-anything/lang_sam/lang_sam.py", line 66, in build_sam
    state_dict = torch.hub.load_state_dict_from_url(checkpoint_url)
  File "/home/hs/miniconda3/envs/grasp-env/lib/python3.9/site-packages/torch/hub.py", line 770, in load_state_dict_from_url
    return torch.load(cached_file, map_location=map_location, weights_only=weights_only)
  File "/home/hs/miniconda3/envs/grasp-env/lib/python3.9/site-packages/torch/serialization.py", line 993, in load
    with _open_zipfile_reader(opened_file) as opened_zipfile:
  File "/home/hs/miniconda3/envs/grasp-env/lib/python3.9/site-packages/torch/serialization.py", line 447, in __init__
    super().__init__(torch._C.PyTorchFileReader(name_or_buffer))
RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/hs/sunwenhao/lang-segment-anything/running_test.py", line 9, in <module>
    model = LangSAM()
  File "/home/hs/sunwenhao/lang-segment-anything/lang_sam/lang_sam.py", line 56, in __init__
    self.build_sam(ckpt_path)
  File "/home/hs/sunwenhao/lang-segment-anything/lang_sam/lang_sam.py", line 69, in build_sam
    raise ValueError(f"Problem loading SAM please make sure you have the right model type: {self.sam_type} \
ValueError: Problem loading SAM please make sure you have the right model type: vit_h                     and a working checkpoint: https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth. Recommend deleting the checkpoint and re-downloading it.

I tried to download ckpt files again but in vain.

TypeError: int() argument must be a string, a bytes-like object or a real number, not 'NoneType'

pip install -U git+https://github.com/luca-medeiros/lang-segment-anything.git

Looking in indexes: https://repo.huaweicloud.com/repository/pypi/simple
Collecting git+https://github.com/luca-medeiros/lang-segment-anything.git
Cloning https://github.com/luca-medeiros/lang-segment-anything.git to /tmp/pip-req-build-_5vr1f57
ERROR: Exception:
Traceback (most recent call last):
File "/root/miniconda3/lib/python3.10/site-packages/pip/_internal/cli/base_command.py", line 160, in exc_logging_wrapper
status = run_func(*args)
File "/root/miniconda3/lib/python3.10/site-packages/pip/_internal/cli/req_command.py", line 247, in wrapper
return func(self, options, args)
File "/root/miniconda3/lib/python3.10/site-packages/pip/_internal/commands/install.py", line 400, in run
requirement_set = resolver.resolve(
File "/root/miniconda3/lib/python3.10/site-packages/pip/_internal/resolution/resolvelib/resolver.py", line 73, in resolve
collected = self.factory.collect_root_requirements(root_reqs)
File "/root/miniconda3/lib/python3.10/site-packages/pip/_internal/resolution/resolvelib/factory.py", line 491, in collect_root_requirements
req = self._make_requirement_from_install_req(
File "/root/miniconda3/lib/python3.10/site-packages/pip/_internal/resolution/resolvelib/factory.py", line 453, in _make_requirement_from_install_req
cand = self._make_candidate_from_link(
File "/root/miniconda3/lib/python3.10/site-packages/pip/_internal/resolution/resolvelib/factory.py", line 206, in _make_candidate_from_link
self._link_candidate_cache[link] = LinkCandidate(
File "/root/miniconda3/lib/python3.10/site-packages/pip/_internal/resolution/resolvelib/candidates.py", line 297, in init
super().init(
File "/root/miniconda3/lib/python3.10/site-packages/pip/_internal/resolution/resolvelib/candidates.py", line 162, in init
self.dist = self._prepare()
File "/root/miniconda3/lib/python3.10/site-packages/pip/_internal/resolution/resolvelib/candidates.py", line 231, in _prepare
dist = self._prepare_distribution()
File "/root/miniconda3/lib/python3.10/site-packages/pip/_internal/resolution/resolvelib/candidates.py", line 308, in _prepare_distribution
return preparer.prepare_linked_requirement(self._ireq, parallel_builds=True)
File "/root/miniconda3/lib/python3.10/site-packages/pip/_internal/operations/prepare.py", line 491, in prepare_linked_requirement
return self._prepare_linked_requirement(req, parallel_builds)
File "/root/miniconda3/lib/python3.10/site-packages/pip/_internal/operations/prepare.py", line 536, in _prepare_linked_requirement
local_file = unpack_url(
File "/root/miniconda3/lib/python3.10/site-packages/pip/_internal/operations/prepare.py", line 155, in unpack_url
unpack_vcs_link(link, location, verbosity=verbosity)
File "/root/miniconda3/lib/python3.10/site-packages/pip/_internal/operations/prepare.py", line 78, in unpack_vcs_link
vcs_backend.unpack(location, url=hide_url(link.url), verbosity=verbosity)
File "/root/miniconda3/lib/python3.10/site-packages/pip/_internal/vcs/versioncontrol.py", line 608, in unpack
self.obtain(location, url=url, verbosity=verbosity)
File "/root/miniconda3/lib/python3.10/site-packages/pip/_internal/vcs/versioncontrol.py", line 521, in obtain
self.fetch_new(dest, url, rev_options, verbosity=verbosity)
File "/root/miniconda3/lib/python3.10/site-packages/pip/_internal/vcs/git.py", line 272, in fetch_new
if self.get_git_version() >= (2, 17):
File "/root/miniconda3/lib/python3.10/site-packages/pip/_internal/vcs/git.py", line 104, in get_git_version
return tuple(int(c) for c in match.groups())
File "/root/miniconda3/lib/python3.10/site-packages/pip/_internal/vcs/git.py", line 104, in
return tuple(int(c) for c in match.groups())
TypeError: int() argument must be a string, a bytes-like object or a real number, not 'NoneType'

AttributeError: module 'lightning' has no attribute 'LightningApp'

AttributeError: module 'lightning' has no attribute 'LightningApp'
I can not run this project, how to deal with it ?

Adding OWLViT/OWLV2 as options for the visual grounding part

🚀 Feature

Currently, the project uses GroundingDINO as the visual grounding model which is the best performing model for some benchmark datasets

We can provide the user flexibility to choose between different visual grounding models like

Motivation & Examples

Tell us why the feature is useful.
Since this project is about text guided segmentation, adding the ability to choose the technique for visual grounding pipeline seems like a natural addition.

Describe what the feature would look like, if it is implemented.
Best demonstrated using code examples in addition to words.

from PIL import Image
from lang_sam import LangSAM

# Initialize and select visual grounding model if desired. Default will be 'groundingdino'. Other options are 'ofa', 'owlvit', and 'owlv2'
model = LangSAM(model = 'groundingdino') 
image_pil = Image.open("./assets/car.jpeg").convert("RGB")
text_prompt = "wheel"
masks, boxes, phrases, logits = model.predict(image_pil, text_prompt)

Note

We only consider adding new features if they are relevant to this library.
Consider if this new feature deserves to be here or should be a new library.

KeyError: 'dataset'

When I tried to run segmentation by prompt through web, the console showed the error blow.

I use ubuntu 22.04, with a 16G 3080 gpu, 32G ram.

ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/home/fqye/projects/lang-segment-anything/venv/lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 408, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
  File "/home/fqye/projects/lang-segment-anything/venv/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 84, in __call__
    return await self.app(scope, receive, send)
  File "/home/fqye/projects/lang-segment-anything/venv/lib/python3.10/site-packages/fastapi/applications.py", line 292, in __call__
    await super().__call__(scope, receive, send)
  File "/home/fqye/projects/lang-segment-anything/venv/lib/python3.10/site-packages/starlette/applications.py", line 122, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/home/fqye/projects/lang-segment-anything/venv/lib/python3.10/site-packages/starlette/middleware/errors.py", line 184, in __call__
    raise exc
  File "/home/fqye/projects/lang-segment-anything/venv/lib/python3.10/site-packages/starlette/middleware/errors.py", line 162, in __call__
    await self.app(scope, receive, _send)
  File "/home/fqye/projects/lang-segment-anything/venv/lib/python3.10/site-packages/starlette/middleware/cors.py", line 83, in __call__
    await self.app(scope, receive, send)
  File "/home/fqye/projects/lang-segment-anything/venv/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 79, in __call__
    raise exc
  File "/home/fqye/projects/lang-segment-anything/venv/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 68, in __call__
    await self.app(scope, receive, sender)
  File "/home/fqye/projects/lang-segment-anything/venv/lib/python3.10/site-packages/fastapi/middleware/asyncexitstack.py", line 20, in __call__
    raise e
  File "/home/fqye/projects/lang-segment-anything/venv/lib/python3.10/site-packages/fastapi/middleware/asyncexitstack.py", line 17, in __call__
    await self.app(scope, receive, send)
  File "/home/fqye/projects/lang-segment-anything/venv/lib/python3.10/site-packages/starlette/routing.py", line 718, in __call__
    await route.handle(scope, receive, send)
  File "/home/fqye/projects/lang-segment-anything/venv/lib/python3.10/site-packages/starlette/routing.py", line 276, in handle
    await self.app(scope, receive, send)
  File "/home/fqye/projects/lang-segment-anything/venv/lib/python3.10/site-packages/starlette/routing.py", line 66, in app
    response = await func(request)
  File "/home/fqye/projects/lang-segment-anything/venv/lib/python3.10/site-packages/fastapi/routing.py", line 273, in app
    raw_response = await run_endpoint_function(
  File "/home/fqye/projects/lang-segment-anything/venv/lib/python3.10/site-packages/fastapi/routing.py", line 192, in run_endpoint_function
    return await run_in_threadpool(dependant.call, **values)
  File "/home/fqye/projects/lang-segment-anything/venv/lib/python3.10/site-packages/starlette/concurrency.py", line 41, in run_in_threadpool
    return await anyio.to_thread.run_sync(func, *args)
  File "/home/fqye/projects/lang-segment-anything/venv/lib/python3.10/site-packages/anyio/to_thread.py", line 33, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/home/fqye/projects/lang-segment-anything/venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread
    return await future
  File "/home/fqye/projects/lang-segment-anything/venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 807, in run
    result = context.run(func, *args)
  File "/home/fqye/projects/lang-segment-anything/venv/lib/python3.10/site-packages/gradio/routes.py", line 271, in api_info
    return gradio.blocks.get_api_info(config, serialize)  # type: ignore
  File "/home/fqye/projects/lang-segment-anything/venv/lib/python3.10/site-packages/gradio/blocks.py", line 504, in get_api_info
    serializer = serializing.COMPONENT_MAPPING[type]()
KeyError: 'dataset'

Mask Shape Error after Changing Box Threshold and Text threshold

Instructions To Reproduce the 🐛 Bug:

Background explanation

Mask Shape Error after Changing Box Threshold and Text threshold

Running on local URL:  http://127.0.0.1:58312

To create a public link, set `share=True` in `launch()`.
Predicting...  vit_h 0.3 0.25 /private/var/folders/4h/162gwqdn6rz_q7s726gct4_80000gp/T/4111474954c6cdc3c501c06c99fa9648e2ac7618/tmp3c_6dacm.png hair
Predicting...  vit_h 0.3 0.25 /private/var/folders/4h/162gwqdn6rz_q7s726gct4_80000gp/T/4111474954c6cdc3c501c06c99fa9648e2ac7618/tmprfwqx47d.png beard
Predicting...  vit_h 0.3 0.25 /private/var/folders/4h/162gwqdn6rz_q7s726gct4_80000gp/T/32d00a9ce7d67b7364d9b931dd5b4e404af43296/tmp03k8nv96.png beard
Predicting...  vit_h 0.3 0.25 /private/var/folders/4h/162gwqdn6rz_q7s726gct4_80000gp/T/32d00a9ce7d67b7364d9b931dd5b4e404af43296/tmpcpbo9923.png hair
Predicting...  vit_h 0.3 0.25 /private/var/folders/4h/162gwqdn6rz_q7s726gct4_80000gp/T/c2451465d4950e1449122dc8a19dd30d994cf32c/tmped82lrpn.png beard
Predicting...  vit_h 0.3 0.25 /private/var/folders/4h/162gwqdn6rz_q7s726gct4_80000gp/T/e7fa26ab6688ecaebd018157977b3c5a1b8778fa/tmpp2mr3x0q.png beard
Predicting...  vit_h 0.5 0.35 /private/var/folders/4h/162gwqdn6rz_q7s726gct4_80000gp/T/e7fa26ab6688ecaebd018157977b3c5a1b8778fa/tmp3pfcrxgn.png beard
Traceback (most recent call last):
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/gradio/routes.py", line 401, in run_predict
    output = await app.get_blocks().process_api(
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/gradio/blocks.py", line 1302, in process_api
    result = await self.call_function(
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/gradio/blocks.py", line 1025, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/anyio/to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "/Users/pegasus/Documents/lang-segment-anything/app.py", line 66, in predict
    image = draw_image(image_array, masks, boxes, labels)
  File "/Users/pegasus/Documents/lang-segment-anything/lang_sam/utils.py", line 18, in draw_image
    image = draw_segmentation_masks(image, masks=masks, colors=['cyan'] * len(boxes), alpha=alpha)
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/torchvision/utils.py", line 303, in draw_segmentation_masks
    raise ValueError("masks must be of shape (H, W) or (batch_size, H, W)")
ValueError: masks must be of shape (H, W) or (batch_size, H, W)
Predicting...  vit_h 0.5 0.35 /private/var/folders/4h/162gwqdn6rz_q7s726gct4_80000gp/T/e7fa26ab6688ecaebd018157977b3c5a1b8778fa/tmppp8wv2od.png beard
Traceback (most recent call last):
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/gradio/routes.py", line 401, in run_predict
    output = await app.get_blocks().process_api(
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/gradio/blocks.py", line 1302, in process_api
    result = await self.call_function(
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/gradio/blocks.py", line 1025, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/anyio/to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "/Users/pegasus/Documents/lang-segment-anything/app.py", line 66, in predict
    image = draw_image(image_array, masks, boxes, labels)
  File "/Users/pegasus/Documents/lang-segment-anything/lang_sam/utils.py", line 18, in draw_image
    image = draw_segmentation_masks(image, masks=masks, colors=['cyan'] * len(boxes), alpha=alpha)
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/torchvision/utils.py", line 303, in draw_segmentation_masks
    raise ValueError("masks must be of shape (H, W) or (batch_size, H, W)")
ValueError: masks must be of shape (H, W) or (batch_size, H, W)

Full runnable code or full changes you made:

lightning run app app.py

Changed Box Threshold to 0.5 from default
Changed Text threshold to 0.35 from default

What exact command you run: lightning run app app.py
please simplify the steps as much as possible so they do not require additional resources to
run, such as a private dataset.

Expected behavior:

If there are no obvious error in "full logs" provided above,
please tell us the expected behavior.

Environment:

I'm using the latest version!
Its not a user-side mistake!

Environment conflicts

I follow the instructions and use conda install environments.yml to create and install packages, but numerous package version conflicts appear. Are there any update for the environement?

ERROR: No matching distribution found for groundingdinoPlease read & provide the following

Instructions To Reproduce the 🐛 Bug:

When I want to get the environment installed, I meet the error:
ERROR: Could not find a version that satisfies the requirement groundingdino (from versions: none)
ERROR: No matching distribution found for groundingdino

how can I fix this

Installation stopped working after GroundingDino Update

When trying to install the package using the steps of the documentation you get conflicting dependencies of lang-sam and grounding dino.
Root of the issue seems to be that the requirements of grounding dino were recently changed to force an installation of supervision 0.21.0 which is incompatible with pillow<9.4.

Theproblem is that lang-sam 0.1.0 depends on Pillow==9.3.0

I tried cuda and pip isntallation and it is for both the same.

RuntimeError: The Poetry configuration is invalid: data.documentation must be uri

Seems that pyproject.toml needs data.documentation to be a URI.

The following PR fixes the issue:
#41

(sorry for short description)

Error Running Lightning App: PytorchStreamReader failed reading zip archive

Instructions To Reproduce the 🐛 Bug:

Background explanation

Error running lightning app on MacOS 13 on Apple M1 chip

RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory

Full error copied below:

(sam) pegasus@peggy-mbp lang-segment-anything % lightning run app app.py
Your Lightning App is starting. This won't take long.
/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/groundingdino/models/GroundingDINO/ms_deform_attn.py:31: UserWarning: Failed to load custom C++ ops. Running on CPU mode Only!
  warnings.warn("Failed to load custom C++ ops. Running on CPU mode Only!")
/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/groundingdino/models/GroundingDINO/ms_deform_attn.py:31: UserWarning: Failed to load custom C++ ops. Running on CPU mode Only!
  warnings.warn("Failed to load custom C++ ops. Running on CPU mode Only!")
INFO: Your app has started. View it in your browser: http://127.0.0.1:7501/view
/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/groundingdino/models/GroundingDINO/ms_deform_attn.py:31: UserWarning: Failed to load custom C++ ops. Running on CPU mode Only!
  warnings.warn("Failed to load custom C++ ops. Running on CPU mode Only!")
final text_encoder_type: bert-base-uncased
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertModel: ['cls.predictions.transform.dense.bias', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.bias']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Model loaded from /Users/pegasus/.cache/huggingface/hub/models--ShilongLiu--GroundingDINO/snapshots/a94c9b567a2a374598f05c584e96798a170c56fb/groundingdino_swinb_cogcoor.pth 
 => _IncompatibleKeys(missing_keys=[], unexpected_keys=['label_enc.weight'])
Process SpawnProcess-2:
Traceback (most recent call last):
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/lightning/app/utilities/proxies.py", line 437, in __call__
    raise e
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/lightning/app/utilities/proxies.py", line 418, in __call__
    self.run_once()
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/lightning/app/utilities/proxies.py", line 569, in run_once
    self.work.on_exception(e)
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/lightning/app/core/work.py", line 625, in on_exception
    raise exception
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/lightning/app/utilities/proxies.py", line 534, in run_once
    ret = self.run_executor_cls(self.work, work_run, self.delta_queue)(*args, **kwargs)
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/lightning/app/utilities/proxies.py", line 367, in __call__
    return self.work_run(*args, **kwargs)
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/lightning/app/components/serve/gradio_server.py", line 77, in run
    self._model = self.build_model()
  File "/Users/pegasus/Documents/lang-segment-anything/app.py", line 71, in build_model
    model = LangSAM(sam_type)
  File "/Users/pegasus/Documents/lang-segment-anything/lang_sam/lang_sam.py", line 57, in __init__
    self.build_sam(sam_type)
  File "/Users/pegasus/Documents/lang-segment-anything/lang_sam/lang_sam.py", line 66, in build_sam
    sam = sam_model_registry[sam_type](checkpoint=sam_checkpoint)
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/segment_anything/build_sam.py", line 15, in build_sam_vit_h
    return _build_sam(
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/segment_anything/build_sam.py", line 105, in _build_sam
    state_dict = torch.load(f)
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/torch/serialization.py", line 797, in load
    with _open_zipfile_reader(opened_file) as opened_zipfile:
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/torch/serialization.py", line 283, in __init__
    super().__init__(torch._C.PyTorchFileReader(name_or_buffer))
RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory
INFO: Your Lightning App is being stopped. This won't take long.
INFO: Your Lightning App has been stopped successfully!

Full runnable code or full changes you made:

git clone https://github.com/luca-medeiros/lang-segment-anything && cd lang-segment-anything
pip install torch torchvison
pip install -e .

What exact command you run: lightning run app app.py
please simplify the steps as much as possible so they do not require additional resources to
run, such as a private dataset.

Expected behavior:

If there are no obvious error in "full logs" provided above,
please tell us the expected behavior.

Environment:

I'm using the latest version!
Its not a user-side mistake!

TypeError: predict() got an unexpected keyword argument 'remove_combined'

Hello,

First of all thanks for the great work. I ran python3 running_test.py and I get the following error
TypeError: predict() got an unexpected keyword argument 'remove_combined'

Could you please suggest the possible solution for this? Thanks in advance

@healthonrails @kauevestena @mutusfa @siddharthksah @dolhasz

Update: The script works fine without this argument. Could you please let me know how this argument influences the output of the model?

Does not support Python3.10

Background explanation

During installation I used python3.10. I got an error

ERROR: Package 'lang-sam' requires a different Python: 3.11.5 not in '<3.11,>=3.8'

Full runnable code or full changes you made:

I changed the dependency in the environment.yml file. I used the following

python=3.10.12=h955ad1f_0

instead of python3.8

What exact command you run:

pip install -U git+https://github.com/luca-medeiros/lang-segment-anything.git

When running the predict line I'm getting that issue.

grounding dino ckpt issue

Instructions To Reproduce the 🐛 Bug:

Background explanation
Running the running_test.py file creates empty masks.
Full runnable code or full changes you made:
No change
What exact command you run:
python running_test.py
please simplify the steps as much as possible so they do not require additional resources to
run, such as a private dataset.

Expected behavior:

(lsa) ubuntu@ip-172-31-19-53:/mnt/efs-shared/sdasbisw/lang-segment-anything$ python running_test.py /home/ubuntu/anaconda3/envs/lsa/lib/python3.8/site-packages/transformers/utils/generic.py:441: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead. _torch_pytree._register_pytree_node( /home/ubuntu/anaconda3/envs/lsa/lib/python3.8/site-packages/transformers/utils/generic.py:309: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead. _torch_pytree._register_pytree_node( /home/ubuntu/anaconda3/envs/lsa/lib/python3.8/site-packages/torch/functional.py:512: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3587.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined] final text_encoder_type: bert-base-uncased Model loaded from /home/ubuntu/.cache/huggingface/hub/models--ShilongLiu--GroundingDINO/snapshots/a94c9b567a2a374598f05c584e96798a170c56fb/groundingdino_swinb_cogcoor.pth => _IncompatibleKeys(missing_keys=[], unexpected_keys=['label_enc.weight', 'bert.embeddings.position_ids']) /home/ubuntu/anaconda3/envs/lsa/lib/python3.8/site-packages/transformers/modeling_utils.py:907: FutureWarning: The device argument is deprecated and will be removed in v5 of Transformers. warnings.warn( /home/ubuntu/anaconda3/envs/lsa/lib/python3.8/site-packages/torch/utils/checkpoint.py:464: UserWarning: torch.utils.checkpoint: the use_reentrant parameter should be passed explicitly. In version 2.4 we will raise an exception if use_reentrant is not passed. use_reentrant=False is recommended, but if you need to preserve the current default behavior, you can pass use_reentrant=True. Refer to docs for more details on the differences between the two variants. warnings.warn( /home/ubuntu/anaconda3/envs/lsa/lib/python3.8/site-packages/torch/utils/checkpoint.py:91: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( all ok

Issue: Model loaded from /home/ubuntu/.cache/huggingface/hub/models--ShilongLiu--GroundingDINO/snapshots/a94c9b567a2a374598f05c584e96798a170c56fb/groundingdino_swinb_cogcoor.pth
=> _IncompatibleKeys(missing_keys=[], unexpected_keys=['label_enc.weight', 'bert.embeddings.position_ids'])

This causes blank masks to be generated.

Environment:

Env:

packages in environment at /home/ubuntu/anaconda3/envs/lsa:

[ yes] I'm using the latest version!
[ yes] Its not a user-side mistake!
- [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

lang-sam 0.1.0 depends on transformers<5.0.0 and >=4.27.4

What transformer version does lang sam work on exactly? It gives me this error:

    The user requested transformers
    lang-sam 0.1.0 depends on transformers<5.0.0 and >=4.27.4

The conflict is caused by:
    The user requested transformers<5.0.0>=4.27.4
    lang-sam 0.1.0 depends on transformers<5.0.0 and >=4.27.4

    The user requested transformers>=4.27.4<5.0.0
    lang-sam 0.1.0 depends on transformers<5.0.0 and >=4.27.4

This one is a little more explanatory, maybe?

The conflict is caused by:
    transformers 4.27.4 depends on huggingface-hub<1.0 and >=0.11.0
    lang-sam 0.1.0 depends on huggingface-hub<0.14.0 and >=0.13.4
    gradio 3.35.2 depends on huggingface-hub>=0.14.0
    transformers 4.27.4 depends on huggingface-hub<1.0 and >=0.11.0
    lang-sam 0.1.0 depends on huggingface-hub<0.14.0 and >=0.13.4
    gradio 3.35.1 depends on huggingface-hub>=0.14.0
    transformers 4.27.4 depends on huggingface-hub<1.0 and >=0.11.0
    lang-sam 0.1.0 depends on huggingface-hub<0.14.0 and >=0.13.4
    gradio 3.35.0 depends on huggingface-hub>=0.14.0
    transformers 4.27.4 depends on huggingface-hub<1.0 and >=0.11.0
    lang-sam 0.1.0 depends on huggingface-hub<0.14.0 and >=0.13.4
    gradio 3.34.0 depends on huggingface-hub>=0.14.0
    transformers 4.27.4 depends on huggingface-hub<1.0 and >=0.11.0
    lang-sam 0.1.0 depends on huggingface-hub<0.14.0 and >=0.13.4
    gradio 3.33.1 depends on huggingface-hub>=0.14.0
    transformers 4.27.4 depends on huggingface-hub<1.0 and >=0.11.0
    lang-sam 0.1.0 depends on huggingface-hub<0.14.0 and >=0.13.4
    gradio 3.33.0 depends on huggingface-hub>=0.14.0

But I can't get transformers to install for lang sam even when I don't specify a version or use the range it wants for versions.

I was having no issues with transformers==4.26.1 for BLIP and such with ComfyUI and all that uses it, but now stuck trying to add some SAM prompt based masking in.

Load samhq model?

https://github.com/SysCV/sam-hq#model-checkpoints

https://huggingface.co/lkeab/hq-sam/tree/main

how to load sam_hq model?

Speed improvement

🚀 Feature

Is there a way to speed up the segmentation process?

Motivation & Examples

I noticed that my machine was not utilised much (GPU utilization < 20%, RAM and CPU were used to an even lesser degree) but segmenting a single large image took > 90 seconds. That is way too long for segmenting a large number of images.

I would think that there must be enormous potential for speed ups.

In lang_sam.py file, I detect that the model = build_model(args) command of the load_model_hf function cannot be executed.

In lang_sam.py file, I detect that the model = build_model(args) command of the load_model_hf function cannot be executed. Why? thank you

def load_model_hf(repo_id, filename, ckpt_config_filename, device='cpu'):
cache_config_file = hf_hub_download(repo_id=repo_id, filename=ckpt_config_filename)

args = SLConfig.fromfile(cache_config_file)
model = build_model(args)
args.device = device

luca-medeiros / lang-segment-anything Goto Github PK

lang-segment-anything's Introduction

Hi there 👋

lang-segment-anything's People

Contributors

Stargazers

Watchers

Forkers

lang-segment-anything's Issues

🚀 Feature

Motivation & Examples

Note

Instructions To Reproduce the 🐛 Bug:

Expected behavior:

Environment:

Instructions To Reproduce the 🐛 Bug:

Expected behavior:

Environment:

Instructions To Reproduce the 🐛 Bug:

Instructions To Reproduce the 🐛 Bug:

Expected behavior:

Environment:

Instructions To Reproduce the 🐛 Bug:

Instructions To Reproduce the Issue:

🚀 Feature

Motivation & Examples

Note

Instructions To Reproduce the 🐛 Bug:

Instructions To Reproduce the 🐛 Bug:

Expected behavior:

Environment:

Instructions To Reproduce the 🐛 Bug:

🚀 Feature

Motivation & Examples

Note

🚀 Feature

Instructions To Reproduce the 🐛 Bug:

Error

Please help me resolve this to the earliest.

Instructions To Reproduce the 🐛 Bug:

pip install -U git+https://github.com/luca-medeiros/lang-segment-anything.git

🚀 Feature

Motivation & Examples

Note

Instructions To Reproduce the 🐛 Bug:

Expected behavior:

Environment:

Instructions To Reproduce the 🐛 Bug:

Instructions To Reproduce the 🐛 Bug:

Expected behavior:

Environment:

Instructions To Reproduce the 🐛 Bug:

Expected behavior:

Environment:

packages in environment at /home/ubuntu/anaconda3/envs/lsa:

Name Version Build Channel

🚀 Feature

Motivation & Examples

Recommend Projects

Recommend Topics

Recommend Org

Jobs