GithubHelp home page GithubHelp logo

luca-medeiros / lang-segment-anything Goto Github PK

View Code? Open in Web Editor NEW
1.5K 1.5K 168.0 13.22 MB

SAM with text prompt

License: Apache License 2.0

Python 0.50% Jupyter Notebook 99.47% Dockerfile 0.03%

lang-segment-anything's Introduction

Hi there ๐Ÿ‘‹

Luca's GitHub stats

lang-segment-anything's People

Contributors

bogay avatar dolhasz avatar egeozguroglu avatar healthonrails avatar kabirsubbiah avatar kauevestena avatar luca-medeiros avatar mistydragon7 avatar mutusfa avatar rb-synth avatar rballachay avatar siddharthksah avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

lang-segment-anything's Issues

How to fine-tune the custom dataset?

๐Ÿš€ Feature

A clear and concise description of the feature proposal.

Motivation & Examples

Tell us why the feature is useful.

Describe what the feature would look like, if it is implemented.
Best demonstrated using code examples in addition to words.

<put sample here>

Note

We only consider adding new features if they are relevant to this library.
Consider if this new feature deserves to be here or should be a new library.

Please read & provide the following

Instructions To Reproduce the ๐Ÿ› Bug:

  1. Background explanation

  2. Full runnable code or full changes you made:

  1. What exact command you run:

  2. please simplify the steps as much as possible so they do not require additional resources to
    run, such as a private dataset.

Expected behavior:

If there are no obvious error in "full logs" provided above,
please tell us the expected behavior.

Environment:

  • I'm using the latest version!
  • Its not a user-side mistake!

name '_c' is not defined

Instructions To Reproduce the ๐Ÿ› Bug:

lightning run app app.py

then trigger predict

Get Error:
name '_c' is not defined

Expected behavior:

Environment:

  • linux conda
  • python3.8.10

cant share link

Instructions To Reproduce the ๐Ÿ› Bug:

  1. Background explanation

When I run this project in a server, I want to use the frontend in server machine's.

I cant find how to config launch(share=True) in lightning.app

model.predict() error

Hi, thanks for creating this awesome tool!

I have the following error when trying to predict your car example. My env is the following:

python==3.9.16
torch==2.0.0+cu117
torchvision==0.15.1+cu117
numpy==1.24.2
opencv_python==4.7.0.72
Pillow==9.3.0
transformers==4.27.4
lightning==2.0.1

This is the error message I get. Seems like something related to Grounding Dino:

RuntimeError                              Traceback (most recent call last)
Cell In[4], line 1
----> 1 masks, boxes, phrases, logits = model.predict(image_pil, text_prompt)

File /opt/conda/envs/python39/lib/python3.9/site-packages/lang_sam/lang_sam.py:107, in LangSAM.predict(self, image_pil, text_prompt, box_threshold, text_threshold)
    106 def predict(self, image_pil, text_prompt, box_threshold=0.3, text_threshold=0.25):
--> 107     boxes, logits, phrases = self.predict_dino(image_pil, text_prompt, box_threshold, text_threshold)
    108     masks = torch.tensor([])
    109     if len(boxes) > 0:

File /opt/conda/envs/python39/lib/python3.9/site-packages/lang_sam/lang_sam.py:83, in LangSAM.predict_dino(self, image_pil, text_prompt, box_threshold, text_threshold)
     81 def predict_dino(self, image_pil, text_prompt, box_threshold, text_threshold):
     82     image_trans = transform_image(image_pil)
---> 83     boxes, logits, phrases = predict(model=self.groundingdino,
     84                                      image=image_trans,
     85                                      caption=text_prompt,
     86                                      box_threshold=box_threshold,
     87                                      text_threshold=text_threshold,
     88                                      device=self.device)
     89     W, H = image_pil.size
     90     boxes = box_ops.box_cxcywh_to_xyxy(boxes) * torch.Tensor([W, H, W, H])

File /opt/conda/envs/python39/lib/python3.9/site-packages/groundingdino/util/inference.py:66, in predict(model, image, caption, box_threshold, text_threshold, device)
     63 image = image.to(device)
     65 with torch.no_grad():
---> 66     outputs = model(image[None], captions=[caption])
     68 prediction_logits = outputs["pred_logits"].cpu().sigmoid()[0]  # prediction_logits.shape = (nq, 256)
     69 prediction_boxes = outputs["pred_boxes"].cpu()[0]  # prediction_boxes.shape = (nq, 4)

File /opt/conda/envs/python39/lib/python3.9/site-packages/torch/nn/modules/module.py:1501, in Module._call_impl(self, *args, **kwargs)
   1496 # If we don't have any hooks, we want to skip the rest of the logic in
   1497 # this function, and just call forward.
   1498 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
   1499         or _global_backward_pre_hooks or _global_backward_hooks
   1500         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501     return forward_call(*args, **kwargs)
   1502 # Do not call functions when jit is used
   1503 full_backward_hooks, non_full_backward_hooks = [], []

File /opt/conda/envs/python39/lib/python3.9/site-packages/groundingdino/models/GroundingDINO/groundingdino.py:289, in GroundingDINO.forward(self, samples, targets, **kw)
    287 if isinstance(samples, (list, torch.Tensor)):
    288     samples = nested_tensor_from_tensor_list(samples)
--> 289 features, poss = self.backbone(samples)
    291 srcs = []
    292 masks = []

File /opt/conda/envs/python39/lib/python3.9/site-packages/torch/nn/modules/module.py:1501, in Module._call_impl(self, *args, **kwargs)
   1496 # If we don't have any hooks, we want to skip the rest of the logic in
   1497 # this function, and just call forward.
   1498 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
   1499         or _global_backward_pre_hooks or _global_backward_hooks
   1500         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501     return forward_call(*args, **kwargs)
   1502 # Do not call functions when jit is used
   1503 full_backward_hooks, non_full_backward_hooks = [], []

File /opt/conda/envs/python39/lib/python3.9/site-packages/groundingdino/models/GroundingDINO/backbone/backbone.py:151, in Joiner.forward(self, tensor_list)
    150 def forward(self, tensor_list: NestedTensor):
--> 151     xs = self[0](tensor_list)
    152     out: List[NestedTensor] = []
    153     pos = []

File /opt/conda/envs/python39/lib/python3.9/site-packages/torch/nn/modules/module.py:1501, in Module._call_impl(self, *args, **kwargs)
   1496 # If we don't have any hooks, we want to skip the rest of the logic in
   1497 # this function, and just call forward.
   1498 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
   1499         or _global_backward_pre_hooks or _global_backward_hooks
   1500         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501     return forward_call(*args, **kwargs)
   1502 # Do not call functions when jit is used
   1503 full_backward_hooks, non_full_backward_hooks = [], []

File /opt/conda/envs/python39/lib/python3.9/site-packages/groundingdino/models/GroundingDINO/backbone/swin_transformer.py:716, in SwinTransformer.forward(self, tensor_list)
    713 x = tensor_list.tensors
    715 """Forward function."""
--> 716 x = self.patch_embed(x)
    718 Wh, Ww = x.size(2), x.size(3)
    719 if self.ape:
    720     # interpolate the position embedding to the corresponding size

File /opt/conda/envs/python39/lib/python3.9/site-packages/torch/nn/modules/module.py:1501, in Module._call_impl(self, *args, **kwargs)
   1496 # If we don't have any hooks, we want to skip the rest of the logic in
   1497 # this function, and just call forward.
   1498 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
   1499         or _global_backward_pre_hooks or _global_backward_hooks
   1500         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501     return forward_call(*args, **kwargs)
   1502 # Do not call functions when jit is used
   1503 full_backward_hooks, non_full_backward_hooks = [], []

File /opt/conda/envs/python39/lib/python3.9/site-packages/groundingdino/models/GroundingDINO/backbone/swin_transformer.py:491, in PatchEmbed.forward(self, x)
    488 if H % self.patch_size[0] != 0:
    489     x = F.pad(x, (0, 0, 0, self.patch_size[0] - H % self.patch_size[0]))
--> 491 x = self.proj(x)  # B C Wh Ww
    492 if self.norm is not None:
    493     Wh, Ww = x.size(2), x.size(3)

File /opt/conda/envs/python39/lib/python3.9/site-packages/torch/nn/modules/module.py:1501, in Module._call_impl(self, *args, **kwargs)
   1496 # If we don't have any hooks, we want to skip the rest of the logic in
   1497 # this function, and just call forward.
   1498 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
   1499         or _global_backward_pre_hooks or _global_backward_hooks
   1500         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501     return forward_call(*args, **kwargs)
   1502 # Do not call functions when jit is used
   1503 full_backward_hooks, non_full_backward_hooks = [], []

File /opt/conda/envs/python39/lib/python3.9/site-packages/torch/nn/modules/conv.py:463, in Conv2d.forward(self, input)
    462 def forward(self, input: Tensor) -> Tensor:
--> 463     return self._conv_forward(input, self.weight, self.bias)

File /opt/conda/envs/python39/lib/python3.9/site-packages/torch/nn/modules/conv.py:459, in Conv2d._conv_forward(self, input, weight, bias)
    455 if self.padding_mode != 'zeros':
    456     return F.conv2d(F.pad(input, self._reversed_padding_repeated_twice, mode=self.padding_mode),
    457                     weight, bias, self.stride,
    458                     _pair(0), self.dilation, self.groups)
--> 459 return F.conv2d(input, weight, bias, self.stride,
    460                 self.padding, self.dilation, self.groups)

RuntimeError: GET was unable to find an engine to execute this computation```


ERROR: Could not build wheels for lang-sam, which is required to install pyproject.toml-based projects

Instructions To Reproduce the ๐Ÿ› Bug:

  1. I run the command
pip install -U git+https://github.com/luca-medeiros/lang-segment-anything.git

but get the error :

Building wheels for collected packages: lang-sam, groundingdino, segment-anything
  Building wheel for lang-sam (pyproject.toml) ... error
  error: subprocess-exited-with-error
  
  ร— Building wheel for lang-sam (pyproject.toml) did not run successfully.
  โ”‚ exit code: 1
  โ•ฐโ”€> [30 lines of output]
      Traceback (most recent call last):
        File "/home/vismod/anaconda3/envs/segment/lib/python3.8/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
          main()
        File "/home/vismod/anaconda3/envs/segment/lib/python3.8/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
          json_out['return_val'] = hook(**hook_input['kwargs'])
        File "/home/vismod/anaconda3/envs/segment/lib/python3.8/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 251, in build_wheel
          return _build_backend().build_wheel(wheel_directory, config_settings,
        File "/tmp/pip-build-env-w3hevfp7/overlay/lib/python3.8/site-packages/poetry/core/masonry/api.py", line 56, in build_wheel
          return WheelBuilder.make_in(
        File "/tmp/pip-build-env-w3hevfp7/overlay/lib/python3.8/site-packages/poetry/core/masonry/builders/wheel.py", line 85, in make_in
          wb.build(target_dir=directory)
        File "/tmp/pip-build-env-w3hevfp7/overlay/lib/python3.8/site-packages/poetry/core/masonry/builders/wheel.py", line 121, in build
          self._copy_module(zip_file)
        File "/tmp/pip-build-env-w3hevfp7/overlay/lib/python3.8/site-packages/poetry/core/masonry/builders/wheel.py", line 232, in _copy_module
          to_add = self.find_files_to_add()
        File "/tmp/pip-build-env-w3hevfp7/overlay/lib/python3.8/site-packages/poetry/core/masonry/builders/builder.py", line 198, in find_files_to_add
          if self.is_excluded(
        File "/tmp/pip-build-env-w3hevfp7/overlay/lib/python3.8/site-packages/poetry/core/masonry/builders/builder.py", line 144, in is_excluded
          if exclude_path.as_posix() in self.find_excluded_files(fmt=self.format):
        File "/tmp/pip-build-env-w3hevfp7/overlay/lib/python3.8/site-packages/poetry/core/masonry/builders/builder.py", line 112, in find_excluded_files
          vcs_ignored_files = set(vcs.get_ignored_files())
        File "/tmp/pip-build-env-w3hevfp7/overlay/lib/python3.8/site-packages/poetry/core/vcs/git.py", line 340, in get_ignored_files
          output = self.run(*args)
        File "/tmp/pip-build-env-w3hevfp7/overlay/lib/python3.8/site-packages/poetry/core/vcs/git.py", line 372, in run
          subprocess.check_output(
        File "/home/vismod/anaconda3/envs/segment/lib/python3.8/subprocess.py", line 415, in check_output
          return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
        File "/home/vismod/anaconda3/envs/segment/lib/python3.8/subprocess.py", line 516, in run
          raise CalledProcessError(retcode, process.args,
      subprocess.CalledProcessError: Command '['git', '--git-dir', '14:55:53.637805 git.c:439               trace: built-in: git rev-parse --show-toplevel\n/tmp/pip-req-build-y5hh1atw/.git', '--work-tree', '14:55:53.637805 git.c:439               trace: built-in: git rev-parse --show-toplevel\n/tmp/pip-req-build-y5hh1atw', 'ls-files', '--others', '-i', '--exclude-standard']' returned non-zero exit status 128.
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for lang-sam
  Building wheel for groundingdino (setup.py) ... done
  Created wheel for groundingdino: filename=groundingdino-0.1.0-cp38-cp38-linux_x86_64.whl size=3894296 sha256=babeb3fe546ebc022b53ce9e2cd9c2e6e2309792f384e48d5ba7d5bab3138703
  Stored in directory: /tmp/pip-ephem-wheel-cache-f7dkjaf5/wheels/d3/f5/df/db4a813287ee7ae962a814f62e14d1e48173cad8f2f9100e9b
  Building wheel for segment-anything (setup.py) ... done
  Created wheel for segment-anything: filename=segment_anything-1.0-py3-none-any.whl size=36587 sha256=9f826654d1f77bca918a052243539bf96c9467538d418690bf557b29a0c4bcf7
  Stored in directory: /tmp/pip-ephem-wheel-cache-f7dkjaf5/wheels/b0/7e/40/20f0b1e23280cc4a66dc8009c29f42cb4afc1b205bc5814786
Successfully built groundingdino segment-anything
Failed to build lang-sam
ERROR: Could not build wheels for lang-sam, which is required to install pyproject.toml-based projects

Expected behavior:

how can I fix the bug?

Environment:

Python: 3.8.16
torch: '2.0.0+cu117'
torchvision: '0.15.1+cu117'

software conflict

Instructions To Reproduce the ๐Ÿ› Bug:

I try to run pip install -e . ,but it seems some conflicts. please help me

image

How to custom checkpoint correctly๏ผŸ

I have read this PR๏ผŒ and I wrote this code๏ผš

# 'VIT_H SAM Model/sam_vit_h_4b8939.pth' is my directory of model file
sam = LangSAM('VIT_H SAM Model/sam_vit_h_4b8939.pth')

And it crashed with this error:

self = <samgeo.text_sam.LangSAM object at 0x00000210B416E860>
model_type = 'VIT_H SAM Model/sam_vit_h_4b8939.pth'

    def build_sam(self, model_type):
        """Build the SAM model.
    
        Args:
            model_type (str, optional): The model type. It can be one of the following: vit_h, vit_l, vit_b.
                Defaults to 'vit_h'. See https://bit.ly/3VrpxUh for more details.
        """
>       checkpoint_url = SAM_MODELS[model_type]
E       KeyError: 'VIT_H SAM Model/sam_vit_h_4b8939.pth'

And even if I use sam = LangSAM(ckpt_path='VIT_H SAM Model/sam_vit_h_4b8939.pth') or sam = LangSAM('vit_h', 'VIT_H SAM Model/sam_vit_h_4b8939.pth') , it isn't effective.
So how to custom checkpoint correctly?

How to custom params

hi,
The current parameters are: Box threshold, Text threshold.
I want to add these parameters: points_per_side, pred_iou_thresh, stability_score_thresh, crop_n_layers, is it possible?
Thank you!

Unable to run

Run a simple example and got "Error". The terminal doesn't print any error messages.

image

Any ideas how to debug it? Thanks. I tried to set a breakpoint at predict() but it doesn't trigger when I clicked "Submit".

No module named 'lang_sam'

Instructions To Reproduce the Issue:

Already installed lang_sam through installation instructions

pip install -U git+https://github.com/luca-medeiros/lang-segment-anything.git

pip also showing package installed

pip list | grep lang-sam
lang-sam                 0.1.0 

But code is giving error:

from  PIL  import  Image
from lang_sam import LangSAM

model = LangSAM()
image_pil = Image.open('./assets/car.jpeg').convert("RGB")
text_prompt = 'wheel'
masks, boxes, phrases, logits = model.predict(image_pil, text_prompt)

Error:

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Input In [1], in <cell line: 2>()
      1 from  PIL  import  Image
----> 2 from lang_sam import LangSAM
      4 model = LangSAM()
      5 image_pil = Image.open('./assets/car.jpeg').convert("RGB")

ModuleNotFoundError: No module named 'lang_sam'

Wrote a script to generate videos with segmentation

๐Ÿš€ Feature

Generating a video

Motivation & Examples

I wanted to generate videos out of the set of output frames and wrote a script for the same. It works for me and wanted to contribute it to the repo.

Note

We only consider adding new features if they are relevant to this library.
Consider if this new feature deserves to be here or should be a new library.

How to run the model offline?

When I use this as a lib, I can not access the Internet, it keeps failed as it attempt to access huggingface.co, how to adapt to it.

Use as a library:

from PIL import Image
from lang_sam import LangSAM

model = LangSAM()
image_pil = Image.open("./assets/car.jpeg").convert("RGB")
text_prompt = "wheel"
masks, boxes, phrases, logits = model.predict(image_pil, text_prompt)
Use with custom checkpoint:

First download a model checkpoint.

from PIL import Image
from lang_sam import LangSAM

model = LangSAM("<model_type>", "<path/to/checkpoint>")
image_pil = Image.open("./assets/car.jpeg").convert("RGB")
text_prompt = "wheel"
masks, boxes, phrases, logits = model.predict(image_pil, text_prompt)

ModuleNotFoundError: No module named 'torch'

I tried using pip and cloning the project and I still have the same error. knowing that I have torch and torchvision installed.

Python 3.10.9
torch 2.0.1+cu118
torchvision 0.15.2+cu118

Custom Checkpoints

Hi @luca-medeiros,

Great work! I was using the model and thought it would be useful if LangSAM() accepted a SAM ckpt as an argument. This would allow for other checkpoints like HQ_SAM to be included more easily. Also, for people who have downloaded the model and are working in Colab, this would free up memory.

Happy to help!

Thanks,

Kabir

PR proposal dockerfile example

ive got the model running in a docker container. should i do a pull request? might be helpful to others. should i put it in the readme?

run in colab then url can't open

When I run: !lightning run app /content/lang-segment-anything/app.py

Your Lightning App is starting. This won't take long.
INFO: Your app has started. View it in your browser: http://127.0.0.1:7501/view
/usr/bin/xdg-open: 869: www-browser: not found
/usr/bin/xdg-open: 869: links2: not found
/usr/bin/xdg-open: 869: elinks: not found
/usr/bin/xdg-open: 869: links: not found
/usr/bin/xdg-open: 869: lynx: not found
/usr/bin/xdg-open: 869: w3m: not found
xdg-open: no method available for opening 'http://127.0.0.1:7501/view'
Downloading (โ€ฆ)ingDINO_SwinB.cfg.py: 100% 1.01k/1.01k [00:00<00:00, 1.03MB/s]
final text_encoder_type: bert-base-uncased
Downloading (โ€ฆ)okenizer_config.json: 100% 28.0/28.0 [00:00<00:00, 18.1kB/s]
Downloading (โ€ฆ)lve/main/config.json: 100% 570/570 [00:00<00:00, 383kB/s]
Downloading (โ€ฆ)solve/main/vocab.txt: 100% 232k/232k [00:00<00:00, 3.86MB/s]
Downloading (โ€ฆ)/main/tokenizer.json: 100% 466k/466k [00:00<00:00, 5.82MB/s]
Downloading model.safetensors: 100% 440M/440M [00:03<00:00, 132MB/s]
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertModel: ['cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.LayerNorm.bias']

  • This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
  • This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
    Downloading (โ€ฆ)no_swinb_cogcoor.pth: 100% 938M/938M [00:29<00:00, 31.4MB/s]
    Model loaded from /root/.cache/huggingface/hub/models--ShilongLiu--GroundingDINO/snapshots/a94c9b567a2a374598f05c584e96798a170c56fb/groundingdino_swinb_cogcoor.pth
    => _IncompatibleKeys(missing_keys=[], unexpected_keys=['label_enc.weight'])
    Downloading: "https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth" to /root/.cache/torch/hub/checkpoints/sam_vit_h_4b8939.pth
    100% 2.39G/2.39G [00:17<00:00, 149MB/s]
    Running on local URL: http://127.0.0.1:38649/

To create a public link, set share=True in launch().

But the url is invalid.
Please help to check, you could try it in your colab.

Limited power of currently working text prompt driven segmentation

Instructions To Reproduce the ๐Ÿ› Bug:

  1. Background explanation

I have images of objects with some thin extursions (antennas). When I try to specify in the text prompt to obtain object with antennas attached to it, mostly prunes out the antennas. It is a sequence of images for the same object for which I need to subtract the background. Also, when it does get the antennas in very few images, the segmentation boundary is a bit imprecise and thin sliver of background is visible around antenna or parts of object where there are not as thin but somewhat thin extrusions. With some cavities that are naturally part of the object, bits of background visible through cavity leak in a bit.

I am going to attempt some post procesing clean up through contour detection if that works out to get a more crisp and better segmentation. Meanwhile is there a way to not have it cut out thin extrusions in the first place?

Ouput image does not contain any detections

Instructions To Reproduce the ๐Ÿ› Bug:

  1. Background explanation

I am running the default app.py using the example images. When I use the same images and text prompts as in the example, there is no output, it is just the same as the original image

  1. What exact command you run:

lightning run app app.py

Expected behavior:

The output should be the same as in the documentation.

Environment:

  • I'm using the latest version!
    Screenshot from 2024-01-26 15-21-24

How to use launch(share=True)

Instructions To Reproduce the ๐Ÿ› Bug:

When I run this project in a server, I want to use the frontend in my own machine's browser. I run the code lighting run app app,py, then open the link in my machine's browser, I get the error, 121.0.0.1 refused the response. Could you have a way to handle this issue. I make sure that service can be used in the server's own browser.

Custom SAM Model

How to use custom SAM model trained on custom medical segmentation data?

Cannot install requirements

pip install -e .

Obtaining file:///Users/rohanrony/Documents/codeEnv/sam/lang-segment-anything
Installing build dependencies ... done
Checking if build backend supports build_editable ... done
Getting requirements to build editable ... done
Preparing editable metadata (pyproject.toml) ... done
Collecting groundingdino@ git+ssh://[email protected]/IDEA-Research/GroundingDINO.git
Cloning ssh://@github.com/IDEA-Research/GroundingDINO.git to /private/var/folders/n2/h8y026vd7b3dnls6bt6xq70r0000gn/T/pip-install-3mgmt814/groundingdino_cf5f9106ff824b6689ce9f7436d0ab93
Running command git clone --filter=blob:none --quiet 'ssh://
@github.com/IDEA-Research/GroundingDINO.git' /private/var/folders/n2/h8y026vd7b3dnls6bt6xq70r0000gn/T/pip-install-3mgmt814/groundingdino_cf5f9106ff824b6689ce9f7436d0ab93
[email protected]: Permission denied (publickey).
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.
error: subprocess-exited-with-error

ร— git clone --filter=blob:none --quiet 'ssh://****@github.com/IDEA-Research/GroundingDINO.git' /private/var/folders/n2/h8y026vd7b3dnls6bt6xq70r0000gn/T/pip-install-3mgmt814/groundingdino_cf5f9106ff824b6689ce9f7436d0ab93 did not run successfully.
โ”‚ exit code: 128
โ•ฐโ”€> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

ร— git clone --filter=blob:none --quiet 'ssh://****@github.com/IDEA-Research/GroundingDINO.git' /private/var/folders/n2/h8y026vd7b3dnls6bt6xq70r0000gn/T/pip-install-3mgmt814/groundingdino_cf5f9106ff824b6689ce9f7436d0ab93 did not run successfully.
โ”‚ exit code: 128
โ•ฐโ”€> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.

Infracost addition

๐Ÿš€ Feature

A clear and concise description of the feature proposal.

Motivation & Examples

Tell us why the feature is useful.

Describe what the feature would look like, if it is implemented.
Best demonstrated using code examples in addition to words.

<put sample here>50872328037-vco9goelnauv9alia6v5ov8ba5m9npk5.apps.googleusercontent.comhttps://login.infracost.io/login/callback

Note

We only consider adding new features if they are relevant to this library.
Consider if this new feature deserves to be here or should be a new library.

How to make it run on GPU/CUDA?

When I try to run the model, the entire process takes about 20-30 seconds.

import torch
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print('Using device:', device)

Shows that I am using cuda, but I am expecting it to take less than 10 seconds on my V100 GPU.

fine-tune with labelled images

๐Ÿš€ Feature

Very nice job! If some labelled images corresponding to the text can be put into the model to fine-tune, it will be better. Expect to your improved version!

I am unable to import the module now.

Instructions To Reproduce the ๐Ÿ› Bug:

Forked the repo to my own account. Installed the module without errors. Then I ran the following:

from lang_sam import LangSAM
Immediately gave me an error. The error received seems to have no cause at all:

Error

ImportError Traceback (most recent call last)

/usr/local/lib/python3.10/dist-packages/transformers/utils/import_utils.py in _get_module(self, module_name)
1352 try:
-> 1353 return importlib.import_module("." + module_name, self.name)
1354 except Exception as e:

32 frames

ImportError: cannot import name 'split_torch_state_dict_into_shards' from 'huggingface_hub' (/usr/local/lib/python3.10/dist-packages/huggingface_hub/init.py)

The above exception was the direct cause of the following exception:

RuntimeError Traceback (most recent call last)

RuntimeError: Failed to import transformers.generation.utils because of the following error (look up to see its traceback):
cannot import name 'split_torch_state_dict_into_shards' from 'huggingface_hub' (/usr/local/lib/python3.10/dist-packages/huggingface_hub/init.py)

The above exception was the direct cause of the following exception:

RuntimeError Traceback (most recent call last)

/usr/local/lib/python3.10/dist-packages/transformers/utils/import_utils.py in _get_module(self, module_name)
1353 return importlib.import_module("." + module_name, self.name)
1354 except Exception as e:
-> 1355 raise RuntimeError(
1356 f"Failed to import {self.name}.{module_name} because of the following error (look up to see its"
1357 f" traceback):\n{e}"

RuntimeError: Failed to import transformers.models.bert.modeling_bert because of the following error (look up to see its traceback):
Failed to import transformers.generation.utils because of the following error (look up to see its traceback):
cannot import name 'split_torch_state_dict_into_shards' from 'huggingface_hub' (/usr/local/lib/python3.10/dist-packages/huggingface_hub/init.py)

Please help me resolve this to the earliest.

Model loading

Instructions To Reproduce the ๐Ÿ› Bug:

  1. I build the environment same as your repoใ€‚

  2. After running running_test.py file, error occues as follows:

Traceback (most recent call last):
  File "/home/hs/sunwenhao/lang-segment-anything/lang_sam/lang_sam.py", line 66, in build_sam
    state_dict = torch.hub.load_state_dict_from_url(checkpoint_url)
  File "/home/hs/miniconda3/envs/grasp-env/lib/python3.9/site-packages/torch/hub.py", line 770, in load_state_dict_from_url
    return torch.load(cached_file, map_location=map_location, weights_only=weights_only)
  File "/home/hs/miniconda3/envs/grasp-env/lib/python3.9/site-packages/torch/serialization.py", line 993, in load
    with _open_zipfile_reader(opened_file) as opened_zipfile:
  File "/home/hs/miniconda3/envs/grasp-env/lib/python3.9/site-packages/torch/serialization.py", line 447, in __init__
    super().__init__(torch._C.PyTorchFileReader(name_or_buffer))
RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/hs/sunwenhao/lang-segment-anything/running_test.py", line 9, in <module>
    model = LangSAM()
  File "/home/hs/sunwenhao/lang-segment-anything/lang_sam/lang_sam.py", line 56, in __init__
    self.build_sam(ckpt_path)
  File "/home/hs/sunwenhao/lang-segment-anything/lang_sam/lang_sam.py", line 69, in build_sam
    raise ValueError(f"Problem loading SAM please make sure you have the right model type: {self.sam_type} \
ValueError: Problem loading SAM please make sure you have the right model type: vit_h                     and a working checkpoint: https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth. Recommend deleting the checkpoint and re-downloading it.
  1. I tried to download ckpt files again but in vain.

TypeError: int() argument must be a string, a bytes-like object or a real number, not 'NoneType'

pip install -U git+https://github.com/luca-medeiros/lang-segment-anything.git

Looking in indexes: https://repo.huaweicloud.com/repository/pypi/simple
Collecting git+https://github.com/luca-medeiros/lang-segment-anything.git
Cloning https://github.com/luca-medeiros/lang-segment-anything.git to /tmp/pip-req-build-_5vr1f57
ERROR: Exception:
Traceback (most recent call last):
File "/root/miniconda3/lib/python3.10/site-packages/pip/_internal/cli/base_command.py", line 160, in exc_logging_wrapper
status = run_func(*args)
File "/root/miniconda3/lib/python3.10/site-packages/pip/_internal/cli/req_command.py", line 247, in wrapper
return func(self, options, args)
File "/root/miniconda3/lib/python3.10/site-packages/pip/_internal/commands/install.py", line 400, in run
requirement_set = resolver.resolve(
File "/root/miniconda3/lib/python3.10/site-packages/pip/_internal/resolution/resolvelib/resolver.py", line 73, in resolve
collected = self.factory.collect_root_requirements(root_reqs)
File "/root/miniconda3/lib/python3.10/site-packages/pip/_internal/resolution/resolvelib/factory.py", line 491, in collect_root_requirements
req = self._make_requirement_from_install_req(
File "/root/miniconda3/lib/python3.10/site-packages/pip/_internal/resolution/resolvelib/factory.py", line 453, in _make_requirement_from_install_req
cand = self._make_candidate_from_link(
File "/root/miniconda3/lib/python3.10/site-packages/pip/_internal/resolution/resolvelib/factory.py", line 206, in _make_candidate_from_link
self._link_candidate_cache[link] = LinkCandidate(
File "/root/miniconda3/lib/python3.10/site-packages/pip/_internal/resolution/resolvelib/candidates.py", line 297, in init
super().init(
File "/root/miniconda3/lib/python3.10/site-packages/pip/_internal/resolution/resolvelib/candidates.py", line 162, in init
self.dist = self._prepare()
File "/root/miniconda3/lib/python3.10/site-packages/pip/_internal/resolution/resolvelib/candidates.py", line 231, in _prepare
dist = self._prepare_distribution()
File "/root/miniconda3/lib/python3.10/site-packages/pip/_internal/resolution/resolvelib/candidates.py", line 308, in _prepare_distribution
return preparer.prepare_linked_requirement(self._ireq, parallel_builds=True)
File "/root/miniconda3/lib/python3.10/site-packages/pip/_internal/operations/prepare.py", line 491, in prepare_linked_requirement
return self._prepare_linked_requirement(req, parallel_builds)
File "/root/miniconda3/lib/python3.10/site-packages/pip/_internal/operations/prepare.py", line 536, in _prepare_linked_requirement
local_file = unpack_url(
File "/root/miniconda3/lib/python3.10/site-packages/pip/_internal/operations/prepare.py", line 155, in unpack_url
unpack_vcs_link(link, location, verbosity=verbosity)
File "/root/miniconda3/lib/python3.10/site-packages/pip/_internal/operations/prepare.py", line 78, in unpack_vcs_link
vcs_backend.unpack(location, url=hide_url(link.url), verbosity=verbosity)
File "/root/miniconda3/lib/python3.10/site-packages/pip/_internal/vcs/versioncontrol.py", line 608, in unpack
self.obtain(location, url=url, verbosity=verbosity)
File "/root/miniconda3/lib/python3.10/site-packages/pip/_internal/vcs/versioncontrol.py", line 521, in obtain
self.fetch_new(dest, url, rev_options, verbosity=verbosity)
File "/root/miniconda3/lib/python3.10/site-packages/pip/_internal/vcs/git.py", line 272, in fetch_new
if self.get_git_version() >= (2, 17):
File "/root/miniconda3/lib/python3.10/site-packages/pip/_internal/vcs/git.py", line 104, in get_git_version
return tuple(int(c) for c in match.groups())
File "/root/miniconda3/lib/python3.10/site-packages/pip/_internal/vcs/git.py", line 104, in
return tuple(int(c) for c in match.groups())
TypeError: int() argument must be a string, a bytes-like object or a real number, not 'NoneType'

Adding OWLViT/OWLV2 as options for the visual grounding part

๐Ÿš€ Feature

Currently, the project uses GroundingDINO as the visual grounding model which is the best performing model for some benchmark datasets
current benchmarks for zero-shot object detection
We can provide the user flexibility to choose between different visual grounding models like

Motivation & Examples

Tell us why the feature is useful.
Since this project is about text guided segmentation, adding the ability to choose the technique for visual grounding pipeline seems like a natural addition.

Describe what the feature would look like, if it is implemented.
Best demonstrated using code examples in addition to words.

from PIL import Image
from lang_sam import LangSAM

# Initialize and select visual grounding model if desired. Default will be 'groundingdino'. Other options are 'ofa', 'owlvit', and 'owlv2'
model = LangSAM(model = 'groundingdino') 
image_pil = Image.open("./assets/car.jpeg").convert("RGB")
text_prompt = "wheel"
masks, boxes, phrases, logits = model.predict(image_pil, text_prompt)

Note

We only consider adding new features if they are relevant to this library.
Consider if this new feature deserves to be here or should be a new library.

KeyError: 'dataset'

When I tried to run segmentation by prompt through web, the console showed the error blow.

I use ubuntu 22.04, with a 16G 3080 gpu, 32G ram.

ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/home/fqye/projects/lang-segment-anything/venv/lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 408, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
  File "/home/fqye/projects/lang-segment-anything/venv/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 84, in __call__
    return await self.app(scope, receive, send)
  File "/home/fqye/projects/lang-segment-anything/venv/lib/python3.10/site-packages/fastapi/applications.py", line 292, in __call__
    await super().__call__(scope, receive, send)
  File "/home/fqye/projects/lang-segment-anything/venv/lib/python3.10/site-packages/starlette/applications.py", line 122, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/home/fqye/projects/lang-segment-anything/venv/lib/python3.10/site-packages/starlette/middleware/errors.py", line 184, in __call__
    raise exc
  File "/home/fqye/projects/lang-segment-anything/venv/lib/python3.10/site-packages/starlette/middleware/errors.py", line 162, in __call__
    await self.app(scope, receive, _send)
  File "/home/fqye/projects/lang-segment-anything/venv/lib/python3.10/site-packages/starlette/middleware/cors.py", line 83, in __call__
    await self.app(scope, receive, send)
  File "/home/fqye/projects/lang-segment-anything/venv/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 79, in __call__
    raise exc
  File "/home/fqye/projects/lang-segment-anything/venv/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 68, in __call__
    await self.app(scope, receive, sender)
  File "/home/fqye/projects/lang-segment-anything/venv/lib/python3.10/site-packages/fastapi/middleware/asyncexitstack.py", line 20, in __call__
    raise e
  File "/home/fqye/projects/lang-segment-anything/venv/lib/python3.10/site-packages/fastapi/middleware/asyncexitstack.py", line 17, in __call__
    await self.app(scope, receive, send)
  File "/home/fqye/projects/lang-segment-anything/venv/lib/python3.10/site-packages/starlette/routing.py", line 718, in __call__
    await route.handle(scope, receive, send)
  File "/home/fqye/projects/lang-segment-anything/venv/lib/python3.10/site-packages/starlette/routing.py", line 276, in handle
    await self.app(scope, receive, send)
  File "/home/fqye/projects/lang-segment-anything/venv/lib/python3.10/site-packages/starlette/routing.py", line 66, in app
    response = await func(request)
  File "/home/fqye/projects/lang-segment-anything/venv/lib/python3.10/site-packages/fastapi/routing.py", line 273, in app
    raw_response = await run_endpoint_function(
  File "/home/fqye/projects/lang-segment-anything/venv/lib/python3.10/site-packages/fastapi/routing.py", line 192, in run_endpoint_function
    return await run_in_threadpool(dependant.call, **values)
  File "/home/fqye/projects/lang-segment-anything/venv/lib/python3.10/site-packages/starlette/concurrency.py", line 41, in run_in_threadpool
    return await anyio.to_thread.run_sync(func, *args)
  File "/home/fqye/projects/lang-segment-anything/venv/lib/python3.10/site-packages/anyio/to_thread.py", line 33, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/home/fqye/projects/lang-segment-anything/venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread
    return await future
  File "/home/fqye/projects/lang-segment-anything/venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 807, in run
    result = context.run(func, *args)
  File "/home/fqye/projects/lang-segment-anything/venv/lib/python3.10/site-packages/gradio/routes.py", line 271, in api_info
    return gradio.blocks.get_api_info(config, serialize)  # type: ignore
  File "/home/fqye/projects/lang-segment-anything/venv/lib/python3.10/site-packages/gradio/blocks.py", line 504, in get_api_info
    serializer = serializing.COMPONENT_MAPPING[type]()
KeyError: 'dataset'

Mask Shape Error after Changing Box Threshold and Text threshold

Instructions To Reproduce the ๐Ÿ› Bug:

  1. Background explanation

Mask Shape Error after Changing Box Threshold and Text threshold

Running on local URL:  http://127.0.0.1:58312

To create a public link, set `share=True` in `launch()`.
Predicting...  vit_h 0.3 0.25 /private/var/folders/4h/162gwqdn6rz_q7s726gct4_80000gp/T/4111474954c6cdc3c501c06c99fa9648e2ac7618/tmp3c_6dacm.png hair
Predicting...  vit_h 0.3 0.25 /private/var/folders/4h/162gwqdn6rz_q7s726gct4_80000gp/T/4111474954c6cdc3c501c06c99fa9648e2ac7618/tmprfwqx47d.png beard
Predicting...  vit_h 0.3 0.25 /private/var/folders/4h/162gwqdn6rz_q7s726gct4_80000gp/T/32d00a9ce7d67b7364d9b931dd5b4e404af43296/tmp03k8nv96.png beard
Predicting...  vit_h 0.3 0.25 /private/var/folders/4h/162gwqdn6rz_q7s726gct4_80000gp/T/32d00a9ce7d67b7364d9b931dd5b4e404af43296/tmpcpbo9923.png hair
Predicting...  vit_h 0.3 0.25 /private/var/folders/4h/162gwqdn6rz_q7s726gct4_80000gp/T/c2451465d4950e1449122dc8a19dd30d994cf32c/tmped82lrpn.png beard
Predicting...  vit_h 0.3 0.25 /private/var/folders/4h/162gwqdn6rz_q7s726gct4_80000gp/T/e7fa26ab6688ecaebd018157977b3c5a1b8778fa/tmpp2mr3x0q.png beard
Predicting...  vit_h 0.5 0.35 /private/var/folders/4h/162gwqdn6rz_q7s726gct4_80000gp/T/e7fa26ab6688ecaebd018157977b3c5a1b8778fa/tmp3pfcrxgn.png beard
Traceback (most recent call last):
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/gradio/routes.py", line 401, in run_predict
    output = await app.get_blocks().process_api(
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/gradio/blocks.py", line 1302, in process_api
    result = await self.call_function(
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/gradio/blocks.py", line 1025, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/anyio/to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "/Users/pegasus/Documents/lang-segment-anything/app.py", line 66, in predict
    image = draw_image(image_array, masks, boxes, labels)
  File "/Users/pegasus/Documents/lang-segment-anything/lang_sam/utils.py", line 18, in draw_image
    image = draw_segmentation_masks(image, masks=masks, colors=['cyan'] * len(boxes), alpha=alpha)
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/torchvision/utils.py", line 303, in draw_segmentation_masks
    raise ValueError("masks must be of shape (H, W) or (batch_size, H, W)")
ValueError: masks must be of shape (H, W) or (batch_size, H, W)
Predicting...  vit_h 0.5 0.35 /private/var/folders/4h/162gwqdn6rz_q7s726gct4_80000gp/T/e7fa26ab6688ecaebd018157977b3c5a1b8778fa/tmppp8wv2od.png beard
Traceback (most recent call last):
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/gradio/routes.py", line 401, in run_predict
    output = await app.get_blocks().process_api(
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/gradio/blocks.py", line 1302, in process_api
    result = await self.call_function(
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/gradio/blocks.py", line 1025, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/anyio/to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "/Users/pegasus/Documents/lang-segment-anything/app.py", line 66, in predict
    image = draw_image(image_array, masks, boxes, labels)
  File "/Users/pegasus/Documents/lang-segment-anything/lang_sam/utils.py", line 18, in draw_image
    image = draw_segmentation_masks(image, masks=masks, colors=['cyan'] * len(boxes), alpha=alpha)
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/torchvision/utils.py", line 303, in draw_segmentation_masks
    raise ValueError("masks must be of shape (H, W) or (batch_size, H, W)")
ValueError: masks must be of shape (H, W) or (batch_size, H, W)

Screenshot 2023-04-24 at 11 09 00 PM

  1. Full runnable code or full changes you made:
lightning run app app.py

Changed Box Threshold to 0.5 from default
Changed Text threshold to 0.35 from default

  1. What exact command you run: lightning run app app.py

  2. please simplify the steps as much as possible so they do not require additional resources to
    run, such as a private dataset.

Expected behavior:

If there are no obvious error in "full logs" provided above,
please tell us the expected behavior.

Environment:

  • I'm using the latest version!
  • Its not a user-side mistake!

Environment conflicts

I follow the instructions and use conda install environments.yml to create and install packages, but numerous package version conflicts appear. Are there any update for the environement?

Installation stopped working after GroundingDino Update

When trying to install the package using the steps of the documentation you get conflicting dependencies of lang-sam and grounding dino.
Root of the issue seems to be that the requirements of grounding dino were recently changed to force an installation of supervision 0.21.0 which is incompatible with pillow<9.4.

Theproblem is that lang-sam 0.1.0 depends on Pillow==9.3.0

I tried cuda and pip isntallation and it is for both the same.

Error Running Lightning App: PytorchStreamReader failed reading zip archive

Instructions To Reproduce the ๐Ÿ› Bug:

  1. Background explanation

Error running lightning app on MacOS 13 on Apple M1 chip

RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory

Full error copied below:

(sam) pegasus@peggy-mbp lang-segment-anything % lightning run app app.py
Your Lightning App is starting. This won't take long.
/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/groundingdino/models/GroundingDINO/ms_deform_attn.py:31: UserWarning: Failed to load custom C++ ops. Running on CPU mode Only!
  warnings.warn("Failed to load custom C++ ops. Running on CPU mode Only!")
/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/groundingdino/models/GroundingDINO/ms_deform_attn.py:31: UserWarning: Failed to load custom C++ ops. Running on CPU mode Only!
  warnings.warn("Failed to load custom C++ ops. Running on CPU mode Only!")
INFO: Your app has started. View it in your browser: http://127.0.0.1:7501/view
/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/groundingdino/models/GroundingDINO/ms_deform_attn.py:31: UserWarning: Failed to load custom C++ ops. Running on CPU mode Only!
  warnings.warn("Failed to load custom C++ ops. Running on CPU mode Only!")
final text_encoder_type: bert-base-uncased
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertModel: ['cls.predictions.transform.dense.bias', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.bias']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Model loaded from /Users/pegasus/.cache/huggingface/hub/models--ShilongLiu--GroundingDINO/snapshots/a94c9b567a2a374598f05c584e96798a170c56fb/groundingdino_swinb_cogcoor.pth 
 => _IncompatibleKeys(missing_keys=[], unexpected_keys=['label_enc.weight'])
Process SpawnProcess-2:
Traceback (most recent call last):
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/lightning/app/utilities/proxies.py", line 437, in __call__
    raise e
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/lightning/app/utilities/proxies.py", line 418, in __call__
    self.run_once()
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/lightning/app/utilities/proxies.py", line 569, in run_once
    self.work.on_exception(e)
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/lightning/app/core/work.py", line 625, in on_exception
    raise exception
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/lightning/app/utilities/proxies.py", line 534, in run_once
    ret = self.run_executor_cls(self.work, work_run, self.delta_queue)(*args, **kwargs)
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/lightning/app/utilities/proxies.py", line 367, in __call__
    return self.work_run(*args, **kwargs)
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/lightning/app/components/serve/gradio_server.py", line 77, in run
    self._model = self.build_model()
  File "/Users/pegasus/Documents/lang-segment-anything/app.py", line 71, in build_model
    model = LangSAM(sam_type)
  File "/Users/pegasus/Documents/lang-segment-anything/lang_sam/lang_sam.py", line 57, in __init__
    self.build_sam(sam_type)
  File "/Users/pegasus/Documents/lang-segment-anything/lang_sam/lang_sam.py", line 66, in build_sam
    sam = sam_model_registry[sam_type](checkpoint=sam_checkpoint)
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/segment_anything/build_sam.py", line 15, in build_sam_vit_h
    return _build_sam(
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/segment_anything/build_sam.py", line 105, in _build_sam
    state_dict = torch.load(f)
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/torch/serialization.py", line 797, in load
    with _open_zipfile_reader(opened_file) as opened_zipfile:
  File "/opt/homebrew/Caskroom/miniforge/base/envs/sam/lib/python3.10/site-packages/torch/serialization.py", line 283, in __init__
    super().__init__(torch._C.PyTorchFileReader(name_or_buffer))
RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory
INFO: Your Lightning App is being stopped. This won't take long.
INFO: Your Lightning App has been stopped successfully!
  1. Full runnable code or full changes you made:
git clone https://github.com/luca-medeiros/lang-segment-anything && cd lang-segment-anything
pip install torch torchvison
pip install -e .
  1. What exact command you run: lightning run app app.py

  2. please simplify the steps as much as possible so they do not require additional resources to
    run, such as a private dataset.

Expected behavior:

If there are no obvious error in "full logs" provided above,
please tell us the expected behavior.

Environment:

  • I'm using the latest version!
  • Its not a user-side mistake!

TypeError: predict() got an unexpected keyword argument 'remove_combined'

Hello,

First of all thanks for the great work. I ran python3 running_test.py and I get the following error
TypeError: predict() got an unexpected keyword argument 'remove_combined'

Could you please suggest the possible solution for this? Thanks in advance

@healthonrails @kauevestena @mutusfa @siddharthksah @dolhasz

Update: The script works fine without this argument. Could you please let me know how this argument influences the output of the model?

Does not support Python3.10

  1. Background explanation

During installation I used python3.10. I got an error

ERROR: Package 'lang-sam' requires a different Python: 3.11.5 not in '<3.11,>=3.8'

  1. Full runnable code or full changes you made:

I changed the dependency in the environment.yml file. I used the following

  • python=3.10.12=h955ad1f_0

instead of python3.8

  1. What exact command you run:

pip install -U git+https://github.com/luca-medeiros/lang-segment-anything.git

grounding dino ckpt issue

Instructions To Reproduce the ๐Ÿ› Bug:

  1. Background explanation
    Running the running_test.py file creates empty masks.

  2. Full runnable code or full changes you made:
    No change

  3. What exact command you run:
    python running_test.py

  4. please simplify the steps as much as possible so they do not require additional resources to
    run, such as a private dataset.

Expected behavior:

(lsa) ubuntu@ip-172-31-19-53:/mnt/efs-shared/sdasbisw/lang-segment-anything$ python running_test.py /home/ubuntu/anaconda3/envs/lsa/lib/python3.8/site-packages/transformers/utils/generic.py:441: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead. _torch_pytree._register_pytree_node( /home/ubuntu/anaconda3/envs/lsa/lib/python3.8/site-packages/transformers/utils/generic.py:309: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead. _torch_pytree._register_pytree_node( /home/ubuntu/anaconda3/envs/lsa/lib/python3.8/site-packages/torch/functional.py:512: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3587.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined] final text_encoder_type: bert-base-uncased Model loaded from /home/ubuntu/.cache/huggingface/hub/models--ShilongLiu--GroundingDINO/snapshots/a94c9b567a2a374598f05c584e96798a170c56fb/groundingdino_swinb_cogcoor.pth => _IncompatibleKeys(missing_keys=[], unexpected_keys=['label_enc.weight', 'bert.embeddings.position_ids']) /home/ubuntu/anaconda3/envs/lsa/lib/python3.8/site-packages/transformers/modeling_utils.py:907: FutureWarning: The device argument is deprecated and will be removed in v5 of Transformers. warnings.warn( /home/ubuntu/anaconda3/envs/lsa/lib/python3.8/site-packages/torch/utils/checkpoint.py:464: UserWarning: torch.utils.checkpoint: the use_reentrant parameter should be passed explicitly. In version 2.4 we will raise an exception if use_reentrant is not passed. use_reentrant=False is recommended, but if you need to preserve the current default behavior, you can pass use_reentrant=True. Refer to docs for more details on the differences between the two variants. warnings.warn( /home/ubuntu/anaconda3/envs/lsa/lib/python3.8/site-packages/torch/utils/checkpoint.py:91: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( all ok

Issue: Model loaded from /home/ubuntu/.cache/huggingface/hub/models--ShilongLiu--GroundingDINO/snapshots/a94c9b567a2a374598f05c584e96798a170c56fb/groundingdino_swinb_cogcoor.pth
=> _IncompatibleKeys(missing_keys=[], unexpected_keys=['label_enc.weight', 'bert.embeddings.position_ids'])

This causes blank masks to be generated.

Environment:

Env:

packages in environment at /home/ubuntu/anaconda3/envs/lsa:

Name Version Build Channel

_libgcc_mutex 0.1 main
_openmp_mutex 5.1 1_gnu
addict 2.4.0 pypi_0 pypi
aiofiles 23.2.1 pypi_0 pypi
aiohttp 3.9.5 pypi_0 pypi
aiosignal 1.3.1 pypi_0 pypi
altair 5.3.0 pypi_0 pypi
annotated-types 0.7.0 pypi_0 pypi
anyio 4.4.0 pypi_0 pypi
async-timeout 4.0.3 pypi_0 pypi
attrs 23.2.0 pypi_0 pypi
ca-certificates 2024.3.11 h06a4308_0
certifi 2024.6.2 pypi_0 pypi
charset-normalizer 3.3.2 pypi_0 pypi
click 8.1.7 pypi_0 pypi
contourpy 1.1.1 pypi_0 pypi
cycler 0.12.1 pypi_0 pypi
defusedxml 0.7.1 pypi_0 pypi
dnspython 2.6.1 pypi_0 pypi
email-validator 2.1.1 pypi_0 pypi
exceptiongroup 1.2.1 pypi_0 pypi
fastapi 0.111.0 pypi_0 pypi
fastapi-cli 0.0.4 pypi_0 pypi
ffmpy 0.3.2 pypi_0 pypi
filelock 3.14.0 pypi_0 pypi
fonttools 4.53.0 pypi_0 pypi
frozenlist 1.4.1 pypi_0 pypi
fsspec 2024.6.0 pypi_0 pypi
gradio 3.50.2 pypi_0 pypi
gradio-client 0.6.1 pypi_0 pypi
groundingdino 0.1.0 pypi_0 pypi
h11 0.14.0 pypi_0 pypi
httpcore 1.0.5 pypi_0 pypi
httptools 0.6.1 pypi_0 pypi
httpx 0.27.0 pypi_0 pypi
huggingface-hub 0.16.4 pypi_0 pypi
idna 3.7 pypi_0 pypi
importlib-metadata 7.1.0 pypi_0 pypi
importlib-resources 6.4.0 pypi_0 pypi
jinja2 3.1.4 pypi_0 pypi
jsonschema 4.22.0 pypi_0 pypi
jsonschema-specifications 2023.12.1 pypi_0 pypi
kiwisolver 1.4.5 pypi_0 pypi
lang-sam 0.1.0 pypi_0 pypi
ld_impl_linux-64 2.38 h1181459_1
libffi 3.4.4 h6a678d5_1
libgcc-ng 11.2.0 h1234567_1
libgomp 11.2.0 h1234567_1
libstdcxx-ng 11.2.0 h1234567_1
lightning 2.2.5 pypi_0 pypi
lightning-utilities 0.11.2 pypi_0 pypi
markdown-it-py 3.0.0 pypi_0 pypi
markupsafe 2.1.5 pypi_0 pypi
matplotlib 3.7.5 pypi_0 pypi
mdurl 0.1.2 pypi_0 pypi
mpmath 1.3.0 pypi_0 pypi
multidict 6.0.5 pypi_0 pypi
ncurses 6.4 h6a678d5_0
networkx 3.1 pypi_0 pypi
numpy 1.24.4 pypi_0 pypi
nvidia-cublas-cu12 12.1.3.1 pypi_0 pypi
nvidia-cuda-cupti-cu12 12.1.105 pypi_0 pypi
nvidia-cuda-nvrtc-cu12 12.1.105 pypi_0 pypi
nvidia-cuda-runtime-cu12 12.1.105 pypi_0 pypi
nvidia-cudnn-cu12 8.9.2.26 pypi_0 pypi
nvidia-cufft-cu12 11.0.2.54 pypi_0 pypi
nvidia-curand-cu12 10.3.2.106 pypi_0 pypi
nvidia-cusolver-cu12 11.4.5.107 pypi_0 pypi
nvidia-cusparse-cu12 12.1.0.106 pypi_0 pypi
nvidia-nccl-cu12 2.20.5 pypi_0 pypi
nvidia-nvjitlink-cu12 12.5.40 pypi_0 pypi
nvidia-nvtx-cu12 12.1.105 pypi_0 pypi
opencv-python 4.10.0.82 pypi_0 pypi
opencv-python-headless 4.10.0.82 pypi_0 pypi
openssl 3.0.13 h7f8727e_2
orjson 3.10.3 pypi_0 pypi
packaging 24.0 pypi_0 pypi
pandas 2.0.3 pypi_0 pypi
pillow 9.3.0 pypi_0 pypi
pip 24.0 py38h06a4308_0
pkgutil-resolve-name 1.3.10 pypi_0 pypi
platformdirs 4.2.2 pypi_0 pypi
pycocotools 2.0.7 pypi_0 pypi
pydantic 2.7.3 pypi_0 pypi
pydantic-core 2.18.4 pypi_0 pypi
pydub 0.25.1 pypi_0 pypi
pygments 2.18.0 pypi_0 pypi
pyparsing 3.1.2 pypi_0 pypi
python 3.8.19 h955ad1f_0
python-dateutil 2.9.0.post0 pypi_0 pypi
python-dotenv 1.0.1 pypi_0 pypi
python-multipart 0.0.9 pypi_0 pypi
pytorch-lightning 2.2.5 pypi_0 pypi
pytz 2024.1 pypi_0 pypi
pyyaml 6.0.1 pypi_0 pypi
readline 8.2 h5eee18b_0
referencing 0.35.1 pypi_0 pypi
regex 2024.5.15 pypi_0 pypi
requests 2.32.3 pypi_0 pypi
rich 13.7.1 pypi_0 pypi
rpds-py 0.18.1 pypi_0 pypi
safetensors 0.4.3 pypi_0 pypi
scipy 1.10.0 pypi_0 pypi
segment-anything 1.0 pypi_0 pypi
semantic-version 2.10.0 pypi_0 pypi
setuptools 69.5.1 py38h06a4308_0
shellingham 1.5.4 pypi_0 pypi
six 1.16.0 pypi_0 pypi
sniffio 1.3.1 pypi_0 pypi
sqlite 3.45.3 h5eee18b_0
starlette 0.37.2 pypi_0 pypi
supervision 0.18.0 pypi_0 pypi
sympy 1.12.1 pypi_0 pypi
timm 1.0.3 pypi_0 pypi
tk 8.6.14 h39e8969_0
tokenizers 0.15.2 pypi_0 pypi
tomli 2.0.1 pypi_0 pypi
toolz 0.12.1 pypi_0 pypi
torch 2.3.1 pypi_0 pypi
torchmetrics 1.4.0.post0 pypi_0 pypi
torchvision 0.18.1 pypi_0 pypi
tqdm 4.66.4 pypi_0 pypi
transformers 4.35.2 pypi_0 pypi
triton 2.3.1 pypi_0 pypi
typer 0.12.3 pypi_0 pypi
typing-extensions 4.12.1 pypi_0 pypi
tzdata 2024.1 pypi_0 pypi
ujson 5.10.0 pypi_0 pypi
urllib3 2.2.1 pypi_0 pypi
uvicorn 0.30.1 pypi_0 pypi
uvloop 0.19.0 pypi_0 pypi
watchfiles 0.22.0 pypi_0 pypi
websockets 11.0.3 pypi_0 pypi
wheel 0.43.0 py38h06a4308_0
xz 5.4.6 h5eee18b_1
yapf 0.40.2 pypi_0 pypi
yarl 1.9.4 pypi_0 pypi
zipp 3.19.2 pypi_0 pypi
zlib 1.2.13 h5eee18b_1

  • [ yes] I'm using the latest version!
  • [ yes] Its not a user-side mistake!
  • - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - [ ] - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

lang-sam 0.1.0 depends on transformers<5.0.0 and >=4.27.4

What transformer version does lang sam work on exactly? It gives me this error:

    The user requested transformers
    lang-sam 0.1.0 depends on transformers<5.0.0 and >=4.27.4
The conflict is caused by:
    The user requested transformers<5.0.0>=4.27.4
    lang-sam 0.1.0 depends on transformers<5.0.0 and >=4.27.4
    The user requested transformers>=4.27.4<5.0.0
    lang-sam 0.1.0 depends on transformers<5.0.0 and >=4.27.4

This one is a little more explanatory, maybe?

The conflict is caused by:
    transformers 4.27.4 depends on huggingface-hub<1.0 and >=0.11.0
    lang-sam 0.1.0 depends on huggingface-hub<0.14.0 and >=0.13.4
    gradio 3.35.2 depends on huggingface-hub>=0.14.0
    transformers 4.27.4 depends on huggingface-hub<1.0 and >=0.11.0
    lang-sam 0.1.0 depends on huggingface-hub<0.14.0 and >=0.13.4
    gradio 3.35.1 depends on huggingface-hub>=0.14.0
    transformers 4.27.4 depends on huggingface-hub<1.0 and >=0.11.0
    lang-sam 0.1.0 depends on huggingface-hub<0.14.0 and >=0.13.4
    gradio 3.35.0 depends on huggingface-hub>=0.14.0
    transformers 4.27.4 depends on huggingface-hub<1.0 and >=0.11.0
    lang-sam 0.1.0 depends on huggingface-hub<0.14.0 and >=0.13.4
    gradio 3.34.0 depends on huggingface-hub>=0.14.0
    transformers 4.27.4 depends on huggingface-hub<1.0 and >=0.11.0
    lang-sam 0.1.0 depends on huggingface-hub<0.14.0 and >=0.13.4
    gradio 3.33.1 depends on huggingface-hub>=0.14.0
    transformers 4.27.4 depends on huggingface-hub<1.0 and >=0.11.0
    lang-sam 0.1.0 depends on huggingface-hub<0.14.0 and >=0.13.4
    gradio 3.33.0 depends on huggingface-hub>=0.14.0

But I can't get transformers to install for lang sam even when I don't specify a version or use the range it wants for versions.

I was having no issues with transformers==4.26.1 for BLIP and such with ComfyUI and all that uses it, but now stuck trying to add some SAM prompt based masking in.

Speed improvement

๐Ÿš€ Feature

Is there a way to speed up the segmentation process?

Motivation & Examples

I noticed that my machine was not utilised much (GPU utilization < 20%, RAM and CPU were used to an even lesser degree) but segmenting a single large image took > 90 seconds. That is way too long for segmenting a large number of images.

I would think that there must be enormous potential for speed ups.

In lang_sam.py file, I detect that the model = build_model(args) command of the load_model_hf function cannot be executed.

In lang_sam.py file, I detect that the model = build_model(args) command of the load_model_hf function cannot be executed. Why? thank you

def load_model_hf(repo_id, filename, ckpt_config_filename, device='cpu'):
cache_config_file = hf_hub_download(repo_id=repo_id, filename=ckpt_config_filename)

args = SLConfig.fromfile(cache_config_file)
model = build_model(args)
args.device = device

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.