GithubHelp home page GithubHelp logo

Comments (10)

insop avatar insop commented on May 28, 2024

Maybe this is the main cause instead of runtime warning from sqrt

 ValueError: Expected input batch_size (510) to match target batch_size (512).

from upsnet.

JoyHuYY1412 avatar JoyHuYY1412 commented on May 28, 2024

Maybe this is the main cause instead of runtime warning from sqrt

 ValueError: Expected input batch_size (510) to match target batch_size (512).

I also met this question. But it wouldn't happen when I use multi-gpu.

from upsnet.

YuwenXiong avatar YuwenXiong commented on May 28, 2024

When you change # gpu from 4 to 1 you also need to reduce lr by 4x and increase # iter by 4x.

from upsnet.

insop avatar insop commented on May 28, 2024

With the below change, I was able to run after my post.

And you confirmed it, thank you for the quick update.

--- upsnet/experiments/upsnet_resnet50_coco_4gpu.yaml	2019-05-11 15:21:57.000000000 -0700
+++ upsnet/experiments/upsnet_resnet50_coco_1gpu.yaml	2019-05-12 00:37:26.000000000 -0700
@@ -2,7 +2,7 @@
 output_path: "./output/upsnet/coco"
 model_prefix: "upsnet_resnet_50_coco_"
 symbol: resnet_50_upsnet
-gpus: '0,1,2,3'
+gpus: '0'
 dataset:
   num_classes: 81
   num_seg_classes: 133
@@ -32,12 +32,12 @@
   snapshot_step: 2000
   resume: false
   begin_iteration: 0
-  max_iteration: 360000
+  max_iteration: 720000
   decay_iteration:
-  - 240000
-  - 320000
+  - 480000
+  - 640000
   warmup_iteration: 1500
-  lr: 0.005
+  lr: 0.0025
   wd: 0.0001
   momentum: 0.9
   batch_size: 1
@@ -54,7 +54,7 @@
   - 800
   max_size: 1333
   batch_size: 1
-  test_iteration: 360000
+  test_iteration: 720000
   panoptic_stuff_area_limit: 4096
   vis_mask: false

from upsnet.

YuwenXiong avatar YuwenXiong commented on May 28, 2024

changing #iter/lr by 2x may not match the result I reported as you changed batch size from 4 to 1, and they (batch size/lr/#iter) should be matched.

from upsnet.

insop avatar insop commented on May 28, 2024

Right, thank you and updated with by 4

**--- upsnet/experiments/upsnet_resnet50_coco_4gpu.yaml	2019-05-11 15:21:57.000000000 -0700
+++ upsnet/experiments/upsnet_resnet50_coco_1gpu.yaml	2019-05-12 05:58:38.000000000 -0700
@@ -2,7 +2,7 @@
 output_path: "./output/upsnet/coco"
 model_prefix: "upsnet_resnet_50_coco_"
 symbol: resnet_50_upsnet
-gpus: '0,1,2,3'
+gpus: '0'
 dataset:
   num_classes: 81
   num_seg_classes: 133
@@ -32,12 +32,12 @@
   snapshot_step: 2000
   resume: false
   begin_iteration: 0
-  max_iteration: 360000
+  max_iteration: 1440000
   decay_iteration:
-  - 240000
-  - 320000
+  - 960000
+  - 1280000
   warmup_iteration: 1500
-  lr: 0.005
+  lr: 0.00125
   wd: 0.0001
   momentum: 0.9
   batch_size: 1
@@ -54,7 +54,7 @@
   - 800
   max_size: 1333
   batch_size: 1
-  test_iteration: 360000
+  test_iteration: 1440000
   panoptic_stuff_area_limit: 4096
   vis_mask: false**

from upsnet.

andyhahaha avatar andyhahaha commented on May 28, 2024

Hi I encounter a similar error.
I change the backbone to PeleeNet and train with 4 gpu.
But feat_id will have some elements are nan.

feat_id = np.clip(np.floor(2 + np.log2(np.sqrt(w * h) / 224 + 1e-6)), 0, 3)

It is because the propose rois has x1>x2 or y1>y2 which cause the w<0 or h<0.
np.log2(negative number ) cause nan.

I have tried smaller learning rate. 0.0025 or 0.00125. But it still happen.
Do anyone know how to solve this problem?
Thanks!

from upsnet.

weixianghong avatar weixianghong commented on May 28, 2024

Hi I encounter a similar error.
I change the backbone to PeleeNet and train with 4 gpu.
But feat_id will have some elements are nan.

feat_id = np.clip(np.floor(2 + np.log2(np.sqrt(w * h) / 224 + 1e-6)), 0, 3)

It is because the propose rois has x1>x2 or y1>y2 which cause the w<0 or h<0.
np.log2(negative number ) cause nan.
I have tried smaller learning rate. 0.0025 or 0.00125. But it still happen.
Do anyone know how to solve this problem?
Thanks!

I also met the same issue. After I change the backbone to ResNeXT-101, RPN will produces negative width or height and causes NaN.
Have you solved it? May you guide me?

from upsnet.

YuwenXiong avatar YuwenXiong commented on May 28, 2024

@andyhahaha @weixianghong Please notice that we used pretrained weights converted from caffe, which are expecting different image preprocessing comparing to torchvision model. Please set use_caffe_model to false if you wanna use models with torchvision-style preprocessing

from upsnet.

weixianghong avatar weixianghong commented on May 28, 2024

@andyhahaha @weixianghong Please notice that we used pretrained weights converted from caffe, which are expecting different image preprocessing comparing to torchvision model. Please set use_caffe_model to false if you wanna use models with torchvision-style preprocessing

It works, thank you!

from upsnet.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.