Nice work porting the model. I found that your evaluation code is wrong. You are evalu

Yes, I found out about the mistake in the eval just recently and I will be fixi

The fixed evaluation is in the dev branch. Find the path of this new in

The evaluation has been fixed now. evalpyt2.py

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

I reran my few hours back with the converted pytorch model (.pth)

If you are short on time, may be you could share the with me and I could look f

Wrong Evaluation script about pytorch-deeplab-resnet HOT 12 CLOSED

isht7 commented on May 27, 2024

Wrong Evaluation script

from pytorch-deeplab-resnet.

Comments (12)

swamiviv commented on May 27, 2024 1

Sure. Everything else being the same, this is the way each image is processed and evaluated. The only difference I see is read/write of the image and GT that have been done using PIL to exactly preserve the range and the RGB channel order. Let me know if you find anything strange here.

        img = np.zeros((513,513,3));        
        img_temp = np.asarray(Image.open(os.path.join(im_path,i[:-1]+'.jpg')),dtype=np.float32)[:,:,::-1]
        img_original = img_temp
        img_temp -= vgg_mean
        img[:img_temp.shape[0],:img_temp.shape[1],:] = img_temp

        gt = np.asarray(Image.open(os.path.join(gt_path,i[:-1]+'.png')),dtype=np.uint8)

        output_list = model(Variable(torch.from_numpy(img[np.newaxis, :].transpose(0,3,1,2)).float(),volatile = True).cuda())
        interp = nn.UpsamplingBilinear2d(size=(513, 513))
        output = interp(output_list[3]).cpu().data[0].numpy()
        output = output[:,:img_temp.shape[0],:img_temp.shape[1]]
        output = output.transpose(1,2,0)
        output = np.argmax(output,axis = 2)
        hist += fast_hist(gt.flatten(),output.flatten(),num_classes)

from pytorch-deeplab-resnet.

isht7 commented on May 27, 2024

Yes, I found out about the mistake in the eval script just recently and I will be fixing it soon. Thank you for pointing this out. The mIoU of 72.1% could be due to issue #4.(will also be fixed soon). There is also one more difference between the caffe version and the pytorch version(during training)- the caffe version does scaling(0.5,0.75,1,1.25,1.5) on some fixed scales, while the pytorch version randomly picks a number between (0.5, 1.3). Randomly picking a number between (0.5, 1.5) does not fit while training on Titan X.

from pytorch-deeplab-resnet.

fanq15 commented on May 27, 2024

Is this bug fixed?

from pytorch-deeplab-resnet.

isht7 commented on May 27, 2024

The fixed evaluation script is in the dev branch. Find the path of this new script in the Results sections of Readme.

from pytorch-deeplab-resnet.

isht7 commented on May 27, 2024

The evaluation script has been fixed now. evalpyt2.py is the correct script. The old script, evalpyt.py, is still there to maintain continuity, and the difference between them has been clearly mentioned in the results section of readme . We get 71.13% mean IOU from the pytorch trained model. train_iter_20000.caffemodel gives 74.39%. The converted .pth model also gives 74.39%. Readme has also been updated to provide scripts to verify each of these performance claim. Please note that in the ground truth images, the label 255 is merged with background during the evaluation because this was done during training also.

from pytorch-deeplab-resnet.

isht7 commented on May 27, 2024

@swamiviv you said(and also in table 4 of the paper) that the .caffemodel gives 76.3% on the val set, but I am getting 74.39% only. Why could this be, is this because I am merging the boundary(255) label with the background? Are you able to get 76.3% yourself? I am using this script. In the end, I print 2 values, the second one is the one which should be considered.

from pytorch-deeplab-resnet.

swamiviv commented on May 27, 2024

Thanks for correcting the eval. code! It is correct that you are getting lesser value since the evaluation script merges the 255 value with bg. You can just let the 255 value as is in the labels when evaluating it. If you look at the 'fast_hist' function, you will find that those pixels with class 255 are ignore automatically. I tried using the .caffemodel in caffe and got the same result. I haven't tried the modified pytorch model. I will give it a try soon. But if you redo this evaluation as I stated here, I am fairly confident you will able to reproduce those numbers.

…

On Tue, Jul 18, 2017 at 1:11 PM, Isht Dwivedi ***@***.***> wrote: @swamiviv <https://github.com/swamiviv> you said(and also in table 4 of the paper) that the .caffemodel gives 76.3% on the val set, but I am getting 74.39% only. Why could this be, is this because I am merging the boundary(255) label with the background? Are you able to get 76.3% yourself? I am using this <https://github.com/isht7/pytorch-deeplab-resnet/blob/development/caffe_evalpyt.py> script. In the end, I print 2 values, the second one is the one which should be considered. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#5 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AGWIeNv9FXyFodVw1zPvmLf4YGNymy-uks5sPOdIgaJpZM4N6Bll> .

-- --Swami

from pytorch-deeplab-resnet.

isht7 commented on May 27, 2024

Update: after leaving the 255 label as it is(as suggested by you), I am now getting 75.54% from the train_iter_20000.caffemodel which is still lower than that reported in the paper(76.3%). If you could look into my code to find the possible cause of this, it would be great.

from pytorch-deeplab-resnet.

swamiviv commented on May 27, 2024

I reran my script few hours back with the converted pytorch model (.pth) of the train_iter_20000.caffemodel file and I could reproduce the exact numbers from the paper (76.42%) over the validation set of 1449 images. I will look into your script soon when I get a chance.

…

On Wed, Jul 19, 2017 at 5:47 PM, Isht Dwivedi ***@***.***> wrote: Update: after leaving the 255 label as it is(as suggested by you), I am now getting 75.54% from the train_iter_20000.caffemodel which is still lower than that reported in the paper(76.3%). If you could look into my code <https://github.com/isht7/pytorch-deeplab-resnet/blob/development/caffe_evalpyt.py> to find the possible cause of this, it would be great. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#5 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AGWIeIJ4zNGywuyCX5PhWb9PhxJXkWtYks5sPnmNgaJpZM4N6Bll> .

-- --Swami

from pytorch-deeplab-resnet.

isht7 commented on May 27, 2024

If you are short on time, may be you could share the script with me and I could look for differences?

from pytorch-deeplab-resnet.

isht7 commented on May 27, 2024

even after using the above code, I get the exact same result as I got before (75.54% mean IOU)! Why could this be happening?

from pytorch-deeplab-resnet.

chenyzh28 commented on May 27, 2024

@isht7, hi, have you solved the problem? Did you get 76.35% as reported in the paper? By the way, how can I convert the gt images and the output images to color images? Or How did you process the gt images at the beginning?

from pytorch-deeplab-resnet.

Wrong Evaluation script about pytorch-deeplab-resnet HOT 12 CLOSED

Comments (12)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs