Hi, Does this implementation solves any class imbalanced datasets ? Since of the f

Okay. I will try that.Thank you for you help and time. <span cla

Imbalanced Classes about deeplearning-models HOT 11 CLOSED

SURABHI-GUPTA commented on August 15, 2024

Imbalanced Classes

from deeplearning-models.

Comments (11)

rasbt commented on August 15, 2024

Hi there,

good question! We did account for class imbalance in a research project where we used CelebA (https://arxiv.org/abs/1807.11936), but I didn't do anything regarding class imbalance in this repository.

This was because in an earlier project where we developed a method for "fooling" a gender classifier, I found that the gender classifier can more easily be fooled if it is a picture of a male person. It may be because there are more female pictures in the dataset. Regarding skin color and gender, there was also some difference for males; however, I didn't see the difference they found in the "Gender Shades" (http://gendershades.org/overview.html) paper, which was published a few months afterwards. Anyways, here are some of the results I got back then ~2 years ago. This is for fooling the gender classifier, so an error of 0.5 is desired.

from deeplearning-models.

SURABHI-GUPTA commented on August 15, 2024

Hey, Thank you for such a quick response. If possible, can you share the code for addressing class imbalance for a feature, say Double_Chin, in celebA ? It would be really helpful for me to understand. Thanks and regards, Surabhi Gupta

…

On Mon, 27 Jan 2020, 9:50 pm Sebastian Raschka, ***@***.***> wrote: Closed #37 <#37>. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#37?email_source=notifications&email_token=AFKZ2LUGUA5X2W4EP7OIG4TQ74CTJA5CNFSM4KL4OM42YY3PNVWWK3TUL52HS4DFWZEXG43VMVCXMZLOORHG65DJMZUWGYLUNFXW5KTDN5WW2ZLOORPWSZGOWHCLUGY#event-2982459931>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AFKZ2LWPECPMEL2ERAWSHYDQ74CTJANCNFSM4KL4OM4Q> .

from deeplearning-models.

rasbt commented on August 15, 2024

Sry we haven't done that, we only oversampled by gender and skin color

from deeplearning-models.

SURABHI-GUPTA commented on August 15, 2024

Okay. But when I ran your code over imbalanced features, it gave 95% accuracy on test images. So, what can be the reason behind this ? Is it because the samples of majority class are more ?

…

On Mon, Jan 27, 2020 at 10:04 PM Sebastian Raschka ***@***.***> wrote: Sry we haven't done that, we only oversampled by gender and skin color — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#37?email_source=notifications&email_token=AFKZ2LUDIBJH4QP3D2IGVWDQ74EKBA5CNFSM4KL4OM42YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKAE3WY#issuecomment-578833883>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AFKZ2LXZINN54ZQEF2QTRWTQ74EKBANCNFSM4KL4OM4Q> .

from deeplearning-models.

rasbt commented on August 15, 2024

it depends. On which labels (imbalanced features) did you do the prediction? And what's their ratio in the dataset?

from deeplearning-models.

SURABHI-GUPTA commented on August 15, 2024

I did it over 'Double_Chin' which has "Without Double_Chin"= 193140 images and "With Double_Chin"= 9549 images Number of DC and NDC images in training dataset: 7571 and 155199 Number of DC and NDC images in validation dataset: 975 and 18892 Number of DC and NDC images in testing dataset: 913 and 19049

…

On Mon, Jan 27, 2020 at 10:27 PM Sebastian Raschka ***@***.***> wrote: it depends. On which labels (imbalanced features) did you do the prediction? And what's their ratio in the dataset? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#37?email_source=notifications&email_token=AFKZ2LVMT6AAN4K5VRAGH5DQ74G77A5CNFSM4KL4OM42YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKAHTLY#issuecomment-578845103>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AFKZ2LQRPR5XE4NMDYGWKTLQ74G77ANCNFSM4KL4OM4Q> .

from deeplearning-models.

rasbt commented on August 15, 2024

In that case, it could indeed have learned to always predict the majority label, because

19049/(19049+913) = 0.954

from deeplearning-models.

SURABHI-GUPTA commented on August 15, 2024

Yes. True. Any suggestion from your side on how can I handle this class imbalanced situation ?

…

On Mon, Jan 27, 2020 at 10:33 PM Sebastian Raschka ***@***.***> wrote: In that case, it could indeed have learned to always predict the majority label, because 19049/(19049+913) = 0.954 — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#37?email_source=notifications&email_token=AFKZ2LT5J3DBC2QCK7GRM23Q74HXXA5CNFSM4KL4OM42YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKAIKVQ#issuecomment-578848086>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AFKZ2LUZD2MLADGF3QNRMP3Q74HXXANCNFSM4KL4OM4Q> .

from deeplearning-models.

rasbt commented on August 15, 2024

The imbalances we were dealing with weren't that extreme so in our case oversampling and/or undersampling both worked fine. Not sure if that's true in your case though.

You can maybe pre-train the network on a different attribute that is less imbalanced. Then, you can undersample your current dataset and use transfer learning to see if you get good results. It's a big imbalance though and getting good results may be tricky.

from deeplearning-models.

SURABHI-GUPTA commented on August 15, 2024

Okay. I will try that. Thank you for you help and time.

…

On Mon, Jan 27, 2020 at 10:39 PM Sebastian Raschka ***@***.***> wrote: The imbalances we were dealing with weren't that extreme so in our case oversampling and/or undersampling both worked fine. Not sure if that's true in your case though. You can maybe pre-train the network on a different attribute that is less imbalanced. Then, you can undersample your current dataset and use transfer learning to see if you get good results. It's a big imbalance though and getting good results may be tricky. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#37?email_source=notifications&email_token=AFKZ2LUWOHGFWH67OBSKFCLQ74IONA5CNFSM4KL4OM42YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKAJCXI#issuecomment-578851165>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AFKZ2LQI2AZ6YXMZSBAKRI3Q74IONANCNFSM4KL4OM4Q> .

from deeplearning-models.

SURABHI-GUPTA commented on August 15, 2024

Hi,

For handling imbalance, I tried to duplicate the rows of minority class in train.csv file itself. But I am getting an error while joining the paths:

Traceback (most recent call last):
File "/home/surabhi/celeba-dataset/classify_bl_aug.py", line 302, in
model_conv = train_model(model_conv, optimizer_conv, exp_lr_scheduler, num_epochs=10)
File "/home/surabhi/celeba-dataset/classify_bl_aug.py", line 207, in train_model
for inputs, labels in dataloaders[phase]:
File "/home/dataset/packages/pytorch/1.0.0/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 637, in next
return self._process_next_batch(batch)
File "/home/dataset/packages/pytorch/1.0.0/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 658, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
TypeError: Traceback (most recent call last):
File "/home/dataset/packages/pytorch/1.0.0/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 138, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/dataset/packages/pytorch/1.0.0/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 138, in
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/surabhi/celeba-dataset/classify_bl_aug.py", line 69, in getitem
img = Image.open(os.path.join(self.img_dir, self.img_names[index]))
File "/home/dataset/packages/python/3.7/lib/python3.7/posixpath.py", line 94, in join
genericpath._check_arg_types('join', a, *p)
File "/home/dataset/packages/python/3.7/lib/python3.7/genericpath.py", line 149, in _check_arg_types
(funcname, s.class.name)) from None
TypeError: join() argument must be str or bytes, not 'int64'

from deeplearning-models.

Imbalanced Classes about deeplearning-models HOT 11 CLOSED

Comments (11)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs