GithubHelp home page GithubHelp logo

Comments (11)

rasbt avatar rasbt commented on August 15, 2024

Hi there,

good question! We did account for class imbalance in a research project where we used CelebA (https://arxiv.org/abs/1807.11936), but I didn't do anything regarding class imbalance in this repository.

This was because in an earlier project where we developed a method for "fooling" a gender classifier, I found that the gender classifier can more easily be fooled if it is a picture of a male person. It may be because there are more female pictures in the dataset. Regarding skin color and gender, there was also some difference for males; however, I didn't see the difference they found in the "Gender Shades" (http://gendershades.org/overview.html) paper, which was published a few months afterwards. Anyways, here are some of the results I got back then ~2 years ago. This is for fooling the gender classifier, so an error of 0.5 is desired.

Screen Shot 2020-01-27 at 10 16 53 AM

from deeplearning-models.

SURABHI-GUPTA avatar SURABHI-GUPTA commented on August 15, 2024

from deeplearning-models.

rasbt avatar rasbt commented on August 15, 2024

Sry we haven't done that, we only oversampled by gender and skin color

from deeplearning-models.

SURABHI-GUPTA avatar SURABHI-GUPTA commented on August 15, 2024

from deeplearning-models.

rasbt avatar rasbt commented on August 15, 2024

it depends. On which labels (imbalanced features) did you do the prediction? And what's their ratio in the dataset?

from deeplearning-models.

SURABHI-GUPTA avatar SURABHI-GUPTA commented on August 15, 2024

from deeplearning-models.

rasbt avatar rasbt commented on August 15, 2024

In that case, it could indeed have learned to always predict the majority label, because

19049/(19049+913) = 0.954

from deeplearning-models.

SURABHI-GUPTA avatar SURABHI-GUPTA commented on August 15, 2024

from deeplearning-models.

rasbt avatar rasbt commented on August 15, 2024

The imbalances we were dealing with weren't that extreme so in our case oversampling and/or undersampling both worked fine. Not sure if that's true in your case though.

You can maybe pre-train the network on a different attribute that is less imbalanced. Then, you can undersample your current dataset and use transfer learning to see if you get good results. It's a big imbalance though and getting good results may be tricky.

from deeplearning-models.

SURABHI-GUPTA avatar SURABHI-GUPTA commented on August 15, 2024

from deeplearning-models.

SURABHI-GUPTA avatar SURABHI-GUPTA commented on August 15, 2024

Hi,

For handling imbalance, I tried to duplicate the rows of minority class in train.csv file itself. But I am getting an error while joining the paths:

Traceback (most recent call last):
File "/home/surabhi/celeba-dataset/classify_bl_aug.py", line 302, in
model_conv = train_model(model_conv, optimizer_conv, exp_lr_scheduler, num_epochs=10)
File "/home/surabhi/celeba-dataset/classify_bl_aug.py", line 207, in train_model
for inputs, labels in dataloaders[phase]:
File "/home/dataset/packages/pytorch/1.0.0/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 637, in next
return self._process_next_batch(batch)
File "/home/dataset/packages/pytorch/1.0.0/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 658, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
TypeError: Traceback (most recent call last):
File "/home/dataset/packages/pytorch/1.0.0/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 138, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/dataset/packages/pytorch/1.0.0/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 138, in
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/surabhi/celeba-dataset/classify_bl_aug.py", line 69, in getitem
img = Image.open(os.path.join(self.img_dir, self.img_names[index]))
File "/home/dataset/packages/python/3.7/lib/python3.7/posixpath.py", line 94, in join
genericpath._check_arg_types('join', a, *p)
File "/home/dataset/packages/python/3.7/lib/python3.7/genericpath.py", line 149, in _check_arg_types
(funcname, s.class.name)) from None
TypeError: join() argument must be str or bytes, not 'int64'

from deeplearning-models.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.