GithubHelp home page GithubHelp logo

Comments (7)

junsukchoe avatar junsukchoe commented on May 25, 2024

Recently the file name of ImageNetV2 has been changed. We are looking into this issue, but before that you can ask the authors for ImageNetV2 to provide the mapping between the old and current file name. Please refer to modestyachts/ImageNetV2#6 for more detail.

from wsolevaluation.

sbelharbi avatar sbelharbi commented on May 25, 2024

hi @junsukchoe
the issue is still not solved yet.
contacted the authors.

is there a way around this?

meanwhile could you share the data you have with the old naming if it is ok with the authors?

thanks

from wsolevaluation.

sbelharbi avatar sbelharbi commented on May 25, 2024

@hyunOO did you find a way to solve this?
thanks

from wsolevaluation.

sbelharbi avatar sbelharbi commented on May 25, 2024

tomorrow, i'll try to brute-force the mapping between images based on their size hxw... hopefully it is unique. will check whether the name of the folders has changed as well so to use it to help iding samples. i see already samples with same size but in different folders...

there is only 10k images. it can be done given some time.
if you have the data or the mapping, please post it here.
thanks

from wsolevaluation.

junsukchoe avatar junsukchoe commented on May 25, 2024

Hello,

For the quick solution, I have made a mapping list based on the SSIM scores: mapping.txt.
It hasn't been thoroughly verified yet, but when I checked a few samples, the mappings were correct.

I hope this helps until the official mapping is released.

Thanks!

from wsolevaluation.

sbelharbi avatar sbelharbi commented on May 25, 2024

hi,
thanks for your quick/helpful reply.
i will work with this while waiting the official maps.
i did a brute force mapping inside the same folders using images sizes.
found only 4556 pairs. the rest have similar sizes!!!

all the found 4556 match the mapping you provided.

thanks again

here is the output of script:

100%|██████████████████████████████████████| 1000/1000 [00:06<00:00, 159.20it/s]
BFORCE: found 4556 possibly correct pairs.
BFORCE: found 5444 failed matching due to duplicate sizes.
found 0 failed comparison.

script:

import os
import sys
from os.path import join, dirname, abspath
from tqdm import tqdm

from PIL import Image

SPLIT = 'valid'


def get_ids(img_id_file: str) -> list:
    image_ids = []
    with open(img_id_file, 'r') as f:
        for line in f.readlines():
            image_ids.append(line.strip('\n').replace('val2/', ''))
    return image_ids


def get_image_sizes(path_img_sz: str) -> dict:
    """
    image_sizes.txt has the structure

    <path>,<w>,<h>
    path/to/image1.jpg,500,300
    path/to/image2.jpg,1000,600
    path/to/image3.jpg,500,300
    ...
    """
    image_sizes = {}
    with open(path_img_sz, 'r') as f:
        for line in f.readlines():
            image_id, ws, hs = line.strip('\n').split(',')
            image_id = image_id.replace('val2/', '')
            w, h = int(ws), int(hs)
            image_sizes[image_id] = (w, h)
    return image_sizes


def compare_bforce_with_mapping(path_provided_map_1: str, bf: dict) -> list:
    mapz = dict()
    with open(path_provided_map_1, 'r') as fin:
        for line in fin.readlines():
            org_k, new_k = line.strip('\n').replace(' ', '').split(',')
            assert org_k not in mapz
            mapz[org_k] = new_k

    failed = []
    for k in bf:
        if bf[k] != mapz[k]:
            failed.append(f'{k}, {bf[k]}, {mapz[k]}')

    return failed


if __name__ == '__main__':
    # hard paths.
    vlddir = 'folds/wsol-done-right-splits/ILSVRC/val'

    # original valid data.
    org_img_id_path = join(vlddir, 'image_ids.txt')
    org_img_sz_path = join(vlddir, 'image_sizes.txt')

    org_ids = get_ids(img_id_file=org_img_id_path)
    org_sz = get_image_sizes(path_img_sz=org_img_sz_path)

    # new valid data.
    data_valid = 'wsol-done-right/ILSVRC/val2'

    subfds = [x[0] for x in os.walk(data_valid) if x[0] != data_valid]
    subfds = [x.replace(data_valid + '/', '') for x in subfds]
    subfds.sort(key=int)
    new_ids = []
    new_sz = dict()
    mappings = dict()  # orig: new
    failed_mappings = []

    for fd in tqdm(subfds, ncols=80, total=len(subfds)):
        c_or_ids = [k for k in org_ids if k.startswith(fd + '/')]
        for file in os.listdir(join(data_valid, fd)):
            if file.endswith(".jpeg"):
                pfile = os.path.join(data_valid, fd, file)
                image = Image.open(pfile)
                w, h = image.size
                new_k = f'{fd}/{file}'
                new_ids.append(new_k)

                new_sz[new_k] = (w, h)

                # bf
                matchs = []
                for k in c_or_ids:
                    matchs.append(org_sz[k] == new_sz[new_k])

                if sum(matchs) == 1:
                    orig_k = c_or_ids[matchs.index(True)]
                    assert orig_k not in mappings
                    mappings[orig_k] = new_k
                else:
                    failed_mappings.append(new_k)

    with open('bfmapping.txt', 'w') as fout:
        for k in mappings:
            fout.write(f'{k}, {mappings[k]}\n')

    # compare bf results with the provided mapping.
    pathmp = 'mapping.txt'
    failed = compare_bforce_with_mapping(path_provided_map_1=pathmp,
                                         bf=mappings)

    print(f'BFORCE: found {len(list(mappings.keys()))} possibly correct pairs.')
    print(f'BFORCE: found {len(failed_mappings)} failed matching due to '
          f'duplicate '
          f'sizes.')

    print(f'found {len(failed)} failed comparison.')

from wsolevaluation.

jason718 avatar jason718 commented on May 25, 2024

#53 submitted a PR regarding this. Used the mapping file provided in the thread.

from wsolevaluation.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.