GithubHelp home page GithubHelp logo

giorking / open-images-dataset Goto Github PK

View Code? Open in Web Editor NEW

This project forked from cvdfoundation/open-images-dataset

0.0 2.0 0.0 10 KB

Open Images is a dataset of ~9 million images that have been annotated with image-level labels and bounding boxes spanning thousands of classes.

Home Page: https://github.com/openimages/dataset

open-images-dataset's Introduction

Open Images Dataset

Open Images is a dataset of ~9 million URLs to images that have been annotated with labels spanning over 6000 categories. This page aims to provide the download instructions and mirror sites for Open Images Dataset. Please visit the project page for more details on the dataset .

Download Images

Download Images With Bounding Boxes Annotations

Prerequisite: Gmail or Gmail associated account

CVDF hosts image files that have bounding boxes annotations in the Open Images Dataset V4. The images are split into train (1,743,042), validation (41,620), and test (125,436) sets. The images are rescaled to 1024x768 resolution with total size 561GB. The images can be directly downloaded into a local directory or a Google Cloud storage bucket. Please sign up with your Gmail or Gmail asscociated account here to request access. After you submit the request form, we will grant READ access to your mail account for the storage bucket:

gs://open-images-dataset

You can either download the images to a storage bucket or a local directory with the following procedures:

  1. install gsutil
  2. gcloud auth login [your_mail_account]
  3. gsutil -m rsync -r gs://open-images-dataset/train [target_dir/train] (513GB)
    gsutil -m rsync -r gs://open-images-dataset/validation [target_dir/validation] (12GB)
    gsutil -m rsync -r gs://open-images-dataset/test [target_dir/test] (36GB)          

The target_dir can be a local directory or a Google Cloud storage bucket.

Download Full Dataset With Google Storage Transfer

Prerequisite: Google Cloud Platform account

In this section, we describe the procedures to download all images in the Open Images Dataset to a Google Cloud storage bucket. We recommend to use the user interface provided in the Google Cloud storage console for the task.

Google Storage provides a "storage transfer" function to transfer online files into a storage bucket. This function can be used to transfer images from original urls into user's storage bucket. CVDF prepares the tsv files that contain all image urls in Open Images Dataset for the transfer. The step-by-step instructions are described in Creating and Managing Transfers with the Console. The size of the whole dataset is around 18TB. Please note that user needs to pay for hosting the dataset on Google Cloud storage after downloading it. The hosting price can be found on Google Cloud Storage Pricing.

The tsv file for the train set (partitioned into 10 files):
https://storage.googleapis.com/cvdf-datasets/oid/open-images-dataset-train[0-9].tsv

The tsv file for the validation set:
https://storage.googleapis.com/cvdf-datasets/oid/open-images-dataset-validation.tsv

The tsv file for the test set:
https://storage.googleapis.com/cvdf-datasets/oid/open-images-dataset-test.tsv

open-images-dataset's People

Contributors

tylin avatar shackenberg avatar

Watchers

James Cloos avatar DavidHSTai avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.