GithubHelp home page GithubHelp logo

ganggangdong / marveldataset2016 Goto Github PK

View Code? Open in Web Editor NEW

This project forked from avaapm/marveldataset2016

0.0 1.0 0.0 10.25 MB

This repository is to download the MARVEL dataset 2016 for the publication "Gundogdu E., Solmaz B, Yucesoy V., Koc A., Marvel: A Large-Scale Image Dataset for Maritime Vessels, Asian Conference on Computer Vision (ACCV), 2016".

Home Page: https://sites.google.com/site/marveldataset2016/

Python 100.00%

marveldataset2016's Introduction

marveldataset2016

This repository is to download the MARVEL dataset 2016 for the publication "Gundogdu E., Solmaz B, Yucesoy V., Koc A., Marvel: A Large-Scale Image Dataset for Maritime Vessels, Asian Conference on Computer Vision (ACCV), 2016".

Author: Erhan Gundogdu

Last Update: 09/28/2016

CITATION:

If you use MARVEL in your research, please cite:

	@PROCEEDINGS {MARVEL,
	author       = "Gundogdu E., Solmaz B, Yucesoy V., Koc A.",
	title        = "Marvel: A Large-Scale Image Dataset for Maritime Vessels",
	year         = "2016",
	organization = "Asian Conference on Computer Vision (ACCV)"
	}

This document is to explain the use of MARVEL and its metadata and to download it from the web.

(0) Downloading MARVEL dataset: In order to download the MARVEL dataset, 'MARVEL_Download.py' Python script must be run with the appropriate .dat file, i.e. Uncomment the related dat file: 'VesselClassification.dat' for Vessel Classification, 'IMOTrainAndTest.dat' for Vessel Verification/Retrieval/Recognition tasks.

Default values and explanation of the download parameters inside MARVEL_Download.py are:
	NUMBER_OF_WORKERS = 10 ## Number of threads that will be working in parallel
	MAX_NUM_OF_FILES_IN_FOLDER = 5000 ## Maximum number of files in a folder
	IMAGE_HEIGHT = 256 ## image height unless ORIGINAL_SIZE = 1
	IMAGE_WIDTH = 256 ## image width unless ORIGINAL_SIZE = 1
	ORIGINAL_SIZE = 0 # 1 for yes, 0 for no
	JUST_IMAGE = 1 # 1 for yes, 0 for no

The images are downloaded to the location of MARVEL_Download.py with a meta folder structure.
Once the download is finished, the vessel images are distributed in a meta folder structure.
The output filename is FINAL.dat, which has the same structure as the input XXX.dat file except for appended path names
as the last column of each line.

e.g. one line from input IMOTrainAndTest.dat file: '993293,1,1,Container Ship'
and one line from output FINAL.dat file: '993293,1,1,Container Ship,path\to\your\folder\W1_1\993293.jpg'

(1) Vessel Classification: VesselClassification.dat is the corresponding file for vessel classification.

Each line carries the information related to the training and the test images in the following structure:
	vessel ID, set index, class label, class label name
vessel ID: is the ID consistent with the url name in the corresponding website and helps us to download the image.
set index: 1 or 2 (1 for training and 2 for test)
class label: vessel type label enumeration (from 1 to 26)
class label name: the name of the class label given in our paper

(2) IMO Training and Test Sets: IMOTrainAndTest.dat is the corresponding file containing information of IMO train and test sets defined in the paper.

Each line carries the information related to the IMO train and the test images in the following structure:
		vessel ID, set index, class label, class label name, IMO Number
	vessel ID: is the ID consistent with the url name in the corresponding website and helps us to download the image.
	set index: 1 or 2 (1 for training and 2 for test)
	class label: vessel type label enumeration (from 1 to 26)
	class label name: the name of the class label given in our paper
	IMO number: is International Maritime Organization number of the corresponding vessel taken from the website
Note: The IMO Train and test sets contain 400K images in total and Vessel Verification/Retrieval/Recognition set indices
are based on the ordering of these 400K image list.

(3) Vessel Verification: VesselVerificationTest.dat and VesselVerificationTrain.dat are the vessel verification task related .dat files. In each file, every line corresponds to the indices of the pairs of images for positive and negative examples. As stated in the paper, a positive example is a pair of images belonging to the same vessel and a negative example is a pair of images belonging to different vessels. These pairs are randomly sampled from the IMO train and test sets.

Each line has the following structure:
		vessel index 1, vessel index 2, set index
	vessel index 1/2: the index of the vessels within a pos/neg example with respect to the IMO train/test set.
	set index: 0 or 1 (0 for negative examples, 1 for positive examples)

Note: In order to use the VesselVerificationTest.dat and VesselVerificationTrain.dat files, IMOTrainAndTest.dat 
is required, since the indices of this task are given with respect to the 400K IMO training and test sets.

(4) Vessel Retrieval ChiTest.dat, ChiTrain.dat, EucTest.dat, EucTrain.dat are the corresponding meta files for vessel retrieval tasks for the chi square and euclidean distances. In each file, every line corresponds to training and test examples.

Each line has the following structure:
		vessel index, class label, class name
	vessel index: the index of the corresponding example with respect to the IMO train/test set.
	class label: class label of the example out of 109 different vessels (as stated in the paper.)
	class name: name of the vessel type (as stated in the paper)
Note: In order to use the VesselVerificationTest.dat and VesselVerificationTrain.dat files, IMOTrainAndTest.dat 
is required, since the indices of this task are given with respect to the 400K IMO training and test sets.

(5) Vessel Recognition The folder with the name Recognition (please first extract Recognition.zip) contains 29 classes for the purpose of recognizing individual vessel within their corresponding vessel types. This folder contains seperate foldera for different 29 vessel types.

Each vessel type folder, e.g. Bulk Carrier, contain five different folds for training and testing. Each .dat file, i.e.
trainSplit_X, has the following structure:
		vesselindex, IMO number
	vessel index: the index of the corresponding example with respect to the IMO train/test set.
	IMO number: IMO number for the example vessel image (This label is required for training a classifier and can also 
	be obtained from the first column of IMOTrainAndTest.dat file with the help of the vessel index.)
Note: In order to use the VesselVerificationTest.dat and VesselVerificationTrain.dat files, IMOTrainAndTest.dat 
is required, since the indices of this task are given with respect to the 400K IMO training and test sets.

marveldataset2016's People

Contributors

avaapm avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.