Download and Cut Datasets

To download original ImageNet and Pascal VOC datasets, follow the instructions below for each dataset type. These datasets are considerably big in size. If you want to save time when loading them into the DL Workbench, you can cut an original dataset.

To learn more about dataset types supported by the DL Workbench and their structure, refer to Dataset Types.

ImageNet Dataset

Download ImageNet Dataset

To download images from ImageNet, you need to have an account and agree to their Terms of Access. Follow the steps below:

  1. Go to the ImageNet homepage:
    imagenet_register_01-b.png
  2. If you have an account, click Login. Otherwise, click Signup in the right upper corner, provide your data, and wait for a confirmation email:
    imagenet_register_01-m-b.png
  3. Once you receive the confirmation email and log in, go to the Download page:
    imagenet_download_00-m-b.png
  4. Select Download Original Images:
    imagenet_download_01-m-b.png
  5. This will redirect you to the Terms of Access page. If you agree to the Terms, continue by clicking Agree and Sign:
    imagenet_terms_of_access_02-m-b.png
  6. Click one of the links in the Download as one tar file section to select it:
    imagenet_download_02-b.png
  7. Save it to the directory with the name provided below:
    C:\Users\Work\imagenet.zip

Cut ImageNet Dataset

Download a script to cut datasets. In a Python* console, run the following command after specifying the parameters:

python C:/Users/Downloads/cut_dataset.py \
--source_archive_dir=C:\Users\Work\imagenet.zip \
--output_size=10 \
--output_archive_dir=C:\Users\Work\subsets \
--dataset_type=imagenet

This command runs the script with the following arguments:

Parameter Explanation
--source_archive_dir=C:\Users\Work\imagenet.zip Full path to a downloaded archive
--output_size=10 Number of images to be left in a smaller dataset
--output_archive_dir=C:\Users\Work\subsets Full directory to the smaller dataset, excluding the name
--dataset_type=imagenet Type of the source dataset

Pascal VOC Dataset

Download Pascal VOC Dataset

To download test data from Pascal VOC, you need to have an account. Follow the steps below:

  1. Go to the PASCAL Visual Object Classes Homepage:
    voc_homepage-b.png
  2. Click PASCAL VOC Evaluation Server under the Pascal VOC data sets heading:
    voc_evaluation_server_01-m-b.png
  3. If you have an account, click Login in the left upper corner. Otherwise, click Registration, provide your data, and wait for a confirmation email:
    voc_login_register-m-b.png
  4. Click Downloads:
    voc_download_01-m-b.png
  5. Select a dataset:
    voc_download_02-b.png
  6. Save it to the directory and with the name provided below:
    C:\Users\Work\voc.tar.gz

Cut Pascal VOC Dataset

Download a script to cut datasets. In a Python* console, run the following command after specifying the parameters:

python C:/Users/Downloads/cut_dataset.py \
--source_archive_dir=C:\Users\Work\voc.tar.gz \
--output_size=10 \
--output_archive_dir=C:\Users\Work\subsets \
--dataset_type=voc

This command runs the script with the following arguments:

Parameter Explanation
--source_archive_dir=C:\Users\Work\voc.tar.gz Full path to a downloaded archive
--output_size=10 Number of images to be left in a smaller dataset
--output_archive_dir=C:\Users\Work\subsets Full directory to the smaller dataset, excluding the name
--dataset_type=voc Type of the source dataset