Skip to content

get-preprocessed-dataset-imagenet

Automatically generated README for this automation recipe: get-preprocessed-dataset-imagenet

Category: AI/ML datasets

License: Apache 2.0

  • Notes from the authors, contributors and users: README-extra

  • CM meta description for this script: _cm.json

  • Output cached? True

Reuse this script in your project

Install MLCommons CM automation meta-framework

Pull CM repository with this automation recipe (CM script)

cm pull repo mlcommons@cm4mlops

cmr "get dataset imagenet ILSVRC image-classification preprocessed" --help

Run this script

Run this script via CLI
cm run script --tags=get,dataset,imagenet,ILSVRC,image-classification,preprocessed[,variations] [--input_flags]
Run this script via CLI (alternative)
cmr "get dataset imagenet ILSVRC image-classification preprocessed [variations]" [--input_flags]
Run this script from Python
import cmind

r = cmind.access({'action':'run'
              'automation':'script',
              'tags':'get,dataset,imagenet,ILSVRC,image-classification,preprocessed'
              'out':'con',
              ...
              (other input keys for this script)
              ...
             })

if r['return']>0:
    print (r['error'])
Run this script via Docker (beta)
cm docker script "get dataset imagenet ILSVRC image-classification preprocessed[variations]" [--input_flags]

Variations

  • No group (any combination of variations can be selected)

    Click here to expand this section.

    • _default
    • _pytorch
      • ENV variables:
        • CM_PREPROCESS_PYTORCH: yes
        • CM_MODEL: resnet50
    • _tflite_tpu
      • ENV variables:
        • CM_MODEL: resnet50
        • CM_PREPROCESS_TFLITE_TPU: yes
  • Group "calibration-option"

    Click here to expand this section.

    • _mlperf.option1
      • ENV variables:
        • CM_DATASET_CALIBRATION_OPTION: one
    • _mlperf.option2
      • ENV variables:
        • CM_DATASET_CALIBRATION_OPTION: two
  • Group "dataset-type"

    Click here to expand this section.

    • _calibration
      • ENV variables:
        • CM_DATASET_TYPE: calibration
    • _validation (default)
      • ENV variables:
        • CM_DATASET_TYPE: validation
  • Group "extension"

    Click here to expand this section.

    • _rgb32
      • ENV variables:
        • CM_DATASET_PREPROCESSED_EXTENSION: rgb32
    • _rgb8
      • ENV variables:
        • CM_DATASET_PREPROCESSED_EXTENSION: rgb8
  • Group "interpolation-method"

    Click here to expand this section.

    • _inter.area
      • ENV variables:
        • CM_DATASET_INTERPOLATION_METHOD: INTER_AREA
    • _inter.linear
      • ENV variables:
        • CM_DATASET_INTERPOLATION_METHOD: INTER_LINEAR
  • Group "layout"

    Click here to expand this section.

    • _NCHW (default)
      • ENV variables:
        • CM_DATASET_DATA_LAYOUT: NCHW
    • _NHWC
      • ENV variables:
        • CM_DATASET_DATA_LAYOUT: NHWC
  • Group "model"

    Click here to expand this section.

    • _for.mobilenet
    • _for.resnet50
      • ENV variables:
        • CM_DATASET_SUBTRACT_MEANS: 1
        • CM_DATASET_GIVEN_CHANNEL_MEANS: 123.68 116.78 103.94
        • CM_DATASET_NORMALIZE_DATA: 0
        • CM_DATASET_INTERPOLATION_METHOD: INTER_AREA
  • Group "precision"

    Click here to expand this section.

    • _float32
      • ENV variables:
        • CM_DATASET_DATA_TYPE: float32
        • CM_DATASET_QUANTIZE: 0
        • CM_DATASET_CONVERT_TO_UNSIGNED: 0
    • _int8
      • ENV variables:
        • CM_DATASET_DATA_TYPE: int8
        • CM_DATASET_QUANTIZE: 1
        • CM_DATASET_CONVERT_TO_UNSIGNED: 0
    • _uint8
      • ENV variables:
        • CM_DATASET_DATA_TYPE: uint8
        • CM_DATASET_DATA_TYPE_INPUT: float32
        • CM_DATASET_QUANTIZE: 1
        • CM_DATASET_CONVERT_TO_UNSIGNED: 1
  • Group "preprocessing-source"

    Click here to expand this section.

    • _generic-preprocessor
      • ENV variables:
        • CM_DATASET_REFERENCE_PREPROCESSOR: 0
    • _mlcommons-reference-preprocessor (default)
      • ENV variables:
        • CM_DATASET_REFERENCE_PREPROCESSOR: 1
  • Group "resolution"

    Click here to expand this section.

    • _resolution.#
      • ENV variables:
        • CM_DATASET_INPUT_SQUARE_SIDE: #
    • _resolution.224 (default)
      • ENV variables:
        • CM_DATASET_INPUT_SQUARE_SIDE: 224
  • Group "size"

    Click here to expand this section.

    • _1
      • ENV variables:
        • CM_DATASET_SIZE: 1
    • _500
      • ENV variables:
        • CM_DATASET_SIZE: 500
    • _full
      • ENV variables:
        • CM_DATASET_SIZE: 50000
    • _size.#
      • ENV variables:
        • CM_DATASET_SIZE: #
Default variations

_NCHW,_mlcommons-reference-preprocessor,_resolution.224,_validation

Script flags mapped to environment

  • --dir=valueCM_DATASET_PREPROCESSED_PATH=value
  • --imagenet_path=valueCM_IMAGENET_PATH=value
  • --imagenet_preprocessed_path=valueCM_IMAGENET_PREPROCESSED_PATH=value
  • --threads=valueCM_NUM_PREPROCESS_THREADS=value

Default environment

These keys can be updated via --env.KEY=VALUE or env dictionary in @input.json or using script flags.

  • CM_DATASET_CROP_FACTOR: 87.5
  • CM_DATASET_DATA_TYPE: float32
  • CM_DATASET_DATA_LAYOUT: NCHW
  • CM_DATASET_QUANT_SCALE: 1
  • CM_DATASET_QUANTIZE: 0
  • CM_DATASET_QUANT_OFFSET: 0
  • CM_DATASET_PREPROCESSED_EXTENSION: npy
  • CM_DATASET_CONVERT_TO_UNSIGNED: 0
  • CM_DATASET_REFERENCE_PREPROCESSOR: 1
  • CM_PREPROCESS_VGG: yes
  • CM_MODEL: resnet50

Native script being run


Script output

cmr "get dataset imagenet ILSVRC image-classification preprocessed [variations]" [--input_flags] -j