Skip to content

app-mlperf-inference-mlcommons-python

Automatically generated README for this automation recipe: app-mlperf-inference-mlcommons-python

Category: Modular MLPerf inference benchmark pipeline

License: Apache 2.0

Developers: Arjun Suresh, Thomas Zhu, Grigori Fursin * Notes from the authors, contributors and users: README-extra


This portable CM script is being developed by the MLCommons taskforce on automation and reproducibility to modularize the python reference implementations of the MLPerf inference benchmark using the MLCommons CM automation meta-framework. The goal is to make it easier to run, optimize and reproduce MLPerf benchmarks across diverse platforms with continuously changing software and hardware.

See the current coverage of different models, devices and backends here.

  • CM meta description for this script: _cm.yaml
  • Output cached? False

Reuse this script in your project

Install MLCommons CM automation meta-framework

Pull CM repository with this automation recipe (CM script)

cm pull repo mlcommons@cm4mlops

cmr "app vision language mlcommons mlperf inference reference ref" --help

Run this script

Run this script via CLI
cm run script --tags=app,vision,language,mlcommons,mlperf,inference,reference,ref[,variations] [--input_flags]
Run this script via CLI (alternative)
cmr "app vision language mlcommons mlperf inference reference ref [variations]" [--input_flags]
Run this script from Python
import cmind

r = cmind.access({'action':'run'
              'automation':'script',
              'tags':'app,vision,language,mlcommons,mlperf,inference,reference,ref'
              'out':'con',
              ...
              (other input keys for this script)
              ...
             })

if r['return']>0:
    print (r['error'])
Run this script via Docker (beta)
cm docker script "app vision language mlcommons mlperf inference reference ref[variations]" [--input_flags]

Variations

  • No group (any combination of variations can be selected)

    Click here to expand this section.

    • _3d-unet
      • ENV variables:
        • CM_TMP_IGNORE_MLPERF_QUERY_COUNT: True
        • CM_MLPERF_MODEL_SKIP_BATCHING: True
    • _beam_size.#
      • ENV variables:
        • GPTJ_BEAM_SIZE: #
    • _bert
      • ENV variables:
        • CM_MLPERF_MODEL_SKIP_BATCHING: True
    • _dlrm
      • ENV variables:
        • CM_MLPERF_MODEL_SKIP_BATCHING: True
    • _multistream
      • ENV variables:
        • CM_MLPERF_LOADGEN_SCENARIO: MultiStream
    • _offline
      • ENV variables:
        • CM_MLPERF_LOADGEN_SCENARIO: Offline
    • _r2.1_default
      • ENV variables:
        • CM_RERUN: yes
        • CM_SKIP_SYS_UTILS: yes
        • CM_TEST_QUERY_COUNT: 100
    • _server
      • ENV variables:
        • CM_MLPERF_LOADGEN_SCENARIO: Server
    • _singlestream
      • ENV variables:
        • CM_MLPERF_LOADGEN_SCENARIO: SingleStream
  • Group "batch-size"

    Click here to expand this section.

    • _batch_size.#
      • ENV variables:
        • CM_MLPERF_LOADGEN_MAX_BATCHSIZE: #
  • Group "device"

    Click here to expand this section.

    • _cpu (default)
      • ENV variables:
        • CM_MLPERF_DEVICE: cpu
        • CUDA_VISIBLE_DEVICES: ``
        • USE_CUDA: False
        • USE_GPU: False
    • _cuda
      • ENV variables:
        • CM_MLPERF_DEVICE: gpu
        • USE_CUDA: True
        • USE_GPU: True
    • _rocm
      • ENV variables:
        • CM_MLPERF_DEVICE: rocm
        • USE_GPU: True
    • _tpu
      • ENV variables:
        • CM_MLPERF_DEVICE: tpu
  • Group "framework"

    Click here to expand this section.

    • _deepsparse
      • ENV variables:
        • CM_MLPERF_BACKEND: deepsparse
        • CM_MLPERF_BACKEND_VERSION: <<<CM_DEEPSPARSE_VERSION>>>
    • _ncnn
      • ENV variables:
        • CM_MLPERF_BACKEND: ncnn
        • CM_MLPERF_BACKEND_VERSION: <<<CM_NCNN_VERSION>>>
        • CM_MLPERF_VISION_DATASET_OPTION: imagenet_pytorch
    • _onnxruntime (default)
      • ENV variables:
        • CM_MLPERF_BACKEND: onnxruntime
    • _pytorch
      • ENV variables:
        • CM_MLPERF_BACKEND: pytorch
        • CM_MLPERF_BACKEND_VERSION: <<<CM_TORCH_VERSION>>>
    • _ray
      • ENV variables:
        • CM_MLPERF_BACKEND: ray
        • CM_MLPERF_BACKEND_VERSION: <<<CM_TORCH_VERSION>>>
    • _tf
      • Aliases: _tensorflow
      • ENV variables:
        • CM_MLPERF_BACKEND: tf
        • CM_MLPERF_BACKEND_VERSION: <<<CM_TENSORFLOW_VERSION>>>
    • _tflite
      • ENV variables:
        • CM_MLPERF_BACKEND: tflite
        • CM_MLPERF_BACKEND_VERSION: <<<CM_TFLITE_VERSION>>>
        • CM_MLPERF_VISION_DATASET_OPTION: imagenet_tflite_tpu
    • _tvm-onnx
      • ENV variables:
        • CM_MLPERF_BACKEND: tvm-onnx
        • CM_MLPERF_BACKEND_VERSION: <<<CM_ONNXRUNTIME_VERSION>>>
    • _tvm-pytorch
      • ENV variables:
        • CM_MLPERF_BACKEND: tvm-pytorch
        • CM_MLPERF_BACKEND_VERSION: <<<CM_TORCH_VERSION>>>
        • CM_PREPROCESS_PYTORCH: yes
        • MLPERF_TVM_TORCH_QUANTIZED_ENGINE: qnnpack
    • _tvm-tflite
      • ENV variables:
        • CM_MLPERF_BACKEND: tvm-tflite
        • CM_MLPERF_BACKEND_VERSION: <<<CM_TVM-TFLITE_VERSION>>>
  • Group "implementation"

    Click here to expand this section.

    • _python (default)
      • ENV variables:
        • CM_MLPERF_PYTHON: yes
        • CM_MLPERF_IMPLEMENTATION: reference
  • Group "models"

    Click here to expand this section.

    • _3d-unet-99
      • ENV variables:
        • CM_MODEL: 3d-unet-99
    • _3d-unet-99.9
      • ENV variables:
        • CM_MODEL: 3d-unet-99.9
    • _bert-99
      • ENV variables:
        • CM_MODEL: bert-99
    • _bert-99.9
      • ENV variables:
        • CM_MODEL: bert-99.9
    • _dlrm-99
      • ENV variables:
        • CM_MODEL: dlrm-99
    • _dlrm-99.9
      • ENV variables:
        • CM_MODEL: dlrm-99.9
    • _gptj-99
      • ENV variables:
        • CM_MODEL: gptj-99
    • _gptj-99.9
      • ENV variables:
        • CM_MODEL: gptj-99.9
    • _llama2-70b-99
      • ENV variables:
        • CM_MODEL: llama2-70b-99
    • _llama2-70b-99.9
      • ENV variables:
        • CM_MODEL: llama2-70b-99.9
    • _resnet50 (default)
      • ENV variables:
        • CM_MODEL: resnet50
        • CM_MLPERF_USE_MLCOMMONS_RUN_SCRIPT: yes
    • _retinanet
      • ENV variables:
        • CM_MODEL: retinanet
        • CM_MLPERF_USE_MLCOMMONS_RUN_SCRIPT: yes
        • CM_MLPERF_LOADGEN_MAX_BATCHSIZE: 1
    • _rnnt
      • ENV variables:
        • CM_MODEL: rnnt
        • CM_MLPERF_MODEL_SKIP_BATCHING: True
        • CM_TMP_IGNORE_MLPERF_QUERY_COUNT: True
    • _sdxl
      • ENV variables:
        • CM_MODEL: stable-diffusion-xl
        • CM_NUM_THREADS: 1
  • Group "network"

    Click here to expand this section.

    • _network-lon
      • ENV variables:
        • CM_NETWORK_LOADGEN: lon
        • CM_MLPERF_SUT_NAME_RUN_CONFIG_SUFFIX1: network_loadgen
    • _network-sut
      • ENV variables:
        • CM_MLPERF_SUT_NAME_RUN_CONFIG_SUFFIX1: network_sut
        • CM_NETWORK_LOADGEN: sut
  • Group "precision"

    Click here to expand this section.

    • _bfloat16
      • ENV variables:
        • CM_MLPERF_QUANTIZATION: False
        • CM_MLPERF_MODEL_PRECISION: bfloat16
    • _float16
      • ENV variables:
        • CM_MLPERF_QUANTIZATION: False
        • CM_MLPERF_MODEL_PRECISION: float16
    • _fp32 (default)
      • ENV variables:
        • CM_MLPERF_QUANTIZATION: False
        • CM_MLPERF_MODEL_PRECISION: float32
    • _int8
      • Aliases: _quantized
      • ENV variables:
        • CM_MLPERF_QUANTIZATION: True
        • CM_MLPERF_MODEL_PRECISION: int8
Default variations

_cpu,_fp32,_onnxruntime,_python,_resnet50

Script flags mapped to environment

  • --clean=valueCM_MLPERF_CLEAN_SUBMISSION_DIR=value
  • --count=valueCM_MLPERF_LOADGEN_QUERY_COUNT=value
  • --dataset=valueCM_MLPERF_VISION_DATASET_OPTION=value
  • --dataset_args=valueCM_MLPERF_EXTRA_DATASET_ARGS=value
  • --docker=valueCM_RUN_DOCKER_CONTAINER=value
  • --hw_name=valueCM_HW_NAME=value
  • --imagenet_path=valueIMAGENET_PATH=value
  • --max_amps=valueCM_MLPERF_POWER_MAX_AMPS=value
  • --max_batchsize=valueCM_MLPERF_LOADGEN_MAX_BATCHSIZE=value
  • --max_volts=valueCM_MLPERF_POWER_MAX_VOLTS=value
  • --mode=valueCM_MLPERF_LOADGEN_MODE=value
  • --model=valueCM_MLPERF_CUSTOM_MODEL_PATH=value
  • --multistream_target_latency=valueCM_MLPERF_LOADGEN_MULTISTREAM_TARGET_LATENCY=value
  • --network=valueCM_NETWORK_LOADGEN=value
  • --ntp_server=valueCM_MLPERF_POWER_NTP_SERVER=value
  • --num_threads=valueCM_NUM_THREADS=value
  • --offline_target_qps=valueCM_MLPERF_LOADGEN_OFFLINE_TARGET_QPS=value
  • --output_dir=valueOUTPUT_BASE_DIR=value
  • --power=valueCM_MLPERF_POWER=value
  • --power_server=valueCM_MLPERF_POWER_SERVER_ADDRESS=value
  • --regenerate_files=valueCM_REGENERATE_MEASURE_FILES=value
  • --rerun=valueCM_RERUN=value
  • --scenario=valueCM_MLPERF_LOADGEN_SCENARIO=value
  • --server_target_qps=valueCM_MLPERF_LOADGEN_SERVER_TARGET_QPS=value
  • --singlestream_target_latency=valueCM_MLPERF_LOADGEN_SINGLESTREAM_TARGET_LATENCY=value
  • --sut_servers=valueCM_NETWORK_LOADGEN_SUT_SERVERS=value
  • --target_latency=valueCM_MLPERF_LOADGEN_TARGET_LATENCY=value
  • --target_qps=valueCM_MLPERF_LOADGEN_TARGET_QPS=value
  • --test_query_count=valueCM_TEST_QUERY_COUNT=value
  • --threads=valueCM_NUM_THREADS=value

Default environment

These keys can be updated via --env.KEY=VALUE or env dictionary in @input.json or using script flags.

  • CM_MLPERF_LOADGEN_MODE: accuracy
  • CM_MLPERF_LOADGEN_SCENARIO: Offline
  • CM_OUTPUT_FOLDER_NAME: test_results
  • CM_MLPERF_RUN_STYLE: test
  • CM_TEST_QUERY_COUNT: 10
  • CM_MLPERF_QUANTIZATION: False
  • CM_MLPERF_SUT_NAME_IMPLEMENTATION_PREFIX: reference
  • CM_MLPERF_SUT_NAME_RUN_CONFIG_SUFFIX: ``

Script output

cmr "app vision language mlcommons mlperf inference reference ref [variations]" [--input_flags] -j