Skip to content

get-preprocessed-dataset-criteo

Automatically generated README for this automation recipe: get-preprocessed-dataset-criteo

Category: AI/ML datasets

License: Apache 2.0

  • Notes from the authors, contributors and users: README-extra

  • CM meta description for this script: _cm.json

  • Output cached? True

Reuse this script in your project

Install MLCommons CM automation meta-framework

Pull CM repository with this automation recipe (CM script)

cm pull repo mlcommons@cm4mlops

cmr "get dataset criteo recommendation dlrm preprocessed" --help

Run this script

Run this script via CLI
cm run script --tags=get,dataset,criteo,recommendation,dlrm,preprocessed[,variations] [--input_flags]
Run this script via CLI (alternative)
cmr "get dataset criteo recommendation dlrm preprocessed [variations]" [--input_flags]
Run this script from Python
import cmind

r = cmind.access({'action':'run'
              'automation':'script',
              'tags':'get,dataset,criteo,recommendation,dlrm,preprocessed'
              'out':'con',
              ...
              (other input keys for this script)
              ...
             })

if r['return']>0:
    print (r['error'])
Run this script via Docker (beta)
cm docker script "get dataset criteo recommendation dlrm preprocessed[variations]" [--input_flags]

Variations

  • No group (any combination of variations can be selected)

    Click here to expand this section.

    • _1
      • ENV variables:
        • CM_DATASET_SIZE: 1
    • _50
      • ENV variables:
        • CM_DATASET_SIZE: 50
    • _fake
      • ENV variables:
        • CM_CRITEO_FAKE: yes
    • _full
    • _validation
  • Group "type"

    Click here to expand this section.

    • _multihot (default)
      • ENV variables:
        • CM_DATASET_CRITEO_MULTIHOT: yes
Default variations

_multihot

Script flags mapped to environment

  • --dir=valueCM_DATASET_PREPROCESSED_PATH=value
  • --output_dir=valueCM_DATASET_PREPROCESSED_OUTPUT_PATH=value
  • --threads=valueCM_NUM_PREPROCESS_THREADS=value

Native script being run

No run file exists for Windows


Script output

cmr "get dataset criteo recommendation dlrm preprocessed [variations]" [--input_flags] -j