Model Zoo CLI Overview
Learn how to use the ModelZoo CLI.
Overview
The ModelZoo CLI is a comprehensive command-line interface that serves as a single entry point for all ModelZoo-related tasks. This tool streamlines various machine learning workflows, from data preprocessing to model training and validation.
Commands
Below is a list of commands that can be used with the ModelZoo CLI tool. Expand each section to see examples and more information.
fit
Trains a model using specified configuration
fit
Trains a model using specified configuration
Example: cszoo fit params_model.yaml
validate
Validates a model using specified configuration
validate
Validates a model using specified configuration
Example: cszoo validate params_model.yaml
validate_all
Runs upstream and downstream validation
validate_all
Runs upstream and downstream validation
Example: cszoo validate_all params_model.yaml
checkpoint
Checkpoint converter
checkpoint
Checkpoint converter
convert
Converts a checkpoint between Huggingface and Cerebras formats or between different Cerebras formats.
convert
Converts a checkpoint between Huggingface and Cerebras formats or between different Cerebras formats.
Example: cszoo checkpoint convert --model gpt2 --src-fmt cs-auto --tgt-fmt hf --config workdir/params_gpt_tiny.yaml model_dir/checkpoint.mdl
convert-config
Converts a checkpoint between Huggingface and Cerebras formats or between different Cerebras formats.
convert-config
Converts a checkpoint between Huggingface and Cerebras formats or between different Cerebras formats.
Example: cszoo checkpoint convert-config --model gpt2 --src-fmt cs-auto --tgt-fmt hf workdir/params_gpt_tiny.yaml
list
Lists all available checkpoint converters. Can also list all checkpoint converters for a specified model.
list
Lists all available checkpoint converters. Can also list all checkpoint converters for a specified model.
Example: cszoo checkpoint list
diff
Compares two checkpoints to identify differences.
diff
Compares two checkpoints to identify differences.
Example: cszoo checkpoint diff checkpoint_a.mdl checkpoint_b.mdl
model
Query information on ModelZoo models
model
Query information on ModelZoo models
list
Displays all supported models.
list
Displays all supported models.
Example: cszoo model list
info
Shows detailed model information.
info
Shows detailed model information.
Example: cszoo model info gpt2
describe
Displays model configuration parameters.
describe
Displays model configuration parameters.
Example: cszoo model describe gpt2
init_checkpoint
Creates initial model checkpoint.
init_checkpoint
Creates initial model checkpoint.
Example: cszoo model init_checkpoint <model_name>
data_preprocess
Preprocess data
data_preprocess
Preprocess data
list
Shows available preprocessing configurations.
list
Shows available preprocessing configurations.
Example: cszoo data_preprocess list
pull
Copies a model configuration file to a local directory.
pull
Copies a model configuration file to a local directory.
Example: cszoo data_preprocess pull summarization_preprocessing -o workdir
run
Executes preprocessing using specified configuration.
run
Executes preprocessing using specified configuration.
Example: cszoo data_preprocess run --config preprocessing.yaml
data_processor
Query information on ModelZoo data processors
data_processor
Query information on ModelZoo data processors
list
Shows available data processors.
list
Shows available data processors.
Example: cszoo data_processor list
info
Displays data processor information.
info
Displays data processor information.
Example: cszoo data_processor info GptHDF5DataProcessor
describe
Shows processor configuration parameters.
describe
Shows processor configuration parameters.
Example: cszoo data_processor describe GptHDF5DataProcessor
benchmark
Benchmarks a specified dataloader.
benchmark
Benchmarks a specified dataloader.
Example: cszoo data_processor benchmark params.yaml
config
Save and manage model config files
config
Save and manage model config files
pull
Copies a model config file to a local directory.
pull
Copies a model config file to a local directory.
Example: cszoo config pull gpt2_tiny -o workdir
validate
Validates a specified config file.
validate
Validates a specified config file.
Example: cszoo config validate params.yaml
convert_legacy
Upgrades V1 config files to V2 YAML.
convert_legacy
Upgrades V1 config files to V2 YAML.
Example: cszoo config convert_legacy old_config.yaml
stats
Retrieves relevant statistics for a model using the specified configuration file.
stats
Retrieves relevant statistics for a model using the specified configuration file.
Example: cszoo config stats params.yaml
Example Workflow: Pretraining a model using the ModelZoo CLI
This workflow guides you through the steps to pretrain a model using the Cerebras ModelZoo CLI. Follow these steps to set up your environment, preprocess data, and run the pretraining process.
Prerequisite: Before proceeding with the steps below, ensure that you have completed the setup and installation guide found here.
Create model directory
Create a directory to store all the files for this pretraining workflow and copy the necessary configuration files.
Preprocess the data
Preprocess the training and validation datasets using the provided configuration files.
Run model
Run the pretraining process using the provided configuration.
Convert checkpoint to HuggingFace
Convert the trained model checkpoint into a HuggingFace-compatible format.
Getting Help
For detailed information about any command, use the --help
flag:
CSZoo Assistant
Need help? Our CSZoo Assistant is an LLM agent you can access from the command line with the assistant
subcommand.
Use it to:
-
Ask questions:
cszoo assistant "what is the checkpoint converter?"
-
Perform actions:
cszoo assistant "convert my checkpoint from huggingface to cerebras"
CSZoo Assistant will always ask your permission before running a command.
Access to the Cerebras Inference API is required and you’ll need to provide your API key with the following command:
export CEREBRAS_API_KEY=<your api key>
Don’t have an API key? Follow these instructions.
CSZoo Assistant is a beta feature and it may make mistakes. Always double-check its reasoning and be aware of the following limitations:
-
CSZoo Assistant can currently only access the help manuals found with
cszoo ... -h
. -
There are currently no advanced context length management mechanisms in place. The assistant will error out if it overflows the context length.