Set Up the Environment

To set up your environment for Model Zoo, follow the instructions in our getting started guide.

Running an Existing Model

To execute an existing model from the Cerebras Model Zoo, follow these steps:

1. Use the CLI to query the registry and display the supported models:

cszoo model list

2. Find the model implementation. For example, to locate the path for GPT-2:

cszoo model describe gpt2 --field path

For example, to locate the Model Zoo path for GPT-2:

cszoo model describe gpt2

╒════════════════╤═════════════════════════════════════════════════════════════════════════════════════════╕
│ Name           │ gpt2                                                                                    │
├────────────────┼─────────────────────────────────────────────────────────────────────────────────────────┤
│ Path           │ <modelzoo path>/modelzoo/models/nlp/gpt2                                                │
├────────────────┼─────────────────────────────────────────────────────────────────────────────────────────┤
│ Configs        │ gpt2_medium_reference                                                                   │
│                │ gpt2_small                                                                              │
│                │ gpt2_tiny                                                                               │
│                │ gpt2_medium_lora_a10                                                                    │
│                │ gpt2_small_reference                                                                    │
│                │ gpt2_large_lora                                                                         │
│                │ gpt2_tiny_synthetic                                                                     │
│                │ gpt2_small_bs1024                                                                       │
│                │ gpt2_medium_lora                                                                        │
│                │ gpt2_large_reference                                                                    │
├────────────────┼─────────────────────────────────────────────────────────────────────────────────────────┤
│ Dataprocessors │ Gpt2SyntheticDataProcessor                                                              │
│                │ GptTextDataProcessor                                                                    │
│                │ DummyDataProcessor                                                                      │
│                │ DummyIterableDataProcessor                                                              │
│                │ GptHDF5DataProcessor                                                                    │
│                │ GptHDF5MapDataProcessor                                                                 │
│                │ HuggingFaceDataProcessorEli5                                                            │
│                │ HuggingFaceIterableDataProcessorEli5                                                    │
╘════════════════╧═════════════════════════════════════════════════════════════════════════════════════════╛

3. Determine which YAML file to use for the model’s parameters. The YAML configurations are located in the -configs directory within the model’s folder. For example, GPT-2’s YAML files can be found at:

<modelzoo path>/modelzoo/models/nlp/gpt2/configs/

4. Execute the run.py script, supplying the appropriate YAML file as an argument.

python /path/to/run.py --params /path/to/config.yaml 

Editing Configurations

If you need to modify existing configurations, ensure you have cloned the Model Zoo repository for write access to the YAML files. If you want to modify how a run is configured for a specific Model Zoo model, make sure you have first cloned the Model Zoo repository for write access to the YAML files. All reference configuration files in Model Zoo are located in the configs/ directory.

Querying Additional Components

To see which data processors are provided in the reference examples, you can use the CLI:

cszoo data_processor list

Use Config Classes with an Existing Model

Every model within the Model Zoo is equipped with a corresponding Config class. When a Config class is associated with a model, the configuration is automatically validated in the backend, necessitating no additional actions from the user. For a deeper understanding of Config classes, you can explore further details in Model Zoo Config Classes.

Register a Component

To register a custom data processor (also known as a dataloader) or model, see Model Zoo Registry.

Evaluating Your Model

To effectively evaluate your model during and after training, follow these guides:

Evaluating During Training

For insights on assessing your model’s performance throughout the training process, visit the run-model/eval guide in the Cerebras Developer Documentation. This resource provides comprehensive information on the steps and settings required to evaluate your model during training on the Wafer-Scale Cluster (WSC).

Using EleutherAI’s Evaluation Harness

If you’re working with Large Language Models (LLMs) within the Model Zoo, you might want to leverage EleutherAI’s Evaluation Harness (EEH) for a more in-depth evaluation. Our guide on downstream valudation using EEH offers detailed instructions on how to prepare your data and set up the EEH for evaluating LLMs. This tool provides a structured approach to assessing model performance across various benchmarks and tasks, facilitating a comprehensive evaluation of your LLM.

By following these guidelines, you can gain valuable insights into your model’s effectiveness, helping you make informed decisions for further model refinement and deployment.

Adding a New Dataset

To incorporate a new dataset for your model within the Cerebras Model Zoo, you’ll need to ensure it’s specified correctly in the model’s configuration and, for certain models like language models, converted into the appropriate format.

Specifying the Dataset in the YAML Configuration

1. Locate the YAML file

Identify the YAML configuration file associated with your model.

2. Update data_dir

Within the YAML file, under train_input or eval_input sections (depending on whether the dataset is for training or evaluation), specify the path to your dataset using the data_dir entry.

Preparing Datasets for Language Models

Language models within the Cerebras ecosystem often require datasets in HDF5 format.

If your dataset isn’t already in this format, follow the steps in our Data Preprocessing Quickstart guide, or find more in-depth information in our Data Preprocessing guide.

Utilizing Checkpoints

The Cerebras Model Zoo includes a “Checkpoint and Config Converter” tool, designed to facilitate the conversion of model implementations between the Model Zoo and other frameworks or repositories. This tool is particularly useful for migrating models into the Model Zoo environment or exporting them for use in different settings. It also allows you to convert checkpoints created through running on previous Cerebras software releases to checkpoints compatible with new software releases.

Checkpoint conversion: Cerebras and HuggingFace, software version updates

To learn more about how to use this tool for converting checkpoints and model configurations, click here. This resource provides detailed instructions on using the converter, ensuring a smooth transition between different coding environments.

Saving and Loading Checkpoints

Proper checkpoint management is crucial for efficiently training and evaluating models. For guidelines on saving and loading checkpoints within the Cerebras environment, consult the checkpointing documentation. This section offers comprehensive insights into checkpoint handling, including saving states during training and loading them for resuming training or evaluation.

By leveraging these resources, you can effectively manage model checkpoints in the Cerebras ModelZoo, enhancing your model development and experimentation workflows.