This guide uses an example from Spring 2023 with CerebrasGPT which uses release 1.8 (cs-1.8) as the source format.
Procedure
- To use the conversion tool, activate the Cerebras virtual environment and specify the following flags:
Flag | Description |
---|---|
--model gpt2 | Model architecture the checkpoint corresponds to |
--src-fmt cs-1.8 | the source format of the checkpoint corresponding to Cerebras Model Zoo(R1.8) |
--tgt-fmt hf | Target format of the checkpoint corresponding to Hugging Face |
--config custom_config_GPT111M.yaml | yaml file configuration used for training the model |
--output-dir hf_dir_train_from_scratch_GPT111M | Directory containing the output configuration and checkpoint |
- Convert the checkpoint obtained from fine-tuning “Cerebras-GPT 111M” included in the model directory finetune_GPT111M.
- To facilitate importing the model in Hugging Face, modify the configuration file’s name to include gpt2 in it.
- Create a Python virtual environment to use Hugging Face.
- Once you have set up the virtual environment, you can now generate outputs using Hugging Face. The tokenizer can be found in Cerebras-GPT-111M available in Hugging Face. Here is an example using the model trained from scratch: