Jais

Model Description
Code Structure
Available Configurations
Workflow
References

Model Description

Jais is a family of decoder-only transformer models developed collaboratively by Inception, MBZUAI, and Cerebras. These models are trained from scratch on 395B tokens of high-quality Arabic, English, and code data, with an emphasis on cross-lingual and code-switching capabilities. Architecturally, Jais models follow the standard transformer decoder design and adopt several enhancements: they use rotary positional embeddings (RoPE), grouped-query attention (GQA), SwiGLU activations, and RMSNorm. The tokenizer is a 51.2K vocabulary SentencePiece model trained on the full multilingual corpus. The Jais family includes both base and instruction-tuned variants and is particularly well-suited for tasks involving multilingual reasoning, dialogue, and programming.

Code Structure

The code for this model is located in the /jais directory within ModelZoo. Here’s how it’s organized:

/configs: Contains YAML configuration files.
model.py: The implementation of the Jais model.

Our implementation of Jais is built on top of our GPT-2 backbone. For more details, see gpt2_model.py.

Available Configurations

Configuration	Description
`params_jais_13b.yaml`	13B parameter base Jais model.
`params_jais_13b_chat.yaml`	Instruction-tuned variant of 13B.
`params_jais_30b.yaml`	30B parameter base Jais model.
`params_jais_30b_chat.yaml`	Instruction-tuned variant of 30B.
`params_jais_30b_phase1.yaml`	30B checkpoint after Phase 1 pretraining.

Workflow

For example workflows using language models from the Cerebras Model Zoo, see our tutorials on pretraining and fine-tuning. For a complete list of Cerebras ModelZoo CLI commands, see the command reference.

References

Almazrouei, E., et al. (2023). Jais: An Open Arabic-Centric Foundation Model

GPT-J & GPT-Neox LLaMA

⌘I

Get Started

Setup and Installation

Models

Data Preparation

Model Configuration

Training and Eval

Configure and Run Jobs

Monitoring and Troubleshooting

Convert and Port

Advanced Usage

Model Description

Code Structure

Available Configurations

Workflow

References

Get Started

Setup and Installation

Models

Data Preparation

Model Configuration

Training and Eval

Configure and Run Jobs

Monitoring and Troubleshooting

Convert and Port

Advanced Usage

​Model Description

​Code Structure

​Available Configurations

​Workflow

​References

Model Description

Code Structure

Available Configurations

Workflow

References