Unveiling Phixtral: The Next Evolution in Language Models

Explore Phixtral, the advanced Large Language Model that leverages a Mixture of Experts architecture, outperforming standard benchmarks with its unique quantized models and customizable configurations. This article delves into Phixtral LLM, including steps for Apple Silicon Macs!

1000+ Pre-built AI Apps for Any Use Case

Unveiling Phixtral: The Next Evolution in Language Models

Start for free

Large Language Models (LLMs) have revolutionized how machines understand and generate human language. The introduction of Phixtral marks the latest leap in this field, boasting a Mixture of Experts (MoE) architecture that promises to refine machine comprehension and production of natural language to unprecedented levels. In this article, we delve into what Phixtral is, how it's trained, its comparative advantages, and guidance on installing the model locally.

Interested in testing out Open Source LLMs without downloading them locally?

Try out Anakin AI! Anakin AI supports a wide range of LLM with pre-made chatbots, where you can test them out instantly online!

What is Phixtral?

Phixtral is a sophisticated LLM built on the MoE framework, an approach that synergizes multiple smaller models, each an "expert" in a different facet of language. This symbiosis allows Phixtral to exhibit superior performance in language tasks.

Phixtral Variants: There are different configurations of Phixtral, such as the phixtral-4x2_8 and phixtral-2x2_8. The numbers denote the MoE configuration, with 4x2_8 suggesting a model that integrates four expert models.

Inspiration: Phixtral's architecture draws from the Mixtral-8x7B-v0.1 framework, renowned for its efficiency and efficacy in handling complex language tasks.

How is Phixtral Trained?

Training an LLM like Phixtral is an extensive process, involving a considerable amount of data and computational power. Let's break down the essentials:

Training Data: Phixtral's expertise is a result of the vast and diverse datasets it's trained on, encompassing a wide range of topics, languages, and writing styles.

Training Methodology: Phixtral utilizes advanced machine learning techniques, with each "expert" trained on specialized datasets to ensure nuanced understanding and generation capabilities.

Computational Resources: The training of Phixtral leverages substantial computational resources, utilizing the latest in GPU and TPU technology to process the extensive training data.

Ethical Training: The training process also involves steps to minimize biases and uphold ethical standards in AI.

Phixtral Benchmarks: Compared

Phixtral, the emergent star in the galaxy of Large Language Models (LLMs), has demonstrated impressive performance across a variety of benchmarks. Its architecture, which capitalizes on the Mixture of Experts (MoE) approach, has not only shown promising results in standard metrics but also introduced technical innovations that distinguish it from its contemporaries.

Here is a comparative table illustrating how Phixtral stands against other leading LLMs based on performance scores from several AI benchmarks:

Performance Benchmarks:


The above data illustrates that Phixtral models outperform the baseline Phi-2 model, suggesting that the MoE approach can effectively enhance LLM capabilities. The technical innovations of Phixtral include:

Quantized Models: Phixtral employs quantization, a process that reduces the precision of the numbers it processes. This can dramatically decrease the model's memory footprint and speed up computation without a significant loss in performance.

Custom Configurations: Users can tweak the model's setup, such as adjusting the number of experts per token or the number of local experts through a config.json file. This level of customization allows Phixtral to be adapted for various computational environments and use cases.

The practical implications of these advantages are vast. For instance, quantized models enable the deployment of Phixtral on devices with limited computational resources, widening the potential for edge computing applications. Moreover, the ability to customize the model's configuration means that it can be fine-tuned for specific tasks, be it conversational AI, content generation, or complex problem-solving, providing a tailored AI solution that can integrate seamlessly with user applications.

Installation and Local Deployment

Deploying Phixtral on your local machine involves a series of straightforward steps. Whether you're on a traditional x86 system or utilizing the power of Apple Silicon, Phixtral's installation is accessible. Below is a high-level overview of the process, embellished with sample code snippets to guide you through the setup.


Ensure your machine meets the following requirements:

  • Python 3.6 or later
  • Pip (Python package installer)
  • Adequate storage space for the model and its dependencies

Step-by-Step Installation

Python Environment Setup: Start by creating a virtual environment to keep dependencies organized and project-specific.

python3 -m venv phixtral-env
source phixtral-env/bin/activate

Install Required Packages: Phixtral relies on several Python libraries. Install them using pip.

pip install transformers einops accelerate bitsandbytes

Downloading the Model:

  • For phixtral-4x2_8 or phixtral-2x2_8, use the following commands:
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "mlabonne/phixtral-4x2_8"  # or "mlabonne/phixtral-2x2_8" for the other variant
model = AutoModelForCausalLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

Running the Model: With the model and tokenizer downloaded, you can now run Phixtral to generate text.

inputs = tokenizer("Your prompt here", return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Running on Apple Silicon

If you're using an Apple Silicon Mac, you might need additional

steps to ensure compatibility with the ML framework being used. Here's how you can proceed:

Install Rosetta 2: Apple Silicon requires Rosetta 2 to run x86_64 applications. Install it using the following command:

/usr/sbin/softwareupdate --install-rosetta --agree-to-license

Install Miniforge: It's recommended to use Miniforge to manage con

ndas to manage Python environments on M1/M2 Macs, which helps in managing packages and environments that are arm64 compatible. Miniforge can be installed from GitHub or using Homebrew.

Create a Conda Environment: Once Miniforge is installed, create a conda environment specifically for Phixtral.

conda create --name phixtral python=3.8
conda activate phixtral

Install Required Packages: Some packages may need to be installed from the conda-forge channel to ensure they are compiled for arm64.

conda install -c conda-forge transformers

Apple-Specific TensorFlow Installation: If your tasks require TensorFlow and it's not yet compatible with Apple Silicon, you can install Apple's fork of TensorFlow which is optimized for their hardware.

pip install --upgrade tensorflow-macos
pip install --upgrade tensorflow-metal  # for GPU acceleration

Verifying Installation: To ensure that the libraries and the model are installed correctly and are running on the arm64 architecture, you can execute the following commands in your Python environment:

import torch
import transformers

If the output doesn't raise any errors and shows the installed versions, the libraries are set up correctly.

Running Phixtral:
Now, you should be able to run the Phixtral model as you would on any other system. Here's a simple example of using the model to generate text on an Apple Silicon Mac:

from transformers import AutoModelForCausalLM, AutoTokenizer

# Load the model and tokenizer
model_name = 'mlabonne/phixtral-2x2_8'  # Replace with the desired model
model = AutoModelForCausalLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Tokenize the input text
inputs = tokenizer("Here is a sentence to complete: ", return_tensors="pt")

# Generate a response
outputs = model.generate(**inputs, max_length=50)

# Decode the generated text
result = tokenizer.decode(outputs[0], skip_special_tokens=True)

Troubleshooting: If you encounter any issues, consider checking the following:

  • Ensure you have the latest updates installed for macOS.
  • Check if there are any open issues on the official GitHub repositories that may address your problem.
  • Consult the Hugging Face forums and Apple Developer forums for advice from the community.

Remember, while the steps listed above are generally what's required, it's essential to refer to the official documentation for the most up-to-date and detailed instructions. Additionally, active participation in the community can help to resolve any potential hiccups during installation and deployment.


As we wrap up our exploration of Phixtral, it's clear that this innovative Large Language Model stands as a testament to the rapid advancements in the field of artificial intelligence. Phixtral's Mixture of Experts architecture, combined with its quantized models, offers a level of customization and efficiency that is poised to redefine the capabilities of machine learning models in natural language processing tasks.

Here are the Phixral Hugging Face Cards:

mlabonne/phixtral-2x2_8 · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
mlabonne/phixtral-4x2_8 · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.