Mistral API Pricing - How to Use Mixtral (MoE) API for Free!

In this article, we discuss the latest release of Mistral AI's API model, allowing for access of Mistral-medium, the most powerful model coming from Mistral AI, and Mistral-medium's pricing, and where to register Mistral API.

1000+ Pre-built AI Apps for Any Use Case

Mistral API Pricing - How to Use Mixtral (MoE) API for Free!

Start for free
Contents

Imagine a world where the tools of creation are at your fingertips, where the barrier between idea and execution is as thin as a whisper. This is the world that Artificial Intelligence as a Service (AIaaS) is crafting—a world where generative AI models are not just tools but collaborators, ready to draft a story, compose code, oView Postr conjure up a conversation from the ether.

This isn't the future; it's right now. And it's more than just handy—it's revolutionary, changing the face of tech as we know it. In this fast-paced era of ceaseless innovation, Mistral AI has stepped onto the scene. They're not just another name in the game. They're the artists handing out the paintbrushes, offering a palette of generative models and APIs that empower developers and businesses to bring their own visions to life with the finesse and power of advanced AI.

Yes, Now You Can Use Mixtral API for Free

Ever since the latest launch of Mixtral 8x7B, the AI world has been riding the hype. Can Mixtral 8x7B, the mysterious Open Source Model be the killer in the AI game? Can it challenge the crown of GPT-4?

This comes the beauty of Open Source, where you can explore the beauty of this wonderid LLM online, right now using Anakin AI's API:

Mixtral 8x7B API on Anakin AI
In case you are wondering "What is Mixtral 8x7B"? You can read our article about Mixtral 8x7B to learn more.

With Anakin AI, you can easily:

  1. Create customized AI Apps utilizing APIs for multiple AI models, such as:
Mixtral | AI Powered | Anakin.ai
Supports Mixtral 7B and 8x7B. Mixtral AI’s next-generation conversational AI uses intelligent Q&A capabilities to solve your tough questions.

2.  Besides support for popular LLM models, Anakin AI is also the one stop shop for image generations models such as:

Imagine creating a highly customized AI app using multiple of these models, optimizing pipelines, and managing payment for all these payment services. Anakin AI got you covered! Everything is nicely placed in one single portal, with No Code required!

Interested? Give Anakin AI a try right now! Easily build you AI Apps using Mixtral 8x7b APIs!👇👇👇

What Models are Available for Mistral's API?


The heart of Mistral AI's offerings lies in its generative endpoints: mistral-tiny, mistral-small, and mistral-medium. Each is designed with a specific performance and price tradeoff in mind, allowing users to select the most suitable option for their needs.

  • Mistral-tiny is the most cost-effective endpoint, serving Mistral 7B Instruct v0.2, excelling in English language tasks with a 7.6 score on MT-Bench.
  • Mistral-small steps up in capabilities, serving the Mixtral 8x7B model proficient in multiple languages including English, French, Italian, German, and Spanish, in addition to code, scoring an 8.3 on MT-Bench.
  • Mistral-medium represents the pinnacle of Mistral AI's current offerings. It operates a prototype model boasting top-tier performance across standard benchmarks and mastering the same languages and coding capabilities as mistral-small, with a superior 8.6 score on MT-Bench. Most importantly, Mistral-medium seems to beat Gemini Pro on most of the available benchmarks, by a considerable margin.

These models are not just competing in the arena of linguistic prowess but also in their alignment with human intentions, a crucial aspect of generative AI's practical application. The benchmarks indicate Mistral AI's competitive edge, particularly when compared with a well-known counterpart, GPT-3.5. The performance metrics showcase Mistral AI's commitment to not only meeting but surpassing industry standards, promising an API that delivers quality, safety, and a range of linguistic abilities.

How Much Does Mistral's API Cost?


Mistral AI has established a clear and flexible pricing strategy for their services, adhering to a pay-as-you-go model. This approach allows users to pay for only what they use, making it an attractive option for businesses and developers of varying scales. It's important to note that the prices listed are exclusive of VAT, which means additional tax costs may apply based on the user's location.

Chat Completions API Pricing

The Chat Completions API is a core offering from Mistral AI, providing different models to cater to varying needs and budgets. The following table details the pricing for each model within this API:

ModelPrice for InputPrice for Output
Mistral-Tiny0.14€ / 1M tokens0.42€ / 1M tokens
Mistral-Small0.6€ / 1M tokens1.8€ / 1M tokens
Mistral-Medium2.5€ / 1M tokens7.5€ / 1M tokens

These models range from "Tiny" to "Medium," offering options for different levels of computational needs and budget constraints. The pricing is competitive, particularly considering the capabilities and potential applications of these models in various AI-driven tasks.

Embeddings API Pricing

In addition to the Chat Completions API, Mistral AI offers an Embeddings API, which is focused on generating vector embeddings from text. This is particularly useful for tasks involving text similarity, clustering, and other machine learning applications. The pricing for the Embeddings API is as follows:

ModelPrice for Input
Mistral-Embed0.1€ / 1M tokens

The Embeddings API is priced affordably, making it accessible for a wide range of applications, from small-scale projects to larger, more complex systems.

Which Mistral API Shall I Choose?

Here are our suggestions for API tiers, based on the pricing table provided by Mistral AI:

Mistral-tiny: Embodied as the gateway to Mistral AI's suite, Mistral-tiny is the epitome of cost-effectiveness with a language-specific focus exclusively to English.

Mistral-small: adept in English, French, Italian, German, and Spanish, coupled with a flair for code generation. Best balanced option between cost-effectiveness and comprehensive performance.

Mistral-medium: Best performance model delivered from Mistral AI.

Considering the industry standards, users can expect to encounter a pricing structure that includes pay-as-you-go models based on usage, such as the number of API calls, tokens generated, or compute time. Subscription models may also be challenging for non-startups to manage.

Other Options to Access Mistral AI's API

The Mixtral API, offered by various providers, presents a diverse range of pricing and performance options. This variety caters to different needs and preferences, making it a versatile choice for organizations and developers. Let's delve deeper into the specifics of each provider to understand their unique offerings and how they fit into different AI strategies.

1. Mistral AI

  • Pricing: 0.6€ / 1M tokens for input and 1.8€ / 1M tokens for output.
  • Key Offering: Mixtral-8x7b-32kseqlen, known for its balance of cost and performance.
  • Unique Feature: It boasts one of the best inference performances in the market, offering up to 100 tokens/s for just 0.0006€/1K tokens, highlighting its efficiency and speed.

2. Abacus AI

  • Pricing: Mixtral 8x7B price/1000 tokens: $0.0003. For Retrieval, the price is set at $0.2/GB/day.
  • Key Features: Low price that is hard to be beaten. As stated by Abacus.ai, they are providing the best bang of the buck with the most competitive price for RAG APIs.
  • For more details, refer to Abacus.ai Documentation.

3. DeepInfra

  • Pricing: $0.27 / 1M, even lower than Abacus AI.
  • Key Features: Besides API, it also features an online portal where you can try out Mixtral-8x7B-Instruct-v0.1.

4. Together AI

  • Pricing: $0.6 / 1M tokens, with output pricing not specified.
  • Key Offering: Mixtral-8x7b-32kseqlen & DiscoLM-mixtral-8x7b-v2 are now available on the Together API.
  • Additional Information: Together AI emphasizes their fast inference performance, and more details can be found on their blog, Together AI Blog on Mixtral.

5. Perplexity AI

  • Pricing: Input at $0.14 / 1M tokens and output at $0.56 / 1M tokens.
  • Key Offering: Mixtral-Instruct, which aligns with the pricing of their 13B Llama 2 endpoint.
  • Incentive for New Users: Perplexity AI offers a starting bonus of $5/month in API credits for new sign-ups, ideal for initial evaluations or small-scale use. Further details can be found at Perplexity AI Documentation.

6. Anyscale Endpoints

  • Pricing: $0.50 / 1M tokens.
  • Key Offering: The official Mixtral 8x7B model, touted to have the best price in the market with an OpenAI compatible API.
  • Upcoming Features: Anyscale is planning to introduce JSON mode and function calling, enhancing the versatility of their API. More information can be found at Anyscale Endpoints Documentation.

Interested in Building AI Apps with No Code?

You should try Anakin AI - a No Code Platform where you can build apps using AI Models such as: GPT-4, Claude-2.1, DALLE-3, Stable Diffusion, and much more!
App Store
Create your own AI app in one minute. Say goodbye to your boring repetitive work.
Free DALL·E 3 AI Image Generator | AI Powered | Anakin.ai
Empower your creativity with the DALL·E AI Image Generator. Generate high-quality images that match your imagination, and fulfill your personalized artistic needs.

Mistral-Medium versus GPT-3.5 Turbo

The AI landscape is witnessing a fascinating competition with the emergence of Mistral-Medium from the European AI company, Mistral. Known for their open-source Large Language Model (LLM) Mistral-7B, they've now introduced Mistral-8x7B and Mistral-Medium 62. The latter, an API-only model, seems to be a step above their mixture of experts (MoE) model, generating buzz in the open-source community for its potential comparability to OpenAI's GPT-3.5 Turbo.

Trenton Dambrowitz, an active member in AI forums, noted the complexity of these models. He mentioned that Mixtral, Mistral's other variant, operates with 46.7B total parameters, utilizing 12.9B per token. This detail hints at the intricate design of Mistral-Medium, yet its performance and capabilities remain somewhat shrouded, as it is API-exclusive and not widely accessible.

Cost is another critical factor. Mistral's pricing strategy is aggressive, with Mistral-Tiny, Small, and Medium models priced significantly lower than OpenAI's offerings. Specifically, Mistral-Medium's API costs are almost four times less than GPT-4 but more expensive than GPT-3.5 Turbo. This pricing strategy positions Mistral as a strong competitor, especially if its performance bridges the gap between GPT-3.5 Turbo and the more heavily moderated GPT-4 API.

The table below summarizes the key aspects of Mistral-Medium versus GPT-3.5 Turbo:

Feature/Aspect Mistral-Medium GPT-3.5 Turbo
Origin European AI company Mistral OpenAI
Model Type API-only, above MoE model Large Language Model
Parameter Count Not specified for Mistral-Medium 175 billion (approx. for GPT-3.5)
Performance Comparable to ChatGPT (anecdotal) Well-established benchmarks
Cost for API $0.0027 / 1k tokens (input), $0.0081 / 1k tokens (output) More expensive than Mistral-Medium
Accessibility Limited access, API-exclusive Broad accessibility
Target Audience Users seeking cost-effective solutions Users requiring high-quality text processing

Dambrowitz also raises questions about the relevance of benchmarks and in-context training, underlining the evolving nature of user preferences and technological advancements. As the field progresses, especially into 2024, the dynamics between these competing models promise to shape the future of AI interactions and applications.

How Good are Mistral AI's Models?

Apologies for the oversight. Let's incorporate the specific data into the article section.

Performance Benchmarking for Mistral Models

Evaluating the effectiveness of AI models goes beyond mere output accuracy; it encompasses a spectrum of attributes that collectively define the model's utility and reliability. Multi-aspect scoring is a methodology that assesses generative models across several dimensions critical to user interaction and satisfaction:

  • Helpfulness: Measures the model's ability to provide constructive and practical information.
  • Clarity: Assesses how understandable and clear the model's responses are.
  • Factuality: Gauges the accuracy and truthfulness of the information provided by the model.
  • Depth: Evaluates the extent to which a model can provide detailed and substantial answers.
  • Engagingness: Reflects how compelling and interesting the model's outputs are.
  • Safety: Ensures the model avoids generating harmful or inappropriate content.

The data showcases Mistral AI's models’ performance across these dimensions, suggesting a robust and balanced capability in generating high-quality content. To illustrate, here’s a summarized table of the performance scores from the benchmarks:

Model/Method Helpfulness Clarity Factuality Depth Engagingness Safety Average Score
Mistral-7b-instruct (SFT) 4.36 4.87 4.29 3.89 4.47 4.75 4.44
Mistral-7b (URIALk=3) 4.57 4.89 4.50 4.18 4.74 4.92 4.63
Mistral-7b (URIALk=8) 4.52 4.90 4.46 4.05 4.78 5.00 4.62
Llama2-7b-chat (RLHF) 4.10 4.83 4.26 3.91 4.70 5.00 4.47
GPT-3.5-turbo-0301 4.81 4.98 4.83 4.33 4.58 4.94 4.75

Note: The scores range from 1 to 5, with higher numbers indicating better performance.

Why Mistral AI's Models are So Good?

The distinction in performance can be attributed to the model alignment methods employed by Mistral AI. Supervised Fine Tuning (SFT) and Retrieval In-Context Learning (URIAL) are pivotal in this respect. SFT fine-tunes the model on curated datasets to enhance its responsiveness to human-like prompts, while URIAL employs a retrieval mechanism to provide contextually rich and relevant responses.

The data illustrates that the tuned models (Mistral-7b with URIALk=3 and URIALk=8) consistently outscore their untuned counterparts. This is especially evident in the Safety category, where the URIALk=8 method achieves a perfect score of 5.00, indicating a significant reduction in the generation of unsafe content.

That means, Mistral AI's alignment methods not only refine the model's outputs but also bolster its performance across various dimensions, marking it as a formidable contender in the domain of generative AI models. This intricate balance of model tuning and performance assessment underscores Mistral AI's commitment to delivering AI services that are not just technologically advanced but are also aligned with user needs and ethical standards.

The sample codes provided in the previous section were hypothetical examples meant to illustrate the potential usage of the Mistral AI API based on typical API patterns. Since we now have access to the official documentation for the Mistral AI API, we can create more accurate examples that align with the specific parameters and endpoints provided by Mistral AI. Let's revise the section on how to use the Mistral AI API with sample codes that are consistent with the official documentation.

How to Use Mistral AI's API

With the official Mistral AI API documentation at our disposal, we can dive into concrete examples of how to interact with the API for creating chat completions and embeddings. Here's how you can use the Mistral AI API in your projects, with revised sample code snippets that adhere to the official specs.

Step 1. Register an API Key from Mistral AI

First, ensure you've registered for an API key from Mistral AI.

Sign in

Use this key to authenticate your API requests. The following examples assume you've set up your development environment with the required libraries for making HTTP requests.

Step 2. Create Chats with Mistral AI's API

To generate a chat response, you need to POST a JSON payload to the /chat/completions endpoint with the required parameters.

Python Example:

import requests

# Replace 'your_api_key_here' with your actual API key.
headers = {
    'Authorization': 'Bearer your_api_key_here',
    'Content-Type': 'application/json',
}

data = {
  "model": "mistral-tiny",
  "messages": [{"role": "user", "content": "Tell me a joke."}],
  "temperature": 0.7,
  "top_p": 1,
  "max_tokens": 16,
  "stream": False,
  "safe_mode": False,
  "random_seed": None
}

response = requests.post('https://api.mistral.ai/chat/completions', headers=headers, json=data)
print(response.json())

JavaScript Example (using Node.js):

const axios = require('axios');

// Replace 'your_api_key_here' with your actual API key.
const headers = {
  'Authorization': 'Bearer your_api_key_here',
  'Content-Type': 'application/json',
};

const data = {
  model: 'mistral-tiny',
  messages: [{ role: 'user', content: 'Tell me a joke.' }],
  temperature: 0.7,
  top_p: 1,
  max_tokens: 16,
  stream: false,
  safe_mode: false,
  random_seed: null
};

axios.post('https://api.mistral.ai/chat/completions', data, { headers: headers })
  .then(response => {
    console.log(response.data);
  })
  .catch(error => {
    console.error(error);
  });

Creating Embeddings

To create embeddings for a list of strings, you need to POST a JSON payload to the /embeddings endpoint.

Python Example:

# Continuing from the previous Python example.

data = {
  "model": "mistral-embed",
  "input": ["Hello", "world"],
  "encoding_format": "float"
}

response = requests.post('https://api.mistral.ai/embeddings', headers=headers, json=data)
print(response.json())

JavaScript Example (using Node.js):

// Continuing from the previous JavaScript example.

const data = {
  model: 'mistral-embed',
  input: ['Hello', 'world'],
  encoding_format: 'float'
};

axios.post('https://api.mistral.ai/embeddings', data, { headers: headers })
  .then(response => {
    console.log(response.data);
  })
  .catch(error => {
    console.error(error);
  });

These examples reflect the required structure and parameters for interacting with the Mistral AI API as per the official documentation. Make sure to handle API responses and potential errors appropriately in your production environment.

Conclusion

Mistral AI's API offers a powerful suite of tools for developers looking to incorporate advanced AI capabilities into their applications. With endpoints for generating chat completions and creating embeddings, the API is versatile and designed to meet the needs of various use cases, from customer service automation to content discovery and beyond.

As Mistral AI continues to evolve and expand its services, it remains committed to providing accessible, top-tier AI technology. With a focus on quality, performance, and user alignment, Mistral AI is not just a service provider but a partner in innovation, helping to shape the future of AI integration across industries. For those ready to embark on this journey, Mistral AI's API is a gateway to unlocking the transformative potential of AI.

Interested in Building AI Apps with No Code?

You should try Anakin AI - a No Code Platform where you can build apps using AI Models such as: GPT-4, Claude-2.1, DALLE-3, Stable Diffusion, and much more!
App Store
Create your own AI app in one minute. Say goodbye to your boring repetitive work.
Free DALL·E 3 AI Image Generator | AI Powered | Anakin.ai
Empower your creativity with the DALL·E AI Image Generator. Generate high-quality images that match your imagination, and fulfill your personalized artistic needs.