Service tiers on Open AI providers

sam · March 9, 2026, 5:31am

We just rolled out a change that allows you to pick service tiers on your OpenAI and Azure providers.

The OpenAI service tier allows you to get heavy discounts on API usage or pay an increased amount of money for faster requests.

For comparison (as of March 9, 2026), GPT 5.4 pricing is:

2.50 per million input tokens under standard tier
1.25 per million input tokens under flex tier
5.00 per million for priority tier (which is about 1.5x faster than standard)

To choose the right model, be sure to head to your LLM configuration.

And pick a service tier:

Note that the flex tier, albeit much cheaper, is also less reliable by design.

Additionally, if you are using OpenAI, be sure to select the responses endpoint by entering the URL https://api.openai.com/v1/responses for your service.

This is particularly important on recent reasoning models; without it, you will not benefit properly from caching, which heavily reduces costs.

Enjoy!

Topic		Replies	Views
HuggingFace TGI vs OpenAI API Endpoint Costs Support ai	2	568	January 15, 2025
DeepSeek provider support? What to do when model provider isn't in "Provider" list? Support ai	12	991	February 3, 2025
Managing consumable AI costs Support ai	4	521	December 3, 2025
Support for o1-preview & o1-mini Feature completed , ai	10	1447	September 16, 2024
Support for Mistral API Feature ai	1	555	December 26, 2023

Service tiers on Open AI providers

Related topics