Service tiers on Open AI providers

We just rolled out a change that allows you to pick service tiers on your OpenAI and Azure providers.

The OpenAI service tier allows you to get heavy discounts on API usage or pay an increased amount of money for faster requests.

For comparison (as of March 9, 2026), GPT 5.4 pricing is:

  • 2.50 per million input tokens under standard tier
  • 1.25 per million input tokens under flex tier
  • 5.00 per million for priority tier (which is about 1.5x faster than standard)

To choose the right model, be sure to head to your LLM configuration.

And pick a service tier:

Note that the flex tier, albeit much cheaper, is also less reliable by design.

Additionally, if you are using OpenAI, be sure to select the responses endpoint by entering the URL https://api.openai.com/v1/responses for your service.

This is particularly important on recent reasoning models; without it, you will not benefit properly from caching, which heavily reduces costs.

Enjoy!

6 Likes