Comparative cost analysis of LLMs

:information_source: In order to use certain Discourse AI features, users are required to use a 3rd party Large Language Model (LLM) provider. Please see each AI feature to determine which LLMs are compatible.

:warning: The following guide compares the estimated costs of different LLM providers.

Note that the costs might vary based on multiple factors such as the number of requests, the length of the text, the computational resources used, the models chosen, and so on. For the most up-to-date and accurate pricing, please check with each provider.

  • OpenAI: OpenAI’s pricing varies based on the model and usage. For instance, GPT-4 costs $0.03 for 1K tokens (input)/ $0.06 for 1K token (output). You can also start experimenting with $5 in free credit that can be used during your first 3 months.
  • Anthropic: Claude-2’s pricing is $11.02 per 1M tokens (input)/ $32.68 per 1M tokens (output). Please check their model pricing for additional details.
  • Azure OpenAI: Azure OpenAI pricing is also dependent on the model and capabilities, e.g. GPT-3.5-Turbo pricing is $0.0015 per 1K tokens (input) / $0.002 per 1K tokens (output).
  • AWS Bedrock with Anthropic access: On-demand pricing is $0.00163 for 1K input tokens and $0.00551 for 1K output tokens.
  • HuggingFace Endpoints with Llama2-like model: Hugging Face’s pricing varies based on the usage and the type of subscription. For instance, the Pro subscription starts at $20 per user per month.
  • Run your own OSS Llama2-like model with TGI: The cost of running your own OSS Llama2-like model with TGI would depend on various factors such as the infrastructure costs, the costs associated with fine-tuning the model, and the costs of managing and maintaining the model.

This is defenetly not statistically acquire comparison, but based on my short testing using OpenAI GPT-4 is three times more expensive than GPT-3.5 Turbo when counted API calls and how many tokens was used — and because moneywise tokens used by GPT-4 are more expensive that difference is much bigger.

And I got no benefits with GPT-4 compared to 3.5 Turbo.

And as a disclaimer: I used finnish, so english can be different thing. Plus any AI is totally useless in chat use when used finnish, but that is totally different ball game — but means, from my point of view, all chatbots are just pure waste of money when used small languages.

The costs here are estimated and agreed that the costs can vary quite dramatically based on usage!

It’s important to note that for many basic tasks, the difference between GPT-4 and GPT-3.5 models may not be significant. However, GPT-4 does have some substantiated differences in terms of its capabilities, creative understanding, and raw input.

I also agree that for languages that are not popular, there is much to be desired in the model’s abilities.

1 Like

I think we are talking about same thing, but to be on safe side :smirk: : that is an issue of AI companies and you, I or any dev can’t change that fact.

But I’m after something like we all should follow a bit how much we are spending money (if we aren’t using money from othet budget than from ours pocket :smirk: ) and trying to find balance of very sujective usefullness and money.

And no, I don’t know what I’m talking about. Mainly because responses of all chat bots are basically just based on english buzz of millions fly (quantity over quality). Situation can be change - better or worse, it depends - if we have better tools to educate AI what sources it can use. Sure, we have, but it will cost huge much more that price of tokens.

And yes, that is headache of small players.

I’m wondering… is there a chance that we can get a better cost/accuracy balance with more freely prompt editing?