DeepSeek provider support? What to do when model provider isn't in "Provider" list?

So DeepSeek has just released their open source reasoning model “R1” (along with an API) which is on-par with OpenAI’s o1 but is around the cost of GPT-4o-mini. It’s really quite amazing and useful, especially because of the cost, but it currently isn’t supported in the LLM setup page. :pleading_face:

1 Like

Try setting it like this

8 Likes

omfg why didn’t I think of that… thanks man :smiling_face_with_tear:

I got an API and connected as @Falco described, tweaked a prompt and had amazing responses. Truly great. Unfortunately I then read their privacy and TOS and they use, share and own everything you do. Based in China. Unfortunately as wonderful as it is, don’t think I can use in community for privacy reasons.

Any suggestions for open source models that can be used in Discourse with API key. Gpt4o and mini are both great but these reasoning models realing enticing.

5 Likes

That’s a good point. Luckily R1 is fully open source and it’s only a matter of time before someone re-finetunes it to de-chinese it, getting rid of the filters/censors/etc. Then I suspect it’ll be a major model on various cloud providers (bedrock, groq, etc.). For now there is no other model that compares to it besides o1. Actually, R1 is even slightly better in some aspects according to benchmarks

3 Likes

@MachineScholar thank you opening this topic and also for your assessment. I’m a bit overwhelmed about this new AI world. I am supervising an intern who is implementing and analyzing AI costs for us. Can you give a rough layman’s view of the cost difference?

Right now, we’re running these LLMs:

  • Claude 3.5 Haiku
  • Claude 3.5 Sonnet
  • Gemini 1.5 Flash
  • GPT-4 Omni

I’m considering implementing DeepSeek R1 because a different intern was really raving today about how great it is compared to GPT-o4 and o1 for specific programming tasks.

Both interns are computer scientists and they’re young, so there’s no shortage of enthusiasm for new technology. :slight_smile:

Also, if my Discourse forum has public data, do I need to be concerned about privacy terms of use of DeepSeek? I guess I’m thinking, if it’s good and saves me money, why not? But, obviously, I don’t want to damage the community.

Hey! Happy to help out :slight_smile:

As of today here are the costs from lowest to highest, per 1 million tokens:
Gemini 1.5 Flash
Input: $0.075
Output: $0.30

Claude 3.5 Haiku
Input: $0.80
Output: $4.00

Claude 3.5 Sonnet
Input: $3.00
Output: $15.00

GPT-4o
Input: $2.50
Output: $10.00

DeepSeek-R1
Input: $0.55
Output: $2.19

The token prices here don’t take into consideration prompt caching which can slash costs. Furthermore, the AI community seems to consistently report that Claude 3.5 Sonnet produces consistently better code than OpenAI models, although I think they often go back and forth in their quality.

Nonetheless, DeepSeek-R1 is the clear winner here as it’s simply not only the best bang for your buck but the best bang in general. The Chatbot Arena Leaderboard backs it up too, as it’s ranking higher than o1:

Yesterday DeepSeek was under heavy cyberattack which was likely causing their API to be nonfunctional but I just tested it again and it’s working now. I opened up a topic about that issue too

As for privacy, DeepSeek clearly states in their policy that the data is stored in China (completely breaking EU law, for example), and it’s no secret that the CCP has access to all company data in China. But if it’s all public data then who cares, really, since your site could theoretically be scraped/mined anyway.

Luckily this model is fully open source and LLM providers are aware of this. For example, fireworks.ai already provides this model, although they are scalping the price, in my opinion, at $8.00 input / $8.00 output. So the DeepSeek API is certainly for economical.


In my community, I use GPT-4o-mini with RAG (it’s forced to read a relevant topic before replying in order to provide a more factual/helpful answer) and strict prompt engineering. It has yet to fail me and it’s very cheap at $0.15 input / $0.60 output. However I wouldn’t really trust it for coding — that’s certainly best left to o1-mini or DeepSeek-R1. Usually 1/3 to 1/2 of all the tokens used in my community are cached (which you can see in /admin/plugins/discourse-ai/ai-usage) which additionally slashes my costs as cached tokens are 50% cheaper.

Thus, if my community uses 2 million input tokens and 100,000 output tokens each day, my approximate costs are:
Daily input cost: ~$0.22
Daily output cost: ~$0.06
…multiplied by 30 days = ~$6.60 input and ~$1.8 output = $8.40 per month.

That’s not even lunch money.

4 Likes

This is quite much just another meta-post, but I tested DeepSeek using question about exercise and loosing weight. I got really bad answer with tons of hallucination. And that is in line with other experiences that I have heard.

So, the price tag is only part of the story. What one get for that price is important part too.

1 Like

Ah yes that is a good point indeed. I’ve forgotten about this because I practically never use LLMs without RAG or some other form of information injection whenever I am searching for knowledge/info. R1 really shines for me when it comes to brainstorming ideas with “critical thinking.” This all requires excellent prompt engineering though.

To clarify: R1 was trained with reasoning-oriented reinforcement learning from the start so it’s simple internal “information retrieval” might be producing hallucinations due to “overthinking.” But I haven’t fully read their research paper yet so take this with a grain a salt as its just my intuition.

It’s also true that it’s easy to jailbreak R1 :wink:

I got some incoherent responses from it also. I was able to use it intentionally to create a couple of good training examples that I put into a RAG text file for something specific. Definitely not ready for prime time. Hopefully open AI releases a more cost-effective reasoning model we could use

@MachineScholar I want to really thank you for your cost analysis and helping me to understand this. I’m a bit overwhelmed with all the new information myself, but the young computer scientist interns seem to suck up the information like a sponge. They may be thinking 8x faster than me…

I have one intern working on the AI plugin for two different Discourse communities. We’re paying the interns, but they are cheap and they’re certainly enthusiastic. The intern mainly doing the AI work is at a University of California computer science program and I often wonder what the on-campus discussions are like in such a young group where the future is so clearly their future to create.

I also wonder what your own research environment is like? You seem to be deeply involved in the technology. What a great time to be involved. So exciting.

I’ll likely start a new topic on my next question. The intern is implementing Google Custom Search and GitHub Token access for the AI bot. I’m not quite sure what these are. However, I’m hoping that the AI bot can access GitHub repos to look through documentation… I’m not sure what’s possible. I also don’t know if Retrieval-Augmented Generation (RAG) is used in the Discourse AI plugin.

Regarding the efficacy of DeepSeek R1 versus o1, a different intern was talking to me about using it for their CS projects using the Web app UI (using ChatGPT Plus). So, the test was super informal, but the enthusiasm for DeepSeek by one of the interns was big.

The intern that is actually working on the AI implementation has been much more reserved about the differences between the LLMs. They are primarily providing cost and usage tables with limited comments thus far on usage differences. We will be making all the LLMs available to the community and ask them to assess. So, it’s smart of the intern to keep their opinion low at the moment.

Thank you again for your help on my journey.

1 Like

Deepseek is hitting deep all the AI world, business and corporations.

They do more with less in every aspect. You can search about their technical differences, I found into on Reddit using a local-client because I’m not agree with their policies but you can get there.

I’m impacted about their patience to work better without billions trough Venture Funds. OpenAI is very expensive for a lot of countries and that’s not what internet or our digital era should be about.

Of course, CPC is directly envolved but nowdays the broken western laws and goverments are almost the same.

The model is censored like OpenAI (Tiananmen Square or Gaza) but perfoms really well at 1/10-20 of the old-common cost.

I think that’s good for the users and technology. Old services needs to change their approach or let the people chose.

1 Like

I’m very happy that I could help!

Yeah being in computer science these days means having to be able to adapt and learn extremely rapidly. It’s quite tiring sometimes though. I imagine innovative campuses in California are at the edge. I’m familiar will many labs in Californian unis where cutting edge research in intelligence and cognition is being done.

I currently have my own company in which I develop intelligent educational technology, and I also work in a tiny AI lab where we are attempting to build a proto-mind and then find some business use case for it. In the near future I will start my own research lab in my own niche research interest, which is intelligent space exploration systems. The AI world is all quite exciting — that’s true — but a part of me sometimes wishes it would all slow down so I don’t have to keep catching up too haha!

Google Custom Search and GitHub Token access will just let the AI Bot access Google Search and GitHub (for programming stuff) respectively. Also, the Discourse AI Bot does indeed do RAG whenever it reads topics or posts in the forum. It reads it then uses that text as extra context in order to generate more informed text.

Indeed it’s good that your interns know how to follow the trends, however, it would also be smart of them to remember that LLMs are always overhyped, because it’s just good for the market. The big LLM developers have an incentive to hype it all up. Although, I will admit, these systems are getting more impressive with time.

@oppman Feel free to PM at any point if you ever need anything! We’re all in this together!

2 Likes