Why was external AI chosen over an internal system?


I am reaching out to inquire about the related content feature on Discourse. I’ve noticed it relies on external AI. Why was this approach chosen over developing an internal system based on tags or categories that could directly offer relevant content? I’ve seen an option to suggest topics from the same category but nothing for tags.

Is there an official Discourse plugin or component that provides this functionality? I would like to thank you for your daily work and the constant innovation you bring to the platform.


Note: I am not a Discourse employee and not a OpenAI employee but a category moderator on the OpenAI forum (creators of ChatGPT and GPT 4) and have been using many of the Discourse AI features even during development.

The means you note are syntactic searches, this is semantic search and uses embeddings instead of keywords.

A Google for how does semantic search work reveals many articles, here is one I think many here may like

Yes it is part of the Discourse AI plugin specifically Semantic Related Topics.


On use of external resources, you can run your LLM locally if you like?:

But have you done this for a project?

It requires you to own or lease particularly impressive hardware!

Try the smaller language models (that you might consider hosting) yourself and see how impressed you are:

Your mileage may vary but imho you’d need to be looking at hosting an at least 70Bn parameter model which is going to be quite costly to self-host.

For reference, GPT 3.5 is supposed to be a 175Bn parameter model and GPT 4 has nearly 2 Trillion (they say) :sweat_smile:

I wrote this plugin:

And it has an AI tagging feature. In my experience you need GPT 4 Turbo to make it work well (and it really works well then!)

If you intended to self host something as powerful as those you’d need very deep pockets.

This is why the use of an external LLM API is still an attractive, pay-as-you-go, option, especially because you are only paying for calls you make, not an expensive infrastructure that spends any time spinning its wheels unused.

Of course if privacy is of major and sufficient concern, this might change the maths.


@EricGT @merefield Thank you for your prompt response and the provided information. I understand and value the innovation AI brings to the Discourse platform. However, I am concerned about the strict data protection requirements in Europe, especially in France with the GDPR. Consulting a lawyer to ensure compliance with our privacy policy could be quite costly.

This is why I was wondering if there is a Discourse plugin available that offers related content functionality without the need for external AI.

Furthermore, I would like to share my personal experience with you: I have long hesitated to launch my Discourse forum, fearing that I might not do things right despite the available guides. The advent of ChatGPT was a game-changer for me. It’s incredible how it has changed my life: with its help, I have been able to undertake projects that I previously wouldn’t have dared to. It’s a revolution that opens up new prospects for me and allows me to move forward confidently.

Thank you again for all the work you do and your ongoing support.

1 Like

As mentioned, the AI plugins can use external services or you can create your own personal cloud system that performs the same tasks. However, running your own AI service is costly, requires extra maintenance and doesn’t give you comparable results to external services.

This is all a limitation of the AI technology itself that it is so challenging to maintain and run, not anything Discourse can do about that. The Discourse plugins are agnostic as to whether you are using an external service or an “internal” one

Not relevant at all in this context.

But… a lawyer is really fast cheaper than selfhosted LLM.