We are happy to announce Discourse AI, a new plugin and our one-stop solution for integrating Artificial Intelligence and Discourse, enabling both new features and enhancing existing ones. With this first release, we are shipping 7 different Discourse AI modules to help community managers, members, and moderators with various tasks from sentiment analysis to automated proofreading and suggested edits. Read along and find out more details about each of these features as well as what is coming up next on our roadmap!
This is an impressive body of work, @Falco and team. Really excited to see how this all works in practice and its impact on community management overall.
These are the kind of updates that feel like opening a new Christmas present.
We (at the present time of this writing) do not have a dedicated manager for our community, and tools like this enable us to continue to scale without a dedicated role.
Not to mention the features like composer helper that just elevate the user experience.
Yes we are planning on exploring this area. The tricky thing is that we only have a small number of examples to feed into GPT-4 given the prompt limits, meeting token limits is really hard. There are quite a few other approaches we can take though and we will explore and report back.
Even with very little fine tuning GPT-4 does not do a terrible job assessing stuff:
Could you try it with a post which contains a long block of code or syslog output? Those are getting tagged as spam by akismet all the time on our site.
Probably, but it would get super expensive to fine tune a model. Some people get really good results by simply using embeddings, that is probably the next thing to try.
When I checked it fine tuning is way cheaper than what I expected. it depends a lot of how many training data you plan to use, but if the comparison is with the size you can fit in a single gpt 4 it’s cents
I didn’t get to the point of using it, so chances are I missed something, so please correct me if I’m wrong
Training can be very, very expensive. In my case, for our training calculations, just for OpenAI’s recommended minimum training, we’d be looking at almost $200,000 for training on a single use case.
Is new users getting confused with TL1 limits still a thing?
If so, I think AI could be a good solution for that, let new users do more, but with the AI paying close attention to them, and put it in the moderator queue if it’s not confident it’s ok
No probs at all @Falco was doing a spike on this today and it looks very promising, even a trivial prompt does surprisingly well. Spam is just sooooo spammy.
Will leave it to Falco to share specifics.
Another interesting approach which we can possibly combine is leaning on the vector database. If you post something and the vector is close to 20 other spams… well it is probably spam. This approach allows fine tuning.
To be honest I kind of see Akismets future as not that bright. Matt must be stressing out about the long term here for it.