I may - will - want the service but it’s early days for the forum I have in mind so not enough data to chew on yet.
Since you are playing with this technology can you tell us what role tags play in training the AI? I put a ton of effort into clustering the corpus of one of my forums in order to generate labels that could then be used to categorize and tag topics. While categorization went very well, implementing tags is problematic because of the sheer numbers of terms involved. There’s no practical way to present them all.
I would think that the AI could use those terms to improve its own results.
Discourse AI will now store the embeddings in the same DB instance we use for everything else. That makes it much easier to install and maintain, and we will automatically import the embeddings from the old database when you update. After that you can now decomission the old database.
Ah, this explains the issues I now get with my setup:
I, [2023-07-18T09:29:11.218667 #1] INFO -- : > cd /var/www/discourse && su discourse -c 'bundle exec rake db:migrate'
------------------------------DISCOURSE AI ERROR----------------------------------
Discourse AI requires the pgvector extension on the PostgreSQL database.
Run a `./launcher rebuild app` to fix it on a standard install.
Alternatively, you can remove Discourse AI to rebuild.
------------------------------DISCOURSE AI ERROR----------------------------------
My database is a RDS Aurora serverless v2 and hence cannot use the pgvector extension. Any chance of configuring the old behaviour?
You are using serverless for the main Discourse DB or only for the embeddings one? Discourse AI now stores the embeddings in the main DB and requires the pgvector extension enabled there. It’s available on the RDS PostgreSQL 13.11 and greater. We don’t use Aurora in production, only RDS PostgreSQL, so that’s the only thing I can recommend you.
RDS is a SaaS from AWS, it can’t be packaged in a Docker image.
Discourse AI works with either the PostgreSQL version we package in our Docker image, with Amazon RDS or with any PostgreSQL instance with the extension installed.
Do you mean recommending “Related Topics” ? In that case no, not yet. There are no embeddings models based on Llama 2 yet.
Worth mentioning that the ones we ship (one open-source and one from OpenAI API) are really good and more than enough to power the Related Topics feature.
Not at the moment, as that would require me to keep two separate repos, one with the app code and another one with the internal tooling to build images and push to our internal repositories and I really couldn’t find time to properly set this up.
The API code is all visible inside the container image tho, even if that is not the best way to peruse it, at least it’s all there.
Could anyone shares the exact minimum and recommended server requirements for a forum with standard visitors? Honestly, I want to give it a try, but I don’t know where to start since there is no clear server requirement.
In my forum, 200-250 online users and an average of 300 posts are created daily. So it can’t be called too much, so I said standard. I understand what you mean, but I plan to rent a new server because the Cloud server I am using now does not allow many upgrades. Thanks for your answer
For example, if you just want to play with embeddings, a $6 droplet doing it on CPU will be enough and that will give you access to the Similar Topics feature.
Now if you want AIHelper and AIBot, you can:
pay per call on OpenAI, and the cost will depend on your usage.
run an open source LLM in a server you own for privacy. A model like Llama2-70B-Chat will need a server that costs 10k ~ 25k a month tho.
run an open source LLM in a pay per hour service, You can run a quantized version of Llama2 in HuggingFace endpoints for $6.50 an hour and it will automatically sleep after 15 minutes without requests.
The ML/Ops area is moving fast, GPUs are super scarce and new models launch every day. Hard to predict, we are all experimenting.