I will be explaining Discourse AI integration to the Team next week. We already have some api keys, BUT, they are all for PAID accounts.
I am sure I will be asked; Are there any Features that can be integrated that can be used for Free ? This is for a school and we are sure students may want to use everything, but the costs will be out of this world.
Thanks
Actually, llama3 is free, but best I can tell, it takes a $300,000 computer to run it.
If you have some budget, you could set it such that you pay that much monthly to whatever service you want to pay for and when it’s used up for the month, it’s used up for the month. You’d try to set up limits so that it wouldn’t happen the first week (or day). It would be complicated to set it up such that teachers who wanted to use it as part of class could count on it.
For what it’s worth, you can run the 70b version of Llama3 within 48GB VRAM which you can source relatively easily from a pair of used Nvidia RTX 3090’s from eBay for about ~$750 each. Building out the rest of a system to support that would land in the ballpark of ~$3000-ish I expect.
Gemini Flash is very cheap and DeepSeek API is so cheap it is almost free.
If you need to run locally, you can build a machine for <$1000 that can run Llama models. I built a 4xP100 machine for $1000 that has 64GB VRAM. But a 2xP40 machine with 48GB VRAM would be enough to run 70B Llama and can be built for $600 if you buy 2nd hand parts.
For the ultimate in cheap, you could run on a single P40 GPU with AQLM quantization, but this would be quite slow (~1 tok/s).
Interesting that no-one is factoring in the electricity costs for all these self-host solutions. Guess that’s one consolidated invoice which is not traceable to a specific machine anyway…
You just touched on a subject that a friend who works for a utility company mentioned the other day as fallout from Remote work. AC/Heating units are now running round the clock 'cos folks have to use them non stop. The result has been astronomical utility bills for many.