Seems very good at it:
there goes several businesses!
Utterly preposterous imho
Should we reimburse all of humanity for evolving the beautiful languages we have?
But I digress.
I donāt disagree with you, but I suspect that many lawsuits are considered utterly preposterous by the defendants but costly nonetheless.
If a human vetted question and answer (for example a Discourse solved topic) has economic value as training data, it doesnāt seem unreasonable to want to get paid for it. Thereās a need for the data, so it would be kind of a win-win scenario.
There are at least two writing competitions in which the object is to write in the style of some designated author. (Bulwer-Lytton and Hemingway)
But I could see where asking an AI to write a novel in the style of some well-known author might raise some hackles with that author or heirs, a recognizable style could be considered an āintellectual propertyā, or at least some lawyer might be willing to so claim in court.
Has anyone had a lot of buzz from users excited to use Discourse Chatbot within their forums? I have seen all this chatbot stuff and I use ChatGPT, Perplexity, Claude, Bard, etc. every day. But I thought forums were a safe space from all of that. I wrote an article about that yesterday I Think AI Saturation Will Revive this Old Web Platform (web forums)
Iām really curious if forum users are desiring Chatbots and AI when they visit discussion forums powered by Discourse and others. If this is the case, I will really have to revamp my idea of forums and even consider a plugin like this. This seems like a big project, maybe time-consuming even. As always, I appreciate all you guys do. Trying to learn about the demand that produced this so that Iām in the loop as it were.
Iām looking into using it in a technical support forum to help answer easy/repetitive questions quickly when staff is busy and during off hours. I think it will be great in that capacity.
Yes, recently I opened a chat window with Hostinger support. It was AI Chatbot. And the chatbot was so effective it told me about an option for a refund I would have never known about and even sent me to link to the refund policy! lol
It understood what I was asking and didnāt ask me if I already tried 10 basic things. So yes, I can see with support cases, it being useful.
Hopefully, that is then saved to the forums, so others can see or even add to the discussion rather than replace it.
Would that also be the case with a knowledgeable support person who had experience using the software they provide support for?
Of course not. There is no such thing as perfect option for everybody.
GPTs can evolve. But now those are low level option even doing simple math. 3.5 canāt do even basics reliable right. Hallucination is really big problem when there should be facts right, or even close to right.
Other languages than english are hard. For few massive languages it will work good, but for me, and for everyone whoās speaking minor one and specially if the structure isnāt using prepositions, translations will never be top notch.
GPT will tranlate first to english, and is changing the prompt. Then answer will be translated back from english, and GPT will do other changes and hallucination round. The end product will be far away what was asked and even what GPT was offering in the beginning.
And because training is based on idea where million flies canāt be wrong and quantity is over quality, amount of mis- and disinformation is more than just huge. And even in that situation there will be even more fiction, because of hallucination.
Of course it is not that black and white. Iām using entry level solution. But if there is money to spend one can do own training and playground will change big time.
Still I make a claim: GPT works best when analyzing or doing something very is not too much variations. Or if it can create something ānewā totally fictive stuff. But the wide middle ground where a GPT should offer facts and realible informationā¦ not that much.
Iām using GPT3.5 by OpenAI a lot every day asā¦ search with steroids. And Iām not too happy. I have to check, re-check and rewrite a lot, but I donāt deny that GPT is still saving my time when creating bulk text.
There was an interesting study on a version of this question published recently:
https://www.nature.com/articles/s41598-024-61221-0
The consequences of generative AI for online knowledge communities
Generative artifcial intelligence technologies, especially large language models (LLMs) like ChatGPT, are revolutionizing information acquisition and content production across a variety of domains. These technologies have a signifcant potential to impact participation and content production in online knowledge communities. We provide initial evidence of this, analyzing data from Stack Overfow and Reddit developer communities between October 2021 and March 2023, documenting ChatGPTās infuence on user activity in the former. We observe signifcant declines in both website visits and question volumes at Stack Overfow, particularly around topics where ChatGPT excels. By contrast, activity in Reddit communities shows no evidence of decline, suggesting the importance of social fabric as a bufer against the community-degrading efects of LLMs. Finally, the decline in participation on Stack Overfow is found to be concentrated among newer users, indicating that more junior, less socially embedded users are particularly likely to exit.
That pretty much describes my own behaviour. I still ask and answer questions on Meta - Iāve got a social connection here. But for learning about new programming languages and frameworks I rely on a combination of ChatGPT and online documentation.
Possibly the main thing LLMs have going for them is their availability. Iād prefer to get guidance from human experts, but no one has enough time or patience to answer all of my questions at the drop of a hat.
A big downside of learning through LLMs as opposed to learning on a public forum is that the information that is generated is private. Itās fairly seldom that learning something via an LLM is just a matter of asking it one question and having it return the correct answer. Itās more like ask it a question, try applying the answer, read some documentation to figure out why the answer didnāt work, get back to the LLM with a follow up questionā¦ eventually a bit of knowledge is generated.
I donāt think anyone wants to read other peopleās chat logs, but possibly technical forums could promote the idea of people posting knowledge that theyāve gleaned from LLMs.
Another obvious downside of learning via LLMs is the loss of social connection, human attention as a motivation for learning, job opportunities, etc. Thatās kind of a big deal from my point of view.
Availability is the main reason weāre building a support bot.
Iām equal parts excited about this tech and slightly worried about the future of web.
Companies including Google (and Bing) are now using data gathered in crawling your site to provide a AI powered Q&A at the top of their search page.
Not only does this push search results down the page and deemphasize sources, but it also creates another worrying dynamic: this will encourage search providers to seek greater integration with some select big data sources.
e.g. Google is reported to have entered into a deal with Reddit to gain access to their API.
IMHO the upshot of that is it will tend to further promote content on larger platforms and harm smaller sites.
Now there is a fair amount of controversy at the moment regarding the quality of the results Google is getting with its āAI overviewā feature and some hilarious and not so hilarious examples that are arguably quite embarrassing for the company. Iām sure the technology will improve though.
Perhaps smaller forums are in a better position to optimise their local use of AI as they can specialise. Google is struggling with providing a very generic service.
Time will tell, but the battle to get attention is still very much being fought.
This was one of my ideas. I was thinking of fine-tuning a BERT-like model to automatically classify posts into categories, or automatically add tags. Detecting ātoxicā posts would be another use case.
For something like Discourse, thereās probably more you can do wtih AI than I have in a lifetime to do it. Although, once AI helps to implement it, maybe it can be done in a lifetimeā¦
Honestly, I think this issue is underlying to all AI related topics, and the naive guy that I am, thinks that this can be (only) solved by a community owned model.
One that is trained by data, that we willingly provide and regulate, by simply adhering to the provided licenses. An ethically trained model, that is computed on all our machines.
Peer to peer computation of data has a long tradition, as certain scientific fields have done this for a couple of decades now.
IMHO, there is no going around that, or at least a comparable solution, if when want to use AI and not sacrifice our principles, long term.
LLM-based moderation will be great, you could ask it to evaluate every post on different arbitrary measures (relevant to community) and perform actions, filters, or help.
I see the start of some of this here but not clear on the feature set: Discourse AI Features | Discourse - Civilized Discussion
I believe the feature set you are looking for is Discourse AI - AI triage
We do have some plans to revamp the landing page so the context is even clearer for all AI features
As seen at large in Facebook, Instagram, TikTok etc
It really depends on what the goal is - take down offensive content, guide the user etc.
One goal that interests me in particular is using an LLM to analyse questions/problems when they are submitted. The goal is not to answer the question, but rather to help the user express their problem in a more constructive way. Too often we see the first reply is ācan you please post your error logsā or āwhat exactly are you trying to do?ā. An LLM could catch topics which fall into this category and nudge the user to provide those details, speeding up the whole support process and creating a higher quality topic for future readers.
Early work has been promising, showing about 93-95% accurarcy from a dataset of about 60 topics. The innacuracies arenāt even that bad - half of the answers where our assessment disagrees with that of the LLM are very dubious to begin with.
My main finding, as obvious as it may be, is: the more you reduce the scope of your query to the LLM, the more accurate that answer will be.