Identify AI hallucinations to search engines

For those of us now living with hallucinations in public data we know that identifying the hallucinations is time consuming.

So in thinking about the problem remembered that if one uses the Hide Details in a post there is a good chance a respectable search engine will not index it. I know this because I use the feature often.

While the Hide Details achieves part of the solution, it hides information that should be visible up front. If instead the selected data is tagged with meta data noting that it contains a hallucination and the meta data can pass all the way from the time of creation to use by an API or viewer to note the data containing the hallucination it would hopefully solve the problem. Also having an icon option like Hide Details but for hallucinations would be just as nice.

I know many will think that this is not just a Discourse problem and I would agree but Discourse is a big enough entity in the field that they could form a group with other entities and set a standard to address the problem as it will only get worse with time. :slightly_smiling_face:

2 Likes

Is it safe to assume you’re talking about hallucinations as it refers to inaccurate data produced by an AI? is there any benefit of treating this data different from misleading human-produced data? Are we at a point where people are treating this data as more authoritative than human posters?

6 Likes

Thank you for taking the time to read this and ask meaningful questions.

Yes!

At present no.

However if you have access to an attorney in/for the company then asking them about this might be of benefit. This is one of those things that governments are looking at creating laws and while I am not an attorney bringing a plan to the government before they make the laws is better than having the law created and having to abide by it.

Yes, many researchers believed citations were real until they could not find them. I don’t collect such but can keep an eye out for them and note them if you would like.

AI Generated Photograph wins award!

I get where you are going but that is not the question I would be asking. Perhaps these instead.

Are people aware that AI is now creating human like text that sounds correct and is not?
Are people aware that AI is creating images that people are convinced it was created by another human? or even appears to be taken by a camera but is instead AI generated?
Would people such as researchers, teachers and other with a vested interest in factual information like to know that hallucinations are tagged as such with meta data? or came from a source known to generate hallucinations?

A bigger problem is others that will take advantage of these people who do not know about such. If the data is marked as an hallucination and the data is cleaned of the meta tag then I would not be surprised if a lawyer can use that to show intent.

Again, thank you for your interest. At present I don’t plan to push this further as you seem to have the idea but if you have other questions, I don’t mind answering. Personally I would rather live in a world where we did not need spam filters, virus scanners and now it seems hallucinations checks because people want to misuse the technology. :slightly_smiling_face:

One of my concerns with AI is that as AI is trained on data that was generated by AI, information will become more and more diluted. To avoid this, it might be in the interest of organizations that are training LLMs on the internet to know how the data they are training the LLM on was generated. Some kind of standard could be useful for this. As an example, I’d hate to see LLMs relying on ChatGPTs response to “What are the best mountain bike trails in Nanaimo for an intermediate level mountain biker?” as authoritative data about mountain biking in Nanaimo.

4 Likes

On that note, it would be far more approachable to describe the data by what it is, rather than enmeshing an argument about computer-generated text, especially when discussing it with an audience who may not have the background information.

Embedding your own bias (whether true or not) into the question is somewhat disingenuous.

For example:

It would be convenient to have a standard way to delineate text written by a generative language model (Computer AI).

We already have some methods, e.g.: @discobot roll 4d6

In the absence of an explicit author, proper citation is probably the best approach.

1 Like

:game_die: 4, 6, 4, 5

Yeah the general public are already falling into the trap of using ChatGPT as if it were a search engine that provides reasonably definitive results.

Of course it is not: it is an expert grifter!

‘60 Minutes’ asked ChatGPT for recommendations for books on the subject of the effect on the economy of inflation. It dutifully responded with the titles and authors of six books: none of which exist!

I asked ChatGPT on two separate occasions if anyone had died at the Goodwood Festival of Speed.

It told me both times that this had only happened once, but each time it gave me a different year and name.

This is a huge limitation and definitely a problem.

It’s a great tool but you currently shouldn’t use it for search without appropriate plugins.

3 Likes

When consuming a forum with generated content (either user generated content or AI generated content), the viewer / consumer should be the one to exercise caution.

A forum owner never will / should take responsibility for all generated content that is published on their forum because taking that responsibility will also imply liability. By flagging specific posts as ‘factually incorrect,’ a forum owner may inadvertently suggest that all unflagged posts are ‘factually correct,’ potentially causing significant issues.

This leads to my point that assessing the factual correctness and practical usability of information in forum posts is the responsibility of the consumer, rather than the publisher.

3 Likes

If this is a response to me then you are not understanding what the feature would do.

The feature would give users the ability to identify portions of any data as possibly containing hallucinations. I also am not asking any entity to take on the responsibility of identifying such but to give the users when creating such a way to identify with meta data that it may contain hallucinations. Thus the mention of Hide Details, the user chooses to use that when needed and it only applies to as much or as little as the user chooses, it can even be used multiple times in the same reply.

Aren’t you simply asking for a fact checker?

So regardless of the source (and mainly because you won’t know), you’d want to review a post for falsehoods.

Ironically I can imagine that the solution might use AI to do some of natural language processing for the task, but I digress …

The unfortunate likely complexity here is it will struggle with political bias or anywhere ideology or dogma might be involved, e.g. in some areas of medicine, where there are no hard and fast facts to rely upon.

But I can clearly see that confirming an obvious fact could be straightforward, e.g. the birth date and location of a famous figure, for example. That surely could be and should be automated?

Definitely an interesting area to watch, for sure!

1 Like

This many and varied replies to a feature request was not expected, usually after posting they sit there and collect dust, the replies are appreciated.

Here is a walk through of two scenarios related to this that will hopefully give a better understanding of the feature request.

  1. User identifies hallucinations

A user uses an LLM, e.g. ChatGPT, for information about Gregorian chants. They paste the ChatGPT completion into a Discourse reply. For the parts of the reply/completion that have hallucinations the user selects the data, clicks a icon for hallucinations and the meta data for the section, think HTML span or similar, is updated to show the span contains a hallucination.

The spans could be as small as an option for a command line. For this command line generated from ChatGPT

gregorio -i --gregfont=“Caeciliae” myfile.gabc

it seems that the gregfont options is a hallucination, thus this section should be marked --gregfont="Caeciliae" as a hallucination.

If one were to inspect the HTML before and after annotating then something like this would be seen

Before

<pre>
   <code class="hljs language-bash">gregorio -i --gregfont=<span class="hljs-string">"Caeciliae"</span> myfile.gabc
   </code>
</pre>

After

<pre>
   <code class="hljs language-bash">gregorio -i <span class="hallucination">--gregfont=<span class="hljs-string">"Caeciliae"</span></span> myfile.gabc
   </code>
</pre>
  1. API consumes data with hallucinations

A user is searching for a command line to create Gregorian chat sheet music, they adjust the query to not include hallucinations. As the search engine generates results it finds a hit for a page with the command

gregorio -i --gregfont=“Caeciliae” myfile.gabc

The search engine then checks the command lines on the page and finds the specific one of note. The search engine then checks the command line to see if it contains a hallucination and finds the span element with the hallucination and does not include that in the search result.


Obviously one could create a plugin for tools such as Chrome to add the needed spans but there also needs to be a standard, think RFC, of the meta data to make it parseable for use with APIs.

The scenarios above were tailored for web pages but similar should apply for LaTeX, etc.

While the scenarios above only used a scalar to identify a hallucination, the meta data could be more complex, think JSON or algebraic data type.


Gregorio references

It feels to me that this is a request for a feature which will be useful in some, but not all, communities.

I personally would be happy to see machine-generated text always hidden inside a spoiler-like mechanism: I want to know that this is what it is, and I want to know it before I invest time in reading it.

It would also be an opportunity to put something in the local rules: failing to delineate your machine-generated text is a flaggable offence.

Putting machine-generated text inside a spoiler-like mechanism also allows the poster to add some commentary or description: they might be including the text because it is amusingly wrong, or educationally wrong, or because it is useful.

The issue of future language models being trained partly on the outputs of older language models… that’s probably a big problem, bigger than spoiler tags.

1 Like

That’s all well and good, but surely the first step is to identify it as machine produced? Given you don’t know some users from Adam …

That may be impossible and all you can do is somehow check for accuracy …

1 Like

My thinking is, that it’s a moderation problem. Some text smells wrong, and enquiries are made, and it’s flagged as “machine generated but not tagged as such”, or flagged as “probable spam”.

I think I gather that some people are trying to make services which can detect machine generated text, but I don’t know how effective they are, or how expensive they will be. In the world of education, it’s about cheating, and there are motivations for detecting it.

1 Like