Summarising topics with an LLM (GPT, BERT, ...)?

Dear discourse team,

Discourse has the ability to summarise long threads. My understanding is that it uses engagement metrics like reactions or number of replies to identify what posts should be kept. In my community (internal communication within the company) this almost never triggers despite threads that are 20 pages long (reply count is not the best trigger for us)

A more powerful approach would be to use an LLM like GPT-3 or BERT to summarise the whole thread on demand (vs just filtering the posts). Summarisation is a use case where LLMs shine (pretty much the poster child) and I believe this would be way more effective than the current approach which is based on weak signals (in some communities, people may not use reactions or reply to specific posts).

On the other hand, running GPT to summarise text can be expensive, up to $0.02 per 1k tokens. This feature would probably make more sense in some private communities than high-traffic public ones.

I imagine the feature to work like this:

  • when reading a long thread for the first time (long defined with a setting) that has never been summarised, suggest to summarise it on demand.
  • the summarisation would be inserted in the thread (eg: everything before is summarized)
  • it is linked from the top of the thread (eg: jump to the summary)
  • ideally, there is some :+1: :-1: feedback mechanism
  • after N more replies, offer to publish a new summary and update the link at the top.
  • the plugin would be configured with an openai key (assuming this is the LLM used). Eventually, if it proves successful, you would want to offer access to different models.

At least that’s the theory :slight_smile:. In practice there are unknowns about how good it would be, or how to configure the degree of summarization. My instinct is that this is extremely high ROI for some communities. The number one problem I hear is that people hate catching up on threads. This would be of massive help.

Thoughts?

8 Likes

Our AI team are certainly exploring this space. Summarization is one of the more interesting possibilities.

We will reply here once we have something to show.

7 Likes

FWIW:

import openai
import requests
from bs4 import BeautifulSoup

openai.api_key = "KEY"

url = "https://meta.discourse.org/t/summarising-topics-with-an-llm-gpt-bert/254951.json"

response = requests.get(url)
data = response.json()

messages = []
messages.append("Title: " + data["title"])
for post in data['post_stream']['posts']:
    soup = BeautifulSoup(post['cooked'], 'html.parser')
    messages.append("Post #" + str(post["post_number"]) + " by " + post["username"])
    messages.append(soup.get_text())

text_blob = "\n".join(messages)

print(text_blob)

max_chunk_len = 4000  # Maximum length of each chunk

chunks = [text_blob[i:i+max_chunk_len] for i in range(0, len(text_blob), max_chunk_len)]

summaries = []
for chunk in chunks:
    print("processing chunk")
    response = openai.Completion.create(
        engine="text-davinci-003",
        prompt="prompt:\n" + chunk + "\nsummary:",
        temperature=0.7,
        max_tokens=200,
        top_p=1,
        frequency_penalty=0,
        presence_penalty=0
    )
    summary = response.choices[0].text.strip()
    print("\nSUMMARY")
    print(summary)
    print("\nSUMMARY")

Discourse is exploring the possibility of using an LLM such as GPT-3 or BERT to summarize long threads on demand. This feature would be configured with an openai key and could be especially useful for private communities. The summarization would be inserted in the thread, with a link at the top and a feedback mechanism. The AI team is currently exploring this possibility and will update when they have something to show.

It’s interesting but reasonably uneven

8 Likes

So I couldn’t help myself from having a go today:

It’s very experimental and there are very few safeguards for when things go wrong atm, so it’s definitely not for professional use(!!)

Note due to token limits it can only currently take the first ~40 Posts into account (based one of my forums activity).

Has a downvoting system. UI preview:

(btw, this example is based on the rather random data in the development fixtures, it performs much better on real data).

Btw, I “spent” $0.33 on calls building this so far :smiley:

($18 is a free allocation)

7 Likes

@Julienlavigne I’ve made some significant improvements to this in terms of stability, flexibility and administrative feedback.

Errors returned by the model are now logged, for example, so you can now easily tell if you’ve breached model limits.

If you (or anyone) wants to give it a try I might be able to provide a little casual support for it for a little while and may be able to incorporate feedback where reasonable.

The full set of settings is now:

4 Likes

@merefield that sounds awesome!

Sadly I’m using the business version of discourse and can’t deploy custom plugins, so I can’t test this easily. I may try to run discourse locally with a backup and play with it though. I’ll keep you posted.

2 Likes