Engineering a persona to lean on chat history

Quick question what would be a “better” structure for our use case:

We have a bunch of exported chat history logs from Slack Channels which contains a lot of know-how, mentioned problems and solutions, etc. Obviously those chats contain a lot of useless “fluff” that would unecomical to just dump into topics/posts and have the AI bot use that.

We have about 10 files each approx 1-2MB in size. In terms of AI persona usage, we will have only about 30 people doing about 10 chats per day (hard to estimate token volume here)

At this point I am wondering what a somewhat reasonable 80/20 approach would be how to make use of those chat logs while keeping it somewhat economical. It came down to 2 options:

  1. Copy and paste the logs into Discourse topics/post: Quick’n’dirty, no custom development required, could end up in a lot of API cost
  2. Somehow pre-process the chat logs and put them into a proper format or structure and upload to persona
  3. Or maybe some form of hybrid: With each AI bot request, evaluate and save the output as txt file and then upload it to the persona

Which option do you guys recommend? Or maybe something completely different?

1 Like

I would recommend the following approach:

  1. Process the 10 files using a “Creative” persona using a large context / large output LLM like Sonnet 4 thinking. The goal of this processing would be to “tidy” up the information and prepare it for a RAG
  2. Then using our built in upload, upload the 10 processed files to a persona, so RAG can search through the content.

Given there is tons of data here I recommend against stuffing a system prompt. As a guideline a system prompt should not be super long, it becomes costly. 10k tokens is workable, 100k tokens is not workable with current frontier LLMs. Every interaction will cost you too much and the LLMs would get extra confused.

Let us know how you go!

Please share the prompts you used to shrink / tidy the chat logs.

3 Likes

Thx, that helps!

Just to clarify, are all the uploaded files injected into the system prompt? Or are they processed through the configured ai_embeddings_model first and then injected?

I am a bit confused about your recommended 10k token limit recommendation especially with the part below:

The files in Discourse AI Persona, upload support are only limited by your upload size, they can be huge, they are processed via embedding, we inject chunks into the prompt per configuration.

What I was talking about was trying to coerce all the information into a single system prompt here:

that is limited…

1 Like

Ah, that clears is it up, thanks!

So basically, my next steps should be to run a few tests with different embeddings models and see what token size I end up injected into the system prompt, right?

And when creating those txt files with chunks, I should make sure that they stay within a reasonable 10k-ish limit?

The embedding model controls quality, not quantity

you can roll up all your data into a single file, we will chunk it in the background and retrieve the most relevant chunks and add the to your prompt

experimenting here would be about improving results, some cleanup may work better than others clean ups some embedding models will be smarter at finding more relevant pieces

1 Like

Thx sam, I really appreciate it :heart:

If you have any further helpful resources, feel free to share them here. Once I make progress, I will try and publish my experience here on meta. :slight_smile:

1 Like