הנדסת פרסונה להישען על היסטוריית צ'אט

jrgong · 27 ביולי,‏ 2025,‏ 4:29pm

שאלה מהירה, איזו מבנה “טוב יותר” למקרה השימוש שלנו:

יש לנו אוסף של יומני היסטוריית צ’אט מיוצאים מערוצי סלאק (Slack Channels) המכילים הרבה ידע, בעיות ופתרונות שהוזכרו וכו’. ברור שהצ’אטים האלה מכילים הרבה “רעש” חסר תועלת שיהיה לא כלכלי פשוט להכניס לנושאים/פוסטים ולגרום לבוט ה-AI להשתמש בו.

יש לנו כ-10 קבצים, כל אחד בגודל של כ-1-2MB. מבחינת שימוש באישיות AI, יהיו רק כ-30 אנשים שיבצעו כ-10 צ’אטים ביום (קשה להעריך את נפח הטוקנים כאן).

בשלב זה, אני תוהה מה תהיה גישת 80/20 סבירה כיצד להשתמש ביומני הצ’אט תוך שמירה על כלכליות מסוימת. זה ירד ל-2 אפשרויות:

העתק והדבק את היומנים לנושאים/פוסטים בדיסקרוס (Discourse): מהיר ומלוכלך, לא דורש פיתוח מותאם אישית, עלול להסתיים בהרבה עלויות API.
עיבוד מקדים כלשהו של יומני הצ’אט והכנסתם לפורמט או מבנה מתאים והעלאתם לאישיות.
או אולי סוג של היברידי: עם כל בקשה לבוט AI, הערכה ושמירה של הפלט כקובץ txt ואז העלאתו לאישיות.

איזו אפשרות אתם ממליצים? או אולי משהו שונה לחלוטין?

sam · 29 ביולי,‏ 2025,‏ 5:36am

הייתי ממליץ על הגישה הבאה:

עיבוד 10 הקבצים באמצעות פרסונה “יצירתית” תוך שימוש ב-LLM גדול עם הקשר גדול / פלט גדול כמו Sonnet 4 thinking. המטרה של עיבוד זה תהיה “לסדר” את המידע ולהכין אותו ל-RAG.
לאחר מכן, באמצעות ההעלאה המובנית שלנו, העלה את 10 הקבצים המעובדים לפרסונה, כך ש-RAG יוכל לחפש בתוכן.

בהינתן שיש כאן המון נתונים, אני ממליץ להימנע מדחיסת הנחיית מערכת. ככלל אצבע, הנחיית מערכת לא אמורה להיות ארוכה מדי, זה הופך ליקר. 10k טוקנים זה עובד, 100k טוקנים זה לא עובד עם LLMs חזיתיים נוכחיים. כל אינטראקציה תעלה לך יותר מדי וה-LLMs יתבלבלו במיוחד.

jrgong · 29 ביולי,‏ 2025,‏ 7:35am

Thx, that helps!

Just to clarify, are all the uploaded files injected into the system prompt? Or are they processed through the configured ai_embeddings_model first and then injected?

I am a bit confused about your recommended 10k token limit recommendation especially with the part below:

sam · 29 ביולי,‏ 2025,‏ 8:04am

The files in Discourse AI Persona, upload support are only limited by your upload size, they can be huge, they are processed via embedding, we inject chunks into the prompt per configuration.

What I was talking about was trying to coerce all the information into a single system prompt here:

that is limited…

jrgong · 29 ביולי,‏ 2025,‏ 8:08am

אה, זה מבהיר את זה, תודה!

אז בעצם, הצעדים הבאים שלי צריכים להיות הרצת כמה בדיקות עם מודלים שונים של הטמעות (embeddings) ולראות איזה גודל טוקנים אני בסופו של דבר מזריק למערכת ההנחיה (system prompt), נכון?

sam · 29 ביולי,‏ 2025,‏ 9:22am

The embedding model controls quality, not quantity

you can roll up all your data into a single file, we will chunk it in the background and retrieve the most relevant chunks and add the to your prompt

experimenting here would be about improving results, some cleanup may work better than others clean ups some embedding models will be smarter at finding more relevant pieces

jrgong · 29 ביולי,‏ 2025,‏ 10:08am

Thx sam, I really appreciate it

If you have any further helpful resources, feel free to share them here. Once I make progress, I will try and publish my experience here on meta.

jrgong · 11 באוגוסט,‏ 2025,‏ 8:50am

@sam how do you suggest adding version or model numbers to metadata separators?

Your original example:

[[metadata about cats]]
a long story about cats
[[metadata about dogs]]
a long story about dogs

Now if we want to enrich them with version numbers or specific model numbers, do I just use the same format or structure that humans would use when typing them?

E.g.:

[[metadata about cats v1.0]]
a long story about cats
[[metadata about dogs]]
a long story about dogs
[[metadata about cats xxl v2.1]]
a long story about cats
[[metadata about dogs v 1.1beta]]
a long story about dogs

Also, when version numbers are missing in metadata (see metadata about dogs), would that chunk be used in a response for all dog-related requests, no matter which “dog version”?

sam · 11 באוגוסט,‏ 2025,‏ 10:59pm

[quote=“jrgong, post:8, topic:376342”]האם אני פשוט משתמש באותה פורמט או מבנה שבני אדם היו משתמשים בהם כשמקלידים אותם?
[/quote]

כן, זו הדרך הנכונה!

נושא		תגובות	צפיות
Discourse AI Persona, upload support Announcements ai-bot , ai	21	1604	11 בספטמבר,‏ 2025
AI bot - Chat and PM integration Site Management how-to , ai-bot , chat , ai	8	1075	24 באוקטובר,‏ 2024
Creating custom AI Bot personas Feature ai-bot , ai , completed	12	3560	23 בנובמבר,‏ 2023
AI bot - Personas Site Management how-to , ai-bot , ai	23	2705	10 באוקטובר,‏ 2025
[Ai Bot] Add user token tracking, custom AI personas, max context posts, document loading, custom API URLs, and localized chat titles Feature ai-bot , ai	2	515	22 במרץ,‏ 2024

הנדסת פרסונה להישען על היסטוריית צ'אט

נושאים קשורים