HTML comments will also be included in the summary.
Since users viewing the summary will likely not be able to see the comments, I think it would be best to exclude the comments from the summary.
Excuse me. I noticed the comment because the modelβs performance is good.
The post includes an example HTML comment (<!-- Discourse is great! > ) to illustrate the issue. γ―γγ
Marking as #wontfix, as Iβm not convinced this is worth the tradeoff of extra preprocessing contents.
If it bothers you in your instance, you can change the summary agent propent to instruct it to ignore those.
Isnβt this behaviour introducing a hidden channel whereby a malicious post might be able to influence the summary without any visible sign that it is doing so?
That is why LLMs break prompts into system and user, so there is differentiation between safe and unsafe entries.
But yes, that is a possibility, specially among smaller and older models.
But as I understand it none of this is mechanism, itβs (at best) influence. Hence the unceasing jailbreaks. So, it does matter what text is shown to the LLM, if you care about what comes out.