大型帖子在工作人员日志中破坏 Excel CSV 格式

管理员可以从 /admin/logs/staff_action_logs 导出管理员日志,以查看完整历史记录,因为管理员界面中仅显示部分日志。某些日志的详细信息中可能包含逗号,这会导致 CSV 格式出错。

8 个赞

Can commas be escaped in this output @vinothkannans?

4 个赞

No, the commas are not breaking the CSV file. Maximum number of characters allowed in a single cell of MS Excel sheet is 32,767. If a column exceeded the characters limit then it’s breaking the CSV format. In our case details column having more characters in some post_edit staff actions. If I open the same CSV file in Google Sheets then it renders all the columns correctly.

If needed I’ll limit the maximum number of characters for details column to ~30,000 in the export unconditionally.

8 个赞

30k chars still seems like a lot. Do we need that many characters in a cell? If so, why?

It’s due to giant posts. For example, many of our internal runbooks tend to be quite long. Likely only the post_edit staff action as Vinoth noted would have this many characters.

Also we storing both old and new post raws for post_edit staff actions in discourse/app/services/staff_action_logger.rb at 4383afb769d97bc9724d5448c0583b14c39782d0 · discourse/discourse · GitHub.

We should aggressively truncate this, I don’t see the point of creating such bulk in the database over something so trivial.

5 个赞

Okay, I’ll try to improve this by storing only the edit differences in the raw text.

4 个赞

我想把这个问题先关闭。我们先做个简单的处理:将单元格限制为 30k,然后关闭它。delta 相关的改动非常非常复杂。

4 个赞

重新打开此问题。我们需要实施某种修复程序,因为巨大的帖子破坏了格式,使得审查导出的日志几乎不可能。

1 个赞

也许最简单的修复方法是:

> '"' +  'post "body" ' .gsub('"', '""') + '"'
=> "\"post \"\"body\"\" \""

我认为这在 CSV 中是允许的,并且应该有效。