Large posts in staff logs break Excel CSV format

Staff can export staff logs from /admin/logs/staff_action_logs to see the full history, as we only display a subset of logs within the admin UI. Certain logs can have commas included in the details, which break formatting of the CSV.

8 Likes

Can commas be escaped in this output @vinothkannans?

4 Likes

No, the commas are not breaking the CSV file. Maximum number of characters allowed in a single cell of MS Excel sheet is 32,767. If a column exceeded the characters limit then it’s breaking the CSV format. In our case details column having more characters in some post_edit staff actions. If I open the same CSV file in Google Sheets then it renders all the columns correctly.

If needed I’ll limit the maximum number of characters for details column to ~30,000 in the export unconditionally.

8 Likes

30k chars still seems like a lot. Do we need that many characters in a cell? If so, why?

It’s due to giant posts. For example, many of our internal runbooks tend to be quite long. Likely only the post_edit staff action as Vinoth noted would have this many characters.

Also we storing both old and new post raws for post_edit staff actions in https://github.com/discourse/discourse/blob/4383afb769d97bc9724d5448c0583b14c39782d0/app/services/staff_action_logger.rb#L133.

We should aggressively truncate this, I don’t see the point of creating such bulk in the database over something so trivial.

5 Likes

Okay, I’ll try to improve this by storing only the edit differences in the raw text.

4 Likes

I would like to close this one off. Let’s do the simple thing here, restrict cells to 30k, then close this. The delta stuff is a very very complex change.

4 Likes

Re-opening this. We need to get some sort of fix in place, it’s near impossible to review exported logs with giant posts breaking the formatting.

Perhaps the simplest fix here is:

> '"' +  'post "body" ' .gsub('"', '""') + '"'
=> "\"post \"\"body\"\" \""

I think this is allowed in CSV and should work.