Large posts in staff logs break Excel CSV format

Staff can export staff logs from /admin/logs/staff_action_logs to see the full history, as we only display a subset of logs within the admin UI. Certain logs can have commas included in the details, which break formatting of the CSV.

8 Likes

Can commas be escaped in this output @vinothkannans?

4 Likes

No, the commas are not breaking the CSV file. Maximum number of characters allowed in a single cell of MS Excel sheet is 32,767. If a column exceeded the characters limit then it’s breaking the CSV format. In our case details column having more characters in some post_edit staff actions. If I open the same CSV file in Google Sheets then it renders all the columns correctly.

If needed I’ll limit the maximum number of characters for details column to ~30,000 in the export unconditionally.

8 Likes

30k chars still seems like a lot. Do we need that many characters in a cell? If so, why?

It’s due to giant posts. For example, many of our internal runbooks tend to be quite long. Likely only the post_edit staff action as Vinoth noted would have this many characters.

Also we storing both old and new post raws for post_edit staff actions in https://github.com/discourse/discourse/blob/4383afb769d97bc9724d5448c0583b14c39782d0/app/services/staff_action_logger.rb#L133.

We should aggressively truncate this, I don’t see the point of creating such bulk in the database over something so trivial.

5 Likes

Okay, I’ll try to improve this by storing only the edit differences in the raw text.

4 Likes

I would like to close this one off. Let’s do the simple thing here, restrict cells to 30k, then close this. The delta stuff is a very very complex change.

4 Likes

Re-opening this. We need to get some sort of fix in place, it’s near impossible to review exported logs with giant posts breaking the formatting.

1 Like

Perhaps the simplest fix here is:

> '"' +  'post "body" ' .gsub('"', '""') + '"'
=> "\"post \"\"body\"\" \""

I think this is allowed in CSV and should work.