I’m certainly glad I tested this. From how I understood the documentation newlines needed to be double escaped. Not so. They needed to be only single escapes else they were written as literal text. I am fairly certain there are better ways to know the number of affected rows, but I do not know how to do them. Anyway, it looks good to me. a couple before and afters:
To recap caveats for those that might be interested in this query
- It does not match against the lack of newlines around improperly formatted bbcode tags in raw post content
- It relies on Discourse displaying the bbcode tags in the cooked content to identify problem posts
- it does not take into account any posts where displaying of bbcode tags in the cooked content is desired (eg. in code fences)
- it matches per post, not per bbcode tag. Therefore if a post contains both improper and proper syntax bbcode tags, all will be surrounded by newlines
- it does not create edit revision histories
- posts require a rebake for the changes to be effected in cooked
- it can not be stressed strongly enough, Save a backup before running this query
- the
RETURNING raw
line will show the number of affected rows after outputting all of the raw content. Though acceptable with my small sample set, this may be undesireable for larger UPDATEs
WITH posts_with_bbcode_quotes AS (SELECT posts.id AS targeted_posts
FROM posts
WHERE posts.cooked ~* '\[quote[^]]*]'
ORDER BY posts.id)
UPDATE posts
SET raw = regexp_replace(posts.raw, '\[(/)?quote([^]]*)]', E'\n[\\1quote\\2]\n', 'g')
FROM posts_with_bbcode_quotes
WHERE posts.id IN (posts_with_bbcode_quotes.targeted_posts)
RETURNING raw;