Replace backslash escape only in old posts after import

I’ve imported a forum where we made heavy use of this combination of characters
\%
which is now reduced to the percent sign due to markdown (?) processing
% (I’ve also used the backslash here, but omitted the code ticks)

Is there a way to

  1. exclude the \% combination from being processed in a special way, or
  2. to replace the \% combination to \\% but only for post-import posts?

Thanks for any pointers.

There are ways to globally replace a text string in all posts and then recook all the posts.

1 Like

I’m aware of rake posts:remap["find","replace"].

In this specific case I guess it would be

rake posts:remap["\\%","\\\\%"]

However, when looking at the code, it seems like regexp is also supported.

So something like

rake posts:remap["(?<=[^\\])\\%","\\\\%","regex"]

should replace all occurrences of \% which are not preceded by another \. But it seems this regexp is then passed along to the database layer which doesn’t support lookbehinds.

It’s tricky. Thanks for any further ideas.

3 Likes

To fix those just before the forum went live you could do it from the rails console and each /search by date.

2 Likes

Can you give an example?

I’ve posted a few examples that I can’t find from my phone. I think I posted code about deactivating all users that could be an example.

This is powerful stuff :nerd_face: Seems like I’m slowly learning Ruby here.

I’ve thought of something like

imported_posts = Post.where("id < 117108")
imported_posts.find_each do |post|
  raw = post.raw.gsub(/(?<=[^\\])\\%/, "\\\\%")
  post.update_column(:raw, raw)
end

but unfortunately, it doesn’t replace anything in Post.raw. Anything obvious I’m missing?

Try adding

   post.save

At the end of the loop

No, but apparently I was missing some additional escapes. This is what works now:

raw = post.raw.gsub(/(?<=[^\\])\\\%/, '\\\\\%')

Using '\\\\\%'over "\\\\\%"also makes a difference.

Thanks for initially pointing me to the Rails console, @pfaffman!

2 Likes

Yes, escapes in strings inside single quote marks are literal strings, all escapes intact. Escapes inside double quote marks get escaped before going to the next piece of code. i.e.
‘\\\\’ goes as ‘\\\\’ while “\\\\” goes as ‘\\’
Similar with ‘&amp;’ → ‘&amp;’ vs “&amp;” → ‘&’

Can be a gremlin that’s not so easy to pin down.

3 Likes

Glad the push was all you needed!

3 Likes