Replace backslash escape only in old posts after import

(Florian) #1

I’ve imported a forum where we made heavy use of this combination of characters
which is now reduced to the percent sign due to markdown (?) processing
% (I’ve also used the backslash here, but omitted the code ticks)

Is there a way to

  1. exclude the \% combination from being processed in a special way, or
  2. to replace the \% combination to \\% but only for post-import posts?

Thanks for any pointers.

(Stephen Chung) #2

There are ways to globally replace a text string in all posts and then recook all the posts.

(Florian) #3

I’m aware of rake posts:remap["find","replace"].

In this specific case I guess it would be

rake posts:remap["\\%","\\\\%"]

However, when looking at the code, it seems like regexp is also supported.

So something like

rake posts:remap["(?<=[^\\])\\%","\\\\%","regex"]

should replace all occurrences of \% which are not preceded by another \. But it seems this regexp is then passed along to the database layer which doesn’t support lookbehinds.

It’s tricky. Thanks for any further ideas.

(Jay Pfaffman) #4

To fix those just before the forum went live you could do it from the rails console and each /search by date.

(Florian) #5

Can you give an example?

(Jay Pfaffman) #6

I’ve posted a few examples that I can’t find from my phone. I think I posted code about deactivating all users that could be an example.

(Florian) #7

This is powerful stuff :nerd_face: Seems like I’m slowly learning Ruby here.

I’ve thought of something like

imported_posts = Post.where("id < 117108")
imported_posts.find_each do |post|
  raw = post.raw.gsub(/(?<=[^\\])\\%/, "\\\\%")
  post.update_column(:raw, raw)

but unfortunately, it doesn’t replace anything in Post.raw. Anything obvious I’m missing?

(Jay Pfaffman) #8

Try adding

At the end of the loop

(Florian) #9

No, but apparently I was missing some additional escapes. This is what works now:

raw = post.raw.gsub(/(?<=[^\\])\\\%/, '\\\\\%')

Using '\\\\\%'over "\\\\\%"also makes a difference.

Thanks for initially pointing me to the Rails console, @pfaffman!

(Mittineague) #10

Yes, escapes in strings inside single quote marks are literal strings, all escapes intact. Escapes inside double quote marks get escaped before going to the next piece of code. i.e.
’\\\\’ goes as ‘\\\\’ while “\\\\” goes as '\\'
Similar with ‘&amp;’ -> ‘&amp;’ vs “&amp;” -> ‘&’

Can be a gremlin that’s not so easy to pin down.

(Jay Pfaffman) #11

Glad the push was all you needed!