Fix quotes after phpBB import


#1

When I imported from phpBB, some quotes were not being handled correctly, as discussed here.

How can I fix them after the fact?

They currently look like this:

[quote]some text
[/quote]

and like this:

[quote=“John Doe”]some text
John[/quote]

The quote tags need to be on a new line.

I looked at this, but couldn’t figure out if it could be applied:


Did 1.9 beta 17 rebake change quotes in posts?
(Mittineague) #2

Nothing personal, but if you don’t understand what’s in that topic then you probably shouldn’t experiment with regex. If you get it wrong it could result in anything form ineffective, incomplete or disastrous enough where you would need to restore your backup.

Using the Data Explorer plugin try this query

SELECT COUNT(id) 
FROM posts 
WHERE cooked LIKE '%[quote%' 

and see how many you’ll need to deal with.


#3

I asked because:


(Mittineague) #4

Ah OK, you’re thinking rake task and I was concerned you might be tempted to run a command.

Anyway, what result do you get with the query?


#5

12153 is the count from the query.


(Mittineague) #6

Thanks, I was hoping it would be much less than that so that manually editing them individually wouldn’t be too tedious. But for that many one would need to have exceptional perseverance.

You will definitely want to save a backup before doing anything potentially destructive in the CLI. And it would be best to hone the pattern until the count value is the same before changing any content.

You didn’t show any in your example bbcode, so I didn’t think of it at the time, but you should run that query again using ILIKE (case insensitive) just in case you have any [QUOTE] tags. (hopefully not)

My ActiveRecord foo is less than I’d like it to be, but Inside the rails console try

Post.where("cooked  REGEXP ?", '\[quote').count 

If the results don’t match it will need some refinements.


#7

Thanks!

The rails command generates this error:

[1] pry(main)> Post.where("cooked  REGEXP ?", '\[quote').count
ActiveRecord::StatementInvalid: PG::SyntaxError: ERROR:  syntax error at or near "REGEXP"
LINE 1: ...WHERE ("posts"."deleted_at" IS NULL) AND (cooked  REGEXP '\[...

(Mittineague) #8

I’m guessing and hoping it doesn’t like the backslash escape. Maybe?

Post.where("cooked REGEXP ?", '[quote').count 

#9

That wasn’t it, same error, unfortunately.

Edit: Syntax error ^ points at REGEXP


(Mittineague) #10

Bah! I’ll try one more before I defer to someone more adept than I and start thinking the Postgres console which I’m more comfortable with instead if someone else doesn’t jump in.

Post.where("cooked REGEXP ?", /\[quote/).count 

#11

Sadly, no luck. I do appreciate the efforts, though!

TypeError: can't quote Regexp


(Mittineague) #12

And here I was thinking I was finally getting better with ActiveRecord in the rails console :slightly_frowning_face:

Most likely a simple syntax error or maybe needs to use downcasing instead.

Anyway, if you can get to the pg console try

SELECT COUNT(id) 
FROM posts 
WHERE cooked ILIKE '%[quote%';

#13

pg console: 12152
data explorer: 12148

data explorer with uppercase QUOTE: 2


(Mittineague) #14

Thanks. I’m guessing the other two are Mixed Case.

I’m on my iPad now, but I’ll go to my desktop to look at my notes of queries that I know have worked for me and edit this post ASAP.


#15

You are correct - Quote returns 2 in data explorer.


(Mittineague) #16

Found what I was looking for. POSIX regex is a bit different from LIKE or SIMILAR TO.but it allows for more finely tuned pattern matching. The result should be the same as the 12152 query

SELECT COUNT(DISTINCT(id)) 
FROM posts 
WHERE cooked ~* '\[quote' 

#17

Yes, it’s the same: 12152.


(Mittineague) #18

Great! Now comes the decision making as to what you want to happen. eg.

  • [quote]foobar[/quote] to foobar
  • [quote some_attr="some_val"]foobar[/quote] to foobar
  • [quote]foobar[/quote] to <blockquote>foobar</blockquote>
  • [quote some_attr="some_val"]foobar[/quote] to <blockquote some_attr="some_val">foobar</blockquote>

AFAIK the blockquote tags in Discourse do not have attributes and trying to keep them would likely be a poor idea. Would one of the first three be OK even with losing any attributes there might be?


#19

I’d be perfectly happy with inserting a line break before and after [quote%] and a line break before and after [/quote].

Losing the attribute would be ok.


(Mittineague) #20

Well, except that once the bbcode tags are replaced there won’t be a before and after, that will make the query easier. So like this would be OK, are you sure?

  • [quote]foobar[/quote] to <br><br>foobar<br><br>
  • [quote some_attr="some_val"]foobar[/quote] to <br><br>foobar<br><br>