Replace a string in all posts


(Arpit Jalan) #1

Want to replace a string in all the posts? Let’s do it!

Access your site

First connect to your Droplet via SSH, and enter the Docker container for your Discourse instances:

cd /var/discourse
./launcher enter app

:warning: WARNING

We strongly recommend you take a full backup before proceeding, and make sure your string replacement is specific enough to affect only the places you want it to. If this string replacement goes wrong, every post on your site will look broken!

Remap all posts containing a specific string

Run this command:

rake posts:remap["find","replace"]

Replace find with the string you want to match and replace with the string you want to replace with.

rake posts:remap[":slightly_smiling:",":slight_smile:"]

The above command will replace all occurrences of :slightly_smiling: with :slight_smile:.

Remove/Delete all occurrences of a word/string

rake posts:delete_word["word-to-delete"]

The above command will remove all occurrences of word-to-delete.


Batch processing to search and replace in all posts
Global Replace across database
How to automatically replace a double exclamation point when a message is saved?
Fixing data after import
[Announce] Search & replace / batch process Discourse posts
Remove BBCode and convert to Markdown
CommonMark testing started here!
Remove images from PMs that have been inactive for >1 year
Quote inside quote display issue
Localised inline date/time
Fix quotes after phpBB import
How do I change the legacy CDN URLs of images in posts?
Rebuild HTML and Rebake doesn't work for some images!
(Tobias Eigen) #2

Beyond pure awesomeness. Thank you! :balloon:

Presume this works for @ mentions for when usernames are changed?


(Arpit Jalan) #3

Yes, it does!

rake posts:remap["@old_username","@new_username"]

(Mittineague) #4

Are there implicit leading and trailing word boundaries used in the match?

If not is there an easy way to check for unintended partial matching?

eg.
If one wanted to change “cadab” to “foobar” would “abracadabra” get changed to “abrafoobarra”


(Jeff Atwood) #5

Use the regex replacement method if you want to match at word boundaries.


(Jeff Atwood) #6

To your point though we really should provide a strong warning to take a backup before running any replacements you are not a hundred percent sure of. I will add that to the post.


(Mittineague) #7

Agreed. I know of more than one developer that has had trouble with regex.

Maybe it would be a good idea to not allow the dreaded (.)*


(Arpit Jalan) #8

Regex replacement is not yet supported for posts:remap task because the Regular Expression syntax is different for PostgreSQL query and Ruby.

If we want to support Regex here, I can create a new rake task to accept two Regex argument:

rake posts:remap_regex["pgsql_regex","ruby_regex","replace"]

pgsql_regex will scope to all the posts matching this Regex (and will use POSIX Regular Expressions syntax), ruby_regex will be used to perform Regex match on post’s raw data (will use Ruby Regexp syntax), and will be replaced with replace argument.

Should I go ahead and create this new Rake task?


Fix quotes after phpBB import
(Jeff Atwood) #9

We can defer this work for now. I added a warning, good idea @Mittineague.


(pjv) #10

Is there a way to escape double quotes in the find and replace strings? How would I go about finding and replacing a string that contains double quotes?


(Eli the Bearded) #11

You can use single quotes for the string, if you need just double quotes in there. If you need more than that (both) then it gets more interesting.

(That “rake” command is run at the command line, right? So shell quoting applies? That would mean you need to be careful about commas in the strings, too.)


(pjv) #12
rake aborted!
Don't know how to build task 'posts:remap' (see --tasks)
/usr/local/bin/bundle:23:in `load'
/usr/local/bin/bundle:23:in `<main>'
(See full trace by running task with --trace)

I’m on version 1.6.1.


(cpradio) #13

@pjv, I think you need to be closer to latest to get that functionality. As it was only checked in 5 days ago.


(Jay Pfaffman) #14

I’m adding replace_email_addresses_with_usernames to the mbox import script.

EDIT: The answer to the next question is NO. Don’t try this at home.

Does it make sense to this remap function there? Something like:


  def replace_email_addresses
    puts "", "replacing email addresses with @usernames"
    post = Post.new

    total_count = User.real.count
    progress_count = 0

    User.find_each do |u|
      #puts "#{u.email} --> @#{u.username}"
      DbHelper.remap(u.email, "@" + u.username)
      progress_count += 1
      print_status(progress_count, total_count)
    end
  end

OK. That seems to work! I guess I answered my own question (after 3 edits).

Final edit: By “work” I mean that it does that replacement across every freaking text field in the database, which not only takes forever, but also changes their email address to their username. Oops.


(Jay Pfaffman) #15

So, I’m still new to this codebase, but when I read this (from db_helper.rb):

  REMAP_SQL ||= "
    SELECT table_name, column_name
      FROM information_schema.columns
     WHERE table_schema = 'public'
       AND is_updatable = 'YES'
       AND (data_type LIKE 'char%' OR data_type LIKE 'text%')
  ORDER BY table_name, column_name"

After seeing how cool it was to get the fields to update from the table schema, I thought, “Uh, that’s going to change every char or text column in the database? That seems a little dangerous.”

My code in the post replaces everyone’s email address in the user table with their username (well, so far 10834/12963). @techAPJ, should we exclude the users table, or at least email field from REMAP_SQL?

I guess I should add

    AND column_name != 'email'

or

    AND table_name != users'

before the ORDER BY line? I don’t see anything else in users that looks like it should be replaced.


(Jeff Atwood) #16

Good point @techapj shouldn’t this be scoped to just the correct posts column that contains the posts data?


(Arpit Jalan) #17

You’re looking at the wrong place. The source code for rake posts:remap is here:


(Jay Pfaffman) #18

A little knowledge is a dangerous thing!

Is there a way to call that from an import script,or should I do that a different way there?


(Jay Pfaffman) #19

Looks like I should just copy that code into mbox.rb. Thanks for the pointer, @techAPJ.


(Jay Pfaffman) #20

I’m mostly talking to myself here, but I think that I’ve finally gotten my code working (though it is not the first time that I have thought that and been horribly wrong).

Other people might possible care to be reminded that posts:remap is case sensitive. This is a problem replacing email addresses, which get downcased in Users, but no in the text of messages. People who accept recommendations when doing an @mention are safe; there may be some users who type @username and ignore the recommendations, which could leave some @usernames not replaced. I think.