Reset after running migration script?

I’m using the migration scripts to migrate a Vanilla 3 forum to self-hosted Discourse.

The migration script is working fine:
RAILS_ENV=production ruby script/import_scripts/vanilla.rb /shared/uploads/export.text

The only issue is that once I’ve done the export, I can’t seem to re-import again. The import script runs fine a second time, but any data changes I made to the import file are not applied. Also, the importer runs about 50x faster on the second run, which makes me suspect it’s not actually importing anything.

Question: is there any way to re-run the import scripts located at
/var/www/discourse/script/import_scripts/
after the first run?

Specifically, as I fix bugs in my import file format, I’d like to be able to re-import to have updates made to posts & discussions only.

So far the only solution I found was to nuke the entire Discourse install and start from scratch, which takes nearly an hour each time.

Any tips?

Here is the relevant code from vanilla.rb:

  def import_posts
    puts "", "importing posts..."

    create_posts(@comments) do |comment|
      next unless t = topic_lookup_from_imported_post_id("discussion#" + comment[:discussion_id])

      {
        id: "comment#" + comment[:comment_id],
        user_id:
          user_id_from_imported_user_id(comment[:insert_user_id]) || Discourse::SYSTEM_USER_ID,
        topic_id: t[:topic_id],
        raw: clean_up(comment[:body]),
        created_at: parse_date(comment[:date_inserted]),
      }
    end
  end

I’m a programmer but not a ruby programmer - is there any way to modify this code to force it to replace the content of a post if I do a re-import?

I found one workaround that’s not halfway bad - as I’m improving my parser which cleans up the import file from Vanilla, I tend to focus on mistakes that occur i specific posts.

So as I improve my parser, I can stop the parser in the debugger (I’m using Xojo, for what it’s worth) and get the Raw Text.

Then, in the live discourse forum, I can simply add a new post, paste the text in, and see how it looks.

This allows me to do a test/debug/change cycle of a few seconds, rather than an hour or so.

My new plan: after I’m happy with my parser cleanup, then I’ll nuke the Discourse and re-install from scratch.

It works that way on purpose. The idea is that your can do an import now and then run another with a fresh dump they will run very fast since it imports only the new data.

You need to drop, create, and migrate the database to start over.

If you have lots of users, you could stop the script after the users were imported and make a backup then and restore that backup before you try your fixes.

1 Like

That makes perfect sense, thank you for explaining. It would be nice if there was a flag one could set to “force overwrite” but I dug thorough the code a bit and didn’t see anything obvious.

Is there an easy way to do that? The only solution I’ve found are these commands, which basically are the same as starting a fresh docker install:

# WARNING: these commands delete your entire Discourse forum
cd /var/discourse
sudo ./launcher stop app
sudo rm -rf /var/discourse/shared/standalone
sudo ./launcher rebuild app

It only takes about 10 minutes, but then I have to go through the initial setup again, which is a pain.

Now that is a fantasic idea! I could even just make a backup after a fresh install, but before running the import script at all, since re-importing users/topics/posts/comments is pretty quick, and this forum is not live to the public.

sv stop unicorn
rake db:drop db:create db:migrate

You have to set an env variable to drop the database, but it will tell you what it is.