Migration via snapshot upload

schwa · January 14, 2022, 9:58pm

Having fought successfully with the vbulletin migration script, and watching the migration churn away for 36 hours now with a couple million posts yet to go…

If there is no existing Discourse content or data relationships to preserve, is there any reason why an existing non-Discourse forum could not be migrated by generating a Discourse backup sql snapshot directly from the source database data?

We’d have to write the script more or less from scratch, but it would be high-level similar to the existing migration scripts. The script would pull the data from the source database, munge it as needed*, and generate flat data dumps for each target Discourse table that could be used to piece together the equivalent of a Discourse backup snapshot. More or less, the script output would be injected into the backup snapshot of an empty Discourse instance.

The * above hides a ton of work, but is there any major roadblock I’m overlooking? Since we can reuse all of the existing source data identifiers (topic id, thread id, etc), I don’t think the munging step requires holding any significant amount of state, but maybe I’m wrong about that. It seems like the heavy lifting of the migration logic would be in the database calls to the source database.

pfaffman · January 14, 2022, 11:26pm

That’s pretty much what the bulk import script does. I believe there is one for vBulletin. You’ll be starting all over and I haven’t yet been successful in running it, but if you have another machine, you nudge try there while the current import continues.

codinghorror · January 14, 2022, 11:53pm

@zogstrip and @gerhard where is the bulk importer code these days on GitHub?

pfaffman · January 15, 2022, 2:08am

It’s in the import script directory.

github.com

discourse/discourse/blob/main/script/bulk_import/vbulletin.rb

# frozen_string_literal: true

require_relative "base"
require "set"
require "mysql2"
require "htmlentities"

class BulkImport::VBulletin < BulkImport::Base

  TABLE_PREFIX = "vb_"
  SUSPENDED_TILL ||= Date.new(3000, 1, 1)
  ATTACHMENT_DIR ||= ENV['ATTACHMENT_DIR'] || '/shared/import/data/attachments'
  AVATAR_DIR ||= ENV['AVATAR_DIR'] || '/shared/import/data/customavatars'

  def initialize
    super

    host     = ENV["DB_HOST"] || "localhost"
    username = ENV["DB_USERNAME"] || "root"
    password = ENV["DB_PASSWORD"]

This file has been truncated. show original

Topic		Replies	Views
Migrate from another forum to Discourse Migrating to Discourse how-to	1	26778	January 4, 2025
Migrating vBulletin onto existing and live Discourse instance Support	11	2269	June 27, 2022
Migrating legacy data from other applications Migration	6	1339	June 23, 2020
Importing from a legacy forum Migration	4	1720	October 4, 2018
Easier forum migration to Discourse Feature	13	5368	April 9, 2016

Migration via snapshot upload

Related topics