合并两个Discourse站点成一个

If you have two Discourse sites that you wish were one, this guide is for you.

There’s a tool called discourse_merger that can take one Discourse site and merge it into another.

Prereqs

This is not an easy task, and should be treated like any other migration to Discourse. You will not be running discourse_merger on a live production site. You will perform the merge in another environment where you can review the output before moving the result to production.

Copy vs Merge

Almost everything will be copied from one site to the other, but Categories and Users can be merged, which will avoid duplication.

  • Users will be merged if a user on both sites has the same email address.
  • Categories will be merged if they have the same name.

If you want to do any reorganization of your data, do it before merging.

Choose the destination site

Choose which site will be the destination for the data. This is the one that will retain all its styling and settings. The other site will have its users, categories, topics, posts, uploads, etc. copied/merged into the destination site.

How to do it

Take backups of both sites including files and copy them to the environment where you’ll perform the merge. It’s possible that they’re from different versions of Discourse, so we need them to be at the same version. I would choose to use the most recent version of Discourse while performing the merge.

Restore the destination site to the merger environment. If doing this from the command line:

bundle exec ruby script/discourse restore destination-2018-08-02-134227-v2018xxx.tar.gz

Next we’ll extract the other site.

cd /path/to/data
tar xvzf other-2018-08-02-134227-v2018xxx.tar.gz

The output will include the database dump and the upload files.

Create a database with the data:

psql
CREATE DATABASE "copyme" ENCODING = 'utf8';
\q
gunzip < /path/to/data/other-2018-08-02-134227-v2018xxx.tar.gz | psql -d copyme

If you’re running the import in an official Docker container (recommended), you will need to reset the postgres password to provide it to the script, otherwise you may encounter an error that the postgres user cannot access the database.

To change the password:

sudo -u postgres psql
\password postgres
(enter the new password)
\q

Now it’s time to run the script. Some env variables you’ll set:

DB_NAME: name of database being merged into the destination site.
DB_HOST: (optional) hostname of database being merged. leave blank if it’s local.
DB_PASS: password for the postgres user to access the database
UPLOADS_PATH: absolute path (site being merged) of the directory containing “original” and “optimized” dirs. e.g. /path/to/data/uploads/default
SOURCE_BASE_URL: base url of the site being merged. e.g. https://meta.discourse.org
SOURCE_CDN: (optional) base url of the CDN of the site being merged.

You may need to run a bundle install prior to running the import script to avoid errors. To do so:

su discourse -c 'bundle config set --local with generic_import && bundle install'

On the first run, you might need to install some extra dependencies for the gems required in import.

Once the bundle is complete, run the import.

su discourse -c 'DB_NAME=copyme DB_PASS=password SOURCE_BASE_URL=http://copy.othersite.com UPLOADS_PATH=/shared/import/data/uploads/default bundle exec ruby script/bulk_import/discourse_merger.rb'

When it’s done, review the output in a web browser.

You can use the remap tool to update links from the old forum.

bundle exec ruby script/discourse remap 'copy.othersite.com' 'hot.newsite.com'

Also rebake all posts with uploads:

rake posts:rebake_match["upload:"]

If everything looks good, take a backup of the result and restore it to your production server.

bundle exec ruby script/discourse backup

Last edited by @italo 2024-11-27T00:52:10Z

Last checked by @JammyDodger 2024-05-26T21:20:00Z

Check documentPerform check on document:
45 个赞

它似乎可以工作,但在我运行备份时,我得到了

pg_dump: error: query failed: ERROR:  permission denied for table migration_mappings

这很奇怪。

编辑:已通过以下方式解决:

ALTER USER discourse WITH SUPERUSER;
1 个赞

最近有人用过这个吗?效果怎么样?

另外,有人知道是否可以自动将用户放入每个来源论坛的组中吗?(以便于他们获得查看他们来自的论坛主题的权限。)

这有点麻烦。我想我不得不注释掉一些东西。合并图像也存在问题。

我想我会将新网站的所有用户添加到某个组中,以便在合并时他们已经在该组中。这比在合并之后或作为合并的一部分进行要容易。

2 个赞

上次我这样做时,合并站点的上传全部丢失了。我从 tar tf backupfile.tar.gz 获取了上传列表,并将它们放入 allfiles.txtx 并将其复制到上传目录。此脚本(很可能需要修改才能在您那里工作)为每个文件创建了一个上传,然后重新烘焙帖子修复了所有(或大多数?)丢失的图像。

def process_uploads
  begin
    # 读取文件名列表
    filenames = File.readlines('/shared/uploads/allfiles.txt').map(&:strip)
    count = 0

    filenames.each do |filename|
      # 在文件名前面加上 /shared
      filename.gsub!(/\.\//,"")
      full_path = File.join('/shared/uploads/default/original/', filename)

      begin
        # 检查路径是否存在且为常规文件(非目录)
        count += 1
        
        if File.exist?(full_path) && File.file?(full_path)
          # 打开文件
          File.open(full_path, 'r') do |tempfile|
            # 使用指定参数创建上传
            u = UploadCreator.new(tempfile, 'imported', {}).create_for(-1)
            puts "#{count} -- #{u.id}: #{u.url}"
          end
        else
          puts "警告:路径未找到或不是常规文件:#{full_path}"
        end
      rescue => e
        puts "处理文件 #{full_path} 时出错:#{e.message}"
        # 即使当前文件失败,也继续处理下一个文件
        next
      end
    end
  rescue Errno::ENOENT
    puts "错误:找不到 files.txt"
  rescue => e
    puts "读取 files.txt 时出错:#{e.message}"
  end
end

# 执行处理
process_uploads;

我通过以下方式获取了错误的帖子:

 bad=Post.where("cooked like '%/images/transparent.png%'")

然后用这个来标记它们需要重新烘焙:

bad.update_all(baked_version: nil)

我没耐心了,所以用了

rake posts:rebake_uncooked_posts

来重新烘焙它们。

2 个赞

我想知道是否可以将我想合并的 Discourse 论坛转换为 XenForo(他们的导入器通常很出色),然后将其与我想合并的其他论坛合并(这些论坛本身将从 vBulletin 转换为 XenForo),最后再将新合并的 XenForo 论坛导入 Discourse 是否会更容易。