将phpBB3论坛迁移到Discourse

I had this same error. I believe it happens when one gets a stale shared/standalone/import/mysql/imported file. Deleting that file cleared the error.

2 个赞

Migrating passwords apparently didn’t work for me.

I set passwords: true in shared/standalone/import/settings.yml before importing from a phpbb 3.0 dump. I have this in containers/app.yml

hooks:
  after_code:
    - exec:
        cd: $home/plugins
        cmd:
          - git clone https://github.com/discourse/docker_manager.git
          - git clone https://github.com/discoursehosting/discourse-migratepassword.git

and I ran ./launcher rebuild app after the import. The import seems to have worked entirely, except that I cannot log in with my password from phpbb. There weren’t any relevant error messages that I spotted from the import, and the password is 20 characters long, so should clear the minimum length limit.

What should I be looking for to troubleshoot this?

A bit more information: I see import_pass entries in the user_custom_fields table of the database that match password hashes that were in the phpbb database, so that part seems to have worked?

Edit: Oh ho, I figured out what happened. I had completely forgotten that we had switched to LDAP authentication for phpbb… The passwords I imported were totally fine, just 15 years old! So now I need to knock something together to extract the password hashes from LDAP instead… :slight_smile:

1 个赞

Hi everyone,

Did someone made an import with attachment recently?

The users, posts work perfectly but the attachment don’t get added to the posts

I still have the bbcode in some posts

The weird thinh is that the content of the files folder get transferred in the discourse’s uploads folder

First, I though it was because the 3.0.12 phpbb’s version was too old but I upgraded phpBB to 3.0.14, cleaned Discourse and uploads folder, and tried the import again. No luck.

I upgraded the phpbb version to 3.2.0 (after running their support toolkit to clean the database of the mods) on a test server, still no luck.

Before investigating further on my database, I wanted to be sure that the script works for everyone

I don’t think I’ve made a mistake on my settings file, but maybe a fresh look on this might help

# This is an example settings file for the phpBB3 importer.

database:
  type: MySQL # currently only MySQL is supported
  host: localhost
  port: 3306
  username: root
  password:
  schema: phpbb
  table_prefix: phpbb_ # Change this, if your forum is using a different prefix. Usually all table names start with phpbb_
  batch_size: 1000 # Don't change this unless you know what you're doing. The default (1000) should work just fine.

import:
  # Set this if you import multiple phpBB forums into a single Discourse forum.
  #
  # For example, when importing multiple sites, prefix all imported IDs
  # with 'first' to avoid conflicts. Subsequent import runs must have a
  # different 'site_name'.
  #
  # site_name: first
  #
  site_name:

  # Create new categories
  #
  # For example, to create a parent category and a subcategory.
  #
  # new_categories:
  # - forum_id: foo
  #   name: Foo Category
  # - forum_id: bar
  #   name: Bar Category
  #   parent_id: foo
  #
  new_categories: []

  # Category mappings
  #
  # For example, topics from phpBB category 1 and 2 will be imported
  # in the new "Foo Category" category, topics from phpBB category 3
  # will be imported in subcategory "Bar category", topics from phpBB
  # category 4 will be merged into category 5 and category 6 will be
  # skipped.
  #
  # category_mappings:
  #   1: foo
  #   2: foo
  #   3: bar
  #   4: 5
  #   6: SKIP
  #
  category_mappings: {}

  # Tag mappings
  #
  # For example, imported topics from phpBB category 1 will be tagged
  # with 'first-category', etc.
  #
  # tag_mappings:
  #   1:
  #   - first-category
  #   2:
  #   - second-category
  #   3:
  #   - third-category
  #
  tag_mappings:

  # Rank to trust level mapping
  #
  # Map phpBB 3.x rank levels to trust level
  # Users with rank at least 3000 will have TL3, etc.
  #
  # rank_mapping:
  #   trust_level_1: 200
  #   trust_level_2: 1000
  #   trust_level_3: 3000
  #
  rank_mapping:

  # WARNING: Do not activate this option unless you know what you are doing.
  # It will probably break the BBCode to Markdown conversion and slows down your import.
  use_bbcode_to_md: false

  # This is the path to the root directory of your current phpBB installation (or a copy of it).
  # The importer expects to find the /files and /images directories within the base directory.
  # You need to change this to something like /var/www/phpbb if you are not using the Docker based importer.
  # This is only needed if you want to import avatars, attachments or custom smilies.
  phpbb_base_dir: /shared/import/data

  site_prefix:
    # this is needed for rewriting internal links in posts
    original: ***.com   # without http(s)://
    new: https://****.org       # with http:// or https://

  # Enable this, if you want to redirect old forum links to the the new locations.
  permalinks:
    categories: true  # redirects   /viewforum.php?f=1            to  /c/category-name
    topics: true      # redirects   /viewtopic.php?f=6&t=43       to  /t/topic-name/81
    posts: false      # redirects   /viewtopic.php?p=2455#p2455   to  /t/topic-name/81/4
    # Append a prefix to each type of link, e.g. 'forum' to redirect /forum/viewtopic.php?f=6&t=43 to /t/topic-name/81
    # Leave it empty if your forum wasn't installed in a subfolder.
    prefix:

  avatars:
    uploaded: true  # import uploaded avatars
    gallery: false   # import the predefined avatars phpBB offers
    remote: false   # WARNING: This can considerably slow down your import. It will try to download remote avatars.

  # When true: Anonymous users are imported as suspended users. They can't login and have no email address.
  # When false: The system user will be used for all anonymous users.
  anonymous_users: true

  # Enable this, if you want import password hashes in order to use the "migratepassword" plugin.
  # This will allow users to login with their current password.
  # The plugin is available at: https://github.com/discoursehosting/discourse-migratepassword
  passwords: false

  # By default all the following things get imported. You can disable them by setting them to false.
  bookmarks: true
  attachments: true
  private_messages: false
  polls: false

  # When true: each imported user will have the original username from phpBB as its name
  # When false: the name of each imported user will be blank unless the username was changed during import
  username_as_name: false

  # Map Emojis to smilies used in phpBB. Most of the default smilies already have a mapping, but you can override
  # the mappings here, if you don't like some of them.
  # The mapping syntax is: emoji_name: 'smiley_in_phpbb'
  # Or map multiple smilies to one Emoji: emoji_name: ['smiley1', 'smiley2']
  emojis:
    # here are two example mappings...
    smiley: [':D', ':-D', ':grin:']
    heart: ':love:'
1 个赞

This specifically happens if import_phpbb3.sh doesn’t find the phpbb_mysql.sql file in the correct path.

1 个赞

I presume this applies to phpBB too, as I’m trying to import dumps ranging from v3.2.x to v3.3.3 but missing parent posts are in the thousands. Even with multiple runs and multiple sequential version backups. For debug simplicity the script could output the url for the message ID to the old forum for reference checking. (… viewtopic.php?p=57912)

1 个赞

Actually since we’re at this, why not log all failed import rows with their error messages in a dump file to share for analysis… just a thought…

At least one occasion is where there is a topic viewtopic.php?f=3&t=1472 but the first post viewtopic.php?p=145185 has been deleted/removed/etc and now the first post for the topic is viewtopic.php?p=145186 which is “a reply”

Maybe for clarity actually state whether the parent topic is literally not found in the dump or just haven’t been imported to the reconstruction.

1 个赞

I’d still take a little feedback to know if everyone managed to do a full import with attachments recently. I can’t make this work.

Right now, I just don’t know if it comes from the script or the database from the phpbb forum :pray:

1 个赞

Did you download the images and put them in the right place?

I’ve not don’t it lately, but I’d be surprised if it doesn’t work.

1 个赞

Yes I double checked the folders and the settings file, the weird thing is that they are getting imported in the default/original folders, but not integrated in the posts.

It was a really old phpbb, with some plugins. I did a little cleanup and successfully upgraded from phpbb 3.0.12 to the latest 3.1 or 3.2, tested an import for each version, but it didn’t work. It may be an issue with the database. That’s why, if you manage to succeed in a full import next time, that would be helpful to hear from you. If it’s the database from phpbb, I’ll do some heavy digging with some help, if it’s the script, I can wait. Thanks for your message!

1 个赞

I don’t remember the last time I did a phpBB3 import, I suspect (but have no way of knowing) because the script works so well.

Since they are getting uploaded to discourse the issue is likely how they are referred to in the posts in phpbb. Do you see any errors when it runs? They could offer hints. Or maybe the plugins changed how they are in the posts and /or database. You’ll probably have to do some digging.

2 个赞

I had a few “missing files”, “bad post time”, I had these during my previous imports so I don’t think it’s a big issue. I also have a lot of “Parent post doesn’t exist” on the first run, but it was mentionned earlier in this thread, and a second run of the script fix this.

Other than that, the script ran pretty good, without any major issue.

There was a plugin (mostly a htaccess file apparently, I didn’t run this forum) to arrange the files in subfolders (per months and year), but I organized it on the same folder and the upgrades on a clean version of phpbb worked fine. All the imported files were active on a phpbb 3.1, 3.2.

I’ll dig a little bit deeper on the database, I might have a sql file from an old import. I’ll compare the attachments and posts tables on my test server. Maybe there’s a thing I missed.

4 个赞

You’re correct, there has been a bug in TextProcessor::process_attachments that causes attachments to not get embedded in the post Markdown. I’ve filed a PR.

5 个赞

Great job, after the two passes, the import looks great. Thanks!

4 个赞

Hi, I just imported about 35k posts; during import I noticed a lot of “Parent post XXXXXX doesn’t exist. Skipping”, and when process ended, in the Discourse forum I had all (I think) the “topics”, but with no answer at all. Practically, it imported just the topic, not the posts (except for the first one with the text of the topic itself).
Also, no avatar was imported, event though I put them in the correct tree under “import” folder.

The original forum was in phpBB2, regularly imported and visible with all its posts in phpBB3, but I deleted many old messages previously; however, all seems ok in phpBB3.

Any suggestion? Is there some import script I can check?

1 个赞

It’s been like that since a few weeks/months, but don’t worry, the import will be completed after running the import_phpbb3.sh command again

3 个赞

Thanks a lot! It worked for the posts, not for avatars. I continue looking for a fix.

1 个赞

我多次阅读了该主题并进行了几次迁移,但每次都对这个插件感到困惑,并遇到了类似的错误。这是我的第三次迁移,它让我抓狂。我认为这个插件的工作方式可能会让进行迁移的用户感到困惑。

应该在某处明确指出,插件应该在迁移过程之后在 Discourse 上激活。如果信息已经写在某处,我可能错过了,也许应该强调这些信息。

如果我错了,请纠正我。 :slight_smile:

1 个赞

我正在报告一个针对从 phpBB 3.0.7 导入时的表情符号转换问题的修复程序。

  • 一些表情符号未正确转换为 Discourse:


    (但并非总是如此,原因不明;有时会显示一些相同的表情符号,有时则不会。起初看起来是随机的)

  • 此外,一些表情符号干脆消失了:
    phpBB
    image
    Discourse
    image

问题出在 replace_smilies(text) 中使用的正则表达式,位于

错误的正则表达式:

<!-- s(\S+) --><img src="{SMILIES_PATH}/.+?" alt=".*?" title=".*?" /><!-- s?:\S+ -->

请注意,正则表达式的开头没有假设 : 字符在后面:

<!-- s

但它确实假设在正则表达式的末尾有一个:

<!-- s?:

(另外我想知道为什么正则表达式的末尾有一个 ? 来匹配 s 字符 0 或 1 次,而正则表达式的开头没有)

我从正则表达式中删除了这个 :,我的两个表情符号问题似乎都完全解决了。

在我的 phpbb 论坛上,许多表情符号确实以 : 开头,例如 :mrgreen::evil:,但有些则不是,例如 8-);)
旧的正则表达式导致了错误的表情符号捕获。例如,多个表情符号并排会被捕获为一个。


修复后的正则表达式:

<!-- s(\S+) --><img src="{SMILIES_PATH}/.+?" alt=".*?" title=".*?" /><!-- s?\S+ -->

我没有直接在 Discourse 仓库中修复代码,因为我不习惯使用 git,而且我不确定这是否会影响从其他 phpBB 版本导入。我不想搞砸任何东西。


总之,如果有人遇到和我一样的问题,这就是解决方案。

4 个赞

另一个已修复的问题可能有助于我在迁移 phpBB 3.0.7 时遇到类似情况的人。

出于某种原因,我的 phpBB 论坛帖子内容有时会在行首出现多个空格。我怀疑有些用户在写消息时“喜欢”不加注意地疯狂按空格键,这无关紧要,因为渲染的页面会忽略这些多余的空格:

原始 phpBB 文本内容:

Salut tous  :)😊
  
     Alors voilà, le combi n'a pas roulé beaucoup ces derniers temps cause CT pas OK  😈
mais il a fait ces 2000 kms sans broncher 😉
Maintenant le CT est OK . Merci L'Atelier Du Raz 8-')

    Je dois donc changer le joint-spi au bout de 40 000 kms en 10 ans 🙄
C'est un silicone et j'ai vu qu'il y avait des "doubles lèvres " !?
What's About ?

             Je trouve ça un peu limte 😈
Merci tous, fred

浏览器中渲染的页面:


但在导入 phpBB → Discourse 时,这些现有的空格被转换成了代码块:

应该显示如下:


我通过添加一个正则表达式来修复它,该正则表达式会删除每行开头的空格

 text.gsub!(/^[^\\S\\r\\n]+/, "\n")

我将其添加在此文件 process_smilies(text) 之前:discourse/script/import_scripts/phpbb3/support/text_processor.rb at 973c9bdcd3b61abc13a2353240e6389ab691c248 · discourse/discourse · GitHub


我遇到的另一个问题。
在此代码中(仍在 text_processor.rb):

    def clean_bbcodes(text)
      # Many phpbb bbcode tags have a hash attached to them. Examples:
      #   [url=https://google.com:1qh1i7ky]click here[/url:1qh1i7ky]
      #   [quote="cybereality":b0wtlzex]Some text.[/quote:b0wtlzex]
      text.gsub!(/:(?:\\w{8})\\]/, ']')

在我的数据库中,这些哈希值的长度在 5 到 8 个字符之间,但正则表达式只删除长度正好为 8 个字符的哈希值。因此,我的导入保留了较短的哈希值而不是删除它们。
我通过将正则表达式更改为以下内容来修复此问题:

text.gsub!(/:(?:\\w{5,8})\\]/, ']')

我还有一个小问题,仍然在同一个文件中。删除 [color] BBCode 标签的正则表达式期望一个以 # 为前缀的十六进制值。但是 [color] 也接受字符串,如“red”、“blue”等作为值。所以我修改了原始的正则表达式:

      # remove color tags
      text.gsub!(/\\[\\/?color(=#[a-z0-9]*)?\\]/i, "")

通过在 # 之后添加一个 ? 使 # 可选。
修复后的代码:

      # remove color tags
      text.gsub!(/\\[\\/?color(=#?[a-z0-9]*)?\\]/i, "")

我不知道我的问题在 phpBB 导入中是否常见,或者它们是否非常特定于我的情况。如果是后者,我希望我在这里的解释不会不受欢迎或多余。如果确实如此,请告诉我,以免尴尬。: grinning_face_with_smiling_eyes:


编辑:迁移后,是否可以将所有现有主题设置为对每个现有用户“已读”?

目的是防止迁移后,现有用户点击现有(有时是旧的)主题时,会跳转到他们迁移前已阅读过的这些主题的第一条消息。

理想情况下,现有用户点击现有主题时,打开的不是第一条消息,而是最后一条消息(当然是自迁移结束以来)。

不过,这是一个小小的生活质量问题(并且随着用户使用论坛和阅读主题,它会在几周内自然消失),但有人向我提出了这个建议。

7 个赞

感谢您分享这些修复!

我过去在迁移中也对正则表达式进行了类似的调整,因此这些将有助于将来的 phpbb 导入。

此主题可能有所帮助:How to mark imported posts as read - #2 by stuwest

3 个赞