phpBB conversion and general (Linux / Docker / Ruby / GEMs) help


#1

I’m a Windows guy with little Linux knowledge :blush:

I installed Ubuntu server 14.04 on VMware and installed the Discourse docker as per instructions, including the email setup. I created an admin account and I can login to my Discourse install.

I have a separate (Windows) server with a copy of a phpBB database (only the database, no phpBB install) on MySQL. I don’t need the existing attachment or avatar pictures for now.

I was able to go ‘into’ the Docker and find the pre-installed phpbb3.rb script but I don’t know how to edit it or overwrite it with an edited script from my Windows machine into the Docker (with Putty/SSH). Nano (as per the install instructions) is not available in the Docker.

I tried to run the pbpbb3.rb script with the default values (just to see if I could run it) but it says the required “mysql2” GEM cannot be found. I found a command (gem install mysql2) to install the mysql2 GEM but that also failed:
ERROR: Error installing mysql2: ERROR: Failed to build gem native extension.

I have no idea what I am doing really :flushed: and would really appreciate some beginner instructions for the problems above.


(Gerhard Schlager) #2

Importing from within the Docker container is kind of complicated right now. But you can give the following a try. It worked last year…


Otherwise I’d recommend installing a development environment and running the importer there.

@sam I was wondering if we could add an import mode to the launcher or somehow configure the container so that imports work out of the box. What I have in mind:

  • install all needed dependencies
  • mount a shared directory on the host where the user can put files that are needed during the import (e.g. attachments)
  • don’t start unicorn or Sidekiq
  • boot directly into the shell, change the user and change into the script/import_scripts directory
  • all the user has to do is run the import script and watch its progress :smile:
  • after the import the user rebuilds the container and starts Discourse normally

Would you accept a PR for that? Do you have any hints on where to start?


More generic importer
(Jens Maier) #3

Looks like the phpbb importer no longer uses the bbcode gem, and modifying the Gemfile is no longer necessary either. Just install the mysql2 gem and its dependencies:

/var/discourse/launcher enter app
apt-get -y install libmysqlclient-dev nano
gem install mysql2

Then edit the installer and run it:

su - discourse
cd /var/www/discourse/script/import_scripts
vi phpbb3.rb # or nano phpbb3.rb
RAILS_ENV=production ruby phpbb3.rb

(Vim is a bit involved if you’re a Linux beginner. You can install nano with apt-get install nano while being root, i.e. before you su - discourse.)


(Kane York) #4

Keep in mind that these changes will be wiped when you rebuild the container, which is actually what you want here. Install extra stuff, run the import, rebuild, the extra stuff is gone.


(Sam Saffron) #5

Sure, but you could probably just use a template for this, no need to muck with launcher, perhaps a template per importer?


#6

I follow these instructions on my original Docker installation and the script now starts without the mysql2 error but now I get this:
URGENT: FATAL: database "discourse_development" does not exist Run$ bin/rake db:create db:migrateto create your database Failed to initialize site default

bin/rake db:create db:migrate doesn’t work

and just rake db:create db:migrate gives

discourse is not in the sudoers file. This incident will be reported. :fearful:


(Jens Maier) #7

Whoops, I forgot that the importer must be run as RAILS_ENV=production ruby phpbb3.rb. :grin:


#8

It’s running now :smile:

@neil My database is from a multi-host environment so all tables begin with sitename_ instead of phpbb_. I’ve now renamed my tables in the source database, but is there an easier way of changing it in one place in the script? It all looks hard coded as far as I can tell.


(Gerhard Schlager) #9

Great! I’ll look into this as soon as I’ve finished my work on the phpBB importer.

Search & replace in the script would have worked. :wink:
Anyway, that’s one of the things I’ve made configurable in the advanced phpBB importer I’m working on right now. I’ll send a PR as soon as I find the time to finish some refactoring.

It’s still using it if the importer is launched with ruby phpbb3.rb bbcode-to-md


(Jens Maier) #10

Oh, ok, I didn’t notice that this was moved to base.rb. :grin:


#11

Two more questions / observations after a 1st trail conversion:

  • do the settings in /admin (username length, min post length, min topic length, title prettify etc) have any influence on the conversion or are these setting all ignored during the conversion?
  • all my users ‘display names’ seem converted from the email address and not from the username in the original database.

(Gerhard Schlager) #12

It depends. Most of the settings can have an influence. Only the following settings are temporarily changed during the import:

I know. I’m going to set the full name during the import to a blank string since there is no “display name” in phpBB anyway. You can fix this yourself by adding name: '', (currently after line 76 in phpbb3.rb).


#13

I would like to have the Discourse name (full name) set to the original phpBB username as a lot of my users have either use their full real name or a complex nickname as login name, including middle names, initials, special characters etc.

As the special characters and spaces etc get dropped when converting the phpBB name to the Discourse username I would like to keep the original phpBB username as name.

Another quick question. Can I install Discourse 1.2.1 instead of the latest beta?
I’m following this guide:


(Gerhard Schlager) #14

In that case use the following: name: user['username']


#15

During the import, do users that have the exact same email address get skipped without warning?
I have some specific accounts (backup administrator, special moderator account, normal user account for an admin) that all share the same email account and these seem to get skipped.
Is there anywhere where I could hack the import script to allow this?
All normal user accounts have a guaranteed unique email address except for the few special cases mentioned.

Edit: I have now fixed this in my source database but skipping users without any warning seems a bug to me.


#16

Another issue.
I seem to have an anomaly in my database that causes almost all topics to hit this part in the converter script:

else puts "Parent post #{m['first_post_id']} doesn't exist. Skipping #{m["id"]}: #{m["title"][0..40]}"

The last post from the topic gets created as 1st post with no following posts.
If I then run the script again all other posts get added to the topics, but the posts in the topics are out of order with the last post being used as first post and then the following posts.

This happens to all post in my database that where previously converted from another forum to phpBB. All newer posts that where directly posted in phpBB after the previous conversion seem to be fine.
I can’t figure out what is wrong in my database that is causing this. The old posts are correctly ordered where the 1st post of a topic is always 1st in the database.

Edit: Okay, my database is broken (at least for the previously converted posts). In phpbb_topics the topic_first_post_id is actually set to the last post id for that topic.


(Gerhard Schlager) #17

Yeah, unfortunately conversions are rarely lossless. Even the upgrade from phpBB2 to phpBB3 damaged the private messages. All information about related messages got lost.

You could try the following SQL for updating the topic_first_post_id. Please create a backup before executing it (just in case ;-)).

UPDATE phpbb_topics t
SET t.topic_first_post_id = COALESCE((
    SELECT p.post_id
    FROM phpbb_posts p
    WHERE p.topic_id = CASE WHEN t.topic_moved_id = 0 THEN t.topic_id ELSE t.topic_moved_id END
    HAVING MIN(p.post_time)
), t.topic_first_post_id);

(Gerhard Schlager) #18

Just out of curiosity, from which forum did you convert to phpBB?
And please tell us, if the SQL did help. I’ll add an optional fix to the importer, if it works. I have one for the problem with private messages from phpBB2 as well.


#19

Thank you so much for your help @gerhard

The original forum was a customized Typo3/mm_forum based forum.
I had to convert it from capturing RSS feeds as I was not allowed to have the original database because of local privacy laws (mainly because of the email addresses and passwords stored in the database). I did have permission and support from the original admins though.

I used these two scripts to convert the users and posts:
https://www.phpbb.com/community/viewtopic.php?f=65&t=1494875
https://www.phpbb.com/community/viewtopic.php?f=65&t=2115251
Both scripts have been abandoned by their original developers and have bugs that need to be fixed (that I wasn’t capable of). I fixed some obvious issues with some simple SQL hacks to the phpBB database but apparently the issue I’ve encountered now is also a result of the buggy phpBB conversion script.

Your SQL fix for my specific problem seems be working beautifully, at least for the old posts in my database. It’s still running the conversion right now so I need to see what happens to the newer posts that where already correct in the database.

Edit: the later topics that already have the correct value for topic_first_post_id also display correctly :smiley:


#20

How can I add bbcode-to-md to my Docker?
I tried the instructions at phpBB 3 Importer (old) but I either get a permission error (no access to the GEM folder) or some errors about an incorrect Json version (asking for version 1.8.0)