i’m sharing my findings as i make my way through a migration from phpBB3 to discourse.
solutions include:
postgres source db
emoji
tweaks to the importer (fixing quote bug, improved bbcode support (including youtube), attachment comments)
soft deleted posts being imported as normal posts
i’m super picky, so if you follow along, you should be able to get a pretty good result.
i’m considering moving to discourse from phpbb (installed version history: 3.2.1 - 3.2.8).
problem is, i use postgresql for the db. suggestions? i haven’t tried it but assume it’s not supported yet based on the OP.
did the new import script ever come out? i see that was a bit over a year ago.
is that importer the one that will be deprecated? what’s the status of the ‘new’ one? i’m trying to decide how much effort i put into this—will there be upstream value?
I have not dealt with the postgresql to mysql issue myself, but purely from a perspective of migrating from phpBB to Discourse, my opinion is: do whatever it takes to accomplish the migration.
I’ve migrated two phpBB forums to Discourse, and although there were the usual grumblings from a small number of users that one gets with any change, the benefits of Discourse are well worth it! Not only is Discourse easier to maintain and administer, but the built-in user engagement, image-handling, user-customization, and legibility of Discourse are only a few of the features that are so far ahead of phpBB that there is no comparison. You also get much better support with Discourse.
I’m not an expert, but a quick search makes it look like you could migrate from postgresql to mysql by doing a schema dump, changing the data types in the schema statements to match those used by mysql, using the modified schema to create the tables in a mysql database, and then doing a table-by-table CSV export and import.
Once you have the mysql database, you could use the regular phpBB migration script and have all your attachments.
I didn’t mean to imply that it would be easy, just that it’s worth the effort, even possible to do manually, and there do seem to be plenty of resources out there to accomplish it, both with automated tools and manually.
given that i hate ruby, and the existing importer is going to be deprecated in favor of bulk importer (which is not good enough yet for my purposes), i am going ahead with the postgres → mysql migration strategy.
i’m getting somewhere with mysql workbench’s migration wizard. i’ll write up a little guide if it succeeds, but it looks promising so far.
basically, mysql workbench fails on importing unicode. will try with mariadb tomorrow and see if i can set default db encoding or something before importing.
It would have been great if MySQL Workbench would have worked. I saw a bunch of error reports with it when I looked at options, and so I didn’t recommend it.
I mentioned the schema dump and CSV export/import method once before, and I’ll only mention it once more with a slight modification to make it much easier and then shut up.
If you want to go a GUI route:
Get your hands on a structure-only export of all the tables for a working copy of the same version of phpBB3 you’re using that is running MySQL/MariaDB. It would literally take someone two minutes to create one from phpMyAdmin, which is available on most web hosts that run MySQL/MariaDB. (Alternatively, you might be able to get the CREATE TABLE statements you need from the phpBB3 install scripts if you can’t get someone to do the structure-only export for you, or even use an inexpensive web hosting account to do a clean install of your version of phpBB3 into a MySQL/MariaDB environment and then delete the data in the tables it creates to create an empty MySQL/MariaDB database.)
Find access to a web host with MySQL/MariaDB and phpMyAdmin, create the database, import the structure from the structure-only export you created in step 1 using phpMyAdmin.
Create a CSV export for each table from your postgresql database and do a CSV import to the corresponding table using phpMyAdmin in your new, MySQL/MariaDB database.
I think that will prevent any issues with charset and encoding and you won’t have to figure out all the corresponding data types and field sizes that would be required if you tried to manually convert the schema dump from postgre.
I can give you a structure-only SQL statement, but the only DB I still have is from phpBB3 3.3.8, so it may add more problems if you’re using v. 3.2.x.
haha. yeah, it took some non obvious tricks to even get it working as far as i did. i think it still could work, but thanks for giving me another avenue to explore.
phpbb_smilies
├── phpbb_smilies.xlsx # master reference to help you decide
├── import.yml # config file with both options
├── orig_phpbb_smilies.csv # original phpbb data
├── orig_files # original phpbb emoji files
│ ├── icon_arrow.gif
│ └── ...
└── new_files # emoji files renamed to match the `new_shortcode` column in the spreadsheet
├── phpbb_arrow.gif
└── ...
has anyone gotten quote attribution to work correctly?
here’s an example of an original post from phpbb containing a quote (username and content changed, IDs etc are the same)
[quote=someuser post_id=46649 time=1677556325 user_id=48]
foo
[/quote]
bar
here’s how it looks in discourse:
here’s the raw post migrated to discourse:
[quote=", post:37, topic:1893"]
foo
[/quote]
bar
the migrated postid and topicid are correct, but the username is missing. this causes it to be noninteractive. when username is an empty string, the quote doesn’t expand, and you can’t click it to follow the reference to the original post.
can i hope for a better experience without making improvements to the importer? i.e. am i just doing something wrong?
i haven’t figured out the last problem yet, but here’s another question.
i notice that after the import finishes and you start the app container, my still existing phpbb server gets hammered really hard. i think this is during the sidekiq post processing phase.
phpbb still exists at www.example.com, and discourse is at dc.example.com
i’m trying to understand what’s actually happening, what settings make sense during this test migration, and what settings will make sense for the final migration. and if i need to have phpbb running for that sidekiq post processing. asking because i have no idea what happens in the post processing.
some possibly relevant settings in my current settings.yml:
import:
# Set this if you import multiple phpBB forums into a single Discourse forum.
site_name:
site_prefix:
# this is needed for rewriting internal links in posts
original: example.com # without http(s)://
new: https://dc.example.com # with http:// or https://
if there’s other stuff you need to look at, please lmk.
another thing i’m not clear on is www subdomain. i currently redirect to www with nginx for phpbb. so in the example above, does it matter if i put original: example.com vs original: www.example.com? similar question for new when i do the final migration. my users would actually access discourse from www.example.com, but idk what the best practice is.
i’m rather picky about my forum’s migration, so i am continuing to improve the importer. not sure if i’ll bother with making PRs since the importer is deprecated, and some of my fixes are semi-specific to my forum, but this branch will have all of my fixes combined in case it’s useful to someone:
i fixed the quotes issue, added support for some commonly-added bbcodes, made the youtube link parsing less broken, and support the mentions/simplementions extension. i still have more stuff to improve like adding support for multiple site prefixes (main use case is when you have links on your forum to example.com and www.example.com).
even though i’m supporting some non-vanilla stuff, it shouldn’t be a problem running it on a standard phpBB forum without extensions. i recommend just using mine in any case.
easiest way to use it is to pull my branch somewhere and clobber the import script dir inside of the container with a bind mount.
i.e. pull my changes somewhere:
git clone --filter=blob:none --no-checkout https://github.com/ftc2/discourse.git discourse_dev
cd discourse_dev
git sparse-checkout set --cone
git switch phpbb_import
git sparse-checkout set script/import_scripts
then add this to your import.yml container config:
then rebuild the import container. after you rebuild, you will probably want to do a reset on where you pulled my repo because the build process will overwrite my files, lol.
cd /path/to/discourse_dev
git reset --hard HEAD
chown -R 1000:1000 .