Prune old topics and posts.
Review your categories and determine if all the topics need to be migrated. Do you need all the data, or will the last several years of topics be enough? phpBB will autoprune topics based on age. You can set up/adjust the prune dates for your forums in the phpBB ACP. Go to Forums and select the gear widget for the forum you wish to modify.
Prune old user names.
The Discourse importer will not migrate users who have not posted. Understanding this, you can remove old users who have never posted. Go to the phpBB ACP and select Users and Groups. There is an option to Prune Users. You can set the posts equal to 0 and the last active date. This will allow you to remove all users who have never posted and have not been active on your forums for some time.
Determine the number of anonymous user posts.
The Discourse migration will migrate anonymous users as suspended users or all to a system user. Review how many topics/posts are attributed to an anonymous user.
MySQL will determine the user_id of the anonymous user. For most installations, this is user_id = 1.
SELECT `user_id`, `username` FROM `phpbb_users` where `username_clean` = "anonymous"
To check topics and posts. Update query with your system anonymous user_id
SELECT * FROM `phpbb_posts` where `poster_id` = "1"
SELECT * FROM `phpbb_topics` where `topic_poster` = "1"
Based on these results, you need to determine how you want to configure the system to handle anonymous user post and topics. I only had a few, and I deleted them.
Review for hidden or soft deleted posts.
This was a feature implemented in phpBB 3.1. Validate if you have any hidden or soft-deleted posts. Discourse ignores this field and will import the posts. The posts were deleted by the user/mods for a reason.
SELECT * FROM `phpbb_posts` where `poster_visibility` = "2"
Delete any posts that are listed.
Validate clean emails.
Please run the following MySQL script to validate your emails so they do not have trailing spaces and remove any identified email issues.
SELECT `user_email` FROM `phpbb_users` where CHAR_LENGTH(`user_email`) != CHAR_LENGTH(TRIM(`user_email`)) ORDER BY `user_id` ASC
Validate Duplicate Titles.
The Discourse setting has a flag allowing duplicate titles. The following query will help you understand if you have many duplicate titles. My forum had over 1,000 topics with duplicate titles, so I updated the flag to allow duplicate topic titles.
SELECT `topic_title`, COUNT(*) as count FROM `phpbb_topics` GROUP BY `topic_title` HAVING COUNT(*) > 1 ORDER BY `count` DESC
Find Orphan Topics.
Every topic should have a post, but sometimes things break. This will list all topics without posts.
SELECT * FROM `phpbb_topics` WHERE `topic_id` NOT IN (Select topic_id from phpbb_posts) ORDER BY `topic_approved` DESC
Clean up any orphan topics.
Find Orphan Posts.
Every post should belong to a topic. This will list posts that do not have a valid topic_id.
SELECT * FROM `phpbb_posts` WHERE `topic_id` NOT IN (Select topic_id from phpbb_topics) ORDER BY `post_id` DESC
Clean up any orphan posts.
Validate first topic post is a valid post
The first post of a topic can be deleted, resulting in a topic with an incorrect first post_id. This validates the topic_first_post_id is valid.
SELECT * FROM `phpbb_topics` WHERE `topic_first_post_id` NOT IN (Select post_id from phpbb_posts) ORDER BY `topic_approved` DESC
Clean up first topic post_id
I will add additional items as I find them preparing for an upcoming migration. If you have any recommended checks, please add them below.