Importing from phpBB3

I guess I shouldn’t have hidden this question within the other post.

@gerhard how are different bbcodes handled differently at the moment? Has this probably happened by error or is it intentional?

1 Like

What posts per minute import rate did you guys get when trying on bigger machines? I am currently running the import on an 8-core Xeon and the machines seems to be quite bored while the rate doesn’t go significantly above 1000 posts/min.

It’s on a fast NVMe SSD so I suppose it’s not the I/O that’s the limit. Any ideas on how to improve the speed further? I am trying to find a few percent to reduce the overall import time maybe by an hour or two (currently at roughly 14 hours)

Hello, I have a small problem… i have imported and converted 100K-ish posts, and then I discovered I had forget to set the redirect links properly (up in addition to not linking the smileys to emojis in settings.yml…). I there a way to re-import everything so it parses the links correctly, without it changing anything else I have already set up?
If I rerun the script again it will just think it has already imported so doesn’t do anything, or will it?

From what I have seen this is not possible unfortunately. It will do what you said - ignore already imported posts.

Is there a way to make it temporarily forget imported posts and reimport?
(Not permanent, since i will want to add posts later)

Probably not intentional, the the BBCode to Markdown conversion isn’t perfect yet.

That sounds about right, unless you can find a CPU with higher single core speed.

Do you have a backup from the time before you started the import? You could restore it and start the import again. Otherwise you are out of luck and will need to start from scratch.

That is unfortunate. Does this go for the smilies as well?

By “start from scratch”, you mean reinstalling discourse?

I’m sorry to say, but it would be really hard an error prone to try to correct the smilies after the import. I can’t help you with that if you want to go down that route.

Yes, essentially. You can skip a couple of steps by deleting all the data and rebuilding the container.

./launcher stop app
rm -rf /var/discourse/shared/
./launcher rebuild app
1 Like

Thank you! :slight_smile: You mean with those two commands the forum will be import ready? No other steps to take? (What about themes, awards, groups and everything else that was set up? Is there a way to backup those (ok for themes I know how to save and import them, but for other settings.)) (ok too many parentheses)

I’m basically just glad I found this out now instead of in the middle of the move when everything is down, so time to fix. But I would like to do as less rework as possible…

Yeah, without a backup you have to redo those things manually.

2 Likes

If it’s somehow not clear, that wipes the entire database and you start from scratch.

There are a couple ways to save and restore settings if you search.

But as suggested, you really want to either do those after the import or do them all and make a backup that you use as a starting place before the import.

3 Likes

Hi Helmi,

A friend who is working on making a custom BB code plugin, discovered a bug in the importer, which caused color tags to be not rendered correctly.

In his words:

Color tags are being deliberately stripped from posts. The converter bug is NOT that it’s only removing closing tags. The “bug” is that it’s NOT removing color tags for named colors

Evidence:
https://github.com/discourse/discourse/blob/master/script/import_scripts/ph pbb3/support/text_processor.rb

    def clean_bbcodes(text)
      # Many phpbb bbcode tags have a hash attached to them. Examples:
      #   [url=https://google.com:1qh1i7ky]click here[/url:1qh1i7ky]
      #   [quote="cybereality":b0wtlzex]Some text.[/quote:b0wtlzex]
      text.gsub!(/:(?:\w{8})\]/, ']')

      # remove color tags
      text.gsub!(/\[\/?color(=#[a-z0-9]*)?\]/i, "")
    end

That last part text.gsub!(/\[\/?color(=#[a-z0-9]*)?\]/i, "") is a regex that will capture color tags with hashes codes and will capture closing color tags, but not color tags with named colors (like color=green, etc).

Solution: Remove the line.

text.gsub!(/\[\/?color(=#[a-z0-9]*)?\]/i, "")

We don’t actually want it. We want to maintain that color data, no matter how we decide to display it. (Well for our forum that is the solution. For discourse you will probably make so it will remove all color tags, not just the named color ones)

2 Likes

hi Gerhard, so after doing as you instructed (first I deleted, then redid a lot of work, then did a backup and then did an import. Only to find halfway that I had done something wrong, so I quit the import and needed to start from scratch again (since it had already imported data).

so, deleted everything again.
After I rebuild the app so I could restore my backup and then do the import, the site refuses to connect.

When I run discourse doctor it says: Discourse version at [domain] NOT FOUND. at localhost not found. No other errors.

what do I need to do?

You didn’t need to rebuild, only to restore the backup.

My guess is that you need to

./launcher start import

Discourse-doctor doesn’t know about the import container.

You might need to rebuild app and import, as they should be the same version.

I have rebuilt import, rebuilt app, and still connection refused. :frowning:

You need to add the database dump and files from phpBB again. You probably deleted them during the reset.

2 Likes

Yes, i have deleted those. But wont i need to restore my stuff first? I had set up a lot of things before the import so i wouldnt have to do it again when i would have to redo an import. Wouldnt it now just import into a vanilla discourse instance?

And then you’ll restore your backup that should be in /var/discourse/shared/backups/default

aaand here I am again, sorry but I now have another issue…

this time it is: smilies! I have put each smiley in the settings.yml like this:

happy: [’:D’,’:-D’]
woo: ‘:woo:’
etc.

I have put the smilies images in the /var/discourse/shared/standalone/import/images/smilies dir.
when importing I didn’t see any errors regarding smilies not found or something like that.

still, smilies haven’t been mapped to emoji, and in the posts they have been converted into images.

what have I done wrong?

thank you once again for your help and insights!

edit: :exploding_head: ofcourse… I have to map them from phpbb3 not phpbb2…
i think that solves it but I still have to test if this was the problem though.

edit2: I have done a new import now with a new phpbb3 database dump that had the smilies inserted. Still no smilies. they have been converted to images and they are not in the emoij set. What can be the issue?

I have finally managed to get the smilies mapped to emoji.
Since it was a lot of stumbling for me with having over 150 custom smilies which all had different image-names and different smiley codes, here is a quick extended how-to for others like me.

understanding what the importer does with smilies
What I thought would happen, is that when you add the smiliy codes in the importer and put the images in the designated image file, they will automatically also be added into the emoji folder. It doesn’t do that. So, you will need to manually import your smiley images into the emoji.
When importing them, they need to have the name of the smiley code you will actually be using. So for example if you had a smiley which image filename is “cheery_icon0.gif” that is displayed when users typed :cheer:, you will have to rename that image into cheer.gif and upload it to the emoji’s. (in admin cp > customize > emojis)

Now, it gets even more interesting when you have a bunch of smilies that in phpbb displayed with things like

<;-)
%-)
:3 

So for instance I would have a smiley

 code : <:-) and was named "_1partyguyhat.gif"

I first had to decide what the new code would be in discourse, since one can’t name files with ‘<:-)’ . Then rename the gif into that code, and then add the corresponding mapping into the settingsfile.

so for me for that particular smiley that was:

party_hat: '<:-)'

what then happens when importing, all instances in a post when someone has typed <:-) are translated into :party_hat:.
It will then use the party-hat emoji to render the smiley when that is available in the emoji’s.

tips for when you have 120 smilies to convert :wink:

  • create a test post in your phpbb3 instance with ALL the smilies in it
  • when importing, you can quickly scan that post and see whether any text is left, or with the [edit] option, see whether any have been translated into images instead of just the emoji-code. (Believe me that can happen when you forget a : or mistake a ; for a : )

troubleshooting settings.yml

  • you need to uncomment the line emojis . (I had totally looked over that)
  • format should be 4 spaces in front of a smiley code
  • you need to add all codes, even when you already had a smiley coded :cheer: and you put in a cheer.gif in the emojis. If you don’t, it will still be translated into an imagefile instead of an emoji-code.
  • if you happen to have a smiley which is coded :yes: or :no:, you need to comment those because otherwise it will parse into boolean values :true: or :false:. In that case you would need to do something like this: "yes": ':yes:' to code that particular emoji.
  • edit: oh and another fun fact I forgot: if you use any - in emojis like ‘party-hat’ , when you upload the image to discourse it will convert that into an underscore so it will be named ‘party_hat’. So don’t use ‘-’ only ‘_’.

I hope this helped someone, I know I have spent almost 2 weeks on this before I finally had them all imported right.

4 Likes