Importing from phpBB3


(James North) #125

Hi Guys,

I have another question resulting from this process.

I have images/attachments from my old forum that are referenced as HTML links and also there are links of other things (Google Docs/Twitter/YouTube) that are in the content of the posts.

By default, the import doesn’t render those images or embeds.

If I click ‘edit’, the WYSIWYG preview renders them successfully. If I post that edit (without touching anything), everything renders on that post.

Is there anything database-wide that I might be able to do that would parse those URLs/attachments at all?

Thanks again - really enjoying setting this up!

(Felix Freiberger) #126

Does clicking “Regenerate HTML” in the :wrench:-Menu also fix this for the corresponding post?

(James North) #127

Yes - that works!

I wonder can that function be accessed site wide?

(Felix Freiberger) #128

I have no idea how you got into this situation (maybe someone else can shed some light on this), but to regenerate all posts run

cd /var/discourse
./launcher enter app
rake posts:rebake

and brace for some serious CPU load :fire:.

(James North) #129

Definitely prepared to set things on fire.

Yeah these posts were imported from esoTalk to phpBB3 and they had links generated for that forum, and those links have stuck around in this import as well.

Thanks heaps @fefrei - going to try it out now!

(James North) #130

Worked a treat - thanks again @fefrei.

(Jay Pfaffman) #131

I’ve noticed this on my imports too. I think that it might be the case that if he’d waited Sidekiq would have baked them, but I’m rebaking them on my Dev machine anyway.

(James North) #132

Yeah I wondered about the sidekiq process - but after having it live for about 5 days I just wanted to smash them.

(James North) #133

Now that I’ve established myself as a thread hog, I have another question :slight_smile:

Those attachments that are jpg/gif/png render in the post and if you click links to them, they show in a new window.

There are also links to attachments from my import that have non-images - so doc/pdf/rtf etc. However links in the post to these files give me the ‘Oops! That page doesn’t exist or is private.’ page in Discourse.

I’ve moved all the attachments into the same folder - /shared/uploads/default/original/attachments

So Discourse is happy to serve me images in that folder, but it it won’t serve other file extensions in that folder.

For the record, yes, I’ve enabled heaps of file types to be allowed in settings (not sure that has much of a bearing on this though).

Is there something that Discourse sees differently about hosting images vs documents in that upload folder? Could it be solved?

Thanks again - I hope to be able to keep an eye on this thread and help anyone who has gone down the same path from esoTalk that I did.

(Gerhard Schlager) #134

Why would you do that‽ Don’t do that!

(James North) #135

Haha - certainly not by choice.

Esotalk migration to phpBB3 meant that attachments became files in an /attachments directory (that’s just how they’re stored in esoTalk).

As a result, there’s just a link to ‘attached files’ at the bottom of posts!

(Jay Pfaffman) #136

I wonder whether you need to enable those file types before you run the importer or whether it is just b open.

(James North) #137

Quite possible - I can’t actually remember whether I enabled the file types before or after I ran the import script (I ran it so many times before I got everything working well).

It just seems a bit odd that links to images work and links to anything else don’t work even though those files are in the same directory with the same permissions.

(Gerhard Schlager) #138

Well, that’s how Discourse works. :wink:
Images are served directly by nginx. Other file types are handled by the UploadsController which raises error 404 unless you are trying to download a known upload.

You have a few options how to solve this:

  • Tweak the nginx config so that it always serves your old attachments without going through rails.
  • Tweak the phpBB3 importer so that it detects your attachment links as actual attachments.

You don’t have to do this with the phpBB3 importer. It does it automatically.
But in this case the attachments aren’t even detected by the import script since they are merely links to files.

(James North) #139

Awesome, thanks @gerhard.

So the phpBB3 importer will be telling Discourse that files are ‘known uploads’ if in fact they natively were from phpBB3, but in my case they were just thrown in as links from a previous import.

I wonder, is there a third option that would involve flagging files as ‘known uploads’ in the Discourse db?

(Gerhard Schlager) #140

Sure, with lots of Ruby magic everything is possible.
But I suggest you change the nginx config by adding the config changes to the app.yml. That’s certainly the easiest thing to do. If you need help with that I suggest you create a #support topic or post in #marketplace since this is getting quite off-topic. :wink:

(James North) #141

Yes definitely - thanks heaps for getting me on the path!

(James North) #142

FWIW, I entered the docker and went to /etc/nginx/conf.d/discourse.conf and just edited this line to include the files that are in that /uploads folder:

 # this allows us to bypass rails
  location ~* \.(gif|png|jpg|jpeg|bmp|tif|tiff|svg|doc|pdf|docx|rtf|txt)$ {
      try_files $uri =404;

I added doc|pdf|docx|rtf|txt and restarted nginx (in the docker) and restarted the discourse instance and bam. They work.

(Kane York) #143

You need to add a replace rule in your app.yml that will apply that change every time you rebuild!

(Michael Corliss) #144

Does anyone have suggestions for performing this on a small server? I have a 1.9GB phpbb3 backup (1m+ posts), and though I have room for it on the discourse server, I have less than 1.9GB to spare; when I try to import, I run out of space.

I’m willing to truncate my phpbb3 database (for example, only importing recent posts), I’m just not sure how to do that without breaking things.