Discourse incorrectly detecting uploaded file as an image

I have some engineering users who wish to attach some data files with uncommon file extensions to their posts. They are essentially plain text files but include extended ASCII characters.

I tried to update Discourse’s NGINX config to specify MIME media types for these files but it didn’t work. I posted a topic (How to customize MIME media type emitted for certain attachments?) about this two weeks ago but have not received any answers so far. Even if NGINX is not updated, it will still serve up unknown file types using the fallback MIME type of “application/octet-stream”. I can live with that for now.

However, when users try to upload these data files into a post (either using the “Upload” button or drag-and-drop), they get an error popup from Discourse like this one:

Discourse Error

It appears that when users upload files, Discourse is trying to be smart and detect whether its an image or some other file type. Furthermore, it appears to be making this determination by looking at the file contents (much like the standard Unix command “file” does). I assume this is so Discourse can decide whether to inline it with the post’s content or put it off to the side as an attachment.

In the case of these data files, this check is incorrectly identifying the files as images. Just for yucks, I put some of these data files on an Ubuntu box and checked them with the “file” command, and sure enough, they were identified as “JPEG image data”.

Is there a way for users to upload files without Discourse trying to detect if they are images? IE, “Please upload this as an attachment, no matter what, don’t inline it”?

Alternatively, I could configure Discourse to allow zip files and tell the users to zip up their files before uploading them, but I’d rather not open up the site to random zip file uploads. That seems like a security issue.

Thanks in advance for any help!

Another work around is adoption an extension for those weird files, like .bin, .data or really .anythinggoes, which should be enough for Discourse to keep those files alone.

Thanks for the quick response! However, I tried that and it doesn’t work. Discourse is definitely looking at the file contents, not the extension, to determine if it’s an image or not.

1 Like

Does anyone have any thoughts on this? This seems like a pretty significant bug.

I did some digging around this issue.

In short, your file is being detected as JPEG because it starts with the same signature as this type of file.
Fixing this behavior in Discourse is possible, but it requires a modification. (see at the end)


Here some technical stuff.
This issue starts here:

FastImage library opens the file and determines the type and size.
As you expect, it returns a type of JPEG.

If you look at what is JPEG signature, it looks like this:

JPEG markers

More information: List of file signatures - Wikipedia
JPEG - Wikipedia

It always starts with the following marker bytes: FF D8

Looking at your file sample with a Hex editor, you will see it starts the same.

image

Now, if you look at how FastImage detects a JPEG, you can see here:

However, you can’t extract any image information since not all the required bytes are there.

How do we fix this issue in Discourse?
Looking at FastImage code, there is a valuable option you can pass.

By using this option, any error ( SizeNotFound, ImageFetchFailure, CannotParseImage, UnknownImageType, BadImageURI) will return no image info ; and your file won’t be detected as an image.

@image_info =
begin
   FastImage.new(@file, :raise_on_failure=>true)
rescue StandardError
   nil
end
...
is_image ||= @image_info && FileHelper.is_supported_image?("test.#{@image_info.type}")

Then, it can work now:
image

I can make a PR later. Using this option makes sense here. :+1:

2 Likes

Wow! This ia a phenomenal analysis! Thank you!

Some quick questions:

  1. So with these changes, the file won’t be detected as an image and will be uploaded like a non-image and displayed to the right of the post?

  2. If I understand correctly, you’re suggesting I make these tweaks in my local Discourse instance to try this out and/or use this until it is included in a future Discourse release. But how do I do this? (I am an experienced software developer but have limited experience with Docker and none with Ruby.)

  3. The FastImage call that would need to be tweaked is in models/upload.rb, right?

  1. Yes, that’s right – like in my screenshot above

  2. I’m not suggesting you do the modification. However, if you can’t wait, you could certainly test that change.

  • For a temporary change (gone after rebuild):
cd /var/discourse
./launcher enter app
sed -i "s/FastImage.new(@file)/FastImage.new(@file, :raise_on_failure=>true)/" lib/upload_creator.rb
sed -i "s/FastImage.new(original_path)/FastImage.new(original_path, :raise_on_failure=>true)/" app/models/upload.rb
exit
  • For a persistent change (stay after rebuild):
cd /var/discourse
nano containers/app.yml  (use your preferred editor)

Add the following custom commands at the end (run section):

  - replace:
      filename: "/var/www/discourse/lib/upload_creator.rb"
      from: "FastImage.new(@file)"
      to: "FastImage.new(@file, :raise_on_failure=>true)"
  - replace:
      filename: "/var/www/discourse/app/models/upload.rb"
      from: "FastImage.new(original_path)"
      to: "FastImage.new(original_path, :raise_on_failure=>true)"

Then, rebuild:

./launcher rebuild app
  1. I guess so if you plan to upload files without extensions. I did not check if other occurrences required the same change.
1 Like

@Arkshine – Thank you very much for these details. I was able to test out both fixes separately (each one on a freshly restored VM) and they both worked!

Notes:

  1. For the temporary fix, I needed to run “./launcher restart app” for the changes to take effect.

  2. It looks like “spec/models/optimized_image_spec.rb” also has a reference to FastImage.new(). Does this one also need to be updated like the rest?

Thanks again for your help!

I’m glad it works. :slight_smile:

  1. This is just for testing purposes, so you don’t have to worry about it.

Great! Thanks! Now that I’ve tested this out in my dev env, I’ll deploy this to the test and prod envs.

BTW, if you have time, I’d love to get your thoughts on a related problem (How to customize MIME media type emitted for certain attachments?).

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.