Image Onebox breaks on Wikimedia links

If you link to Wikimedia with a link like File:Stones members montage2.jpg - Wikimedia Commons the image onebox is triggered, trying to display this link as an image (which it isn’t, it is just the info page about that image):

There are two possible solutions:

  1. Just ignore links to common.wikimedia.org in the image onebox
  2. Use the Wikipedia API to get the actual link to the image itself

The latter requires an additional request, the resolved URL looks something like https://upload.wikimedia.org/wikipedia/commons/thumb/a/af/Stones_members_montage2.jpg/250px-Stones_members_montage2.jpg . Not sure whether this should be added to the existing image onebox or if there should be a separate wikimedia onebox. I have some crude code to do the conversion from the commons.wikimedia.org URL to the actual image URL here:

https://github.com/phw/discourse-musicbrainz-onebox/blob/master/engine/wikimedia.rb

1 Like

Well, they are misleading us by sending back html content when the file extension is .jpg

I’d be happy to merge a PR adding this to the onebox gem :wink:

2 Likes

Great to hear that. But it will take some time from my side, got to tackle a view other projects before…

2 Likes

I want to solve this, can you give me a quick “how to”? it isn’t just installing the gem right? Sorry, i want to learn and contribute.

2 Likes

Awesome, this is actually a great way to get some coding practice :slight_smile: To solve this you will need to fork GitHub - discourse/onebox: A gem for turning URLs into website previews and add your changes there. They have actually a pretty nice tutorial on how to add a new onebox provider in the README.

To solve this issue I would actually do it in two steps:

  1. Make sure the existing image onebox located at lib/onebox/engine/image_onebox.rb does not load commons.wikimedia.org URLs anymore. That could be done by modifying the regular expression in matches_regexp
  2. Add a new wikimedia_onebox.rb provider, which does first parse the provided URL and then makes a request to the Wikipedia API to get the actual image URL. Take a look at the tutorial in the README and the existing oneboxes in lib/onebox/engine/i. For the Wikimedia part you can reuse some of the code I linked above.

I have only played around with it a little myself. But you will need some way to test your work, and when I actually began implementing a onebox I wrote myself a helper script in the root directory of the onebox source code which looked something like this:

require_relative "lib/onebox"

# Set this path to your actual template path
Onebox.options = {
  load_paths: [
    File.join(File.dirname(__FILE__), "templates")
  ]
}

# This is the URL to test:
url = "https://commons.wikimedia.org/wiki/File:Stones_members_montage2.jpg"

# Create the onebox for the URL:
preview = Onebox.preview(url)

# This will print the generated onebox HTML:
puts preview
4 Likes

You can also achieve this by raising the priority of the engine.

2 Likes

I implemented this in PR #344. I tested using Discourse and it seems to behave as expected.

6 Likes

The pull request was merged.

https://github.com/discourse/onebox/pull/344

5 Likes

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.