Convert image to text

People post screenshots, could there be a way to extract text from an image, and added at the bottom of the post

1 Like

Sure. Google OCR.

But not by Discourse. And I would guess such funtionality isn’t coming quite soon anyway :wink:

1 Like

Suspect you’d have to create a plug-in either by authoring it yourself or engaging a freelancer marketplace

1 Like

See this plugin

Client (@csmu) never paid me BTW :face_with_symbols_over_mouth:

6 Likes

Hey @michaeld

Quickly skimming this plugin, am I right that the images are sent to google servers for processing? What was the reasoning for this approach rather than using a ruby gem to process locally or on the server of the discourse instace? I’m interested in this topic, but submitting images out of house isn’t an option.

Better performance, ease of maintenance, avoiding version dependencies on local installation.

I understand that this is not always an acceptable approach. A PR is welcome although the user should always be able to avoid a local dependency hell.

1 Like

Interesting. I guess this was mostly focussed on hand writing though right? If it was simply extracting text from an image, for example an error screenshot, then I guess a local gem might be accurate enough. I played with a python library for something like this a while ago and got reasonable results. Sometimes it was garbage, but the results would never be read by the community, only the search engine. If the user notcied something silly they could always modify the hidden text.

I don’t want reasonable results, I want excellent results.

2 Likes

There is no OCR that can offer excellent results. Even resonable can be hard to achieve — no matter what library is in use,

1 Like

Bear in mind OCR is often working on screen grabs, not on scans or photos. It still won’t be 100%, but it’s a good kind of text to try to recognise.

I note that Mastodon’s Web UI offers an OCR function in the dialogue where you can enter an image description for accessibility reasons. It might be that it runs server-side. Here’s what it looks like, after I clicked on “Detect text from picture”:
image

1 Like

Interesting. Looks like it has similar results to Tesseract. I wonder how the Mastodon tool handles images with graphics as well as text?

A noble goal :heart: Whilst I share the desire for excellent results, I’ll be happy with an 80% improvement :wink:

In the context I have in mind, the goal is to extract things like error messages from screenshots. For example, if a user has an error log in their terminal, the tendency is to just screencap it. Even if the result isn’t perfect, if it extracts about 80% of the text correctly, then someone searching for the error message, or another related piece of text has a far higher chance of finding the Topic, than if it was just the unsearchable image.