People post screenshots, could there be a way to extract text from an image, and added at the bottom of the post
Sure. Google OCR.
But not by Discourse. And I would guess such funtionality isnât coming quite soon anyway
Suspect youâd have to create a plug-in either by authoring it yourself or engaging a freelancer marketplace
See this plugin
Client (@csmu) never paid me BTW
Hey @michaeld
Quickly skimming this plugin, am I right that the images are sent to google servers for processing? What was the reasoning for this approach rather than using a ruby gem to process locally or on the server of the discourse instace? Iâm interested in this topic, but submitting images out of house isnât an option.
Better performance, ease of maintenance, avoiding version dependencies on local installation.
I understand that this is not always an acceptable approach. A PR is welcome although the user should always be able to avoid a local dependency hell.
Interesting. I guess this was mostly focussed on hand writing though right? If it was simply extracting text from an image, for example an error screenshot, then I guess a local gem might be accurate enough. I played with a python library for something like this a while ago and got reasonable results. Sometimes it was garbage, but the results would never be read by the community, only the search engine. If the user notcied something silly they could always modify the hidden text.
I donât want reasonable results, I want excellent results.
There is no OCR that can offer excellent results. Even resonable can be hard to achieve â no matter what library is in use,
Bear in mind OCR is often working on screen grabs, not on scans or photos. It still wonât be 100%, but itâs a good kind of text to try to recognise.
I note that Mastodonâs Web UI offers an OCR function in the dialogue where you can enter an image description for accessibility reasons. It might be that it runs server-side. Hereâs what it looks like, after I clicked on âDetect text from pictureâ:
Interesting. Looks like it has similar results to Tesseract. I wonder how the Mastodon tool handles images with graphics as well as text?
A noble goal Whilst I share the desire for excellent results, Iâll be happy with an 80% improvement
In the context I have in mind, the goal is to extract things like error messages from screenshots. For example, if a user has an error log in their terminal, the tendency is to just screencap it. Even if the result isnât perfect, if it extracts about 80% of the text correctly, then someone searching for the error message, or another related piece of text has a far higher chance of finding the Topic, than if it was just the unsearchable image.