Ai plugin ocr support

ozkn · September 5, 2023, 9:01pm

Can support be added to the Discourse ai plugin to add the text in the images to the post (ocr)? Can google lens api (cloud vision) support be added for this?

Example: GitHub - communiteq/discourse-ocr-uploads

Falco · September 5, 2023, 9:16pm

It’s in our roadmap to leverage a multi-modal LLM to create image descriptions, which should also provide some level of OCR. But for pure OCR maybe give that plugin a try?

ozkn · September 6, 2023, 9:14am

I used this plugin in older versions of Discourse and it worked. But unfortunately it no longer works in the new version of Discourse

pmusaraj · February 21, 2024, 1:37pm

See AI Image Captioning Feature in Discourse AI Plugin, this is now ready and enabled here on meta.

ozkn · February 21, 2024, 7:20pm

Thanks for this feature, I will try it @Falco @pmusaraj

sam · February 21, 2024, 10:42pm

I think we should still keep this open, the captioning feature is adjacent to OCR but not exactly OCR.

OCR for example would allow you to take a photo of your notes and then upload and print them exactly. The AI captioning is much more sophisticated but also does not give you that fidelity of printing an entire page of text.

Not sure when we will have time to work on an OCR, but it does feel a bit different.

sam · March 28, 2024, 10:08am

Now that Anthropic Claude 3 has vision support it does a decent job with ocr jobs, for example

Tris20 · March 28, 2024, 10:46am

Cries in German

On a serious note, I have curiosity about how it would perform on an image like this:

Tesseract gets the following:

MINGW64 ~/Source/Repos/Sut. Driver. Firmware
$ git push
Locking support detected on remote “origin”. Consider enabling it with:
$ git config Ifs ‘1fs.locksverify true
LFS: Access forbidden. Check your access level.
error: failed to push some refs to
MINGW64 ~/Source/Repos/Sut. Driver. Firmware
$ git push
Locking support detected on remote “origin”. Consider enabling it with:
$ git config Ifs. /\fs.locksverify true
Uploading LFS objects: 100% (1/1), 584 KB | 0 B/s, done.
Enumerating objects: 9, done.
Counting objects: 100% (9/9), done.
Delta compression using up to 8 threads
Compressing objects: 100% (3/3), done.
Writing objects: 100% (5/5), 478 bytes | 478.00 KiB/s, done.
Total 5 (delta 1), reused 0 (delta 0), pack-reused 0 »
remote:
remote: To create a merge request for visit:
remote: 1
remote:
To
2c50e5b. . ba25f3e
L MINGN64 ~/Source/Repos /Sut. Driver. Firmware

(Ok I’m surprised at how reasonable this result is. Tesseract often changes line order and glitches letters with these types of examples.)

Sam would it be possible for you to throw that image in to claude and post the result?

sam · March 28, 2024, 12:02pm

Feel free to try here, Claude creative persona here has vision support, just enabled it

Tris20 · March 28, 2024, 12:15pm

https://meta.discourse.org/t/parsing-complex-json-data-in-tris20-code/301329

This is either a skill issue on my side, or Claude is having little trouble

Tris20 · March 28, 2024, 12:23pm

Skill issue I was using the Forum Helper rather than the Creative personality.

Claude Creative gives us this:

MINGW64 ~/Source/Repos/Sut.Driver.Firmware (kingfisher)
$ git push
Locking support detected on remote “origin”. Consider enabling it with:
$ git config lfs.http://tfs.locksverify true
LFS: Access forbidden. Check your access level.
error: failed to push some refs to ‘http://tfs.lockeed/tfs/HnC/TEC/SUT/Driver%20Firmware/_git/Sut.Driver.Firmware’

MINGW64 ~/Source/Repos/Sut.Driver.Firmware (kingfisher)
$ git push
Locking support detected on remote “origin”. Consider enabling it with:
$ git config lfs.http://tfs.locked/tfs.locksverify true
Uploading LFS objects: 100% (1/1), 584 KB | 0 B/s, done.
Enumerating objects: 9, done.
Counting objects: 100% (9/9), done.
Delta compression using up to 8 threads
Compressing objects: 100% (3/3), done.
Writing objects: 100% (5/5), 478 bytes | 478.00 KiB/s, done.
Total 5 (delta 1), reused 0 (delta 0), pack-reused 0
remote: To create a merge request for kingfisher, visit:
remote: http://tfs.locked/tfs/HRC/TEC/SUT/_git/Sut.Driver.Firmware/pullrequest/new?sourceRef=kingfisher&targetRef=develop
remote:
To http://tfs.locked/tfs/HRC/TEC/SUT/_git/Sut.Driver.Firmware
2c50e5b…ba25f3e kingfisher - kingfisher

MINGW64 ~/Source/Repos/Sut.Driver.Firmware (kingfisher)
$

Interesting result. I feel like the two are roughly on a par with each other in this example but with different faults. Claude has halucinated the URL here, making a contextual guess that we are doing a pull request based on the git issue.

Tesseract doesn’t halicunate, and in this case seems to be more correct. The only thing that stands out is 1fs on line 4, and \fs on line 10 instead of lfs.

Claude also makes a similar error throughout, using tfs instead of lfs. It’s surprising because the extrapolation of URL based on context shows it understood the context, but then instead of lfs which is prominent in this problem domain, it created something completely new: tfs

sam · April 2, 2024, 4:32am

What I found absolutely jaw dropping here is that you have an OCR engine you can interact with.

Don’t like the kingfisher hallucination … fine … just ask it not to do that thing.

Really mind bending what you can do with this tech:

AI Conversation with Claude-3-opus

Git Push Errors: Locking and Access Issues

sam: Tell me exactly what this says please [image]

claude_3_opus_bot: The image shows terminal output from running some Git commands to push changes to a remote repository. It starts with trying to push to a remote called “origin”, but locking support is detected on the remote and it says to consider enabling it with “$ git config lfs.locksverify true”. However, there is an error that says “failed to push some refs”. Then it tries pushing again, this time with LFS…

sam: just transcribe it please…

claude_3_opus_bot: MINGW64 ~/Source/Repos/Sut.Driver.Firmware (main) (feature/sam) $ git push Locking support detected on remote “origin”. Consider enabling it with: $ git config lfs.locksverify true LFS: Access forbidden. Check your access level. error: failed to push some refs to ‘https://alm.actia.fr/dap-components/Sut.Driver.Firmware.git’ MINGW64 ~/Source/Repos/Sut.Driver.Firmware (main) (feature/sam) $ g…

sam: That contained some hallucinations, instead of guessing any of the words simply redact areas you are not confident of […] Put it all in a markdown ``` block.

...

Read full transcript

(apologies for broken image in the OP, we are fixing it, the image is)

Topic		Replies	Views
Helper - Auto caption Site Management how-to , ai , ai-captions	8	177	August 8, 2025
Lets see your best AI Image Caption! General ai , ai-helper , ai-captions	38	2178	June 29, 2024
Adding a picture questions feature Feature ai	3	772	January 12, 2024
Convert image to text Feature	10	1402	February 9, 2023
Discourse AI Plugin official , included-in-core , ai	83	35382	August 13, 2025

Ai plugin ocr support

Related topics