AI Image Captioning Feature in Discourse AI Plugin

Falco · February 20, 2024, 5:53pm

We’ve introduced an AI Image Captioning feature to the Discourse AI plugin, enabling automatic caption generation for images in posts. This functionality aims to improve content accessibility and enrich visual elements within your community.

Features and Use

Automatic AI Captions: Upon uploading an image in the editor, you can generate a caption automatically using AI.
Editable Captions: The generated caption can be edited to better suit your content’s context and tone.
Enhanced Accessibility: The feature supports creating more accessible content for users relying on screen readers.

How to Use

Upload an image in the Discourse editor.
Click the “Caption with AI” button near the image.
A generated caption will appear, which you can modify.
Accept the caption to include it in your post.

Feedback

Your feedback is crucial for refining this feature. It’s enabled here on Meta, so please share your experiences, issues, or suggestions here on this topic.

AI Model

This feature supports both the open-source model LLaVa 1.6 or with the OpenAI API.

frold · February 20, 2024, 5:56pm

Funny I used it earliner in this post. I was very impressed. It could read the image and tell what it was about in this post

https://meta.discourse.org/t/discourse-subscriptions/140818/609?u=frold

EricGT · February 20, 2024, 6:10pm

Noted this on the OpenAI forum

Jagster · February 20, 2024, 6:18pm

I don’t know how we get mobile users remember to use that, because they have to jump away from editor.

Is that caption used as alt-text too?

Falco · February 20, 2024, 6:21pm

Yes.

We plan on adding JIT reminders in the near future if the reception is good.

Falco · February 21, 2024, 5:00pm

2 posts were split to a new topic: Support for prompt customization in DiscourseAI

pmusaraj · February 20, 2024, 10:15pm

It can see the plaid shirt, but it can’t detect George Costanza.

Jokes aside, this is great especially for accessibility. In previous A11Y reports, missing alt text on images is one of the main items raised, and previously we’ve written all that off since images are user-uploaded content. This now draws a path forward to much, much better accessibility.

Tris20 · February 21, 2024, 8:23am

In the case of error messages, is there any way to encourage it to caption the main part of the error so the search engine picks up on it?

Some other results

It identifies the third correctly as the IBM EWM tool, but does not recognise 2 as being Rhapsody, and 1 being Vector Davinci. None the less these captions are pretty reasonable.

tpetrov · February 21, 2024, 9:55am

This is an awesome feature!

But it’s very hard to find. The user needs to hover over the image to see the button and then click it (and most people wion’t know about that).
Even though I knew and was looking for the feature, I had the check the video to get that I need to hover.
IMO it should be “in your face” to be used in the beginning. I’d even make it create the captions by default, without the user having to click anything

Falco · February 21, 2024, 5:04pm

We will eventually makes those prompts customizable, so this will then be possible.

As a new feature, our idea is to introduce it in a very unobtrusive way to gather feedback, and then make it easier to find and even automatic.

JammyDodger · March 12, 2024, 9:36am

6 posts were split to a new topic: Issues configuring AI image captions

ecki · March 15, 2024, 12:41pm

Will that send the (Internet) Image link to the AI Service or upload the Image content or run some “hashing” locally in discourse? Is it server-side or javascript (i.e. exposing the client ip to external service).

Falco · March 15, 2024, 1:12pm

It sends a link to the image to the service you selected for the captioning. It happens server-side, as there are credentials involved.

If you want the feature but don’t want to involve third-parties, you can always run LLaVa in your own server.

ecki · March 15, 2024, 3:33pm

agreed, however the quality might suuffer from hardware limitations. Maybe you could share some recommendation in regards to model-sizes and quatisation or minimum vram from your experience. (not sure if they have quantized models at all, their “zoo” seems to have only full models).

Falco · March 15, 2024, 3:46pm

We are running the full model, but the smallest version of it with Mistral 7B. It’s taking 21GB VRAM in our single A100 servers, and it’s ran via ghcr.io/xfalcox/llava:latest container image.

Sadly the ecosystem for multi-modal models ain’t as mature as the text2text ones, so we can’t yet leverage inference servers like vLLM or TGI and are left with those one-off microservices. This may change this year, multimodal is on vLLM roadmap, but until then we can at least test the waters with those services.

seanblue · March 21, 2024, 10:34pm

I have some small UX feedback for this. On small images, the “Capture with AI” button blocks not only the image itself but other text in the post, making it hard to review the post when editing.

Moin · March 21, 2024, 10:55pm

mattdm · April 12, 2024, 1:59pm

I am seeing all generated captions (both here and on my site) start with “The image contains” or “An image of” or similar. This seems unnecessary and redundant. Could the prompt be updated to tell it that it doesn’t need to explain that the image is an image?

sam · April 17, 2024, 3:20am

It is so tricky to hone cause different models have different tolerances, but one plan we have is to allow community owners control over the prompts so they can experiment.

Isambard · June 3, 2024, 5:11pm

@mattdm You can achieve this simply by pre-seeding the generated answer with “An image of”. This way the LLM thinks that it has already generated the introduction and will generate just the remainder.

Topic		Replies	Views
Helper - Auto caption Site Management how-to , ai , ai-captions	7	152	April 13, 2025
A forum forgets automatic AI caption Bug ai , ai-helper , fixed	6	212	August 12, 2024
Issues configuring AI image captions Support ai , ai-helper	21	645	April 12, 2024
Non-AI method of captioning images Support	3	147	June 21, 2024
Lets see your best AI Image Caption! General ai , ai-helper , ai-captions	38	2124	June 29, 2024

AI Image Captioning Feature in Discourse AI Plugin

Features and Use

How to Use

Feedback

AI Model

Related topics