Qwen3-VL-8b Image Recognition Issues and Gemma3-27b Mixed Text Image Content

Hello, I found a topic Managing Images in AI context. I would like to know more about this context.

Could someone clarify current logic of understanding images?


  1. I use Qwen3-VL-8b with LM Studio with OpenAI-compatible API. The hint below says that images are supported by Anthropic, Google and OpenAI models. No chance for Qwen, right?

  2. Qwen3-VL-8b New confusing message when the model cannot recognize a picture/document.

In 3.6.0.beta2:

Both in case vision enabled = true and vision enabled = false AI-bot handles the request of image recognition correctly, without any exception.

In v2025.12.0-latest: new option allowed attachments

Now with vision enabled = true in returns an error in the dialog:

{“error”:“Invalid ‘content’: ‘content’ objects must have a ‘type’ field that is either ‘text’ or ‘image_url’.”}
  1. Gemma3-27b. Some thoughts about recognizing mixed text+image content. The response currently supports text only. When I ask the model to provide a text from OCR-layer of PDF with separated images, it returns

There is nothing at this URL, the model made a fake link.

Thanks!

lmstudio does not have PDF support in completion or responses api.

It only supports image/text from what I can tell.

2 likes

Thank you for the reply! I will mark it as solved and leave a comment here that it was correct for LM Studio 0.3.x. Studio team is currently working on version 0.4.0 with new REST. Hope they add PDF support in their responses.

1 like