Gemini CLI LLM Call Chain Termination - AI 对话

sam 2025-12-17

looking at GitHub - google-gemini/gemini-cli: An open-source AI agent that brings the power of Gemini directly into your terminal. what trick does this use to terminate a chain of llm calls once budget is reached?

GitHub_Helper_bot 2025-12-17

Thinking

GitHub code search
Searched for ‘budget terminate’ in google-gemini/gemini-cli

GitHub code search
Searched for ‘token limit chain’ in google-gemini/gemini-cli

GitHub search files
Searched for ‘budget, token’ in google-gemini/gemini-cli/main

GitHub file content
Retrieved content of packages/core/src/core/tokenLimits.ts, packages/core/src/agents/types.ts from google-gemini/gemini-cli@main

GitHub code search
Searched for ‘max_turns terminate’ in google-gemini/gemini-cli

GitHub code search
Searched for ‘AgentTerminateMode MAX_TURNS’ in google-gemini/gemini-cli

GitHub file content
Retrieved content of packages/core/src/agents/executor.ts from google-gemini/gemini-cli@main

Let me also check how the client handles budget termination in the main chat loop:

Thinking

GitHub code search
Searched for ‘token_budget’ in google-gemini/gemini-cli

GitHub code search
Searched for ‘infinite loop MAX_TURNS’ in google-gemini/gemini-cli

GitHub file content
Retrieved content of packages/core/src/core/client.ts#L1-L100 from google-gemini/gemini-cli@main

GitHub file content
Retrieved content of packages/core/src/core/client.ts#L450-L550 from google-gemini/gemini-cli@main

Based on my analysis of the gemini-cli repository, they use multiple strategies to terminate chains of LLM calls, but the key “trick” is a combination of:

1. Turn Counter with Hard Limit (`MAX_TURNS = 100`)

In packages/core/src/core/client.ts:

const MAX_TURNS = 100;

The client bounds turns to prevent infinite loops:

// Ensure turns never exceeds MAX_TURNS to prevent infinite loops
const boundedTurns = Math.min(turns, MAX_TURNS);

2. Agent-Level Termination Modes

The agent executor (packages/core/src/agents/executor.ts) uses an AgentTerminateMode enum:

export enum AgentTerminateMode {
  ERROR = 'ERROR',
  TIMEOUT = 'TIMEOUT',
  GOAL = 'GOAL',
  MAX_TURNS = 'MAX_TURNS',
  ABORTED = 'ABORTED',
  ERROR_NO_COMPLETE_TASK_CALL = 'ERROR_NO_COMPLETE_TASK_CALL',
}

3. Configurable Limits per Agent

Agents define their own limits via RunConfig:

interface RunConfig {
  max_time_minutes: number;  // Timeout in minutes
  max_turns?: number;        // Max conversational turns
}

4. The Clever Bit: Grace Period Recovery

When a limit is hit, instead of hard-stopping, they give the agent one final chance with a 1-minute grace period:

const GRACE_PERIOD_MS = 60 * 1000; // 1 min

The agent receives a final warning message forcing it to call complete_task immediately:

“You have exceeded the maximum number of turns. You have one final chance to complete the task with a short grace period. You MUST call complete_task immediately with your best answer…”

This lets the model gracefully wrap up rather than just cutting off mid-task.

5. Mandatory Completion Tool

Agents must call a complete_task tool to finish. If they stop calling tools without calling this, they hit ERROR_NO_COMPLETE_TASK_CALL and trigger the recovery flow.

Summary: The main trick is turn counting + timeout with a grace period recovery mechanism that prompts the model to wrap up gracefully. There’s no explicit “token budget” termination - they rely on turn limits, time limits, and forcing the model to self-terminate via the complete_task tool.

sam 2025-12-17

wait explain the complete_task tool?

GitHub_Helper_bot 2025-12-17

Looking at the code in packages/core/src/agents/executor.ts, the complete_task tool is a mandatory synthetic tool that gets injected into every agent’s toolset. It’s the only way an agent can signal it’s done.

How It Works

1. Always Injected

In prepareToolsList(), this tool is always added regardless of what other tools the agent has:

const TASK_COMPLETE_TOOL_NAME = 'complete_task';

// Always inject complete_task.
const completeTool: FunctionDeclaration = {
  name: TASK_COMPLETE_TOOL_NAME,
  description: outputConfig
    ? 'Call this tool to submit your final answer and complete the task. This is the ONLY way to finish.'
    : 'Call this tool to signal that you have completed your task. This is the ONLY way to finish.',
  parameters: {
    type: Type.OBJECT,
    properties: {},
    required: [],
  },
};

2. Can Require Structured Output

If the agent has an outputConfig, the tool’s schema is dynamically modified to require that output:

if (outputConfig) {
  const jsonSchema = zodToJsonSchema(outputConfig.schema);
  completeTool.parameters!.properties![outputConfig.outputName] = schema as Schema;
  completeTool.parameters!.required!.push(outputConfig.outputName);
}

So if an agent is configured to return a “report”, the tool becomes:

complete_task(report: string)  // required parameter

3. Processed Specially

When the model calls complete_task, it’s handled differently from other tools - it’s not executed, it just signals completion:

if (functionCall.name === TASK_COMPLETE_TOOL_NAME) {
  taskCompleted = true;
  
  // Validate output if required
  if (outputConfig) {
    const validationResult = outputConfig.schema.safeParse(outputValue);
    if (!validationResult.success) {
      taskCompleted = false; // Validation failed, revoke completion
      // ... return error to model
    }
  }
  // ...
}

4. Enforced in System Prompt

The agent’s system prompt hammers home that this is mandatory:

finalPrompt += `
* When you have completed your task, you MUST call the \`complete_task\` tool.
* Do not call any other tools in the same turn as \`complete_task\`.
* This is the ONLY way to complete your mission. If you stop calling tools without calling this, you have failed.`;

5. Failure Mode if Not Called

If the model stops generating tool calls without ever calling complete_task, the executor catches this:

// If the model stops calling tools without calling complete_task, it's an error.
if (functionCalls.length === 0) {
  return {
    status: 'stop',
    terminateReason: AgentTerminateMode.ERROR_NO_COMPLETE_TASK_CALL,
    finalResult: null,
  };
}

This triggers the grace period recovery where the model gets one more chance.

Why This Design?

It’s a forcing function that:

Prevents ambiguous endings - The model can’t just trail off or say “I’m done” in text
Enables structured output - The completion tool can enforce a schema for the final answer
Creates a clean termination signal - The executor knows definitively when the task is complete vs. when something went wrong
Allows output validation - If the model’s answer doesn’t match the expected schema, it can be sent back to try again

It’s essentially a protocol contract: “You have tools, use them, and when you’re done, call this specific tool with your answer.”

1. Turn Counter with Hard Limit (MAX_TURNS = 100)