AI Function Calling & Tool Use Guide 2026 - Master LLM Tool Integration

Function calling — also known as tool use — is the mechanism that transforms large language models from text generators into action-taking systems. Instead of just producing words, a model with function calling can request that your application execute specific operations: look up data, call an API, query a database, or run calculations. This guide covers how function calling works across all major LLM providers in 2026, with practical code examples and production patterns you can deploy today.

What Is Function Calling?

At its core, function calling is a structured way for an LLM to communicate intent to your application. You define a set of tools (functions) with their names, descriptions, and parameter schemas. When the model decides it needs to call a tool, it outputs a structured JSON object specifying which function to call and with what arguments. Your application executes the function and returns the result, which the model uses to continue the conversation.

The flow always involves these steps:

You send a message to the model along with available tool definitions
The model responds with a tool call (or a regular text response)
Your application executes the tool and returns the output
The model processes the tool output and generates a final response

This is not the model executing code itself. The model only decides what to call and with what arguments. Your application is always in control of execution.

OpenAI Function Calling

OpenAI's function calling API has evolved significantly. As of 2026, the responses.create API is the recommended interface, with support for parallel tool calls, structured outputs, and the new tool search feature for managing large tool sets.

Basic Tool Definition

Tools are defined using JSON Schema for their parameters. Here's a simple example:

from openai import OpenAI
import json

client = OpenAI()

tools = [
    {
        "type": "function",
        "name": "get_weather",
        "description": "Get current weather for a location",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "City name, e.g. 'San Francisco'"
                },
                "unit": {
                    "type": "string",
                    "enum": ["celsius", "fahrenheit"],
                    "description": "Temperature unit"
                }
            },
            "required": ["location"]
        }
    }
]

response = client.responses.create(
    model="gpt-5",
    tools=tools,
    input="What's the weather in Tokyo?"
)

Handling Tool Calls

When the model returns tool calls, you execute them and provide the results back:

input_list = [{"role": "user", "content": "What's the weather in Tokyo?"}]
response = client.responses.create(
    model="gpt-5",
    tools=tools,
    input=input_list
)

input_list += response.output

for item in response.output:
    if item.type == "function_call":
        # Execute your function logic
        args = json.loads(item.arguments)
        result = get_weather(args["location"], args.get("unit", "celsius"))
        
        # Return the result to the model
        input_list.append({
            "type": "function_call_output",
            "call_id": item.call_id,
            "output": result
        })

# Get final response
final_response = client.responses.create(
    model="gpt-5",
    tools=tools,
    input=input_list
)
print(final_response.output_text)

Structured Outputs with Strict Mode

For reliable function calling, enable strict mode which guarantees the model's output matches your schema exactly:

tools = [
    {
        "type": "function",
        "name": "search_products",
        "description": "Search product catalog",
        "parameters": {
            "type": "object",
            "properties": {
                "query": {"type": "string"},
                "category": {"type": "string"},
                "max_price": {"type": "number"},
                "in_stock_only": {"type": "boolean"}
            },
            "required": ["query"]
        },
        "strict": True  # Guarantees schema compliance
    }
]

With strict mode, the model will always produce valid JSON matching your schema. This eliminates the need for defensive parsing and makes your application more reliable.

Tool Search for Large Tool Sets

GPT-5.4 and later models support tool search, which lets you defer rarely-used tools and load them only when needed. This is essential when your application has dozens or hundreds of tools:

tools = [
    # Frequently used tools defined directly
    {
        "type": "function",
        "name": "search_knowledge_base",
        "description": "Search internal knowledge base",
        "parameters": { ... }
    }
]

# Use tool_search for deferred loading
response = client.responses.create(
    model="gpt-5",
    tools=tools,
    tool_search={
        "enabled": True,
        "max_results": 5
    },
    input="Find our refund policy"
)

Anthropic Tool Use (Claude)

Anthropic's Claude models support tool use through the tools parameter in the messages API. The approach is similar but uses a slightly different schema format:

import anthropic

client = anthropic.Anthropic()

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    tools=[
        {
            "name": "get_weather",
            "description": "Get current weather for a location",
            "input_schema": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "City name"
                    }
                },
                "required": ["location"]
            }
        }
    ],
    messages=[
        {"role": "user", "content": "What's the weather in Paris?"}
    ]
)

# Check for tool use in response
for block in response.content:
    if block.type == "tool_use":
        print(f"Tool: {block.name}")
        print(f"Input: {block.input}")
        
        # Execute and return result
        tool_result = get_weather(block.input["location"])
        
        # Continue conversation with tool result
        response = client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=1024,
            tools=[...],
            messages=[
                {"role": "user", "content": "What's the weather in Paris?"},
                {"role": "assistant", "content": response.content},
                {
                    "role": "user",
                    "content": [
                        {
                            "type": "tool_result",
                            "tool_use_id": block.id,
                            "content": str(tool_result)
                        }
                    ]
                }
            ]
        )

Claude's Computer Use

Claude also supports a unique computer use capability that goes beyond traditional function calling. With computer use, Claude can interact with a computer interface — clicking, typing, and scrolling — through structured tool calls. This is available on Claude 3.5 Sonnet and later models and is particularly useful for UI automation and testing.

Google Gemini Function Calling

Gemini's function calling uses a protocol-compatible approach with OpenAI's format, making it easy to switch providers:

import google.generativeai as genai

genai.configure(api_key="YOUR_API_KEY")

# Define functions
get_weather_func = genai.protos.FunctionDeclaration(
    name="get_weather",
    description="Get current weather for a location",
    parameters={
        "type": "object",
        "properties": {
            "location": {"type": "string", "description": "City name"}
        },
        "required": ["location"]
    }
)

model = genai.GenerativeModel(
    model_name="gemini-2.5-pro",
    tools=[{"function_declarations": [get_weather_func]}]
)

chat = model.start_chat()
response = chat.send_message("What's the weather in Berlin?")

# Check for function call
if response.candidates[0].content.parts[0].function_call:
    fc = response.candidates[0].content.parts[0].function_call
    print(f"Function: {fc.name}")
    print(f"Args: {dict(fc.args)}")
    
    # Execute and return
    result = get_weather(dict(fc.args)["location"])
    response = chat.send_message(
        genai.protos.Content(
            parts=[genai.protos.Part(
                function_response=genai.protos.FunctionResponse(
                    name="get_weather",
                    response={"result": result}
                )
            )]
        )
    )

Provider Comparison

Feature	OpenAI (GPT-5)	Anthropic (Claude)	Google (Gemini 2.5)
Parallel tool calls	Yes	Yes	Yes
Structured outputs	Strict mode	JSON mode	JSON mode
Tool search / deferred	Yes (GPT-5.4+)	No	No
Max tools per request	128+	128	64
Computer use	No	Yes	No
Forced tool choice	Yes	Yes	Yes
Streaming tool calls	Yes	Yes	Yes

Parallel Tool Calls

One of the most powerful features in 2026 is parallel tool calling. When a user query requires multiple independent tool calls, the model can request them all at once instead of sequentially. This dramatically reduces latency for complex queries.

# Example: User asks "Compare weather in NYC and London"
# The model may return TWO tool calls simultaneously

response = client.responses.create(
    model="gpt-5",
    tools=tools,
    input="Compare weather in NYC and London"
)

# Process all tool calls
results = {}
for item in response.output:
    if item.type == "function_call":
        args = json.loads(item.arguments)
        results[item.call_id] = get_weather(args["location"])
        input_list.append({
            "type": "function_call_output",
            "call_id": item.call_id,
            "output": results[item.call_id]
        })

Key point: when processing parallel tool calls, you must return all tool results before making the next API call. Don't make separate follow-up calls for each tool result.

Error Handling Patterns

Function calling in production requires robust error handling. Here are the key patterns:

1. Tool Execution Errors

When a tool execution fails, don't crash — report the error back to the model so it can recover:

def execute_tool_safely(tool_name, args):
    try:
        result = tool_registry[tool_name](**args)
        return {"status": "success", "data": result}
    except Exception as e:
        return {"status": "error", "message": str(e)}

# Return error to model — it will try an alternative approach
input_list.append({
    "type": "function_call_output",
    "call_id": item.call_id,
    "output": json.dumps(execute_tool_safely(item.name, args))
})

2. Schema Validation

Even with strict mode, validate tool arguments on your side:

from jsonschema import validate, ValidationError

def validate_tool_args(tool_name, args, schema):
    try:
        validate(instance=args, schema=schema)
        return True
    except ValidationError as e:
        print(f"Invalid args for {tool_name}: {e.message}")
        return False

3. Timeout Protection

Tool executions should always have timeouts to prevent hanging:

import asyncio

async def execute_with_timeout(tool_func, args, timeout=10):
    try:
        result = await asyncio.wait_for(
            tool_func(**args),
            timeout=timeout
        )
        return result
    except asyncio.TimeoutError:
        return {"error": f"Tool execution timed out after {timeout}s"}

Production Patterns

Tool Registry Pattern

Centralize tool definitions and handlers in a registry for clean architecture:

class ToolRegistry:
    def __init__(self):
        self.tools = {}
        self.handlers = {}
    
    def register(self, name, description, parameters, handler):
        self.tools[name] = {
            "type": "function",
            "name": name,
            "description": description,
            "parameters": parameters
        }
        self.handlers[name] = handler
    
    def get_tool_definitions(self):
        return list(self.tools.values())
    
    def execute(self, name, args):
        return self.handlers[name](**args)

# Usage
registry = ToolRegistry()
registry.register(
    name="search_docs",
    description="Search internal documentation",
    parameters={
        "type": "object",
        "properties": {
            "query": {"type": "string"},
            "limit": {"type": "integer", "default": 5}
        },
        "required": ["query"]
    },
    handler=search_documentation
)

Conversation State Management

In a multi-turn tool use conversation, maintain state carefully:

class ToolConversation:
    def __init__(self, client, model, registry):
        self.client = client
        self.model = model
        self.registry = registry
        self.input_list = []
        self.max_tool_rounds = 5  # Prevent infinite loops
    
    async def run(self, user_message):
        self.input_list.append({"role": "user", "content": user_message})
        
        for round_num in range(self.max_tool_rounds):
            response = self.client.responses.create(
                model=self.model,
                tools=self.registry.get_tool_definitions(),
                input=self.input_list
            )
            
            self.input_list += response.output
            
            # Check if model wants to call tools
            tool_calls = [item for item in response.output 
                         if item.type == "function_call"]
            
            if not tool_calls:
                return response.output_text
            
            # Execute all tool calls
            for item in tool_calls:
                args = json.loads(item.arguments)
                result = await self.registry.execute(item.name, args)
                self.input_list.append({
                    "type": "function_call_output",
                    "call_id": item.call_id,
                    "output": json.dumps(result)
                })
        
        return "Maximum tool call rounds exceeded"

Structured Outputs Beyond Tool Calling

Function calling isn't just for executing actions — it's also a powerful pattern for extracting structured data from unstructured text. You can define "tools" that the model fills in with extracted information:

extraction_tools = [
    {
        "type": "function",
        "name": "extract_customer_info",
        "description": "Extract customer information from text",
        "parameters": {
            "type": "object",
            "properties": {
                "name": {"type": "string"},
                "email": {"type": "string"},
                "phone": {"type": "string"},
                "intent": {"type": "string", "enum": ["purchase", "support", "complaint", "inquiry"]}
            },
            "required": ["name", "intent"]
        },
        "strict": True
    }
]

response = client.responses.create(
    model="gpt-5",
    tools=extraction_tools,
    tool_choice={"type": "function", "name": "extract_customer_info"},
    input="Hi, I'm Sarah Johnson. My email is sarah@example.com. I need help with my recent order."
)

This pattern is extremely useful for form auto-filling, data extraction, and intent classification in production applications.

MCP vs Function Calling

The Model Context Protocol (MCP) is an emerging standard that builds on the function calling concept but adds a standardized protocol layer. Here's how they relate:

Function calling is an API-level feature — you define tools per request
MCP is a protocol standard — tools are hosted as MCP servers and discovered dynamically
MCP servers expose tools that any MCP-compatible client can use
Think of function calling as the mechanism, MCP as the interoperability standard

For most applications in 2026, direct function calling is sufficient and simpler. MCP becomes valuable when you need tool interoperability across different applications and platforms.

Best Practices

Write clear tool descriptions — The model relies on descriptions to decide when and how to use a tool. Be specific about what the tool does and when it should be used.
Use strict mode when available — It guarantees schema compliance and eliminates parsing errors.
Keep tool sets focused — Don't overload the model with too many tools. If you have more than 20, use tool search or group them by context.
Always set max rounds — Prevent infinite tool call loops by capping the number of rounds.
Validate all inputs — Never trust model-generated arguments blindly. Validate before execution.
Log tool calls — Track which tools are called, how often, and with what arguments. This data is invaluable for debugging and optimization.
Return errors to the model — When a tool fails, inform the model so it can try a different approach rather than leaving the user hanging.
Design idempotent tools — Tool execution may be retried. Design tools so that calling them twice with the same arguments produces the same result.

Common Pitfalls

Pitfall	Solution
Vague tool descriptions leading to wrong tool selection	Write detailed descriptions with examples of when to use the tool
Tool calling infinite loop	Set max rounds and detect repeated identical calls
Missing required parameters in tool output	Use strict mode and validate server-side
Slow tool execution blocking the conversation	Add timeouts and async execution
Model hallucinating tool calls that don't exist	Use tool_choice to constrain, and validate call names
Tool results too large for context window	Summarize or truncate results before returning

Conclusion

Function calling is the bridge between LLMs and the real world. In 2026, all major providers — OpenAI, Anthropic, and Google — support robust tool use with parallel calls, structured outputs, and streaming. The key to success is treating function calling as a system design problem, not just an API call. Invest in clear tool descriptions, robust error handling, and a clean architecture like the tool registry pattern. Your users will benefit from more capable, reliable AI applications.