AI Structured Outputs & JSON Mode Guide 2026
Get reliable, schema-compliant JSON from any LLM. OpenAI Structured Outputs, Anthropic JSON mode, Google controlled generation, and production patterns.
One of the biggest headaches in building LLM applications is parsing the model's output. You ask for JSON, and sometimes you get it — wrapped in markdown code blocks, with missing fields, or with hallucinated values that don't match your schema. Structured Outputs is the solution: it guarantees the model always returns valid JSON that conforms to your specified schema. This guide covers every provider's approach to structured outputs in 2026, with practical code examples and production patterns.
The Problem: Why Structured Outputs Matter
Without structured outputs, getting reliable JSON from an LLM is a fragile process:
- The model wraps JSON in
```json ... ```markdown blocks - Required fields are sometimes omitted
- Enum values are hallucinated (e.g.,
"priority": "urgent"when only"high","medium","low"are valid) - Nested objects have inconsistent structures
- The model adds conversational text before or after the JSON
In production, even a 1% JSON parsing failure rate means hundreds of broken requests per day at scale. Structured Outputs eliminates this class of errors entirely.
OpenAI Structured Outputs
OpenAI offers the most robust structured outputs implementation, available through two mechanisms: function calling with strict mode and response_format with json_schema.
Method 1: Structured Outputs via Response Format
Use this when you want the model's response to follow a schema — not to call a function:
from openai import OpenAI
from pydantic import BaseModel
client = OpenAI()
class MovieReview(BaseModel):
title: str
year: int
rating: float
genre: list[str]
summary: str
recommended: bool
response = client.responses.parse(
model="gpt-5",
input=[
{"role": "system", "content": "Extract movie review information."},
{"role": "user", "content": "I watched Inception (2010). Mind-blowing film! 9/10."}
],
text_format=MovieReview,
)
review = response.output_parsed
print(review.title) # "Inception"
print(review.year) # 2010
print(review.rating) # 9.0
print(review.recommended) # True
The Pydantic model is automatically converted to a JSON Schema, and the model is guaranteed to produce output matching that schema. If it can't (e.g., due to a safety refusal), you get an explicit refusal object instead of broken JSON.
Method 2: Direct JSON Schema (without Pydantic)
response = client.chat.completions.create(
model="gpt-5",
messages=[{"role": "user", "content": "List 3 programming languages"}],
response_format={
"type": "json_schema",
"json_schema": {
"name": "language_list",
"strict": True,
"schema": {
"type": "object",
"properties": {
"languages": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": {"type": "string"},
"paradigm": {"type": "string", "enum": ["imperative", "functional", "logic", "object-oriented"]},
"year_created": {"type": "integer"}
},
"required": ["name", "paradigm", "year_created"],
"additionalProperties": False
}
}
},
"required": ["languages"],
"additionalProperties": False
}
}
}
)
import json
result = json.loads(response.choices[0].message.content)
print(result["languages"][0]["name"]) # e.g. "Python"
Strict Mode Requirements
When using strict mode, your JSON Schema must follow these rules:
- All fields must be in
required additionalPropertiesmust beFalseon all objects- Nested objects must also follow these rules recursively
- Optional fields should use
nullable: truewith anulltype in the union
# Making a field optional with strict mode
"optional_field": {
"type": ["string", "null"],
"description": "This field is optional"
}
Handling Refusals
When the model refuses to answer (safety filter), you get a structured refusal instead of broken output:
response = client.responses.parse(
model="gpt-5",
input="How to hack a server",
text_format=MySchema,
)
if response.output_parsed is None:
# Model refused — check refusal
refusal = response.output[0]
print(f"Refused: {refusal.refusal}")
else:
# Normal structured output
data = response.output_parsed
Anthropic JSON Mode
Anthropic's approach to structured output is through tool use or explicit JSON prompting. Claude doesn't have native JSON Schema enforcement like OpenAI, but you can achieve reliable results with the right patterns:
Approach 1: Tool Use for Structured Extraction
Define a tool with your desired schema and force the model to use it:
import anthropic
import json
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
tool_choice={"type": "tool", "name": "extract_info"},
tools=[
{
"name": "extract_info",
"description": "Extract structured information from text",
"input_schema": {
"type": "object",
"properties": {
"entities": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": {"type": "string"},
"type": {"type": "string", "enum": ["person", "organization", "location"]},
"mentioned_in_context": {"type": "string"}
},
"required": ["name", "type"]
}
},
"sentiment": {"type": "string", "enum": ["positive", "negative", "neutral"]},
"summary": {"type": "string"}
},
"required": ["entities", "sentiment", "summary"]
}
}
],
messages=[
{"role": "user", "content": "Apple announced Tim Cook will visit their new Austin campus next week."}
]
)
for block in response.content:
if block.type == "tool_use":
data = block.input # Already a dict!
print(json.dumps(data, indent=2))
This is the most reliable way to get structured output from Claude — the model is forced to produce a valid tool call that matches your schema.
Approach 2: JSON Mode with Prefill
Use the assistant prefill trick to force JSON output:
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[
{
"role": "user",
"content": """Extract the following as JSON:
{"name": string, "age": number, "occupation": string}
Text: Sarah is a 32-year-old software engineer."""
},
{
"role": "assistant",
"content": "{" # Prefill forces JSON output
}
]
)
# The response continues from "{"
result = json.loads("{" + response.content[0].text)
print(result["name"]) # "Sarah"
Google Gemini Controlled Generation
Gemini supports structured output through response schema configuration:
import google.generativeai as genai
from pydantic import BaseModel
class Product(BaseModel):
name: str
price: float
in_stock: bool
category: str
# Using Pydantic model directly
model = genai.GenerativeModel('gemini-2.5-pro')
response = model.generate_content(
"Extract product info: The Widget Pro costs $29.99 and is currently in stock in the electronics category.",
generation_config=genai.GenerationConfig(
response_mime_type="application/json",
response_schema=Product
)
)
import json
product = json.loads(response.text)
print(product["name"]) # "Widget Pro"
print(product["price"]) # 29.99
Gemini with Raw JSON Schema
response = model.generate_content(
"List 3 fruits with their color and taste",
generation_config=genai.GenerationConfig(
response_mime_type="application/json",
response_schema={
"type": "array",
"items": {
"type": "object",
"properties": {
"name": {"type": "string"},
"color": {"type": "string"},
"taste": {"type": "string", "enum": ["sweet", "sour", "bitter", "umami"]}
},
"required": ["name", "color", "taste"]
}
}
)
)
Provider Comparison
| Feature | OpenAI | Anthropic | Google Gemini |
|---|---|---|---|
| Schema enforcement | Guaranteed (strict mode) | Via tool use | Guaranteed (response_schema) |
| Pydantic support | Native SDK | No | Native SDK |
| Enum validation | Guaranteed | Best-effort | Guaranteed |
| Explicit refusals | Yes | No | No |
| Streaming support | Yes | Via tool streaming | Limited |
| Optional fields | Via nullable | Naturally optional | Via optional |
| Max schema complexity | Large | Large | Moderate |
Production Patterns
1. Multi-Provider Abstraction
If you need structured outputs across multiple providers, abstract the interface:
from abc import ABC, abstractmethod
from pydantic import BaseModel
from typing import Type
class StructuredOutputProvider(ABC):
@abstractmethod
async def extract(self, text: str, schema: Type[BaseModel], prompt: str) -> BaseModel:
pass
class OpenAIProvider(StructuredOutputProvider):
def __init__(self, model="gpt-5"):
self.client = OpenAI()
self.model = model
async def extract(self, text, schema, prompt):
response = self.client.responses.parse(
model=self.model,
input=[
{"role": "system", "content": prompt},
{"role": "user", "content": text}
],
text_format=schema,
)
return response.output_parsed
class AnthropicProvider(StructuredOutputProvider):
def __init__(self, model="claude-sonnet-4-20250514"):
self.client = anthropic.Anthropic()
self.model = model
async def extract(self, text, schema, prompt):
# Convert Pydantic schema to tool schema
json_schema = schema.model_json_schema()
response = self.client.messages.create(
model=self.model,
max_tokens=1024,
tool_choice={"type": "tool", "name": "extract"},
tools=[{
"name": "extract",
"description": prompt,
"input_schema": json_schema
}],
messages=[{"role": "user", "content": text}]
)
for block in response.content:
if block.type == "tool_use":
return schema(**block.input)
# Usage
provider = OpenAIProvider()
result = await provider.extract(
"Apple stock rose 5% to $198",
StockInfo,
"Extract stock information"
)
2. Fallback Chain
If strict mode fails (e.g., schema too complex), fall back to JSON mode:
def get_structured_output(client, model, messages, schema):
# Try structured outputs first
try:
response = client.chat.completions.create(
model=model,
messages=messages,
response_format={
"type": "json_schema",
"json_schema": {
"name": "response",
"strict": True,
"schema": schema
}
}
)
return json.loads(response.choices[0].message.content)
except Exception as e:
print(f"Structured output failed: {e}")
# Fallback to JSON mode
try:
response = client.chat.completions.create(
model=model,
messages=messages + [
{"role": "assistant", "content": "Here is the JSON:\n{"}
],
response_format={"type": "json_object"}
)
return json.loads("{" + response.choices[0].message.content)
except Exception as e:
print(f"JSON mode also failed: {e}")
return None
3. Validation Layer
Even with guaranteed structured outputs, add a validation layer for business logic:
from pydantic import BaseModel, field_validator
class OrderExtraction(BaseModel):
items: list[str]
total: float
currency: str = "USD"
@field_validator('total')
@classmethod
def total_must_be_positive(cls, v):
if v <= 0:
raise ValueError('Total must be positive')
return v
@field_validator('items')
@classmethod
def must_have_items(cls, v):
if not v:
raise ValueError('At least one item required')
return v
# Use with structured output
response = client.responses.parse(
model="gpt-5",
input="I'd like to order 2 laptops",
text_format=OrderExtraction,
)
try:
order = OrderExtraction(**response.output_parsed.model_dump())
except ValueError as e:
print(f"Business validation failed: {e}")
Advanced Schema Patterns
Recursive Schemas
Nested and recursive structures work with OpenAI Structured Outputs:
class Comment(BaseModel):
author: str
text: str
replies: list["Comment"] = []
# This creates a tree structure
# OpenAI handles the recursion depth limit automatically
Union Types (Discriminated Unions)
from typing import Union, Literal
class TextContent(BaseModel):
type: Literal["text"]
body: str
class ImageContent(BaseModel):
type: Literal["image"]
url: str
alt_text: str
class CodeContent(BaseModel):
type: Literal["code"]
language: str
code: str
ContentBlock = Union[TextContent, ImageContent, CodeContent]
class Article(BaseModel):
title: str
blocks: list[ContentBlock]
Extraction Chains
For complex documents, break extraction into multiple passes:
class EntityExtraction(BaseModel):
entities: list[str]
relationships: list[tuple[str, str, str]] # (subject, predicate, object)
class FactCheck(BaseModel):
claims: list[str]
verifiable: list[bool]
# Pass 1: Extract entities
entities = client.responses.parse(
model="gpt-5",
input=document,
text_format=EntityExtraction,
)
# Pass 2: Fact-check claims using extracted entities
facts = client.responses.parse(
model="gpt-5",
input=f"Check these entities: {entities.output_parsed.entities}\n\nDocument: {document}",
text_format=FactCheck,
)
Common Pitfalls
| Pitfall | Solution |
|---|---|
Forgetting additionalProperties: False |
Required for strict mode on OpenAI. Add to every object in schema. |
| Not handling refusals | Always check if output_parsed is None before using it. |
| Too many enum values | Keep enums under ~20 values. More causes unreliable selection. |
| Schema too complex | Break into multiple extraction calls. Simpler schemas are more reliable. |
| Using JSON mode when Structured Outputs is available | Always prefer Structured Outputs — it guarantees schema compliance. |
| Not using Pydantic for Python | Pydantic gives you type safety, validation, and automatic schema generation. |
Performance Considerations
- Latency: Structured Outputs adds minimal latency (~50-100ms) compared to unstructured output
- Tokens: Schema definitions count as input tokens. Complex schemas can add 500-2000 tokens to each request
- Cost: The schema tokens add to your input costs. For high-volume applications, consider caching the schema portion with prompt caching
- Streaming: Structured Outputs with streaming works on OpenAI — the JSON builds incrementally. Parse after the stream completes.
Conclusion
Structured Outputs transforms LLM integration from fragile regex-parsing to reliable schema-driven data extraction. If you're using OpenAI or Google Gemini, use their native Structured Outputs features — they guarantee schema compliance. For Anthropic, the tool-use pattern achieves similar reliability. In all cases, add a Pydantic validation layer for business logic that goes beyond schema validation. The result: zero JSON parsing errors in production, simpler code, and happier users.
Related Guides: Function Calling & Tool Use Guide · Prompt Engineering Guide · Streaming Responses Guide