Function Calling 与 Tool Use 架构设计

核心概念

Function Calling(工具调用)是让大语言模型(LLM)与外部世界交互的关键机制。模型本身不执行函数,而是生成结构化的函数调用请求,由应用层执行后将结果返回给模型。

这个看似简单的机制彻底改变了 AI 应用的架构:LLM 从一个"文本生成器"变成了一个"决策引擎",能够规划、调用工具、处理结果、迭代执行。

主流平台实现对比

OpenAI Function Calling

from openai import OpenAI

client = OpenAI()

# 定义工具
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "获取指定城市的当前天气信息",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {
                        "type": "string",
                        "description": "城市名称,如 '北京'、'上海'",
                    },
                    "unit": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": "温度单位",
                    },
                },
                "required": ["city"],
            },
        },
    },
]

# 发送请求
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "北京今天天气怎么样?"}],
    tools=tools,
    tool_choice="auto",   # auto / none / required / 指定函数
)

message = response.choices[0].message

# 检查是否有工具调用
if message.tool_calls:
    for tool_call in message.tool_calls:
        # 解析函数名和参数
        func_name = tool_call.function.name
        func_args = json.loads(tool_call.function.arguments)

        # 执行函数
        result = execute_function(func_name, func_args)

        # 将结果传回模型
        messages.append(message)
        messages.append({
            "role": "tool",
            "tool_call_id": tool_call.id,
            "content": json.dumps(result),
        })

    # 模型根据工具结果生成最终回复
    final_response = client.chat.completions.create(
        model="gpt-4o",
        messages=messages,
    )

OpenAI 的特点:

  • 支持并行工具调用(单次返回多个 tool_calls)
  • tool_choice 可精确控制调用行为
  • Structured Outputs 模式保证 JSON Schema 100% 合规

Anthropic Tool Use

import anthropic

client = anthropic.Anthropic()

tools = [
    {
        "name": "get_weather",
        "description": "获取指定城市的当前天气信息。当用户询问天气时使用此工具。",
        "input_schema": {
            "type": "object",
            "properties": {
                "city": {
                    "type": "string",
                    "description": "城市名称",
                },
                "unit": {
                    "type": "string",
                    "enum": ["celsius", "fahrenheit"],
                    "default": "celsius",
                },
            },
            "required": ["city"],
        },
    },
]

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    tools=tools,
    messages=[{"role": "user", "content": "北京今天天气怎么样?"}],
)

# Anthropic 使用 content blocks 模型
# 一条消息可以包含 text + tool_use 混合内容
for block in response.content:
    if block.type == "tool_use":
        tool_name = block.name
        tool_input = block.input
        tool_use_id = block.id

        result = execute_function(tool_name, tool_input)

        # 将结果传回
        messages.append({"role": "assistant", "content": response.content})
        messages.append({
            "role": "user",
            "content": [
                {
                    "type": "tool_result",
                    "tool_use_id": tool_use_id,
                    "content": json.dumps(result),
                }
            ],
        })

        final = client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=1024,
            tools=tools,
            messages=messages,
        )

Anthropic 的特点:

  • Content blocks 模型,文本和工具调用可混合在同一消息中
  • 强调工具描述的质量,对 prompt 工程化有更高要求
  • 支持 tool_choice: {"type": "any"}{"type": "tool", "name": "xxx"}

Google Gemini Function Calling

import google.generativeai as genai

# 定义工具(使用 Python 函数签名)
def get_weather(city: str, unit: str = "celsius") -> dict:
    """获取指定城市的当前天气信息。

    Args:
        city: 城市名称,如 '北京'、'上海'
        unit: 温度单位,celsius 或 fahrenheit
    """
    return {"temperature": 22, "condition": "晴", "unit": unit}

model = genai.GenerativeModel(
    "gemini-2.0-flash",
    tools=[get_weather],
)

# Gemini 支持自动执行工具调用
chat = model.start_chat(enable_automatic_function_calling=True)
response = chat.send_message("北京今天天气怎么样?")
# 模型自动调用 get_weather,返回最终结果
print(response.text)

Google 的特点:

  • 支持直接传入 Python 函数,自动提取 schema
  • enable_automatic_function_calling 实现全自动循环
  • 支持 Google Search 和 Code Execution 作为内置工具

三平台对比

特性 OpenAI Anthropic Google
Schema 格式 JSON Schema JSON Schema JSON Schema / Python 函数
并行调用 支持 支持 支持
流式工具调用 支持 支持 支持
强制调用特定工具 tool_choice: {"name": "xxx"} tool_choice: {"type": "tool"} function_calling_config: ANY
禁止工具调用 tool_choice: "none" tool_choice: {"type": "none"} function_calling_config: NONE
结构化输出保证 Structured Outputs 无原生保证 无原生保证
自动执行循环 无(需手动实现) 支持
最大工具数 128 约 50-100 128

Schema 设计最佳实践

1. 描述是关键

{
    "name": "search_products",
    "description": "在产品目录中搜索商品。当用户需要查找、浏览或比较产品时使用。不要用于查询订单状态或用户信息。",
    "parameters": {
        "type": "object",
        "properties": {
            "query": {
                "type": "string",
                "description": "搜索关键词。支持产品名称、类别、品牌。示例:'iPhone 16'、'运动鞋'、'索尼耳机'"
            },
            "category": {
                "type": "string",
                "enum": ["electronics", "clothing", "food", "home"],
                "description": "产品类别过滤。仅在用户明确提及类别时使用"
            },
            "price_range": {
                "type": "object",
                "properties": {
                    "min": {"type": "number", "description": "最低价格(人民币)"},
                    "max": {"type": "number", "description": "最高价格(人民币)"}
                },
                "description": "价格范围过滤。仅在用户提及预算或价格区间时使用"
            },
            "sort_by": {
                "type": "string",
                "enum": ["relevance", "price_asc", "price_desc", "rating", "newest"],
                "default": "relevance",
                "description": "排序方式。默认按相关性排序"
            }
        },
        "required": ["query"]
    }
}

2. 防止参数幻觉

# 使用 enum 约束可选值
"status": {
    "type": "string",
    "enum": ["pending", "processing", "completed", "failed"],
    "description": "订单状态筛选"
}

# 使用 pattern 约束格式
"date": {
    "type": "string",
    "pattern": "^\\d{4}-\\d{2}-\\d{2}$",
    "description": "日期,格式 YYYY-MM-DD"
}

# 使用 minimum/maximum 约束数值范围
"limit": {
    "type": "integer",
    "minimum": 1,
    "maximum": 100,
    "default": 10,
    "description": "返回结果数量"
}

3. 工具编排模式

# 模式一:顺序调用
tools = [
    search_database,     # 1. 先搜索
    analyze_results,     # 2. 再分析
    generate_report,     # 3. 最后生成报告
]

# 模式二:条件分支
tools = [
    check_inventory,     # 检查库存
    place_order,         # 有库存 -> 下单
    notify_restock,      # 无库存 -> 通知补货
]

# 模式三:循环迭代
tools = [
    web_search,          # 搜索
    read_page,           # 阅读页面
    # 模型判断信息是否充足,不足则继续搜索
    summarize,           # 汇总
]

错误处理架构

import json
import traceback
from enum import Enum

class ToolError(Enum):
    INVALID_PARAMS = "invalid_params"
    NOT_FOUND = "not_found"
    PERMISSION_DENIED = "permission_denied"
    RATE_LIMITED = "rate_limited"
    INTERNAL_ERROR = "internal_error"
    TIMEOUT = "timeout"

def execute_tool_safely(func_name: str, func_args: dict) -> dict:
    """安全执行工具调用,返回结构化错误信息"""
    try:
        # 参数验证
        if func_name not in TOOL_REGISTRY:
            return {
                "error": ToolError.INVALID_PARAMS.value,
                "message": f"Unknown tool: {func_name}",
                "available_tools": list(TOOL_REGISTRY.keys()),
            }

        func = TOOL_REGISTRY[func_name]

        # 执行(带超时)
        result = func(**func_args)
        return {"status": "success", "data": result}

    except ValueError as e:
        return {
            "error": ToolError.INVALID_PARAMS.value,
            "message": str(e),
            "hint": "请检查参数格式和取值范围",
        }
    except PermissionError as e:
        return {
            "error": ToolError.PERMISSION_DENIED.value,
            "message": "当前用户无权执行此操作",
        }
    except TimeoutError:
        return {
            "error": ToolError.TIMEOUT.value,
            "message": "操作超时,请稍后重试",
        }
    except Exception as e:
        return {
            "error": ToolError.INTERNAL_ERROR.value,
            "message": "内部错误,请联系管理员",
            "debug": traceback.format_exc() if DEBUG else None,
        }

重试与降级策略

async def tool_call_with_retry(
    client,
    messages,
    tools,
    max_iterations: int = 10,
    max_retries_per_tool: int = 2,
):
    """带重试和最大迭代限制的工具调用循环"""
    tool_call_counts = {}
    iteration = 0

    while iteration < max_iterations:
        iteration += 1

        response = await client.chat.completions.create(
            model="gpt-4o",
            messages=messages,
            tools=tools,
        )

        message = response.choices[0].message
        messages.append(message)

        if not message.tool_calls:
            return message.content  # 模型决定不再调用工具

        for tool_call in message.tool_calls:
            func_name = tool_call.function.name

            # 检查单工具重试次数
            tool_call_counts[func_name] = tool_call_counts.get(func_name, 0) + 1
            if tool_call_counts[func_name] > max_retries_per_tool:
                result = {
                    "error": "max_retries_exceeded",
                    "message": f"{func_name} 已重试 {max_retries_per_tool} 次,"
                               "请尝试其他方法或向用户确认信息",
                }
            else:
                args = json.loads(tool_call.function.arguments)
                result = execute_tool_safely(func_name, args)

            messages.append({
                "role": "tool",
                "tool_call_id": tool_call.id,
                "content": json.dumps(result, ensure_ascii=False),
            })

    return "达到最大工具调用次数限制,请简化请求"

安全防护

权限控制

from dataclasses import dataclass, field

@dataclass
class ToolPermission:
    """工具权限定义"""
    tool_name: str
    allowed_roles: list[str] = field(default_factory=lambda: ["admin"])
    requires_confirmation: bool = False
    max_calls_per_session: int = 100
    sensitive_params: list[str] = field(default_factory=list)

TOOL_PERMISSIONS = {
    "search_products": ToolPermission(
        tool_name="search_products",
        allowed_roles=["user", "admin"],
        requires_confirmation=False,
    ),
    "place_order": ToolPermission(
        tool_name="place_order",
        allowed_roles=["user", "admin"],
        requires_confirmation=True,   # 下单需确认
        sensitive_params=["payment_method"],
    ),
    "delete_user": ToolPermission(
        tool_name="delete_user",
        allowed_roles=["admin"],
        requires_confirmation=True,
        max_calls_per_session=5,
    ),
}

def check_permission(tool_name: str, user_role: str, session) -> tuple[bool, str]:
    """检查工具调用权限"""
    perm = TOOL_PERMISSIONS.get(tool_name)
    if not perm:
        return False, "未注册的工具"

    if user_role not in perm.allowed_roles:
        return False, f"角色 {user_role} 无权使用 {tool_name}"

    call_count = session.get_tool_call_count(tool_name)
    if call_count >= perm.max_calls_per_session:
        return False, f"{tool_name} 本次会话已达到调用上限 ({perm.max_calls_per_session})"

    return True, "ok"

参数注入防护

import re

def sanitize_tool_args(func_name: str, args: dict) -> dict:
    """防止通过工具参数进行注入攻击"""
    sanitized = {}
    for key, value in args.items():
        if isinstance(value, str):
            # 防止 SQL 注入
            if re.search(r"(;|--|DROP|DELETE|UPDATE|INSERT)\s", value, re.IGNORECASE):
                raise ValueError(f"Suspicious SQL pattern in parameter '{key}'")

            # 防止命令注入
            if re.search(r"[;&|`$(){}]", value):
                raise ValueError(f"Suspicious shell pattern in parameter '{key}'")

            # 长度限制
            if len(value) > 10000:
                raise ValueError(f"Parameter '{key}' exceeds maximum length")

        sanitized[key] = value

    return sanitized

性能优化

并行工具调用

import asyncio

async def execute_parallel_tool_calls(tool_calls):
    """并行执行多个工具调用"""
    tasks = []
    for tc in tool_calls:
        func_name = tc.function.name
        func_args = json.loads(tc.function.arguments)
        tasks.append(execute_tool_async(func_name, func_args))

    results = await asyncio.gather(*tasks, return_exceptions=True)

    tool_results = []
    for tc, result in zip(tool_calls, results):
        if isinstance(result, Exception):
            content = json.dumps({"error": str(result)})
        else:
            content = json.dumps(result, ensure_ascii=False)

        tool_results.append({
            "role": "tool",
            "tool_call_id": tc.id,
            "content": content,
        })

    return tool_results

工具结果缓存

import hashlib
from functools import lru_cache

class ToolCache:
    def __init__(self, ttl_seconds=300):
        self.cache = {}
        self.ttl = ttl_seconds

    def cache_key(self, func_name: str, args: dict) -> str:
        args_str = json.dumps(args, sort_keys=True)
        return hashlib.md5(f"{func_name}:{args_str}".encode()).hexdigest()

    def get(self, func_name: str, args: dict):
        key = self.cache_key(func_name, args)
        entry = self.cache.get(key)
        if entry and (time.time() - entry["timestamp"]) < self.ttl:
            return entry["result"]
        return None

    def set(self, func_name: str, args: dict, result):
        key = self.cache_key(func_name, args)
        self.cache[key] = {
            "result": result,
            "timestamp": time.time(),
        }

# 可缓存的工具标记
CACHEABLE_TOOLS = {"search_products", "get_weather", "get_exchange_rate"}

总结

Function Calling 的架构设计核心要点:

  1. Schema 设计:描述要详尽,参数要约束,避免模型幻觉
  2. 错误处理:结构化错误信息帮助模型理解和恢复
  3. 安全防护:权限控制、参数注入防护、调用频率限制
  4. 性能优化:并行执行、结果缓存、工具选择精简
  5. 可观测性:记录每次工具调用的输入输出和延迟,便于调试和优化

Maurice | maurice_wen@proton.me