如何在本地搭建完整的AI Agent系统：Ollama、MCP与Skill的串联指南

来源：IPIPP.com作者：陈平安时间：05-22

导读：本期聚焦于小伙伴创作的《如何在本地搭建完整的AI Agent系统：Ollama、MCP与Skill的串联指南》，敬请观看详情，探索知识的价值。以下视频、文章将为您系统阐述其核心内容与价值。如果您觉得《如何在本地搭建完整的AI Agent系统：Ollama、MCP与Skill的串联指南》有用，将其分享出去将是对创作者最好的鼓励。

在本地跑一个完整 AI Agent 系统：Ollama、MCP 和 Skill 到底怎么串起来

现在不少开发者想在本地搭建一套完整的 AI Agent 系统，不用依赖远程大模型服务，既保护数据安全又能灵活定制功能。其中 Ollama 负责提供本地大模型推理能力，MCP（Model Context Protocol）负责规范模型与外部资源的通信协议，Skill 则是 Agent 可调用的具体功能模块，三者串起来就能实现从用户提问到完成复杂任务的完整流程。接下来我们就一步步拆解整套系统的搭建和串联逻辑。

一、核心组件基础介绍

在开始串联之前，先明确三个组件的核心定位，避免后续对接时出现概念混淆：

Ollama：轻量级本地大模型运行工具，支持拉取和运行 Llama 3、Mistral 等多种开源大模型，不需要复杂的环境配置，能直接提供 HTTP 接口供外部调用。
MCP：模型上下文协议，本质是定义了一套模型与外部工具、数据源交互的标准格式，不管是调用本地文件、查询数据库还是执行自定义功能，都可以通过符合 MCP 规范的接口对接，不用每次都针对模型做适配。
Skill：Agent 的具体能力单元，比如“查询本地天气”“读取本地文档”“执行代码脚本”都是独立的 Skill，每个 Skill 只需要实现 MCP 协议要求的接口，就能被 Agent 识别调用。

二、环境准备与基础服务启动

首先我们需要把 Ollama 服务跑起来，作为整个系统的推理核心。下面是 Ollama 的安装和基础启动步骤，假设你使用的是 Linux 或者 macOS 系统（Windows 系统可以对应调整命令）：

# 安装 Ollama，按照官方脚本执行即可
curl -fsSL https://ollama.com/install.sh | sh

# 启动 Ollama 服务，默认会监听 11434 端口
ollama serve

# 新开一个终端，拉取我们需要的大模型，这里以 llama3:8b 为例
ollama pull llama3:8b

# 测试模型是否正常运行，发送一个简单的提问
curl http://127.0.0.1:11434/api/generate -d '{
  "model": "llama3:8b",
  "prompt": "你好，介绍一下你自己"
}'

如果返回了模型生成的回复，说明 Ollama 服务已经正常启动。接下来我们需要准备一个简单的 MCP Server 框架，用来对接后续的 Skill，这里用 Python 实现一个最基础的 MCP Server 示例：

# 导入需要的依赖，需要先安装 mcp 库：pip install mcp
from mcp.server import Server
from mcp.server.stdio import stdio_server
import asyncio

# 初始化 MCP Server 实例
app = Server("local-agent-mcp-server")

# 定义一个简单的 Skill：返回当前系统时间，符合 MCP 工具定义规范
@app.list_tools()
async def list_tools():
    return [
        {
            "name": "get_current_time",
            "description": "获取当前系统的本地时间，返回格式为 YYYY-MM-DD HH:MM:SS",
            "inputSchema": {
                "type": "object",
                "properties": {},
                "required": []
            }
        }
    ]

# 实现工具调用的处理逻辑
@app.call_tool()
async def call_tool(name: str, arguments: dict):
    if name == "get_current_time":
        import datetime
        current_time = datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")
        return [{"type": "text", "text": f"当前系统时间为：{current_time}"}]
    return [{"type": "text", "text": "未找到对应的 Skill 功能"}]

# 启动 MCP Server，使用标准输入输出作为通信通道
async def main():
    async with stdio_server() as (read_stream, write_stream):
        await app.run(read_stream, write_stream, app.create_initialization_options())

if __name__ == "__main__":
    asyncio.run(main())

三、Agent 核心逻辑串联

现在我们需要写一个 Agent 主程序，把 Ollama 的推理能力和 MCP Server 的 Skill 能力串起来。核心逻辑是：用户发送提问 -> Agent 调用 Ollama 分析是否需要调用 Skill -> 如果需要，通过 MCP 协议调用对应 Skill -> 把 Skill 返回的结果拼接回上下文，再次调用 Ollama 生成最终回复。

下面是完整的 Agent 串联代码示例，包含和 Ollama、MCP Server 的对接逻辑：

import requests
import subprocess
import json
import threading
import queue

# Ollama 服务地址，本地默认端口
OLLAMA_URL = "http://127.0.0.1:11434/api/chat"
# 使用的大模型名称，和之前拉取的模型对应
MODEL_NAME = "llama3:8b"

class MCPClient:
    """MCP 客户端，负责和 MCP Server 通信"""
    def __init__(self, server_script_path):
        self.server_script_path = server_script_path
        self.server_process = None
        self.request_queue = queue.Queue()
        self.response_queue = queue.Queue()
        self._start_server()

    def _start_server(self):
        """启动 MCP Server 子进程"""
        self.server_process = subprocess.Popen(
            ["python", self.server_script_path],
            stdin=subprocess.PIPE,
            stdout=subprocess.PIPE,
            stderr=subprocess.PIPE,
            text=True
        )
        # 启动线程监听 Server 返回的消息
        threading.Thread(target=self._read_responses, daemon=True).start()

    def _read_responses(self):
        """读取 MCP Server 的返回结果"""
        while True:
            line = self.server_process.stdout.readline()
            if not line:
                break
            try:
                response = json.loads(line)
                self.response_queue.put(response)
            except json.JSONDecodeError:
                pass

    def list_tools(self):
        """获取 MCP Server 支持的所有 Skill 列表"""
        request = {"jsonrpc": "2.0", "id": 1, "method": "tools/list", "params": {}}
        self.server_process.stdin.write(json.dumps(request) + "\n")
        self.server_process.stdin.flush()
        response = self.response_queue.get()
        return response.get("result", {}).get("tools", [])

    def call_tool(self, tool_name, arguments=None):
        """调用指定的 Skill"""
        if arguments is None:
            arguments = {}
        request = {
            "jsonrpc": "2.0",
            "id": 2,
            "method": "tools/call",
            "params": {"name": tool_name, "arguments": arguments}
        }
        self.server_process.stdin.write(json.dumps(request) + "\n")
        self.server_process.stdin.flush()
        response = self.response_queue.get()
        return response.get("result", {}).get("content", [])

    def close(self):
        """关闭 MCP Server 进程"""
        if self.server_process:
            self.server_process.terminate()

class LocalAgent:
    """本地 AI Agent 核心类"""
    def __init__(self, mcp_server_script):
        self.mcp_client = MCPClient(mcp_server_script)
        self.tools = self.mcp_client.list_tools()
        # 把 Skill 信息格式化成 Ollama 工具调用需要的格式
        self.ollama_tools = [
            {
                "type": "function",
                "function": {
                    "name": tool["name"],
                    "description": tool["description"],
                    "parameters": tool["inputSchema"]
                }
            } for tool in self.tools
        ]
        self.chat_history = []

    def chat(self, user_input):
        """处理用户提问，返回最终回复"""
        self.chat_history.append({"role": "user", "content": user_input})
        # 第一次调用 Ollama，判断是否要调用 Skill
        payload = {
            "model": MODEL_NAME,
            "messages": self.chat_history,
            "tools": self.ollama_tools,
            "stream": False
        }
        response = requests.post(OLLAMA_URL, json=payload)
        response_data = response.json()
        assistant_message = response_data.get("message", {})

        # 如果模型返回了工具调用请求
        if "tool_calls" in assistant_message:
            tool_call = assistant_message["tool_calls"][0]
            tool_name = tool_call["function"]["name"]
            tool_args = tool_call["function"]["arguments"]
            # 调用 MCP Client 执行对应的 Skill
            tool_result = self.mcp_client.call_tool(tool_name, tool_args)
            # 把工具调用结果和模型回复拼回上下文
            self.chat_history.append(assistant_message)
            self.chat_history.append({
                "role": "tool",
                "content": json.dumps(tool_result, ensure_ascii=False),
                "tool_call_id": tool_call.get("id", "")
            })
            # 第二次调用 Ollama，生成最终回复
            payload["messages"] = self.chat_history
            final_response = requests.post(OLLAMA_URL, json=payload)
            final_data = final_response.json()
            final_message = final_data.get("message", {}).get("content", "")
            self.chat_history.append({"role": "assistant", "content": final_message})
            return final_message
        else:
            # 不需要调用 Skill，直接返回模型回复
            content = assistant_message.get("content", "")
            self.chat_history.append({"role": "assistant", "content": content})
            return content

    def close(self):
        self.mcp_client.close()

if __name__ == "__main__":
    # MCP Server 脚本的路径，替换成你自己的文件路径
    mcp_script_path = "mcp_server.py"
    agent = LocalAgent(mcp_script_path)
    print("本地 AI Agent 已启动，输入 quit 退出")
    while True:
        user_input = input("用户：")
        if user_input.lower() == "quit":
            break
        reply = agent.chat(user_input)
        print(f"Agent：{reply}")
    agent.close()

四、运行测试与扩展说明

把上面的代码保存好，假设 MCP Server 的代码保存为 mcp_server.py，Agent 主程序保存为 local_agent.py，先确保 Ollama 服务正常运行，然后执行主程序：

# 安装需要的依赖
pip install requests mcp

# 运行 Agent 主程序
python local_agent.py

此时你可以输入“现在几点了”这类问题，Agent 会识别到需要调用 get_current_time 这个 Skill，通过 MCP 协议调用对应的功能，拿到时间结果后再拼接上下文让 Ollama 生成通顺的回复。

如果要扩展更多 Skill，只需要在 MCP Server 里按照 MCP 的规范定义新的工具，实现对应的处理逻辑，Agent 主程序不需要修改任何代码，重启后就能自动识别新的 Skill。比如你可以新增一个“读取本地文件”的 Skill，定义好名称、描述和参数格式，实现读取文件的逻辑，就能让 Agent 具备读取本地文档的能力。

整个流程的核心就是 MCP 协议作为中间层，屏蔽了不同 Skill 和模型之间的适配差异，Ollama 只需要按照标准格式返回工具调用请求，Agent 只需要负责转发请求和拼接上下文，三者职责清晰，后续扩展和维护都非常方便。

本地AI_Agent Ollama MCP协议 Skill开发大模型部署本作品最后修改时间：2026-05-22 05:26:54

免责声明：网站部分内容来源于网络或由用户自行发表，内容观点不代表本站立场。本站是个人网站免费分享，内容仅供个人学习、研究或参考使用，如内容中引用了第三方作品，其版权归原作者所有。若内容触犯了您的权益，请联系我们进行处理。