在Windows上部署PyTorch模型的三种主流方法_Windows

摘要：本文介绍了在windows系统上部署pytorch模型的三种主流方法。方案一通过torchscript实现高性能推理，支持python和c++调用；方案二使用fastapi构建web api服务，适合后端调用；方案三通过pyinstaller打包为桌面exe程序，便于交付给终端用户。每种方案都包含详细步骤，涵盖模型导出、加载推理、服务部署和gui集成等关键环节，可根据不同应用场景（高性能推理、web服务或桌面应用）灵活选择。

在 windows 上部署 pytorch 模型主要有三种主流方式，取决于你的具体需求（是用于高性能推理、web 服务 api，还是桌面应用程序）。

以下是三种最常用方案的详细步骤：

方案一：使用 torchscript (官方原生，适合 c++ 调用或高性能 python 服务)

适用场景：需要脱离 python 解释器依赖（c++ 部署），或者在 python 中追求比原生 model.forward 更快的推理速度。

步骤 1: 导出模型为 torchscript

在你的训练代码或单独的脚本中，将训练好的模型转换为脚本格式。

import torch
import torchvision.models as models

# 1. 加载训练好的模型 (确保处于评估模式)
model = models.resnet18(weights='imagenet1k_v1') # 示例模型
model.eval()

# 2. 创建示例输入 (用于追踪或脚本化)
# 假设输入是 batch_size=1, 3通道, 224x224的图片
example_input = torch.rand(1, 3, 224, 224)

# 3. 跟踪模式 (tracing) - 适合控制流简单的模型
traced_script_module = torch.jit.trace(model, example_input)

# 或者 脚本模式 (scripting) - 适合有复杂控制流(if/for)的模型
# traced_script_module = torch.jit.script(model)

# 4. 保存模型
traced_script_module.save("resnet18_windows.pt")
print("模型已导出为 resnet18_windows.pt")

步骤 2: 在 windows 上部署 (python 端加载)

创建一个独立的推理脚本 inference.py，它不依赖训练代码，只依赖导出的 .pt 文件。

import torch
import torchvision.transforms as transforms
from pil import image

# 1. 加载 torchscript 模型
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = torch.jit.load("resnet18_windows.pt")
model.to(device)
model.eval()

# 2. 预处理图片
transform = transforms.compose([
    transforms.resize(256),
    transforms.centercrop(224),
    transforms.totensor(),
    transforms.normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])

# 3. 推理
img = image.open("test_image.jpg").convert("rgb")
input_tensor = transform(img).unsqueeze(0).to(device)

with torch.no_grad():
    output = model(input_tensor)
    print("预测结果:", output.argmax(dim=1).item())

步骤 3: (可选) c++ 部署 (完全脱离 python)

如果你需要极致的性能或集成到现有的 c++ windows 软件中：

下载 libtorch: 去 pytorch 官网选择 “libtorch”，操作系统选 “windows”，语言选 “c++”，计算平台选 “cuda” (如果有显卡) 或 “cpu”。
配置 visual studio:
- 新建 c++ 项目。
- 在属性页中配置 include directories 和 library directories 指向解压后的 libtorch 文件夹 (include, lib)。
- 链接 torch_cpu.lib 或 torch_cuda.lib 等相关库。
编写 c++ 代码: 使用 torch::jit::load("model.pt") 加载并推理。

方案二：构建 web api 服务 (最常用，适合后端服务)

适用场景：需要通过 http 请求调用模型（如前端网页、移动端 app 调用），使用 fastapi 或 flask。

步骤 1: 安装依赖

打开 windows powershell 或 cmd：

pip install fastapi uvicorn[standard] pillow python-multipart
# 如果还没装 torch
pip install torch torchvision

步骤 2: 创建main.py

from fastapi import fastapi, file, uploadfile
import torch
import torchvision.transforms as transforms
from pil import image
import io

app = fastapi()

# 全局加载模型 (避免每次请求都加载)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = torch.jit.load("resnet18_windows.pt") # 使用方案一中导出的模型
model.to(device)
model.eval()

transform = transforms.compose([
    transforms.resize(256),
    transforms.centercrop(224),
    transforms.totensor(),
    transforms.normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])

@app.post("/predict/")
async def predict(file: uploadfile = file(...)):
    # 读取图片
    image_data = await file.read()
    image = image.open(io.bytesio(image_data)).convert("rgb")
    
    # 预处理
    input_tensor = transform(image).unsqueeze(0).to(device)
    
    # 推理
    with torch.no_grad():
        output = model(input_tensor)
        prediction = output.argmax(dim=1).item()
    
    return {"filename": file.filename, "class_id": prediction}

# 启动命令: uvicorn main:app --reload --host 0.0.0.0 --port 8000

步骤 3: 运行服务

在终端运行：

uvicorn main:app --host 0.0.0.0 --port 8000

现在你可以访问 http://localhost:8000/docs 查看 swagger ui 界面并上传测试图片。

步骤 4: windows 开机自启 (作为服务)

为了让它在后台一直运行：

使用 nssm (non-sucking service manager) 工具。
下载 nssm.exe。
命令行运行 nssm install pytorchservice。
在弹出的 gui 中：
- path: 填写你的 python 路径 (例如 c:\users\yourname\venv\scripts\python.exe)。
- arguments: 填写 -m uvicorn main:app --host 0.0.0.0 --port 8000。
- startup directory: 填写你的代码所在文件夹。
点击 “install service”，然后在 windows 服务管理器中启动该服务。

方案三：打包为桌面 exe 程序 (适合交付给最终用户)

适用场景：需要发给没有 python 环境的普通用户使用，带图形界面 (gui)。

步骤 1: 安装 pyinstaller 和 gui 库

pip install pyinstaller pyside6 # 或者 tkinter (内置)

步骤 2: 编写带 gui 的推理脚本app.py

import sys
import torch
from pyside6.qtwidgets import qapplication, qmainwindow, qpushbutton, qlabel, qfiledialog, qvboxlayout, qwidget
from pyside6.qtgui import qpixmap
from pil import image
import torchvision.transforms as transforms
import io

# 加载模型 (全局)
model = torch.jit.load("resnet18_windows.pt")
model.eval()
transform = transforms.compose([...]) # 同上

class mainwindow(qmainwindow):
    def __init__(self):
        super().__init__()
        self.setwindowtitle("pytorch windows 演示")
        
        layout = qvboxlayout()
        self.label = qlabel("请上传图片")
        self.btn = qpushbutton("选择图片并预测")
        self.btn.clicked.connect(self.predict)
        
        layout.addwidget(self.label)
        layout.addwidget(self.btn)
        
        container = qwidget()
        container.setlayout(layout)
        self.setcentralwidget(container)

    def predict(self):
        file_path, _ = qfiledialog.getopenfilename(self, "选择图片", "", "images (*.png *.jpg)")
        if file_path:
            img = image.open(file_path).convert("rgb")
            input_tensor = transform(img).unsqueeze(0)
            with torch.no_grad():
                out = model(input_tensor)
                res = out.argmax(dim=1).item()
            self.label.settext(f"预测类别 id: {res}")

if __name__ == "__main__":
    app = qapplication(sys.argv)
    window = mainwindow()
    window.show()
    sys.exit(app.exec())

步骤 3: 打包成 exe

由于 pytorch 很大，打包需要特殊参数。

pyinstaller --noconfirm --onefile --windowed --add-data "resnet18_windows.pt;." app.py

注意：windows 下 --add-data 使用分号 ; 分隔，linux/mac 使用冒号 :。

重要提示：

首次打包可能非常大（几百 mb 甚至 1gb+），因为包含了整个 pytorch 库。
如果遇到内存错误，尝试添加 --exclude-module 排除不需要的库，或者使用 upx 压缩（但有时会导致 pytorch 崩溃，需测试）。
生成的 .exe 文件在 dist 文件夹下，可以直接发给任何 windows 电脑运行（无需安装 python）。

windows 部署特别注意事项

路径问题：
- windows 路径使用反斜杠 \，但在 python 字符串中建议用正斜杠 / 或原始字符串 r"c:\path"。
- 使用 os.path.join 或 pathlib 来处理路径，保证兼容性。

cuda 驱动：
- 如果使用 gpu 部署，目标机器必须安装与 pytorch 版本匹配的 nvidia 显卡驱动。
- 不需要在目标机器安装 cuda toolkit (cudnn 等已包含在 pytorch wheel 包或 libtorch 中)，只要显卡驱动够新即可。

防火墙：
- 如果是 web api 部署，windows defender 防火墙可能会拦截 8000 端口。首次运行时需允许通过防火墙。

性能优化 (windows 特有)：
- 在推理前设置线程数：torch.set_num_threads(1)。windows 上多线程有时反而因为上下文切换导致变慢，特别是在 cpu 推理时。
- 使用 torch.backends.cudnn.benchmark = true (仅限 nvidia gpu) 可以加速固定输入的推理。

总结推荐

需求	推荐方案	难度	性能
内部微服务/api	方案二 (fastapi + torchscript)	⭐⭐	⭐⭐⭐⭐
集成到 c++ 软件	方案一 (libtorch c++)	⭐⭐⭐⭐⭐	⭐⭐⭐⭐⭐
给小白用户的工具	方案三 (pyinstaller exe)	⭐⭐⭐	⭐⭐⭐
快速原型验证	直接运行 python 脚本	⭐	⭐⭐