0. Series Loop (Follow along without public source code)

End-to-end pipeline: Vue frontend → api/routes/chat.py → Guide multi-turn SSE → run_analysis_pipeline (parse → analyze → match → report) → tools/pdf_exporter PDF.
This article: 15/17 · Deployment Ring · Docker

Stage User Visible Code Entry Article
Create session Welcome message POST /api/sessions 09
Multi-turn conversation SSE streaming chat/stream → run_guide_single_turn 06, 14
Information sufficient Start analysis _run_analysis_background 05, 07
Resume parsing Progress 30% run_resume_parser 12
Profile/RIASEC Progress 50% run_profile_analyzer 03, 13
Career matching Progress 70% run_career_matcher 02
Report Progress 90% run_reporter 11
Download PDF File GET …/report/pdf 11, 15
Description
Before reading this Article 11 (fonts), Article 17 (env)
After reading this Understand how frontend and backend are packed into the same image based on Dockerfile stage descriptions
Next ring Article 16: Entry middleware (Article 16)

Full series loop index: SERIES-LOOP.md

1. What Problem to Solve

iCan is not a pure API service: the Vue frontend must be built into a static directory, the backend FastAPI + LangGraph needs to connect to MySQL/Chroma, Embedding relies on sentence-transformers (which indirectly pulls PyTorch), and Reporter also uses ReportLab/matplotlib to generate Chinese PDFs.

In the macOS development environment, “everything works fine”, but common issues when moving to a Linux container include:

  • pip automatically installs the CUDA version of PyTorch, causing the image size to explode and unnecessary downloads on GPU-less servers;
  • PDF/radar chart Chinese characters become squares (container lacks CJK fonts);
  • COPY frontend/ brings the host’s node_modules into the image, causing Vite build anomalies;
  • The Debian new version’s GTK/Pango package name changes causing apt-get install to fail.

The project root’s Dockerfile + docker-compose.yml records the actual solutions to these problems.


2. Implementation Location

File Responsibility
Dockerfile Two stages: Node builds frontend → Python runtime image
docker-compose.yml Ports, environment variable defaults, model and Chroma volume mounts
frontend/vite.config.js outDir: '../static', base: '/static/'
tools/pdf_exporter.py ReportLab / matplotlib Chinese font detection chain
config.py Fields corresponding to container environment variables like EMBEDDING_MODEL_PATH, CHROMA_PERSIST_DIR

Docker multi-stage build


3. Multi-stage Dockerfile Structure

Stage 1: Frontend Build (node:18-alpine)

1
2
3
4
5
6
7
FROM node:18-alpine AS frontend-builder
WORKDIR /build
COPY frontend/package.json frontend/package-lock.json ./
RUN npm install --registry=https://registry.npmmirror.com
COPY frontend/ ./
RUN rm -rf node_modules && npm install --registry=https://registry.npmmirror.com
RUN npx vite build && ls -la /static/

frontend/vite.config.js outputs the build artifacts to the static/ directory in the repository root (inside container: /build/../static/static/):

1
2
3
4
5
6
7
export default defineConfig({
base: '/static/',
build: {
outDir: '../static',
emptyOutDir: true,
},
})

FastAPI mounts static resources from ./static/; base: '/static/' ensures the packaged JS/CSS paths align with the backend routes.

Stage 2: Python Runtime (python:3.10-slim)

1
2
3
4
5
6
7
8
9
10
11
FROM python:3.10-slim
WORKDIR /app
# apt: Pango/Cairo/GDK + fonts-noto-cjk
COPY requirements.txt .
RUN pip install --no-cache-dir torch --index-url https://download.pytorch.org/whl/cpu ... && \
pip install --no-cache-dir -r requirements.txt ... && \
pip install --no-cache-dir matplotlib ...
COPY src/ ./src/
COPY --from=frontend-builder /static/ ./static/
WORKDIR /app/src
CMD ["uvicorn", "ican.main:app", "--host", "0.0.0.0", "--port", "8000"]

Note that the final WORKDIR is /app/src, so uvicorn ican.main:app uses src as the Python path root, consistent with local cd src && uvicorn.


4. docker-compose and Environment Overrides

docker-compose.yml does not hardcode DeepSeek; it defaults to the host’s Ollama:

1
2
3
4
5
6
7
8
environment:
- LLM_API_KEY=${LLM_API_KEY:-ollama}
- LLM_BASE_URL=${LLM_BASE_URL:-http://host.docker.internal:11434/v1}
- LLM_MODEL_CHAT=${LLM_MODEL_CHAT:-qwen3.5:9b}
- LLM_MODEL_LIGHT=${LLM_MODEL_LIGHT:-qwen3.5:9b}
- DB_URL=${DB_URL:-mysql+pymysql://root:ican2026@mysql:3306/ican?charset=utf8mb4}
- EMBEDDING_MODEL_PATH=${EMBEDDING_MODEL_PATH:-/app/models/bge-m3}
- CHROMA_PERSIST_DIR=${CHROMA_PERSIST_DIR:-/app/chroma_data}

The Embedding model is mounted via a read-only volume: /ican/iCan/llm_models/bge-m3:/app/models/bge-m3:ro. Chroma data is persisted with a named volume chroma-data. This corresponds one-to-one with the EMBEDDING_MODEL_PATH and CHROMA_PERSIST_DIR fields in config.py (see Article 17).


5. Pitfall 1: PyTorch CUDA Version Gets Indirectly Installed

Phenomenon

requirements.txt contains dependencies like sentence-transformers. If you directly run pip install -r requirements.txt, pip may pull the CUDA version of torch (700MB+), which is entirely unnecessary on CPU servers.

Project Solution

In the Dockerfile, first install the CPU version of torch, then install the remaining dependencies:

1
2
RUN pip install --no-cache-dir torch --index-url https://download.pytorch.org/whl/cpu -i https://pypi.tuna.tsinghua.edu.cn/simple && \
pip install --no-cache-dir -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple

The pre-installed torch satisfies sentence-transformers‘s dependency, avoiding redundant GPU builds. matplotlib is installed in a separate line to ensure all PDF chart dependencies are complete in the slim image.


6. Pitfall 2: Chinese PDF/Chart Garbled in Container

Phenomenon

ReportLab’s default Helvetica does not contain Chinese characters; matplotlib radar charts and bar chart labels appear as tofu squares.

Image Layer

The Dockerfile installs fonts-noto-cjk and the necessary Pango/Cairo stack for PDF rendering:

1
2
3
apt-get install -y --no-install-recommends \
libpango-1.0-0 libpangocairo-1.0-0 libgdk-pixbuf-2.0-0 \
libcairo2 libffi-dev fonts-noto-cjk

Code Layer

In tools/pdf_exporter.py, when registering the ReportLab font in _build_pdf, it probes paths in order, with Linux Docker paths listed first:

1
2
3
4
5
6
7
8
9
10
11
12
13
cn_font = "Helvetica"
for fp in [
"/usr/share/fonts/opentype/noto/NotoSansCJK-Regular.ttc",
"/usr/share/fonts/truetype/noto/NotoSansCJK-Regular.ttc",
"/usr/share/fonts/opentype/noto/NotoSansCJKsc-Regular.otf",
"/usr/share/fonts/truetype/wqy/wqy-zenhei.ttc",
# ...
"/System/Library/Fonts/PingFang.ttc", # macOS dev machine
]:
if os.path.exists(fp):
pdfmetrics.registerFont(TTFont("CNFont", fp))
cn_font = "CNFont"
break

For matplotlib (_generate_radar_chart, _generate_bar_chart), set:

1
2
3
4
5
6
plt.rcParams["font.sans-serif"] = [
"Noto Sans CJK SC", "WenQuanYi Zen Hei", "WenQuanYi Micro Hei",
"PingFang SC", "Heiti SC", "STHeiti", "SimHei",
"Microsoft YaHei", "Arial Unicode MS",
]
plt.rcParams["axes.unicode_minus"] = False

The first font in the list, "Noto Sans CJK SC", matches the Noto package name installed via apt. Just installing fonts without modifying the code might work on macOS but still fall back to Helvetica in Docker—both image and code layers must be addressed.


7. Pitfall 3: COPY frontend Overwrites node_modules

Phenomenon

First npm install, then COPY frontend/ ./ copies the developer’s local (possibly macOS) node_modules into the image, overwriting the freshly installed Linux dependencies, causing Vite/esbuild permission or platform binary mismatches.

Project Solution

1
2
COPY frontend/ ./
RUN rm -rf node_modules && npm install --registry=https://registry.npmmirror.com

An equivalent approach is to exclude frontend/node_modules in .dockerignore to prevent COPY from bringing it in. The current Dockerfile chooses to reinstall after COPY, which is explicit and reproducible.


8. Pitfall 4: Debian Package Name Changes

Phenomenon

Old documentation often references libgdk-pixbuf2.0-0, but on newer Debian (e.g., trixie-based slim base images), this package is missing.

Project Solution

Use the new package name libgdk-pixbuf-2.0-0 (see line 20 of Dockerfile). WeasyPrint/ReportLab indirectly depend on GDK-Pixbuf; if this package is missing, PDF generation may fail during import or rendering, and the error message may not directly point to the package name—worth noting separately.


9. Pitfall 5: Compose Default DSN and Missing MySQL Service

In docker-compose.yml, the DB_URL default points to mysql:3306, but the compose file only defines the app service, no mysql container. Without additional orchestration, the database connection will fail upon container startup—you need to either add a MySQL service, change DB_URL to a host-reachable DSN, or rely on SQLite (the default in config.py) during development.

Similarly, the Embedding volume source path is hardcoded as /ican/iCan/llm_models/bge-m3. Before deploying on a different machine, this must be changed to the actual host path; otherwise, EMBEDDING_MODEL_PATH points to an empty directory, and vector retrieval will fail at runtime.


10. Accelerating Builds in China

In the same Dockerfile, three mirror sources are used:

Layer Approach
apt sed replace deb.debian.orgmirrors.aliyun.com (handles both debian.sources and sources.list)
pip -i https://pypi.tuna.tsinghua.edu.cn/simple
npm --registry=https://registry.npmmirror.com

PyTorch CPU still uses the official --index-url https://download.pytorch.org/whl/cpu, combined with the Tsinghua mirror.


11. Runtime Directory and Data Volumes

Pre-create directories inside the image:

1
RUN mkdir -p /app/chroma_data /app/uploads /app/logs /app/models

The compose file binds ./logs to /app/logs and uses a named volume for Chroma. Embedding weights are not baked into the image (large size); they are mounted via host path—before deployment, ensure EMBEDDING_MODEL_PATH matches the volume source path.


12. Summary

  1. Two-stage build: Alpine Node compiles frontend → slim Python runs API, static files via COPY --from=frontend-builder /static/.
  2. CPU torch first, then requirements, controlling image size and satisfying the Embedding stack.
  3. fonts-noto-cjk + tools/pdf_exporter.py font chain for stable Chinese in PDF and matplotlib.
  4. Reinstall node_modules after COPY frontend, avoiding cross-platform node_modules contamination.
  5. Use libgdk-pixbuf-2.0-0 as package name; compose overrides LLM/DB via environment variables, defaulting to Ollama instead of hardcoding a cloud model.
  6. Check MySQL and Embedding mount paths; compose defaults won’t automatically spin up a database or download models.

Next article: FastAPI middleware and rate limiting (Article 16).


Appendix: Key Source Code (Line-by-Line Comments)

The following code is excerpted from the iCan implementation. Each line has a Chinese comment above it, allowing you to follow along even without the public repository.
Generated by: python3 bin/build-ican-annotated-snippets.py

Dockerfile Multi-stage (Excerpt)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
# ========== Dockerfile 多阶段(节选) ==========
# 源文件: iCan/Dockerfile 行 1-43

# L1: Docker 多阶段构建:定义基础镜像
FROM node:18-alpine AS frontend-builder

# L3: 设置容器内工作目录
WORKDIR /build
# L4: 复制文件/目录到镜像层
COPY frontend/package.json frontend/package-lock.json ./
# L5: 赋值:更新局部变量或 state 字段
RUN npm install --registry=https://registry.npmmirror.com
# L6: 复制文件/目录到镜像层
COPY frontend/ ./
# L7: 赋值:更新局部变量或 state 字段
RUN rm -rf node_modules && npm install --registry=https://registry.npmmirror.com
# L8: 构建时执行的 shell 命令(装依赖、编译前端等)
RUN npx vite build && ls -la /static/

# L10: Docker 多阶段构建:定义基础镜像
FROM python:3.10-slim

# L12: 设置容器内工作目录
WORKDIR /app

# L14: 构建时执行的 shell 命令(装依赖、编译前端等)
RUN sed -i 's/deb.debian.org/mirrors.aliyun.com/g' /etc/apt/sources.list.d/debian.sources 2>/dev/null; \
# L15: 执行该语句(细节见上文业务描述)
sed -i 's/deb.debian.org/mirrors.aliyun.com/g' /etc/apt/sources.list 2>/dev/null; \
# L16: 执行该语句(细节见上文业务描述)
apt-get update && apt-get install -y --no-install-recommends \
# L17: 执行该语句(细节见上文业务描述)
build-essential \
# L18: 执行该语句(细节见上文业务描述)
libpango-1.0-0 \
# L19: 执行该语句(细节见上文业务描述)
libpangocairo-1.0-0 \
# L20: 执行该语句(细节见上文业务描述)
libgdk-pixbuf-2.0-0 \
# L21: 执行该语句(细节见上文业务描述)
libffi-dev \
# L22: 执行该语句(细节见上文业务描述)
libcairo2 \
# L23: 执行该语句(细节见上文业务描述)
fonts-noto-cjk \
# L24: 执行该语句(细节见上文业务描述)
&& rm -rf /var/lib/apt/lists/*

# L26: 复制文件/目录到镜像层
COPY requirements.txt .

# L28: 构建时执行的 shell 命令(装依赖、编译前端等)
RUN pip install --no-cache-dir torch --index-url https://download.pytorch.org/whl/cpu -i https://pypi.tuna.tsinghua.edu.cn/simple && \
# L29: 执行该语句(细节见上文业务描述)
pip install --no-cache-dir -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple && \
# L30: 执行该语句(细节见上文业务描述)
pip install --no-cache-dir matplotlib -i https://pypi.tuna.tsinghua.edu.cn/simple

# L32: 复制文件/目录到镜像层
COPY pyproject.toml .
# L33: 复制文件/目录到镜像层
COPY src/ ./src/

# L35: 赋值:更新局部变量或 state 字段
COPY --from=frontend-builder /static/ ./static/

# L37: 构建时执行的 shell 命令(装依赖、编译前端等)
RUN mkdir -p /app/chroma_data /app/uploads /app/logs /app/models

# L39: 声明容器监听端口
EXPOSE 8000

# L41: 设置容器内工作目录
WORKDIR /app/src

# L43: 容器启动命令(生产入口)
CMD ["uvicorn", "ican.main:app", "--host", "0.0.0.0", "--port", "8000"]

PDF Font Registration

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
# ========== PDF 字体注册 ==========
# 源文件: tools/pdf_exporter.py 行 283-330

# L283: 同步函数 _build_pdf:路由决策或工厂方法
def _build_pdf(report_md: str, title: str, show_charts: bool) -> bytes:
# L284: 导入依赖模块
from reportlab.lib.pagesizes import A4
# L285: 导入依赖模块
from reportlab.platypus import (
# L286: 执行该语句(细节见上文业务描述)
SimpleDocTemplate, Paragraph, Spacer, Image, Table, TableStyle,
# L287: 执行该语句(细节见上文业务描述)
KeepTogether,
# L288: 执行该语句(细节见上文业务描述)
)
# L289: 导入依赖模块
from reportlab.lib.styles import ParagraphStyle
# L290: 导入依赖模块
from reportlab.lib.units import cm
# L291: 导入依赖模块
from reportlab.lib import colors
# L292: 导入依赖模块
from reportlab.pdfbase import pdfmetrics
# L293: 导入依赖模块
from reportlab.pdfbase.ttfonts import TTFont
# L294: 导入依赖模块
from reportlab.lib.enums import TA_CENTER, TA_JUSTIFY, TA_LEFT

# L296: 赋值:更新局部变量或 state 字段
buf = io.BytesIO()
# L297: 赋值:更新局部变量或 state 字段
doc = SimpleDocTemplate(
# L298: 执行该语句(细节见上文业务描述)
buf,
# L299: 赋值:更新局部变量或 state 字段
pagesize=A4,
# L300: 赋值:更新局部变量或 state 字段
leftMargin=2 * cm,
# L301: 赋值:更新局部变量或 state 字段
rightMargin=2 * cm,
# L302: 赋值:更新局部变量或 state 字段
topMargin=2 * cm,
# L303: 赋值:更新局部变量或 state 字段
bottomMargin=2 * cm,
# L304: 执行该语句(细节见上文业务描述)
)

# L306: 赋值:更新局部变量或 state 字段
cn_font = "Helvetica"
# L307: 循环
for fp in [
# L308: 执行该语句(细节见上文业务描述)
"/usr/share/fonts/opentype/noto/NotoSansCJK-Regular.ttc",
# L309: 执行该语句(细节见上文业务描述)
"/usr/share/fonts/truetype/noto/NotoSansCJK-Regular.ttc",
# L310: 执行该语句(细节见上文业务描述)
"/usr/share/fonts/opentype/noto/NotoSansCJKsc-Regular.otf",
# L311: 执行该语句(细节见上文业务描述)
"/usr/share/fonts/truetype/wqy/wqy-zenhei.ttc",
# L312: 执行该语句(细节见上文业务描述)
"/usr/share/fonts/truetype/wqy/wqy-microhei.ttc",
# L313: 执行该语句(细节见上文业务描述)
"/usr/share/fonts/truetype/droid/DroidSansFallbackFull.ttf",
# L314: 执行该语句(细节见上文业务描述)
"/System/Library/Fonts/PingFang.ttc",
# L315: 执行该语句(细节见上文业务描述)
"/System/Library/Fonts/STHeiti Light.ttc",
# L316: 执行该语句(细节见上文业务描述)
"/System/Library/Fonts/Hiragino Sans GB.ttc",
# L317: 执行该语句(细节见上文业务描述)
"/Library/Fonts/Arial Unicode.ttf",
# L318: 执行该语句(细节见上文业务描述)
]:
# L319: 条件分支
if os.path.exists(fp):
# L320: 开始 try 块,后续 except 负责兜底
try:
# L321: 执行该语句(细节见上文业务描述)
pdfmetrics.registerFont(TTFont("CNFont", fp))
# L322: 赋值:更新局部变量或 state 字段
cn_font = "CNFont"
# L323: 执行该语句(细节见上文业务描述)
break
# L324: 捕获异常,避免整图/整请求崩溃
except Exception:
# L325: 执行该语句(细节见上文业务描述)
pass

# L327: 赋值:更新局部变量或 state 字段
accent_c = colors.HexColor("#0d9488")
# L328: 赋值:更新局部变量或 state 字段
text_c = colors.HexColor("#1f2937")
# L329: 赋值:更新局部变量或 state 字段
gray_c = colors.HexColor("#6b7280")
# L330: 赋值:更新局部变量或 state 字段
border_c = colors.HexColor("#e5e7eb")

Series Navigation

Article Topic
1 System Overview
2 Five Agent Collaboration
3 Holland RIASEC
4–7 State · Routing · Nesting · Fault Tolerance
8–11 LLM Layer · SSE/WS · DB Migration · PDF
12–14 JSON Prompt · RIASEC Prompt · Guide Prompt
15–17 Docker · Middleware · Configuration

← Back to iCan Topic