0. Series Loop (Read Along Without Public Source Code)
End-to-End Pipeline: Vue Frontend → api/routes/chat.py → Guide Multi-turn SSE → run_analysis_pipeline (Parsing → Analysis → Matching → Report) → tools/pdf_exporter PDF.
This Post: Post 1/17 · Overview · 17-Post Map
| Phase | User Sees | Code Entry | Corresponding Post |
|---|---|---|---|
| Create Session | Welcome Message | POST /api/sessions | 09 |
| Multi-turn Chat | SSE Streaming | chat/stream → run_guide_single_turn | 06, 14 |
| Info Sufficient | Start Analysis | _run_analysis_background | 05, 07 |
| Resume Parsing | Progress 30% | run_resume_parser | 12 |
| Profile/RIASEC | Progress 50% | run_profile_analyzer | 03, 13 |
| Career Matching | Progress 70% | run_career_matcher | 02 |
| Report | Progress 90% | run_reporter | 11 |
| Download PDF | File | GET …/report/pdf | 11, 15 |
| Description | |
|---|---|
| Before reading this post | None (recommended to start here) |
| After reading this post | Able to describe FastAPI interface layer, LangGraph five nodes, two execution paths (HTTP vs CLI) |
| Next step | Post 02: How the five node functions are chained (Post 2) |
Series full loop index: SERIES-LOOP.md
1. What Problem Does It Solve
Common career planning products fall into two extremes:
- Pure Questionnaire: Holland 120 questions, get a code after filling in, lacking context about your resume and concerns.
- Pure Chat: One big Prompt handles everything, packing parsing, assessment, matching, and report writing into a single call, leading to unstable output.
iCan’s approach is: Multi-turn dialogue to collect context → Structured profile → Agent-based analysis and matching → Generate downloadable PDF report.
The backend uses FastAPI (main.py mounts routes like api/routes/chat.py) to provide API and SSE/WebSocket, and uses LangGraph StateGraph (create_workflow() in workflow.py) to orchestrate the 5-stage pipeline.
Below describes the architecture and technology choices; key logic is accompanied by code snippets for better understanding.
2. Technology Choices (Why Not “One Super Prompt”)
| Layer | Choice | Reason in This Project |
|---|---|---|
| Web | FastAPI | LLM calls are IO-intensive; async routes + SSE streaming provide better response experience |
| Orchestration | LangGraph | Guide needs cyclic questioning, the next four stages need linear execution; graph structure is easier to express than LCEL chains |
| LLM | ChatOpenAI Compatible API | Same invoke_llm; switching to DeepSeek / Ollama only changes base_url and model in .env |
| Report | ReportLab + matplotlib | Controllable Chinese PDF, radar/bar charts, no browser printing required |
| Frontend | Vue 3 + Vite | Chat page + report generation progress + PDF download |
Why not CrewAI / AutoGen: The current flow is a fixed DAG (Guide conditional loop + four sequential stages). LangGraph’s add_conditional_edges is sufficient, and the state TypedDict is clear.
3. System Architecture
3.1 Overview (draw.io)
3.2 Data Flow (Mermaid)
1 | |
Top-level state type: iCanWorkflowState (core/state.py). Guide has its own inner GuideState and subgraph create_guide_graph(), detailed in Post 6.
4. Top-Level Workflow Code (Core 30 Lines)
Implementation location: workflow.py → create_workflow().
1 | |
route_after_guide reads fields like needs_more_info returned by Guide to decide whether to continue the conversation or proceed to the parsing pipeline (with a loop limit, see Post 07).
5. Five Agent Responsibilities
| Node | Module | Input → Output | Description |
|---|---|---|---|
guide_node |
agents/guide.py |
Dialogue history → collected_info, sufficiency flag |
Inner 5 steps: welcome → … → check_sufficiency |
resume_parser_node |
agents/resume_parser.py |
Raw text → Structured JSON profile | Uses get_light_model() + JSON fallback parsing |
profile_analyzer_node |
agents/profile_analyzer.py |
Profile → Abilities/Personality/Values + RIASEC | Multi-node subgraph, Post 3 covers Holland scoring |
career_matcher_node |
agents/career_matcher.py |
Profile → Three-level path recommendations | Vertical / Horizontal / Transition three tiers |
reporter_node |
agents/reporter.py |
Analysis results → Markdown report | Then goes through tools/pdf_exporter.py to produce PDF |
Note: In implementation, these are async functions + sub-StateGraphs, not five Python classes.
6. LLM Integration (Don’t Hardcode “Only DeepSeek”)
llm/providers.py provides a unified wrapper:
get_chat_model()/get_light_model()→ChatOpenAI(api_key, base_url, model=...)invoke_llm/invoke_llm_with_json→ Normal text and JSON mode
Configuration from environment variables (config.py + pydantic-settings):
| Scenario | Typical Config |
|---|---|
Local dev .env |
LLM_MODEL_CHAT=deepseek-v4-flash, LLM_BASE_URL=https://api.deepseek.com/v1 |
| Docker + Ollama | qwen3.5:9b + http://host.docker.internal:11434/v1 (Qwen3 requires /no_think injection) |
| Code defaults | Falls back to gpt-4o when env not set (deployment must explicitly write .env) |
Post 8 covers dual-model strategy; currently ResumeParser uses light model, Reporter still uses chat model (inconsistent with some earlier docs; code is the authority).
7. Deployment Highlights (Docker)
Multi-stage Dockerfile: Node builds frontend static assets → Python image installs dependencies → uvicorn ican.main:app.
Pitfalls are mostly in Post 15; here are three key points:
- Install CPU PyTorch first, to avoid sentence-transformers pulling in a huge CUDA package.
- Install
fonts-noto-cjkin the image, otherwise PDF Chinese characters appear as boxes. - COPY
distseparately from frontend; do not include hostnode_modulesinto the image.
8. Directory Structure (Easy to Navigate by Post)
1 | |
9. Pitfall Records (Compare with Source Code)
Default model in code is not DeepSeek
Inconfig.py, the defaultLLM_MODEL_CHATisgpt-4o, andLLM_BASE_URLdefaults to OpenAI’s official URL. Usingdeepseek-v4-flashlocally is a.envdeployment configuration, not hardcoded; missing the env var will cause connection to the wrong endpoint.Online conversations and
create_workflow()are not the same path
SSE multi-turn chat goes throughrun_guide_chat()inworkflow.py(internally callingrun_guide_single_turninagents/guide.py); once information is sufficient,api/routes/chat.pytriggersrun_analysis_pipeline(), which directly chainsresume_parser→profile_analyzer→career_matcher→reporter, rather than callingainvokeon the full top-level graph each time. Fullrun_workflow()+create_workflow()is only used for CLI or one-shot runs.PlannerStateis defined but not integratedPlannerStateexists incore/state.py, but the top-levelworkflow.pyhas no Planner node; action planning content is currently merged into the report section ofagents/reporter.py. Do not write “six Agents” in articles.Top-level state uses
workflow_messagesfor accumulation, notmessages
The Reducer field iniCanWorkflowStateisworkflow_messages: Annotated[list[str], operator.add];messagesonly appears in the innerGuideState. Mixing them up leads to wrong log fields during debugging.Dual-model division is code-authoritative
Comments inllm/providers.pyonce described Reporter could use a mini model, butagents/reporter.pycurrently still callsget_chat_model()for chapter generation; onlyagents/resume_parser.pydefinitely usesget_light_model().Fallback path when Ollama is unavailable
run_analysis_pipeline()first callscheck_ollama_available(); on failure, it falls back to_regex_quick_profile+_generate_fallback_reportinworkflow.py. Report quality drops significantly — LLM availability must be ensured at the ops level.
10. Series Guide
| Post | Topic |
|---|---|
| 1 | System Overview (This post) |
| 2 | Five Agent Collaboration and iCanWorkflowState |
| 3 | Holland RIASEC + OpenAI Compatible API Deployment Example |
| 4–7 | LangGraph State, Routing, Nesting, Fault Tolerance |
| 8–11 | FastAPI Integration, SSE/WS, DB, PDF |
| 12–14 | Prompt and Stable JSON Output |
| 15–17 | Docker, Middleware, Configuration Management |
11. Summary
iCan’s core is not “switching to a larger model”, but rather using LangGraph to split uncertain LLM steps into testable nodes, using FastAPI to handle streaming interactions, and ReportLab to deliver persistent PDFs.
Next post: LangGraph Multi-Agent Orchestration — How the five subgraphs connect, how route_after_guide works with the Guide inner loop.