LangGraph Double-Layer StateGraph Nesting: Architectural Practice for Reducing Agent Coupling

0. Series Loop (Read Along Without Public Source Code)

End-to-End Pipeline: Vue Frontend → api/routes/chat.py → Guide Multi-turn SSE → run_analysis_pipeline (Parse → Analyze → Match → Report) → tools/pdf_exporter PDF.
This Article: # 6/17 · Guide Loop · Inner-Outer Two-Layer Graph

Stage	User Visible	Code Entry	Article
Create Session	Welcome Message	POST /api/sessions	09
Multi-turn Dialogue	SSE Streaming	chat/stream → run_guide_single_turn	06, 14
Information Sufficient	Start Analysis	_run_analysis_background	05, 07
Resume Parsing	Progress 30%	run_resume_parser	12
Profile/RIASEC	Progress 50%	run_profile_analyzer	03, 13
Career Matching	Progress 70%	run_career_matcher	02
Report	Progress 90%	run_reporter	11
Download PDF	File	GET …/report/pdf	11, 15

	Description
Before reading this	Section 05: Outer layer guide_node
After reading this	Draw the inner 5-node subgraph, explain API goes through run_guide_single_turn
Next loop	Article 14: Stage Prompts (Article 7)

Full series loop index: SERIES-LOOP.md

1. What Problem to Solve

The iCan top-level workflow has 5 Agent nodes (Guide → ResumeParser → ProfileAnalyzer → CareerMatcher → Reporter). If the 5 dialogue stages inside Guide (Welcome, Need Assessment, Basic Collection, Deep Mining, Sufficiency Check) are also flattened into the same StateGraph, it would cause:

State field explosion: Guide’s collected_info, current_stage mixed with top-level structured_profile, final_report in the same TypedDict.
Modifying Guide affects global: Adjusting check_sufficiency‘s routing logic might accidentally trigger top-level route_after_guide.
High testing cost: Verifying “back to dig_deeper when info insufficient” requires running all four analysis stages to isolate.

The actual approach is outer 5 nodes + inner Guide subgraph: the outer workflow.py‘s guide_node only does state mapping, the inner agents/guide.py‘s create_guide_graph() encapsulates 5 function nodes and conditional edges.

2. Implementation Location: Two-Layer State + Two-Layer Graph

Layer	File	State Type	Entry
Outer	`workflow.py`	`iCanWorkflowState`	`create_workflow()` → `guide_node`
Inner	`agents/guide.py`	`GuideState`	`create_guide_graph()` → `run_guide_agent()`

Two sets of TypedDict in core/state.py separate responsibilities:

# core/state.py — Guide inner layer
class GuideState(TypedDict, total=False):
    conversation_history: list[dict[str, str]]
    collected_info: dict[str, Any]
    is_info_sufficient: bool
    messages: Annotated[list[str], operator.add]  # reducer merges AI replies
    current_stage: str
    # ...

# core/state.py — top-level aggregation
class iCanWorkflowState(TypedDict, total=False):
    session_id: str
    conversation_history: list[dict[str, str]]
    needs_more_info: bool
    structured_profile: dict[str, Any]
    final_report: str
    # ...

The outer layer only cares about needs_more_info, conversation_history, raw_input; the inner layer holds current_stage, missing_fields, emotion_state.

Double-layer StateGraph nesting

3. Outer `guide_node`: Facade, Not a Guide Class

There is no GuideAgent class in workflow.py; only the async function guide_node. It is responsible for extract → call → write back:

# workflow.py — guide_node (excerpt)
async def guide_node(state: iCanWorkflowState) -> dict:
    conversation_history = list(state.get("conversation_history", []))
    raw_input = state.get("raw_input", "")
    if raw_input:
        conversation_history.append({"role": "user", "content": raw_input})

    guide_state: GuideState = create_initial_guide_state()
    guide_state["conversation_history"] = conversation_history

    guide_result = await run_guide_agent(guide_state)

    guide_messages = guide_result.get("messages", [])
    latest_reply = guide_messages[-1] if guide_messages else ""
    updated_history = list(conversation_history)
    if latest_reply:
        updated_history.append({"role": "assistant", "content": latest_reply})

    is_sufficient = guide_result.get("is_info_sufficient", False)
    result = {
        "conversation_history": updated_history,
        "current_agent": "guide",
        "needs_more_info": not is_sufficient,
    }
    collected_info = guide_result.get("collected_info", {})
    if collected_info.get("collected_raw"):
        result["raw_input"] = collected_info["collected_raw"]
    return result

The outer layer does not know inner node names like welcome, dig_deeper; it only reads is_info_sufficient and messages[-1].

The outer loop is controlled by route_after_guide:

# workflow.py — route_after_guide (excerpt)
def route_after_guide(state: iCanWorkflowState) -> str:
    if not state.get("needs_more_info", True):
        return "resume_parser_node"
    user_msg_count = len([m for m in state.get("conversation_history", [])
                          if m.get("role") == "user"])
    if user_msg_count >= 3:
        return "resume_parser_node"  # force into analysis
    return "guide_node"

4. Inner `create_guide_graph()`: Five Nodes + Conditional Loop

The inner graph is built in agents/guide.py, with all nodes being async functions, not class methods:

# agents/guide.py — create_guide_graph (excerpt)
def create_guide_graph() -> StateGraph:
    graph = StateGraph(GuideState)
    graph.add_node("welcome", welcome)
    graph.add_node("assess_need", assess_need)
    graph.add_node("collect_basic_info", collect_basic_info)
    graph.add_node("dig_deeper", dig_deeper)
    graph.add_node("check_sufficiency", check_sufficiency)

    graph.set_entry_point("welcome")
    graph.add_edge("welcome", "assess_need")
    graph.add_edge("assess_need", "collect_basic_info")
    graph.add_edge("collect_basic_info", "dig_deeper")
    graph.add_edge("dig_deeper", "check_sufficiency")
    graph.add_conditional_edges(
        "check_sufficiency",
        should_continue,
        {"dig_deeper": "dig_deeper", "handoff": END},
    )
    return graph.compile()

run_guide_agent calls create_guide_graph() each time and then ainvoke, with recursion_limit=15:

async def run_guide_agent(state: GuideState) -> dict:
    graph = create_guide_graph()
    result = await graph.ainvoke(state, config={"recursion_limit": 15})
    return result

The inner loop is determined by should_continue: is_info_sufficient=True → handoff (END); otherwise back to dig_deeper. Additionally, if loop_count >= 8, force handoff (estimated from messages list length).

Each inner node calls LLM via get_chat_model() + invoke_llm() (see Article 8). On exception, returns fixed phrases without retrying the model.

5. Difference from API Path: Subgraph Not Used by All Entries

This is key to understanding the nested architecture: The HTTP dialogue API does NOT go through the inner 5-node graph by default.

Entry	Call Chain	Uses `create_guide_graph`?
Top-level `run_workflow()`	`guide_node` → `run_guide_agent`	Yes
`POST /api/sessions/.../chat`	`run_guide_chat` → `run_guide_single_turn`	No (single-turn LLM)
`POST .../chat/stream`	Direct `model.astream` + keyword sufficiency check	No

run_guide_chat in workflow.py explicitly uses single-turn mode:

async def run_guide_chat(conversation_history: list, user_message: str) -> dict:
    from ican.agents.guide import run_guide_single_turn
    result = await run_guide_single_turn(conversation_history, user_message)
    # update history, return reply / is_info_sufficient

Thus: The nested subgraph serves the batch-processing top-level workflow; online per-turn chat uses run_guide_single_turn or SSE streaming, whose logic differs from inner check_sufficiency (LLM judges sufficient/insufficient).

6. Position in the Pipeline

Complete top-level edges (create_workflow):

guide_node → route_after_guide
    ├─ needs_more_info → guide_node (loop)
    └─ sufficient / forced exit → resume_parser_node → profile_analyzer_node
        → career_matcher_node → reporter_node → END

A single ainvoke of the inner graph runs sequentially through welcome → … → check_sufficiency, and if necessary loops between dig_deeper ↔ check_sufficiency. Each time the outer guide_node is scheduled, it calls create_initial_guide_state() and starts from welcome again — this repeats the welcome message in scenarios without real-time user input and running the full workflow in one go. It is a design trade-off rather than a LangGraph framework limitation.

The other four Agents (resume_parser, profile_analyzer, etc.) also follow the pattern of outer node functions + inner run_* subgraph/pipeline, same as Guide but with different inner node counts; the top-level file has only one create_workflow() in workflow.py.

7. Pitfalls

① Comment says “loop at most 2 times”, code doesn’t use 2
The comment in should_continue says “loop at most 2 times”, but the code actually uses loop_count >= 8; the outer route_after_guide uses user_msg_count >= 3 to force into analysis. When documenting or changing requirements, rely on grep results, not docstrings.

② run_guide_agent recompiles the graph each time
create_guide_graph() calls graph.compile() each time run_guide_agent is called, without module-level caching. For frequent Guide calls, caching the compiled graph is possible but not implemented in the current MVP.

③ Outer guide_node always starts from welcome
create_initial_guide_state() fixes current_stage="greeting", so the inner entry point is always welcome. If the outer route_after_guide returns to guide_node multiple times, the welcome node is repeated—this matters when running the workflow in batch; the online API is unaffected because it uses run_guide_single_turn.

④ Dual track: messages reducer and conversation_history
Inner AI replies go into GuideState.messages (Annotated add); outer persistence uses conversation_history (list of role/content dicts). guide_node only maps messages[-1] into history; intermediate multi-message outputs from inner nodes are not fully carried to the outer layer.

8. Summary

Nested structure: outer iCanWorkflowState + guide_node, inner GuideState + create_guide_graph(), implemented as function nodes rather than an Agent class.
The outer facade only does field mapping; the inner 5 nodes + should_continue handle dialogue stages and dig_deeper loop.
API chat goes through run_guide_single_turn, NOT the inner subgraph; the subgraph is mainly used for the run_workflow / guide_node path.
Each of the two layers has its own exit conditions (inner: loop_count/LLM sufficiency, outer: needs_more_info/user turns). When debugging, clarify which layer is looping.
To modify Guide behavior, first confirm whether the change is in the subgraph nodes or in the single-turn/API streaming path.

Next article: LangGraph error handling and fault tolerance (workflow.py node excepts, run_analysis_pipeline degradation).

Appendix: Key Source Code (Line-by-Line Annotations)

The following code is extracted from the iCan implementation. Each line has Chinese comments above. You can follow along even without a public repository.
Generation command: python3 bin/build-ican-annotated-snippets.py

create_guide_graph

# ========== create_guide_graph ==========
# Source file: agents/guide.py   Lines 375-431

# L375: Synchronous function create_guide_graph: routing decision or factory method
def create_guide_graph() -> StateGraph:
# L377: [Doc] Create StateGraph for dialogue guidance Agent.
# L379: [Doc] Function description:
# L380: [Doc] Build the LangGraph workflow graph for the dialogue guidance Agent, defining nodes and edges.
# L381: [Doc] The workflow executes in the following order:
# L382: [Doc] welcome -> assess_need -> collect_basic_info -> dig_deeper -> check_sufficiency
# L383: [Doc] check_sufficiency uses conditional routing:
# L384: [Doc] - Insufficient info -> dig_deeper (loop continues dialogue)
# L385: [Doc] - Sufficient info -> END (workflow ends)
# L387: [Doc] Input parameters:
# L388: [Doc] None
# L390: [Doc] Output parameters:
# L391: [Doc] StateGraph: compiled LangGraph StateGraph instance, ready for invoke.
# (Lines 376-392 are function/module docstring, converted to comments for readability)
# L393: Start try block; except handles fallback
    try:
# L394: Log for online debugging of node inputs/outputs
        logger.info("[create_guide_graph] Starting to create StateGraph for dialogue guidance Agent")

# L396: Create LangGraph state graph; the TypedDict in parentheses defines fields shared/passed among nodes
        graph = StateGraph(GuideState)

# L398: Add nodes
# L399: Register graph node "welcome" with async node function value
        graph.add_node("welcome", welcome)
# L400: Register graph node "assess_need" with async node function value
        graph.add_node("assess_need", assess_need)
# L401: Register graph node "collect_basic_info" with async node function value
        graph.add_node("collect_basic_info", collect_basic_info)
# L402: Register graph node "dig_deeper" with async node function value
        graph.add_node("dig_deeper", dig_deeper)
# L403: Register graph node "check_sufficiency" with async node function value
        graph.add_node("check_sufficiency", check_sufficiency)

# L405: Set entry point
# L406: Set graph entry: first node executed during ainvoke
        graph.set_entry_point("welcome")

# L408: Define sequential edges
# L409: Add unconditional edge: after previous node completes, fixed to go to next node
        graph.add_edge("welcome", "assess_need")
# L410: Add unconditional edge: after previous node completes, fixed to go to next node
        graph.add_edge("assess_need", "collect_basic_info")
# L411: Add unconditional edge: after previous node completes, fixed to go to next node
        graph.add_edge("collect_basic_info", "dig_deeper")
# L412: Add unconditional edge: after previous node completes, fixed to go to next node
        graph.add_edge("dig_deeper", "check_sufficiency")

# L414: Define conditional edge: route after check_sufficiency based on sufficiency judgment
# L415: Add conditional edge: next node name determined by routing function return value
        graph.add_conditional_edges(
# L416: Execute this statement (details see business description above)
            "check_sufficiency",
# L417: Execute this statement (details see business description above)
            should_continue,
# L418: Execute this statement (details see business description above)
            {
# L419: Execute this statement (details see business description above)
                "dig_deeper": "dig_deeper",
# L420: Execute this statement (details see business description above)
                "handoff": END,
# L421: Execute this statement (details see business description above)
            },
# L422: Execute this statement (details see business description above)
        )

# L424: Compile StateGraph to obtain ainvoke-able Runnable
        compiled_graph = graph.compile()
# L425: Log for online debugging of node inputs/outputs
        logger.info("[create_guide_graph] StateGraph created and compiled successfully")
# L426: Return this node's fields to be merged into state (LangGraph will merge)
        return compiled_graph

# L428: Catch exception to avoid crashing entire graph/request
    except Exception as e:
# L429: Log for online debugging of node inputs/outputs
        logger.error("[create_guide_graph] Exception while creating StateGraph: %s", e, exc_info=True)
# L430: Re-raise exception to be handled by caller or LangGraph
        raise

Outer guide_node facade

# ========== Outer guide_node facade ==========
# Source file: workflow.py   Lines 35-105

# L35: Async function guide_node: can be awaited, suitable for IO-bound LLM/DB calls
async def guide_node(state: iCanWorkflowState) -> dict:
# L37: [Doc] Dialogue guidance node: call GuideAgent for multi-turn information collection
# L39: [Doc] Function description:
# L40: [Doc] Pass conversation history and user information from top-level workflow state to GuideAgent,
# L41: [Doc] call run_guide_agent to execute multi-turn dialogue guidance, collect user basic information,
# L42: [Doc] career confusion, expectations, and other key information. Update workflow state based on dialogue result.
# L44: [Doc] Input:
# L45: [Doc] state (iCanWorkflowState): Top-level workflow state, containing conversation_history, raw_input, etc.
# L47: [Doc] Output:
# L48: [Doc] dict: State update dictionary, containing updates for conversation_history, current_agent, needs_more_info, etc.
# (Lines 36-49 are function/module docstring, converted to comments for readability)
# L50: Start try block; except handles fallback
    try:
# L51: Log for online debugging of node inputs/outputs
        logger.info(
# L52: Multi-turn dialogue list, elements {role, content}
            "[guide_node] Starting execution, input: session_id=%s, conversation_history length=%d, raw_input length=%d",
# L53: Execute this statement (details see business description above)
            state.get("session_id"),
# L54: Multi-turn dialogue list, elements {role, content}
            len(state.get("conversation_history", [])),
# L55: Execute this statement (details see business description above)
            len(state.get("raw_input", "")),
# L56: Execute this statement (details see business description above)
        )

# L58: Build input state for GuideAgent
# L59: Multi-turn dialogue list, elements {role, content}
        conversation_history = list(state.get("conversation_history", []))
# L60: Assignment: update local variable or state field
        raw_input = state.get("raw_input", "")

# L62: Append user's latest message to conversation history
# L63: Conditional branch
        if raw_input:
# L64: Multi-turn dialogue list, elements {role, content}
            conversation_history.append({"role": "user", "content": raw_input})

# L66: Assignment: update local variable or state field
        guide_state: GuideState = create_initial_guide_state()
# L67: Multi-turn dialogue list, elements {role, content}
        guide_state["conversation_history"] = conversation_history

# L69: Call GuideAgent
# L70: Run Guide inner full subgraph (used by CLI/top-level guide_node)
        guide_result = await run_guide_agent(guide_state)

# L72: Extract GuideAgent's reply
# L73: Assignment: update local variable or state field
        guide_messages = guide_result.get("messages", [])
# L74: Assignment: update local variable or state field
        latest_reply = guide_messages[-1] if guide_messages else ""

# L76: Update conversation history (add AI reply)
# L77: Multi-turn dialogue list, elements {role, content}
        updated_history = list(conversation_history)
# L78: Conditional branch
        if latest_reply:
# L79: Execute this statement (details see business description above)
            updated_history.append({"role": "assistant", "content": latest_reply})

# L81: Determine if information is sufficient
# L82: Whether Guide judges user info sufficient to enter analysis stage
        is_sufficient = guide_result.get("is_info_sufficient", False)

# L84: Collected information
# L85: Assignment: update local variable or state field
        collected_info = guide_result.get("collected_info", {})

# L87: Assignment: update local variable or state field
        result = {
# L88: Multi-turn dialogue list, elements {role, content}
            "conversation_history": updated_history,
# L89: Execute this statement (details see business description above)
            "current_agent": "guide",
# L90: Whether to continue Guide loop; False means can proceed to resume_parser
            "needs_more_info": not is_sufficient,
# L91: Execute this statement (details see business description above)
        }

# L93: Store collected raw information for subsequent ResumeParser usage
# L94: Conditional branch
        if collected_info:
# L95: Assignment: update local variable or state field
            raw_collected = collected_info.get("collected_raw", "")
# L96: Conditional branch
            if raw_collected:
# L97: Assignment: update local variable or state field
                result["raw_input"] = raw_collected

# L99: Log for online debugging of node inputs/outputs
        logger.info(
# L100: Whether to continue Guide loop; False means can proceed to resume_parser
            "[guide_node] Execution complete, output: is_sufficient=%s, needs_more_info=%s, conversation_history length=%d",
# L101: Execute this statement (details see business description above)
            is_sufficient,
# L102: Execute this statement (details see business description above)
            not is_sufficient,
# L103: Execute this statement (details see business description above)
            len(updated_history),
# L104: Execute this statement (details see business description above)
        )
# L105: Return this node's fields to be merged into state (LangGraph will merge)
        return result

run_guide_single_turn (API actual path)

# ========== run_guide_single_turn (API actual path) ==========
# Source file: agents/guide.py   Lines 465-520

# L465: Async function run_guide_single_turn: can be awaited, suitable for IO-bound LLM/DB calls
async def run_guide_single_turn(conversation_history: list, user_message: str) -> dict:
# L467: [Doc] Single-turn dialogue mode: directly call LLM for one round of guidance, without StateGraph loop.
# L469: [Doc] Function description:
# L470: [Doc] Based on existing conversation history and user's new message, call LLM to generate a reply.
# L471: [Doc] Does not use the internal StateGraph loop mechanism, suitable for per-turn interaction with the user.
# L473: [Doc] Input parameters:
# L474: [Doc] conversation_history (list): existing conversation history
# L475: [Doc] user_message (str): user's latest message
# L477: [Doc] Output parameters:
# L478: [Doc] dict: Contains reply (AI reply), is_info_sufficient (whether info is sufficient), collected_info (collected info)
# (Lines 466-479 are function/module docstring, converted to comments for readability)
# L480: Start try block; except handles fallback
    try:
# L481: API single-turn Guide: does not run inner 5-node subgraph, single LLM reply
        logger.info("[run_guide_single_turn] Starting execution, user message length=%d, history length=%d", len(user_message), len(conversation_history))

# L483: Assignment: update local variable or state field
        messages = [
# L484: Execute this statement (details see business description above)
            {"role": "system", "content": GUIDE_SYSTEM_PROMPT},
# L485: Execute this statement (details see business description above)
        ]
# L486: Multi-turn dialogue list, elements {role, content}
        for msg in conversation_history:
# L487: Execute this statement (details see business description above)
            messages.append(msg)
# L488: Execute this statement (details see business description above)
        messages.append({"role": "user", "content": user_message})

# L490: Get the dialogue LLM instance (config from settings.LLM_MODEL_CHAT)
        model = get_chat_model()
# L491: Call LLM to return plain text, with 60s timeout and Qwen3 /no_think injection
        reply = await invoke_llm(model, messages)

# L493: Assignment: update local variable or state field
        all_user_text = user_message
# L494: Multi-turn dialogue list, elements {role, content}
        for msg in conversation_history:
# L495: Conditional branch
            if msg.get("role") == "user":
# L496: Assignment: update local variable or state field
                all_user_text += " " + msg.get("content", "")

# L498: Assignment: update local variable or state field
        optional_keywords = ["year", "industry", "position", "job title", "skill", "experience", "company", "major", "degree", "direction", "expectation", "confusion",
# L499: Execute this statement (details see business description above)
                             "work", "development", "engineer", "manager", "operations", "product", "design", "data", "architecture", "management",
# L500: Execute this statement (details see business description above)
                             "experience", "project", "responsible", "participated", "university", "undergraduate", "master", "PhD"]
# L501: Assignment: update local variable or state field
        found_keywords = [kw for kw in optional_keywords if kw in all_user_text]

# L503: Assignment: update local variable or state field
        is_sufficient = (
# L504: Assignment: update local variable or state field
            (len(found_keywords) >= 6) or
# L505: Assignment: update local variable or state field
            (len(found_keywords) >= 4 and len(all_user_text) >= 50)
# L506: Execute this statement (details see business description above)
        )

# L508: Assignment: update local variable or state field
        collected_info = {"collected_raw": all_user_text}

# L510: API single-turn Guide: does not run inner 5-node subgraph, single LLM reply
        logger.info("[run_guide_single_turn] Execution complete, is_sufficient=%s, found_keywords=%s", is_sufficient, found_keywords)

# L512: Return this node's fields to be merged into state (LangGraph will merge)
        return {
# L513: Execute this statement (details see business description above)
            "reply": reply or "",
# L514: Whether Guide judges user info sufficient to enter analysis stage
            "is_info_sufficient": is_sufficient,
# L515: Execute this statement (details see business description above)
            "collected_info": collected_info,
# L516: Execute this statement (details see business description above)
        }

# L518: Catch exception to avoid crashing entire graph/request
    except Exception as e:
# L519: API single-turn Guide: does not run inner 5-node subgraph, single LLM reply
        logger.error("[run_guide_single_turn] Single-turn dialogue exception: %s", e, exc_info=True)
# L520: Return this node's fields to be merged into state (LangGraph will merge)
        return {

Article	Topic
1	System Overview
2	Five-Agent Collaboration
3	Holland RIASEC
4–7	State · Routing · Nesting · Fault Tolerance
8–11	LLM Layer · SSE/WS · DB Migration · PDF
12–14	JSON Prompt · RIASEC Prompt · Guide Prompt
15–17	Docker · Middleware · Configuration

← Back to iCan Topic