0. Series Loop (Readable Without Public Source Code)

End-to-End Pipeline: Vue Frontend → api/routes/chat.py → Guide Multi‑turn SSE → run_analysis_pipeline (Parse → Analyze → Match → Report) → tools/pdf_exporter PDF.
This Article: 4/17 · Data Loop · State TypedDict

Stage User Visible Code Entry Corresponding Article
Create Session Welcome Message POST /api/sessions 09
Multi‑turn Dialogue SSE Streaming chat/stream → run_guide_single_turn 06, 14
Information Sufficient Start Analysis _run_analysis_background 05, 07
Resume Parsing Progress 30% run_resume_parser 12
Profile/RIASEC Progress 50% run_profile_analyzer 03, 13
Career Matching Progress 70% run_career_matcher 02
Report Progress 90% run_reporter 11
Download PDF File GET …/report/pdf 11, 15
Description
Before This Article Return values of each node in Article 02
After This Article Distinguish outer iCanWorkflowState from inner GuideState
Next Loop Article 05: Routing with needs_more_info (Article 5)

Full Series Loop Index: SERIES-LOOP.md

1. LangGraph’s State Passing Mechanism

LangGraph’s core concept is state‑driven. Each node receives a TypedDict defined in core/state.py, processes it, returns partial fields to update, and LangGraph automatically merges them into the global state.

1
2
3
4
5
Node A receives state → process → returns {"field_a": "value_a"}

LangGraph auto‑merges into global state

Node B receives updated state → process → returns {"field_b": "value_b"}

The key question of this mechanism is: How is merging done?

Outer and Inner State Layers

2. TypedDict Defines Agent State

Implementation location: core/state.py. LangGraph uses Python’s TypedDict to define state structures:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# core/state.py — iCanWorkflowState excerpt
from typing import Annotated, Any
from typing_extensions import TypedDict
import operator

class iCanWorkflowState(TypedDict, total=False):
session_id: str
user_id: str
conversation_history: list[dict[str, str]]
raw_input: str
structured_profile: dict[str, Any]
personal_profile: dict[str, Any]
career_matches: list[dict[str, Any]]
final_report: str
current_agent: str
needs_more_info: bool
workflow_messages: Annotated[list[str], operator.add]

total=False means all fields are optional (nodes can return only the fields they need to update).

3. Annotated[list, operator.add] — Reducer Explained

Default Behavior: Overwrite

Without a Reducer, LangGraph’s default behavior is new value overwrites old value:

1
2
3
4
5
6
7
8
# Node A returns
{"current_agent": "guide"}

# Node B returns
{"current_agent": "resume_parser"}

# Final state
{"current_agent": "resume_parser"} # ← B’s value overwrote A’s value

This is correct for fields like current_agent or needs_more_info where only the latest value matters.

Reducer Behavior: Accumulate

1
2
3
4
5
# core/state.py — Inner GuideState
messages: Annotated[list[str], operator.add]

# core/state.py — Outer iCanWorkflowState
workflow_messages: Annotated[list[str], operator.add]

Annotated[list[str], operator.add] tells LangGraph: merge this field using operator.add (list concatenation).

1
2
3
4
5
6
7
8
# Node A returns
{"messages": ["Hello!"]}

# Node B returns
{"messages": ["Can you tell me about your concerns?"]}

# Final state
{"messages": ["Hello!", "Can you tell me about your concerns?"]} # ← accumulated

Why the Guide Inner messages Must Use a Reducer

In the multi‑turn subgraph of agents/guide.py, each of welcome / assess_need / collect_basic_info / dig_deeper returns {"messages": [reply]}. Without a Reducer:

1
2
3
4
5
welcome output:   messages = ["Hi! I am XiaoC"]
assess_need output: messages = ["Can you be more specific about your concerns?"]

Without Reducer → final state is only ["Can you be more specific about your concerns?"] ← first one lost!
With Reducer → final state is ["Hi! I am XiaoC", "Can you be more specific about your concerns?"] ✅ all preserved

4. How to Choose Overwrite vs Accumulate

Selection Principle

Field Characteristic Usage Example
Only the latest value Direct assignment (overwrite) current_agent, needs_more_info
History needed Annotated + operator.add GuideState.messages, workflow_messages
Incrementally filled dict Direct assignment (overwrites entire dict) structured_profile, personal_profile
Accumulating list Annotated + operator.add messages (inner), workflow_messages (outer)

Common Mistakes

Mistake 1: conversation_history with Reducer but it’s a list of dicts

1
2
3
4
5
# ❌ Wrong: conversation_history does not need a Reducer
conversation_history: Annotated[list[dict], operator.add]

# ✅ Correct: node manually manages the entire list
conversation_history: list[dict[str, str]]

Reason: conversation_history contains messages from both user and assistant; the order (user first, then assistant) must be controlled manually. LangGraph’s automatic merge would break the conversation structure.

Mistake 2: dict type with Reducer

1
2
3
4
5
# ❌ Wrong: dict cannot use operator.add
collected_info: Annotated[dict, operator.add]

# ✅ Correct: direct overwrite
collected_info: dict[str, Any]

5. Layered State Design

The iCan project adopts outer + inner layering in core/state.py: the top‑level iCanWorkflowState and per‑agent states like GuideState / ProfileAnalysisState / CareerMatchState / ReporterState, etc.

Note: PlannerState is also defined in core/state.py, but workflow.py has not yet connected the Planner node. Do not draw a sixth agent in the state transition diagram.

Outer State (core/state.pyiCanWorkflowState)

1
2
3
4
5
6
7
8
class iCanWorkflowState(TypedDict, total=False):
session_id: str
conversation_history: list[dict] # full conversation history (manually appended by nodes, no Reducer)
structured_profile: dict # resume_parser_node output
personal_profile: dict # profile_analyzer_node output
career_matches: list[dict] # career_matcher_node output
needs_more_info: bool # route_after_guide routing flag
workflow_messages: Annotated[list[str], operator.add]

Inner State (core/state.pyGuideState)

1
2
3
4
5
6
7
class GuideState(TypedDict, total=False):
conversation_history: list[dict]
collected_info: dict
messages: Annotated[list, operator.add] # accumulated AI responses
current_stage: str
is_info_sufficient: bool
emotion_state: str # only inner, not leaked to outer

Why Layering

  1. Responsibility Isolation: Guide’s internal fields (e.g., emotion_state, missing_fields) are defined in GuideState and never enter iCanWorkflowState.
  2. Independent Testing: run_guide_agent(guide_state) / run_profile_analyzer(analyzer_state) can be unit‑tested without the top‑level graph.
  3. Data Transformation: The *_node functions in workflow.py manually handle outer‑inner mapping.

Data Transformation Example (compare with workflow.py)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# workflow.py — guide_node
async def guide_node(state: iCanWorkflowState) -> dict:
conversation_history = list(state.get("conversation_history", []))
if state.get("raw_input"):
conversation_history.append({"role": "user", "content": state["raw_input"]})

guide_state = create_initial_guide_state() # core/state.py
guide_state["conversation_history"] = conversation_history
guide_result = await run_guide_agent(guide_state) # agents/guide.py

updated_history = list(conversation_history)
if guide_result.get("messages"):
updated_history.append({"role": "assistant", "content": guide_result["messages"][-1]})

return {
"conversation_history": updated_history,
"needs_more_info": not guide_result.get("is_info_sufficient", False),
"current_agent": "guide",
}

Similarly, profile_analyzer_node: takes structured_profile from the outer state, constructs a ProfileAnalysisState, calls run_profile_analyzer(), and assembles the result into personal_profile.

6. Other Reducer Usages

Beyond operator.add, you can use other reducers:

1
2
3
4
5
6
7
8
9
10
11
from typing import Annotated
import operator

# Set union (deduplication)
tags: Annotated[set[str], operator.or_]

# Always take the latest (equivalent to default behavior)
latest_value: Annotated[str, lambda old, new: new]

# Custom merge logic (keep the maximum)
max_score: Annotated[float, lambda old, new: max(old or 0, new)]

7. Pitfall Records

  1. Outer misuse of messages: Only GuideState.messages has the Reducer; the top‑level is workflow_messages. Both audit scripts and log checks should refer to core/state.py.
  2. conversation_history must not have a Reducer: The user/assistant order is maintained by workflow.py’s guide_node manually appending; automatic merging would break the dialogue structure.
  3. ProfileAnalysisState.analysis_messages: Inner analysis process messages can accumulate, but the outer state only takes structured fields (ability_model, riasec_scores, etc.). Do not return guide_result wholesale.
  4. Initial needs_more_info: create_initial_workflow_state() defaults to False; it becomes True/False only after the first entry into guide_node. Pay attention to the initial value when writing tests.

8. Preventing State Pollution

Problem Scenario

If internal state of the Guide Agent (e.g., emotion_state) accidentally appears in the outer state, downstream nodes might incorrectly read it.

Prevention Measures

  1. Type Validation: TypedDict strictly defines fields for each state; undefined fields will not appear.
  2. Manual Mapping: Node functions return only the fields that need updating, not irrelevant ones.
  3. Separate State Types: Each Agent has its own TypedDict, providing compile‑time field checking.

9. Summary

Key takeaways for LangGraph state management in iCan:

  • All TypedDicts are centralized in core/state.py.
  • Default overwrite for single‑value fields like current_agent, needs_more_info.
  • Annotated + Reducer for GuideState.messages and iCanWorkflowState.workflow_messages.
  • Layering + manual mapping: the *_node functions in workflow.py are the sole conversion layer between outer and inner states.
  • PlannerState is defined but not yet connected; when extending, do not confuse it with the existing five‑stage pipeline.

Next Article: Conditional routing in workflow.py — how route_after_guide and should_continue inside Guide’s inner state work together.


← Back to iCan Topic