Prompt Design for AI Agent: How to Make LLM Stably Output Structured JSON

0. Series Loop (Follow Along Without Public Source Code)

End-to-End Pipeline: Vue Frontend → api/routes/chat.py → Guide Multi-turn SSE → run_analysis_pipeline (Parse → Analyze → Match → Report) → tools/pdf_exporter PDF.
This Article: 12/17 · Structure Loop · JSON

Stage	User Visible	Code Entry	Corresponding Article
Create Session	Welcome Message	POST /api/sessions	09
Multi-turn Dialogue	SSE Streaming	chat/stream → run_guide_single_turn	06, 14
Information Sufficient	Start Analysis	_run_analysis_background	05, 07
Resume Parsing	Progress 30%	run_resume_parser	12
Profile/RIASEC	Progress 50%	run_profile_analyzer	03, 13
Career Matching	Progress 70%	run_career_matcher	02
Report	Progress 90%	run_reporter	11
Download PDF	File	GET …/report/pdf	11, 15

	Description
Before reading this	Article 08: invoke_llm_with_json
After reading this	Manually work through `parse_json_from_text` four-layer strategy
Next loop	Article 03/13: Business JSON schema (Article 13)

Full Series Loop Index: SERIES-LOOP.md

1. What Problem to Solve

In the iCan main flow, resume_parser_node needs to convert the natural language resume collected during the Guide phase into a structured_profile for profile_analyzer_node to consume. The input is unstructured text, and the output must be a dict with a fixed schema.

Common failure modes during actual integration:

The model wraps JSON inside a ` json code block, or mixes it directly with explanatory text;
Ollama local models do not support response_format={"type": "json_object"}, causing bind to throw an error;
Minor JSON syntax issues (trailing commas, single quotes), causing json.loads to fail directly;
LLM returns an empty dict on both calls, breaking the entire parsing pipeline.

iCan’s strategy is Prompt constraint schema + JSON mode at the call layer + four-layer text extraction + regex fallback, rather than expecting the model to “get it perfect in one shot.”

2. Implementation Location

Module	Responsibility
`llm/prompts.py`	`RESUME_PARSER_SYSTEM_PROMPT`: Complete JSON example + field rules
`llm/providers.py`	`invoke_llm_with_json`: `response_format` first, fallback on failure
`llm/parsers.py`	`parse_json_from_text` four-layer extraction; `validate_structured_profile` validation
`agents/resume_parser.py`	Assemble messages, select `get_light_model()`, retry and `_regex_extract_profile` fallback

Subgraph order (create_resume_parser_graph): load_input → extract_information → build_profile → validate_profile.

JSON Four-Layer Fallback Parsing

3. Prompt Design: ResumeParser’s Schema Contract

The Prompt is defined in RESUME_PARSER_SYSTEM_PROMPT in llm/prompts.py. The core is not a single sentence “please output JSON,” but four things clearly stated at once:

Complete Example: Shows all fields: basic_info, work_experience, skill_set, certifications, career_progression, parsing_confidence;
Missing Strategy: “If not mentioned, use null, do not fabricate”;
Inference Annotation: parsing_confidence.inferred_fields lists inferred fields;
Chinese and Format: Differentiate technical/soft skills, integrate and deduplicate across multi-turn dialogue.

The Prompt embeds a complete example wrapped in ```json—this aligns with the regex r"```json\s*([\s\S]*?)\s*```" in llm/parsers.py strategy 1: if the model follows the Prompt and outputs a code block, the parser hits it on the first layer.

extract_information in agents/resume_parser.py combines the system prompt and user’s original text into messages:

messages = [
    {"role": "system", "content": RESUME_PARSER_SYSTEM_PROMPT},
    {"role": "user", "content": f"Please extract structured personal information from the following text:\n\n{document_content}"},
]
model = get_light_model()
parsed_data = await invoke_llm_with_json(model, messages)

Model Selection: Resume parsing uses get_light_model() (code defaults to LLM_MODEL_LIGHT=gpt-4o-mini), not the chat model. In .env, it’s common to change to DeepSeek or Ollama qwen3.5:9b inside Docker—switching models does not affect Prompt/schema, but affects JSON mode compatibility (see pitfalls).

4. Call Layer: `invoke_llm_with_json` Dual Channel

In llm/providers.py, JSON invocation is not a simple ainvoke, but a three-level fallback:

try:
    json_model = model.bind(response_format={"type": "json_object"})
    response = await asyncio.wait_for(json_model.ainvoke(processed, **kwargs), timeout=60)
    raw_content = response.content
except Exception as bind_err:
    logger.warning("response_format JSON mode not supported, falling back to text mode: %s", bind_err)
    response = await asyncio.wait_for(model.ainvoke(processed, **kwargs), timeout=60)
    raw_content = response.content

try:
    result = json.loads(raw_content)
except (json.JSONDecodeError, TypeError):
    result = parse_json_from_text(raw_content)  # llm/parsers.py
    if not result:
        raise ValueError(f"Cannot extract valid JSON from LLM response, original content: {raw_content[:300]}")

The flow can be summarized as:

bind(json_object) → json.loads(content)
        ↓ unsupported or parse failure
normal ainvoke → json.loads
        ↓ still fails
parse_json_from_text (four layers)
        ↓ empty dict
ValueError / upstream retry

Additionally, when LLM_BASE_URL contains 11434 and the model name contains qwen3, _inject_no_think prepends /no_think before the system message to avoid Qwen3 thinking blocks contaminating the JSON—an extra layer of JSON stability on local Ollama.

5. Four-Layer Fallback Parser: `parse_json_from_text`

parse_json_from_text in llm/parsers.py is the last net, trying in order:

Strategy	Regex/Logic	Typical Scenario
1	r"```json\s([\s\S]?)\s*```"	ChatGPT style output
2	Normal ``` ... ```, content starts with `{` or `[`	Code block without json label
3	`r"\{[\s\S]*\}"` greedy match outermost braces	“Okay, the result is: {…}”
4	`json.loads(text.strip())`	Pure JSON response
Fallback	Return `{}`	Completely unparseable

Unlike general tutorials, each layer failure does not throw up in the implementation—if a strategy’s json.loads fails, it moves to the next layer; the outermost JSONDecodeError is caught and returns {}. This means the caller must check for empty dict—invoke_llm_with_json will then raise ValueError, and extract_information will enter retry or regex fallback.

In the source code, each layer has logger.info annotating the strategy number (Strategy 1–4). During debugging, you can check the logs to see which layer was reached.

6. Agent Retry and Regex Fallback

extract_information in agents/resume_parser.py has business retry on top of the LLM layer:

for attempt in range(2):
    try:
        model = get_light_model()
        parsed_data = await invoke_llm_with_json(model, messages)
        if parsed_data and len(parsed_data) > 0:
            break
        logger.warning("[extract_information] Attempt %d returned empty data, retrying", attempt + 1)
    except TimeoutError as te:
        ...
    except Exception as e:
        ...

if not parsed_data or len(parsed_data) == 0:
    parsed_data = _regex_extract_profile(document_content)

_regex_extract_profile uses regex to extract name, education, work experience, etc.—field names are not identical to the Prompt schema (e.g., it produces skills instead of skill_set.technical_skills). build_profile fills missing keys with default empty structures—intentionally “something is better than nothing,” but validate_profile will likely still report missing required fields.

7. Quality Closure After Parsing

The LLM’s self-evaluated parsing_confidence is extracted into confidence_scores in build_profile; validate_profile calls validate_structured_profile from llm/parsers.py for code-side validation. Required fields include:

basic_info.education, basic_info.major
Non-empty work_experience list
skill_set.technical_skills, skill_set.soft_skills
career_progression.total_years

Missing items are written into parse_errors, and validation_passed is written into confidence_scores. The confidence from the Prompt and the Python validation are complementary: the former reflects the model’s self-assessment, the latter ensures downstream Agents do not receive a “skeleton profile.”

8. Position in the Pipeline

In the top-level workflow.py: when guide_node has enough information, it enters resume_parser_node, which outputs structured_profile written to iCanWorkflowState, then passes it to profile_analyzer_node.

Data flow:

Guide conversation text (raw_input)
    → run_resume_parser
    → invoke_llm_with_json + parse_json_from_text
    → structured_profile + confidence_scores + parse_errors
    → ProfileAnalyzer

The same invoke_llm_with_json + parse_json_from_text is also reused by other nodes needing JSON, such as ProfileAnalyzer, CareerMatcher, etc. (see Article 8 LLM layer); ResumeParser is the call point with the most complex schema and longest fallback chain.

9. Pitfalls and Boundaries

Pitfall 1: response_format is not a universal capability. Some Ollama models fail on bind and fall back to text mode, relying more on the JSON example in the Prompt and parse_json_from_text. When integrating with Docker default qwen3.5:9b, check logs for the “falling back to text mode” warning.

Pitfall 2: Strategy 3 greedy match may cut incorrectly. \{[\s\S]*\} goes from the first { to the last }. If the model embeds other curly braces before or after the JSON, the whole parse may fail and fall into {}. The Prompt requirement “output only JSON” is still necessary; the parser cannot replace Prompt constraints.

Pitfall 3: Regex fallback and schema misalignment. _regex_extract_profile produces fields like skills, which are not automatically mapped to skill_set.technical_skills. Downstream validation failure is expected behavior—guide the user to supplement information or retry the LLM, rather than treating the fallback as a successful parse.

Pitfall 4: Empty dict and retry. extract_information attempts at most 2 times; if invoke_llm_with_json returns an empty dict (without throwing an exception), it logs a warning and retries. TimeoutError is caught separately and does not block indefinitely.

10. Summary

The Prompt locks the schema with a complete JSON example + null/inference rules, defined in llm/prompts.py.
invoke_llm_with_json in llm/providers.py first tries json_object mode, then normal call, then json.loads → parse_json_from_text.
parse_json_from_text in llm/parsers.py is four-layer fallback; returns {} on failure; callers must handle empty results.
extract_information in agents/resume_parser.py uses get_light_model() with 2 retries + _regex_extract_profile as the final fallback.
validate_structured_profile validates required fields with code rules, parallel to parsing_confidence self-evaluation.

Next article: RIASEC assessment Prompt engineering (Article 13).

Appendix: Key Source Code (Line-by-Line Annotations)

The following code is from the iCan implementation. Chinese annotations are above each line, allowing you to follow along without the public repository.
Generation command: python3 bin/build-ican-annotated-snippets.py

parse_json_from_text Four-Layer Strategy

# ========== parse_json_from_text Four-Layer Strategy ==========
# Source file: llm/parsers.py  Lines 19-92

# L19: Synchronous function parse_json_from_text: routing decision or factory method
def parse_json_from_text(text: str) -> dict:
# L21: [Doc] Extract JSON from LLM response text.
# L23: [Doc] Function description:
# L24: [Doc] Extracts and parses JSON content from text returned by LLM. Supports the following formats:
# L25: [Doc] 1. Markdown code block wrapped JSON (```json ... ```)
# L26: [Doc] 2. Normal code block wrapped JSON (``` ... ```)
# L27: [Doc] 3. JSON embedded directly in text (starting with {, ending with })
# L28: [Doc] Returns an empty dictionary on parsing failure.
# L30: [Doc] Input description:
# L31: [Doc] text (str): Raw text from LLM response
# L33: [Doc] Output description:
# L34: [Doc] dict: Parsed JSON dictionary, returns empty dict {} on failure
# (Lines L20-35 are function/module docstrings, converted to comments for readability)
# L36: Start try block, subsequent except handles fallback
    try:
# L37: Extract JSON from LLM text (four-layer regex/parse strategy)
        logger.info(f"[parse_json_from_text] Starting execution, input: text length={len(text)}")
# L38: Extract JSON from LLM text (four-layer regex/parse strategy)
        logger.debug(f"[parse_json_from_text] Text preview: {text[:300]}")

# L40: Conditional branch
        if not text or not text.strip():
# L41: Extract JSON from LLM text (four-layer regex/parse strategy)
            logger.warning("[parse_json_from_text] Input text is empty, returning empty dict")
# L42: Return fields to be merged into state (LangGraph will merge)
            return {}

# L44: Strategy 1: Try to extract from ```json ... ``` code block
# L45: Assignment: update local variable or state field
        json_code_block_pattern = r"```json\s*([\s\S]*?)\s*```"
# L46: Assignment: update local variable or state field
        match = re.search(json_code_block_pattern, text)
# L47: Conditional branch
        if match:
# L48: Assignment: update local variable or state field
            json_str = match.group(1).strip()
# L49: Extract JSON from LLM text (four-layer regex/parse strategy)
            logger.debug(f"[parse_json_from_text] Extracted content from json code block, length: {len(json_str)}")
# L50: Parse LLM returned string into Python dict
            result = json.loads(json_str)
# L51: Extract JSON from LLM text (four-layer regex/parse strategy)
            logger.info(f"[parse_json_from_text] Execution complete (Strategy 1: json code block), returned field count: {len(result)}")
# L52: Return fields to be merged into state (LangGraph will merge)
            return result

# L54: Strategy 2: Try to extract from normal ``` ... ``` code block
# L55: Assignment: update local variable or state field
        code_block_pattern = r"```\s*([\s\S]*?)\s*```"
# L56: Assignment: update local variable or state field
        match = re.search(code_block_pattern, text)
# L57: Conditional branch
        if match:
# L58: Assignment: update local variable or state field
            inner = match.group(1).strip()
# L59: Try to determine if it is JSON (starts with { or [)
# L60: Conditional branch
            if inner.startswith("{") or inner.startswith("["):
# L61: Extract JSON from LLM text (four-layer regex/parse strategy)
                logger.debug(f"[parse_json_from_text] Extracted JSON content from normal code block, length: {len(inner)}")
# L62: Parse LLM returned string into Python dict
                result = json.loads(inner)
# L63: Extract JSON from LLM text (four-layer regex/parse strategy)
                logger.info(f"[parse_json_from_text] Execution complete (Strategy 2: normal code block), returned field count: {len(result)}")
# L64: Return fields to be merged into state (LangGraph will merge)
                return result

# L66: Strategy 3: Try to find JSON directly in the text (find outermost { })
# L67: Assignment: update local variable or state field
        brace_pattern = r"\{[\s\S]*\}"
# L68: Assignment: update local variable or state field
        match = re.search(brace_pattern, text)
# L69: Conditional branch
        if match:
# L70: Assignment: update local variable or state field
            json_str = match.group(0)
# L71: Extract JSON from LLM text (four-layer regex/parse strategy)
            logger.debug(f"[parse_json_from_text] Extracted JSON content directly from text, length: {len(json_str)}")
# L72: Parse LLM returned string into Python dict
            result = json.loads(json_str)
# L73: Extract JSON from LLM text (four-layer regex/parse strategy)
            logger.info(f"[parse_json_from_text] Execution complete (Strategy 3: direct extraction), returned field count: {len(result)}")
# L74: Return fields to be merged into state (LangGraph will merge)
            return result

# L76: Strategy 4: Try to parse the entire text directly
# L77: Start try block, subsequent except handles fallback
        try:
# L78: Parse LLM returned string into Python dict
            result = json.loads(text.strip())
# L79: Extract JSON from LLM text (four-layer regex/parse strategy)
            logger.info(f"[parse_json_from_text] Execution complete (Strategy 4: direct parse), returned field count: {len(result)}")
# L80: Return fields to be merged into state (LangGraph will merge)
            return result
# L81: Catch exception to avoid crashing the entire graph/request
        except json.JSONDecodeError:
# L82: Execute statement (details in business description above)
            pass

# L84: Extract JSON from LLM text (four-layer regex/parse strategy)
        logger.warning("[parse_json_from_text] Could not extract valid JSON from text, returning empty dict")
# L85: Return fields to be merged into state (LangGraph will merge)
        return {}

# L87: Catch exception to avoid crashing the entire graph/request
    except json.JSONDecodeError as e:
# L88: Extract JSON from LLM text (four-layer regex/parse strategy)
        logger.error(f"[parse_json_from_text] JSON parse failed, exception: {e}", exc_info=True)
# L89: Return fields to be merged into state (LangGraph will merge)
        return {}
# L90: Catch exception to avoid crashing the entire graph/request
    except Exception as e:
# L91: Extract JSON from LLM text (four-layer regex/parse strategy)
        logger.error(f"[parse_json_from_text] Exception during JSON extraction: {e}", exc_info=True)
# L92: Return fields to be merged into state (LangGraph will merge)
        return {}

invoke_llm_with_json

# ========== invoke_llm_with_json ==========
# Source file: llm/providers.py  Lines 208-278

# L208: Asynchronous function invoke_llm_with_json: can be awaited, suitable for IO-type LLM/DB calls
async def invoke_llm_with_json(model: ChatOpenAI, messages: list, **kwargs) -> dict:
# L210: [Doc] Call LLM and parse JSON output.
# L212: [Doc] Function description:
# L213: [Doc] Uses the specified ChatOpenAI model instance, asynchronously calls the LLM with the message list,
# L214: [Doc] requires the model to reply in JSON format, and automatically parses the response content into a Python dictionary.
# L215: [Doc] Suitable for scenarios requiring structured data output, such as resume parsing, career matching results, etc.
# L216: [Doc] Prefers to use response_format JSON mode, falls back to text parsing if not supported.
# L218: [Doc] Input description:
# L219: [Doc] model (ChatOpenAI): Configured ChatOpenAI model instance
# L220: [Doc] messages (list): Message list, format [{"role": "system/user/assistant", "content": "..."}]
# L221: [Doc] **kwargs: Extra parameters, such as temperature, max_tokens, etc. to override default configuration
# L223: [Doc] Output description:
# L224: [Doc] dict: Parsed JSON dictionary data
# (Lines L209-225 are function/module docstrings, converted to comments for readability)
# L226: Import dependency module
    import json

# L228: Import dependency module
    from ican.llm.parsers import parse_json_from_text

# L230: Start try block, subsequent except handles fallback
    try:
# L231: Log for online debugging of node input/output
        logger.info(
# L232: Call LLM and parse JSON; internal JSON mode → text fallback chain
            f"[invoke_llm_with_json] Starting execution, input: model={model.model_name},"
# L233: Execute statement (details in business description above)
            f"message count: {len(messages)}, kwargs: {kwargs}"
# L234: Execute statement (details in business description above)
        )
# L235: Call LLM and parse JSON; internal JSON mode → text fallback chain
        logger.debug(f"[invoke_llm_with_json] Message details: {messages}")

# L237: Assignment: update local variable or state field
        processed = _inject_no_think(messages)
# L238: Assignment: update local variable or state field
        raw_content = None

# L240: Import dependency module
        import asyncio as _asyncio

# L242: Start try block, subsequent except handles fallback
        try:
# L243: Try OpenAI JSON mode, if not supported go to except fallback
            json_model = model.bind(response_format={"type": "json_object"})
# L244: Start try block, subsequent except handles fallback
            try:
# L245: Hard timeout wrapper, prevent LLM from hanging
                response = await _asyncio.wait_for(json_model.ainvoke(processed, **kwargs), timeout=60)
# L246: Catch exception to avoid crashing the entire graph/request
            except _asyncio.TimeoutError:
# L247: Raise exception up, handled by caller or LangGraph
                raise TimeoutError("AI model response timed out, please retry later")
# L248: Assignment: update local variable or state field
            raw_content = response.content
# L249: Catch exception to avoid crashing the entire graph/request
        except TimeoutError:
# L250: Raise exception up, handled by caller or LangGraph
            raise
# L251: Catch exception to avoid crashing the entire graph/request
        except Exception as bind_err:
# L252: Log for online debugging of node input/output
            logger.warning(
# L253: Call LLM and parse JSON; internal JSON mode → text fallback chain
                f"[invoke_llm_with_json] response_format JSON mode not supported, falling back to text mode: {bind_err}"
# L254: Execute statement (details in business description above)
            )
# L255: Start try block, subsequent except handles fallback
            try:
# L256: Hard timeout wrapper, prevent LLM from hanging
                response = await _asyncio.wait_for(model.ainvoke(processed, **kwargs), timeout=60)
# L257: Catch exception to avoid crashing the entire graph/request
            except _asyncio.TimeoutError:
# L258: Raise exception up, handled by caller or LangGraph
                raise TimeoutError("AI model response timed out, please retry later")
# L259: Assignment: update local variable or state field
            raw_content = response.content

# L261: Log for online debugging of node input/output
        logger.debug(
# L262: Call LLM and parse JSON; internal JSON mode → text fallback chain
            f"[invoke_llm_with_json] Raw response length: {len(raw_content) if raw_content else 0}"
# L263: Execute statement (details in business description above)
        )

# L265: Start try block, subsequent except handles fallback
        try:
# L266: Parse LLM returned string into Python dict
            result = json.loads(raw_content)
# L267: Catch exception to avoid crashing the entire graph/request
        except (json.JSONDecodeError, TypeError):
# L268: Call LLM and parse JSON; internal JSON mode → text fallback chain
            logger.info("[invoke_llm_with_json] Direct JSON parse failed, trying parse_json_from_text extraction")
# L269: Extract JSON from LLM text (four-layer regex/parse strategy)
            result = parse_json_from_text(raw_content)
# L270: Conditional branch
            if not result:
# L271: Raise exception up, handled by caller or LangGraph
                raise ValueError(f"Cannot extract valid JSON from LLM response, original content: {raw_content[:300]}")

# L273: Log for online debugging of node input/output
        logger.info(
# L274: Call LLM and parse JSON; internal JSON mode → text fallback chain
            f"[invoke_llm_with_json] Execution complete, returned JSON field count: {len(result)}"
# L275: Execute statement (details in business description above)
        )
# L276: Call LLM and parse JSON; internal JSON mode → text fallback chain
        logger.debug(f"[invoke_llm_with_json] Returned JSON preview: {str(result)[:300]}")

# L278: Return fields to be merged into state (LangGraph will merge)
        return result

extract_information

# ========== extract_information ==========
# Source file: agents/resume_parser.py  Lines 153-225

# L153: Asynchronous function extract_information: can be awaited, suitable for IO-type LLM/DB calls
async def extract_information(state: ResumeParserState) -> dict:
# L155: [Doc] Use LLM to extract structured information.
# L157: [Doc] Function description:
# L158: [Doc] Sends the user's text content to the LLM, following the format requirements defined in
# L159: [Doc] RESUME_PARSER_SYSTEM_PROMPT, extracting structured personal information including basic info,
# L160: [Doc] work experience, skill set, certifications, and career development path.
# L162: [Doc] Input description:
# L163: [Doc] state (ResumeParserState): Resume parsing state object, must contain document_content.
# L165: [Doc] Output description:
# L166: [Doc] dict: State update dictionary, containing parsed_sections (structured data from LLM).
# (Lines L154-167 are function/module docstrings, converted to comments for readability)
# L168: Start try block, subsequent except handles fallback
    try:
# L169: Log for online debugging of node input/output
        logger.info("[extract_information] Starting execution, input: state=%s", {k: str(v)[:100] for k, v in state.items()})
# L170: Assignment: update local variable or state field
        document_content = state.get("document_content", "")

# L172: Conditional branch
        if not document_content or not document_content.strip():
# L173: Log for online debugging of node input/output
            logger.warning("[extract_information] Document content is empty, skipping extraction")
# L174: Return fields to be merged into state (LangGraph will merge)
            return {
# L175: Execute statement (details in business description above)
                "parsed_sections": {},
# L176: Execute statement (details in business description above)
                "parse_errors": ["Document content is empty, cannot extract information"],
# L177: Execute statement (details in business description above)
            }

# L179: Assignment: update local variable or state field
        messages = [
# L180: Execute statement (details in business description above)
            {"role": "system", "content": RESUME_PARSER_SYSTEM_PROMPT},
# L181: Execute statement (details in business description above)
            {"role": "user", "content": f"Please extract structured personal information from the following text:\n\n{document_content}"},
# L182: Execute statement (details in business description above)
        ]

# L184: Log for online debugging of node input/output
        logger.info("[extract_information] Calling LLM to extract structured information, document length: %d", len(document_content))

# L186: Assignment: update local variable or state field
        parsed_data = {}
# L187: Assignment: update local variable or state field
        last_err = None
# L188: Loop
        for attempt in range(2):
# L189: Start try block, subsequent except handles fallback
            try:
# L190: Get light model instance (mainly used for resume_parser structured JSON)
                model = get_light_model()
# L191: Call LLM and parse JSON; internal JSON mode → text fallback chain
                parsed_data = await invoke_llm_with_json(model, messages)
# L192: Conditional branch
                if parsed_data and len(parsed_data) > 0:
# L193: Execute statement (details in business description above)
                    break
# L194: Log for online debugging of node input/output
                logger.warning("[extract_information] Attempt %d returned empty data, retrying", attempt + 1)
# L195: Catch exception to avoid crashing the entire graph/request
            except TimeoutError as te:
# L196: Assignment: update local variable or state field
                last_err = te
# L197: Log for online debugging of node input/output
                logger.warning("[extract_information] LLM call timeout on attempt %d: %s", attempt + 1, te)
# L198: Catch exception to avoid crashing the entire graph/request
            except Exception as e:
# L199: Assignment: update local variable or state field
                last_err = e
# L200: Log for online debugging of node input/output
                logger.warning("[extract_information] LLM call exception on attempt %d: %s", attempt + 1, e)

# L202: Conditional branch
        if not parsed_data or len(parsed_data) == 0:
# L203: Log for online debugging of node input/output
            logger.warning("[extract_information] LLM extraction failed, using regex fallback")
# L204: Assignment: update local variable or state field
            parsed_data = _regex_extract_profile(document_content)

# L206: Log for online debugging of node input/output
        logger.info("[extract_information] Structured data field count: %d", len(parsed_data))
# L207: Log for online debugging of node input/output
        logger.debug("[extract_information] Structured data preview: %s", json.dumps(parsed_data, ensure_ascii=False)[:500])

# L209: Assignment: update local variable or state field
        result = {
# L210: Execute statement (details in business description above)
            "parsed_sections": parsed_data,
# L211: Execute statement (details in business description above)
        }
# L212: Log for online debugging of node input/output
        logger.info("[extract_information] Execution complete, output: parsed_sections field count=%d", len(parsed_data))
# L213: Return fields to be merged into state (LangGraph will merge)
        return result

# L215: Catch exception to avoid crashing the entire graph/request
    except Exception as e:
# L216: Log for online debugging of node input/output
        logger.error("[extract_information] Exception extracting structured information with LLM: %s", e, exc_info=True)
# L217: Assignment: update local variable or state field
        fallback = _regex_extract_profile(state.get("document_content", ""))
# L218: Conditional branch
        if fallback:
# L219: Log for online debugging of node input/output
            logger.info("[extract_information] Extracted %d fields using regex fallback", len(fallback))
# L220: Return fields to be merged into state (LangGraph will merge)
            return {"parsed_sections": fallback}
# L221: Return fields to be merged into state (LangGraph will merge)
        return {
# L222: Execute statement (details in business description above)
            "parsed_sections": {},
# L223: Execute statement (details in business description above)
            "parse_errors": [f"LLM extraction exception: {str(e)}"],
# L224: Execute statement (details in business description above)
        }

Article	Topic
1	System Overview
2	Five Agent Collaboration
3	Holland RIASEC
4–7	State · Routing · Nesting · Fault Tolerance
8–11	LLM Layer · SSE/WS · DB Migration · PDF
12–14	JSON Prompt · RIASEC Prompt · Guide Prompt
15–17	Docker · Middleware · Configuration

← Back to iCan Topic