0. Series Closed Loop (Read Along Even Without Public Source Code)
End-to-End Pipeline: Vue Frontend → api/routes/chat.py → Guide Multi-turn SSE → run_analysis_pipeline (Parse→Analyze→Match→Report) → tools/pdf_exporter PDF.
This Article: 3/17 · Analysis Loop · Holland RIASEC
| Phase | Visible to User | Code Entry | Corresponding Article |
|---|---|---|---|
| Create Session | Welcome Message | POST /api/sessions | 09 |
| Multi-turn Dialogue | SSE Streaming | chat/stream → run_guide_single_turn | 06, 14 |
| Information Sufficient | Start Analysis | _run_analysis_background | 05, 07 |
| Resume Parsing | Progress 30% | run_resume_parser | 12 |
| Profile/RIASEC | Progress 50% | run_profile_analyzer | 03, 13 |
| Career Matching | Progress 70% | run_career_matcher | 02 |
| Report | Progress 90% | run_reporter | 11 |
| Download PDF | File | GET …/report/pdf | 11, 15 |
| Description | |
|---|---|
| Before reading | Article 02: profile_analyzer_node |
| After reading | Understand how analyze_riasec produces six-dimension scores |
| Next up | Article 13: RIASEC Dedicated Prompt (Article 4) |
Full Series Closed Loop Index: SERIES-LOOP.md
1. Introduction to the Holland RIASEC Model
The Holland Theory of Vocational Personalities was proposed by American psychologist John Holland in 1959. It divides people’s vocational interests into six dimensions:
| Dimension | English | Type | Typical Careers |
|---|---|---|---|
| R | Realistic | Realistic | Engineer, Technician, Architect |
| I | Investigative | Investigative | Scientist, Data Analyst, Doctor |
| A | Artistic | Artistic | Designer, Writer, Musician |
| S | Social | Social | Teacher, Counselor, Social Worker |
| E | Enterprising | Enterprising | Entrepreneur, Sales Manager, Lawyer |
| C | Conventional | Conventional | Accountant, Administrator, Auditor |
Holland Code: Take the 2-3 highest-scoring dimensions to form a code (e.g., IAS, RCE), each code corresponds to a set of matching careers.
Why this model was chosen:
- Widely recognized in psychology with over 60 years of empirical research support
- O*NET (Occupational Information Network) official classification foundation
- Clear six-dimensional structure, suitable for LLM quantitative evaluation
- No need for users to answer 120 questions; can be inferred through dialogue
2. LLM Integration: OpenAI Compatible Interface (DeepSeek as Common Deployment Configuration)
In implementation, there is no DeepSeek-specific SDK. Everything is integrated through langchain_openai.ChatOpenAI (llm/providers.py), using LLM_BASE_URL + LLM_MODEL_CHAT / LLM_MODEL_LIGHT to switch providers—DeepSeek, official OpenAI, and local Ollama all use the same invoke_llm / invoke_llm_with_json.
Comparison with Other Deployment Options
| Option | JSON Mode | Domestic Access | Note |
|---|---|---|---|
| DeepSeek API (e.g., deepseek-v4-flash) | ✅ | ✅ Direct | Common .env configuration in project |
| GPT-4o etc. (Official OpenAI) | ✅ | Depends on network | Code default when config.py has no env |
| Ollama + Qwen3 | ✅ | Local | Docker can point to 11434 by default, requires /no_think |
.env example (deployment configuration, not hardcoded in framework):
1 | |
Dual Model Strategy (Based on Current Code)
llm/providers.py provides get_chat_model() and get_light_model(); defaults come from config.py (when no env is configured, it’s gpt-4o):
| Caller | Current Implementation |
|---|---|
| Guide / ProfileAnalyzer / CareerMatcher | get_chat_model() |
ResumeParser (agents/resume_parser.py) |
get_light_model() |
Reporter (agents/reporter.py) |
get_chat_model() (if docs say light, follow the code) |
Switch LLM_MODEL_CHAT / LLM_MODEL_LIGHT via .env without changing business code. Writing deepseek-v4-flash in .env is just a deployment example, not the code default.
3. RIASEC’s Location in agents/profile_analyzer.py
Holland scoring is not an independent service; it’s a node in the ProfileAnalyzer subgraph. create_profile_analyzer_graph() defines a sequential chain:
1 | |
analyze_riasec reads the structured profile from upstream agents/resume_parser.py via ProfileAnalysisState.structured_profile, truncates the JSON to about 2000 characters, then calls the LLM:
1 | |
The top-level workflow.py‘s profile_analyzer_node writes riasec_scores into personal_profile for use by career_matcher_node and tools/pdf_exporter.py chart generation.
4. How the LLM Performs Quantitative Evaluation
Traditional Questionnaire vs LLM Evaluation
Traditional Holland assessments require users to answer 120+ questions, taking 20-30 minutes. In contrast, the LLM evaluation approach is:
- Users describe their experiences, preferences, and confusions in dialogue
- The LLM infers the tendency of the six dimensions from this natural language
- Provides a score from 0-10, along with reasoning
Prompt Design
The rules are concentrated in the sixth section of PROFILE_ANALYZER_SYSTEM_PROMPT in llm/prompts.py; the analyze_riasec node further constrains the JSON shape in the user message:
1 | |
Output format:
1 | |
Scoring Basis Design
The key is that the Prompt requires the LLM to explain the reason for each score:
1 | |
This is more valuable than simply outputting scores—users can see “why I got a 9 for I”.
5. Connecting RIASEC Scores to Career Matching
How Holland Code Enters CareerMatcher
The project does not hardcode a HOLLAND_CAREER_MAP dictionary. generate_candidate_paths in agents/career_matcher.py serializes the full personal_profile (including riasec_scores) and sends it along with CAREER_MATCHER_SYSTEM_PROMPT from llm/prompts.py to the LLM. The Prompt requires combining the Holland code to explain the three-level paths:
1 | |
The Prompt requires the LLM to reference holland_code (e.g., IEA) in the recommendation text, explaining which directions align with high-score dimensions and which are stretch directions—this is more flexible than maintaining a static mapping table, but also more dependent on Prompt constraints and JSON parsing stability (see Article 12).
6. Visualization: tools/pdf_exporter.py Bar Chart
When embedding charts in the PDF report, it reads the six keys R–C from personal_profile.riasec_scores; keys must be consistent with the output of analyze_riasec (single uppercase letter).
1 | |
Chinese Font Handling
In Docker deployments, matplotlib does not support Chinese by default. Solution:
- Install system font package:
apt-get install fonts-noto-cjk - Specify font priority in the code:
1 | |
7. Evaluation of Assessment Effectiveness
Comparison with Traditional Questionnaires
| Dimension | Traditional Questionnaire | LLM Assessment |
|---|---|---|
| User Time | 20-30 minutes | 5-minute conversation |
| Number of Questions | 120+ | Natural conversation |
| Scoring Accuracy | Standardized scale | Semantic inference based |
| Adaptability | Fixed questions | Dynamic follow-up |
| User Experience | Dull | Like chatting with a friend |
Limitations
- Accuracy depends on input quality: The more detailed the user’s description, the more accurate the assessment
- No standardized scale: Lacks extensive validity verification like the SDS (Self-Directed Search) scale
- Cultural differences: Holland model is based on the U.S. workplace; the Chinese context requires weight adjustments
8. Pitfall Records
- Don’t hardcode DeepSeek as default:
config.pydefaults togpt-4o; DeepSeek is a deployment choice viaLLM_BASE_URL=https://api.deepseek.com/v1in.env. holland_codenot written intoriasec_scores:analyze_riaseconly writes the six R–C floats into state;holland_codeis in the LLM JSON but not persisted toriasec_scores. If the PDF needs to display the code, it must be supplemented from the raw JSON or via post-processing.- JSON parsing failure returns all zeros: When
analyze_riasecencounters an exception, it returns six-dimension 0.0; downstream Matcher will still proceed—requires judging based on Guide dialogue quality and Parser confidence scores (confidence_scores). - Ollama Qwen3 requires
/no_think:_inject_no_thinkinllm/providers.pyautomatically injects this when detecting Ollama + qwen3; otherwise, the RIASEC JSON may be polluted by thinking blocks.
9. Summary
The core idea of implementing the Holland assessment with an LLM is: Write the psychological scale into llm/prompts.py, and have analyze_riasec in agents/profile_analyzer.py infer dimensional scores from the structured profile.
Key design points:
- OpenAI compatible interface + environment variable to switch models (DeepSeek / GPT / Ollama all usable)
- Prompt requires scores + reasoning, ensuring interpretability
tools/pdf_exporter.py‘s matplotlib bar chart embedded in PDF- Install
fonts-noto-cjkin Docker environment to solve Chinese display issues
Next Article: Layered usage of TypedDict + Annotated Reducer in
core/state.py.