0. Series Loop (Follow Along Without Public Source Code)

End-to-End Pipeline: Vue Frontend → api/routes/chat.py → Guide Multi-turn SSE → run_analysis_pipeline (Parsing → Analysis → Matching → Report) → tools/pdf_exporter PDF.
This Article: #17/17 · Configuration Loop · settings

Stage User Visible Code Entry Corresponding Article
Create Session Welcome Message POST /api/sessions 09
Multi-turn Dialogue SSE Streaming chat/stream → run_guide_single_turn 06, 14
Information Sufficient Start Analysis _run_analysis_background 05, 07
Resume Parsing Progress 30% run_resume_parser 12
Profile/RIASEC Progress 50% run_profile_analyzer 03, 13
Career Matching Progress 70% run_career_matcher 02
Report Progress 90% run_reporter 11
Download PDF File GET …/report/pdf 11, 15
Description
Before Reading Article 08/15 Environment Variables
After Reading Can write how .env overrides Settings defaults
Next Loop Back to Article 01: Loop Review (go back to Article 01 for review)

Full series loop index: SERIES-LOOP.md

1. What Problem to Solve

iCan’s configuration scattered across multiple places quickly spirals out of control: LLM keys, MySQL URL, Embedding path, Chroma directory, chat round limit—local .env, Docker environment, CI-injected variables may all coexist simultaneously. Also need to avoid logging the full LLM_API_KEY.

The project consolidates all items into a single Settings class in config.py, with a module-level settings singleton, uniformly read by llm/providers.py, Agent, Docker compose. This article only covers iCan’s implementation, not a generic pydantic-settings tutorial.


2. Implementation Location

Module Relationship with Config
config.py _ENV_FILE location, dotenv preload, Settings, get_settings(), settings singleton
llm/providers.py from ican.config import settings, reads LLM_* to create ChatOpenAI
agents/resume_parser.py Indirectly reads LLM_MODEL_LIGHT via get_light_model()
.env.example Repository’s recommended local/DeepSeek example values
docker-compose.yml Container environment variable overrides (default Ollama + MySQL DSN)

pydantic-settings config loading


3. _ENV_FILE Lookup: Two-level Path

config.py calculates the absolute path of .env before importing Settings:

1
2
3
4
5
_ENV_FILE = os.path.normpath(os.path.join(
os.path.abspath(os.path.dirname(__file__)), "..", "..", ".env"))
if not os.path.exists(_ENV_FILE):
_ENV_FILE = os.path.normpath(os.path.join(
os.path.abspath(os.path.dirname(__file__)), "..", "..", "..", ".env"))

Explanation (__file__ is in src/ican/config.py):

  1. Preferred project root/.env (up two levels to iCan/ repository root);
  2. Fallback one more level up (compatible with checkout layouts where src is nested one more directory).

If neither path exists, _ENV_FILE still points to the last candidate path; pydantic won’t error when it can’t read the file, fields will use code defaults.

Settings.model_config explicitly binds this path:

1
2
3
4
5
6
model_config = SettingsConfigDict(
env_file=_ENV_FILE,
env_file_encoding="utf-8",
case_sensitive=True,
extra="ignore",
)

case_sensitive=True means environment variable names must match field names exactly (LLM_API_KEY, not llm_api_key). extra="ignore" allows extra keys in .env without ValidationError.


4. dotenv Preload: Write to os.environ Before pydantic

Before the Settings class definition, if .env is found, it is manually injected into os.environ using python-dotenv:

1
2
3
4
5
6
7
8
if os.path.exists(_ENV_FILE):
try:
from dotenv import dotenv_values
for k, v in dotenv_values(_ENV_FILE).items():
if v is not None and k not in os.environ:
os.environ[k] = v
except Exception:
pass

Design intent:

  • Does not overwrite existing environment variables (k not in os.environ)—Docker/K8s injected values take precedence over .env file;
  • Makes os.environ visible before pydantic parsing, avoiding edge cases where some libraries read environ only, not env_file;
  • except Exception: pass: silently skips if python-dotenv is missing or file is corrupted; still allows starting with pure environment variables.

Effective priority (consistent with comments):

1
Process environment variables > .env file (via preload + pydantic env_file) > Settings field defaults

5. Settings Fields: Code Defaults vs .env Example

Code defaults (active when no env variable set):

Field Default Usage
LLM_MODEL_CHAT gpt-4o Guide / Analysis / Matching / Reporter
LLM_MODEL_LIGHT gpt-4o-mini ResumeParser etc.
LLM_BASE_URL https://api.openai.com/v1 OpenAI-compatible root path
LLM_API_KEY "" Empty string, must fill locally
LLM_MAX_TOKENS 4096 Max generation tokens (compose default 8192 overrides)
DB_URL sqlite:///./ican.db Development SQLite
DEBUG False Production mode
MAX_CHAT_ROUNDS 15 Guide loop limit
EMBEDDING_MODEL_PATH "" Empty means external mount or local configuration required
CHROMA_PERSIST_DIR "" Empty means using project relative path logic

.env.example shows common DeepSeek deployment examples, not code defaults:

1
2
3
LLM_BASE_URL=https://api.deepseek.com/v1
LLM_MODEL_CHAT=deepseek-v4-flash
LLM_MODEL_LIGHT=deepseek-v4-flash

Saying “defaults to DeepSeek” in documentation would conflict with source code; config.py is the authority, .env / compose are override layers.

Docker docker-compose.yml is a third set of defaults (Ollama qwen3.5:9b, LLM_API_KEY=ollama), suitable for offline demos, not contradictory with .env.example‘s cloud API—both are environment overrides, not changing Settings class defaults.


6. Convenience Properties and llm_config_dict Redaction

Several @property on Settings are used by business side:

  • is_debug / is_production: based on DEBUG;
  • log_level_value: maps LOG_LEVEL string to logging.DEBUG integer etc.;
  • app_info: dictionary of app name, version, debug switch.

LLM-related aggregation in llm_config_dict:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
result = {
"api_key": self.LLM_API_KEY,
"base_url": self.LLM_BASE_URL,
"model_chat": self.LLM_MODEL_CHAT,
"model_light": self.LLM_MODEL_LIGHT,
"temperature": self.LLM_TEMPERATURE,
"max_tokens": self.LLM_MAX_TOKENS,
}
safe_result = {
**result,
"api_key": "***" + result["api_key"][-4:] if len(result["api_key"]) > 4 else "***",
}
logger.info(f"[llm_config_dict] execution completed, returned (redacted): {safe_result}")
return result # Note: returns result with plaintext key

Key points:

  • Log uses safe_result, only exposes last four chars;
  • Return value is still the full result (contains plaintext api_key); caller must redact themselves if they print later;
  • Key length ≤4 shows *** in logs to avoid short key leakage.

llm/providers.py does not go through llm_config_dict; it directly reads settings.LLM_API_KEY etc. to construct ChatOpenAI—single configuration entry point, but log redaction only triggers when accessing llm_config_dict property.


7. Module-level Singleton: get_settings() and settings

1
2
3
4
5
6
7
8
9
10
11
12
def get_settings() -> Settings:
settings = Settings()
logger.info(
f"[get_settings] execution completed, app: {settings.APP_NAME} v{settings.APP_VERSION}, "
f"debug mode: {settings.DEBUG}, database: {settings.DB_URL}"
)
return settings

try:
settings = get_settings()
except Exception:
settings = Settings() # Fallback to pure defaults on creation failure

Executed once at module import, shared process-wide via from ican.config import settings. If FastAPI needs dependency injection, get_settings() can be wrapped, but current codebase generally directly imports settings.

Note: get_settings() creates a new Settings() instance each call; only the module-level settings is a singleton. Business code should import settings, not call get_settings() repeatedly unless for test isolation.

Startup log prints DB_URL plaintext (including password); unlike API Key, database connection string is currently not redacted—production should control via LOG_LEVEL or external log filters.


8. How to Switch Environments

Scenario Approach
Local development Copy .env.example.env, fill in DeepSeek/OpenAI
Docker compose environment block overrides; host .env can be passed via ${VAR}
CI-only secrets No .env, pipeline injects LLM_API_KEY etc. as env variables
SQLite → MySQL Change DB_URL, compose default is already MySQL DSN

EMBEDDING_MODEL_PATH, CHROMA_PERSIST_DIR in compose default to /app/models/bge-m3, /app/chroma_data, corresponding to volume mounts from article 15. Empty string defaults in code mean “path must be specified externally” (Chroma comment says default may fallback to project root chroma_data).

Type validation done by pydantic: DEBUG=false string becomes False; DEBUG=abc causes ValidationError on startup. Same for LLM_TEMPERATURE, LLM_MAX_TOKENS.


9. Position in Pipeline

1
2
3
4
5
6
7
.env / Environment variables
→ dotenv preload (top of config.py)
→ Settings() validation
→ settings singleton
→ llm/providers.py (ChatOpenAI)
→ agents/resume_parser.py / agents/* / api/routes/*
→ docker-compose environment overrides

Changing a model does not require changing Agent code; just modify LLM_BASE_URL + LLM_MODEL_*; consistent with article 8 “OpenAI-compatible interface + environment variable switching”. check_ollama_available in llm/providers.py also reads settings.LLM_BASE_URL and settings.LLM_MODEL_CHAT for health check.


10. Pitfalls and Edge Cases

Pitfall 1: Mistaking .env.example as the runtime default. Source code defaults are gpt-4o + OpenAI URL; DeepSeek is just an example file. When writing documentation or screenshots, distinguish between “class defaults” and “deployment examples”.

Pitfall 2: llm_config_dict redaction only protects the logger. The returned dict still contains plaintext key; serializing it into an API response would leak. Redaction logic should not be copied to external interfaces; separate DTO design is needed.

Pitfall 3: _ENV_FILE path is independent of cwd. Config calculates absolute path based on config.py location; starting uvicorn from /app/src still finds repo root .env (if mounted into container). But if .env is not COPY-ed into image and no compose variables set, only code defaults + compose injected items are used.

Pitfall 4: dotenv preload silence. except Exception: pass swallows dotenv failures; when troubleshooting “variable not taking effect”, check both os.environ, compose, and case_sensitive spelling.

Pitfall 5: Only one of dual model fields configured. .env.example often sets same DeepSeek model for chat/light; agents/resume_parser.py still goes through LLM_MODEL_LIGHT. Changing only LLM_MODEL_CHAT without light leads to inconsistency between two model paths.


11. Summary

  1. Two-level _ENV_FILE lookup + SettingsConfigDict(env_file=...) bind to a single .env path.
  2. dotenv preload before import, and does not overwrite existing env vars, suitable for Docker injection.
  3. Code defaults are OpenAI-based; DeepSeek/Ollama overridden via .env or compose, not hardcoded in Settings class.
  4. llm_config_dict logs redact last four chars, return body still contains plaintext, callers should use with caution.
  5. Use module-level settings singleton; get_settings() mainly used for startup log and exception fallback.

Articles 16 (middleware) and 15 (Docker) in this series both pull environment variables from this module as the source.


Appendix: Key Source Code (Line-by-line Comments)

The following code is excerpted from iCan implementation, with Chinese comments above each line, readable without public repository.
Generation command: python3 bin/build-ican-annotated-snippets.py

Settings LLM/DB Fields

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
# ========== Settings LLM/DB fields ==========
# Source file: config.py Lines 32-88

# L32: Define class (config or ORM model)
class Settings(BaseSettings):
# L34: [Document] iCan project global configuration class.
# L36: [Document] Function description:
# L37: [Document] Centralized management of all configuration items, supports reading from .env file and environment variables.
# L38: [Document] Uses pydantic-settings for type validation and verification.
# L40: [Document] Input parameter description:
# L41: [Document] None (automatically read from environment variables/.env file via BaseSettings)
# L43: [Document] Output parameter description:
# L44: [Document] Settings instance, accessible via attributes
# (L33-45 are function/module docstrings, converted to comments for readability)

# L47: ------------------------------------------------------------------
# L48: Pydantic Settings configuration
# L49: ------------------------------------------------------------------
# L50: Assignment: update local variable or state field
model_config = SettingsConfigDict(
# L51: Assignment: update local variable or state field
env_file=_ENV_FILE,
# L52: Assignment: update local variable or state field
env_file_encoding="utf-8",
# L53: Assignment: update local variable or state field
case_sensitive=True,
# L54: Assignment: update local variable or state field
extra="ignore",
# L55: Execute this statement (details see business description above)
)

# L57: ------------------------------------------------------------------
# L58: Application basic configuration
# L59: ------------------------------------------------------------------
# L60: Assignment: update local variable or state field
APP_NAME: str = "iCan"
# L61: [Document] Application name

# L63: Assignment: update local variable or state field
APP_VERSION: str = "0.1.0"
# L64: [Document] Application version

# L66: Assignment: update local variable or state field
DEBUG: bool = False
# L67: [Document] Debug mode switch

# L69: ------------------------------------------------------------------
# L70: LLM Large Language Model configuration
# L71: ------------------------------------------------------------------
# L72: Assignment: update local variable or state field
LLM_API_KEY: str = ""
# L73: [Document] LLM API key, read from .env file

# L75: Assignment: update local variable or state field
LLM_BASE_URL: str = "https://api.openai.com/v1"
# L76: [Document] LLM API base URL

# L78: Assignment: update local variable or state field
LLM_MODEL_CHAT: str = "gpt-4o"
# L79: [Document] Model used for conversation guide, analysis, and matching

# L81: Assignment: update local variable or state field
LLM_MODEL_LIGHT: str = "gpt-4o-mini"
# L82: [Document] Lightweight model used for parsing and report formatting

# L84: Assignment: update local variable or state field
LLM_TEMPERATURE: float = 0.7
# L85: [Document] LLM generation temperature parameter

# L87: Assignment: update local variable or state field
LLM_MAX_TOKENS: int = 4096
# L88: [Document] LLM maximum generated token count

llm_config_dict Redaction

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
# ========== llm_config_dict redaction ==========
# Source file: config.py Lines 179-215

# L179: Decorator
@property
# L180: Synchronous function llm_config_dict: routing decision or factory method
def llm_config_dict(self) -> dict:
# L182: [Document] Get LLM-related configuration as a dictionary.
# L184: [Document] Function description:
# L185: [Document] Assembles LLM-related config items into a dictionary, convenient for passing to LLM client.
# L187: [Document] Input parameter description:
# L188: [Document] None
# L190: [Document] Output parameter description:
# L191: [Document] dict: Contains api_key, base_url, model_chat, model_light,
# L192: [Document] temperature, max_tokens
# (L181-193 are function/module docstrings, converted to comments for readability)
# L194: Start try block, subsequent except handles fallback
try:
# L195: Import dependency module
from ican.core.logger import get_logger

# L197: Assignment: update local variable or state field
logger = get_logger(__name__)
# L198: Log for online debugging of node input/output
logger.info(f"[llm_config_dict] start execution, input: None")

# L200: Assignment: update local variable or state field
result = {
# L201: Execute statement (details see business description above)
"api_key": self.LLM_API_KEY,
# L202: Execute statement (details see business description above)
"base_url": self.LLM_BASE_URL,
# L203: Execute statement (details see business description above)
"model_chat": self.LLM_MODEL_CHAT,
# L204: Execute statement (details see business description above)
"model_light": self.LLM_MODEL_LIGHT,
# L205: Execute statement (details see business description above)
"temperature": self.LLM_TEMPERATURE,
# L206: Execute statement (details see business description above)
"max_tokens": self.LLM_MAX_TOKENS,
# L207: Execute statement (details see business description above)
}

# L209: Log after redaction
# L210: Assignment: update local variable or state field
safe_result = {**result, "api_key": "***" + result["api_key"][-4:] if len(result["api_key"]) > 4 else "***"}
# L211: Log for online debugging of node input/output
logger.info(f"[llm_config_dict] execution completed, returned (redacted): {safe_result}")
# L212: Return fields to be merged into state (LangGraph will merge)
return result
# L213: Catch exception to avoid crashing entire graph/request
except Exception as e:
# L214: Execute statement (details see business description above)
print(f"[llm_config_dict] exception getting LLM config dict: {e}")
# L215: Return fields to be merged into state (LangGraph will merge)
return {}

get_settings Singleton

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
# ========== get_settings singleton ==========
# Source file: config.py Lines 287-326

# L287: Synchronous function get_settings: routing decision or factory method
def get_settings() -> Settings:
# L289: [Document] Get global Settings singleton instance.
# L291: [Document] Function description:
# L292: [Document] Creates and returns a Settings configuration instance. This function can serve as a factory method for dependency injection,
# L293: [Document] ensuring the entire application uses the same configuration.
# L295: [Document] Input parameter description:
# L296: [Document] None
# L298: [Document] Output parameter description:
# L299: [Document] Settings: global configuration instance
# (L288-300 are function/module docstrings, converted to comments for readability)
# L301: Start try block, subsequent except handles fallback
try:
# L302: Import dependency module
from ican.core.logger import get_logger

# L304: Assignment: update local variable or state field
logger = get_logger(__name__)
# L305: Log for online debugging of node input/output
logger.info(f"[get_settings] start execution, input: None")

# L307: Assignment: update local variable or state field
settings = Settings()

# L309: Log for online debugging of node input/output
logger.info(
# L310: Execute statement (details see business description above)
f"[get_settings] execution completed, app: {settings.APP_NAME} v{settings.APP_VERSION}, "
# L311: Execute statement (details see business description above)
f"debug mode: {settings.DEBUG}, database: {settings.DB_URL}"
# L312: Execute statement (details see business description above)
)
# L313: Return fields to be merged into state (LangGraph will merge)
return settings
# L314: Catch exception to avoid crashing entire graph/request
except Exception as e:
# L315: Execute statement (details see business description above)
print(f"[get_settings] exception creating Settings instance: {e}")
# L316: Re-raise exception, handled by caller or LangGraph
raise


# L319: ----------------------------------------------------------------------
# L320: Global configuration instance (module-level singleton)
# L321: ----------------------------------------------------------------------
# L322: Start try block, subsequent except handles fallback
try:
# L323: Assignment: update local variable or state field
settings = get_settings()
# L324: Catch exception to avoid crashing entire graph/request
except Exception:
# L325: If creation fails (e.g., missing .env file), use defaults
# L326: Assignment: update local variable or state field
settings = Settings()

Series Navigation

Article Topic
1 System Overview
2 Five-Agent Collaboration
3 Holland RIASEC
4–7 State · Routing · Nesting · Fault Tolerance
8–11 LLM Layer · SSE/WS · DB Migration · PDF
12–14 JSON Prompt · RIASEC Prompt · Guide Prompt
15–17 Docker · Middleware · Config

← Back to iCan Series