pydantic-settings Configuration Management: Best Practices for API Key Masking and Multi-Environment Configuration

0. Series Loop (Follow Along Without Public Source Code)

End-to-End Pipeline: Vue Frontend → api/routes/chat.py → Guide Multi-turn SSE → run_analysis_pipeline (Parsing → Analysis → Matching → Report) → tools/pdf_exporter PDF.
This Article: #17/17 · Configuration Loop · settings

Stage	User Visible	Code Entry	Corresponding Article
Create Session	Welcome Message	POST /api/sessions	09
Multi-turn Dialogue	SSE Streaming	chat/stream → run_guide_single_turn	06, 14
Information Sufficient	Start Analysis	_run_analysis_background	05, 07
Resume Parsing	Progress 30%	run_resume_parser	12
Profile/RIASEC	Progress 50%	run_profile_analyzer	03, 13
Career Matching	Progress 70%	run_career_matcher	02
Report	Progress 90%	run_reporter	11
Download PDF	File	GET …/report/pdf	11, 15

	Description
Before Reading	Article 08/15 Environment Variables
After Reading	Can write how `.env` overrides Settings defaults
Next Loop	Back to Article 01: Loop Review (go back to Article 01 for review)

Full series loop index: SERIES-LOOP.md

1. What Problem to Solve

iCan’s configuration scattered across multiple places quickly spirals out of control: LLM keys, MySQL URL, Embedding path, Chroma directory, chat round limit—local .env, Docker environment, CI-injected variables may all coexist simultaneously. Also need to avoid logging the full LLM_API_KEY.

The project consolidates all items into a single Settings class in config.py, with a module-level settings singleton, uniformly read by llm/providers.py, Agent, Docker compose. This article only covers iCan’s implementation, not a generic pydantic-settings tutorial.

2. Implementation Location

Module	Relationship with Config
`config.py`	`_ENV_FILE` location, dotenv preload, `Settings`, `get_settings()`, `settings` singleton
`llm/providers.py`	`from ican.config import settings`, reads `LLM_*` to create `ChatOpenAI`
`agents/resume_parser.py`	Indirectly reads `LLM_MODEL_LIGHT` via `get_light_model()`
`.env.example`	Repository’s recommended local/DeepSeek example values
`docker-compose.yml`	Container environment variable overrides (default Ollama + MySQL DSN)

pydantic-settings config loading

3. `_ENV_FILE` Lookup: Two-level Path

config.py calculates the absolute path of .env before importing Settings:

_ENV_FILE = os.path.normpath(os.path.join(
    os.path.abspath(os.path.dirname(__file__)), "..", "..", ".env"))
if not os.path.exists(_ENV_FILE):
    _ENV_FILE = os.path.normpath(os.path.join(
        os.path.abspath(os.path.dirname(__file__)), "..", "..", "..", ".env"))

Explanation (__file__ is in src/ican/config.py):

Preferred project root/.env (up two levels to iCan/ repository root);
Fallback one more level up (compatible with checkout layouts where src is nested one more directory).

If neither path exists, _ENV_FILE still points to the last candidate path; pydantic won’t error when it can’t read the file, fields will use code defaults.

Settings.model_config explicitly binds this path:

model_config = SettingsConfigDict(
    env_file=_ENV_FILE,
    env_file_encoding="utf-8",
    case_sensitive=True,
    extra="ignore",
)

case_sensitive=True means environment variable names must match field names exactly (LLM_API_KEY, not llm_api_key). extra="ignore" allows extra keys in .env without ValidationError.

4. dotenv Preload: Write to `os.environ` Before pydantic

Before the Settings class definition, if .env is found, it is manually injected into os.environ using python-dotenv:

if os.path.exists(_ENV_FILE):
    try:
        from dotenv import dotenv_values
        for k, v in dotenv_values(_ENV_FILE).items():
            if v is not None and k not in os.environ:
                os.environ[k] = v
    except Exception:
        pass

Design intent:

Does not overwrite existing environment variables (k not in os.environ)—Docker/K8s injected values take precedence over .env file;
Makes os.environ visible before pydantic parsing, avoiding edge cases where some libraries read environ only, not env_file;
except Exception: pass: silently skips if python-dotenv is missing or file is corrupted; still allows starting with pure environment variables.

Effective priority (consistent with comments):

1	`Process environment variables > .env file (via preload + pydantic env_file) > Settings field defaults`

5. Settings Fields: Code Defaults vs .env Example

Code defaults (active when no env variable set):

Field	Default	Usage
`LLM_MODEL_CHAT`	`gpt-4o`	Guide / Analysis / Matching / Reporter
`LLM_MODEL_LIGHT`	`gpt-4o-mini`	ResumeParser etc.
`LLM_BASE_URL`	`https://api.openai.com/v1`	OpenAI-compatible root path
`LLM_API_KEY`	`""`	Empty string, must fill locally
`LLM_MAX_TOKENS`	`4096`	Max generation tokens (compose default 8192 overrides)
`DB_URL`	`sqlite:///./ican.db`	Development SQLite
`DEBUG`	`False`	Production mode
`MAX_CHAT_ROUNDS`	`15`	Guide loop limit
`EMBEDDING_MODEL_PATH`	`""`	Empty means external mount or local configuration required
`CHROMA_PERSIST_DIR`	`""`	Empty means using project relative path logic

.env.example shows common DeepSeek deployment examples, not code defaults:

1
2
3

LLM_BASE_URL=https://api.deepseek.com/v1
LLM_MODEL_CHAT=deepseek-v4-flash
LLM_MODEL_LIGHT=deepseek-v4-flash

Saying “defaults to DeepSeek” in documentation would conflict with source code; config.py is the authority, .env / compose are override layers.

Docker docker-compose.yml is a third set of defaults (Ollama qwen3.5:9b, LLM_API_KEY=ollama), suitable for offline demos, not contradictory with .env.example‘s cloud API—both are environment overrides, not changing Settings class defaults.

6. Convenience Properties and `llm_config_dict` Redaction

Several @property on Settings are used by business side:

is_debug / is_production: based on DEBUG;
log_level_value: maps LOG_LEVEL string to logging.DEBUG integer etc.;
app_info: dictionary of app name, version, debug switch.

LLM-related aggregation in llm_config_dict:

result = {
    "api_key": self.LLM_API_KEY,
    "base_url": self.LLM_BASE_URL,
    "model_chat": self.LLM_MODEL_CHAT,
    "model_light": self.LLM_MODEL_LIGHT,
    "temperature": self.LLM_TEMPERATURE,
    "max_tokens": self.LLM_MAX_TOKENS,
}
safe_result = {
    **result,
    "api_key": "***" + result["api_key"][-4:] if len(result["api_key"]) > 4 else "***",
}
logger.info(f"[llm_config_dict] execution completed, returned (redacted): {safe_result}")
return result  # Note: returns result with plaintext key

Key points:

Log uses safe_result, only exposes last four chars;
Return value is still the full result (contains plaintext api_key); caller must redact themselves if they print later;
Key length ≤4 shows *** in logs to avoid short key leakage.

llm/providers.py does not go through llm_config_dict; it directly reads settings.LLM_API_KEY etc. to construct ChatOpenAI—single configuration entry point, but log redaction only triggers when accessing llm_config_dict property.

7. Module-level Singleton: `get_settings()` and `settings`

def get_settings() -> Settings:
    settings = Settings()
    logger.info(
        f"[get_settings] execution completed, app: {settings.APP_NAME} v{settings.APP_VERSION}, "
        f"debug mode: {settings.DEBUG}, database: {settings.DB_URL}"
    )
    return settings

try:
    settings = get_settings()
except Exception:
    settings = Settings()  # Fallback to pure defaults on creation failure

Executed once at module import, shared process-wide via from ican.config import settings. If FastAPI needs dependency injection, get_settings() can be wrapped, but current codebase generally directly imports settings.

Note: get_settings() creates a new Settings() instance each call; only the module-level settings is a singleton. Business code should import settings, not call get_settings() repeatedly unless for test isolation.

Startup log prints DB_URL plaintext (including password); unlike API Key, database connection string is currently not redacted—production should control via LOG_LEVEL or external log filters.

8. How to Switch Environments

Scenario	Approach
Local development	Copy `.env.example` → `.env`, fill in DeepSeek/OpenAI
Docker	compose `environment` block overrides; host `.env` can be passed via `${VAR}`
CI-only secrets	No `.env`, pipeline injects `LLM_API_KEY` etc. as env variables
SQLite → MySQL	Change `DB_URL`, compose default is already MySQL DSN

EMBEDDING_MODEL_PATH, CHROMA_PERSIST_DIR in compose default to /app/models/bge-m3, /app/chroma_data, corresponding to volume mounts from article 15. Empty string defaults in code mean “path must be specified externally” (Chroma comment says default may fallback to project root chroma_data).

Type validation done by pydantic: DEBUG=false string becomes False; DEBUG=abc causes ValidationError on startup. Same for LLM_TEMPERATURE, LLM_MAX_TOKENS.

9. Position in Pipeline

.env / Environment variables
    → dotenv preload (top of config.py)
    → Settings() validation
    → settings singleton
    → llm/providers.py (ChatOpenAI)
    → agents/resume_parser.py / agents/* / api/routes/*
    → docker-compose environment overrides

Changing a model does not require changing Agent code; just modify LLM_BASE_URL + LLM_MODEL_*; consistent with article 8 “OpenAI-compatible interface + environment variable switching”. check_ollama_available in llm/providers.py also reads settings.LLM_BASE_URL and settings.LLM_MODEL_CHAT for health check.

10. Pitfalls and Edge Cases

Pitfall 1: Mistaking .env.example as the runtime default. Source code defaults are gpt-4o + OpenAI URL; DeepSeek is just an example file. When writing documentation or screenshots, distinguish between “class defaults” and “deployment examples”.

Pitfall 2: llm_config_dict redaction only protects the logger. The returned dict still contains plaintext key; serializing it into an API response would leak. Redaction logic should not be copied to external interfaces; separate DTO design is needed.

Pitfall 3: _ENV_FILE path is independent of cwd. Config calculates absolute path based on config.py location; starting uvicorn from /app/src still finds repo root .env (if mounted into container). But if .env is not COPY-ed into image and no compose variables set, only code defaults + compose injected items are used.

Pitfall 4: dotenv preload silence. except Exception: pass swallows dotenv failures; when troubleshooting “variable not taking effect”, check both os.environ, compose, and case_sensitive spelling.

Pitfall 5: Only one of dual model fields configured. .env.example often sets same DeepSeek model for chat/light; agents/resume_parser.py still goes through LLM_MODEL_LIGHT. Changing only LLM_MODEL_CHAT without light leads to inconsistency between two model paths.

11. Summary

Two-level _ENV_FILE lookup + SettingsConfigDict(env_file=...) bind to a single .env path.
dotenv preload before import, and does not overwrite existing env vars, suitable for Docker injection.
Code defaults are OpenAI-based; DeepSeek/Ollama overridden via .env or compose, not hardcoded in Settings class.
llm_config_dict logs redact last four chars, return body still contains plaintext, callers should use with caution.
Use module-level settings singleton; get_settings() mainly used for startup log and exception fallback.

Articles 16 (middleware) and 15 (Docker) in this series both pull environment variables from this module as the source.

Appendix: Key Source Code (Line-by-line Comments)

The following code is excerpted from iCan implementation, with Chinese comments above each line, readable without public repository.
Generation command: python3 bin/build-ican-annotated-snippets.py

Settings LLM/DB Fields

# ========== Settings LLM/DB fields ==========
# Source file: config.py   Lines 32-88

# L32: Define class (config or ORM model)
class Settings(BaseSettings):
# L34: [Document] iCan project global configuration class.
# L36: [Document] Function description:
# L37: [Document] Centralized management of all configuration items, supports reading from .env file and environment variables.
# L38: [Document] Uses pydantic-settings for type validation and verification.
# L40: [Document] Input parameter description:
# L41: [Document] None (automatically read from environment variables/.env file via BaseSettings)
# L43: [Document] Output parameter description:
# L44: [Document] Settings instance, accessible via attributes
# (L33-45 are function/module docstrings, converted to comments for readability)

# L47: ------------------------------------------------------------------
# L48: Pydantic Settings configuration
# L49: ------------------------------------------------------------------
# L50: Assignment: update local variable or state field
    model_config = SettingsConfigDict(
# L51: Assignment: update local variable or state field
        env_file=_ENV_FILE,
# L52: Assignment: update local variable or state field
        env_file_encoding="utf-8",
# L53: Assignment: update local variable or state field
        case_sensitive=True,
# L54: Assignment: update local variable or state field
        extra="ignore",
# L55: Execute this statement (details see business description above)
    )

# L57: ------------------------------------------------------------------
# L58: Application basic configuration
# L59: ------------------------------------------------------------------
# L60: Assignment: update local variable or state field
    APP_NAME: str = "iCan"
# L61: [Document] Application name

# L63: Assignment: update local variable or state field
    APP_VERSION: str = "0.1.0"
# L64: [Document] Application version

# L66: Assignment: update local variable or state field
    DEBUG: bool = False
# L67: [Document] Debug mode switch

# L69: ------------------------------------------------------------------
# L70: LLM Large Language Model configuration
# L71: ------------------------------------------------------------------
# L72: Assignment: update local variable or state field
    LLM_API_KEY: str = ""
# L73: [Document] LLM API key, read from .env file

# L75: Assignment: update local variable or state field
    LLM_BASE_URL: str = "https://api.openai.com/v1"
# L76: [Document] LLM API base URL

# L78: Assignment: update local variable or state field
    LLM_MODEL_CHAT: str = "gpt-4o"
# L79: [Document] Model used for conversation guide, analysis, and matching

# L81: Assignment: update local variable or state field
    LLM_MODEL_LIGHT: str = "gpt-4o-mini"
# L82: [Document] Lightweight model used for parsing and report formatting

# L84: Assignment: update local variable or state field
    LLM_TEMPERATURE: float = 0.7
# L85: [Document] LLM generation temperature parameter

# L87: Assignment: update local variable or state field
    LLM_MAX_TOKENS: int = 4096
# L88: [Document] LLM maximum generated token count

llm_config_dict Redaction

# ========== llm_config_dict redaction ==========
# Source file: config.py   Lines 179-215

# L179: Decorator
    @property
# L180: Synchronous function llm_config_dict: routing decision or factory method
    def llm_config_dict(self) -> dict:
# L182: [Document] Get LLM-related configuration as a dictionary.
# L184: [Document] Function description:
# L185: [Document] Assembles LLM-related config items into a dictionary, convenient for passing to LLM client.
# L187: [Document] Input parameter description:
# L188: [Document] None
# L190: [Document] Output parameter description:
# L191: [Document] dict: Contains api_key, base_url, model_chat, model_light,
# L192: [Document] temperature, max_tokens
# (L181-193 are function/module docstrings, converted to comments for readability)
# L194: Start try block, subsequent except handles fallback
        try:
# L195: Import dependency module
            from ican.core.logger import get_logger

# L197: Assignment: update local variable or state field
            logger = get_logger(__name__)
# L198: Log for online debugging of node input/output
            logger.info(f"[llm_config_dict] start execution, input: None")

# L200: Assignment: update local variable or state field
            result = {
# L201: Execute statement (details see business description above)
                "api_key": self.LLM_API_KEY,
# L202: Execute statement (details see business description above)
                "base_url": self.LLM_BASE_URL,
# L203: Execute statement (details see business description above)
                "model_chat": self.LLM_MODEL_CHAT,
# L204: Execute statement (details see business description above)
                "model_light": self.LLM_MODEL_LIGHT,
# L205: Execute statement (details see business description above)
                "temperature": self.LLM_TEMPERATURE,
# L206: Execute statement (details see business description above)
                "max_tokens": self.LLM_MAX_TOKENS,
# L207: Execute statement (details see business description above)
            }

# L209: Log after redaction
# L210: Assignment: update local variable or state field
            safe_result = {**result, "api_key": "***" + result["api_key"][-4:] if len(result["api_key"]) > 4 else "***"}
# L211: Log for online debugging of node input/output
            logger.info(f"[llm_config_dict] execution completed, returned (redacted): {safe_result}")
# L212: Return fields to be merged into state (LangGraph will merge)
            return result
# L213: Catch exception to avoid crashing entire graph/request
        except Exception as e:
# L214: Execute statement (details see business description above)
            print(f"[llm_config_dict] exception getting LLM config dict: {e}")
# L215: Return fields to be merged into state (LangGraph will merge)
            return {}

get_settings Singleton

# ========== get_settings singleton ==========
# Source file: config.py   Lines 287-326

# L287: Synchronous function get_settings: routing decision or factory method
def get_settings() -> Settings:
# L289: [Document] Get global Settings singleton instance.
# L291: [Document] Function description:
# L292: [Document] Creates and returns a Settings configuration instance. This function can serve as a factory method for dependency injection,
# L293: [Document] ensuring the entire application uses the same configuration.
# L295: [Document] Input parameter description:
# L296: [Document] None
# L298: [Document] Output parameter description:
# L299: [Document] Settings: global configuration instance
# (L288-300 are function/module docstrings, converted to comments for readability)
# L301: Start try block, subsequent except handles fallback
    try:
# L302: Import dependency module
        from ican.core.logger import get_logger

# L304: Assignment: update local variable or state field
        logger = get_logger(__name__)
# L305: Log for online debugging of node input/output
        logger.info(f"[get_settings] start execution, input: None")

# L307: Assignment: update local variable or state field
        settings = Settings()

# L309: Log for online debugging of node input/output
        logger.info(
# L310: Execute statement (details see business description above)
            f"[get_settings] execution completed, app: {settings.APP_NAME} v{settings.APP_VERSION}, "
# L311: Execute statement (details see business description above)
            f"debug mode: {settings.DEBUG}, database: {settings.DB_URL}"
# L312: Execute statement (details see business description above)
        )
# L313: Return fields to be merged into state (LangGraph will merge)
        return settings
# L314: Catch exception to avoid crashing entire graph/request
    except Exception as e:
# L315: Execute statement (details see business description above)
        print(f"[get_settings] exception creating Settings instance: {e}")
# L316: Re-raise exception, handled by caller or LangGraph
        raise


# L319: ----------------------------------------------------------------------
# L320: Global configuration instance (module-level singleton)
# L321: ----------------------------------------------------------------------
# L322: Start try block, subsequent except handles fallback
try:
# L323: Assignment: update local variable or state field
    settings = get_settings()
# L324: Catch exception to avoid crashing entire graph/request
except Exception:
# L325: If creation fails (e.g., missing .env file), use defaults
# L326: Assignment: update local variable or state field
    settings = Settings()

Article	Topic
1	System Overview
2	Five-Agent Collaboration
3	Holland RIASEC
4–7	State · Routing · Nesting · Fault Tolerance
8–11	LLM Layer · SSE/WS · DB Migration · PDF
12–14	JSON Prompt · RIASEC Prompt · Guide Prompt
15–17	Docker · Middleware · Config

← Back to iCan Series