cfgnotes

2026-05-28

Qwen3.5 Pitfalls: Thinking Chain and flash-attn

0. Series Closure This Article Position Upstream This Article Output Downstream Article 9/10 Article 08 Inference Verification Qwen3.5 Specific Issue...

📖 2 min read

Read more →

2026-05-28

vLLM Deployment: Dynamic Mounting of LoRA and OpenAI-Compatible API

0. Series Closure This Article Upstream Output Downstream Part 10/10 Training + Validation Production-ready HTTP API Business App / Mini Program...

📖 3 min read

Read more →

2026-05-27

LoRA Effect Validation: The Principle of verify_lora.py and Practical Tests on Mac

0. Series Loop Position Upstream Output Downstream Post 8/10 Post 07: Loss Convergence Qualitative conclusion: Whether it’s “gentler, less preachy” Po...

📖 2 min read

Read more →

2026-05-27

Understanding the LoRA Training Curve: 750-Step Real Trading Review

0. Series Closure Position in Series Upstream This Article Output Downstream Article 7/10 Article 06: Training Complete Metric interpretation, determi...

📖 2 min read

Read more →

2026-05-27

Single Card SFT Practice (Part 2): SFTTrainer and Training Loop

0. Series Overview This Article Upstream Output Downstream Article 6/10 Article 05 – Dataset Ready final_lora/, 15 checkpoints Article 07 – Reading Lo...

📖 2 min read

Read more →

2026-05-27

Single-Card SFT Practical (Part 1): Tokenizer, Base Model, and LoRA Configuration

0. Series Completion Position in Series Upstream This Output Downstream Article 5/10 Article 04: Environment Ready Tokenizer, Base Model, LoraConfig,...

📖 2 min read

Read more →

2026-05-26

Qwen3.5-4B LoRA Fine-tuning Environment Setup and Model Preparation

0. Series Closing the Loop This Article’s Position Upstream This Article’s Deliverable Downstream Article 4/10 Article 03: Principles Runnable GPU env...

📖 2 min read

Read more →

2026-05-26

LoRA Principle: Training Only 0.25% of Parameters

0. Series Closed Loop Position in Series Upstream This Post’s Output Downstream Post 3/10 Post 02: Data into Model Understand r/alpha/target...

📖 2 min read

Read more →

2026-05-26

Training Set Design: 1000 JSONL and Elderly Psychological Model

0. Series Closure Position in Series Upstream This Article’s Output Downstream Article 2/10 Article 01: Scenario Definition Trainable JSONL Specificat...

📖 2 min read

Read more →

2026-05-26

pydantic-settings Configuration Management: Best Practices for API Key Masking and Multi-Environment Configuration

0. Series Loop (Follow Along Without Public Source Code)End-to-End Pipeline: Vue Frontend → api/routes/chat.py → Guide Multi-turn SSE → run_analysis_p...

📖 5 min read

Read more →

2026-05-26

Why Use LoRA for Elderly Emotional Companionship AI

0. Series Overview Position in Series Upstream Output Downstream Article 1/10 Business scenario definition Technical selection conclusion, project bou...

📖 2 min read

Read more →

2026-05-26

Intelligent Dialogue Engine in Practice — Multi-turn Context Management and Smart Chart Recommendation

🎯 Core Questions of This ChapterHow to build a truly “intelligent” conversational system? Problems with ordinary chatbots: ❌ No memory: Each conversation roun...

📖 6 min read

Read more →

2026-05-26

Implementing a Frontend Drag-and-Drop Interaction System —— Drag API + Grid + mousedown

🎯 Core Questions of This ChapterHow to achieve a smooth and intuitive drag-and-drop interaction experience? Challenge Pain Points of Traditional Solutions O...

📖 7 min read

Read more →

2026-05-26

Data Dashboard Two-Stage Separation Architecture: Design Time vs Runtime Decoupling

🎯 Core Question of This ChapterWhat is the core contradiction of Dashboards? Dimension Problems with Traditional Approaches Our Solution Performance LLM...

📖 6 min read

Read more →

2026-05-26

Production Deployment and Performance Optimization — Asynchronous Architecture, Caching Strategy, and Monitoring

🎯 Core Chapter QuestionsHow to safely and efficiently deploy a development environment application to production? Challenge Traditional Pain Points Our Solu...

📖 9 min read

Read more →

2026-05-25

FastAPI Middleware Design: Custom RateLimitMiddleware and CORS Configuration Practice

0. Series Loop (Follow Along Even Without Public Source Code)End-to-End Pipeline: Vue Frontend → api/routes/chat.py → Guide Multi-turn SSE → run_analy...

📖 4 min read

Read more →

2026-05-25

Full Record of Pitfalls in Docker Deployment of AI Projects: PyTorch CPU Version + Chinese Fonts + Frontend Build

0. Series Loop (Follow along without public source code)End-to-end pipeline: Vue frontend → api/routes/chat.py → Guide multi-turn SSE → run_analysis_p...

📖 8 min read

Read more →

2026-05-25

Prompt Strategy for Conversational AI Agent: 5-Stage Progressive Information Mining

0. Series Loop (Follow Without Publishing Source)End-to-end pipeline: Vue frontend → api/routes/chat.py → Guide multi-turn SSE → run_analysis_pipeline...

📖 7 min read

Read more →

2026-05-25

NL→SQL Translation Engine — Semantic Model Injection and Security Verification

🎯 Core Questions of This ChapterHow to make LLM generate accurate, safe, and executable SQL? This is the core challenge of the entire system: ❌ The LLM doesn’...

📖 7 min read

Read more →

2026-05-25

RIASEC Holland Assessment Prompt Engineering: Let AI Scientifically Assess Career Interests

0. Series Loop (Follow Along Without Public Source Code)End-to-End Pipeline: Vue Frontend → api/routes/chat.py → Guide Multi-turn SSE → run_analysis_p...

📖 8 min read

Read more →

2026-05-24

Prompt Design for AI Agent: How to Make LLM Stably Output Structured JSON

0. Series Loop (Follow Along Without Public Source Code)End-to-End Pipeline: Vue Frontend → api/routes/chat.py → Guide Multi-turn SSE → run_analysis_p...

📖 8 min read

Read more →

2026-05-24

Python PDF Generation in Practice: ReportLab + matplotlib for Professional Chinese Reports

0. Series Loop (Follow along without open-source code)End-to-end pipeline: Vue frontend → api/routes/chat.py → Guide multi-turn SSE → run_analysis_pip...

📖 9 min read

Read more →

2026-05-24

SQLAlchemy Automatic Migration Solution: Technique to Add Fields Automatically Without Alembic

0. Series Loop (Read Along Without Public Source Code)End-to-End Pipeline: Vue frontend → api/routes/chat.py → Guide multi-turn SSE → run_analysis_pip...

📖 5 min read

Read more →

2026-05-24

FastAPI SSE Streaming Output + WebSocket Progress Push: Real-Time Communication Solution for AI Applications

0. Series Loop (Follow Along Without Public Source Code)End-to-End Pipeline: Vue frontend → api/routes/chat.py → Guide multi-turn SSE → run_analysis_p...

📖 7 min read

Read more →

2026-05-24

Metadata Intelligent Management System —— LLM-driven Semantic Model Construction

🎯 Core Questions of This ChapterHow does an LLM understand your database structure? This is the core challenge of all NL→SQL systems: ❌ The LLM doesn’t know w...

📖 10 min read

Read more →

2026-05-23

FastAPI + LangChain Integration Best Practices: Unified LLM Call Interface Design

0. Series Loop (Readalong Without Open Source)End-to-End Pipeline: Vue frontend → api/routes/chat.py → Guide Multi-turn SSE → run_analysis_pipeline (P...

📖 5 min read

Read more →

2026-05-23

LangGraph Error Handling and Fault Tolerance Design: 5 Strategies to Prevent AI Agent System Crashes

0. Series Loop (Follow Along Without Open Source Code)End-to-End Chain: Vue frontend → api/routes/chat.py → Guide multi-turn SSE → run_analysis_pipeli...

📖 10 min read

Read more →

2026-05-23

LangGraph Double-Layer StateGraph Nesting: Architectural Practice for Reducing Agent Coupling

0. Series Loop (Read Along Without Public Source Code)End-to-End Pipeline: Vue Frontend → api/routes/chat.py → Guide Multi-turn SSE → run_analysis_pip...

📖 8 min read

Read more →

2026-05-23

LLM Unified Gateway Design — LiteLLM Abstraction, Prompt Engineering, and Response Cleaning

🎯 Core Chapter QuestionsIn AI application development, how to elegantly call LLM APIs? Hard-coding with requests.post()? Or writing a set of calling logic for...

📖 6 min read

Read more →

2026-05-23

LangGraph Conditional Routing in Practice: Implementing Agent Loop Dialogue and Forced Exit Mechanism

0. Series Loop (Read Along Without Public Source Code)End-to-End Pipeline: Vue frontend → api/routes/chat.py → Guide multi-turn SSE → run_analysis_pip...

📖 6 min read

Read more →

2026-05-22

LangGraph State Management Detailed: Correct Usage of TypedDict + Annotated Reducer

0. Series Loop (Readable Without Public Source Code)End-to-End Pipeline: Vue Frontend → api/routes/chat.py → Guide Multi‑turn SSE → run_analysis_pipel...

📖 3 min read

Read more →

2026-05-22

Holland RIASEC + OpenAI-Compatible API: AI Career Assessment Engineering Implementation

0. Series Closed Loop (Read Along Even Without Public Source Code)End-to-End Pipeline: Vue Frontend → api/routes/chat.py → Guide Multi-turn SSE → run_...

📖 3 min read

Read more →

2026-05-22

LangGraph Multi-Agent Orchestration Practice: How 5 Agents Collaborate to Complete Career Planning Analysis

0. Series Loop (Follow Along Without Public Source Code)End-to-end pipeline: Vue frontend → api/routes/chat.py → Guide multi-turn SSE → run_analysis_pipeline (p...

📖 4 min read

Read more →

2026-05-22

NLP MySQL Intelligent Data Analysis Platform — Project Overview and Technology Selection

📖 Introduction: Why This Project?In daily enterprise operations, data querying is one of the most frequent needs, but the reality is harsh: Business users: Wa...

📖 3 min read

Read more →

2026-05-22

Building an AI Agent System from Scratch: FastAPI + LangGraph Smart Career Planning Platform

0. Series Loop (Read Along Without Public Source Code)End-to-End Pipeline: Vue Frontend → api/routes/chat.py → Guide Multi-turn SSE → run_analysis_pip...

📖 3 min read

Read more →

2026-05-21

Bid Farewell to Retrieval Hallucinations! Step-by-Step Guide to Building an Enterprise-Grade RAG Data Pipeline (with One-Click Docker Deployment)

Pain Point: Using Generic Embedding Models for RAG, Getting Irrelevant Answers in Vertical Domains?Have you ever encountered this: Using OpenAI’s text-embeddin...

📖 6 min read

Read more →

2026-05-21

Docker Containerized Deployment: Solving Timezone Mismatch and Log Mounting Issues

1. IntroductionWhen deploying with containers, it is common to encounter issues where the time inside the container differs from the host (typically by 8 hours)...

📖 3 min read

Read more →

2026-05-21

JWT Authentication: AccessToken has expired, how should RefreshToken be used?

1. IntroductionIn JWT (JSON Web Token) based authentication systems, AccessToken is typically set with a short lifetime (15 minutes to 1 hour) to reduce securit...

📖 4 min read

Read more →

2026-05-21

Hand-build a Simplest Permission Center: RBAC Model Design and Implementation

1. IntroductionIn a microservices and front-end/back-end separation architecture, permission management is one of the fundamental pieces of system security...

📖 5 min read

Read more →

2026-05-21

Introduction to Hybrid Development: A Step-by-Step Guide to Writing a JSBridge

1 IntroductionHybrid mobile development has become the mainstream choice for balancing multi-platform development efficiency with native experience. While WebVi...

📖 4 min read

Read more →

2026-05-20

Frontend Solution for Embedding Agent on Web

1. IntroductionAI Agents are evolving from standalone applications to embedded interactive experiences on the web. The core challenge for frontend teams is: how...

📖 4 min read

Read more →

2026-05-20

Agent Request Timeout and Slow Response Optimization

1. IntroductionThis article introduces the issues of request timeout and slow responses in Agent systems caused by network fluctuations, long tool execution tim...

📖 4 min read

Read more →

2026-05-20

Agent Integration with Local Knowledge Base for RAG Combined Use

1. IntroductionTraditional RAG lacks proactive reasoning and multi-step retrieval capabilities in complex Q&A scenarios. This article introduces how to inte...

📖 4 min read

Read more →

2026-05-20

Agent Task Auto-Decomposition Logic Development

1. IntroductionAs large model-driven agents are gradually deployed in enterprise production environments, a common problem emerges: With the same underlying lar...

📖 4 min read

Read more →

2026-05-20

PDF extraction always missing tables? PyMuPDF + PaddleOCR-VL hybrid solution in practice (with MLX acceleration)

Pain point: Why is your table always empty when extracting PDF with pdfplumber?Let’s start with a real scenario: You have a 50-page PDF product technical specif...

📖 6 min read

Read more →

2026-05-19

Implementation of A2A Multi-Agent Inter-Communication and Invocation

1. IntroductionAs AI Agent application scenarios evolve from single-point tasks to complex workflows, multi-agent cross-platform collaboration has become a nece...

📖 5 min read

Read more →

2026-05-19

Implementation of A2A Multi-Agent Inter-Communication and Invocation

1. IntroductionAs AI Agent application scenarios evolve from single-point tasks to complex workflows, multi-agent cross-platform collaboration has become a nece...

📖 5 min read

Read more →

2026-05-19

Rapid Adaptation of MCP Protocol Integration with Agent

Quick Adaptation for MCP Protocol Access in Agents1. IntroductionIn LLM application development, a common problem is the need to write adaptation code for each...

📖 4 min read

Read more →

2026-05-19

Avoiding Pitfalls in Function Call Debugging

1. IntroductionFunction Call is a core capability that enables large language models to interact with external systems, allowing the model to trigger the execut...

📖 3 min read

Read more →

2026-05-19

Agent Custom Tool Development and Integration Methods

Developing and Integrating Custom Tools for Agents1. IntroductionAgents extend their capabilities through tool calls (Function Call), which is currently the mai...

📖 4 min read

Read more →

2026-05-19

Agent Development and Deployment: Building a Production-Grade Intelligent System from Scratch

1. Introduction: Easy to Develop Agents, Hard to Deploy ThemThis article will focus on best practices for Agent development and deployment, systematically break...

📖 7 min read

Read more →

2026-05-19

How to Chunk RAG Without Losing Context? 5 Strategies from Beginner to Production-Grade (With Decision Tree for Selection)

Pain Point: Should chunk_size be 500 or 1000? Why is the effect still poor after adjusting N times?Chunking is the most underestimated step in a RAG system. Man...

📖 5 min read

Read more →

2026-05-18

Agent Long-Term Persistent Memory Practical Setup

Practical Guide to Building Long-Term Persistent Memory for Agents - Internal Knowledge Base Document1. IntroductionIn real-world agent deployments, stateless d...

📖 4 min read

Read more →

2026-05-18

Agent Short-term Session Memory Implementation Plan

1. IntroductionThe token limit of large model context windows (typically 4K–128K tokens) determines the amount of information that can be carried in a single co...

📖 5 min read

Read more →

2026-05-18

Agent Character Persona and Task Logic Design Method

1. IntroductionAs LLM-driven agents are gradually deployed in enterprise production environments, a common issue has emerged: given the same underlying model, w...

📖 4 min read

Read more →

2026-05-18

BGE-M3 Local Fine-Tuning Practice: From Scratch to Production-Level Deployment (with Complete Code)

📊 Table of Contents Why Fine-tune BGE-M3? Core Capabilities of BGE-M3 Fine-tuning Environment Setup Guide Data Preparation: Building a High-Quality Training Se...

📖 11 min read

Read more →

2026-05-18

Differences in Selection Between Lightweight Agent and Enterprise Heavy-Duty Agent

IntroductionUnder the dual trends of edge computing and enterprise digitalization, the technical selection of Agents involves a trade-off between efficiency and...

📖 4 min read

Read more →

2026-05-17

Core Essential Capabilities of Autonomous Decision-Making Agents

1. IntroductionCurrent AI systems in production environments often face scenarios that are dynamic, complex, and require independent operation. This demands tha...

📖 4 min read

Read more →

2026-05-17

Difference between Agent and ordinary large model chatbot

The Essential Difference Between Agent and Ordinary Large Model Conversational BotIntroductionWith the widespread application of Large Language Models (LLMs), c...

📖 3 min read

Read more →

2026-05-17

Milvus Production Environment Collection Design + HNSW Tuning Practical Guide

📊 Table of Contents Why Meticulously Design a Milvus Collection? Collection Schema Best Practices Deep Dive into HNSW Index Principles Complete HNSW Para...

📖 11 min read

Read more →

2026-05-17

A Plain Explanation of What Is an AI Intelligent Agent

Understanding AI Agent: A Simple ExplanationIntroductionLarge Language Models (LLMs) have demonstrated impressive capabilities — they can answer questions, writ...

📖 3 min read

Read more →

2026-05-16

RAG Implementation: Production Environment Deployment and Performance Monitoring Practice

1. IntroductionMoving a RAG system from prototype to production typically encounters three core challenges: uncontrollable response latency, fluctuating retriev...

📖 4 min read

Read more →

2026-05-16

Table 4-level Vectorization Scheme: Let RAG System Truly Understand Structured Data

📊 Table of Contents Why Tables are a Nightmare for RAG Systems? Core Idea of 4-Level Table Vectorization Detailed Design and Scenarios of 4 Levels Complete Cod...

📖 11 min read

Read more →

2026-05-15

RRF Multi-channel Fusion Ranking: The Secret Weapon That Improves RAG Retrieval Accuracy by 30%+

📊 Table of Contents Why Do We Need Multi-Channel Retrieval Fusion? Deep Dive into RRF Algorithm Core Principles Multi-Channel Retrieval Architecture in RAG Sys...

📖 9 min read

Read more →

2026-05-14

MySQL+Milvus+MinIO Three-Storage Dual-Write Architecture: Building an Enterprise-Level RAG Data Foundation

📊 Table of Contents Why Does a RAG System Need a Triple‑Storage Architecture? Triple‑Storage Responsibility Division and Design Philosophy Dual‑Write Consisten...

📖 9 min read

Read more →

2026-05-12

RAG Evaluation: End-to-End Metric Design and Effect Evaluation Framework

1. Introduction: Why RAG Evaluation Can’t Rely on “Feelings” Alone?Imagine this scenario: You painstakingly build a RAG system. The knowledge base contains tens...

📖 4 min read

Read more →

2026-05-09

RAG Advanced: Agentic RAG — Dynamic Tool Calling and Iterative Optimization

1. Introduction: The Limitations of Traditional RAG and the Value Proposition of Agentic RAGDo you still remember the excitement of deploying your first RAG sys...

📖 7 min read

Read more →

2026-05-05

RAG Advanced: Multimodal RAG — Image-Text Mixed Retrieval and Generation

1. Introduction: The “Blindness” Dilemma of Traditional RAG and the Multimodal BreakthroughHave you ever encountered this scenario: you feed a PDF report full o...

📖 6 min read

Read more →

2026-05-02

RAG Online Part: Generation Optimization — Self-RAG and Adaptive Retrieval

1. Introduction: The Bottleneck in Generation – Why Self-RAG and Adaptive Retrieval?RAG (Retrieval-Augmented Generation) has a common pain point: all queries ar...

📖 7 min read

Read more →

2026-04-28

RAG Online Part: Retrieval Optimization — Multi-Channel Recall and Result Fusion

1. Introduction: Single Path Not Enough – How Multi-Path Recall Solves the “Lopsided” Retrieval Problem?After deploying an RAG (Retrieval-Augmented Generation)...

📖 6 min read

Read more →

2026-04-25

RAG Online Part: Retrieval Optimization — HyDE and Query Expansion Techniques

Introduction: Why is your RAG retrieval always inaccurate? Starting with HyDEImagine this scenario: You’ve built an enterprise knowledge base RAG (Retrieval-Aug...

📖 5 min read

Read more →

2026-04-21

RAG Offline Part: Embedding Model Selection and Domain Adaptation Fine-tuning

1. Introduction: Why Is the Embedding Model the “Invisible Bottleneck” of RAG?Have you ever encountered this scenario? Your RAG (Retrieval-Augmented Generation)...

📖 4 min read

Read more →

2026-04-18

RAG Offline Part: Metadata Enhancement and Knowledge Graph Fusion Preprocessing

1. Introduction: Why Does RAG Offline Preprocessing Need Metadata Enhancement and Knowledge Graphs?Hi, I’m a tech blogger. Today, let’s talk about a critical ye...

📖 8 min read

Read more →

2026-04-15

RAG Offline Part: Multi-Source Heterogeneous Data Cleaning and Deduplication Strategy

Introduction: When “Multi-Source Data” Becomes the Nightmare and Turning Point of RAGImagine this scenario: you’re developing a smart Q&A system for e-comme...

📖 7 min read

Read more →

Agent Series

RAG Series

Developer Tools

iCan Career

Latest Articles