This series is a set of practical RAG data pipeline engineering practices, organized as “Overview → Ingestion → Chunking → Vectorization → Retrieval → Fusion → Foundation”. The code and architecture can be directly used for production selection.
Relationship with the In-Site Theory Series
| Track | Description | Entry |
|---|---|---|
| This Series (Engineering Practice) | 8 articles, telling you how to do it: Docker, extraction, chunking, Milvus, RRF, etc. | Top navigation RAG Pipeline Series · Directory below this page |
| RAG Full-Link Theory Series | 11 articles, telling you why: cleaning, metadata, Embedding, multi-path recall, Self-RAG, evaluation and deployment | RAG Series |
At the end of each practical article, there is an “In-Site Theory Extension”, pointing to the theoretical article corresponding to that chapter’s technical point, making it easy to trace back the methodology from the engineering practice.
Recommended Reading Order
- First, read Chapter 1 to establish a global view of the Pipeline (one-click Docker deployment)
- Choose according to your role: Data engineering focuses on chapters 2–3; algorithm focuses on chapters 4–7; architecture focuses on chapter 8
- When encountering a conceptual blind spot, click the theory link at the end of the article to jump to the RAG Theory Series