Advanced Training
RAG &
Knowledge Systems
Build retrieval-augmented generation pipelines that give LLMs access to your data.
Vector databases, embedding strategies, GraphRAG, and knowledge graphsβcreate
AI systems that know what you know.
Course Overview
RAG transforms LLMs from general-purpose chatbots into domain experts. By retrieving
relevant context before generation, you eliminate hallucinations and ground responses
in your actual data. This course covers the full RAG stack from basic retrieval to
advanced graph-based approaches.
1
Embedding Fundamentals
How text becomes vectors. Embedding models, dimensionality, and semantic similarity.
OpenAI Ada
Cohere
Sentence Transformers
2
Vector Databases
Storing and querying embeddings at scale. Index types, filtering, and hybrid search.
Pinecone
ChromaDB
Weaviate
3
Chunking Strategies
Document preprocessing, chunk sizes, overlap, and semantic chunking approaches.
Recursive
Semantic
Agentic
4
Retrieval Strategies
Beyond basic similarity search: reranking, query expansion, and multi-query retrieval.
Cohere Rerank
HyDE
Multi-Query
5
GraphRAG
Knowledge graphs for RAG. Entity extraction, relationship mapping, and graph traversal.
Neo4j
LlamaIndex
Microsoft GraphRAG
6
Evaluation & Optimization
Measuring RAG quality. Retrieval metrics, generation quality, and continuous improvement.
RAGAS
Faithfulness
Relevance
RAG Pipeline Architecture
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β INGESTION PIPELINE β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Documents β Chunker β Embedder β Vector DB β
β β β β β β
β [PDF/MD] [Split] [Ada-3] [Pinecone] β
β [HTML] [Overlap] [Cohere] [ChromaDB] β
β [JSON] [Semantic] [Weaviate] β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β RETRIEVAL PIPELINE β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Query β Query Expansion β Retrieval β Reranking β Context β
β β β β β β β
β [User] [HyDE/MQ] [Top-K] [Cohere] [Prompt] β
β [Decompose] [Hybrid] [CrossEnc] β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β GENERATION PIPELINE β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Context + Query β Prompt Template β LLM β Response + Citations β
β β β β β β
β [Merged] [System] [Claude] [Grounded] β
β [Few-shot] [GPT-4] [Sourced] β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Structural Patterns for RAG
RAG systems benefit from structural patterns that manage complexity and enable
flexible component swapping.
Composite
Build document trees where folders contain documents contain chunks. Process entire hierarchies uniformly with recursive operations.
Flyweight
Share embedding model instances and vector DB connections across retrievers. Avoid loading the same expensive resources multiple times.
Bridge
Separate retrieval abstraction from implementation. Switch between Pinecone, ChromaDB, or Weaviate without changing retriever logic.
Template Method
Define the RAG pipeline skeleton: chunk β embed β store β retrieve β generate. Let subclasses customize specific steps.
from abc import ABC, abstractmethod
class RAGPipeline(ABC):
"""Template for RAG pipelines - subclasses customize steps"""
async def process_query(self, query: str) -> str:
"""Template method - defines the algorithm structure"""
expanded = await self._expand_query(query)
docs = await self._retrieve(expanded)
ranked = await self._rerank(query, docs)
context = self._build_context(ranked)
response = await self._generate(query, context)
return response
async def _expand_query(self, query: str) -> list[str]:
"""Default: no expansion. Override for HyDE, multi-query."""
return [query]
@abstractmethod
async def _retrieve(self, queries: list[str]) -> list[Document]:
pass
async def _rerank(self, query: str, docs: list) -> list:
"""Default: no reranking. Override to add Cohere rerank."""
return docs
def _build_context(self, docs: list) -> str:
return "\n\n".join(d.content for d in docs)
@abstractmethod
async def _generate(self, query: str, context: str) -> str:
pass
class VectorStore(ABC):
@abstractmethod
async def search(self, query: str, k: int) -> list: pass
class PineconeStore(VectorStore):
async def search(self, query: str, k: int):
embedding = await self.embedder.embed(query)
return self.index.query(embedding, top_k=k)
class ChromaStore(VectorStore):
async def search(self, query: str, k: int):
return self.collection.query(query_texts=[query], n_results=k)
Iterator Pattern for Document Processing
Memory Efficiency
When processing large document collections, use iterators to avoid loading
everything into memory. Process documents one at a time or in batches.
from typing import Iterator, Generic, TypeVar
from pathlib import Path
T = TypeVar("T")
class ChunkIterator(Generic[T]):
"""Iterator pattern for memory-efficient document processing"""
def __init__(self, documents: list[Path], chunk_size: int = 500):
self.documents = documents
self.chunk_size = chunk_size
self.doc_index = 0
self.chunk_buffer: list[str] = []
def __iter__(self) -> Iterator[str]:
return self
def __next__(self) -> str:
while not self.chunk_buffer:
if self.doc_index >= len(self.documents):
raise StopIteration
doc = self._load_document(self.documents[self.doc_index])
self.chunk_buffer = self._chunk_document(doc)
self.doc_index += 1
return self.chunk_buffer.pop(0)
def _load_document(self, path: Path) -> str:
return path.read_text()
def _chunk_document(self, text: str) -> list[str]:
chunks = []
for i in range(0, len(text), self.chunk_size):
chunks.append(text[i:i + self.chunk_size])
return chunks
async def ingest_documents(paths: list[Path], vector_store):
for chunk in ChunkIterator(paths):
embedding = await embedder.embed(chunk)
await vector_store.upsert(chunk, embedding)
Vector Database Comparison
| Database |
Best For |
Scaling |
Setup |
| Pinecone |
Production, managed service |
Automatic, serverless |
pip install pinecone-client |
| ChromaDB |
Local development, prototyping |
Single machine |
pip install chromadb |
| Weaviate |
Hybrid search, GraphQL API |
Kubernetes, cloud |
Docker or Weaviate Cloud |
| Qdrant |
Advanced filtering, Rust performance |
Cluster mode |
pip install qdrant-client |
| pgvector |
Existing Postgres infrastructure |
Postgres scaling |
Postgres extension |
Hands-On Projects
- Build a basic RAG system with ChromaDB and sentence-transformers
- Implement the Template Method pattern for a configurable RAG pipeline
- Create a Bridge pattern to swap between Pinecone and ChromaDB
- Build a document iterator that processes 10GB of PDFs efficiently
- Implement HyDE (Hypothetical Document Embeddings) for query expansion
- Add Cohere reranking to improve retrieval quality
- Build a GraphRAG system with Neo4j and entity extraction
- Evaluate your RAG system with RAGAS metrics
Ready to Build Knowledge Systems?
Give your AI applications access to enterprise knowledge. Continue with
Fine-Tuning & Customization to create domain-specific models.
Enroll Now