1. Introduction
- Lesson Material
- Thoughts on Software Architecture
- What is a Large Language Model?
- Understanding Inference
- Large Language Models Come in Many Sizes and Flavors
- Retrieval vs Generative Models
- Retrieval-based Models
- Generative Models
- Hybrid Models
- Tokenization: Breaking Text into Pieces
- Context Size: How Much Information Can a Language Model Use During Inference?
- What is Context Size?
- Why is Context Size Important?
- Examples of Language Models with Different Context Sizes
- Choosing the Right Context Size
- Finding Needles in Haystacks
- Modalities: Beyond Text
- What are Modalities?
- Multimodal Language Models
- Benefits and Applications of Multimodal Models
- Provider Ecosystems
- OpenAI
- Anthropic
- Meta
- Cohere
- Ollama
- Multi-Model Platforms
- Choosing an LLM Provider
- OpenRouter
- Thinking About Performance
- Experimenting With Different LLM Models
- Compound AI Systems
- Deployment Patterns for Compound AI Systems
- Question and Answer
- Multi-Agent/Agentic Problem Solvers
- Conversational AI
- CoPilots
- Roles in Compound AI Systems
- Generator
- Retriever
- Ranker
- Classifier
- Tools & Agents
- Exercise
- Exercise 1
- Quiz
- Quiz 1
- Part I: Part 1: Fundamental Approaches & Techniques
2. Narrow The Path
- Lesson Material
- Latent Space: Incomprehensibly Vast
- How The Path Gets “Narrowed”
- Turning Down The Temperature
- Hyperparameters: Knobs and Dials of Inference
- Raw Versus Instruct-Tuned Models
- Raw Models: The Unfiltered Canvas
- Instruct-Tuned Models: The Guided Experience
- Choosing the Right Kind of Model for Your Project
- Prompt Engineering
- The Building Blocks of Effective Prompts
- The Art and Science of Prompt Design
- Prompt Engineering Techniques and Best Practices
- Zero-Shot Learning: When No Examples Are Needed
- One-Shot Learning: When a Single Example Can Make a Difference
- Few-Shot Learning: When Multiple Examples Can Improve Performance
- Example: Prompts Can Be Much More Complex Than You Imagine
- Experimentation and Iteration
- The Art of Vagueness
- Why Anthropomorphism Dominates Prompt Engineering
- Separating Instructions from Data: A Crucial Principle
- Prompt Distillation
- How It Works
- Initial Prompt Generation
- Prompt Refinement
- Prompt Compression
- System Directive and Context Integration
- Final Prompt Assembly
- Key Benefits
- What about fine-tuning?
- Exercise
- Exercise 2
- Quiz
- Quiz 2
3. Retrieval Augmented Generation (RAG)
- Lesson Material
- What is Retrieval Augmented Generation?
- How Does RAG Work?
- Why Use RAG in Your Applications?
- Implementing RAG in Your Application
- Preparation of Knowledge Sources (Chunking)
- Proposition Chunking
- Implementation Notes
- Quality Check
- Benefits of Proposition-Based Retrieval
- Real-World Examples of RAG
- Case Study: RAG in a Tax Preparation Application Without Embeddings
- Intelligent Query Optimization (IQO)
- Reranking
- RAG Assessment (RAGAs)
- Faithfulness
- Answer Relevance
- Context Precision
- Context Relevancy
- Context Recall
- Context Entities Recall
- Answer Semantic Similarity (ANSS)
- Answer Correctness
- Aspect Critique
- Challenges and Future Outlook
- Semantic Chunking: Enhancing Retrieval with Context-Aware Segmentation
- Hierarchical Indexing: Structuring Data for Improved Retrieval
- Self-RAG: A Self-Reflective Enhancement
- HyDE: Hypothetical Document Embeddings
- What is Contrastive Learning?
- Exercise
- Exercise 3
- Quiz
- Quiz 3
4. Multitude of Workers
- Lesson Material
- AI Workers As Independent Reusable Components
- Account Management
- E-commerce Applications
- Product Recommendations
- Fraud Detection
- Customer Sentiment Analysis
- Healthcare Applications
- Patient Intake
- Patient Risk Assessment
- AI Worker as a Process Manager
- Store Your Trigger Messages
- Integrating AI Workers Into Your Application Architecture
- Designing Clear Interfaces and Communication Protocols
- Handling Data Flow and Synchronization
- Managing the Lifecycle of AI Workers
- Composability and Orchestration of AI Workers
- Chaining AI Workers for Multi-Step Workflows
- Parallel Processing for Independent AI Workers
- Ensemble Techniques for Improved Accuracy
- Dynamic Selection and Invocation of AI Workers
- Combining Traditional NLP with LLMs
- Exercise
- Exercise 4
- Quiz
- Quiz 4
5. Tool Use
- Lesson Material
- What is Tool Use?
- The Potential of Tool Use
- The Tool Use Workflow
- Include function definitions in your request context
- Dynamic Tool Selection
- Forced (aka Explicit) Tool Selection
- Tool Choice Parameter
- Forcing a Function To Get Structured Output
- Execution of Function(s)
- Optional Continuation of the Original Prompt
- Best Practices for Tool Use
- Descriptive Definitions
- Processing of Tool Results
- Error Handling
- Iterative Refinement
- Composing and Chaining Tools
- Future Directions
- Exercise
- Exercise 5
- Quiz
- Quiz 5
6. Stream Processing
- Lesson Material
- Implementating a ReplyStream
- The “Conversation Loop”
- Auto Continuation
- Conclusion
- Exercise
- Exercise 6
- Quiz
- Quiz 6
7. Self Healing Data
- Lesson Material
- Practical Case Study: Fixing Broken JSON
- Considerations and Counterindications
- Data Criticality
- Error Severity
- Domain Complexity
- Explainability and Transparency
- Unintended Consequences
- Exercise
- Exercise 7
- Quiz
- Quiz 7
8. Contextual Content Generation
- Lesson Material
- Personalization
- Productivity
- Rapid Iteration and Experimentation
- Scalability and Efficiency
- AI Powered Localization
- The Importance of User Testing and Feedback
- Exercise
- Exercise 8
- Quiz
- Quiz 8
9. Generative UI
- Lesson Material
- Generating Copy for User Interfaces
- Personalized Forms
- Contextual Field Suggestions
- Adaptive Field Ordering
- Personalized Microcopy
- Personalized Validation
- Progressive Disclosure
- Context-Aware Explanatory Text
- Defining Generative UI
- Example
- The Shift to Outcome-Oriented Design
- Challenges and Considerations
- Future Outlook and Opportunities
- Exercise
- Exercise 9
- Quiz
- Quiz 9
10. Intelligent Workflow Orchestration
- Lesson Material
- Business Need
- Key Benefits
- Key Patterns
- Dynamic Task Routing
- Contextual Decision Making
- Adaptive Workflow Composition
- Exception Handling and Recovery
- Implementing Intelligent Workflow Orchestration in Practice
- Intelligent Order Processor
- Intelligent Content Moderator
- Predictive Task Scheduling in a Customer Support System
- Exception Handling and Recovery in a Data Processing Pipeline
- Monitoring and Logging
- Monitoring Workflow Progress and Performance
- Logging Key Events and Decisions
- Benefits of Monitoring and Logging
- Considerations and Best Practices
- Scalability and Performance Considerations
- Handling High Volumes of Concurrent Workflows
- Optimizing Performance of AI-Powered Components
- Monitoring and Profiling Performance
- Scaling Strategies
- Performance Optimization Techniques
- Testing and Validation of Workflows
- Unit Testing Workflow Components
- Integration Testing Workflow Interactions
- Testing AI Decision Points
- End-to-End Testing
- Continuous Integration and Deployment
- Exercise
- Exercise 10
- Quiz
- Quiz 10
- Part II: Part 2: The Patterns
11. Prompt Engineering
- Lesson Material
- Chain of Thought
- How It Works
- Examples
- Content Generation
- Structured Entity Creation
- LLM Agent Guidance
- Benefits and Considerations
- Mode Switch
- How It Works
- When to Use It
- Example
- Role Assignment
- How It Works
- When to Use It
- Examples
- Prompt Object
- How It Works
- Prompt Template
- How It Works
- Benefits and Considerations
- When to Use It:
- Example
- Structured IO
- How It Works
- Scaling Structured IO
- Benefits and Considerations
- Prompt Chaining
- How It Works
- When To Use It
- Example: Olympia’s Onboarding
- Prompt Rewriter
- How It Works
- Example
- Response Fencing
- How It Works
- Benefits and Considerations
- Error Handling
- Query Analyzer
- How It Works
- Implementation
- Part-of-Speech (POS) Tagging and Named Entity Recognition (NER)
- Intent Classification
- Keyword Extraction
- Benefits
- Query Rewriter
- How It Works
- Example
- Benefits
- Ventriloquist
- How It Works
- When to Use It
- Example
- Exercise
- Exercise 11
- Quiz
- Quiz 11
12. Discrete Components
- Lesson Material
- Predicate
- How It Works
- When to Use It
- Example
- API Facade
- How It Works
- Key Benefits
- When To Use It
- Example
- Authentication and Authorization
- Request Handling
- Response Formatting
- Error Handling and Edge Cases
- Scalability and Performance Considerations
- Comparison with Other Design Patterns
- Result Interpreter
- How It Works
- When to Use It
- Example
- Virtual Machine
- How It Works
- When to Use It
- Example
- Behind The Magic
- Specification and Testing
- Specifying the Behavior
- Writing Test Cases
- Example: Testing the Translator Component
- Replay of HTTP Interactions
- Exercise
- Exercise 12
- Quiz
- Quiz 12
13. Human In The Loop (HITL)
- Lesson Material
- High-Level Patterns
- Hybrid Intelligence
- Adaptive Response
- Human-AI Role Switching
- Escalation
- How It Works
- Key Benefits
- Real-World Application: Healthcare
- Feedback Loop
- How It Works
- Applications and Examples
- Advanced Techniques in Human Feedback Integration
- Passive Information Radiation
- How It Works
- Contextual Information Display
- Proactive Notifications
- Explanatory Insights
- Interactive Exploration
- Key Benefits
- Applications and Examples
- Collaborative Decision Making (CDM)
- How It Works
- Example
- Continuous Learning
- How It Works
- Applications and Examples
- Example
- Ethical Considerations
- Role of HITL in Mitigating AI Risks
- Technological Advancements and Future Outlook
- Challenges and Limitations of HITL Systems
- Exercise
- Exercise 13
- Quiz
- Quiz 13
14. Intelligent Error Handling
- Lesson Material
- Traditional Error Handling Approaches
- Contextual Error Diagnosis
- How It Works
- Prompt Engineering for Contextual Error Diagnosis
- Retrieval-Augmented Generation for Contextual Error Diagnosis
- Intelligent Error Reporting
- Predictive Error Prevention
- How It Works
- Smart Error Recovery
- How It Works
- Personalized Error Communication
- How It Works
- Adaptive Error Handling Workflow
- How It Works
- Exercise
- Exercise 14
- Quiz
- Quiz 14
15. Quality Control
- Lesson Material
- Eval
- Problem
- Solution
- How It Works
- Example
- Considerations
- Understanding Golden References
- How Reference-Free Evals Work
- Guardrail
- Problem
- Solution
- How It Works
- Example
- Considerations
- Guardrails and Evals: Two Sides of the Same Coin
- The Interchangeability of Guardrails and Reference-Free Evals
- Implementing Dual-Purpose Guardrails and Evals
- Exercise
- Exercise 15
- Quiz
- Quiz 15
- Part III: Glossary
- A
- B
- C
- D
- E
- F
- G
- H
- Inference
- J
- K
- L
- M
- N
- O
- P
- Q
- R
- S
- T
- U
- V
- W
- Z
