Thank you for sending your enquiry! One of our team members will contact you shortly.
Thank you for sending your booking! One of our team members will contact you shortly.
Course Outline
Foundations of Agentic Systems in Production
- Agentic architectures: loops, tools, memory, and orchestration layers.
- The lifecycle of agents: development, deployment, and continuous operation.
- Challenges associated with managing agents at production scale.
Infrastructure and Deployment Models
- Deploying agents within containerized and cloud environments.
- Scaling patterns: horizontal vs. vertical scaling, concurrency, and throttling.
- Multi-agent orchestration and workload balancing.
Monitoring and Observability
- Key metrics: latency, success rate, memory usage, and agent call depth.
- Tracing agent activity and call graphs.
- Instrumenting observability using Prometheus, OpenTelemetry, and Grafana.
Logging, Auditing, and Compliance
- Centralized logging and structured event collection.
- Compliance and auditability within agentic workflows.
- Designing audit trails and replay mechanisms for debugging.
Performance Tuning and Resource Optimization
- Reducing inference overhead and optimizing agent orchestration cycles.
- Model caching and lightweight embeddings for faster retrieval.
- Load testing and stress scenarios for AI pipelines.
Cost Control and Governance
- Understanding cost drivers for agents: API calls, memory, compute, and external integrations.
- Tracking agent-level costs and implementing chargeback models.
- Automation policies to prevent agent sprawl and idle resource consumption.
CI/CD and Rollout Strategies for Agents
- Integrating agent pipelines into CI/CD systems.
- Testing, versioning, and rollback strategies for iterative agent updates.
- Progressive rollouts and safe deployment mechanisms.
Failure Recovery and Reliability Engineering
- Designing for fault tolerance and graceful degradation.
- Retry, timeout, and circuit breaker patterns for agent reliability.
- Incident response and post-mortem frameworks for AI operations.
Capstone Project
- Build and deploy an agentic AI system with comprehensive monitoring and cost tracking.
- Simulate load, measure performance, and optimize resource usage.
- Present the final architecture and monitoring dashboard to peers.
Summary and Next Steps
Requirements
- Solid understanding of MLOps and production machine learning systems.
- Experience with containerized deployments (Docker/Kubernetes).
- Familiarity with cloud cost optimization and observability tools.
Target Audience
- MLOps engineers.
- Site Reliability Engineers (SREs).
- Engineering managers overseeing AI infrastructure.
21 Hours
Testimonials (3)
The trainer is patient and very helpful. He knows the topic well.
CLIFFORD TABARES - Universal Leaf Philippines, Inc.
Course - Agentic AI for Business Automation: Use Cases & Integration
Good mixvof knowledge and practice
Ion Mironescu - Facultatea S.A.I.A.P.M.
Course - Agentic AI for Enterprise Applications
The mix of theory and practice and of high level and low level perspectives