SOA-C02
Complete Study Guide
Master deployment, management, and operations of secure, highly available, fault-tolerant workloads on AWS. All 6 exam domains covered — including the unique hands-on lab component.
Domain Breakdown
The SOA-C02 exam covers six domains. Unlike other associate exams, it includes an exam lab component where you perform hands-on tasks in the AWS console — a unique feature of this certification.
| Domain | Topic | Weight | Key Services |
|---|---|---|---|
| 1 | Monitoring, Logging, and Remediation | 20% | CloudWatch, Config, CloudTrail, Systems Manager |
| 2 | Reliability and Business Continuity | 16% | Auto Scaling, ELB, Route 53, RDS Multi-AZ, Backup |
| 3 | Deployment, Provisioning, and Automation | 18% | CloudFormation, SSM, Beanstalk, EC2 Image Builder |
| 4 | Security and Compliance | 16% | IAM, KMS, Inspector, GuardDuty, Shield, WAF |
| 5 | Networking and Content Delivery | 18% | VPC, CloudFront, Route 53, Transit Gateway, Direct Connect |
| 6 | Cost and Performance Optimization | 12% | Cost Explorer, Trusted Advisor, Compute Optimizer |
Monitoring, Logging & Remediation
The largest domain. Focus on CloudWatch in depth — metrics, alarms, logs, dashboards, and anomaly detection. Understand AWS Config for compliance and CloudTrail for auditing.
Monitoring, Logging, and Remediation
CloudWatch · CloudTrail · AWS Config · EventBridge · Systems Manager OpsCenter
CloudWatch Metrics
- Default EC2 metrics: CPU, Network, Disk I/O — not memory or disk space
- Detailed monitoring: 1-minute granularity (default: 5 min)
- Custom metrics via
PutMetricDataAPI — use custom Namespace - High-resolution custom metrics: 1-second granularity
- Metric retention: 15 months (granularity degrades over time)
CloudWatch Alarms
- States: OK, ALARM, INSUFFICIENT_DATA
- INSUFFICIENT_DATA ≠ error — just no data in evaluation period
- Actions: SNS notification, EC2 action, ASG scaling, Systems Manager OpsCenter
- Composite Alarms: AND/OR logic across multiple alarms — reduces alarm noise
- Alarm Math: use metric math expressions for complex conditions
CloudWatch Logs
- Log Groups → Log Streams → Log Events
- Retention: 1 day to never (set per log group)
- Metric Filters: create CloudWatch metrics from log patterns
- Log Insights: SQL-like queries across log groups
- Subscription Filters: real-time stream to Lambda/Kinesis/Firehose
- Export to S3 via
CreateExportTask
AWS CloudTrail
- Records all API calls across your account — "who did what"
- Management Events: default on. Data Events: S3/Lambda (must enable)
- Insights Events: detect unusual API activity patterns
- Trail logs stored in S3 (encrypted by default with SSE-S3)
- Enable log file validation (SHA-256 hash) for integrity
- Multi-region trail: enabled by default for new trails
AWS Config
- Records configuration changes to AWS resources over time
- Config Rules: evaluate compliance (managed or custom Lambda)
- Remediation: auto-remediate non-compliant resources via SSM Automation
- Aggregator: multi-account, multi-region compliance view
- Conformance Packs: bundle of Config rules + remediation actions
- Config ≠ CloudTrail: Config tracks state, CloudTrail tracks API calls
Systems Manager (Monitoring)
- OpsCenter: operational work items (OpsItems) from CloudWatch, Config
- Explorer: aggregated view of OpsData across accounts/regions
- Incident Manager: automated runbooks during incidents
- SSM Agent: must be installed on EC2 (pre-installed on Amazon Linux 2+)
- Instance Profile needs
AmazonSSMManagedInstanceCorepolicy
/opt/aws/amazon-cloudwatch-agent/ config. Use SSM Parameter Store to centrally manage the agent config file across a fleet of instances.| Service | What It Tracks | Key Use Case | Exam Tip |
|---|---|---|---|
| CloudWatch Metrics | Performance data over time | Auto Scaling triggers, alarms | Memory not in default metrics |
| CloudWatch Logs | Application/system log data | Debug, pattern matching, alerting | Metric Filters create custom metrics |
| CloudTrail | API call history (who/what/when) | Security audit, compliance | Data events off by default |
| AWS Config | Resource configuration state | Compliance, drift detection | Auto-remediation via SSM |
| EventBridge | Events from AWS services | Trigger Lambda, Step Functions | Schedule: rate() or cron() |
Reliability & Business Continuity
Design for high availability and fault tolerance. Focus on Auto Scaling groups, Elastic Load Balancing, Route 53 failover, RDS Multi-AZ, and disaster recovery strategies.
Reliability and Business Continuity
Auto Scaling · ELB · Route 53 · RDS Multi-AZ · S3 Resilience · AWS Backup · DR Strategies
EC2 Auto Scaling
- Launch Template (preferred) or Launch Configuration
- Scaling Policies: Target Tracking (simplest), Step Scaling, Scheduled
- Target Tracking: maintain metric (e.g. CPU=50%) automatically
- Cooldown period: default 300s — prevents rapid scale in/out cycles
- Lifecycle Hooks: pause instances during launch/terminate for custom actions
- Warm Pool: pre-warmed instances in stopped state for fast scale-out
Elastic Load Balancing
- ALB: Layer 7, path/host routing, WebSocket, Lambda targets
- NLB: Layer 4, ultra-low latency, static IP, TCP/UDP/TLS
- CLB: legacy, avoid for new deployments
- GWLB: Layer 3/4, bump-in-the-wire for security appliances
- Connection Draining / Deregistration Delay: 300s default
- Cross-Zone Load Balancing: distribute across all AZs evenly
Route 53 Health Checks & Routing
- Endpoint health checks: HTTP/HTTPS/TCP probes every 30s (fast: 10s)
- Failover routing: primary/secondary; secondary activates when primary unhealthy
- Calculated health checks: combine multiple health checks with AND/OR
- CloudWatch Alarm-based health checks: for private endpoints
- Routing policies: Simple, Weighted, Latency, Failover, Geolocation, Multi-value
RDS High Availability
- Multi-AZ: synchronous replication, automatic failover (~1-2 min), no read traffic to standby
- Read Replicas: asynchronous, for read scaling, can promote to standalone
- Multi-AZ DB Cluster: up to 2 readable standby instances
- Automated backups: 1–35 days retention, point-in-time restore
- Snapshots: manual, retained until deleted, can copy cross-region
S3 Resilience
- 11 nines durability (99.999999999%) — data stored across ≥3 AZs
- Versioning: retain all versions, delete markers instead of deleting
- MFA Delete: require MFA to delete versions or change versioning state
- Cross-Region Replication (CRR): async replication to another region
- Same-Region Replication (SRR): replicate within region for log aggregation
AWS Backup
- Centralized backup service for EC2, EBS, RDS, DynamoDB, EFS, S3
- Backup Plans: define schedules, lifecycle rules, retention
- Backup Vault Lock: WORM — prevent backup deletion (Compliance/Governance mode)
- Cross-Region and cross-account backup copies
- On-demand backups alongside scheduled plans
| DR Strategy | RTO | RPO | Cost | Description |
|---|---|---|---|---|
| Backup & Restore | Hours | Hours | $ | Restore from backups/snapshots in new region |
| Pilot Light | ~10 min | Minutes | $$ | Core services running; scale out on failover |
| Warm Standby | Minutes | Seconds | $$$ | Scaled-down full copy; scale up on failover |
| Multi-Site Active/Active | ~0 | ~0 | $$$$ | Full capacity in multiple regions simultaneously |
Deployment, Provisioning & Automation
Automate infrastructure provisioning and application deployments using CloudFormation, Systems Manager, and Elastic Beanstalk. Understand AMI management and IaC best practices.
Deployment, Provisioning, and Automation
CloudFormation · SSM · Elastic Beanstalk · EC2 Image Builder · Service Catalog
CloudFormation
- Stacks: deploy/update/delete infrastructure as a unit
- Change Sets: preview changes before applying — required for production
- Drift Detection: find manual changes made outside CloudFormation
- StackSets: deploy across multiple accounts + regions from one template
- Nested Stacks: reuse common patterns as child stacks
- DeletionPolicy:
RetainorSnapshotfor stateful resources
AWS Systems Manager
- Run Command: execute scripts on fleets without SSH
- Patch Manager: define patch baselines, patch groups, maintenance windows
- Session Manager: browser-based SSH/RDP without bastion hosts
- Automation: run runbooks (SSM Documents) for automated remediation
- Parameter Store: config values + secrets (SecureString via KMS)
- Inventory: collect metadata from managed instances
Elastic Beanstalk
- PaaS: handles provisioning, load balancing, scaling, monitoring
- Deploy policies: All at Once, Rolling, Rolling+Batch, Immutable, Blue/Green
- Immutable: new ASG created — safest, instant rollback
.ebextensions/: YAML/JSON config files for customizationProcfile,Buildfile: custom build/run commands- Supported: Java, Node.js, PHP, Python, Ruby, Go, Docker, .NET
EC2 Image Builder
- Automates creation, testing, and distribution of AMIs (and container images)
- Components: base OS → build components → test components → distribute
- Image Pipeline: scheduled or manual builds
- Image Recipes: define OS + software + hardening steps
- Integrates with SSM Parameter Store to share latest AMI IDs
- Cross-region, cross-account AMI distribution
CloudFormation Helper Scripts
cfn-init: read and execute metadata fromAWS::CloudFormation::Initcfn-signal: signal CloudFormation when instance is readycfn-hup: detect metadata changes and run updates on running instances- CreationPolicy + WaitCondition: wait for cfn-signal before marking stack success
- UserData vs cfn-init: UserData runs once at launch; cfn-init supports updates
Service Catalog & OpsWorks
- Service Catalog: portfolio of approved CloudFormation templates for self-service
- Launch Constraints: role used to launch products (not user's own permissions)
- OpsWorks: Chef/Puppet managed configuration (legacy service)
- OpsWorks Stacks: Layer-based EC2 management with lifecycle events
- For new workloads prefer SSM over OpsWorks
2. Create Patch Group — tag instances (e.g.,
Patch Group = Production), associate with baseline.3. Create Maintenance Window — define when patching runs (schedule + duration + stop time).
4. Register targets + tasks — register patch group to window, add
AWS-RunPatchBaseline document.5. Monitor with Patch Compliance — view patch status in SSM Compliance dashboard.
Security & Compliance
Apply security controls, manage access, and ensure compliance. Focus on IAM, Organizations/SCPs, KMS, and the suite of AWS security services: Inspector, GuardDuty, Security Hub, Macie, Shield, and WAF.
Security and Compliance
IAM · KMS · Organizations/SCPs · GuardDuty · Inspector · Macie · Shield · WAF · Security Hub
AWS Organizations & SCPs
- Organizations: manage multiple AWS accounts centrally
- Service Control Policies (SCPs): restrict maximum permissions for OUs/accounts
- SCPs do not grant permissions — they set maximum allowed boundaries
- Management account: SCPs don't apply to management account itself
- Consolidated Billing: single payment method, volume discounts
- Tag Policies: enforce consistent tagging across the organization
KMS & Encryption
- AWS Managed Keys: free, auto-rotated annually, can't disable
- Customer Managed Keys (CMK): full control, manual or auto rotation
- KMS direct encrypt limit: 4 KB — use envelope encryption for larger data
- Envelope:
GenerateDataKey→ encrypt locally → store encrypted DEK - All KMS API calls logged in CloudTrail
- Multi-Region Keys: replicate key material across regions
Amazon Inspector
- Automated vulnerability scanning for EC2 and ECR container images
- Scans OS packages and application packages for CVEs
- EC2: uses SSM Agent — no additional agent needed
- Findings exported to Security Hub and EventBridge
- Inspector v2 (current): continuous, event-driven scanning
Amazon GuardDuty
- Threat detection using ML on VPC Flow Logs, CloudTrail, DNS Logs
- Detects: unauthorized access, compromised instances, cryptomining
- 30-day free trial; enabled per region
- Multi-account: designate administrator account via Organizations
- Findings sent to EventBridge for automated remediation
Shield & WAF
- Shield Standard: free, automatic protection against L3/L4 DDoS
- Shield Advanced: $3000/month, L7 DDoS, cost protection, DRT access
- WAF: Web Application Firewall — attach to CloudFront, ALB, API GW
- WAF Rules: IP match, geo match, rate-based, managed rule groups (OWASP)
- WAF ACLs: regional (ALB) or global (CloudFront)
Security Hub & Macie
- Security Hub: aggregates findings from GuardDuty, Inspector, Macie, Config
- Checks against security standards: CIS Benchmarks, AWS Foundational Security
- Requires AWS Config enabled in all member accounts
- Macie: ML-powered sensitive data discovery in S3
- Detects PII, financial data, credentials in S3 objects
- Macie findings sent to Security Hub and EventBridge
Networking & Content Delivery
Design and troubleshoot networking architectures. Master VPC components, connectivity options (VPN, Direct Connect, Transit Gateway), and content delivery with CloudFront and Route 53.
Networking and Content Delivery
VPC · Security Groups · NACLs · NAT · Transit Gateway · Direct Connect · CloudFront · Route 53
VPC Fundamentals
- Subnets: public (IGW route) vs private (no IGW route)
- Route Tables: one per subnet; most specific route wins
- Internet Gateway (IGW): enables internet access for public subnets
- NAT Gateway: allows private subnet outbound internet (managed by AWS)
- NAT Instance: self-managed EC2; must disable source/dest check
- Egress-Only IGW: IPv6 only — one-way outbound for private subnets
VPC Security
- Security Groups: stateful, instance-level, allow rules only
- NACLs: stateless, subnet-level, allow and deny rules, numbered priority
- NACLs: remember to allow ephemeral ports (1024–65535) for responses
- VPC Flow Logs: capture IP traffic metadata — ACCEPTED/REJECTED
- Flow logs to S3, CloudWatch Logs, or Kinesis Data Firehose
VPC Connectivity
- VPC Peering: direct connection between two VPCs (non-transitive)
- Transit Gateway: hub-and-spoke model — scales to thousands of VPCs
- VPC Endpoints — Gateway: S3/DynamoDB (free). Interface: others (cost)
- PrivateLink: expose service privately to other VPCs via endpoint
- VPN CloudHub: hub-and-spoke VPN across multiple customer locations
Direct Connect & VPN
- Site-to-Site VPN: encrypted tunnel over internet; quick setup
- Direct Connect (DX): dedicated private connection; consistent performance; 1–100 Gbps
- DX lead time: weeks to months for physical provisioning
- DX + VPN: use VPN as backup to Direct Connect for redundancy
- Direct Connect Gateway: access multiple VPCs/regions from single DX connection
CloudFront
- CDN: 400+ edge locations globally; serves cached content
- Origins: S3, ALB, EC2, API GW, Custom HTTP
- OAC (Origin Access Control): restrict S3 to CloudFront only (replaces OAI)
- Cache Behaviors: route by path pattern to different origins
- Signed URLs/Cookies: restrict access to authenticated users
- Lambda@Edge / CloudFront Functions: run logic at edge locations
Route 53 Advanced
- Alias records: free, point to AWS resources (ELB, CloudFront, S3 website)
- CNAME: can't use at zone apex (root domain)
- Routing: Weighted (A/B), Latency (closest region), Geolocation, Geoproximity
- Failover: active-passive using health checks
- Private Hosted Zone: DNS for VPC resources; must enable
enableDnsHostnames+enableDnsSupport
Connect two VPCs (same owner, few): VPC Peering
Connect many VPCs: Transit Gateway (replaces complex peering meshes)
Access AWS services privately: VPC Endpoints (Gateway for S3/DDB, Interface for others)
On-premises to AWS (fast setup): Site-to-Site VPN
On-premises to AWS (dedicated, high bandwidth): Direct Connect
Both DX + VPN: VPN as backup for Direct Connect
Cost & Performance Optimization
Use AWS cost management tools, right-sizing recommendations, and purchasing options to reduce costs while maintaining performance. Understand Trusted Advisor and Compute Optimizer.
Cost and Performance Optimization
Cost Explorer · Budgets · CUR · Trusted Advisor · Compute Optimizer · Purchasing Options
Cost Management Tools
- Cost Explorer: visualize, understand, forecast AWS costs — 12 months history
- AWS Budgets: set alerts when costs/usage exceed thresholds
- Budget types: Cost, Usage, Savings Plans, Reservation
- Cost Allocation Tags: tag resources to track costs by team/project/env
- Activate tags in Billing console before they appear in reports
- Cost and Usage Report (CUR): most detailed billing data → S3 → Athena/QuickSight
Trusted Advisor
- 5 pillars: Cost Optimization, Performance, Security, Fault Tolerance, Service Limits
- Free tier: 7 core checks (security + service limits only)
- Business/Enterprise Support: all checks + API access + CloudWatch integration
- Key checks: MFA on root, S3 public access, underutilized EC2, ELB with no backends
- Service Quotas: view and request limit increases (also via Trusted Advisor)
Compute Optimizer
- ML-based right-sizing recommendations for EC2, EBS, Lambda, ECS on Fargate, Auto Scaling
- Requires ≥30 days of CloudWatch metrics for recommendations
- Identifies over-provisioned and under-provisioned resources
- Enhanced Infrastructure Metrics (paid): 3 months of data for better accuracy
- Export findings to S3 for analysis
Purchasing Options
- On-Demand: no commitment, highest cost, pay per second
- Reserved Instances: 1 or 3 year, up to 72% discount
- Savings Plans: flexible RI (Compute or EC2), hourly commitment
- Spot Instances: up to 90% discount, can be interrupted (2 min warning)
- Dedicated Hosts: physical server, BYOL, compliance requirements
- Compute Savings Plans > EC2 Reserved for flexibility across families/regions
S3 Storage Classes
- S3 Standard: frequent access, milliseconds retrieval
- S3 Standard-IA: infrequent access, retrieval fee, min 30-day storage
- S3 One Zone-IA: single AZ, 20% cheaper than Standard-IA
- S3 Glacier Instant: archive, milliseconds retrieval, min 90 days
- S3 Glacier Flexible: minutes to hours retrieval, min 90 days
- S3 Glacier Deep Archive: 12-hour retrieval, cheapest, min 180 days
- S3 Intelligent-Tiering: auto-moves objects between tiers, no retrieval fee
Performance Optimization
- EC2 Enhanced Networking: ENA (up to 100 Gbps), SR-IOV for high bandwidth
- EBS Volumes: gp3 preferred over gp2 (cheaper, independent IOPS/throughput)
- EBS-Optimized: dedicated bandwidth between EC2 and EBS
- Placement Groups: Cluster (low latency), Spread (fault tolerance), Partition (HDFS/Kafka)
- ElastiCache: Redis for sessions/leaderboards; Memcached for simple caching
Practice Questions
Test your SOA-C02 readiness across all six domains. 720+ to pass.
Exam Complete!
out of 100 questions correct