Cheap Compute Is Not Cost Optimization, Architecture Is
- Pairoj Ruamviboonsuk

- 4 hours ago
- 3 min read
The Scenario
A fast-scaling SaaS company reviews its cloud bill.
Kubernetes clusters are running across regions. Autoscaling is enabled. Traffic is healthy.
But monthly infrastructure costs continue rising.
The team notices something obvious:
Spot instances (AWS), preemptible VMs (GCP), and spot VMs (Azure) offer up to 90% savings compared to on-demand pricing.
The discount is real.
The risk is real too.
Instances can be terminated with little notice.
So the question becomes: Can we capture the discount without sacrificing reliability?
Why On-Demand Worked
On-demand infrastructure is predictable.
No sudden termination. No surprise evictions. No capacity volatility.
It feels safe. But safety without optimization compounds cost.
As clusters scale, over-provisioning becomes invisible.Idle capacity hides inside auto-scaled environments.Replication multiplies inefficiency.
The system works. But it is not economically intentional.
Where Constraint Emerges
Spot capacity introduces a structural trade-off.
Cloud providers can terminate:
Spot instances (AWS)
Preemptible VMs (GCP)
Spot VMs (Azure)
Often with minimal notice — for example, two minutes on AWS.
If architecture does not account for interruption:
Stateful services fail
Critical workloads collapse
Single-instance apps go offline
Customer trust is affected
Cheap compute without architectural discipline becomes fragility.
Cost optimization must be engineered.
The Architectural Principle
Interruptible capacity is not unreliable. Undesigned systems are unreliable.
The goal is not to run everything on spot. The goal is to design a fault-tolerant architecture that can absorb interruption without impact.
Cost efficiency is not a pricing decision.
It is an architectural property.
The Design Discipline
Optimizing Kubernetes infrastructure costs requires structured separation of reliability tiers.
1. Separate Node Pools by Reliability
Create distinct node groups:
On-demand nodes for critical services
Spot/preemptible nodes for fault-tolerant workloads
This enforces architectural intent. Critical system pods do not compete with interruptible capacity.
2. Target the Right Workloads
Spot capacity is appropriate for:
Stateless services
Batch processing
CI/CD pipelines
Horizontally replicated services
It is not suitable for:
Stateful, single-instance workloads
Mission-critical services without redundancy
Architecture determines placement — not cost pressure.
3. Enforce Placement with Taints and Tolerations
Apply Kubernetes taints to spot nodes.Use tolerations only on workloads designed for interruption.
This prevents accidental scheduling of sensitive workloads onto unstable capacity.
Guardrails preserve discipline.
4. Automate with Intelligent Autoscaling
Cost optimization only works when scaling is dynamic.
Use:
Horizontal Pod Autoscaler (HPA)
Vertical Pod Autoscaler (VPA)
Node autoscaling via Cluster Autoscaler or Karpenter
A smart node autoscaler can:
Prioritize spot capacity first
Fall back to on-demand when unavailable
Karpenter, for example, dynamically provisions the right instance types at the right time and gracefully handles spot fallback.
Automation is not convenience.
It is economic control.
5. Handle Interruption Gracefully
Spot termination notices must trigger safe draining and rescheduling.
Tools like kube-spot-termination-notice-handler (AWS) allow pods to migrate before termination.
Interruption becomes a controlled event — not an outage.
6. Distribute Risk with Topology Spread Constraints
Spread replicas across:
Availability zones
Nodes
Instance types
This prevents a single interruption event from cascading into service degradation.
Resilience is distribution by design.
7. Monitor and Right-Size Continuously
Autoscaling only works when resource requests and limits are accurate.
Over-requesting resources defeats optimization.Under-requesting creates instability.
Continuous monitoring ensures:
Accurate scaling triggers
Cost visibility
Prevention of over-provisioning
Cost control requires measurement discipline.
The Multi-Layer Outcomes
When spot capacity is architected correctly:
Technical
Fault-tolerant workload designAutomated scaling behaviorGraceful interruption handling
Operational
No service disruption during spot terminationClear workload classificationControlled scaling events
Commercial
Up to 90% savings versus on-demand for eligible workloadsReduced over-provisioningImproved infrastructure ROI
Strategic
Economic resilience during traffic spikesFlexibility in multi-cloud strategyFreedom to scale without runaway cost
Cost becomes intentional rather than reactive.
Executive Translation
In boardrooms, this conversation is not about HPA or Karpenter.
It is about unit economics.
Can we scale without letting infrastructure cost grow linearly?
Spot capacity alone does not solve this.
Architecture does.
The Architectural Close
Cheap compute is easy to buy.
Resilient cheap compute is designed.
Spot instances are not risky.
Undifferentiated infrastructure is risky.
Cost efficiency in Kubernetes is not a pricing trick.
It is architecture applied to economics. And when economics are engineered, scale becomes sustainable.



Comments