Validation & Benchmarks

StornX was designed against real production patterns and validated with end-to-end load and stress tests on representative microservice applications. This section summarises the methodology and the headline findings - the goal is to give you confidence that the controller behaves the way the rest of this documentation claims, not to drown you in numbers.

If you want the raw data, every plot here is reproducible from the workload, dashboards and Helm values shipped in perf-tests/.

What was tested

Two reference microservice applications, chosen because they represent the two most common shapes of real workloads:

Application	Why
Google Online Boutique	10 services, classic e-commerce graph, light-to-moderate inter-service chatter.
OpenTelemetry Demo	20+ services, heavier graph, mixed sync/async traffic, native observability.

Where the tests ran

A production-like multi-AZ EKS cluster.

Test infrastructure on AWS EKS

AWS EKS (eu-central-1), Kubernetes 1.33
3 availability zones, mix of m5.large / m5.xlarge worker nodes
Istio 1.24.x with the addon Prometheus / Grafana / Kiali stack
Kube-NetLag DaemonSet for node-to-node latency
Load generated with k6 (rich scenarios in perf-tests/k6/)

What was compared

Three configurations of the same cluster, all other variables held constant:

Configuration	Autoscaling	Placement	Traffic routing
Baseline	Kubernetes HPA (CPU)	Default scheduler	Istio random load-balancing
OptTraffic	Kubernetes HPA (CPU)	Default scheduler	Istio locality routing only
StornX	OptiScaler	OptiScaler placement	OptiBalancer adaptive weights

The goal is not to declare a single winner on a single metric, but to show how StornX shifts the cost / latency / availability frontier.

What was measured

Dimension	Metric	Why it matters
Latency	End-to-end P95 response time per request	User experience
Throughput	Successful RPS sustained at the target load	Application capacity
Cost	Cross-AZ data-transfer bytes, replica-hours	Cloud bill
Reliability	Error rate during chaos (zone degradation, Pod kills)	Production-grade resilience
Resources	CPU and memory utilisation per replica	Right-sizing / waste

Headline takeaways

The detailed plots are split across the next pages. The pattern is consistent:

Lower P95 under load - StornX trades a small amount of fault-tolerance "spread" for substantial co-location wins once the minimum zone count is satisfied.
Lower cost - fewer cross-AZ bytes; in load tests, also fewer replica-hours because OptiBalancer keeps the existing replicas working at a steady utilisation instead of forcing the HPA to over-scale.
Better availability during simulated zone degradation - traffic shifts gradually toward healthy zones, errors stay near zero where the baseline produces visible error spikes.

Continue with:

Load tests - sustained, realistic traffic
Stress tests - saturating each service in turn
Availability tests - simulated zone failures
Side-by-side comparison - the summary table teams usually want

What was tested​

Where the tests ran​

What was compared​

What was measured​

Headline takeaways​

What was tested

Where the tests ran

What was compared

What was measured

Headline takeaways