$ Case Studies
Real-world infrastructure challenges from engagements at global organisations. Each post covers the problem, approach, and outcomes.
Cutting CI Costs by 85% with AWS Spot and GitHub App Autoscaling
How I replaced a fleet of always-on CI runners with an event-driven autoscaling architecture using a custom GitHub App, AWS API Gateway, Lambda, and Spot Instances — reducing monthly CI spend by 85%.
Building Firm-Wide Observability with Datadog
How I designed and rolled out a centralised observability platform across a quantitative trading firm, unifying logs, metrics, and traces from Kubernetes, bare-metal servers, and legacy systems into a single Datadog instance.
Multi-Datacentre Disaster Recovery with HashiStack
How I designed and implemented a multi-datacentre disaster recovery architecture using HashiCorp Nomad, Consul, and Vault to ensure continuity for trading operations across geographically separated sites.
Modernising 2,500 Nomad Jobs into 300 IaC Files
How I consolidated 2,500 manually managed Nomad job specifications into 300 Terraform files using Terragrunt and the Nomad provider, cutting deployment time by 80% across five environments and multiple data centres.
GitOps at Scale: Migrating 100+ Banking Applications
How I led the migration of over 100 banking applications from TeamCity to GitLab with Vault-integrated pipelines, standardised Kubernetes deployments via Helm, and reusable Ansible Galaxy roles across Credit Suisse's global infrastructure.
Eliminating a 500-Job Queue Bottleneck in CI/CD
How we diagnosed and resolved a critical CI/CD pipeline bottleneck that was blocking Open Banking delivery at one of the UK's largest retail banks, reducing a 500-700 pending job queue to near-zero through Zalenium, automated webhooks, and infrastructure modernisation.
Scaling a Social Platform to 80,000 Concurrent Users
How I took on the sole infrastructure lead role at vVoosh and scaled a social entertainment platform to sustain 80,000 concurrent users in a two-hour performance test, using multi-account AWS with Terraform, Outlyer and ELK monitoring, and Jenkins CI/CD.
Zero-Downtime Releases for a £4 Billion Grocery Platform
How we engineered zero-downtime weekly deployments for Tesco's online grocery platform, serving millions of customers across mobile and web with £4 billion in annual revenue, using AWS EC2, ECS, Lambda, and Direct Connect.
Automating Infrastructure for the 100,000 Genomes Project
Building automated, reproducible infrastructure for Genomics England's landmark 100,000 Genomes Project — managing 300+ VMs including HPC clusters across VMware and AWS, achieving 10%+ cloud cost savings in the first week, and migrating to the Atlassian suite.
Transforming Software Delivery for the Brazilian Federal Government
How we introduced Agile XP practices, CI/CD, and modern deployment architecture to SERPRO's 70-person team, transforming delivery for the Brazilian Presidency's project monitoring system and the national anti-drug programme.