How I Modernized OpenShift Infrastructure to Scale an App
🌍 Situation
The Highway Maintenance Contractor Reporting (HMCR) application is a core platform used by contractors across British Columbia to report on road maintenance activities—everything from pothole fixes to rockfall clearing and winter road salts.
When I joined the project, the application was functional but fragile.
- The app worked… most of the time.
- The infrastructure? Cobwebbed YAML, manual patches, and tribal knowledge.
- OpenShift deployments were automated via GitHub Actions, but only skin-deep.
- CI/CD existed, but we were far from modern DevOps.
Contractor adoption increased and the volume of maintenance reports skyrocketed. From a flow of new hires to the project, internal users have also increased. Both which increased the load on the application. Hence:
We needed HMCR to keep up with modern standards and scale
🎯 Task
My task was threefold:
- Stabilize and understand the existing OpenShift-based deployment pipeline—without breaking production.
- Modernize our infrastructure to support scalability, automation, and observability across multiple environments (
dev
,test
,uat
,prod
). - Scale the application to meet growing number of users that further increase application load.
🔧 Action
🔍 Step 1: Understand the Processes
I started by dissecting how the HMCR app deployed:
- Traced GitHub Actions workflows that built Docker images and pushed to OpenShift.
- Analyzed relationships between
DeploymentConfig
,BuildConfig
, andImageStream
wirings. - Explored route configs, internal TLS, and how our build triggers were set up.
The app was running in OpenShift, but it was held together with duct tape.
🔁 Step 2: Migrate from DeploymentConfig to Kubernetes Deployments
OpenShift's DeploymentConfig
had limitations and was in the deprecation phase. We needed better compatibility with Kubernetes-native tooling and Helm charts. So, I:
- Rewrote all deployment manifests to use
apps/v1 Deployment
- Replaced
BuildConfigs
with external image builds via GitHub Actions - Updated GitHub workflows to push directly to OpenShift's internal image registry
This allowed us to ditch OpenShift-specific legacy components and move toward portable, K8s-native deployment practices.
📊 Step 3: Monitor and Observe
I added monitoring and alerting layers:
- Integrated Prometheus + Grafana for pod metrics
- Enabled EFK (Elasticsearch-Fluentd-Kibana) stack for application logs
- Added proactive alerts for CPU/memory thresholds and failing pods
I also cleaned up old image streams, and automated stale object pruning via oc
and GitHub Actions workflows.
⚖️ Step 4: Implement Autoscaling with HPA
With growing traffic and more contractors using HMCR, I set up a Horizontal Pod Autoscaler (HPA) for our API and background process servers. This was the initial setup:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: hmcr-api
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: hmcr-api
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 60
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 70
- type: Pods
pods:
metric:
name: http_requests_per_second
target:
type: AverageValue
averageValue: '100'
Where http_requests_per_second
is a custom metric from Prometheus Adapter.
HPA automatically scaled the pods from 2 up to 8 during peak usage (e.g., after storms or annual reports) in seconds, maintaining stable loads, and scaling down during quiet hours.
In the interest of reading time, I will skip the part where I setup session affinity and ensure application statelessness with Redis
📦 Step 5: Helm All The Things
Manually managing templates across multiple environments was error-prone and time-consuming. OpenShift is an interconnected network of interdependent Kubernetes objects. Helm brilliantly solves this issue by allowing objects to be shipped together as a package. I created Helm charts to:
- Template our deployments
- Inject environment-specific values via
values.yaml
- DRY up configuration for routes, secrets, and storage
Now we could roll out changes across dev
, test
, and prod
using a single, repeatable structure in a single command:
helm upgrade --install hmcr-api ./charts/hmcr-api -f hmcr-prod.yaml
🐘 Step 6: Automate PostgreSQL Upgrades
I created an automated, namespace-scoped PostgreSQL upgrade script that:
- Backed up existing PVCs
- Restored data post-upgrade
- Provision necessary objects
We now had safe, repeatable, zero-downtime upgrades across all environments.
✅ Result
By the end of this effort:
- HMCR reliably scale to meet growing number of users and loads.
- We eliminated config drift between environments and package interdependent K8s objects using Helm.
- PostgreSQL upgrades were hands-off and namespace-safe.
- Developers could focus on features, not firefighting YAML bugs.