Your delivery problem is your stack, not your team

Written by CDS Marketing | 148 May 2026

Your delivery problem probably isn’t your team. Most technical directors we speak to have already ruled out the obvious explanations: the team is capable, the process is broadly sensible, and the problem still doesn’t go away.

What we find, consistently, is that the constraint isn’t capability, it’s the stack. More specifically, it’s the proportion of engineering effort that goes on keeping the platform running rather than building on it. That proportion is often much higher than anyone has explicitly agreed to. In one recent survey,51% of tech leaders said security was the biggest software development challenge for 2025, which is a useful reminder that delivery friction is now as much a security and complexity problem as it is a talent problem.

The three types of slow

It’s worth being precise about what “slow” actually means here, because it tends to show up in three different ways that don’t always get separated out.

Slow to ship: The time between code being ready and that code reaching users is longer than it should be. This isn’t because testing or review are inherently bad, but because the deployment pipeline touches too many systems, too many of which still require manual steps. Harness’s 2025 research found that 50% of deployments still rely on manual steps, and 64% of infrastructure code changes do too. When delivery still depends on handoffs and human coordination, speed becomes a structural issue, not a motivational one.

Slow to recover: When something fails in production and the fix is known within minutes, the hours it takes to get that fix to users usually have nothing to do with diagnosis. They’re a product of the remediation path running through the same heavyweight process as any other release. Harness also found that44% of organisations rely onmanual rollbacks and 43% make rollback decisions based on subjective evidence rather than data, which helps explain why recovery often drags on after the root cause is already understood.

Slow to iterate safely. This is the least visible of the three and usually the most consequential. Teams working on expensive, complex infrastructure become conservative over time:

Preview environments cost money or take time to spin up
Rollback is painful so changes get batched
Security review is a separate gate, so it gets front-loaded
The ability to course-correct late disappears

In the same Harness report, only 34% of engineering teams can quickly spin up pre-built developer environments in the cloud, while 67% cannot build and test dev environments within 15 minutes. That makes experimentation feel expensive, even when the change itself is small.

This last one rarely surfaces in retrospectives because it looks like prudent engineering rather than a structural constraint. But its effect on the organisation’s ability to move is compounding.

Where the complexity lives

A useful question to ask honestly: what proportion of your engineering capacity goes on work that doesn’t differentiate your product or service?

Provisioning, patching, orchestration, cache invalidation, capacity planning, load balancer configuration, none of it is wrong, it’s all necessary, but none of it delivers anything directly to your users. It’s overhead. And in most organisations running traditional cloud infrastructure, that overhead is a larger share of engineering time than anyone consciously signed up for.

The harder problem is that this overhead scales with the organisation. More services means more infrastructure to manage, which means more people needed to manage it, more surface area for things to go wrong, and more coordination required before anything can change. Harness’s report shows how quickly the drag appears in practice: 61% of engineering leaders say code reviews take over a day, and 35% of teams do not consistently follow branching strategies across QA, dev, and infrastructure repos. The complexity tax gets heavier as you grow, which is the opposite of what you need.

What a different model changes

The shift to a serverless edge architecture, where compute, storage, security, and routing all live in a single globally distributed platform, removes most of that overhead at the infrastructure layer. Not by giving you better tools to manage the complexity, but by removing a lot of the complexity itself.

For the three failure modes above, the effect is meaningful. Build and deployment pipelines still exist, and they should, but the propagation step that follows them collapses dramatically. A deployment that previously required infrastructure changes to ripple across regions now reaches users in seconds. Recovery from incidents follows the same logic. And iteration becomes genuinely lower-cost when preview environments are disposable, rollback for stateless workloads is straightforward, and security configuration can be versioned alongside application code.

The security side deserves more than a passing mention here. When WAF rules, rate limits, access policies, and TLS configuration live in the same repository and ship through the same pipeline as application code, an entire category of cross-team coordination disappears. Security stops being a handoff and becomes part of the delivery system by being reviewed, versioned, and deployed with the same rigour as everything else. This is a structural change to where risk actually accumulates.

OpenSSF’s 2024 survey found that 28% of software development professionals were not familiar with secure software development practices, 50% identified lack of training as a major challenge, and 69% rely on on-the-job experience to learn secure software development. That means security often enters the process as a gate, not a design principle, which slows teams down precisely when they most need to move quickly and safely.

That last qualifier matters. Rollback is simple when your application is stateless. Serverless edge shifts the stateful problem rather than solving it; different storage mechanisms carry different consistency guarantees and rollback semantics, and that needs explicit design attention regardless of which platform you're building on.

The trade-offs worth knowing about

A shift to serverless edge isn’t without its own operational considerations, and it’s worth being clear-eyed about them.

Vendor dependency is real. Moving your compute, routing, and security onto a single platform gives you significant leverage but also significant concentration. That’s a risk management conversation worth having explicitly, not something to discover later.

Long-running processes don’t fit this model well. Serverless edge is excellent for request-response workloads, but if your architecture includes jobs that need to run for minutes rather than milliseconds, you’ll need a different answer for those parts of the stack.

The skills shift is also non-trivial for teams coming from traditional infrastructure backgrounds. OpenSSF found that 75% of developers with less than one year of experience reported a lack of familiarity with secure software development practices, which is a reminder that the learning curve is not just technical, It’s organisational. Underestimating that curve is one of the more common ways migrations take longer than expected.

None of these are reasons to avoid the model, but they are reasons to go in with a clear picture of where it fits well and where it needs supplementing.

The question worth sitting with

For most technical directors considering this shift, the real question is whether the organisation is ready for it, and what the genuine blockers are.

In our experience, they’re rarely purely technical. The harder conversations tend to be about platform risk concentration, vendor dependency, and whether the team’s skills map well onto this model. Those are legitimate concerns and they deserve straight answers.

What’s worth challenging is the assumption that the current setup is the lower-risk option. Distributed infrastructure managed separately from application code, owned by different teams, with security configuration that quietly drifts from the codebase over time - that carries real risk too. It’s just familiar risk, and familiar risk has a way of feeling smaller than it actually is.

A more realistic reading is this: modern software delivery is not being held back by a shortage of talent, but by the amount of engineering attention consumed by maintaining the system around the product. Once you measure that honestly, the conversation changes.

Practical steps to take

If this resonates, the first step is to stop treating delivery speed as a vague cultural issue and start measuring where the time actually goes. Break lead time, recovery time, and iteration time apart, then look at how much of each is spent waiting on infrastructure, approvals, or manual change steps.

Next, identify the parts of the stack that absorb engineering time without creating customer value. If the same teams are repeatedly handling patching, provisioning, rollback coordination, or environment setup, that is usually a sign the architecture is carrying too much operational weight.

Then, look at security as part of the delivery system rather than a separate checkpoint. The goal is not to reduce scrutiny, but to make secure delivery repeatable, versioned, and fast enough that teams can change safely without batching risk into larger releases.

Finally, be honest about fit. Serverless edge is not the answer for every workload, but for teams whose bottleneck is operational overhead rather than product complexity, it can remove a surprising amount of drag. The right question is not whether the model is perfect, but whether it eliminates enough friction to let engineering focus on work that actually differentiates the business.

If you're working through these questions and want to understand how CDS approaches this in practice, our Cloudflare Developer Platform page walks through how we build and deliver on it.

Want more content like this? Sign up to our monthly newsletter!

View full post