michael werneburg
 

the problem

DevOps aims to eliminate the old and broken model of information technology in which one team (Development) creates or updates a software solution then hand-builds it in the production environment for another team (Operations) to run. The idea is that the developers can continue working on new business solutions while Operations keeps the existing business running. There are two distinct teams with different objectives. It's "dev vs ops".

This model dates to the early days of business software development, when things were far simpler. It's not hard to list complexities that didn't exist when this model was created: flat "growth" and results such as continuous mergers and acquisitions and constant cost-cutting; cyber-crime; regulatory involvement in business systems robustness; constant turnover in technologies; a multiplicity of computing platforms; an acceleration of change in business environments; new focus on previously un-costed "externalities" like environmental impacts; minority shareholder activism; globalization of markets, work force, and supply chains; consumer privacy concerns; and so on.

the unbearable persistence of "dev vs ops"

Any model that predates major changes to the environment it serves is unlikely to remain fit for purpose. The mis-fit development/operations model persists in IT departments because organizational structures perpetuate them. They persist because a lack of training and research within technology departments leaves the staff complacent. And they persist because businesses don't know that they could and should demand better.

The latter point has many drivers but some important ones are:

ramifications

The biggest negative outcome of this "dev vs ops" regime is that system delivery becomes so difficult and error-riddled that problems are hard to fix and people come to think of change as dangerous. The ramifications of this are many.

As recently as late 2019, I heard from a small technology firm with which I was interviewing that the Operations team simply didn't get any guidance from the Developers when a release happened. The supposed "run" team doesn't actually know how to run the software. What I have been seeing for nearly thirty years under the "dev vs ops" model is that operations comes to resent the developers for delivering unstable software, and development resents the lack of trust from operations. When something goes wrong, we descend into finger-pointing and the result is always a patch. So the cycle is: long and difficult release -> production error -> recrimination -> patch.

No one considers security, performance, scalability, or other non-functional specification their responsibility. The ops team can't change the code, so it's not their environment; but the dev team doesn't understand production, so they don't consider anything but the features the business asked them to build.

IT fails to deal with changes to the business in a timely and effective fashion. This reinforces the hopeless reputation of the IT department, and the cycle perpetuates.

the solution

Working with an environment with a traditional build/run dichotomy, I designed a DevOps team that included several components in fact or virtually: systems administration, database design and support, IT security, specialists in cloud platforms such as Azure and Salesforce, and specialists in the software delivery tools. The latter specialists understood tools such as versioning, continuous integration and delivery, and the use of telemetry in detecting and addressing production issues.

This design was supported by a clear enunciation of the roles and responsibilities that would be carried out by DevOps and the other teams—notably the developers who provide the software releases. This explicit enunciation was required to ensure that not everything fell to the new team, and that new team had the support it needed from the others in terms of providing code via repository and pipelines and a standard programming language, and so on.

I was then tasked with building this team, which took time and in fact resulted in several months of negotiation for the different management structures to step up and contribute the members needed. Eventually we resorted to hiring. The team now functions as owner of the cloud platform and is working with the developers on incrementally taking on more responsibility for how code is delivered—it's not just a run team. In time, I expect this organization to facilitate self-support by the development teams to a point that changes come directly from those teams and the DevOps team acts primarily as the provider for the tools that allow this to happen.