Keep Networking Simple - A Guide to Helping your IT Department Overcome Complexity
Jun 22, 2018
There is little doubt that enterprise IT is prohibitively complex. This is true for IT as a whole and for networking in particular. Complexity serves as an anchor, holding enterprises back from achieving network efficiency and simplicity to succeed.
As we speed toward a multicloud future, the complexity problem has to be solved. And while that solution will certainly be technology focused, the business side has a crucial role to play in helping enterprise IT simplify and solve.
Learning from the cloud
What companies do you think of when you think agile? Chances are that you are imagining the major cloud providers and SaaS companies. Names like AWS, Google and Facebook probably race to the top of your mind. But what do they do that makes them so much faster than everyone else?
Yes, they have significant developer skills, but operations begin well before a single line of code is written. These companies have made a decision to put operations first. Operations are not an afterthought, handled by a team that barely has a voice during the design decisions. Instead, the operations teams are included in early architecture discussions, funneling their requirements into the process. If you do nothing else, make sure that the IT operations team is well-represented—and with an equal voice—at the outset.
The obstacle of diverse operating environments
From an architecture perspective, the single-most important thing that an organization can do to help drive complexity down is to reduce diversity in the operating environment.
How can anything be called simple when everything is hyper-contextual? If the underlying infrastructure is a patchwork mosaic of different systems and software cobbled together over two decades, there is no hope of simplifying operations. In many ways, the biggest impediment to eradicating complexity is lacking the will to ever remove anything. And a second challenge comes as new requests. If you, as a business leader, are requesting new tech, especially just another version of what is already in place, is it worth the crossing tax that will eventually come as it matures and integration becomes necessary? When you have new requests, are you engaging cross-organizationally in requirements to find the best overall solution, or are you accepting quick fixes just for your team to get the money spent this quarter?
How can you help your IT leaders? A start would be to list all of the devices, versions of software, protocols, policies and more that your team uses. Continue tracking that list year-over-year. If the list is not becoming significantly shorter, then your team is moving in the wrong direction. There needs to be explicit draw-down plans to settle on the smallest set of building blocks possible.
And the process should reinforce uniformity. Expediency is rarely a good reason to make one-off changes that feel good in the moment, but lead to accumulating technical debt. Technical debt is the term that IT uses to describe the cost of additional rework, integration and operations overhead caused by the accumulation of lots of individual decisions to go with easy solutions that are incomplete or one-off. These solutions are often purposefully selected over solutions that take longer or cost more, but also more effectively set up long term success. The ultimate extra cost of these decisions adds up over time. Diverging from the norm should be intentionally difficult and require extra scrutiny.
Be wary of one-way doors
Jeff Bezos once cautioned people to be particularly careful when walking through one-way doors. What he meant was that any decision that is irreversible needs to be made with thought and precision.
As leaders drive change in IT, it’s important to make sure that every decision does two things: moves the company forward and leaves options open. It’s obvious that the future is multicloud, which means that every decision should make the network better and prepare the infrastructure for the continued move toward multicloud.
At every major decision point—initial design, solution purchase, day-2 changes—leaders should explicitly inspect to ensure that there is an identified path forward and it includes the possibilities for alternatives. If the team cannot articulate the path and how options remain available in architectures, vendors and operations to reduce the risk on new decisions, then hold the decision and send the team back to complete the exercise. While the extra time and effort might seem painful, it will be far more palatable than a rip-and-replace later on.
Cattle and pets
One of the seminal moments in the early cloud days was the distinction described with an analogy of cattle versus pets. In the old way of doing things, we treat our servers like pets—as an example, if the mail server goes down, it’s all hands on deck. The CEO can’t get her email, and it’s essentially the end of the world. In the new way, servers are numbered, like cattle in a herd. For example, www001 to www100. When one server goes down, it’s removed and quickly replaced.
While he was talking about servers, the same can be said of networking equipment. In a Google network, if a switch fails, they don’t troubleshoot it. They just replace it with another identical model that has the standard configuration on it.
The point here is that your teams should not fall in love with their devices. They should not be treated as individuals. They should be identical. And simple things like naming can help reinforce that dynamic. Ask your team how they are selecting the software and devices for their infrastructure. How are they adopting a building block approach that eases swap out during trouble events, or to scale architecture or enable trade-in for newer models with more advanced capabilities?
When enterprise IT is left to evolve organically over a couple of decades, strange things start to happen. For enterprises of even moderate size, there are devices that have almost been in production longer than 90% of the current employees have been with the business. In IT, budget tends to be allocated to new things under the flawed premise that “if it isn’t broken, don’t fix it.”
But that mode of operation leads to infrastructure drift. Old devices live alongside newer devices. Capabilities don’t stay in sync. Unique infrastructure-like snowflakes form. And those snowflakes disrupt operations.
Management should be aggressive at replacing aging equipment. While the first wave of refresh is difficult from a CapEx planning perspective, the operational gains can be significant. It also allows teams to get more for their dollar, as newer generations of equipment almost certainly come with performance advantages.
When refresh becomes part of the normal capital allocation process, the need for Big Bang projects lessens, and maintaining continuity across multiple years becomes significantly easier, all while simplifying the underlying infrastructure.
Do not be fooled into thinking that complexity is only an artifact of technology decisions. For the vast majority of enterprises, complexity starts above the technology stack with the processes and teams. But by making simplicity an explicit effort that is managed and tracked alongside more traditional IT KPIs, meaningful change can be driven.
What KPIs are you using today? How is complexity measured? Is there a baseline? Is there is a target? Are there well-understood initiatives with budgets and deadlines? Is it part of the daily dialogue?
If the answer to any of these is no, it’s likely there is still opportunity to do more. And given the existential threat that unnecessary complexity represents, can you really afford not to?