Welcome to part VIII in our multi-part blog series on the Self-Driving Network™. Find part VII here and be sure to listen to two new podcasts featuring Kireeti Kompella describing the Self-Driving Network and the challenges we face as we build it. The first podcast is from automation guru Ivan Peplnjak of IPspace, while the second is from the always entertaining PacketPushers.
Much of the recent hype in machine learning has been in the areas of image recognition, language translation, and natural language processing (NLP). And rightly so. Significant breakthroughs have been made in these areas in the last several years to the point where, in some cases, the machines have become better than humans. Image recognition doesn't interest us particularly in the networking business, but Juniper is eagerly participating in NLP as we experiment with new ways for operators to interact with their networks, including chatbots and voice. And what we have often called "declarative programming" or "intent-driven" capabilities, including those that we've built into Contrail, Northstar, and other systems over the last few years are additional feats of language processing.
Those on the leading edge of machine learning research often claim that developing the algorithms is not the most difficult aspect of the job, nor is it even the most critical. They’ll tell you that the biggest hurdle is gathering the training data. Specifically, it’s finding enough data to train the algorithms, and just as important, it’s having that data in a structured format.
Let’s take image recognition for example. You can’t create a system that discerns a bad apple from a good one unless you have at least many thousands of pictures of apples, with each labelled “good” or “bad.” The only way to generate this type of information is through manual labor. Imagine humans poring over pictures of apples and determining which ones they would eat.
Or think about language translation. Languages are incredibly complex, filled with grammatical exceptions and semantic nuance. You can’t train a translation algorithm without first having access to massive amounts of translated examples across different contexts. French/English translation algorithms benefitted immensely from many years of Canadian parliamentary debates that had been previously translated (by humans). But similarly rich data pairs do not exist for most languages.
In the networking world, however, we typically don’t face these same hurdles. I’m not here to tell you that collecting network data (ie, telemetry) is trivial, it’s just that we already have mountains of data and most of that data is pretty well structured. Log files contain dates, times, transactions, IP addresses, severity levels, etc.

When we’re building an unsupervised learning algorithm to cluster data for anomaly detection and analysis, our biggest challenge is not a scarcity of data in a format ready to process. The more likely challenge is how to best optimize a k-means algorithm so that it runs hyper-efficiently because we actually have so much data available to run through. In fairness when you take that next step after an unsupervised learning initiative and get into automated resolution of the anomaly then labelled outcome data is trickier (eg, the digital records of the network engineer performing X, Y, and Z to fix the problem).
No doubt substantial challenges exist as we continue to apply more machine learning techniques to network design and operation. And those challenges do include a myriad of issues related the underlying data, such as normalization, ownership, privacy, sharing, correlation across different domains, etc. But networks generate and collect enormous amounts of data and most of it is well-structured and just waiting to be fed into clever machine learning algorithms so that your network will ultimately run more smoothly. . .
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.