When I talk with customers about automation often the discussion runs to which language you should use or what tool is needed. However the first steps in figuring out automation is to look at the workflow of information that you will need to process to achieve your desired state. While this certainly isn’t as exciting as jumping right into the code it is required when thinking about the architecture of automation. Much like designing a network you need to take a great deal of thought into designing automation.
Automation needs a few elements before you can begin to design it
Where do you receive input for automation?
What devices do you need to automate?
How do you apply the automation?
Do you validate the automation after it has run?
When running an automation task you typically need some initial input before the automation begins. This may come in the form of a change ticket with details, an input of new IP addresses to add to a security policy from a messaging queue, or pulling information out of a database. This is the element that effectively provides the source of data for your upcoming change. Unfortunately this information doesn’t always come in the form that is easily automatable.
The most basic systems to initiate an automation task are an email, someone yelling down the hall at you, or a help desk ticket. These inputs are typically received in a manual fashion and the data received in them are transferred to the following steps. This may sound like a bad thing almost as if someone has delivered you a change request on a stone tablet. However having checkpoints in the automation process isn’t horrible. It can prevent automation from running wild or give you a chance to have an approval process for a network change.
More advanced automation systems will receive data upstream from a more interesting source to kick off the automation task. I did mention a few of these above let us think about something exciting like robots. Github has an amazing bot called Hubot. In its simplest form it can receive information in a chat channel and then take an action based upon what was said. A simple example of Hubot in action is its ability to execute Google image queries on your behalf Hubot Google image search example.
Lets take this example and apply it to networking. So imagine Hubot being able to check your network routing table for you. Or alert you that there is an issue in your network. While not in existence today, although I do take requests, it would be an example of a system acting on the automatic input of information. In the past I have used it in a similar manner to check deployed versions of software builds or to alert me to a Github push. Many companies today use instant messaging chat systems as a form of communication so why not add a little automation to that as well. It is an interface that is easy for people to understand and it is darn convenient.
While the Hubot example is excellent it also helps me bring up the next step in determining where to apply your automation. If we continue with this example we would need Hubot to know more to talk to our devices. It would need to know which devices to check, how to authenticate to them, and ensure it has reachability to the devices. This is a pretty important to think about as you don’t want to be handing out your network device credentials due to the extreme security risk. If I were to implement Hubot to do these tasks for me I would use an SSH key for authentication and keep that specific key to Hubot so it can easily be changed or revoked. This same challenge applies to your own automation as well. It could be Hubot, Ansible(), or your own custom script. These considerations are universal to the automation acting on your behalf.
To pick a specific device you would probably want to specify it to Hubot via the chat channel. The command to it would look like “hubot check route tables on device bigrouter.ketchup.ninja. Lastly it would be a good idea to put some access restrictions around what commands it could run. On Junos devices this would be pretty easy to do with access privileges. We don’t want Hubot to become Skynet now do we?
Choosing how we want to apply the automation is a pretty critical step. The term idempotence is thrown around for this problem. The basics of it are that you can apply the same change multiple times without changing the initial application. Lucky for us this is generally how network devices are configured. You can do a CRUD (Create, Read, Update, Delete) set of actions to the configuration. While this is generally true there may be some other dependencies you need to check for your configuration. Examples on Junos would be that a prefix-list or address object must exist before it is applied to a policy. This dependency generally becomes the biggest challenge when thinking of what to apply to a device. So before you start sending configuration you want to do a fair amount of testing to ensure you do not run into any issues in applying your configuration. Generally “show” style commands will not have an issue in being run from automation.
This leads into our last point of validating your automation. Typically in a scripted-style of automation you would run a set of commands and then let the script exit. The issue is what if the commands that are run cause an issue on the device. This is where it is important to have your automation validate its changes. In the case of running “show” commands you don’t have to validate them, as they don’t change anything on the device. If you change and commit the configuration then you want to validate that the change occurred, as you would have expected it to. Without this who watches the Automation? Who watches the Watchmen?
I find that doing the validation step is often skipped because of the time to commit to write it. Really this is pretty typical in programming things. Developers want to get the code to work and then go back to do stricter validation of the input or output of the program. Hence why we have seen so many issues with things like SQL injection, Cross-site scripting, and buffer overflows. With this in mind you ideally want to pull the configuration or a snippet of the configuration to ensure that it worked. This validation would be the minimum that you would want to do. In the best-case scenario for a network change it would be best to actually try and pass traffic to validate the change. This way you not only validate the change but the change actually works.