What is Devops?
For several years now, one of the hottest (and beaten to death) terms thrown about in the IT industry, and in particular network administration, is devops. And devops is a great thing. It enables a tiny team to do the work of entire departments of people.
There is only one problem.
Most of us cannot even define it. This certainly includes many people who have been doing what either they themselves consider devops, or what they are told by their managers is devops. It becomes this fuzzy nonspecific term like ‘synergy’ or a ‘value added outside-the-box paradigm shift’. Soft, obvious marketing gobbledygook with no substance.
Except that isn’t the case with devops.
Devops is a specific thing, and it can be derived by looking at the name itself.
Dev-Ops. Doing operations work using software development tools and methodology.
Ok great, thanks Mr. Obvious, don’t you have some hotel tickets to pimp? I’ll admit, this isn’t necessarily the clearest answer up front, but it works well as a filter for examining a particular issue.
Now one thing to keep in mind, is that devops technologies are about scale. If you have a singular web host on a hosted VPS, most of the time anything in the devops landscape will be overkill. Of course, do you ever intend to scale in the future?
So, as an example of a problem that can be resolved through a devops lens, let’s look at an organization, Harry’s Aerospace, Inc. They run 150 servers, running various versions of Linux-based operating systems, and 15 Windows based servers, again running different versions.
Every two weeks or so, a new set up security updates come out for the base servers. Typically, a team of engineers in the NOC will do a rotation, typically at third shift local time, manually logging onto each server, using an ssh loop. Then, they will go through the Windows machines, RDPing into each, and performing the updates. The network links are saturated, traffic is live, and systems are running slowly. The process takes 1.5 – 4 hours, and 6 engineers. Cost to the company is through the roof, and everybody is rubbing their wrists to ward off the RSI.
To make matters worse, there is this mysterious demand being exerted from the higher management via several teams of developers, all wanting to update code, add new features, and change everything you JUST got patched! And so they should. Sites stagnate and die without fresh features and content.
Very quickly this situation escalates into an unmanageable nightmare. Tickets accumulate in the backlog, emails get more aggressive, and eventually some poor engineer is rage-quitting the end of the day with strong liquor, swearing off the impossible position he seems to be in.
Back to devops. Devops is the solution to this problem.
Doing any of these tasks on one machine is easy, and the toughest part really comes down to exactly specifying the task itself. Hell, even what each engineer does on each individual machine.. It isn’t hard, just painfully tedious and error-prone.
A better solution would be to let the backlog accumulate for a week or however long it takes, and allocate those engineers to building a machine to do the work for you.
There are lots of tools for this task, Puppet, Chef, Ansible, the list goes on, and is honestly irrelevant for this point. The main thing is to use one of them. Set it up, and automate the job.
Now, instead of logging into and managing dozens of mixed servers, you log onto one or two systems, promote some code, and either let it run it’s course, or force the matter with an imperative run. That’s it. If you need to revert because of some horrible bug in the new whiz-bang feature? No sweat – you obviously had your source in git, so when the change was found to be bad, you reverted to the last known good branch or release, and everything was running properly in minutes.
Suddenly your machines are all running the same code, are organized into logical units, are easily searchable, and even better, can be easily modified by a single engineer at any level, and instantly have effect made. The devs need an update pushed out? No problem.
This isn’t a sales pitch for any particular tool, or even packaged tools altogether. It is about looking at these sorts of problems with a cost/benefit mindset, allowing yourself to invest some time in DEVELOPING solutions to problems, instead of just hammering away at the symptoms; in this case updates and upgrades.
What problems have you solved by stepping back and acting instead of reacting?