DevOps what do you need to learn
I was asked recently in a forum by a student what does it takes to be a good DevOps. The answer is not easy and will be very different from one person to another based on the different experience in different context and environment.
DevOps is a management philosophy to manage an infrastructure using automation scripts to efficiently and quickly adapt to situation. The goal is to keep a High Availability ratio and keep up with the increased load of request on your infrastructure by scaling the number of server required at a given moment to cope with the load from customers. More load equal more servers, more servers equal more time/effort to manage and it was required to find a better way to increase the ratio servers / humans to manage them. Operation Developers was born when company such as Facebook and Google grow so big in number of servers that it was impossible to manage them in the classic way. Most servers were doing mostly the same thing and shared the same configuration and setup. Automation became a required task and now a day is becoming the new way to manage an infrastructure.
I won’t go further on the philosophy and history as there is plenty of better documentation around with better writer than I’m to cover the subjet in the whole scope. I will instead focus on the technology used in a cloud infrastructure that are usually found in a DevOps environment. Technology in this field are constantly evolving and new revolutionary technology keep popping up once in a while. This list is more a reference to familiarize with the current technology, reason to be and why they were created. Investigating each project will give more insight about the context of why they were created, what challenge they try to solve and give a foundation of what to study to become a good DevOps.
1- Linux Operating System
You must master Linux Operating System as a power user or system administrator. (no need to know all the developers side like library, not even need to know how to recompile a kernel *we almost never do it any more now a day*,) Don’t learn a distribution, Master the Linux Standard and then learn the distribution as you will likely work with debian base and RedHat based.
- Operating System basic structure, kernel, directory tree, and so on.
- Hardware and how a server operate (Raid, Fiber Channel, Infiniband, SAN, NAS, HotSwap, …)
- Networking (TCP/IP, Routing, OpenFlow, VLANs, …)
- Backup Solutions (Tivoli TSM, BackupExec, Bacula, …)
- This list can go very long and it is a never ending story
2- Cloud Platform or Infrastructure As A Service (IAAS)
AWS, Azure, Google Cloud, are a platform like many others, what you need to learn is get familiar with it and learn how to interact with it programmatically and what framework are available to reuse code functionality without having to reinvent the wheel with new fresh code that you made. There is enough maturity with AWS to be able to use a strong framework and reuse the existing code.
is another platform usually seen in private cloud, mostly Docker has to be seen as a service container rather than a VM. It is a jail root on steroid in Linux world. Basically you just isolate the service from the rest of the OS for security, stability and re-usability.
Philosophy of docker is that you deploy a new version of a container rather than maintaining fleet of container. Combined with Agile development and continuous delivery. *OS update, you update your source file and you deploy new container from that new version and you kill the old one.
4- Vagrant / OpenVZ
Mostly used to generate labs as needs. It is a separate entity, not sure why we would use it now a day, it is to run a type of VM, never been a fan, at that point I would rather use KVM, Xen or VMWare with the corresponding API. Unless I’m looking to build a labs and I don’t have access to an infrastructure to host it.
5- Private Cloud Platform
This is a game changer right now, VMWare is a hypervisor like Xen, Vagrant, OpenVZ, KVM and so on. The Cloud platform is built on top of the hypervisor. VMWare Cloud Director, OpenStack.
I would bet on OpenStack with the current information:
- IBM is investing massively in OpenStack with the acquisition of SoftLayer,
- RedHat is betting its future on OpenStack to penetrate the cloud enterprise market and is investing massively,
- NASA is using OpenStack and has been in the first supporter of the platform
- There is more OpenStack deployment in private company than any other platform.
6- Orchestration tools:
- Ansible, for small modification on a multitude of host. It’s like running a command in Linux VS running a script for a more complex task on a single server, you use Ansible for small task on multiple server.
- Chef / Puppet / Salt ; Mostly a matter of taste, they are all equivalent with their strength and weakness. I’ve seen more Puppet than any of the 2 others
- GIT: Which ever you do, you will need to master GIT to keep track of your change in scripts, configurations, deployment scripts, password encrypted list or any type of file that you need to keep track of change / modification. You can even deploy configuration on remote server via GIT. *Maybe not best practice, but just to tell how powerful GIT alone is*
- Jenkins; Personnally, I never had to use it myself but I’m not a power user of Docker and it might be handy in some more complex DevOps environment like in Facebook by example. I believe knowing it is just one more arrow in your bow.
- By the way, Developer are very bad SysAdmin Operation … Don’t think like a developer with a big head who knows everything. Operation environment isn’t Development Lab. So continuous integration has no place in Production environment. In every company where they want to push DevOps philosophy, we keep repeating that Developer are dangerous. 😉
7- Continuous delivery
8- Database platform:
Usually scale vertically (in CPU and RAM in a single machine) use Columns / Lines stored inside a Table and Table are stored in a Database. You cannot grow a Table (which is the smallest object container in a database) bigger than the available local resources (CPU, RAM, Disk) and your performance is limited to the bottleneck of the slowest electronic component in the server, usually disk IO.
- MySQL is dying because of Oracle stupid business people.
- MariaDB is the new MySQL OpenSource effort
No SQL Database:
Usually scale horizontally ( On multiple server ). The data is stored logically on many servers forming a data cluster where the load can be distributed and new node added as need to cope with performance requirement. The opposite of a Table container which you cannot split when it grows out of available space, No SQL allow you to split the equivalent of a Table into sub Table and to redistribute it on differents servers. This is why it scale horizontally instead of vertically.
DBA are the expert of Databases. Their role is to know everything about performance, tweaking of database, architecture of database and so on. Usually a SysAdmin, we are only responsible for provisionning new nodes, or physical resources such as disk space and so on. If No SQL cluster, we are responsible to build the cluster and all required component of the cluster, but we usually don’t touch anything inside the database application. Our role stop at is the service running or not and if not why and can I fix it. We may as well advice for performance distribution between node at the infrastructure level.
- Nagios is an old technology from the 1990. The founder of the project ended up in conflict with the OpenSource community because he never let them touch the core of Nagios to fix some basic design and architecture flaw that undermined the product. Unfortunately, Nagios doesn’t scale, has many bugs and is very unsecure. Design flaw made it that you need to restart the service every time you make a config change. There is no re usability of configuration block, each host has to repeat the whole config file even if object being monitored have identical role (like web server in a web cluster). Nagios have seen the number of code release declined drastically in the past 5 years in the OpenSource community. The company is now below 20 employees and their outdate business strategy to license the product isn’t going well either. Expert predict Nagios is on its final decline.
- Icinga – was born when the Nagios community left and started their own project. Icinga1 is a rewrite of Nagios in a better language and with core design / architecture fix. It is 100% compatible with Nagios configuration and you can reuse the same configuration from your Nagios right out of the box. Even better, there is also a project that allow you to install a package that will install, integrate many product together like Nagios, Icinga1, pnpnagios and so on. You just need to switch a flag in a configuration to start either Nagios or Icinga from the same configurations files.
- Icinga2 – Icinga team didn’t stop there. They decided to build a complete new platform. On this new platform, configuration are reusable, more like code that you can use as a programming language to include by example a for loop to iterate items or generate configuration without the need to type 1000 lines, you can template a config and include it in another configuration and so on. I don’t have enough space here to describe the product in all its glory, but if you trust me, this is the way to go to monitor any big environment. A good reason to choose Icinga2 instead of Nagios or Icinga1 is this platform can scale horizontally. Another big plus Netways.de, the company behind it, was founded by previous Senior Nagios developer and they don’t ask for license for commercial support. They only bill their time for consultant services, training and so on. I had the chance to talk to senior employee in the company and I had a very good impression. Netways.de is also founding many seminar to promote Open Source project all over Europe.
- Sensu – This is more a platform than a product, you need to code a frontend and build your logic and so on. It basically just provide a way to receive signal on the network, put it in a queue, and have a feeder that will unpile the queue and do something with it. It provides the basic functionality (logic) of a monitoring system. It is very powerful because you can make it what you want easily with some coding time / effort. It is design to monitor very dynamic Cloud environment with server that popup and popout all the time. It is highly scalable as it use HBase, RabbitMQ, Elasticsearch, and a few other technology on the backend which are all very scalable. IT SCALE HORIZONTALLY!!!! Also!
- All the other product are mostly clone of Nagios or crap on steroid (so just crappier). Most very enterprise level monitoring system are so heavy and unflexible that they are unusable in Cloud environment. They are just good to provide manager with nice looking graph and reports.
10- Logs Management and Analysis:
- Logstash and lumberjack
- SIEM / ArcSight (more for security folks)
- Graphite: A very lightweight and fast service that listen on a port and can create dot in a graph in real time from data coming from the TCP port. You can do a telnet graphitehost 2003 and type graph1.1.[timestamp] and it will create the dot with value 1 at the timestamp time and so adding more dot will create a graphics. So your monitoring agent don’t need to do any fancy stuff to interconnect with your graphing tools other than send a packet on TCP port using telnet or netcat.
- Cacti: Kind of obsolete but still widely used because it works and have been there for a long time. Slow, doesn’t scale well.
12- FrontEnd to graphing for logstash and graphite
- Kibana- A nice FrontEnd to ElasticSearch (on which Logstash is based, so like the frontend for analysing your logs from logstash)
- Grafana – Nice dashboard highly customizable that you can interface natively with logstash and graphite and extend their feature. It cans display graphs but also any type of information to create a nice Admin Dashboard.
- Bash – Very useful for automating stuff at the OS level, the advantage is that you run native OS command in a script NOT POSIX Compliant
- Korn Shell – More portable and exist on other platform than Linux like AIX, Solaris, and so on, POSIX Compliant
- Python – Python is probably the most favorite language and used mostly to extend platform or do complex operation that require a more oriented object language
- Ruby – Icinga2, Ansible, Puppet and many other tools are based on Ruby and will make your life easier to be familiar with it.
- Perl – Not as popular anymore but always useful to achieve some low level task. Parsing efficiently text file. RegEx was created for Perl.
- Data modelisation – XML, JSON
- API – SOAP or other API structure
- Vi, VIM, other flavor- You need to master at least one of them to be comfortable in console
- You need to be comfortable with a real code editor that integrate with your framework and git. This is developer requirement.
- List of other Editor : Atoms, Notepad++, UltraEdit. *There are many more*