In my career, I sometime get to work in company who made the mistake to deploy Debian at large for their operations.
The funny part, it always have a similar pattern. Some developer created an application / product and at the very beginning of the company, they didn’t have an IT operations team and as such, developer were in charge of operations and deployed in production what they were familiar with.
This is a natural evolution from small company to bigger company and it is also a very costly mistake.
Why would you ask?
Debian strength is around its community of passionate who are very great at helping newbie learn the system. Also, Debian tend to use more recent library/kernels/packages that offer more features for developers to code a solution faster.
On the other end, being at the latest version of everything make it a very bad systems for operations where stability, reliability, robustness and maturity are far more important. Also, because there is no enterprise support for Debian, there is a lack of understanding what IT operations really is for that community. Also, there is no effort to support or develop tools to support large scale environment.
They are clueless about the reality of running a large scale infrastructure. Most of them are developer or manage 4-5 servers in a corner.
LTS or Long Term Support:
In a OS, Long Term Support mean that for a given period of time, you will be supported for that version of the OS for your operations. You will receive bug fix, security patches, and updates. This should also cover maintaining packages for the duration of the LTS.
They make sure to have developer working in the open source community, assisting and helping develop (applications/packages) and to ensure each new version will be supported on their LTS version. This ensure that any software supported by RedHat has a guarantee to stay active (community won’t vanish for the new hype thing) and that you get covered for the duration of the LTS. You will get update for your OS and for your packages.
Redhat even made the mistake to adopt Kernel version 2.6 which in the end were full of bugs and badly designed. Linux Kernel community quickly canned that version and start a new branch. But RedHat had already adopted it for their RedHat Linux Enteprise 6 with LTS support for 10 years. They had to hire Kernel developer to maintain, fix and develop the kernel 2.6 and support it until RHEL 6 End Of Life (EOL).
This is the kind of commitment enterprise ready OS need to match
With Debian, they understood LTS like this :
- We will keep the ISO available on our repository
- We will continue to keep online our packages repository
- We will continue to provide security fixes only if the fixes can be applied with the current version of library
- So if a packages is version 1.8 with dependency on older library than the current branch and the latest version of the package is 1.14 with a security fix that should be backport, they will only fix it if they can backport it directly without effort.
- This also mean, that at the moment the LTS start, (the newer version become the stable branch), all the open source community abandon the LTS and develop to support the new stable branch only. They even remove the LTS version from their supported version.
The response from the community if you are running a LTS version, you should probably move away to the current branch … Which goes completely against the reason to be of having a LTS. It cost a lot of money and involve a lot of IT risk to upgrade from one major to another.
I’m currently working with an infrastructure of 5000 Debian 7 (Wheezy) Linux servers for a new client.
Here are some real example of challenge I’m facing:
- They have 5000 servers in total, 99% Debian 7 (Wheezy) currently on the LTS.
- They hired me because they have reliability issue in their operations and want to improve their processes to manage that infrastructure.
- They don’t want to invest because this infrastructure will eventually in 2-3 years be decommissioned and replace with new datacenters.
- They use Puppet to orchestrate the configuration management.
- They use SSSD (from RedHat FreeIPA project) to do central authentication against Active Directory (Windows AD).
- They want to keep going with Debian 7 for a few more years.
One of the things we highlighted is they don’t have any inventory tools.
- They are unable to run reports to know which servers could be impacted by a security flaw affecting a specific version of a packages.
- They are unable to have a status of the patching of their servers.
- They are unable to have an overview of their environment or if server are not reporting to Puppet correctly
With Debian, there is just no tools that were developed for it. RedHat has Satellite and Spacewalk (opensource version of Satellite). And RedHat Satellite and Spacewalk are fundamentally incompatible with Debian. There isn’t just any solution that exist for Debian.
Ubuntu Canonical made Landscape which is an attempt to support enterprise but it seriously lack features and capacity compare to RedHat. And there is no open source version of Landscape and at a price tag of 150$ per server + 2000$ for the Landscape server, it is quite costly for the poor functionality.
We also highlight a serious bug that affect the Central Authentication
Debian has no tools to do central authentication with Active Directory. It supports Kerberos and LDAP but I never seen any big business using that for Central Authentication globally. Secretary, VP, DG they are use Windows OS laptop and it integrates way better with Windows Active Directory. Business want to use Active Directory to do their central authentication with their servers. Corporate policy.
Little problem, the only Linux solution open source to do Central Authentication against AD is FreeIPA component SSSD. (FreeIPA itself is a kind of Linux Active Directory that mimic the functionality of Windows AD). The SSSD component is the client side application that connect to FreeIPA to authenticate and which support as well Active Directory.
In Debian and Ubuntu, there is only one person responsible to port and maintain the SSSD packages, Timo Alteenen. He is doing a great job for a one single men effort for a component that critical for enterprise. If this guys run under a bus, Debian loses all support for central authentication against Active Directory.
Another thing, Debian 7 currently support version 1.8.4 of SSSD which has a major bug that prevent it to fail over on the secondary servers in case the primary is unreachable. If you lose your primary Active Directory, you are locked out of all your servers. The fix has been applied done in version 1.9 and the latest version is 1.15. But Timo is unable to port the packages to Debian 7 because of a library dependency version. That library is not portable because of a complex set of dependency that affect also OS core library that would break fundamental components.
Every time that primary AD servers is going to reboot for OS patches, we are locked out of 5000 servers. There is no fix that we can apply for that. *Not even a simple round robin at the DNS or with a Load Balancer because of the security mechanism of Windows using certificate and hostname and IP hash and so on*.
There is no good tools to maintain remote repository
When you manage secure environment, you are forbidden to give direct access to Internet to your servers. As a result you need to build local repository that them alone are allowed to connect to that specifics repository mirror to sync nightly.
The tools available to perform repository mirror are primitive and consist mostly of a perl script that you need to hard code the config you want and even so you are not sure if it will in the end respect the right directory layout required by APT package manager. This is just not elegant and hackish.
A System Administrator doesn’t have the time to reinvent the wheel. It should be simple to cronjob a tools and you tell it which mirror to sync with.
It is also adding operational risk by having no guarantee that a modification or a configuration will be working and not introducing issue. When you have 5000 servers to patches monthly, each day is schedule for 100 of servers, if you introduce an error and detect it 5 days later, you impacted 500 critical servers hosting critical applications.
Upgrading from Debian 7 to Debian 8
Debian 8 has so major difference compare to Debian 7 that our Puppet 3.8 is no longer supported on Debian 8. The community of Debian decided that going Debian 8, Puppet 4 is the only available option.
Between Puppet 3 and Puppet 4, there is major difference and many of them are just incompatible with the Puppet 3. So, if we were to upgrade to Debian 8 eventually, we have to rebuild our whole Puppet configuration architecture. For this client, this is a huge task and a 1 year project alone. And this is if we don’t find other dependancy with other tools we use.
After that, you have 5000 servers to upgrade to Debian 8 and with insuring no downtime at all for applications and customers. How many years project?
Overall, the flaw of Debian in a large scale infrastructure is:
- A lack of tools to manage your infrastructure.
- A lack of understanding of IT operations challenges and a community who don’t know anything about it.
- A fake LTS support that doesn’t successfully respond to the reason to be of a LTS.
- A lack of open source community to support the LTS version
- A lack of Enterprise with dedicated resources to ensure the OS continuity. *no company back Debian financially or by staffing to ensure the continuity of this project*. This is a big liability for business.
- A lack of enterprise level support for emergency