In this article, I will focus mainly on one factor that has been the biggest business disruption of the past 50 years and which is still today the reason behind most recent major IT evolution.
- Infrastructure: Virtualisation, Cloud, Automation, Telecom
- Methodology: Waterfall, Agile
What is the difference between IT and other Engineering field?
In IT, we depend on systems which are made of many smaller components working together to provide a solution. We use these components at the extreme limit of their physical tolerance.
Ex.: CPU runs at near their physical temperature limits.
A system is deemed more complex as you add more parts to it. Computer are extraordinary complex systems. And our IT infrastructure depend on many smaller complex systems adding more complexity.
We call Entropy, the mathematical relation that define the potential of outage of a system in relation to the complexity of its parts.
As such, IT use complex systems and evolve in a very complex business environment that is highly dynamic and unpredictable. Even if we use empiric methodology to build better systems, the fact that the environment is unpredictable make it impossible to build a perfect system.
Considering the level of Entropy in IT, the fact that we can achieve 99.9% SLA is because of dedicated work to build our systems with a high level of resilience at every level, never seen in another field.
Adding resilience to a system is also adding component, rendering that system even more complex. There is, a limit where adding resilience to a system has no more positive impact. We are so resilient that we reached the bottleneck of resilience.
Of course, Non-Technical people are unaware of this reality and don’t understand why computer are not 100% reliable. And they keep complaining as soon as an application or service doesn’t work for them.
Other Engineering field:
In other field of Engineering the environment is controllable and use very specifics metrics for each situation that are limited and not dynamic. In this context, empiric methodology work well, because from one project to another, the variables are very similar and the knowledge from previous project is applicable to new project.
Ex.: when you build a bridge, the type of soil, the type of weather, the type of usage for the bridge are well known and all the metrics required are known as well. Unless the soil is of a new type never seen before … the metrics should be reusable.
The level of Entropy is quite low in comparison to IT.
Business Management perception of IT
Because of the Entropy issue, business managers perceive IT as unreliable. And on top of being unreliable, the cost of IT keep raising year after year.
Why does business manager perceive IT as unreliable?
Business Manager evaluate metrics and plan accordingly to these solid metrics. Like known metrics in our example of “Other Engineering field”. It is easy to have a tight control on metrics in a controlled environment. You can plan based on these metrics and feel in control.
With IT, there is no metrics that will warn you when a system will fail. The only thing you can do is prevent the system from failing (regular maintenance), or avoid the outage to impact the service (resilience design). Which are beyond the Business Manager understanding.
He knows that he depends heavily on IT to have a more performing, efficient, and competitive business. But he also knows that he has no control of it and his business rely on a system that can breaks at anytime without warning. And to prevent the system to break, the cost of IT keep increasing year after year.
Considering that a good Business Manager is an obsessive person who monitor and track every small metrics to detect any variation to adapt very quickly to any situation, IT is probably their worst nightmare.
Impact of the petrol crisis in 1970 on businesses until today
In the 1970s, the price of petrol jump drastically, putting a lot of pressure on economy who depended on petrol. The result of petrol crisis caused a huge inflation in country dependant to petrol. To survive to the elevation of life cost, employee salary needed to be raised as well. The cost of labor became less competitive in western country (USA, Canada, Europe country).
To adapt to this new reality, businesses had to drastically cut expense leak and become more efficient. Particularly, company evolving in a petrol dependant economy, competing against company evolving in a petrol non-dependant economy.
The impact on Japan
Japan back then didn’t rely heavily on petrol and most of the work, were done manually by human. Japanese being very honourable people, the quality level of their work is very important and they feel accountable if they fail to reach the high (perfect) level of quality. Just after the war, the Japanese had to rebuild their country and economy. It was like a virgin sheet on which to draw a new economic model. After the war, Japan became a new economic partner with USA. A young Ph.D. in economic researcher from USA took the opportunity to move to Japan to put in place an ideal system of production based on his research. The idea was to have many businesses inter dependant in a proximity to each other. It was a huge success!
- In a car industry, you have a company building the glass for a windshield, another will provide another component and so on. All these suppliers will provide the next manufacturer in the chain which will end up in the assemblies of car by example. Being in proximity, reduced the energy requirement for transportation and provided the flexibility to order only what you need for immediate production requirements, without the need to store inventory. If you need a windshield, the windshield company next door will make you one in the hour. If each supplier in the chain follow this logic, you save a lot of money on storage warehouse, transportation, and energy.
- Also, the Japanese have a Culture based on honor, they will pay a provider on the honor, only once the car has been sold. Everyone in the chain work for free until the product is sold.
- The action of calling a provider to supply you with a component you need is called: Kanban(trigger in Japanese). As a domino effect, the Kanban will reach out, up to the first provider in the chain. This is the base of Just In Time (JIT)
It was also common for an employee to stop the whole chain of operation when an employee detected a mistake. It would be a shame if an error was detected far further in the chain, meaning many more people would have missed it. Correction and revision of the processes were taken care immediately, improving the manufacture processes and improving the quality of the products overall.
Employee who detected a problem were also rewarded!
With the honour culture of Japanese and their high sense of work quality, they developed a new industrial methodology, process flow and relation with suppliers, that led to a much more efficient manufacturing model, that cost much less and result in better quality product.
The impact on the American
USA depended heavily on petrol for its industry because inter dependant company were spread all around the country. In USA, the economic model was that each State were specialised in some type of manufacturing based on proximity to the primary resources related to the manufacturing.
- California for plastic,
- Ohio for glasses and so on.
Let say a car manufacturer located in Chicago, Illinois, would do business with a windshield manufacturer from Ohio and a plastic company from California. The transportation requirement would make company in USA, rely heavily on petrol.
With the cost of petrol jumping, impacting the cost of production in every industry, company became less competitive internationally.
When the first car “Made in Japan” arrived on USA market, their price were so aggressive that Made in USA car were completely outmatch.
A Made in Japan car including shipping transportation and importation regulation cost, was sold for half the production cost of Made in USA car.
The USA car manufacturer were dazzled and couldn’t explain how a Japan’s car manufacturer could produce a car at a price difference bigger than the inflation difference. In this case, this was astonishing and unbelievable. So, Ford, GM, Jeep, they sent people to Japan to understand how this was possible.
They’ve been amazed by how the production was organised. USA producer were losing about 40% of their production as defect. Compare to less than 1% (1 on 3 000 000 for Japanese “Six Sigma”). They found, they had to drastically change or they would disappear in no time. These company were huge and complex and this type of change are major. But if they didn’t improve their production efficiency, they couldn’t survive this new competition.
What USA did wrong?
- They had remote supplier far away.
- They had complex process flow to manage with their remote supplier. (no honour requires more validation)
- A client would assume responsibility for a mistake on shipment order from a supplier. (no honour)
- Supplier would often cheat to make more margin. (no honour)
- Because often suppliers were unreliable, client ordered often more than immediate need, just in case. (no honour)
- If an employee stopped the operation chain, if it wasn’t life threatening situation, he would surely get fired.
- Employee were not encouraged to do quality, they were only encouraged to work fast and produce fast. Leading to lower quality product.
- They had a production process that did the Quality Insurance at the very last step of the process flow. Thus, defect car would be trashed if it took more than a few hours to fix. Disassembling a car take considerable time and time is money.
- They would end up with more than 40% defect per day impacting negatively production cost.
- They were producing car in volume faster than they would sell. They had to store the cars in huge parking lot waiting for them to sell. This passive inventory was very costly. You need to invest money to create the goods (production cost) and you must pay to store it until it sells, delaying your Return On Investment (ROI). “Push Methodology“
The essentials points are:
- In an aggressive capitalism system without honour, people cheat each other and attempt to abuse other and this require more complex system to detect these bad intention. Most of the time, the complexity of this validation cost more than the cheat and it was largely accepted as accounting loss.
- To do Quality Insurance at the very last step is very costly. Because to fix an issue it would require more steps and would be more complex than if it was detected earlier. There is a direct link with the cost to fix a problem with how early it was discovered. More late a problem is found and costlier it will be to fix.
- Focusing on speed of manufacturing is cutting significantly in the quality of the product. Customer care and support is costly and less reliable car cost more to maintain for a manufacturer. *This could also be argued that parts market was a good income for them*.
- They produced more than they can sell applying a “Push methodology“.
- These practices led to very negative Return On Investment (ROI) and Cost of Production.
What Japan did right?
- They had proximity between suppliers.
- They worked together, as their reason to be, was to provide the next step in the chain.
- They segmented the manufacturing in many smaller pieces as each supplier was very specialised in producing one product that would be a part of the final product.
Ex.: one supplier would provide the doors of the car.
- Their relation with supplier were based on honour, the delivery will be complete without defect and in time or AYAKIRI.
- They did Quality Insurance at every step of the process. Detecting a defect would cost split minute to fix it at the previous step.
- A continuously improving process flow and focusing on quality rather than quantity lowered drastically the number of defect product at the end of the manufacturing process. This alone would bring the production cost down by 40% compare to USA production cost.
- Being paid once the final product has been sold to end customer, they all shared the risk equally and honourably.
- They were producing only for what they sell, every investment made to create new product were cash in quickly bring the Return On Investment (ROI) very fast and with no cost of storage. “Just In Time (JIT)“
The essential point is:
- With trust and honour, they don’t need complex and costly system or processes to deal with their supplier.
- Being a culture of honour and perfectionist where quality is important, they intuitively put in place an early Quality Checkat every step of the production line which lower the cost to fix a defect found. The result is a very low number of defect on the final product. 1 on 3 000 000 “Six Sigma”.
- Because they continuously improve their processes, they keep improving their efficiency lowering the production cost.
- Because everyone get paid once the product is sold, they intuitively put in place a “Just In Time (JIT)”where the production focus only on producing for what will be sold tomorrow. As such, the investment required is minimal and the time to get the investment back is very short.
Just In Time (JIT) is part of what we call the Pull Methodology which is the opposite of the Push Methodology.
How the American adapted?
- To survive, the American adopted the “Just In Time” Methodology or Pull Methodology.
- They extensively use IT systems to automate complex processes.
Ex.: To manage more closely their suppliers and make sure they respect their engagements.
“First Supply Chain Management (SCM) systems“
- Create what will become the Six Sigma Methodology.
- They put in place control system and contract with remote supplier that would punish them if they made mistake.
Ex.: If a supplier doesn’t respect “Six Sigma” quality, after 1 mistakes a supplier would be put on the bench for 3 months and another supplier will be used instead. After 3 mistakes, they are permanently banned.
- They automate as much processes as they can to lower their employee salary wage cost.
What’s the relation of the Petrol crisis and IT?
The inflation caused by the Petrol crisis:
Before the inflation of the dollar caused by the Petrol crisis, a teacher earned a good salary of 2000$ a year and own a big house that cost around 15 000$ – 20 000$.
Since the inflation caused by the petrol crisis brought the dollar 15x higher, that 2000$ would be equivalent to 60,000$ now.
Unfortunately, the cost of life didn’t follow the salary this mean, 60,000$ have less buying power than that 2,000$ back then. (this is another story)
IT before 1970
It before 1970 was mostly being used to automate operations related to finance. Computer Equipment cost was colossal.
- Back then, IBM developed the Mainframe which would cost around 800 000$ (which is about 12 000 000$ now a day).
- A programmer was a low wage job at around 500$ per year!
- 300 employees * 500$ per year = 150 000$ per year.
For a company, any minutes the mainframe was idling was costly, while a programmer was cheap labor.
Bank would hire hundred of employee who would be trained on the spot as programmer to create simple code to be executed on the Mainframe. Even if the code wasn’t efficient or of poor quality, redoing code was still cheaper than keeping the Mainframe idling.
Because coding was cheap, no effort was made to make coding more efficient.
- Code was completely created from scratch for each task. No reuse of reusable code for similar context.
- Code wasn’t efficient or of good quality.
- Many people were working on the same code for resilience if a bug occurred in the code of one employee.
- No need to put in place methodology or practice to make better code quality
IT after 1970
Suddenly in less than 5 years, low wage programmer now cost a fortune with salary of up to 15 000$ per year.
- 300 employees * 15 000$ = 4 500 000$ per year.
- The Mainframe was purchased for 800 000$
The price to make code is now 15 times more expensive. The business will start lowering their number of programmers to cut cost and they will take benefit in creating better quality code from more experienced / expert employee who will earn slightly more than before.
Better code quality is not enough, a bug is expensive to fix now but creating code from scratch for every task is even more expensive and add the risk to introduce new bugs.
A need to create reusable code that “add value” appear to be more efficient. Also, in the 70s, coding requirement are quite simple and are mostly mathematics calculus thrown to the Mainframe. As we all know now a day, systems and applications keep getting more complex.
Evolution toward more complex IT systems
As business evolves and competition among company keep raising as in the car industry. A need for more complex systems emerges.
These systems requires code of a new level of complexity.
American business needed to automate some complex processes. They need by example to start to manage more closely their suppliers. This created the need to create an application Supply Chain Management (SCM) system.
To make businesses more efficient, it needs to automate complex processes but it also requires to make better decision based on solid metrics. To help business to make better decision and to have better control on metrics influencing the business, we developed systems to assist Manager in taking decision.
- Decision Support Systems (DSS),
- Management Information Systems (MIS),
- Executive Information Systems (EIS)
- Enterprise Resource Planning (ERP).
A software is a system and as such follow the rules of Entropy. With the emergence of more complex software solution, it became important to develop new technique to develop better quality code. Engineering is using simple principle to create an empiric set of practice, rules, methodology, knowledge that will improve the field of expertise and allow us to build better systems.
Since 1970, programmer doesn’t create more line of code, statistics show that it is about the same number of lines. So why programmer in 2016 are more efficient than those of 1970? They don’t waste their time coding on the low technical level detail. They create code that have more value, by reusing existing code that take care of the low level technical and by creating features instead. They created added value code!
The first step in developing more complex IT software was to start reusing good code and as such find a way to build code that is reusable by respecting basic rules such as Abstraction and Independence.
Abstraction / Independence
By building code designed to perform a specific task, it allows to reuse that code every time the task it is designed for, is necessary. But to be able to reuse this code, it needs a way to interact with it from upper level. This principle of independence and a method to interact with it is called Abstraction.
With abstraction, we can develop very complex application without having to know all the detail about how to implement the micro technical detail.
By example, when you write data to a file, you don’t need to know how to write the bytes on the magnetic disk of the hard drive. This step alone would be tremendously complex without existing code that relieve programmer from understanding all this micro detail. Imagine the complexity when new firmware is released or new model of hard drive with different hardware configuration are release.
Because of more complex system create more Entropy, creating smaller very specialised application is easier to develop, easier to maintain and easier to control quality. By ensuring a high quality of smaller parts, we can build more complex system on top of them.
Another good example of abstraction is the whole network ISO model on which all our Telecommunication depends on. Each layer of the ISO model is independent from each other. A switch operating on Layer 2, doesn’t need to know about the Layer 3-7 and doesn’t need to know how the Layer 1 is implemented. This allow to support any type of physical media without impacting the telecommunication.
All IT infrastructure use these engineering principles.
In the beginning of this new era of IT, concept such as Architecture and Designer didn’t exist. The programmer would receive a feature request and would try to implement it. The lack of planning, standards, proper analysis, and documentation was a major roadblock to any big project. Such project was costly and complex.
It wasn’t before the development of the IBM 360 super computer, which end up the biggest commercial success of all time, that we start seeing proper planning, documentation, and notion such as Architecture and design.
The main developer of the IBM 360 took the initiative to do the opposite of what everyone else were doing. Instead of developing feature right away and creating a documentation at the very end, they decided to create documentation first and coding at the very end. This principle had the direct impact that the code would implement what the documentation specifies instead of creating code that would lead to code that would add unplanned feature.
The work on the IBM 360 was the trigger that initiated the new field of expertise for that period, Software Architect and Software Designer and proper documentation as well as Waterfall Project Management model.
It wasn’t by accident, that I use the car industry as a parallel to explain the Push and Pull principle. There is an important parallel to make in our observation of the other industry that are not directly linked to IT and how IT evolves or learn from the other industry processes. For a long time, we used the Waterfall model to lead our Project in IT.
The waterfall model
It consists of creating milestone with dependencies and each dependency must be completed before reaching the next milestone creating a waterfall schema.
- One aspect of the waterfall model is you cannot do Quality Insurance before having a feature or a product, so the Quality Insurance is done at the very end of the project.
- Also, you need to invest a lot of time before you can have a working feature and sell it, meaning you must store your code until a product is ready to sell.
In comparison to the car industry:
- In the push model, a car is assemble completely before doing Quality Insurance which result in costly fixes or losses. In the Waterfall model, you cannot do Quality Insurance before you get a feature or a product completed.
- In the push model, they produce more car than they can sell and store them, delaying the Return On Investment(ROI). You might be unable to sell your stock. In the Waterfall model, you need to invest a lot of time coding the whole feature before you can have a return on investment. There is a risk that you will never be able to complete your project because it takes too long and budget won’t allow it.
The Waterfall is an application of the Push model. In IT, more time it takes to deliver a project and higher are the chance the project will never succeed to deliver.
It could be caused by many factor like:
- Budget overspending
- Technology evolution that would make the initial project deprecated
- Loss of expertise inside the team
- Accumulation of bad quality code and loss of scope
- New business plan and strategy or other external business pressure
The Agile model
It consists of creating short term scope or objective to deliver features as soon as possible. The way to achieve it is to break the project into much smaller pieces. A feature would be decomposed into smaller more specialised features. The goal is to deliver one micro feature per week and to do a Quality Insurance daily (or continuously, *Night Build*). A programmer can only merge its code in the project if it is operationally bug free.
- By releasing many smaller micro features, the project gain value even if it is not completed. Some of the feature could even have a monetary value even without a final product.
- The Quality Insurance is done continuously and any bugs detected are taken care very early in the process, maximising the efficiency and the cost to fix the bug.
- Investor can see the result of their investment and sell a partially functional product while pursuing the development in parallel. (We see this often in the gaming industry, where a game is released early and fixes, patches, extended features comes later with Service Pack).
- It is easy to adapt the scope of the project to adapt to a last-minute change. Time is invested and paid in a very granular way.
Agile is an application of the Pull method.
Agile is an application to project management of the Pull principle, but it covers a larger scope and it is not perfect for every type of project. It is quite likely new application of the Pull principle will be created in the future.
The influence of Pull at the Infrastructure level
In the past, it was the best practice to acquire hardware and to operate your own datacenter. The reason was for security, to have a vertical control on the level of Quality and Service Level Agreement. You wouldn’t trust a third party to care for your infrastructure as much as you do. By doing so, you need to invest upfront in hardware and since you plan to keep that hardware for 5 years, you plan your purchase for your next 5 years’ growth and highest peak of resources usages. As such, the first year, you overspend on what you really need. The second year, you still overspend even if your hardware become under more loads, until you reach the capacity balance between what your current hardware can handle and the need to expand it.
Also, once it is time to expand, you always do your Capacity Planning accordingly to your highest peak which might happen 3% of the times or less.
Aside the cost of the hardware, the most expensive part of your infrastructure is Electricity and Cooling. This alone is 10x more expensive than the cost of a server running. If your server is 97% of the time idling, you waste tons of money and it is hardly efficient financially. And having servers’ shutdown is not much more efficient either.
Virtualization is not new, it existed since IBM Power System with LPAR for longer than VMWare. Which is one reason IBM being very popular before VMWare arrival. But VMWare also brought the virtualization to the next level by making it less hardware dependant and bringing it to the Infrastructure level instead of just server level. You virtualize switches, cluster your server, automate resources usage by migrating VMs or starting new VMs or Node as you need. So, you can shutdown your useless server on low peak and start nodes when high peak happens.
Virtualization is enabling company to be more efficient with their infrastructure cost, but they still need to upfront the cost of the hardware to sustain the highest peak and the price of the licenses.
The cloud is another virtualization concept at the Infrastructure level, but is really an innovation by sharing the cost of the infrastructure between many company. Amazon transformed its biggest expense in IT to support their online bookstore into a service department offering IT on demand to other businesses, sharing the cost of the IT and in the end making it profitable.
By doing that, they brought a new application of the Pull model applied to the Infrastructure. Company now only need to invest for their immediate needs. No need to buy a collection of servers for later use. The Return On Investment is immediate and when you don’t need resources, you just cancel it and pay as you go.
This is one of the main reason Cloud became that successful even against all the other constraint, such as:
- Data integrity issue
The advantages are worth the risk for businesses. This is also transforming the way we work with infrastructure. We don’t work with physical server and as such we build new servers and destroy servers at will as we need creating a very dynamics infrastructure. Dealing with this dynamic infrastructure is challenging and requires more automation and new technology to operate them.
At this point, it is impossible to not develop expertise in Cloud technology and the new methodology and new operations skills. Mostly working with API that allow us to manage our remote infrastructure remotely and programmatically. It requires good programming skills, good knowledge of the operating system and knowledge of operations environment, it requires a new set of skills and expertise.
And since it involves mostly programming, we tend to use Agile Methodology. Basically, transforming the whole stream of processes into a Pull chain process.
As you can see, the influence of the 70s and the Japanese methodology is having a major impact on everything, including the IT. Everything in a business that is still using a Push methodology should be converted into a Pull methodology. If you find Push processes that could be converted into Pull, it is quite likely to be a good business idea.
So far it has been revolutionising the IT industry completely in the past 10 years. (40 years after the Japanese initial conceptualisation of the Pull (Just In Time) methodology). Better late than never as we say!