Monday, August 15, 2016

If it isn't broke you better fix it.

This didn't have to happen.
I have been off line for a week as I attended the Gen-Con game fair in Indianapolis and tried to get back into the swing of things at work.  While I was away, I had a chance to recharge my batteries and have a good time playing board games with friends.  I also got to have a little fun with the people at Big Potato Games which is seems like a fun group of people who are making a big splash in the industry.  When I came back two things happened which got my attention which illustrated the paradox of contemporary business and modern technology.

The first was the problem with Delta Airlines and its reservation system which grounded the company for two days.  The second was a small article in the technology press about Windows 10 updates.  Both articles illustrate to me that the business maxim, “…if it isn’t broke don’t fix it,” is seriously wrong.  If you are a company in the 21st century if you want to remain in business it is your responsibility to upgrade your technology infrastructure and applications.

First, Delta airlines relies on its reservation system to be managed on AS/400 systems and mainframes using the IBM Transaction Processing Facility software.  The software was last upgraded by IBM ten years ago and the only people who can fix something if anything goes wrong are IBM consultants.  If something goes wrong a CIO and their company is forced to call IBM to make changes and corrections.  In the same ten year span, Microsoft has had four operating systems; Windows 7, Vista, Windows 8 and Windows 10.  Presently, there is an entire ecosystem of developers outside of Microsoft who can alter, improve or fix these systems.  So if an airline wants more availability to labor and more up to date systems they should go with a Microsoft solution.

This did not happen for a few reasons.  First, airlines for all their talk of customer service and being high tech are notoriously stingy with money to upgrade and improve their technology infrastructure.  So what they did is graft other technology systems on to their old IBM infrastructure.  If the AS/400 went down, it would create a cascading effect which would shut down the airline.  According to the news, that is exactly what happened as numerous technology professionals scrambled to get the systems back up and running.  It also lead to the CEO of the company publicly admitting they are doing the best they could to fix the problem without knowing exactly what went wrong.    Next, the people who make the decisions about the funding felt this risk was so unlikely that they decided that the system was not broken and so they did not need to make improvements.

This kind of thinking is foolish.  Software is like any other machine but it manufactured out of ones and zeros instead of steel.  Machinery needs to be maintained or it will break down.  Fail to change the oil in your car and see what happens after 100,000 miles.  That is the exact situation which happened at Delta. The people driving the organization put off or ignored routine maintenance to its systems because it would cost money to do so.  As long as everything was working, there was no need to do maintenance and upgrades.  As you can see, this cost the company millions of dollars when the system failed and hurt its reputation for quality service.

The other new item I saw this week was a brief blurb about how Windows 10 updates are not an iron clad guarantee that a system will not be compromised by hackers because people generally do not upgrade the other software on their machines.  As a technology professional we have seen people with Windows 10 machines with copies of Office 2007 on them.  This mixing and matching of software in the real world is common because people don’t have the money to upgrade everything.  This creates openings for hackers and people willing to do bad things.

This is short sided like a person not changing the oil in their car.  When you upgrade an operating system you should be able to update the software which is on that operating system.  This is why I am a big fan of Google Documents and Microsoft’s Office 365 software because these cloud based systems update automatically and do not rely on the user purchasing and installing upgrades.  The burden is no longer on the consumer but on the company providing the software which is what it should be.

So in one week the world witnessed an object lesson in why the phrases, “…if it isn’t broke don’t fix it,” is wrong.  Old and outdated software which was not maintained properly failed spectacularly.  The only people who could fix the software was a third party vendor which was not responsive.  The pennies saved on upgrades and improvements became millions of dollars in technical debt which shut down the company.  Finally, the reputation of the company was hurt by this kind of thinking.

It is also clear that just upgrading operating systems is not enough the applications which run on those operating systems need to be improved.  I understand that in the world of technology bragging about your new data center or software upgrades to your core business is not as glamorous as web application or phone app but it is just as important because when those systems fail they fail in an embarrassing and spectacular fashion.  So it is up to everyone from the largest company to the personal consumer to pay attention to how they maintain their software.  If not, expect to be grounded.

Until next time.