“That’s some catch, that catch-22” – utters, almost in admiration, captain John Yossarian, as he’s stuck in the middle of army bureaucracy. "It's the best there is", Doc Daneeka agrees. IT specialists around the world have their own catch-22 and that would be scheduled downtimes, a true conundrum for many organizations running 24/7. Let’s see how Extreme’s solutions can help escape this dramatic stalemate!
The term “catch-22” was coined by Joseph Heller, who used it in his 1961 satirical novel of the same title, recently adapted into a TV miniseries by HBO (a must-read and watch, by the way!). The story examines the absurdity of war and military life through the experiences of Captain Yossarian, a U.S. Army Air Forces B-25 bombardier during the World War II, who attempts to maintain his sanity while fulfilling his service requirements so that he may return home.
Alas, Doc Daneeka, an army psychiatrist, invokes "catch-22" to explain why any pilot requesting mental evaluation for insanity – hoping to be found not sane enough to fly and thereby escape dangerous missions – demonstrates his own sanity in creating the request and thus cannot be declared insane.
So… that’s some catch, that catch-22, isn’t it? The term itself has filtered into common usage in the English language, describing a conundrum or sense-defeating logic, from which there is no escape because of mutually conflicting or dependent conditions and limitations. And that brings us to the aforementioned downtimes.
The cost of idleness
Let’s face it – no one likes downtimes. Especially, when you are managing hundreds and thousands of devices in a large, multi-tenant environment. For many organizations the process of introducing a change is beyond bureaucratic. Because it’s not really about the change or update itself, you see. It’s about the humongous cost (and not necessarily a financial one!) of idleness that comes with it.
In order to maintain system operation and security, we need downtime to institute a necessary change. But the very same downtime can’t be obtained due to business or operational requirements, especially when a certain business or organization is running 24/7. A perfect example would be the healthcare industry, especially during these difficult, trying times.
So we’re stuck in a situation, where the benefits of introducing an update do not compensate, or outweigh only slightly, the cost of maintenance. In order to achieve the idea of “zero downtime”, organizations implement hot standby systems and duplicated efforts in hardware and supporting infrastructure just to avoid having a system offline. Alas, not everyone can afford to do that. Those who can’t, simply come to terms with absorbing the downtime as lost productivity.
But it doesn’t have to be that way. Because every cloud has a silver lining. And in this particular case, quite literally!
Moving at cloud-speed
While developing and deploying ExtremeCloud™ IQ, the world’s first true 4th generation platform, we have made significant strides to eliminate maintenance downtime. In order to do so, a new method of updating production has been introduced, making it possible to do without ever taking the application offline! How does it work exactly?
ExtremeCloud™ IQ uses two major types of DBMS systems. All of the unstructured data, such as monitoring statistics, events, alarms, and AI/ML, is stored using Elastic Search. This way we are storing data inside of indexes and those indexes can be expanded with new fields anytime. The application won’t care about what we do to the Elastic Search indexes, just so long as we don’t delete anything.
However, for all of the structured data, consisting of such elements as configuration objects, authentication information, or the inventory list of managed devices, the platform utilizes the good ol’ fashioned SQL-based solutions. And with SQL database schemas we need to be a little bit careful. And that’s why with the advent of 4th generation cloud, we’ve done something a bit differently.
– First of all, with our ISO 27001 certification, we’ve amassed hundreds of pages of operational documentation and processes that require developers to document their proposed schema changes. Any proposed schema change for a particular release sprint is vetted by several people, including the Cloud Operations Distinguished Engineer. No database change is ever undertaken without explicit vetting and approval – says Bill Lundgren, Director of Product Management for Cloud Operations and Architecture at Extreme Networks.
– Secondly, we’ve engineered ExtremeCloud™ IQ to be forward and backward compatible. This is actually quite a feat. Every line of code in the application that interacts with a database connection must anticipate that it may interface with a new schema and it must gracefully process it. In addition, the databases involved have to be backward compatible such that during the upgrade process the legacy application can still function. This is a ballet of code, conditional processing, and operating processes that come together to create a situation of zero downtime updates – Bill Lundgren says.
How to accomplish a zero downtime update?
In the diagram below, we start out on the left with the ExtremeCloud™ IQ application and the supporting databases operating normally. In the first step, our cloud operations team will apply fully vetted and authorized DB schema updates to the active databases, creating a DB version of “N+1”. Meanwhile, utilizing the forward compatibility we’ve engineered into the application, the older version of ExtremeCloud™ IQ continues to run using the new database schema. The backward compatibility of the new “N+1” DB schema allows it to accept and process new transactions from the legacy application format.
Next, we apply updates to the ExtremeCloud™ IQ application itself. In 4th generation cloud, this update involves booting and swapping new containers that are orchestrated via the Kubernetes infrastructure. We do this in a graceful fashion, replacing a portion of the operating Kubernetes Pods, and then failing active connections into those upgraded instances. We will then upgrade the remaining pods, creating an application version “N+1” that is connected to the new supporting DB schema.
Finally, we apply DB cleanup and housekeeping routines to set the new standard for both the application and the database.
Escape the clutches of catch-22
Designed to streamline every aspect of your network from deployment to maintenance, ExtremeCloud™ IQ has been built with the human element in mind, helping IT and business focus on what’s really important to them instead of dealing with menial and time consuming operational tasks.
So be like Captain Yossarian. Steer your organization into the Cloud and escape the clutches of scheduled downtimes – a real catch-22!