When it comes to viruses, it's typically those of the computer/digital variety that IT is concerned about. But with the ongoing pandemic, IT operations teams are on the hook to maintain business functions in the midst of rapid and massive change. One of the biggest challenges for businesses is the shift to remote work at scale. Ensuring that they can continue to provide products and services — and satisfy their customers — against this backdrop is challenging for many.
IT resources that once served business needs well are now faced with a whole different set of employee needs, while home internet and hardware is also being put to the test. There are already examples of infrastructure under strain. One leading cloud provider was hit by an outage for several hours in March, after a connectivity issue in one of its US data centers. And a video calling service, which has seen a massive upsurge in usage, experienced a partial outage in April. Another example is the U.S. government loan plan intended to support small businesses, which was beset with technical issues, leaving many banks and would-be users unable to access the online portal.
It stands to reason that organizations are trying to make do during this unprecedented time, patching existing products and services with rapidly deployed bolt-ons to keep their business running while also improving communication and accessibility for people who work from home.
An outage is a very public kind of failure. But as embarrassing as it is for the companies that make the headlines, it's a risk all businesses face. There are multiple potential causes — network failures, software malfunctions, usage spikes, human error and configuration error among them.
Those headline-making outages give business leaders big headaches as they deal with the huge costs associated — running into the hundreds of millions — as well as the impact on the confidence of their customers. Fortune 1,000 companies lose between $1.25 billion to $2.5 billion every year due to unplanned outages.
Counting the Cost
The length, cost and impact of an outage will vary, at least in part because multiple parts of a business are likely to be affected simultaneously. The size and scale of a company can also complicate the problem. Evolving technology and platforms across multiple locations can cause weak points that are not immediately obvious without oversight of the entire system. With tightening operations budgets, this can be a constant challenge.
A report by Ponemon put the average cost of downtime at nearly $9,000 a minute. Outage
cost, of course, varies greatly depending on the size of the business affected and the sector it operates in. Banking, government, healthcare, manufacturing, media, retail, utilities and transportation are among those most at risk — and where outages are the most costly.
How much downtime costs an organization isn't just a matter of looking at lost revenue. Business disruption, reputational damage, customer churn and the effect on productivity levels also factor in. Further down the line, there may well be a fallout caused by fines, litigation or settlements, third-party costs and equipment replacement.
Steps Toward Resilience
During downtime, what usually happens is a trial-and-error approach that relies on intrinsic knowledge and teams who are working in operational and technology silos. This is likely to prolong the amount of time businesses are offline.
A better solution is for organizations to determine what they can do in advance to avoid outages and implement a recovery plan to get them back up and running as soon as possible. This should include cooperation with third-party providers and technology partners. Agile businesses will be best placed to weather the current situation. The ability to adapt to demand quickly and fall back on a robust IT application system will help ensure that resilience.
To reduce the risk of downtime, take the obvious steps involved in eliminating single points of failure — balancing load between servers, following good back-up practices and building in technical fail-safes. What is becoming increasingly apparent is that sophisticated AI, predictive processes and automation are starting to play a critical role in prevention.
This cognitive technology essentially operates at three basic levels: the ability to perform tasks, perform activities and handle situations. This last group of intelligent incident or situation handling prioritizes what needs to be acted on, identifying the root cause and prescribing an action. It further augments productivity by performing the action autonomously. Enabling these mission-critical applications to keep IT running is core to supporting current, essential services such as healthcare systems, utilities, telecom providers, and retail and distribution services.
Though the pandemic is straining IT teams, many of the challenges that companies are facing now have common ground with smaller-scale problems that crop up during "business as usual." It is also important to remember that the business landscape is going to look fundamentally different once the immediate crisis has passed. There will be increasing demand to run businesses effectively by working remotely, managing cash flows through smart supplier management, and shifting from a reactive to a proactive mode of IT operations by eliminating slow and error-prone manual processes.
COVID-19 has created a new "business as usual," and companies will benefit from the assist that intelligent systems management, leveraging AI and automation platforms, can give. Such tools help organizations to become more resilient and adaptable, creating a more reliable infrastructure that minimizes or even eliminates outages.
Organizations use data to fuel their operations, make smart business decisions, improve customer relationships, and much more. Because so much value can be extracted from data its influence is generally positive, but it can also be detrimental to a business experiencing a serious disruption such as a cyberattack, insider threat, or storage platform-specific hack or bug ...
Previously siloed IT teams and technologies are converging as enterprises accelerate their modernization efforts in reaction to COVID-19, according to a study by LogicMonitor ...
You surf the internet, don't you? While all of us are at home due to Covid lock-down and accepting a new reality, the majority of the work is happening online. IT managers are looking for tools that can track the user digital experience. Executives are reading a report from Gartner or Forrester about some of the best networking monitoring solutions available on the market. Project managers are using Microsoft Teams online to communicate and ensure team members are meeting deliverables on time. Remote employees everywhere use OWA to check their office mails. No matter what, you can be quite sure that everyone is using their favorite browser and search engine for connecting online and accomplish tasks ...
With the right solutions, teams can move themselves out of the shadows of error resolution and into the light of innovation. Observability data, drawn from their systems and imbued with context from AI, lets teams automate the issues holding them back. Contextualized data and insights also give them the language to speak to the incremental, product-led approach and the direction to drive key innovations in customer experience improvement. Communicating value becomes a much easier proposition for DevOps practitioners — and they can take their seat at the company table as contributors to value ...
Prediction: Successful organizations will blur (or erase) the line between ITOps and DevOps. DevOps has to coexist with traditional IT operations ... So bring a little DevOps to every aspect of IT operations. You don't even have to call it DevOps ...