APMdigest posed the following question to the IT Operations community: How should ITOps adapt to the new normal? In response, industry experts offered their best recommendations for how ITOps can adapt to this new remote work environment. Part 4 covers monitoring and visibility.
Start with: How ITOps Can Adapt to the New Normal - Part 1
Start with: How ITOps Can Adapt to the New Normal - Part 2
Start with: How ITOps Can Adapt to the New Normal - Part 3
AIOPS AND OBSERVABILITY
Implement proper AIOps and Observability solutions which will reduce the "wild goose" chase by ITOps teams. The money saved by solving high profile incidents will pay for the cost of the solution during the first year itself — many times over.
Principal, The Field CTO
Read Andy Thurai's recent blog on APMdigest: Getting to Zero Unplanned Downtime with AIOps
Just as pain is called "the gift no one wants", the turbo-pivot to remote everything paved the way for innovation in ITOps. Yes, the crisis pointed out some areas that needed shoring up, but it also permanently 86-ed the old "that's not the way we've always done it" obstacle to change. ITOps teams have the perfect storm of opportunity, necessity, and cultural open-mindedness to innovate and to automate cross-domain collaboration. There's a stunning array of capabilities to choose from across a rich AIOps market landscape — and now is the time to strike.
Research Director, Enterprise Management Associates (EMA)
Read Valerie O'Connell's recent blog on APMdigest: ITSM That's Ready When Tomorrow Happens Today
This has been a unique year that forced companies to a accelerate their digital transformation efforts and move faster than ever to keep up with growing customer demands. As a result, business leaders must invest in technology that combines the power of artificial intelligence with observability to easily solve the problems hindering them from delighting customers under surmounting pressures. As digital business cements itself as the norm, they'll also need modern tools capable of rapid time to results — literally bringing value in the time it takes to make a cappuccino — and go from zero to correlated incidents. Relying on legacy tools to gather data and integrate it can take months and hinder success well into 2021. Investing in modern solutions that drive innovation is the only way to be successful in the new normal.
Download the eBook: Observability with AIOps For Dummies
APPLICATION AND INFRASTRUCTURE MONITORING
As remote work continues to be an integral part of the new normal, ITOps teams need to be agile and prepared to address common issues such as service outages, including systems going down and applications slowing. Therefore, IT teams should ensure application and infrastructure monitoring solutions are a key part of their long-term strategies. These will allow teams to act quickly to identify and address any issues outside traditional network perimeters.
Sr. Marketing Manager, ManageEngine
END USER EXPERIENCE
For network operations teams going forward, the biggest challenge will be keeping up with the accelerated pace of change now that they've proven to skeptical business leaders their efficiency (and efficacy) in successfully transforming the network. This will require teams to put a greater emphasis on leveraging comprehensive visibility into end-user performance wherever users are located now that the footprint for potential errors has expanded with workers at home.
Marketing Communications Manager, AppNeta
Read Paul Davenport's recent blog on APMdigest: IT Has Proven Rapid Digital Transformation is Possible - What's Next?
In our conversations with partners and customers, we are finding that IT leaders are moving from reactive mode to more opportunistic and proactive thinking. Organizations should continue to invest in collaboration software and in developing the most efficient ways for teams to work together and stay productive during ongoing remote work, yet there needs to be a sharper attention to customer experience. This means that IT will need to reduce technical debt to free up investment in targeted innovation, and determine the best way to measure everything they do according to business goals and the delivery of key business services.
VP Product Development and Cloud Operations, OpsRamp
One powerful way ITOps teams can adapt to "the new normal" is to focus on better cloud monitoring and visibility. As the rapid shift to remote work accelerated the cloud migration efforts that were already happening pre-Covid, ITOps teams have been under significant pressure to monitor the cloud from an operations and network perspective. As more businesses move data center assets to the cloud, ITOps must be equipped to monitor the new normal of cloud-based networks. On-premise workloads have long afforded the ability to access all parts of the network that you owned, but now with increasing cloud adoption, your ITOps team needs specific instrumentation provided by cloud providers and integrated by tool vendors. For instance, to assess application performance from the network perspective (in the same way you're accustomed to for an on-premises data center), you must understand how to instrument with traffic mirroring via cloud-based packet analytics tools. To effectively monitor, manage and optimize today's increasingly cloud-based networks, start by looking at your existing toolset and determining if and how it can be adapted or upgraded with new tools and capabilities for cloud visibility.
Founder and CTO, LiveAction
VISIBILITY BEYOND THE NETWORK
Now that IT teams are on the hook to manage what has become a different corporate network for every employee, and ensure that core business applications are seamlessly delivered over third-party cloud and Internet networks, today's new work-from-home infrastructure requires us to adapt to new monitoring frameworks that provide visibility beyond the networks that traditionally lie within enterprise control. The digital supply chain in today's new normal is more complex than ever before and troubleshooting disruptions, managing digital experiences, and scaling support all require that ITOps has full visibility into what has become an exponentially extended IT perimeter.
Head of Product, ThousandEyes
FOCUS ON ENDPOINTS
The focus for ITOps teams needs to be on the endpoint. For the end user, the boundaries of the corporate network have long disappeared, and now IT teams are dealing with what is essentially one big worldwide network instead of the well-defined, enterprise-built network they were previously accustomed to. The potential attack surface has expanded exponentially in parallel, putting both the business and their customers at greater risk of compromise. So, the top priority for IT needs to be ensuring they have the ability to find, manage, and secure every endpoint no matter where it is connecting from.
VP, Sales Engineering, Absolute Software
The pandemic has been an accelerant for digital transformation. For some companies, digital transformation went from whiteboard to production in weeks where we saw new applications, services and software updates emerge that consumers and businesses now regularly rely on in the new normal. DevOps and SRE teams responsible for maintaining digital services have seen expectations heightened, with increased demands to quickly address and resolve service degradations resulting from accelerated innovation. Technology teams need a better way to recover quickly, adapt and learn from outages and interruptions related to technical and customer-impacting issues. This can create more space for innovation and fuel more accessible, always-on customer experiences. Leveraging SRE practices to modernize incident management and implementing automated resolution workflow can help teams reduce friction in the entire software development cycle. This adaptive approach applies agile principles to incident management, empowering teams to deliver better customer experiences at a lower cost.
Many IT organizations, even some of the most organized, proactive teams, have accepted a higher level of reactive work than they'd like. That's understandable, especially if it meant saving the business. However, not all have been able to unwind temporary process exceptions, or return to the previous, proactive stance IT professionals prefer. The second wave for IT is redesigning IT processes — device deployment, support, security, and more — to support remote work for the long run. Normalizing a deep queue of support tickets from exceptions into standardized requests recovers headroom teams need for ongoing business-critical transformation projects. More than catching up to the "new normal," returning to proactive ops will make it easier for teams to adapt to the "next normal," whatever it happens to be.
Head Geek, SolarWinds
Go to: How ITOps Can Adapt to the New Normal - Part 5, the final installment in the series.
Achieve more with less. How many of you feel that pressure — or, even worse, hear those words — trickle down from leadership? The reality is that overworked and under-resourced IT departments will only lead to chronic errors, missed deadlines and service assurance failures. After all, we're only human. So what are overburdened IT departments to do? Reduce the human factor. In a word: automate ...
On average, data innovators release twice as many products and increase employee productivity at double the rate of organizations with less mature data strategies, according to the State of Data Innovation report from Splunk ...
While 90% of respondents believe observability is important and strategic to their business — and 94% believe it to be strategic to their role — just 26% noted mature observability practices within their business, according to the 2021 Observability Forecast ...
Let's explore a few of the most prominent app success indicators and how app engineers can shift their development strategy to better meet the needs of today's app users ...
Business enterprises aiming at digital transformation or IT companies developing new software applications face challenges in developing eye-catching, robust, fast-loading, mobile-friendly, content-rich, and user-friendly software. However, with increased pressure to reduce costs and save time, business enterprises often give a short shrift to performance testing services ...
DevOps, SRE and other operations teams use observability solutions with AIOps to ingest and normalize data to get visibility into tech stacks from a centralized system, reduce noise and understand the data's context for quicker mean time to recovery (MTTR). With AI using these processes to produce actionable insights, teams are free to spend more time innovating and providing superior service assurance. Let's explore AI's role in ingestion and normalization, and then dive into correlation and deduplication too ...
As we look into the future direction of observability, we are paying attention to the rise of artificial intelligence, machine learning, security, and more. I asked top industry experts — DevOps Institute Ambassadors — to offer their predictions for the future of observability. The following are 10 predictions ...
One thing is certain: The hybrid workplace, a term we helped define in early 2020, with its human-centric work design, is the future. However, this new hybrid work flexibility does not come without its costs. According to Microsoft ... weekly meeting times for MS Teams users increased 148%, between February 2020 and February 2021 they saw a 40 billion increase in the number of emails, weekly per person team chats is up 45% (and climbing), and people working on Office Docs increased by 66%. This speaks to the need to further optimize remote interactions to avoid burnout ...
Here's how it happens: You're deploying a new technology, thinking everything's going smoothly, when the alerts start coming in. Your rollout has hit a snag. Whole groups of users are complaining about poor performance on their devices. Some can't access applications at all. You've now blown your service-level agreement (SLA). You might have just introduced a new security vulnerability. In the worst case, your big expensive product launch has missed the mark altogether. "How did this happen?" you're asking yourself. "Didn't we test everything before we deployed?" ...
The Fastly outage in June 2021 showed how one inconspicuous coding error can cause worldwide chaos. A single Fastly customer making a legitimate configuration change, triggered a hidden bug that sent half of the internet offline, including web giants like Amazon and Reddit. Ultimately, this incident illustrates why organizations must test their software in production ...