BigPanda Expands Platform
October 29, 2019
Share this

BigPanda announced a major expansion of its platform capabilities to enable IT Ops, network operations center (NOC), and DevOps teams to rapidly investigate and resolve incidents and outages in cloud-native and hybrid-cloud environments.

Leveraging its Open Box Machine Learning and its Open Integration Hub technologies, BigPanda ingests changes from disparate change feeds and tools, and correlates and analyzes these changes against alerts collected from enterprise monitoring tools to rapidly isolate the root cause change that resulted in an incident or outage.

“Today’s IT environments are very fast-moving and constantly changing. Changes in software and infrastructure occur several times a day at most enterprises, which dramatically increases the potential for unexpected incidents and outages. Unfortunately, legacy IT operations tools weren’t designed for environments of rapid change and are slowing down operations teams from discovering and resolving outages in a timely manner,” said Assaf Resnick, CEO and co-founder, BigPanda. “BigPanda’s new offering puts, for the first time, the root-cause change behind an outage at the IT Ops teams’ fingertips, slashing mean-time-to-resolution and improving the performance of critical systems and applications. This is a win for IT operations teams, their enterprises, and most importantly, their customers.”

As enterprises migrate to the cloud, their IT stacks are accelerating. These fast-moving IT stacks are subject to hundreds or thousands of changes on a constant basis and experience ever-shifting application and service topologies. Legacy IT operations tools and root cause analysis techniques are ineffective inside these fast-moving IT stacks. That’s because legacy tools and techniques were designed for slower-moving monolithic applications and IT stacks, where the root causes of problems were mostly related to infrastructure and hardware failures.

When IT Ops, NOC, and DevOps teams try to use legacy tools and techniques to support cloud-native and hybrid-cloud architectures and applications, incidents and outages become more frequent, last longer and have a wider impact footprint. This creates serious consequences for businesses in the form of higher operating costs, degraded performance and availability, SLA violations and penalties, and ultimately, unhappy customers and end-users.

The BigPanda platform expansion includes the following features designed to speed up incident and outage resolution:

- Root Cause Changes: BigPanda’s platform expansion equips IT Ops, NOC, and DevOps teams, for the first time, with the tools to contend with the thousands of regular application and infrastructure changes that cause incidents and outages.Leveraging out-of-the-box integrations with all major change feeds and tools, BigPanda’s Root Cause Changes feature ingests changes from any source of change data, including change management, change log, configuration management, and others. Subsequently, BigPanda’s Root Cause Changes feature uses machine learning (ML) to correlate and analyze this dataset alongside the dataset of alerts collected from monitoring tools.The ML-driven cross-correlation and analysis surfaces the root cause change that resulted in an incident or outage, enabling IT Ops, NOC and DevOps teams to rapidly handle the change and resolve the incident or outage.

- Real-time Topology Mesh. Another aspect of the BigPanda platform expansion is the launch of the Real-time Topology Mesh. This new capability makes BigPanda’s platform the first AIOps solution to provide a real-time topology model across the entire IT stack, including the dynamic infrastructures inside fast-moving IT stacks, by piecing together the third critical dataset for IT operations: topology data.Leveraging out-of-the-box integrations, BigPanda’s Real-time Topology Mesh ingests topology data from configuration management, cloud & virtualization management, service discovery, APM and CMDB tools to create a full-stack, always up-to-date topology model.For IT Ops, NOC and DevOps teams struggling to detect, investigate and resolve incidents and outages in fast-moving IT environments, BigPanda’s Real-time Topology Mesh significantly improves their ability to detect those incidents and outages, visualize them, identify their probable root cause, understand their impact on users and customers, and route them to the right teams for rapid resolution, all in real-time.

“The world of hybrid IT — with a mix of cloud-native and legacy, on-prem workloads — is here for the foreseeable future. Old approaches to problem solving in these complex, dynamic environments don’t work, in part because they typically don’t deliver insight into the relationship between changes and incidents,” said Nancy Gohring, senior analyst with 451 Research. “Correlating alerts, change events and topology can help teams narrow in on the cause of performance problems in modern application and infrastructure environments.”

With the launch of Root Cause Changes and Real-time Topology Mesh, BigPanda is now able to ingest the three critical datasets in IT operations: alerts, changes and topology, across all layers of fast-moving IT stacks, and use ML to correlate and analyze this data in real-time. This helps IT Ops, NOC and DevOps teams rapidly detect, investigate and resolve incidents and outages, minimizing the impact on users and customers.

Both new additions to the BigPanda platform, Root Cause Changes, and Real-time Topology Mesh, are currently available to select customers as part of a beta program, and will be generally available at the end of the year.

Share this

The Latest

May 28, 2020

On Wednesday, May 6th, iOS users all over the world experienced an app crash when they tried to open popular apps such as TikTok, GroupMe, Spotify, and Pinterest. How did simultaneous crashes occur across so many independent apps? What's the common thread that would cause widespread app crashes? Turns out, it was a change in behavior in the Facebook API ...

May 27, 2020

Keeping networks operational is critical for businesses to run smoothly. The Ponemon Institute estimates that the average cost of an unplanned network outage is $8,850 per minute, a staggering number. In addition to cost, a network failure has a negative effect on application efficiency and user experience ...

May 26, 2020

Nearly 3,700 people told GitLab about their DevOps journeys. Respondents shared that their roles are changing dramatically, no matter where they sit in the organization. The lines surrounding the traditional definitions of dev, sec, ops and test have blurred, and as we enter the second half of 2020, it is perhaps more important than ever for companies to understand how these roles are evolving ...

May 21, 2020

As cloud computing continues to grow, tech pros say they are increasingly prioritizing areas like hybrid infrastructure management, application performance management (APM), and security management to optimize delivery for the organizations they serve, according to SolarWinds IT Trends Report 2020: The Universal Language of IT ...

May 20, 2020

Businesses see digital experience as a growing priority and a key to their success, with execution requiring a more integrated approach across development, IT and business users, according to Digital Experiences: Where the Industry Stands ...

May 19, 2020

Fully 90% of those who use observability tooling say those tools are important to their team's software development success, including 39% who say observability tools are very important ...

May 18, 2020

As our production application systems continuously increase in complexity, the challenges of understanding, debugging, and improving them keep growing by orders of magnitude. The practice of Observability addresses both the social and the technological challenges of wrangling complexity and working toward achieving production excellence. New research shows how observable systems and practices are changing the APM landscape ...

May 14, 2020
Digital technologies have enveloped our lives like never before. Be it on the personal or professional front, we have become dependent on the accurate functioning of digital devices and the software running them. The performance of the software is critical in running the components and levers of the new digital ecosystem. And to ensure our digital ecosystem delivers the required outcomes, a robust performance testing strategy should be instituted ...
May 13, 2020

The enforced change to working from home (WFH) has had a massive impact on businesses, not just in the way they manage their employees and IT systems. As the COVID-19 pandemic progresses, enterprise IT teams are looking to answer key questions such as: Which applications have become more critical for working from home? ...

May 12, 2020

In ancient times — February 2020 — EMA research found that more than 50% of IT leaders surveyed were considering new ITSM platforms in the near future. The future arrived with a bang as IT organizations turbo-pivoted to deliver and support unprecedented levels and types of services to a global workplace suddenly working from home ...