Skip to main content

As CIOs Address App Sprawl, Observability Can't Be an Afterthought

Bill Lobig
IBM Software

App sprawl has been a concern for technologists for some time, but it has never presented such a challenge as now. As organizations move to implement generative AI into their applications, it's only going to become more complex. In fact, a recent Canva report found that 72% of CIOs see application sprawl as a challenge — and with 71% of CIOs expecting to adopt 30-60 new apps this year, this complexity is poised to keep growing.

Potential solutions include consolidating applications, optimizing workflows, and automating IT processes to reduce strain on technologists so they can tackle the issue of app sprawl head-on. While these are all valid and necessary approaches, observability is a necessary component for understanding the vast amounts of complex data within AI-infused applications, and it must be the centerpiece of an app- and data-centric strategy to truly manage app sprawl.

Cracking the Code for AI App Sprawl Challenges

In a year of elevated global IT spend, ensuring investments aren't wasted is a necessity for overwhelmed technology leaders, who not only must make decisions around which technologies to implement, but also make sense of application performance amid growing tides of vast and complex data.

When AI enters the mix, it's even more important to have complete visibility, which many organizations still lack. Observability tools and practices can help technologists address AI app sprawl by providing visibility into the performance, behavior, and dependencies of AI applications. Unfortunately, many teams are still attempting to work with incomplete visibility, meaning they simply don't know what they don't know.

Simply put, traditional application performance monitoring (APM) tools can provide visibility to a certain degree, but they weren't built to necessarily account for the influx of generative AI applications that modern enterprises are dealing with.

End-users for generative AI-infused applications demand continuous availability and frictionless experience. However, with a lack of real-time visibility, they will feel when an outage or delay compromises their experience, particularly as applications span numerous platforms. Not to mention, the implementation and understanding of complex generative AI models' behaviors are still something many organizations are working to figure out. These potential blind spots could create significant performance, compliance, and security issues.

Observe the Full Stack So You Can Add to It

Organizations can better understand the internal state of their AI applications by analyzing their external outputs with observability. When this is connected to business outcomes and effectively addressed, technology teams can lay the groundwork for a well-functioning application monitoring and management process. So, when generative AI is introduced to the environment, the foundation is already in place to effectively see and optimize application processes.

As generative AI applications join the enterprise equation, observability tools are a must-have for facilitating the delivery of higher-quality software at a faster pace — these are best enabled through:

Finding and fixing the "unknown unknowns": You can't fix what you can't see. Unfortunately, many monitoring practices and tools can only address flaws that are previously known. Observability uncovers conditions that would be impossible to find manually or with traditional platforms. It then monitors the correlation to different performance flaws and gives context for discovering root causes, resulting in quick and easy remediation.

Detecting and remediating issues early on: With observability, monitoring is integrated into the initial stages of software development. It's then easy to pinpoint and rectify new code issues before they affect the service level agreements (SLAs) and customer experience.

Self-healing application infrastructure and automated resolution: Observability can be coupled with automation capabilities to anticipate issues from system outputs and resolve them autonomously without requiring manual intervention.

Scaling and load balancing: We need to be able to observe and control the current load on systems but also help forecast future demands. The data can be used to optimize applications in real-time without having the end-users feel any impact.

Cost management: From optimizing workloads to computational resources, SREs can and should find ways to save on VM, GPU, cloud and inferencing costs through observability.

Observability is crucial for organizations to address AI app sprawl and overcome myriad challenges that come along with it, especially as generative AI becomes a mainstay across enterprises. By following observability best practices and deploying the right automated tools, organizations can proactively identify and resolve issues and ensure all AI applications are always available and friction-free.

Bill Lobig is VP, Automation Product Management, IBM Software

Hot Topics

The Latest

In APMdigest's 2026 Observability Predictions Series, industry experts offer predictions on how Observability and related technologies will evolve and impact business in 2025. Part 3 covers more predictions about Observability ...

In APMdigest's 2026 Observability Predictions Series, industry experts offer predictions on how Observability and related technologies will evolve and impact business in 2025. Part 2 covers predictions about Observability and AIOps ...

The Holiday Season means it is time for APMdigest's annual list of predictions, covering Observability and other IT performance topics. Industry experts — from analysts and consultants to the top vendors — offer thoughtful, insightful, and often controversial predictions on how Observability, AIOps, APM and related technologies will evolve and impact business in 2026 ...

IT organizations are preparing for 2026 with increased expectations around modernization, cloud maturity, and data readiness. At the same time, many teams continue to operate with limited staffing and are trying to maintain complex environments with small internal groups. These conditions are creating a distinct set of priorities for the year ahead. The DataStrike 2026 Data Infrastructure Survey Report, based on responses from nearly 280 IT leaders across industries, points to five trends that are shaping data infrastructure planning for 2026 ...

Developers building AI applications are not just looking for fault patterns after deployment; they must detect issues quickly during development and have the ability to prevent issues after going live. Unfortunately, traditional observability tools can no longer meet the needs of AI-driven enterprise application development. AI-powered detection and auto-remediation tools designed to keep pace with rapid development are now emerging to proactively manage performance and prevent downtime ...

Every few years, the cybersecurity industry adopts a new buzzword. "Zero Trust" has endured longer than most — and for good reason. Its promise is simple: trust nothing by default, verify everything continuously. Yet many organizations still hesitate to implement Zero Trust Network Access (ZTNA). The problem isn't that ZTNA doesn't work. It's that it's often misunderstood ...

For many retail brands, peak season is the annual stress test of their digital infrastructure. It's also when often technical dashboards glow green, yet customer feedback, digital experience frustration, and conversion trends tell a different story entirely. Over the past several years, we've seen the same pattern across retail, financial services, travel, and media: internal application performance metrics fail to capture the true experience of users connecting over local broadband, mobile carriers, and congested networks using multiple devices across geographies ...

PostgreSQL promises greater flexibility, performance, and cost savings compared to proprietary alternatives. But successfully deploying it isn't always straightforward, and there are some hidden traps along the way that even seasoned IT leaders can stumble into. In this blog, I'll highlight five of the most common pitfalls with PostgreSQL deployment and offer guidance on how to avoid them, along with the best path forward ...

The rise of hybrid cloud environments, the explosion of IoT devices, the proliferation of remote work, and advanced cyber threats have created a monitoring challenge that traditional approaches simply cannot meet. IT teams find themselves drowning in a sea of data, struggling to identify critical threats amidst a deluge of alerts, and often reacting to incidents long after they've begun. This is where AI and ML are leveraged ...

Three practices, chaos testing, incident retrospectives, and AIOps-driven monitoring, are transforming platform teams from reactive responders into proactive builders of resilient, self-healing systems. The evolution is not just technical; it's cultural. The modern platform engineer isn't just maintaining infrastructure. They're product owners designing for reliability, observability, and continuous improvement ...

As CIOs Address App Sprawl, Observability Can't Be an Afterthought

Bill Lobig
IBM Software

App sprawl has been a concern for technologists for some time, but it has never presented such a challenge as now. As organizations move to implement generative AI into their applications, it's only going to become more complex. In fact, a recent Canva report found that 72% of CIOs see application sprawl as a challenge — and with 71% of CIOs expecting to adopt 30-60 new apps this year, this complexity is poised to keep growing.

Potential solutions include consolidating applications, optimizing workflows, and automating IT processes to reduce strain on technologists so they can tackle the issue of app sprawl head-on. While these are all valid and necessary approaches, observability is a necessary component for understanding the vast amounts of complex data within AI-infused applications, and it must be the centerpiece of an app- and data-centric strategy to truly manage app sprawl.

Cracking the Code for AI App Sprawl Challenges

In a year of elevated global IT spend, ensuring investments aren't wasted is a necessity for overwhelmed technology leaders, who not only must make decisions around which technologies to implement, but also make sense of application performance amid growing tides of vast and complex data.

When AI enters the mix, it's even more important to have complete visibility, which many organizations still lack. Observability tools and practices can help technologists address AI app sprawl by providing visibility into the performance, behavior, and dependencies of AI applications. Unfortunately, many teams are still attempting to work with incomplete visibility, meaning they simply don't know what they don't know.

Simply put, traditional application performance monitoring (APM) tools can provide visibility to a certain degree, but they weren't built to necessarily account for the influx of generative AI applications that modern enterprises are dealing with.

End-users for generative AI-infused applications demand continuous availability and frictionless experience. However, with a lack of real-time visibility, they will feel when an outage or delay compromises their experience, particularly as applications span numerous platforms. Not to mention, the implementation and understanding of complex generative AI models' behaviors are still something many organizations are working to figure out. These potential blind spots could create significant performance, compliance, and security issues.

Observe the Full Stack So You Can Add to It

Organizations can better understand the internal state of their AI applications by analyzing their external outputs with observability. When this is connected to business outcomes and effectively addressed, technology teams can lay the groundwork for a well-functioning application monitoring and management process. So, when generative AI is introduced to the environment, the foundation is already in place to effectively see and optimize application processes.

As generative AI applications join the enterprise equation, observability tools are a must-have for facilitating the delivery of higher-quality software at a faster pace — these are best enabled through:

Finding and fixing the "unknown unknowns": You can't fix what you can't see. Unfortunately, many monitoring practices and tools can only address flaws that are previously known. Observability uncovers conditions that would be impossible to find manually or with traditional platforms. It then monitors the correlation to different performance flaws and gives context for discovering root causes, resulting in quick and easy remediation.

Detecting and remediating issues early on: With observability, monitoring is integrated into the initial stages of software development. It's then easy to pinpoint and rectify new code issues before they affect the service level agreements (SLAs) and customer experience.

Self-healing application infrastructure and automated resolution: Observability can be coupled with automation capabilities to anticipate issues from system outputs and resolve them autonomously without requiring manual intervention.

Scaling and load balancing: We need to be able to observe and control the current load on systems but also help forecast future demands. The data can be used to optimize applications in real-time without having the end-users feel any impact.

Cost management: From optimizing workloads to computational resources, SREs can and should find ways to save on VM, GPU, cloud and inferencing costs through observability.

Observability is crucial for organizations to address AI app sprawl and overcome myriad challenges that come along with it, especially as generative AI becomes a mainstay across enterprises. By following observability best practices and deploying the right automated tools, organizations can proactively identify and resolve issues and ensure all AI applications are always available and friction-free.

Bill Lobig is VP, Automation Product Management, IBM Software

Hot Topics

The Latest

In APMdigest's 2026 Observability Predictions Series, industry experts offer predictions on how Observability and related technologies will evolve and impact business in 2025. Part 3 covers more predictions about Observability ...

In APMdigest's 2026 Observability Predictions Series, industry experts offer predictions on how Observability and related technologies will evolve and impact business in 2025. Part 2 covers predictions about Observability and AIOps ...

The Holiday Season means it is time for APMdigest's annual list of predictions, covering Observability and other IT performance topics. Industry experts — from analysts and consultants to the top vendors — offer thoughtful, insightful, and often controversial predictions on how Observability, AIOps, APM and related technologies will evolve and impact business in 2026 ...

IT organizations are preparing for 2026 with increased expectations around modernization, cloud maturity, and data readiness. At the same time, many teams continue to operate with limited staffing and are trying to maintain complex environments with small internal groups. These conditions are creating a distinct set of priorities for the year ahead. The DataStrike 2026 Data Infrastructure Survey Report, based on responses from nearly 280 IT leaders across industries, points to five trends that are shaping data infrastructure planning for 2026 ...

Developers building AI applications are not just looking for fault patterns after deployment; they must detect issues quickly during development and have the ability to prevent issues after going live. Unfortunately, traditional observability tools can no longer meet the needs of AI-driven enterprise application development. AI-powered detection and auto-remediation tools designed to keep pace with rapid development are now emerging to proactively manage performance and prevent downtime ...

Every few years, the cybersecurity industry adopts a new buzzword. "Zero Trust" has endured longer than most — and for good reason. Its promise is simple: trust nothing by default, verify everything continuously. Yet many organizations still hesitate to implement Zero Trust Network Access (ZTNA). The problem isn't that ZTNA doesn't work. It's that it's often misunderstood ...

For many retail brands, peak season is the annual stress test of their digital infrastructure. It's also when often technical dashboards glow green, yet customer feedback, digital experience frustration, and conversion trends tell a different story entirely. Over the past several years, we've seen the same pattern across retail, financial services, travel, and media: internal application performance metrics fail to capture the true experience of users connecting over local broadband, mobile carriers, and congested networks using multiple devices across geographies ...

PostgreSQL promises greater flexibility, performance, and cost savings compared to proprietary alternatives. But successfully deploying it isn't always straightforward, and there are some hidden traps along the way that even seasoned IT leaders can stumble into. In this blog, I'll highlight five of the most common pitfalls with PostgreSQL deployment and offer guidance on how to avoid them, along with the best path forward ...

The rise of hybrid cloud environments, the explosion of IoT devices, the proliferation of remote work, and advanced cyber threats have created a monitoring challenge that traditional approaches simply cannot meet. IT teams find themselves drowning in a sea of data, struggling to identify critical threats amidst a deluge of alerts, and often reacting to incidents long after they've begun. This is where AI and ML are leveraged ...

Three practices, chaos testing, incident retrospectives, and AIOps-driven monitoring, are transforming platform teams from reactive responders into proactive builders of resilient, self-healing systems. The evolution is not just technical; it's cultural. The modern platform engineer isn't just maintaining infrastructure. They're product owners designing for reliability, observability, and continuous improvement ...