Skip to main content

Information is Power, But Only If ...

Robin Lyon

IT has access to an amazing amount of data. Often we collect hundreds of data points on one server such as individual processor load, thread state, disk throughput both in and out etc. We then store this in a bin and use this to create a metric called something similar to server performance. When it comes time to provide reports (weekly, monthly and so on) IT then assigns some poor person the job of collating this information. This is usually done by running a report and importing it into a spread sheet and then combining various servers and metrics into some grouping and calling it an application. Then some numbers are calculated and saved in the spreadsheet to create a performance over time graph. The same is done with database numbers, application performance, network statistics etc. This process is then repeated by levels of management combining more numbers into a single number to represent service performance to allow reporting to more senior levels of management.

Given that IT is all about automating processes, this has struck me as somewhat backwards.

Data Management and IT – Operational Intelligence

IT by and large is staffed by realists – the type that don’t respond well to marketing, want solutions and have little time for repetition.

A second reality is that IT is a fledgling science. While it has a century under its’ belt, it has not developed some niceties like the common taxonomy of biology; every company creates its own rankings and groupings of IT functions. Quite often a great deal of resources are used in creating the custom taxonomy.

To add to the frustration of IT managers everywhere, different off the shelf applications also present data in the taxonomy that is coded specific to that application. It becomes more and more difficult to extract and combine data in a meaningful way.

An IT user friendly application should allow its user base to create rules for the grouping of data for reports. By allowing atomic bits of data, such as unused server capacity for a select group of servers, it now can report on the unused server capacity for an application. Using this application data as a new data point, the well-designed application will allow another ad hoc grouping to provide information on an over-all service.

This process of using groups to create other groups goes on as needed until the application is configured to match the taxonomy the company has designed. Instead of complex calculations each month, a one-time setup is created and automation is achieved.

By allowing different data elements to be members of more than one group, we can avoid a second common pitfall such as the question of factoring the time of DNS queries or a multi-application database server.

IT needs to save time, and its internal applications need to accept the reality of reporting against an ever changing data set that is custom to each company that uses it.

Robin Lyon is Director of Analytics at AppEnsure.

Hot Topics

The Latest

Outages aren't new. What's new is how quickly they spread across systems, vendors, regions and customer workflows. The moment that performance degrades, expectations escalate fast. In today's always-on environment, an outage isn't just a technical event. It's a trust event ...

Most organizations approach OpenTelemetry as a collection of individual tools they need to assemble from scratch. This view misses the bigger picture. OpenTelemetry is a complete telemetry framework with composable components that address specific problems at different stages of organizational maturity. You start with what you need today and adopt additional pieces as your observability practices evolve ...

One of the earliest lessons I learned from architecting throughput-heavy services is that simplicity wins repeatedly: fewer moving parts, loosely coupled execution (fewer synchronous calls), and precise timing metering. You want data and decisions to travel the shortest possible path. The goal is to build a system where every strategy and each line of code (contention is the key metric) complements the decision trees ...

As discussions around AI "autonomous coworkers" accelerate, many industry projections assume that agents will soon operate alongside human staff in making decisions, taking actions, and managing tasks with minimal oversight. But a growing number of critics (including some of the developers building these systems) argue that the industry still has a long way to go to be able to treat AI agents like fully trusted teammates ...

Enterprise AI has entered a transformational phase where, according to Digitate's recently released survey, Agentic AI and the Future of Enterprise IT, companies are moving beyond traditional automation toward Agentic AI systems designed to reason, adapt, and collaborate alongside human teams ...

The numbers back this urgency up. A recent Zapier survey shows that 92% of enterprises now treat AI as a top priority. Leaders want it, and teams are clamoring for it. But if you look closer at the operations of these companies, you see a different picture. The rollout is slow. The results are often delayed. There's a disconnect between what leaders want and what their technical infrastructure can handle ...

Kyndryl's 2025 Readiness Report revealed that 61% of global business and technology leaders report increasing pressure from boards and regulators to prove AI's ROI. As the technology evolves and expectations continue to rise, leaders are compelled to generate and prove impact before scaling further. This will lead to a decisive turning point in 2026 ...

Cloudflare's disruption illustrates how quickly a single provider's issue cascades into widespread exposure. Many organizations don't fully realize how tightly their systems are coupled to thirdparty services, or how quickly availability and security concerns align when those services falter ... You can't avoid these dependencies, but you can understand them ...

If you work with AI, you know this story. A model performs during testing, looks great in early reviews, works perfectly in production and then slowly loses relevance after operating for a while. Everything on the surface looks perfect — pipelines are running, predictions or recommendations are error-free, data quality checks show green; yet outcomes don't meet the ground reality. This pattern often repeats across enterprise AI programs. Take for example, a mid-sized retail banking and wealth-management firm with heavy investments in AI-powered risk analytics, fraud detection and personalized credit-decisioning systems. The model worked well for a while, but transactions increased, so did false positives by 18% ...

Basic uptime is no longer the gold standard. By 2026, network monitoring must do more than report status, it must explain performance in a hybrid-first world. Networks are no longer just static support systems; they are agile, distributed architectures that sit at the very heart of the customer experience and the business outcomes ... The following five trends represent the new standard for network health, providing a blueprint for teams to move from reactive troubleshooting to a proactive, integrated future ...

Information is Power, But Only If ...

Robin Lyon

IT has access to an amazing amount of data. Often we collect hundreds of data points on one server such as individual processor load, thread state, disk throughput both in and out etc. We then store this in a bin and use this to create a metric called something similar to server performance. When it comes time to provide reports (weekly, monthly and so on) IT then assigns some poor person the job of collating this information. This is usually done by running a report and importing it into a spread sheet and then combining various servers and metrics into some grouping and calling it an application. Then some numbers are calculated and saved in the spreadsheet to create a performance over time graph. The same is done with database numbers, application performance, network statistics etc. This process is then repeated by levels of management combining more numbers into a single number to represent service performance to allow reporting to more senior levels of management.

Given that IT is all about automating processes, this has struck me as somewhat backwards.

Data Management and IT – Operational Intelligence

IT by and large is staffed by realists – the type that don’t respond well to marketing, want solutions and have little time for repetition.

A second reality is that IT is a fledgling science. While it has a century under its’ belt, it has not developed some niceties like the common taxonomy of biology; every company creates its own rankings and groupings of IT functions. Quite often a great deal of resources are used in creating the custom taxonomy.

To add to the frustration of IT managers everywhere, different off the shelf applications also present data in the taxonomy that is coded specific to that application. It becomes more and more difficult to extract and combine data in a meaningful way.

An IT user friendly application should allow its user base to create rules for the grouping of data for reports. By allowing atomic bits of data, such as unused server capacity for a select group of servers, it now can report on the unused server capacity for an application. Using this application data as a new data point, the well-designed application will allow another ad hoc grouping to provide information on an over-all service.

This process of using groups to create other groups goes on as needed until the application is configured to match the taxonomy the company has designed. Instead of complex calculations each month, a one-time setup is created and automation is achieved.

By allowing different data elements to be members of more than one group, we can avoid a second common pitfall such as the question of factoring the time of DNS queries or a multi-application database server.

IT needs to save time, and its internal applications need to accept the reality of reporting against an ever changing data set that is custom to each company that uses it.

Robin Lyon is Director of Analytics at AppEnsure.

Hot Topics

The Latest

Outages aren't new. What's new is how quickly they spread across systems, vendors, regions and customer workflows. The moment that performance degrades, expectations escalate fast. In today's always-on environment, an outage isn't just a technical event. It's a trust event ...

Most organizations approach OpenTelemetry as a collection of individual tools they need to assemble from scratch. This view misses the bigger picture. OpenTelemetry is a complete telemetry framework with composable components that address specific problems at different stages of organizational maturity. You start with what you need today and adopt additional pieces as your observability practices evolve ...

One of the earliest lessons I learned from architecting throughput-heavy services is that simplicity wins repeatedly: fewer moving parts, loosely coupled execution (fewer synchronous calls), and precise timing metering. You want data and decisions to travel the shortest possible path. The goal is to build a system where every strategy and each line of code (contention is the key metric) complements the decision trees ...

As discussions around AI "autonomous coworkers" accelerate, many industry projections assume that agents will soon operate alongside human staff in making decisions, taking actions, and managing tasks with minimal oversight. But a growing number of critics (including some of the developers building these systems) argue that the industry still has a long way to go to be able to treat AI agents like fully trusted teammates ...

Enterprise AI has entered a transformational phase where, according to Digitate's recently released survey, Agentic AI and the Future of Enterprise IT, companies are moving beyond traditional automation toward Agentic AI systems designed to reason, adapt, and collaborate alongside human teams ...

The numbers back this urgency up. A recent Zapier survey shows that 92% of enterprises now treat AI as a top priority. Leaders want it, and teams are clamoring for it. But if you look closer at the operations of these companies, you see a different picture. The rollout is slow. The results are often delayed. There's a disconnect between what leaders want and what their technical infrastructure can handle ...

Kyndryl's 2025 Readiness Report revealed that 61% of global business and technology leaders report increasing pressure from boards and regulators to prove AI's ROI. As the technology evolves and expectations continue to rise, leaders are compelled to generate and prove impact before scaling further. This will lead to a decisive turning point in 2026 ...

Cloudflare's disruption illustrates how quickly a single provider's issue cascades into widespread exposure. Many organizations don't fully realize how tightly their systems are coupled to thirdparty services, or how quickly availability and security concerns align when those services falter ... You can't avoid these dependencies, but you can understand them ...

If you work with AI, you know this story. A model performs during testing, looks great in early reviews, works perfectly in production and then slowly loses relevance after operating for a while. Everything on the surface looks perfect — pipelines are running, predictions or recommendations are error-free, data quality checks show green; yet outcomes don't meet the ground reality. This pattern often repeats across enterprise AI programs. Take for example, a mid-sized retail banking and wealth-management firm with heavy investments in AI-powered risk analytics, fraud detection and personalized credit-decisioning systems. The model worked well for a while, but transactions increased, so did false positives by 18% ...

Basic uptime is no longer the gold standard. By 2026, network monitoring must do more than report status, it must explain performance in a hybrid-first world. Networks are no longer just static support systems; they are agile, distributed architectures that sit at the very heart of the customer experience and the business outcomes ... The following five trends represent the new standard for network health, providing a blueprint for teams to move from reactive troubleshooting to a proactive, integrated future ...