Skip to main content

I/Os Per Second Myths

Terry Critchley

The performance of an application depends on the availability of adequate IT resources, such as CPU, memory, storage and so on.

Storage metrics of interest are:
■ Data capacity
■ Input/output capacity (I/O performance)
■ Durability, space, cooling, cost, ROI and other mainly commercial factors.

We are concerned in this blog with the second item, I/O capability, which is not as simple as my system does X input/output operations per second (IOPs). First, let us look at some background to input/output. The classical I/O time for a disk access is:

TCPU+TCTL+TSEEK+TWAIT+TSEARCH+TACC+TXFR+TCOMP

TCPU = Time to parse and generate the I/O request in the processor

TCTL = Time for the controller to format and issue the request to the HDD, plus the time for the request to reach the HDD

TSEEK = Time to move to the correct track on the HDD (called a SEEK)

TWAIT = Time waiting to reach the required record

(In case of disk subsystems with set sector capability, the channel disconnects from the particular I/O until the record position is about to be reached on the track, then reconnects to complete the I/O. In the meantime it can do something else with its time. Prior to this feature, the channel would wait until the head reached the right position and then release it after the I/O was complete.)

TACC = Time to access the record (SEARCH) which will have an overhead depending on the format of the data (RDBMS, flat file, RAID x and so on)

TXFR = Transfer time of the accessed data to the processor via the controller/channel

TCOMP = Time to complete/post the end of the I/O.

This time is divided into 1 second to get I/Os per second (IOPs). Is physical I/O speed all that matters then?

Records: A record to an application usually means a logical record, for example, the name and address of a client. This can be made up of more than one physical record, which is normally retrieved as a block of a certain size, for example, 2048 bytes. Some though, a physical record may contain more than one logical record.

Disk Access: An I/O operation consists of several activities and the list of these depends how far you go back in the chain from data need to fulfillment. This is shown in the I/O time equation above.

Myth 1

This myth is propagated widely in internet articles and is totally erroneous, so beware. The misconception is a follows:

■ if an I/O operation (seek, search, read) takes X milliseconds, then that disk arm is capable of supporting 1000/X I/Os per second (IOPs). Yes it is, if you don't mind a response time of approximately infinity, give or take a few ms as the arm would be running at 100% utilization.

A sensible approach would be to do this calculation and settle for, say, 40% of this IOPs rate as an average which might be sustained.

Myth 2

If we make the allowance above, then a storage subsystem supporting X IOPs will perform better than one supporting 0.8X IOPs. In its raw form, this statement is not true I'm afraid, since the I/Os needed to satisfy an application's request for data depends on other factors, many within the designer's control:

■ the positioning of the physical data and its fragmentation, the former no longer in the control of the programmer, the latter a fact of life, except for the ability to defragment when necessary

■ the type of application (email, query, OLTP etc.) and access mode (random, sequential, read or write intensive)

■ block sizes and other physical characteristics, such as rotational speed (up to 15,0000 rpm)

■ the use of memory caching or disk caching, which can eliminate some I/Os

■ the design of the database layout, which is crucial and trees have been sacrificed writing about this topic

■ what RAID level, or other access method, is employed

■ the program's mode of accessing logical records (see below) might be sub-optimal (to be mild about it); does it chain reads/writes, save records or retrieve them again and so on

■ the key and indexing should be optimized to avoid long synonym chains to compose a single record - the shorter the key the greater chance of synonyms

■ Other factors and storage subsystem parameters

The upshot of this is that very fast I/O performance can be negated by poor design and often is. If the items above are properly thought through then, and only then, will the system supporting X IOPs outperform the system supporting 0.8X IOPs. These design features assume that any metadata, such as logs, indexes, copies etc. are not written to the disks containing the application data.

Dr. Terry Critchley is the Author of “High Availability IT Services” ISBN 9781482255904 (CRC Press).

The Latest

Like most digital transformation shifts, organizations often prioritize productivity and leave security and observability to keep pace. This usually translates to both the mass implementation of new technology and fragmented monitoring and observability (M&O) tooling. In the era of AI and varied cloud architecture, a disparate observability function can be dangerous. IT teams will lack a complete picture of their IT environment, making it harder to diagnose issues while slowing down mean time to resolve (MTTR). In fact, according to recent data from the SolarWinds State of Monitoring & Observability Report, 77% of IT personnel said the lack of visibility across their on-prem and cloud architecture was an issue ...

In MEAN TIME TO INSIGHT Episode 23, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses the NetOps labor shortage ... 

Technology management is evolving, and in turn, so is the scope of FinOps. The FinOps Foundation recently updated their mission statement from "advancing the people who manage the value of cloud" to "advancing the people who manage the value of technology." This seemingly small change solidifies a larger evolution: FinOps practitioners have organically expanded to be focused on more than just cloud cost optimization. Today, FinOps teams are largely — and quickly — expanding their job descriptions, evolving into a critical function for managing the full value of technology ...

Enterprises are under pressure to scale AI quickly. Yet despite considerable investment, adoption continues to stall. One of the most overlooked reasons is vendor sprawl ... In reality, no organization deliberately sets out to create sprawling vendor ecosystems. More often, complexity accumulates over time through well-intentioned initiatives, such as enterprise-wide digital transformation efforts, point solutions, or decentralized sourcing strategies ...

Nearly every conversation about AI eventually circles back to compute. GPUs dominate the headlines while cloud platforms compete for workloads and model benchmarks drive investment decisions. But underneath that noise, a quieter infrastructure challenge is taking shape. The real bottleneck in enterprise AI is not processing power, it is the ability to store, manage and retrieve the relentless volumes of data that AI systems generate, consume and multiply ...

The 2026 Observability Survey from Grafana Labs paints a vivid picture of an industry maturing fast, where AI is welcomed with careful conditions, SaaS economics are reshaping spending decisions, complexity remains a defining challenge, and open standards continue to underpin it all ...

The observability industry has an evolving relationship with AI. We're not skeptics, but it's clear that trust in AI must be earned ... In Grafana Labs' annual Observability Survey, 92% said they see real value in AI surfacing anomalies before they cause downtime. Another 91% endorsed AI for forecasting and root cause analysis. So while the demand is there, customers need it to be trustworthy, as the survey also found that the practitioners most enthusiastic about AI are also the most insistent on explainability ...

In the modern enterprise, the conversation around AI has moved past skepticism toward a stage of active adoption. According to our 2026 State of IT Trends Report: The Human Side of Autonomous AI, nearly 90% of IT professionals view AI as a net positive, and this optimism is well-founded. We are seeing agentic AI move beyond simple automation to actively streamlining complex data insights and eliminating the manual toil that has long hindered innovation. However, as we integrate these autonomous agents into our ecosystems, the fundamental DNA of the IT role is evolving ...

AI workloads require an enormous amount of computing power ... What's also becoming abundantly clear is just how quickly AI's computing needs are leading to enterprise systems failure. According to Cockroach Labs' State of AI Infrastructure 2026 report, enterprise systems are much closer to failure than their organizations realize. The report ... suggests AI scale could cause widespread failures in as little as one year — making it a clear risk for business performance and reliability.

The quietest week your engineering team has ever had might also be its best. No alarms going off. No escalations. No frantic Teams or Slack threads at 2 a.m. Everything humming along exactly as it should. And somewhere in a leadership meeting, someone looks at the metrics dashboard, sees a flat line of incidents and says: "Seems like things are pretty calm over there. Do we really need all those people?" ... I've spent many years in engineering, and this pattern keeps repeating ...

I/Os Per Second Myths

Terry Critchley

The performance of an application depends on the availability of adequate IT resources, such as CPU, memory, storage and so on.

Storage metrics of interest are:
■ Data capacity
■ Input/output capacity (I/O performance)
■ Durability, space, cooling, cost, ROI and other mainly commercial factors.

We are concerned in this blog with the second item, I/O capability, which is not as simple as my system does X input/output operations per second (IOPs). First, let us look at some background to input/output. The classical I/O time for a disk access is:

TCPU+TCTL+TSEEK+TWAIT+TSEARCH+TACC+TXFR+TCOMP

TCPU = Time to parse and generate the I/O request in the processor

TCTL = Time for the controller to format and issue the request to the HDD, plus the time for the request to reach the HDD

TSEEK = Time to move to the correct track on the HDD (called a SEEK)

TWAIT = Time waiting to reach the required record

(In case of disk subsystems with set sector capability, the channel disconnects from the particular I/O until the record position is about to be reached on the track, then reconnects to complete the I/O. In the meantime it can do something else with its time. Prior to this feature, the channel would wait until the head reached the right position and then release it after the I/O was complete.)

TACC = Time to access the record (SEARCH) which will have an overhead depending on the format of the data (RDBMS, flat file, RAID x and so on)

TXFR = Transfer time of the accessed data to the processor via the controller/channel

TCOMP = Time to complete/post the end of the I/O.

This time is divided into 1 second to get I/Os per second (IOPs). Is physical I/O speed all that matters then?

Records: A record to an application usually means a logical record, for example, the name and address of a client. This can be made up of more than one physical record, which is normally retrieved as a block of a certain size, for example, 2048 bytes. Some though, a physical record may contain more than one logical record.

Disk Access: An I/O operation consists of several activities and the list of these depends how far you go back in the chain from data need to fulfillment. This is shown in the I/O time equation above.

Myth 1

This myth is propagated widely in internet articles and is totally erroneous, so beware. The misconception is a follows:

■ if an I/O operation (seek, search, read) takes X milliseconds, then that disk arm is capable of supporting 1000/X I/Os per second (IOPs). Yes it is, if you don't mind a response time of approximately infinity, give or take a few ms as the arm would be running at 100% utilization.

A sensible approach would be to do this calculation and settle for, say, 40% of this IOPs rate as an average which might be sustained.

Myth 2

If we make the allowance above, then a storage subsystem supporting X IOPs will perform better than one supporting 0.8X IOPs. In its raw form, this statement is not true I'm afraid, since the I/Os needed to satisfy an application's request for data depends on other factors, many within the designer's control:

■ the positioning of the physical data and its fragmentation, the former no longer in the control of the programmer, the latter a fact of life, except for the ability to defragment when necessary

■ the type of application (email, query, OLTP etc.) and access mode (random, sequential, read or write intensive)

■ block sizes and other physical characteristics, such as rotational speed (up to 15,0000 rpm)

■ the use of memory caching or disk caching, which can eliminate some I/Os

■ the design of the database layout, which is crucial and trees have been sacrificed writing about this topic

■ what RAID level, or other access method, is employed

■ the program's mode of accessing logical records (see below) might be sub-optimal (to be mild about it); does it chain reads/writes, save records or retrieve them again and so on

■ the key and indexing should be optimized to avoid long synonym chains to compose a single record - the shorter the key the greater chance of synonyms

■ Other factors and storage subsystem parameters

The upshot of this is that very fast I/O performance can be negated by poor design and often is. If the items above are properly thought through then, and only then, will the system supporting X IOPs outperform the system supporting 0.8X IOPs. These design features assume that any metadata, such as logs, indexes, copies etc. are not written to the disks containing the application data.

Dr. Terry Critchley is the Author of “High Availability IT Services” ISBN 9781482255904 (CRC Press).

The Latest

Like most digital transformation shifts, organizations often prioritize productivity and leave security and observability to keep pace. This usually translates to both the mass implementation of new technology and fragmented monitoring and observability (M&O) tooling. In the era of AI and varied cloud architecture, a disparate observability function can be dangerous. IT teams will lack a complete picture of their IT environment, making it harder to diagnose issues while slowing down mean time to resolve (MTTR). In fact, according to recent data from the SolarWinds State of Monitoring & Observability Report, 77% of IT personnel said the lack of visibility across their on-prem and cloud architecture was an issue ...

In MEAN TIME TO INSIGHT Episode 23, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses the NetOps labor shortage ... 

Technology management is evolving, and in turn, so is the scope of FinOps. The FinOps Foundation recently updated their mission statement from "advancing the people who manage the value of cloud" to "advancing the people who manage the value of technology." This seemingly small change solidifies a larger evolution: FinOps practitioners have organically expanded to be focused on more than just cloud cost optimization. Today, FinOps teams are largely — and quickly — expanding their job descriptions, evolving into a critical function for managing the full value of technology ...

Enterprises are under pressure to scale AI quickly. Yet despite considerable investment, adoption continues to stall. One of the most overlooked reasons is vendor sprawl ... In reality, no organization deliberately sets out to create sprawling vendor ecosystems. More often, complexity accumulates over time through well-intentioned initiatives, such as enterprise-wide digital transformation efforts, point solutions, or decentralized sourcing strategies ...

Nearly every conversation about AI eventually circles back to compute. GPUs dominate the headlines while cloud platforms compete for workloads and model benchmarks drive investment decisions. But underneath that noise, a quieter infrastructure challenge is taking shape. The real bottleneck in enterprise AI is not processing power, it is the ability to store, manage and retrieve the relentless volumes of data that AI systems generate, consume and multiply ...

The 2026 Observability Survey from Grafana Labs paints a vivid picture of an industry maturing fast, where AI is welcomed with careful conditions, SaaS economics are reshaping spending decisions, complexity remains a defining challenge, and open standards continue to underpin it all ...

The observability industry has an evolving relationship with AI. We're not skeptics, but it's clear that trust in AI must be earned ... In Grafana Labs' annual Observability Survey, 92% said they see real value in AI surfacing anomalies before they cause downtime. Another 91% endorsed AI for forecasting and root cause analysis. So while the demand is there, customers need it to be trustworthy, as the survey also found that the practitioners most enthusiastic about AI are also the most insistent on explainability ...

In the modern enterprise, the conversation around AI has moved past skepticism toward a stage of active adoption. According to our 2026 State of IT Trends Report: The Human Side of Autonomous AI, nearly 90% of IT professionals view AI as a net positive, and this optimism is well-founded. We are seeing agentic AI move beyond simple automation to actively streamlining complex data insights and eliminating the manual toil that has long hindered innovation. However, as we integrate these autonomous agents into our ecosystems, the fundamental DNA of the IT role is evolving ...

AI workloads require an enormous amount of computing power ... What's also becoming abundantly clear is just how quickly AI's computing needs are leading to enterprise systems failure. According to Cockroach Labs' State of AI Infrastructure 2026 report, enterprise systems are much closer to failure than their organizations realize. The report ... suggests AI scale could cause widespread failures in as little as one year — making it a clear risk for business performance and reliability.

The quietest week your engineering team has ever had might also be its best. No alarms going off. No escalations. No frantic Teams or Slack threads at 2 a.m. Everything humming along exactly as it should. And somewhere in a leadership meeting, someone looks at the metrics dashboard, sees a flat line of incidents and says: "Seems like things are pretty calm over there. Do we really need all those people?" ... I've spent many years in engineering, and this pattern keeps repeating ...