Does Latency Affect Performance? Yes, but No.

Apr 24, 2015

Automation Isn't A Feature It's In Our DNA

As discussed in my last blog post — Latency and Response Time: The Same but Not the Same, latency and response time can be confusing when assessing storage I/O performance. In the storage world, performance is generally defined by two parameters: Throughput and I/O Per Second. Throughput is measured in bytes per second. In the present day it is more convenient to use megabyte per second (of course tomorrow it may be GBPS…). I/O per second, or IOPS, is self-explanatory.

Nevertheless, people frequently ask about latency. The two questions that need to be asked are:

Is latency considered another performance parameter?
Does latency affect throughput and IOPS?

Let’s look into the second question first. In my previous post, I described the commonly accepted definition of latency. Put simply, latency is the response time when you send a small I/O to a storage device.

If the I/O is a data read, latency is the time it takes for the data to come back. If the I/O is a write, latency is the time for the write acknowledgement to return.

Different systems may have different latency, due to hardware characteristics, mechanical movement, signal distance, and processing logic design. Of course interference from other systems accessing the same storage will affect the dynamic latency. Also, latency may be added if additional hardware is inserted into the data path. Yes, even adding our appliances (or any hardware) into the data path will introduce latency into the system, albeit extremely small latency.

The question is — does higher latency necessarily degrade IOPS or MBPS?

The answer is “yes,” and “no.”

“Yes” because in the absolute sense, any added delay has some effect when the I/O is accessed in certain manner. For example, if an application can process only one I/O at a time in a completely serial manner, then of course any added delay will affect the IOPS (and thus, will affect throughput), because the effect of the latency is cumulative. However, very few applications access data in a serial manner. Most of them process many I/O’s at the same time. And that is why “no,” latency generally does not affect throughput and IOPS.

We can use an example to illustrate.

Imagine a storage system that is capable of processing 100,000 IOPS connected to a client host by a long, long cable (say, ten miles long) that introduces 100 microsecond (ųs) latency. If we use a number of single 1KB I/O in series, each I/O will take 100 ųs to complete. The IOPS will be 10,000, with a throughput of 10MB/s. Let’s say we double the cable length (and therefore, the latency) by adding ten more miles. Now each I/O will take 200 ųs to complete, and the added latency will cut the performance in half (to about 5,000 IOPS, and 5 MB/s).

Remember though, that a majority of applications process multiple I/O’s at the same time. While each dependent transaction has to perform in series, multiple transactions are processed at the same time.

Imagine the same system now sends 10 I/O’s at the same time (approximately). Using the previous example, the ten-mile cable with 100ųs latency will return IOPS of 100,000, and throughput of 100 MB/s. The twenty-mile cable with twice the latency will once again return half that figure – 50,000 IOPS and 50 MB/s.

But what if we now send 20 I/O’s at the same time? For the ten-mile cable, we have already mentioned that 100,000 IOPS is the limit of the storage system, so bumping the I/O to anything higher than 10 will not increase performance.

However, in the case of the twenty-mile cable, bumping I/O’s to 20 now also bumps the IOPS to 100,000, and the throughput 100 MB/s. This means as long as the number or I/O’s being processed is sufficiently large to compensate for the added latency, the throughput and IOPS will not be affected.

With regard to the first question, is latency considered as another performance parameter? I personally do not believe so, since I can think of very few applications where performance is measured by absolute latency. I have been told that some high speed trading firms will actually fight over nanoseconds to get ahead of other traders. But you can rest assured in that case they will process all transactions on high-speed memory only (prodigious amounts of it) and the systems will be physically as close to the action as possible. Of course, these institutions can afford such extravagance. After all, they are literally “making money” in every sense. Nevertheless, there are a very limited number of applications that have such cutthroat requirements.

Again, everything in this post can be verified by performing simple tests using standard tools such as IOMeter or FIO.

About the Author:

Wai Lam

Before joining Cirrus Data Solutions, Wai co-founded FalconStor Software in 2000, where he served as CTO and VP of Engineering. Wai was the chief architect, holding 18 of the 21 FalconStor patents. His inventions and innovations include many of industry’s “firsts,” in areas of advanced storage virtualization, data protection, and disaster recovery. Wai received a MSEE from UCLA, 1984, and BSEE from SUNY Stony Brook, 1982. He was honored with the Distinguished Alumni Award from Stony Brook in 2008.