Software Defined Storage – What Is It?

Tech Blog

Software Defined Storage – What Is It?

Software Defined Storage – What Is It?

Jun 16, 2015

Every now and then some new buzzword catches the public’s fancy and becomes a sensation that everyone (especially the “experts”) talks about with fervid enthusiasm. Each time this happens, I end up feeling like that naïve boy looking at the naked emperor (a very unpleasant sight).

Today’s phrase is “Software Defined Storage,” or the esoteric-sounding acronym, SDS. But what is it?

Like everyone else does, I looked it up on Wikipedia. In essence, “… SDS is an evolving concept for computer data storage software to manage policy-based provisioning and management of data storage independent of hardware … typically include a form of storage virtualization to separate the storage hardware from the software that manages the storage infrastructure…”, etc.

Hmmm, that sounds very familiar. I vaguely remember some history from the not so distant past…

Once upon a time, there was this great visionary named Reijane Huai. Foreseeing the emerging Gigabit Ethernet as a viable challenger to the fibre channel SAN, he gathered a team of engineers to create a SCSI-over-IP product. The hope was to use the new GbE to provide a routable, more ubiquitous, and potentially more cost-effective storage connectivity to break the fibre channel monopoly.

The year was 2000. After the concept was tested using Linux by students of Professor Eric Chen in Taiwan, FalconStor was founded. Our team in New York quickly started to work on a prototype for Windows, which was a much bigger challenge since — unlike Linux — we did not have source code. I experimented with some sample code to confirm the NDIS driver access to the network stack. After that, my colleagues — the software whizzes Ron Niles and Jimmy Wu — quickly used that code to create the world’s first working SCSI over IP driver for Windows, demonstrated by playing a movie with an IP virtual disk in Windows, from a CD drive connected to a Linux server. To achieve speed, Ron, who could program at atomic level, implemented an efficient zero-memory copy between the network buffer and the SCSI interface in the kernel, achieving 120 MB/s with our SAN over IP. This speed was 50% higher than the 80 MB/s maximum set by fibre channel at that time. We thought we had something that could lead the company to success.

Alas, it was not meant to be. Unfortunately the great “Dot-Bomb” of 2000 killed the potential of many new technologies. We realized we needed more than just SCSI over IP, and started to create a complete storage platform by adding novel storage functions such as differential snapshot and micro scan IP replication. Bernie Wu, who had great insights into the industry, urged us to also add fibre channel connectivity. We unknowingly ended up pioneering a whole new frontier. For example, by then EMC’s snapshot was BVC. It was a full volume mirroring, which would then be broken off to preserve the data image. So each snapshot was a full volume. Our solution was block level differential, which meant only changed data was stored. After seeing what we did, IBM’s storage experts named our snapshots “space efficient snapshots.” We invented many techniques for these advanced functions, and subsequently received many patents. Brilliant ideas and designs were also contributed by Wayne Lam, the other visionary and our current CEO at Cirrus Data, who brought back valuable customer input, and closed most of the enterprise deals back then.

The result was IPStor, the world’s first storage virtualization product that provided the most advanced storage functionalities available, even by today’s standards. It was pure software. Heterogeneous storage devices were virtualized in one environment and managed by a central platform. Storage pools could be created to allow policy based allocation, even self-allocation from the client hosts.

Wait, how does Wikipedia describe SDS again?

I guess people have either forgotten or are unaware that there was a time (just 15 years ago) storage had no functions other than acting as the RAID controller. Then we created a product that was “storage virtualization to separate the software to manage policy-based provisioning and management of data storage independent of hardware….” (from Wikipedia). Soon many storage companies started to add similar functions to their hardware. Of course each company did this in their own way, and soon a jungle was created. The word “virtualization” was “good”, then “bad”, then “good”, then “bad”. I guess now it is “good” once again. Someone evidently wandered into today’s storage jungle and came up with the brilliant idea: “Why don’t we create a virtualized storage software platform to centralize and manage all this hardware? And let’s call it … uh… Software Defined Storage, since there is already Software Defined Networking…”

Maybe being old has its advantages. Having actually been at the center of the action helps one recognize things for the way they really are, and how they should be. In fact, today, almost everything is software defined. Storage is no exception. This is especially true for today’s IT environment. Perhaps we should just call it the “Software Defined Age,” since now even servers are software.

A new storage platform for tomorrow must take into consideration not only just the typical data storage as a SAN volume, or as a NAS share, but should also allow flexible integration into today’s cloud infrastructure. This means the complete storage paradigm should encompass the virtualized, distributed computational and data environment over the ubiquitous and high capacity Internet connectivity. The days of individual and isolated islands of storage being the main challenge of storage administrators are fading fast.

So what is SDS? Well, I can only provide my two cents, as above, with the long view supported by personal lessons from history. Years of experience has made me skeptical of hype. From the SDN example, one can best derive that SDS should allow for consumers to specify/request specific properties, or capabilities of the storage devices in a more flexible manner. This requires storage systems to provide appropriate APIs, and — more importantly — the capabilities to satisfy such requests; with functions like thin provisioning, snapshots, deduplications, etc. These functions have been in the industry for a while in various forms, including the example mentioned on storage pools with self-allocation based on performance characteristics. The VM computational environment has further pushed adaptation from many storage vendors. But one can hardly discern these specific facts from all the chit-chat about SDS.

In the recent past, a prodigious number of novel applications have emerged, due to a confluence of events and circumstances for each particular moment in time – cheaper and faster storage, faster connection speeds, etc. especially for multimedia data, – have created a feeding frenzy on storage. This brought in very ingenious, specific solutions to meet the new wave of storage challenges. But with each wave of new and innovative architectural frameworks comes an equivalent wave of hype being preached as a panacea to every problem under the sun.

This phenomenon has occurred repeatedly since the information revolution, which come to think of it, is only few decades old. We are living in a truly interesting and exciting time. However, on top of concepts and theories, at Cirrus Data we remind ourselves to focus on the truly important objective — to bring concrete, precise solutions to our customers, and to bring useful technologies and products to the market.

At the end of the day, these 1’s and 0’s (a lot of them), must reside on some physical media somewhere, and that physical media needs to be managed, protected, and available, SDS or not.

A slightly edited version of this post appeared in Business Computing World earlier this month.

About the Author:

About the Author:

Wai Lam

Before joining Cirrus Data Solutions, Wai co-founded FalconStor Software in 2000, where he served as CTO and VP of Engineering. Wai was the chief architect, holding 18 of the 21 FalconStor patents. His inventions and innovations include many of industry’s “firsts,” in areas of advanced storage virtualization, data protection, and disaster recovery. Wai received a MSEE from UCLA, 1984, and BSEE from SUNY Stony Brook, 1982. He was honored with the Distinguished Alumni Award from Stony Brook in 2008.

Before joining Cirrus Data Solutions, Wai co-founded FalconStor Software in 2000, where he served as CTO and VP of Engineering. Wai was the chief architect, holding 18 of the 21 FalconStor patents. His inventions and innovations include many of industry’s “firsts,” in areas of advanced storage virtualization, data protection, and disaster recovery. Wai received a MSEE from UCLA, 1984, and BSEE from SUNY Stony Brook, 1982. He was honored with the Distinguished Alumni Award from Stony Brook in 2008.

Before joining Cirrus Data Solutions, Wai co-founded FalconStor Software in 2000, where he served as CTO and VP of Engineering. Wai was the chief architect, holding 18 of the 21 FalconStor patents. His inventions and innovations include many of industry’s “firsts,” in areas of advanced storage virtualization, data protection, and disaster recovery. Wai received a MSEE from UCLA, 1984, and BSEE from SUNY Stony Brook, 1982. He was honored with the Distinguished Alumni Award from Stony Brook in 2008.