What is RADIO?
I often use the concept of RADIO (Redundant Array of Distributed Independent Objects) in my description of VSAN, as people need extra help in understanding why you don’t need RAID in a VMware Virtual SAN. It’s particularly applicable to storage, but the concept also applies to compute (CPU and RAM), and potentially to network virtualisation – but this article will focus on storage. Before I explain what is RADIO, let’s refresh on what RAID is;
- RAID is a redundant array of independent (or inexpensive) disks. Fundamentally, the data that you write to data storage is distributed amongst multiple devices so that if there is a failure of one disk, the data is already distributed amongst the surviving disks and so can be recovered (except RAID 0).
- There are multiple ‘flavours’ of RAID that provide recoverability and differing performance characteristics, but they share the same core requirements of expecting all disks to be the same size, same performance and connected to the same controller device (or controlling software).
- Once the RAID type (‘flavour’) has been set, all data in this RAID set shares the same parameters of performance, availability and accessibility – and this can be hard to change.
- RAID was invented as a response to the issue of spinning disks suffering from failures and losses that could not be tolerated within an enterprise environment, and to keep the data available after a loss.
- RAID has additional benefits in that a disk has performance limitations that can be a bottleneck, but distributing disk traffic amongst multiple disks can improve performance.
- RAID has an overhead that the amount of usable space is less than the raw space – this is often equal one disk in the set, but can equal two disks or even 50% of disks. Each different type of RAID also has their own penalty for writes whilst parity is calculated or written.
There is a lot more to be said about RAID, however I have focussed on the points that are relevant to this article. So, what is the difference to RAID and what is RADIO? (not to be confused with RAID10…)
What is RADIO?
RADIO differs from RAID in many subtle ways, starting with the point that not all devices (because they can be spinning disks, SSDs or Flash storage) need to be identical in size/capacity or performance. Additionally, because it is controlled by policy and software, not all devices need to be connected to the same controller device or use the same connectivity (e.g. SATA, SAS, PCIe). The policies regarding performance, availability and accessibility can be changed at the object level – which in VSAN’s case is the Virtual Machine disk level, instead of an entire data store.
With a Redundant Array of Distributed Independent Objects, data is distributed based on a policy defined for the requirements of the object (the VM), instead of defining a setting for disks that may end up holding data. This more granular approach is more dynamic in that decisions can be made for the needs of each system.
How does it work?
Instead of gathering a group of disks with the same capacity and performance under one controller, each system that is in the cluster will provide it’s resources to the pool of storage, and be able to consume resources from other members of the cluster. Simple enough, it’s a concept just like compute resources (RAM and CPU) in a virtual cluster – and in the same way you can either upgrade one host (add RAM or add storage capacity), or add new hosts to the cluster and their resources expand the pool for all members.
If you had a miss-matched cluster of hosts that had different sizes of RAM and counts of CPU cores, your cluster would have characteristics that VMs would run faster on some hosts than on others – and you can define DRS rules and re-arrange machines to take advantage of this. In this scenario, a failure or removal of the fastest host would mean that VMs would run on slower hosts – but importantly they would still be available.
So, in a RADIO scenario, the storage for VMs is replicated by software to multiple distributed independent endpoints. These copies increase availability, but also this means that instead of all communication needing to go to a SAN (and SAN transport), data can be written to disk directly by the local storage controller – at the speed of the controller (e.g. 16Gbit/s in SATA 3.2) and with only one queue. If your RADIO cluster is miss-matched, then you can have a smaller number of very high speed devices for doing the actual work, but still have instantly available copies of the live data on lower cost storage media.
Do I need RAID with RADIO?
The whole point of RADIO is that it provides the capabilities of RAID for today’s scenarios, instead of trying to shoe-horn a 1987 technology (built to increase availability and performance over single disks) onto highly reliable and high performance disks and SSDs, If you use VSAN, you can’t use RAID, because the disks need to be presented to and managed by the hypervisor – and RAID gets in the way. With another scenario, you also want to avoid having an extra layer to manage and configure – because it’s not going to add any more benefits. RADIO provides all the redundancy you need with additional copies of the VM files, it enables obtaining higher performance by putting the disks directly at the control of the server (with no RAID controllers or SAN transport), and enhances scalability beyond a RAID set by moving the policy down to the more granular level of the VM.
As far as I know, this is the first time that the concept “RADIO” has been published. Feel free to re-use if it is relevant, and I would appreciate a credit.