NVMe 2.0 - The Future of Storage
The Pros of NVMe
SAS vs SATA
Prior to solid state drives, there were more things to consider with rotational media.
- Full command queuing
- Multipath I/O enhances R/Ws
- 1/2 seek latency time
- 2× rotational vibration tolerance
- 2× transfer rate (full duplex)
- 3× typical IOPs
- Multipath I/O for redundancy
- 1/2 bad sector timeout time
- 2× MTBF hours @ 2× rated temperature
- 1.5× maximum operating temperature
- 1.6× average longer warranty
- Variable sector size
- Dedicated misalignment detection
- Rotational vibration compensation
- Higher RPM
- Tighter run-out
- Dual-anchored spindles
- Less inertial, more rigid head-stack assemblies
- Stronger magnets
- Air turbulence controls
- Dedicated processors for data and servo
- Advanced error handling
Faster Performance
Better Reliability
Common Physical Drive Features
NVMe vs SAS
With the surge of solid state drives and their prevalence, NVMe was the answer to fully realise the benefits of their simpler design with a modern data path.
- Direct storage-processor communication via PCIe bus (no HBA)
- No mechanical parts delay signal path
- Transfer Rates up to 8 Gb/s per lane (PCIe 3.0) or 128 Gb/s per x16 slot
- Transfer Rates up to 16 Gb/s per lane (PCIe 4.0) or 256 Gb/s per x16 slot
- No mechanical parts to fail
- No relying on a Host Bus Adapter (HBA)
- As NVMe defines the command layer and the host interface, the performance is dictated by the drive and motherboard. For SAS drives, the performance is dictated by the SAS controller.
Fastest Performance
Best reliability
Protocol Specifics
Version 2.0
Announced June 2021, with the latest Revision 2.0c as of October 2022.
New Features
Transfer efficiently with the Simple Copy Command, that allows hosts to copy from multiple logical block addresses to a single logical block address (LBA).
Maximise logical storage capacity, controllers, and ports with Domains & Partitions, that are native and enterprise-ready.
Optimise performance with NVM Set & Endurance Group Management, offering granular control over drive-data structure.
Align data logically and physically with Zoned Namespaces, boosting SSD performance by increasing sequential access speeds and further reducing excessive wear.
Control data architecture with Namespace Types, whether Logical Block Addresses (LBA), Key-Value (K-V), or with special flags, such as sequential-only writes.
Prohibit command execution with Command and Feature Lockdown, whether directly as a host or remotely via out-of-band management endpoints.
Better store and access data faster with Key-Value pairs, that supersede traditional logical block addressing with variable-length data references, avoiding translation table overhead.
Rotational Media Support
As SSDs are yet to be a commercially viable replacement to HDDs in terms of raw capacity, after over a decade since the release of NVMe 1.0, NVMe 2.0 is here to optimise spinning disks.
- Spindown (Spindle Operational Power State ON → OFF)
- Spinup (Spindle Operational Power State OFF → ON
- With Media - Commands can be processed; initialized and attached Namespaces and media are ready
- Independent of Media - Controller may be ready before namespaces and media are initialized and attached
- Domain ID
- Endurance Group Capacities - Total (TEGCAP), Unallocated (UEGCAP)
- Endurance Estimate - Remaining lifetime TBW
- Host Commands - Endurance Group Controller Read/Writes
- Capacity - spare %, spare threshold %, used %
- Data Units - Bytes Read/Written
- Media Units - Bytes Written
- Media/Data Integrity Errors - uncorrectable ECC, CRC checksum failure, LBA tag mismatch
- Error Information Log Entry Count
- Nominal Rotational Speed (RPM)
- Spinup Count (Operational Power State OFF → ON, Lifetime Total Successful)
- Failed Spinup Count (Operational Power State ON → OFF, Lifetime Total Failed)
- Number of Actuators
- Load Count (Actuator Operational State OFF → ON, Life Total Successful)
- Failed Load Count (Actuator Operational State ON → OFF, Life Total Failed)
- Technical information regarding Namespaces, NVM Sets, and Endurance Groups, with a specific bit flag identifying whether the stored data is on rotational media
- All the operations and states required
1.5.50 - 1.5.55 - 1.5.56
3.5.3 - Controller Ready Modes During Initialization
5.16.1.10 - Endurance Group Information
5.16.1.22 - Rotational Media Information Log
5.17.2.8 - I/O Command Set Independent Identify Namespace data structure
5.27.1.22 - Spinup Control
8.20 - Rotational Media
Multiple Controller Support
Ensures drive failover between drives and multiple transceivers for a mixed interface system.
32-bit / 64-bit CRC Support
Further define different metadata or more specific information regarding drive and data protection.
Backwards Compatibility
Upgrade your system in stages, without having to commit to an entire overhaul of hardware.
Seagate Demo the First NVMe HDD
First shown at the OCP Experience Center summit in November 2021, the system demonstrated a full NVMe architecture utilising NVMe HDDs in a JBOD enclosure via an NVMe-oF networking switch.
The setup comprises less components, tri-band (NVMe, SAS, SATA) transceivers, native drivers, multi-actuator hard drive support, and a common management API - Redfish™.
Why support HDDs with NVMe?
The latest hard drive advancements have pushed the boundaries of magnetic media into the field of flash.
Seagate's latest top tier HDD, the EXOS MACH.2 2X18, is not only massive at 18 TB, but rapid with speeds of up to 524 MB/s, placing it in the top 20% of consumer SSDs. This extreme density is achieved through Heat-Assisted Magnetic Recording (HAMR), with multi-actuator platters to achieve the speed.
WD is also competing for the fastest and most dense drives, with their Microwave-Assisted Magnetic Recording (MAMR) - an evolution of HAMR. While the technology has promised exceptional results for over a decade, the tech has remained hypothetical in practice, with energy-assisted Perpendicular Magnetic Recording (ePMR) the tech of choice for WD's current and near-future Ultrastar DC HC line-ups.
Excluding Seagate's multi-actuator models, the discrepancy between the two manufacturers' flagship 20 TB enterprise drives is around 30MB/s Read/write speeds, but is important to note that Seagate's Exos X20 utilises CMR, while WD's Ultrastar DC HC650 utilises SMR.
As NVMe protocol logic is physically stored and executed by the controller on the device, to take advantage of NVMe 2.0, you'll need new drives.
However, as PCIe Gen3 buses support 1 GB/s per lane, and Gen4 buses support 2 GB/s per lane, there's plenty of bandwidth to saturate on current systems.