Banner
Backup D.R. Replication Virtualisation Hardware/Media Privacy

Current Filter: Storage>>>>>Technology Focus>

PREVIOUS

   Current Article ID:4263

NEXT



Is it safe?

Editorial Type: Technology Focus     Date: 05-2014    Views: 2673   





When it comes to error detection and correction technologies, not all SSD manufacturers are the same, according to SanDisk's Chris Smith.

Threats to data integrity in Solid State Drives, as in HDDs, fall into two basic categories: NAND flash memory errors (analogous to media errors in HDDs) and errors that occur to data in-flight either to or from the flash.

Errors in NAND flash consist of both correctable data errors and more catastrophic flash component failures. These correctable errors are analogous to HDD media errors and the SSD deals with them in a similar fashion. However, unlike a HDD where catastrophic failures generally result in total data loss, SanDisk SSDs incorporate additional features to recover from many of the worst NAND failure modes.

Correctable Errors: Bit errors occur fairly frequently throughout the flash array, and increase as the flash ages. These type of errors are detected and corrected on-the-fly by the controller's Error Correction Code or ECC engine.

Uncorrectable errors: When data bit errors occur in numbers too great for the ECC engine to correct, or when NAND flash pages or blocks fail outright, an SSD's primary data integrity capability can break down. Without additional protection against these threats, data loss is a real risk.

SanDisk SSDs feature a data fail protection scheme called "F.R.A.M.E." (Flexible Redundant Array of Memory Elements). F.R.A.M.E. correction can recover data in situations where ECC cannot, up to and including the total loss of flash pages (consisting of multiple blocks of user data), and is able to substantially reduce the Uncorrected Bit Error Rate (UBER) risk to better than 1 in 1017bits read, exceeding JEDEC requirements JESD218.

Undetected Errors: While exceedingly rare, single bit errors can escape detection by the ECC engine. Undetected errors in this subcategory will result in incorrect data being returned to the host as good. SanDisk SSDs include additional features to protect against this category of error.

Furthermore, data in transit to/from flash memory is susceptible to soft errors due to radiation induced bit flips. Unless an SSD is also able to detect these errors, in spite of any other error protection subsystems, the integrity of mission-critical data will still be at risk. SanDisk SSD's 32-bit Data Path CRC error detection provides protection here as well.

While all SSD designs include ECC engines, and some include Data Path CRC, few include protection for catastrophic flash memory failures. An SSD that lacks the capability to recover from an uncorrectable error, such as a page or block failure is unlikely to meet the requirements of high reliability applications. SanDisk SSDs implement protection from all three of these classes of errors, and so are highly suited for the most demanding enterprise applications.

ERROR CORRECTION AND DETECTION

Figure 1 shows a block diagram of the data path protection implementation for the Optimus family of SSDs.

Figure 2 below shows a block diagram of the data path protection implementation for the CloudSpeed family of SSDs.

SAS AND SATA LINK 32-BIT CRC
Data transacted on the host interface is protected by industry-standard link-level CRC. As data is received from the host by the drive, the drive calculates a 32-bit CRC checksum for the data as it is clocked in to the drive's buffer memory. The CRC is encapsulated with the host's data when the data is written to the flash. Errors that occur along this internal data path are detectable via both the CRC and the ECC and parity protection implemented in the buffer memory itself.



Page   1  2

Like this article? Click here to get the Newsletter and Magazine Free!

Email The Editor!         OR         Forward ArticleGo Top


PREVIOUS

                    


NEXT