Thursday, July 21, 2016

QLC Flash on the horizon

Exabytes shipped
Last May in my talk at the Future of Storage workshop I discussed the question of whether flash would displace hard disk as the bulk storage medium. As the graph shows, flash is currently only a small proportion of the total exabytes shipped. How rapidly it could displace hard disk is determined by how rapidly flash manufacturers can increase capacity. Below the fold I revisit this question based on some more recent information about flash technology and the hard disk business.

First, economic stress on the hard disk industry has increased. Seagate plans a 35% reduction in capacity and 14% layoffs. WDC has announced layoffs. Unit shipments for both companies are falling. If disk is in a death spiral, massive increases in flash shipments will be needed.

Flash vs HDD capex
There are a number of ways flash manufacturers could increase capacity. They could build more flash fabs. This is extremely expensive, but as I reported in my talk, flash advocates believe that this is not a problem:
The governments of China, Japan, and other countries are stimulating their economies by encouraging investment, and they regard dominating the market for essential chips as a strategic goal, something that justifies investment. They are thinking long-term, not looking at the next quarter's results. The flash companies can borrow at very low interest rates, so even if they do need to show a return, they only need to show a very low return.
Since then the economic situation has become less clear, and the willingness of the governments involved to subsidize fabs may have decreased, so this argument may be less effective. If there aren't going to be a lot of new flash fabs, what else could the manufacturers do to increase shipments from the fabs they have?

The traditional way of delivering more chip product from the same fab has been to shrink the chip technology. Unfortunately, shrinking the technology from which flash is made has bad effects. The smaller the cells, the less reliable the storage and the fewer times it can be written, as shown by the vertical axis in this table:
Write endurance vs. cell size
Both in logic and in flash, the difficulty in shrinking the technology further has led to 3D, stacking layers on top of each other. Flash is in production with 48 layers, and this has allowed manufacturers to go back to larger cells with better write endurance.

Flash has another way to increase capacity. It can store more bits in each cell, as shown in the horizontal axis of the table. The behavior of flash cells is analog, the bits are the result of signal-processing in the flash controller. By improving the analog behavior by tweaking the chip-making process, and improving the signal processing in the flash controller, it has been possible to move from 1 (SLC) to 2 (MLC) to 3 (TLC) bits per cell. Because 3D has allowed increased cell size (moving up the table), TLC SSDs are now suitable for enterprise workloads.

Back in 2009, thanks to their acquisition of M-Systems, SanDisk briefly shipped some 4 (QLC) bits per cell memory (hat tip to Brian Berg). But up to now the practical limit has been 3. As the table shows, storing more bits per cell also reduces the write endurance (and the reliability).

As more and more layers are stacked the difficulty of the process increases, and it is currently expected that 64 layers will be the limit. Beyond that, manufacturers expect to use die-stacking. That involves taking two (or potentially more) complete 64-layer chips and bonding one on top of the other, connecting them via Through Silicon Vias (TSVs). TSVs are holes through the chip substrate containing wires. Although adding 3D layers does add processing steps, and thus some cost, it merely lengthens the processing pipeline. It doesn't slow the rate at which wafers can pass through and, because each wafer contains more storage, it increases the fab's output of storage. Die-stacking, on the other hand, doesn't increase the amount of storage per wafer, only per package. It doesn't increase the fab's output of bytes.

Now, Chris Mellor at The Register reports that Good gravy, Toshiba QLC flash chips are getting closer:
3D TLC flash is now good enough for mainstream enterprise use. ... QLC could become usable for applications needing read access to a lot of fast, relative to disk and tape, flash capacity but low write access. Archive data, on the active end of a spectrum of high-to-low archive access rates, is one such application.

Back in March, Jeff Ohshima, a Toshiba executive, presented ... QLC flash at the Non-Volatile Memory Workshop and suggested 88TB QLC 3D NAND SSDs with a 500 write cycle life could be put into production.
QLC will not have enough write endurance for conventional SSD applications. So will there be enough demand for manufacturers to produce it, and thus double their output relative to TLC?

Exabytes shipped
Cloud systems such as Facebook's use tiered storage architectures in which re-write rates decrease rapidly down the layers. Beacuse most re-writes would be absorbed by higher layers, it is likely that QLC-based SSDs would work well at the bulk storage level despite only a 500 write cycle life. It seems likely that only a few of the 2015 flash exabytes in the graph are 3D TLC, most would be 2D MLC. If we assume that half the flash from existing fabs becomes 3D QLC, flash output might increase 8x. This would still not be enough to completely displace hard disks, but it would reduce disk volumes and thus worsen the economics of building them. Fewer new flash fabs would be needed to displace the rest, which would be more affordable. Both effects would speed up the disk death spiral.

9 comments:

David. said...

According to what looks like a edited press release in EE Times:

"The inventor of 3D monolithic chip technology back in 2010, BeSang Inc. (Beaverton, Ore.), claims to have since created a superior three-dimensional (3D) architecture for NAND flash. Frustrated with licensee Hynix's slow implementation of its monolithic 3D technology, BeSang is opening the door to partnerships with other memory houses, as well as offering to contract-fab the chips for resale by others, at a price that reduces the cost-per-bit of 3D NAND from over 20¢ to about 2¢ per gigabyte."

A 10x price reduction would be a big deal but there are reasons to be skeptical about claims for chip technologies that aren't yet in mass production.

David. said...

CHris Mellor at The Register reports:

"Seagate is closing down its factory in Havant on the south coast of the UK and axing 327 jobs."

and:

"HAMR could increase areal density to 1.5Tbit/inch2, 50 per cent or more higher than today's drives; a 10TB helium-filled drive could become a 15TB helium-filled HAMR drive, with higher capacities in prospect. Enterprise valuation drives could come in 2017 with GA in 2018. Client drives might follow."

David. said...

Paul ALcorn at Tom's Hardware reports that:

"The Western Digital Corporation (WDC) announced that it has completed development and is ramping up production of BiCS3, which is the world's first 64-layer 3D NAND."

David. said...

Chris Mellor's QLC flash is tricky stuff to make and use, so here's a primer is a readable description of the technology.

David. said...

Chris Mellor reports that Samsung's 64-layer flash will ship in volume by the end of the year.

David. said...

At the Flash Memory Summit, Toshiba described a 100TB QLC-based SSD.

"Analyst haus Stifel Nicolaus' MD, Aaron Rakers, added more details, saying Toshiba is actively working on a product and, in fact, has “been in early/high-level testing with hyper-scale customers”. He said Facebook, a prospective QLC SSD user, thought QLC drives would have a 150 write cycle endurance rating, and it is anticipating product arrival for WORM (write once, ready many) use. Facebook would be the exact sort of customer Toshiba has in mind, and has probably already tested Tosh QLC drives."

David. said...

Seagate's Flash Memory Summit announcements are reviewed by Chris Mellor.

David. said...

Chris Mellor reports that the first version of Intel's XPoint NVM technology suffers performance problems that limit it to SSD applications, and that these are now scheduled at the "end of this year".

To obtain the fill potential of NVMs they need to be packaged as memories (DIMMs) not disks (SSDs). Intel is promising XPoint DIMMS in 2018. NVM DIMMs have the potential to eventually displace flash from the high-performance end of the market, pushing it down into the bulk storage space and thus displacing mode HDDs. But, as we see, not for some time.

David. said...

Intel just launched six new 3D NAND SSDs.