Facebook's VP of engineering Jay Parikh on storing 350 million images per day, the need for speed and the product he's waiting to buy.
K.L.: What kinds of solutions are you considering?
looking at all sorts of options but one that is interesting to me is
can the industry produce different grades of flash that handle these
different use cases? Today, high end flash vendors are focused on very,
very high performance use cases where you need a lot of IOPS, you need a
very low latency, you need very predictable but very high performance.
You pay a premium here. This is stuff you want to put into your database
server, your app server, your critical analytics service, whatever, but
it’s the wrong price point and it’s the wrong feature set for storing
data that is less frequently accessed.
My plea to the industry
is essentially I’ve got spinning disks which are optimized for one sort
of thing and I’ve got flash on the other side of it and in between I
don’t really have anything. When I’m building up these Big Data
applications I have to ultimately boil down to only picking between two
options here. There are hybrid options where you can put flash on top of
disks and do some things like that but ultimately I want to drive the
cost of our storage down, both from a cost per gig but also the power
that we consume storing these photos online.”
assume you’re talking about going beyond simply difference between SLC,
MLC, TLC kind of thing and looking for something really re-imagined?
We don’t need a lot of write endurance for this type of storage. You
read the photo, it stays there until somebody deletes it. It’s not like
you’re updating the photo and changing the color every day so we don’t
need a lot of high write IOPS or write capacity on this NAND. We just
need to be able to read it and we’re probably going to be reading it
K.L.: What about some of the
next-generation nonvolatile memories? A number of them can certainly
deliver the performance you seek, although not at the price point.
At our scale, we need to consider density in addition to power and
cost. At the moment we'd be able to put a few hundred gigabytes of RAM
on a server, and it would be expensive. Compare that with, say, our
ability to put 3 TB of flash or tens of terabytes of spinning disk on a
particular host. That's the kind of density we need.
Facebook’s latest storage box design has been open-sourced as part of
Facebook's contribution to the Open Compute Project.