The Internet of Things (IoT) presents numerous engineering challenges, not the least of which is the transport of data from remote locations to one that can distill it into useful information for users. However the challenge doesn't end at the front door of the datacenter, as the efficient movement of data throughout the datacenter plays a considerable role in the effectiveness of an IoT.
In two previous articles, we have talked about some of the challenges that face engineers when looking to build a network that provides Quality of Service (QoS) for an IoT. The datacenter brings new challenges as data has to be ferried between compute and storage systems that have different requirements.
To facilitate the need for high bandwidth, low latency and energy efficient interconnect within a datacenter, AMD's SeaMicro SM15000 servers incorporate state-of-the-art Freedom Fabric. It is designed to ferry mission critical data from between storage and ultra-dense compute
nodes, with the AMD SeaMicro SM15000 purpose built for the Big Data age, allowing organizations to efficiently use the vast amounts of data that an IoT will generate.
To understand the challenges posed by an IoT within the datacenter one needs to appreciate where the value of an IoT is generated.
The data that is collected by sensors as part of an IoT will need to be analyzed in order to provide statistically actionable data for users. Providing such data requires complex analysis over vast datasets and access to historical data in order to provide the user with historical context. High-performance data analysis systems may be comprised of compute nodes, database servers and storage hierarchies, all of which must work together in order to distill vast quantities of data into useful information.
In the world of high-performance computing (HPC), interconnects have long been an important factor when looking at overall cluster performance and this applies equally to datacenters that will work on data generated by an IoT. While HPC clusters are typically skewed towards compute performance, as witnessed by the biannual Top 500 list, data analysis systems may require more than one type of interconnect in order to achieve optimum balance between performance and cost.
Engineering an interconnect that meets these requirements has given rise to fabric compute models, where the interconnect between processors within a single server and external nodes is a fundamental part of the complete system. This is being driven by the characteristics of data analysis, which makes use of well-known scale-out workloads such as Hadoop. Such workloads will drive demand for dense compute, an infrastructure that focuses on maximizing processor and core count in order to maximize the divide and conquer approach used by Hadoop and other MAPREDUCE-based systems.
Fabric compute is more than the connections between processors. The interconnect between processors in a dense compute node will be different to that linking compute nodes to data storage systems.
Costs dictate the need for tiered storage infrastructures in the modern datacenter. A hierarchical storage infrastructure makes performance storage such as PCI-Express based flash storage the first port of call, and is complemented by SATA solid state drives and traditional hard drives to provide cost-effective storage capacity. Tiers within the storage hierarchy will incorporate both hot and cold storage, with cold storage playing a critical part in economically storing and retrieving valuable data that is available within a predetermined latency.
The interconnect at various tiers of the storage hierarchy will need to meet the bandwidth demands of each tier. For example, a 1 Gb/sec Ethernet link between the compute node and a storage node that makes use of PCI-Express based flash storage will create a bandwidth bottleneck as the storage node will quickly saturate the link and degrade performance.
High-performance interconnect is more than just bandwidth. Consideration must be given to the protocols that run on top of the wire and whether they are suited to streaming or bursty data. A point-to-point protocol that can be deployed between processors within a dense compute node would not be feasible when connecting tens of thousands of compute nodes within a datacenter. The challenges do not stop at connecting processors; transactional operations are likely to generate bursts of data throughout the datacenter, while moving data from hot to cold storage is likely to be a bulk transfer that lasts for minutes.
Engineering a datacenter to meet these challenges is of vital importance if the raw data collected by an IoT is to be efficiently distilled into meaningful and actionable data. Previously, the notion of a datacenter interconnect started and finished with an Ethernet cable going to a top-of-the-rack switch. Today workloads dictate the need for fabric compute -- a multi-faceted interconnect, one that meets the needs of a variety of services within the datacenter and is designed to be at the heart of the overall system.
Related articles: Read more about challenges facing IoT here:
— Lawrence Latif is a Technical Communications Manager at AMD. He has published peer-reviewed research and has over a decade of experience in enterprise IT, networking, system administration, software infrastructure, and data analytics. Lawrence holds a BSc in computer science and management from Kings College London, an MSc in systems engineering management, and a PhD in electronic and electrical engineering from University College London.