LAKE WALES, Fla. ‐ Cray Inc. Tuesday (June 20) launched a free, downloadable analytics software suite for it’s Intel Xeon powered high-end Urika-XC line of supercomputers at the International Supercomputer Conference (ISC 2017) in Frankfurt, Germany. According to Cray, the software suite “cuts through the Big Data deluge like-a-knife-through-butter” using open-source analytics, deep learning and artificial intelligence (AI).
The first Cray-1 processor was built completely from NOR gates and SRAM with about 200,000 gates in all.
"All the software in the Big Data Analytics Software Suite is downloadable for the Cray UrikaXC," Tim Barr, Cray’s director of analytics and artificial intelligence product strategy, told EE Times in exclusive interview. "We only charge you for maintenance and installation if you want those services from Cray."
Cray said its Big Data Analytics Software Suite will accelerate visualization, machine learning, deep learning, weather forecasting, seismic imaging, manufacturing CAE (computer aided design), scientific analysis, climate science, chemistry and materials science, large-scale graph discovery, cancer cell morphology, fraud- and insider-threat detection. However, any application that has to run analytics on large data sets, from sensors to on-line transactions, can benefit, according to Barr.
Today Cray’s XC can be configured with thousands of cores and terabytes of memory with trillions of gates.
Cray's UrikaXC supercomputers do not need the massively parallel Xeon Phi or Nvidia graphic processing units (GPUs) to get the most out of its Big Data Analytics Software Suite, said Barr, but works with Intel Xeon multi-core processors, the models of which depend on the XC model you choose. (You can also rent time on a Urika-SC instead of buying by going to cloud provider the Markley Group (Boston).
The components of the Big Data Analytics Software Suite include Cray's own Graph Engine (which includes the some of the company’s fastest graph theoretic algorithms available today), the Apache Spark world-famous analytics environment, the BigDL distributed deep learning framework for Spark, the distributed Dask parallel computing libraries for analytics, and widely-used languages for analytics including Python, Scala, Java, and R. All are open source, but for a fee Cray will provide full support for the software suite, including a software subscription that includes maintenance, updates and technical support from Cray.
Dask is a flexible parallel computing library for analytics with two components: a dynamic task scheduling optimized for computation; and “Big Data” collections like parallel arrays, dataframes, and lists that run on top of the dynamic task schedulers, according to Barr. The user can run combined workloads using same scheduler and in-memory Spark operations with Dask.
Big DL is a distributed deep learning library for Apache Spark written by Intel but contributed to the open source community. Users can their own deep learning applications as standard Spark programs, which can directly run on top of Spark.
Cray Big Data Analytics can handle, terabytes, petabytes and even exabytes to draw conclusions about relationships hidden sparsely among this deluge of data.
The biggest advantage of running the Big Data Analytics Software Suite, according to Barr, is that it can be combined using same scheduler and using in-memory Spark with Dask. Thus, you can move your from analytics and AI workloads to scientific modeling and simulations to data analytics without having to move a massive big data set from the simulation system to a separate analytic system. Cray accomplished this feat by leveraging a container system to run analytics workload on UrikaXC instead of X-86 Cluster. The resulting lower-cost-of-ownership and greater return-on-investment will enable real-time weather forecasting, predictive maintenance, precision medicine, and fraud detection, according to Barr.
The software package has already been running at beta testers in large accounts, including the third fastest supercomputer in the world, nicknamed Piz Daint, at the Swiss National Supercomputing Centre.
The Big Data Analytics Software Suite will be available for everyone in the third quarter of 2017.
— R. Colin Johnson, Advanced Technology Editor, EE Times