A recent AnTuTu benchmark that put Intel's latest Atom ahead of ARM-based processors from Nvidia, Qualcomm, and Samsung made headlines. But something seems odd.
There has been a considerable amount of press around recent AnTuTu benchmark results and a recent ABI Research report claiming, "Intel apps processor [the Atom Z2580] outperforms Nvidia, Qualcomm, and Samsung." (See Intel processor outperforms Nvidia, Qualcomm, Samsung ICs.)
This blanket statement essentially proclaims that Intel has surpassed the entire ARM ecosystem in mobile processors for the all-important high-end smartphone segment. Normally, I would not comment on information or comments from other analyst firms, but in this case, something seems odd.
Evaluating current mobile processors is a challenging effort. These processors are more aptly referred to as systems-on-chips (SoCs) because they are much more than the CPUs common to PCs. These processors are complex systems of heterogeneous processing elements combined with memory, I/O, high-speed networks, communications modems, and a host of other dedicated system functions. This alone makes comparison based on specifications difficult.
Integration of the processors into mobile devices further complicates any evaluation because the overall performance and efficiency of these processors is impacted by the other system components, such as software, multimedia accelerators, and wireless networking. As a result, the industry turns to benchmarks to compare processors and devices. Unfortunately, the mobile benchmarks still fall short of providing an accurate evaluation.
Benchmarking is plagued with many issues. The first is the complexity and variety of mobile processors and devices. No two mobile processors or devices are designed alike. The second issue is that it is difficult to test for actual usage models. Smartphones, for example, are used for a variety of functions including communication, content creation, and entertainment. In addition, this usage varies, not only between individuals, but for each individual depending on his or her current requirements. And usage models are continuously changing with new applications, content, and devices.
Finally, benchmarks have always been subject to manipulation. It has always been in the best interest of the technology vendors to demonstrate their goods in the most positive light possible. As a result, vendors have attempted to manipulate benchmarks through various means, ranging from optimizing hardware configurations or modifying software to match the benchmark testing parameters to even attempting to influence the benchmark code or methodology. As a result, no benchmark is completely accurate in evaluating a processor or device. However, where one benchmark falls short, another typically excels. This is why is has become more common practice to use a suite of benchmarks in product evaluations.
So, the first thing that strikes me as odd is that the only benchmark referred to in the articles referring to the new Intel processor outperforming all the ARM competitors is AnTuTu. In the case of the ABI Research report, the press release just refers generically to "benchmarks" to measure the overall performance and current drain, which is an attempt to measure the power efficiency of the processor. Although ABI Research did not specify which benchmarks were used, Intel and other tech reviewers have confirmed that they experienced similar results using the AnTuTu benchmark.
However, many of these sources indicated that other benchmarks demonstrate better performance and lower power consumption from the devices using the ARM-based processors. To further investigate this discrepancy, I compiled a variety of benchmark information from tech reviewers, benchmarking organizations, and other industry resources.
The figure below demonstrates the relative performance results of the Samsung Exynos 5 Octa processor relative to the Intel Atom Z2580. The two devices used were the Lenovo K900 using the Intel processor and the Samsung Galaxy S4 GT-I9500 using the Samsung processor. Note that the benchmarking results for the Samsung Galaxy S4 GT-I9505 using the Qualcomm Snapdragon 600 processor were very similar to those of the GT-I9500 using the Samsung processor. Because of the variance in numbers from different sources, the final numbers were averaged, and revisions in the benchmarks were taken into account whenever possible.
The results demonstrate a significant advantage for the Intel processor relative to the Samsung processor, but only in the AnTuTu benchmark, and only in one of the AnTuTu benchmark tests. This raises several questions. The first question is why do the results of these benchmarks vary so greatly?
The AnTuTu RAM benchmark shows almost double the performance advantage for the Intel processor, while the Quadrant and Geekbench memory benchmarks show the Samsung processor with up to a 50 percent performance advantage, and the Geekbench stream benchmark shows the Samsung processor with nearly twice the performance advantage. Similarly, the AnTuTu CPU benchmark shows only a 20 percent advantage for the Samsung processor, but the other CPU-centric benchmarks (Quadrant, Geekbench, and Linpack) show a 2.3x or greater advantage for the Samsung processor.
This is a tremendous discrepancy. In a statistical evaluation, it would be normal to eliminate the outlying data points. If we do this, the much cited AnTuTu benchmark would not be included in the final evaluation. In addition, the outlying figures would be examined for errors.