There are two target audiences for benchmarks: engineers and end-users.
Engineers have price/performance targets they want to meet when designing systems, and benchmarks are metrics that tell them where they are on meeting the targets.
End-users will see benchmarks as ways of comparing systems.
But in the smartphone market, where the AnTuTu bechmarks seem to be popular, the question is what decisions are made on the basis of them.
Smartphones are better seen as fashion accessories than as tech. Most buyers are interested in cool. Smartphones are status markers, and the incentive will be "My phone is cooler than yours!"
What makes a phone cool? Brand will be critical. iPhone buyers aren't buying iOS, they're buying Apple. There are scads of Android phones, so while Android is cool, it's a common denominator, and just running Android is not a deciding feature. Windows Phone is not cool, and that may be the biggest challenge Microsoft and Nokia have in getting a share of the market.
The smartphone market reminds me of the movies. In the movie business, you are as good as your last hit picture, and if your studio hits a dry patch and doesn't have hit pictures for a while, you may go out of business. The smartphone market is similar. Motorola had a hit with the Razr, didn't have a followup hit, and was rumored for a bit to be looking at getting out of the smartphone business.
I'd bet that most folks who run the AnTuTu benchmarks are measuring performance on the phone they already bought for other reasons, and aren't making a decision on which phone to buy based on them.
Given that, how much should anyone care what the AnTuTu benchmarks say?
Your point is well taken and I completely agree. End-users don't care about benchmarks. PC generation and pre-PC generation users usually look for common brands and features that they prefer. The mobile generation looks for something new. Benchmarks really come into play with the OEMs and carriers that are making decisions about what technology and products select. So, they still play a role within the industry, even if we all agree that they should not.
And, when we really look at an ARM or Intel processor, the decision is very similar to selecting the OS - what ecosystem do you want? Because, a processor alone does not make a successul product.
Oh, I can see end-users caring about benchmarks. I just don't think most will use them to make a purchase decision. They will use them to validate the decision they already made. The AnTuTu bechmarks are the sort of thing I can see a smartphone owner point to and say "Look how much better the benchmarks are for the phone I bought than they are for your choice." (with the implicit sub-text "I made a better choice than you did, so I'm smarter and cooler than you.")
And I can see cases in the industry where benchmarks will be used, and there will even be agreement they should be used. I just hope for better understanding of what the benchmarks measure, how they measure it, and what the results actually mean. (This may be wishful thinking on my part.)
I also wonder how many mobile device users are really aware of what processor is under the hood, or cares if they know? As you say, they are buying an eco system, and what is available in that eco system will be far more important than what that eco system runs on. There are probably device owners who are aware that Intel and ARM are battling for share in the mobile device market, but I really doubt anyone will say "I'm running ARM, and you're running (yuck!) Intel! You're a real dweeb!"
I agree with your point about mobile phone users not giving a hoot what kind of processor their phone is running. Obviously there are some that do, many of which read EE Times, but the vast majority of people buying smartphones could care less.
Great Post. Fantastic to see that EE Times article was influential enough for Antutu to update their benchmark, though it does cast doubt on the value of traditional benchmarking. Also a great lesson not to 'follow one number' to judge the capability of a smartphone. I'm sure Intel isn't too happy about such a post, so kudos to EE Times for sticking with the facts/data.
Just showing performance numbers without measuing how much energy has been actually used to run the benchmark is meaningless. A quad core out-of-order CPU beating a dual-core in-order CPU in multi-thread benchmarks..color me surprised!
In terms of the timing, I and many of my colleagues were made aware of the 3.2.2 revision on Wednesday evening, which is what I am going by.
In terms of performance being meaningless, I agree. The best measure is efficiency (performance/Watt), and the most accurate measurement is platform efficiency. Even measuring just the CPU core or SoC is rather meaningless. Take the Intel Atom vs. the Qualcomm Snapdragon. If you were doing a fair comparison you should include all the functions that are integrated into the Snapdragon, such as wi-fi and the cellular baseband modem. On a platfom without those functions integrated into the SoC like the Atom, you have to factor in the combo chip, the baseband modem chip, and the filters managing coexistence issues. Then you need to add in any external accelerators that the other platforms may or may not have ro require, such as image signal processors, video processors, DSP, etc. And this is just to make a viable silicon comparison. However, if you really want to measure overall performance and power, you need to factor all the other system components and software as well.
So, I agree that we need a better way to evalaute mobile paltfroms and the key components that drive them. Any suggestions?
I feel it is not fair to Compare GCC on ARM Processor against ICC on Intel Processor. From my past experience, ARMCC used to outpform GCC 2:1 in some cases. So I feel fair comparison should be either GCC on both ARM & Intel Processor or ARMCC on ARM Processor and ICC on Intel Processor. Of course, that will give ARM processor more edge against Intel Processor. I know ARM inc has been working to improve GNU compiler efficiency. So the gap between ARMCC and GCC might not be as bigger as used to be.
Another way to look at this is: Since we're talking about performance of mobile devices running Android, perhaps benchmarks should use whichever compiler is typically used for on a given processor when that processor is used in Android mobile devices.
In other words, rather than choosing the compiler that yields the best benchmark results, let's use the compiler (and compiler settings) that yields the most realistic benchmark results.
Agreed, GCC is currently the only official compiler on Android, so the OS and all native code is compiled with it. Therefore any Android benchmarking should be done using GCC. Using GCC also has the advantage that unlike some other compilers it is not optimized to do well on benchmarks (or even break them) but actually focusses on performing well on real code.
The next version of AnTuTu will revert back to GCC. Hopefully they also fix the compiler options to be the same on all targets (the ARM version was compiled -Os with inlining and unrolling disabled) so that we finally end up with a fair comparison. Of course none of this changes the fact the benchmark itself remains rubbish - setting bits in memory is not a good memory performance test.