SAN JOSE, Calif. A PC benchmark program that claims to measure system performance more realistically than today's other tests has taken Intel Corp.'s Pentium 4 processor to task.
Among other things, Benchmark Studio 1.0 has indicated that Advanced Micro Devices Inc.'s Athlon processor significantly outperforms the Pentium 4 and that PCs with two relatively slow Pentium IIIs far surpass desktops with a single 1.5-GHz Pentium 4.
Randall C. Kennedy, who developed the Benchmark Studio while working almost exclusively for Intel, said other current benchmarks are skewed to favor the chip giant, with which he had an acrimonious falling out last year.
"Basically the benchmarks out there today say what Intel wants said about PC performance," he maintained.
Kennedy charged that existing benchmarks, including Winstone from Ziff Davis and those from the Business Applications Performance Corp. (BAPCo) are heavily influenced by Intel, something spokesmen from both groups denied.
An Intel spokesman would not comment on Kennedy's statements but defended the benchmarks in use today. "BAPCo and Ziff-Davis are the major industry-standard benchmarks and they are well-accepted in the industry," he said.
Kennedy, director of research at Competitive Systems Analysis (Danville, Calif.), the two-man consulting firm that developed the test suite, said his benchmark is unique: It attempts to measure multiple concurrent tasks in an environment similar to real-world business computing as opposed to existing benchmarks that measure jobs run serially. In addition, he said, the program lets users flexibly define what mix of tasks they want the benchmark to run, customizing the benchmark to their own typical workloads.
So far the tests claim eye-opening results. For instance, they show that an Athlon processor in a system with double-data-rate SDRAMs delivers about 15 percent more performance than a Pentium 4 system with Direct Rambus, Kennedy said. They also show a desktop with two 733-MHz Pentium III chips is almost twice as fast as a PC using the latest 1.5-GHz Pentium 4, a finding that might surprise many in the industry where speed is king and dual-processor desktops never took off. Kennedy's test also showed the 1.5-GHz Pentium 4 significantly outperforms the 1-GHz Pentium III, he said.
Nathan Brookwood, a consultant, and others who saw a presentation at the Platform Conference here, said that if the software lives up to its billing the benchmark could significantly impact how PCs and their components are measured and marketed. "The existing benchmarks have been showing similar performance for the Pentium III and Pentium 4, which has been a big problem for Intel," said Brookwood, who is the principal consultant with Insight 64 (Saratoga, Calif.).
Nick Stam, senior technical director for PC Magazine, the Ziff Davis Media publication, which helped develop Winstone, said, "I think [Benchmark Studio is] a valiant effort and a good engineering tool, but it could be confusing to an end user if they don't know what they are doing." If the program does indeed measure multiple tasks running concurrently, he said, it could be a significant improvement over today's serial tests.
However, useful benchmarks need to measure both the time a computer takes to handle tasks and how much of the work the PC accomplishes in that time, something Stam said he was not sure the new program does. "And," he said, "to get really pristine, repeatable results you really need a client/server test environment [rather than the standalone PC test the Benchmark Studio performs.]"
Stam rejected Kennedy's charge that the Winstone benchmark is too heavily influenced by Intel. "Intel, AMD, Microsoft, Via and other companies have all had input into Winstone, and we try to correct things that are wrong when someone points a problem out to us," he said.
A falling out
Ironically, although the software shines a negative light on some Intel processors, Kennedy spent most of the 18 months in which he developed the suite essentially working just for the chip maker. Kennedy said he got started with Intel trying to develop a metric that could show the performance advantages of Direct Rambus memories. But once the company started de-emphasizing Rambus he shifted his efforts to helping Intel measure the benefits of its Pentium 4 over the Pentium III.
But Kennedy said he had a falling out with Intel over an article critical of Pentium performance that he published on his Web site at www.xpnet.com and subsequently lost what had been more than a half million dollars in Intel business. After taking a vacation over the holidays last month, he decided to provide a free trial of an end-user version of the benchmark over his Web site.
"Dell is working with this in their labs, as is AMD, Cyrix [now Via] and Compaq," Kennedy said. "Most of the major OEMs know who we are and they could become our key development partners."
An engineer at AMD said he has talked to Kennedy, received the benchmark and reviewed Kennedy's Web site, but has not done an evaluation of the product yet. Compaq and Via did not return calls by press time.
Kennedy said he has not yet hashed out a business model but is considering licensing the code to OEMs or asking them for sponsorships to support ongoing work on the benchmark. "I can run for about a year before I have to really worry about how I will make money on this," he said.
One key to the program is its ability to define and isolate particular tasks that can be handled repetitively in mix-and-match application loadings. Users can set a host of parameters for a benchmark, including running up to four application loads simultaneously. Kennedy demonstrated the software benchmarking a desktop while running canned video on three simultaneous Windows Media Player windows and some background Microsoft Office applications.
The point of the software is to test performance on real-world application sets in what Intel calls a "continuous computing environment" where multiple tasks and multiple threads are running in parallel in a Windows environment. The code also tests multiple parameters such as memory and disk accesses and queue lengths, packet speeds and CPU utilization.
"It's really about testing performance under load," said Kennedy, comparing existing CPU-centric benchmarks to quarter mile drag races between sports cars. "The Pentium 4 is really more like an SUV and you need to see how it handles when running down the road with a load of gear in the back, two screaming kids and a canoe on top. That's an analogy Intel doesn't like, but it's fitting.
"Unless you test and simulate all the parameters you will fail to differentiate the real performance difference between one system and another," he said. "Our goal was to come up with an easy way to generate these complex simulations in a short amount of time that didn't require you to have a Microsoft engineer present."
The Windows-specific program relies heavily on Microsoft's COM object-oriented technology and its ActiveX controls. It focuses on the Windows 2000 environment, which Kennedy said was one of the reasons it has uncovered the performance benefits of dual-processor desktops.
Existing benchmarks tend to focus on Windows 95/98, programs that were not savvy about multiprocessing environments and thus masked much of their inherent performance advantages over fast uniprocessing systems, Kennedy said. "I would never give up my dual-processor desktop system now," he said.
Kennedy deflected criticism at the conference that the flexibility of the program could allow OEMs or chip makers to create skewed application suites that unfairly emphasize some performance characteristic of their product. "It's one thing to build a simulation that shows off your capabilities," he said, "and it's another thing to just cheat."
The program had its first incarnation in a complex version that required multiple desktops and servers and was demonstrated by Intel's Pat Gelsinger at an Intel sales conference last February. The current desktop-only version "is buggy and only a beta, but we are doing our best," said Kennedy.