SAN JOSE -- Countering claims made recently by an industry microprocessor research firm, Intel Corp. at this week's Intel Developer Forum here said the upcoming Pentium 4 has no deep pipeline performance penalty.
Intel executives here at IDF detailed the Pentium 4's NetBursttechnology, which they said significantly increases performance
over other processors, while nearly doubling the number of
processor pipeline stages.
Jeff Austin, Intel's IA-32 architect launch manager, said the Pentium 4's 20-stage pipeline suffers no penalty for pre-fetch misprediction because of its use of the NetBurst technology.
Misprediction, which sounds like an arcane technical question, is a
key performance factor. To increase the speed of operations and
data rates, modern processors literally try to guess in advance
what data will be needed. If the processor guesses wrong, a deep
20-stage pipeline such as Pentium 4 can take up to 13 clock
cycles to purge all the data and be refilled, slowing operations.
Bert McComas, an analyst at InQuest Research Inc. in Gilbert, Ariz.,
claimed recently that the pre-fetch misprediction problem causes
the 1.4-GHz Pentium 4 to operate at the same performance level
as the 1.13-GHz Pentium III.
Intel's Austin, however, said NetBurst corrects most of the
miprediction problem, with the Pentium 4 performing at the highest
level of any Intel processor to date. Allowing the deep Pentium 4
pipeline to meet performance targets is only one of NetBurst's
goals, as the device also aims to provide much faster integer and
NetBurst includes Advanced Dynamic Execution, a speculative
engine that helps increase memory pre-fetch prediction rates
greatly, according to Intel. The technique uses three times as
many instructions operating in pre-fetch as the Pentium III and
includes more sophisticated algorithms that look at many prior
executions before making a prediction on data to be accessed,
The Pentium 4 also features a Level 1 on-chip cache that executes
already decoded instructions, thus eliminating latency delays. The
L1 cache of the Pentium III, in comparison, must decode
instructions each time they are issued, slowing the speed at which
data is fed to the processor.
NetBurst's Rapid Execution Engine is another feature and includes
an arithmetic logic unit (ALU) integer-processor running at 2.8
GHz, which is twice the main-processor clock speed and provides
extremely rapid processing of integer instructions, according to
A new Streaming SIMD-2 Extension in NetBurst also speeds
processing by operating arithmetic integer operations at 128 bits
every clock cycle, twice as fast as Penitum III. Additionally, Intel
said, the NetBurst adds a 128-bit double precision float point
operation not found in the Pentium III.