United Business Media EE Times
Search

Home Latest News Semiconductors Market Intelligence Unit Forums EETimes Europe TechOnline New Products Careers Blogs Subscriptions Media Kit Contact Webinars RSS




 
Posted: 11:45 p.m., EDT, 7/31/98

Intel, Compaq gird for 64-bit battle

By Alexander Wolfe

SANTA CLARA, Calif. — A battle is heating up at the bleeding edge of microprocessor technology as Intel Corp. and Compaq Computer Corp.'s Alpha group rush to ready their competing 64-bit architectures. New technical details have come to light about the race, which pits Intel's Merced, due out in mid-2000, against the next-generation Alpha CPU, known as the 21364. (Compaq acquired the Alpha design team when it bought Digital Equipment Corp. in June.)

Intel hasn't talked much about Merced since last fall, when it outlined the explicitly parallel instruction computing (EPIC) architecture that forms the basis of the CPU. Compaq has said nothing about its plans for an improved Alpha, probably because a current design, called the 21264, is moving to market and won't ship in quantity until later this year.

However, a design team of some 200 engineers is already hard at work on the successor 21364 CPU (also known as EV-7) at the former Digital Palo Alto Design Center, EE Times has learned. The architecture will be highly scalable, sources said, and could sample as soon as late 1999, though it most likely will hit the streets around the same time as Merced.

As for Merced, the processor will delve far deeper into advanced code-optimization techniques than Intel has previously let on, sources told EE Times. Intel has already disclosed that Merced relies heavily on predicative and speculative execution. (Predication removes unnecessary branches from an application program, while speculation masks memory latency by executing load instructions as soon as possible.)

Now, the sources report that Merced will use two other techniques, called "prepare to branch" and register windowing. In addition, it's becoming clearer precisely how Intel will apply speculation to the task of keeping the multiple execution units in Merced's parallel architecture supplied with usable instructions.

Because all the software techniques Merced uses are cutting-edge and highly complex, it's been difficult to put together a complete picture of how the architecture really works. That's been complicated by the fact that Intel, according to its own publicly stated policy, plans to release information on Merced "incrementally."

Experts caution that Merced watchers shouldn't put too much stock in any single, arcane methodology. Rather, the proof of Intel's 64-bit pudding will lie in whether Merced's highly parallel hardware and highly optimizing software compiler can cooperate to deliver on the promise of EPIC computing.

Reached at press time, an Intel spokeswoman wouldn't comment on Intel's plans for future disclosures, but she did confirm that Intel will probably provide additional information toward the end of the year.

That will likely happen in October, at the Microprocessor Forum in San Jose, Calif., EE Times has learned. There, Stephen Smith, vice president of the company's microprocessor products group, is slated to talk about the "features and futures" of the so-called IA-64 architecture on which Merced is based.

The spokeswoman also emphasized that Intel's plans to produce the chip in mid-2000 remain on track.

Though Intel has positioned Merced as a revolutionary development, it's becoming clear that the architecture stands on the shoulders of advanced concepts that have been kicking around for some time. Two cases in point are the Cydrome Cydra-5 and the Multiflow Trace systems, two very-long-instruction-word architectures of the mid-1980s. Though not microprocessor-based, they are widely considered the best examples of architectures utilizing speculation and predication.

"Their compilers were extremely advanced for their time," said Wen-mei Hwu, chairman of the computer-engineering program at the University of Illinois at Urbana-Champaign and a noted compiler expert. "A lot of the technology is being improved and integrated into the first generation of IA-64 commercial compilers."

At the same time, Hwu pointed out that Merced will have to go beyond the limited support for predication and speculation that's already been demonstrated in several production microprocessor architectures.

"The recent HP PA-RISC and Sun UltraSparc processors have non-trapping load instructions that enable limited speculated execution," Hwu said. "In addition, recent PA-RISC, UltraSparc, Pentium II and MIPS R10000 processors all have very limited support for predicated execution. However, this limited support doesn't result in significant performance advantages because it's mostly peephole optimizations for very limited code constructs."

To make matters worse, Hwu went on, "the commercial compilers for these processors do not have the infrastructure required to perform significant analysis and optimization on predicated and speculated code. They have treated predication mostly as a feature to assist code scheduling, and this limited view has cost them major optimization opportunities." Hwu said that recent DSP chips from Texas Instruments Inc. have predicated support that's "closer to what a compiler can use for more general constructs."

Hwu believes compilers will be crucial to the success of IA-64, because predication and speculation-if they are to be used at all-by definition require a compiler to transform the code. He added that previous architectures have all lacked three key ingredients: efficient predicate-defining instructions, a large number of predicate registers to handle complex code structures, and effective exception-detection and recovery mechanisms.

"Moving forward, these three features will be critical to the success of compiling real-world programs," Hwu said. "More importantly, the commercial compilers for the IA-64-based processors will have to effectively analyze and optimize predication and speculation code in the entire back end of the compiler rather than being limited to some optimizations right before code scheduling. This will require major upgrades of commercial compiler back ends." Hwu declined to comment on the specifics of possible optimization techniques used by Merced.

Compaq's upcoming next-generation Alpha is less of a quantum leap than Merced is for Intel. At the same time, given Intel's dominant market share, Alpha must maintain its performance edge just to survive.

"Obviously, we're looking over our shoulders," said Compaq engineer Allen Baum, a member of the Alpha design team. "Our belief is that when Merced comes out, there will be an Alpha that's faster. When the next Merced comes out, there will also be an Alpha that's faster."

To keep the Merced wolf from the door, Compaq maintains two separate Alpha design teams, one in Palo Alto, Calif., and the other in Massachusetts. (The teams went to work for Compaq when Digital was acquired in June; Digital's semiconductor manufacturing facility and StrongARM technology was sold to Intel.)

The California team is well into the design of the 21364. Although Baum wouldn't comment, an outside source familiar with the effort said the device will feature clock speeds and a new bus that are faster than the current 21264 Alpha incarnation. That processor, set to ship late this year, will debut at 500 MHz. Speeds will quickly rise to 600 MHz and will ultimately top out at 800 MHz.

In contrast, the 21364 is scheduled to debut at 750 MHz and will eventually push to 1.2 GHz. It will also include an integrated memory controller and a faster, next-generation EV-7 bus. More important, it will be designed for use in symmetric-multiprocessing implementations, where up to 64 processors can be ganged in a single server. On-chip transistor count will jump a full order of magnitude, from the 15 million of the 264 generation to the 100 million range.

The new Alpha will also implement some of the same advanced code-optimization techniques Intel is eyeing. "There's already stuff in the architecture to do that predication and speculation, including things like prefetch instructions," said Baum. "The major difference between Merced and Alpha in this respect is static vs. dynamic. That is, they're doing everything as statically as they possibly can. We're doing everything as dynamically as we possibly can."

As to the brewing public perception that Merced uses predication and speculation far more than other architectures, Baum said, "I think that it's more explicit in the architecture as opposed to something that's hidden in the implementation. That is the big difference. Look at their name, EPIC: the 'explicit' part is important."

The first official details on the Compaq effort are scheduled to emerge at the Microprocessor Forum in October, when Pete Bannon of the Alpha design team will discuss the 21364.

Even as the 364 effort proceeds, the Alpha team on the East Coast is beginning work on the 21464. Interestingly, that device, not the 364, is the first Alpha chip slated to use 0.18-micron process technology. However, those plans are believed to be subject to adjustment.

"We think we have ways of keeping ahead [of Merced]," Baum said. "Partly, it's implementation expertise. We think we can implement better."

When it comes to implementation, Intel's engineers have advanced techniques of their own, including a leading-edge method for performing branches. Sources close to Intel revealed that the architecture will in fact implement a "prepare-to-branch" feature, in which the compiler prefetches branch paths way before the application needs them.

In computing the branch in advance, a condition bit is set. It is used to activate that branch at a later point, should the software actually need it.

The last piece of new information that's come to light involves how Merced manages its 128 internal registers. Because of its use of speculation and predication, the processor must juggle dozens of different instructions and pieces of data that may or may not actually be used during execution. The information is stored in registers. As a result, the chip has a voracious appetite that often exceeds even the available 128 registers.

To push beyond this limit, Merced applies a register-windowing feature, which essentially divides up its 128 internal registers into sliding blocks or windows. Additional space is extracted by temporarily overlaying additional registers in software. The approach is less complicated than the register-renaming technique microprocessors typically use to juggle the references between software and the physical registers themselves.

  Free Subscription to EE Times
First Name Last Name
Company Name Title
Email address
  Click here for your Free Subscription to EETimes Europe
 
CAREER CENTER
Ready to take that job and shove it?
SEARCH JOBS
SPONSOR

RECENT JOB POSTINGS
CAREER NEWS
10 Search Engines You Don't Know About
Go beyond Google and get vertical. These specialized search sites will help you find the business information you need -- fast.

For more great jobs, career related news, features and services, please visit EETimes' Career Center.


All White Papers »   

 

FEATURED TOPIC



ADDITIONAL TOPICS












Home | About | Editorial Calendar | Feedback | Subscriptions | Newsletter | Media Kit | Contact | Reprints|  RSS|  Digital|  Mobile
Network Websites
International
Network Features



All materials on this site Copyright © 2008 TechInsights, a Division of United Business Media LLC All rights reserved.
Privacy Statement | Your California Privacy Rights | Terms of Service | About