In computing systems, the need for more efficient systems has become paramount especially in servers and main frames where the high power density per square foot of office space has demonstrated the limited capacity of office buildings to meet this large demand. With the rapid improvement in power delivery, new opportunities must be found within the system to reduce energy consumption. These opportunities lie where most people are not looking, not in a single subsystem but rather in the interaction between subsystems. This article reviews one such interaction within a typical server system.
They are taking over
In servers, the growth of the Internet has created the need for server farms. Large clusters of servers collocated at critical junctures where multiple trunk lines provide very high-speed access to the communications backbone of the country. These buildings are often in expensive rent areas; which is translated into the cost per square foot of floor space charged for housing the servers. The cost per square foot plus the need for more bandwidth has created the need to shrink the size of a typical server from a 5U only a few years ago down to a single 1U drawer that may contain up to 4 processors operating at twice the frequency of only 3 years ago. Historically, power has been proportional to the operating frequency of the CPU, therefore the power demand has increased 2(double frequency) x 4(processors) x 5(reduction of rack space per server) = 100 times during this short time.
This concentration of power often provoked the realization that typical buildings are not designed to provide the high power demanded to power a room full of these dense servers. Also lacking are redundant power supplies to protect against blackouts or brownouts. Finally the building air conditioning is insufficient to cool the thermal load found in these server clusters, bringing to the forefront the topic of this paper.
One design focus has been on the layout of the server itself, its card spacing and chassis, and whether they allow efficient airflow for cooling. The AdvancedTCA card cage has offered improvements as well as challenges. The denser and larger blades in the 8U x 280 mm format boost performance in server and communications switching centers. But thermal management can be a challenge for the cards, especially where the board spacings are tight. With higher processing speeds, the need to maximize density in the cabinet and the necessity of virtually no downtime, there are tremendous constraints on the ability to provide all the functionality needed and cool it properly. Cooling 200W per slot while cramming boards and modules in the smallest amount of rack space requires careful attention to power efficiency. Every percentage in lost efficiency will be dissipated as heat — and we have a stack in reducing, not increasing, the number of fans required.
Another focus of many designers is on increasing the dc-dc power converter efficiency of the CPU's core supply (the Vcore) with new controllers and better MOSFETs only solved one leg of the power conversion problem — albeit among the most visible part of the overall problem.
In many servers, it is a school of Intel Pentium processors that serve as core CPUs. These devices can demand in access of 100 amps at 1.2V — often on 150ns transient. For this reason, so much attention had been focused on the efficiency of the Pentium power supply modules, effectively point-of-load (POL) converters. But it is other parts of the power delivery system that can have impacts on efficiency and energy conservation.
In a typical server system, even if the server power is 48 volts dc, there is always an ac power source. This primary power source is often neglected when looking for the sources of power loss and heat. Today there are standards on how much distortion can be introduced onto the Power grid, making it necessary to add power factor correction to the input of the Bulk regulator or ac to dc power system.
The next stage is usually a traditional isolated dc-dc converter that may be housed in a standard brick type converter and finally several POL converters distributed through out the server and located close to the critical circuits (figure 1). The power of each of these circuits has also increased as more memory was added to the computer systems; more complex chip sets were added to enable increased parallel operation of the processors and faster I/O to the processors. (The distribution of these loads is illustrated in figure 2).
Figure 1: Server power delivery systems use three power stages: One with power factor correction (PFC) to convert an AC line voltage to a –48V DC distribution voltage. An intermediate supply converts this to the 12V used by each circuit card.
Click to Enlarge
Figure 2: Each card in a server will need to power a high-current processor (often a 100-amp Pentium), banks of memory devices, and several ASICs, each with their own regulated voltage and current requirements.
Click to Enlarge
Who consumes power?
If we start at the load and work back toward the power source, we can see how the power is consumed.
In a typical server processor the Vcore converter processes 130 amperes x 1.3 volts = 169 watts of power at approximately 88 percent efficiency. This creates a loss of 23 watts that requires an input of 192 watts. If the efficiency is increased by 2 percent to 90 percent, the resulting input is reduced by only 5 watts. But if the converter is designed to require less cooling or even no cooling, then the resulting power savings can eliminate a fan required to blow 200 linear feet per minute (LFM) against static of pressure of 0.25 inch-H2O. This savings can be 2 to 4 amperes at 12 volts or 20 to 50 watts.
In the past when the Vcore converter efficiency was only 75 percent, the power loss would have been 56 watts each times 2 for a total of 112 watts, which could be reduced to only 46 watts using the latest technology today. But now the power of the fan is approximately equal to the losses of the converter.
Therefore designing the Vcore converter with a technology that eliminates the need for a dedicated fan can save more power than if the converter were operating at 100 percent efficiency. Such a solution will make use of high-efficiency MOSFETs and multi-phase controller. (Parts like IR's DirectFET transfer heat through PCB traces and the XPhase controller uses up to 7 phases to promote switching regulator efficiency.) A comparison of the converter efficiency from 3 years ago and today demonstrates how the solution can eliminate the need for cooling through a combination of improved efficiency and improved MOSFET packaging that reduced the thermal density by 50 percent over the previous solution at the new higher current (figure 3).
Figure 3: The efficiency of Pentium voltage regulator modules (VRMs) has improved dramatically—especially with current requirements in excess of 100 amps.
Click to Enlarge
Less heat; more power
Whether this savings is accomplished by reducing the amount of air, or eliminating the fan entirely, the opportunity exists to significantly reduce the power consumption for a typical server. There can be as many as 7 fans in a single 1U server, each drawing up to 1 ampere (figure 4).
Figure 4: System efficiency can be improved by eliminating fan motors and carefully regulating the duty cycle of those remaining. More efficient thermal dissipation remains key.
Click to Enlarge
More fans have been added because the back pressure of the new higher density servers requires more power to move the same airflow described in reference 13 describing processor cooling. If the total power dissipated within the server is 500 watts, then each fan is cooling approximately 70 watts of power and consuming 12 watts to accomplish this. From this simple illustration it becomes clear that simply adding more air flow has reached the point of diminishing return. More efficient thermal design is now as important as more efficient power conversion.
If we now step back and look at the entire power train, we can see the impact of fan motors in a broader view. Beginning with the electronic loads totaling 280 watts typical — not worst case — the fan losses are still among the largest. At 88 percent efficiency, there is little room to reduce these losses. Eliminating a fan can reduce losses by 12 watts. Improving the dc-dc converter with a new bus converter can reduce the losses of that section by 12 watts. Improving the efficiency of the PFC section and off-line rectifier (Bulk converter) can reduce the losses by another 41 watts. This accounts for about 50 percent of the losses at full load.
At light load the fan represents a much greater percentage of the total power losses (Table 1), that is why the fan is turned off whenever possible in many desktop and notebook computers. In servers, variable speed fan drivers can dramatically improve the light load efficiency when used in conjunction with the power converter to match cooling capacity to power losses. Variable speed motors can also be used to reduce the audible noise by raising the frequency above 20kHz.
Table 1: Summary of Power Conversion and Cooling Losses. Even with light loads, fans absorb considerable percentages of available system power.
Click to Enlarge
The outlook for the future
Past improvements in power conversion efficiency have dramatically reduced the total losses from ac prime power to the CPU input, but future improvements will demand overall systems solutions based upon simultaneously improving the thermal, mechanical and power conversion systems to achieve new levels of energy utilization. Reviewing each stage of the power conversion process to determine if that stage is still required, if the interface voltage is optimum and if the thermal packages of the components are designed to minimize the energy consumed to cool the power dissipated will be essential to continue the rapid improvements achieved during the past 8 years.