Design Article

IMG1

Managing QoS for video delivery in shared-I/O environments

Shreyas Shah
Chief Systems Architect
PLX Technology

2/13/2009 2:12 AM EST

QoS (Quality of Service) has been talked about for over three decades in the networking world. But as computing and networking industry differences start to blur, and as scale-out architectures are becoming more popular, the QoS that applications receive while running on compute farms has resurged as a hot topic within today's consolidated infrastructures and converged networks. These compute farms require mission-critical and non-mission-critical applications to run on the same physical hardware that would otherwise provide differentiated services.

Compute farms also call for I/O QoS that can differentiate the various types of application/virtual machine flows within the infrastructure. In a shared I/O environment, the I/O devices are being shared across multiple processor complexes that require QoS built into I/O devices and converged fabrics. The converged fabric will need to carry the built-in QoS to I/O devices in the fabric to provide end-to-end application QoS.

QoS means different things to different people, and often is the source of confusion among designers, not the least of who develop video/imaging-systems. Let's look at QoS's role in consolidated/converged systems and the challenges designers face implementing it, and then examine straightforward methods for successful QoS deployment.

Video and IPTV systems
Today's converged infrastructures include voice, video and data traffic types. Video is a bandwidth-hungry application that requires higher speed and feed from servers, switches and other systems within the infrastructure. Video can be delivered with different protocols such as IPTV, surveillance camera and video on demand (VOD). Real-time applications such as these need QoS from both the network and infrastructure. QoS is supported in the data centers as well as in the telecom networks for end-to-end service- level agreements and support for real-time applications.

Figure 1 shows video ENDECs on a PCI Express (PCIe) subsystem. This card needs QoS implementation within the PCIe switch. The PCIe system provides policing and shaping functionality as part of the QoS implementation.


View full size

Figure 1: Surveillance camera and video servers

Figure 1 illustrates how the video has been captured by the camera subsystem and RGB (red, green and blue components) is sent to the ENDEC, which digitizes the video signal and writes it to local CPU memory. This data is sent to graphics cards or to a network where it is being displayed at a common console. The PCIe switch plays a very important role here, providing QoS functionality while addressing the guaranteed bandwidth and latency to the video traffic.

Next: IPTV system, Shared-I/O system
IPTV system
Figure 2 shows an IPTV system, in which servers read movies or video files from network attached storage or local disks. This video is being sent over IP protocol through a network I/O device to the viewers. Again, the PCIe switch with QoS implementation helps in priority-setting the IPTV traffic over other I/O traffic, as well as guaranteed bandwidth and latency.


View full size

Figure 2: IPTV and VOD servers

In the case of VOD, the video files are read from the disks and sent over Ethernet I/O devices. This application requires QoS implementation within the PCIe switch as well as QoS implemented in the network.

Shared-I/O system
Figure 3 illustrates the shared I/O system and the importance of QoS within this system. In this example, a PCIe multi-root I/O virtualization (MR-IOV) switch needs to have QoS functionality implemented on a per-port basis to provide the guaranteed bandwidth and latency for the traffic, as well as for guaranteed-service level agreements to the applications running in this converged and consolidated infrastructure.


View full size

Figure 3: Shared-IO system

The QoS provides the capability of running mission-critical and non-mission-critical applications on the same infrastructure side-by-side and hence harvest the true power of virtualization to save power, cost and management of the infrastructure.

Next: QoS in nutshell, Ingress flow
QoS in nutshell
QoS, in nutshell, is a technique that polices the traffic when it arrives in a device or system, and shapes it based on egress parameters when it leaves the chip or system. Once shaped and once the parameters are set correctly, the flow will be well behaved in the network. As shown in Figure 4, flow A arrives in system A, then leaves system A to go to system B and so on. In each system, the flows are policed at ingress and the traffic is shaped when it leaves the system. QoS is applied on a hop-by-hop basis to the flows.


View full size

Figure 4: QoS polices traffic flows

Typically in the network, queues are implemented where most memories reside in the system. The number of queues and flow depends upon the complexity of the chips and/or systems. Flows are being classified before applying policing functions. The chips used in these systems should support QoS for consistency and continuity with the service levels of previous chips or systems. These systems, along with selected applications, support end-to-end QoS on its application flows. The egress shaper and schedulers employ different algorithms (explained later in this article) in the chips and systems. The shaper shapes the traffic before leaving the system.

Ingress flow classification
When the traffic is received at the chip or system, the first thing applied is flow classification. The classification identifies the flow and generates flow IDs. This FlowId is used to map to queue number or index into the queue number entry. These classifications can be implemented with a variety of algorithms, the simplest being TCAMs. Other algorithms, like the hash of certain fields in the packet, produces FlowId (Index) and is also very popular among QoS implementations in systems, and in the case of contention, the flows are being serviced by control-plane processors. These flows are then reprogrammed in SRAM with either different hash algorithms or an increased bucket size to accommodate a newly searched entry. The flow classification depends upon where it is being applied; in the typical network, quintuple flows (5) are identified (Src MAC, Dst MAC, Src IP, Dest IP, Protocol Type). If early de-multiplexing is used to identify the flow in servers, it could be up to seven tuples that include TCP/UDP sources and destination numbers. This level of granularity provides the system designer and administrator a way to prevent malicious attacks, as well as provide QoS to applications running on these systems.

Once the flows are identified, they're checked for drops and to allow or re-direct them as ACL implementation. After the flow is being identified as "allow," it's considered a candidate to apply a policing function. Also, some of the chips/systems employ the flow counters that help not only in debugging the systems but also help in providing statistics of the flows that are being actively monitored and managed. There are management proprietary and public standards that measure the statistics of the flows passing through the system.

Next: Ingress policing function, Egress shaper


Ingress policing function
Typically, in network or I/O devices, the policing function is implemented after the flow identification, and the flow engine supports allowing the packets in the system.


View full size

Figure 5: Network ingress flow classification and policing function

In network and I/O devices, the algorithm used for the policing function is RFC 2697 (SrTCM) or RFC 2698 (TrTCM). It's a single-rate, three-color marker algorithm that could be implemented as color-aware or color-blind mode. The packet is marked as green, yellow or red, based on the flow pattern entering in the system. Once the color is identified, some of the systems apply a WRED (weighted random early discard) algorithm, and accept or drop the packets based on queue depth accumulated within the device. The RED/WRED checks the queue occupancy and maintains the queue occupancy at the level where bursty traffic does not overflow the queue. This algorithm calculates the probability of accepting or rejecting the packets and, based on this non-linear function equivalent implementation, it drops the packets or accepts it. Once the packet is accepted it will be accepted in a queue and will be shaped before it leaves the chip/system (see Figure 5). Based on the color, the profile is applied. The red color has typically higher probability of drop than does the green at the same queue occupancy. Once WRED accepts the packet, the packet is accepted in the queue. This queue number was generated at the flow classification stage or the tag was generated that will be used to look up in the table that provides the queue number.

Egress shaper
Egress shaper is the rate at which the traffic is being dequeued (read) from the queues. Packets are dequeued (read) based on an algorithm like SrTCM (RFC2697) or TrTCM (RFC2698) programmed as a scheduler rather than a policing function (see Figure 6). The tokens are gathered at the rate specified and packets are dequeued as the tokens are available.


View full size

Figure 6: Egress shaper

The scheduler schedules a packet if the length of the packet at the head is less than or equal to that of the tokens available, and the tokens are subtracted by the length of the packet.

The rate at which the packets are dequeued is determined, based on CIR or CIR/PIR. These rates are made programmable and exposed to management software of the system, so the administrator can program the shapers based on application requirements.

Priority queues
Once the packets are scheduled to go out based on the programmed CIR/PIR and CBS/PBS parameters, they are scheduled to be enqueued in the priority queues. These queues could be served as strict priority or as weighted round robin. These priority groups could match with the priority groups defined in CEE/DCE specifications of IEEE.

The strict priority will provide the least delay through the device or system for the highest priority queue. The weighted round robin makes sure that the lower priority group does not starve due to very high traffic on a high-priority queue.

Next: Egress scheduler
Egress scheduler
An egress shaper serves to shape the traffic and add it into priority queues. The round robin or strict priority is being applied to the queue before traffic leaves the system. The priority group could be applied to the queues. Typical network systems apply WRR or DWRR, giving better approximation to continuous deficit weighted round robin. As shown in Figure 7, the packets are dequeued at the shaped rate and leave the system based on priority scheduling algorithm. The RR arbitration keeps all the queues moving at the same speed, whereas DWRR differentiates the traffic based on the weights assigned to queues or queue groups.


View full size

Figure 7: Egress shaper and scheduler

Conclusion
This overview of QoS being implemented in I/O devices or network systems should provide designers with a greater understanding of how QoS plays a major role in application service-level agreements in shared I/O environments. QoS also helps in running both mission-critical and non-mission-critical applications on the same physical servers. Designers would want the non-mission-critical traffic to be throttled in the cases when mission-critical traffic is being passed through the systems.

Video and imaging, as well as a number of other market segments, are embracing QoS; they're evolving the technology and taking advantage of it in servers, I/O devices and network switches for end-to-end QoS applications. The early de-multiplexing techniques could be implemented in I/O devices to de-multiplex traffic based on protocol type and fill in the queues associated with protocol types, which are highly useful in shared I/O environments.

About the author
Shreyas Shah is chief systems architect at PLX Technology, a provider of PCI Express and other I/O interconnect technology for the communications, server, storage, embedded-control, and consumer markets. He has more than 15 years of experience in engineering, with an emphasis on the computing, networking and storage communications market segments. Shah holds an MSEE from the Indian Institute of Technology, Bombay, India. He can be reached at shreyas@plxtech.com.


print

email

rss

Bookmark and Share

Joinpost comment




Please sign in to post comment

Navigate to related information

Most Popular

Product Parts Search

Enter part number or keyword
PartsSearch


FeedbackForm