Before using a TCP/IP stack in an embedded product, it is important to acknowledge and understand the reasons to do so. Obviously, the product is to be connected to an IP network. We can say the embedded system in this case requires connectivity. An embedded system might be called on to exchange large amounts of data over a reliable connection. In this case, we can say that the embedded system demands performance.
Most embedded system resources are extremely limited when compared to the resources available on a desktop or laptop PC. Product manufacturers must create products at the lowest cost possible to offer them at the best possible price to their customers, yet work within the constraints of RAM, CPU speed, and peripheral hardware performance inherent in hardware platforms used for embedded design. With limited hardware resources, how can an embedded designer meet system design requirements? They begin by asking and answering a fundamental question:
Do you need a TCP/IP stack..
- to connect to an IP network without any minimum performance requirement?
- to connect to an IP network and obtain high throughput?
The answer to this question has a major impact on hardware choices that ultimately drive product cost. These hardware choices include CPU performance, NIC interface type, and RAM availability.
Connectivity, throughput and bandwidth are concepts that shape the configuration system hardware and software parameters. Let's look at an overview of each:
As a best practice, the performance of an Ethernet connection should be measured in Megabits per second (Mbps). This allows us to easily compare system performance with respect to the Ethernet link’s maximum bandwidth.
Currently, Ethernet over twisted pair is the preferred physical medium. The available bandwidth of the link is normally 10 Mbps, 100 Mbps or 1 Gbps. These numbers are used as the reference for the efficiency of an Ethernet NIC. For example, if we have an Ethernet NIC with a 100 Mbps link, we already know that our embedded system maximum bandwidth is 100 Mbps. However, there are a number of limiting factors in embedded systems that do not allow them to reach what we call the Ethernet line speed, in this case 100 Mbps. Such factors include duplex mismatch, TCP/IP stack performance based on CPU speed, RAM available for buffers, DMA vs. non-DMA Ethernet driver design, performance related to clock and peripheral power management, and the use of a true zero-copy architecture. These embedded system bandwidth-limiting factors are included in this and subsequent chapters.
Connectivity in this context is the exchange of information without any performance constraints. Many embedded systems requiring connectivity only may work optimally with hardware and software that provide a low-bandwidth TCP/IP connection.
For example, if an embedded system is sending or receiving a few hundreds bytes every second (let’s say of sensor data), then the constraints on the system are fairly relaxed. It means that the CPU may be clocked at a lower speed. It may also mean that if the NIC is Ethernet, it can be a 10 Mpbs instead of a 100-Mbps interface and the RAM requirement is reduced since there is less data flowing in the system.
A system that needs throughput can be one that transmits or receives streamed video, for example. Streamed video transmission can be anything from a few megabits per second (Mbps) to many Mbps depending on the signal quality and the compression rate used.
This type of application requires an embedded system with sufficient resources to achieve higher bandwidth than a “connectivity-only” system. Constraints on the NIC, CPU and RAM availability are clearly higher. For the CPU and NIC, these issues are hardware dependent, but for RAM usage, the constraints are related to software and the requirements of the application.
The transport protocols at Layer 4, have a greater influence on RAM usage. It is at this layer, for example, that flow control, or how much data is in transit in the network between the hosts, is implemented. The basic premise of flow control is that the more data in transit, more RAM is required by the system to handle the data volume. Details on how these protocols work and their impact on RAM usage are located in Chapter 7, “Transport Protocols” on page 167.
Achieving high throughput in a system requires greater resources. The question becomes, how much? Each element influencing performance must be analyzed separately.
There is an inherent asymmetry in a TCP/IP stack whereby it is simpler to transmit than to receive. Substantially more processing is involved in receiving a packet as opposed to transmitting one, which is why embedded system transmit speeds are typically faster. We therefore say that most embedded targets are slow consumers.
Let's look at a personal computer by way of an example. On a PC, the CPU is clocked at approximately 3 GHz and has access to gigabytes of memory. These high-powered computers invariably have an Ethernet NIC with its own processor and dedicated memory (often megabytes worth). However, even with all of these resources, we sometimes question our machine's network performance!
Now, imagine an embedded system with a 32-bit processor clocked at 70MHz and containing a mere 64 Kilobytes of RAM which must be allocated to duties apart from networking. The Ethernet controller is capable of 100 Mbps. However, it is unrealistic to ask even a 70-MHz processor with only 64 Kbytes of RAM to be able to achieve this performance level. Standard Ethernet link bandwidths are 10, 100 Mbps and 1 Gbps and semiconductor manufacturers integrate these Ethernet controllers into their microcontrollers. The CPU may not be able to fill this link to its maximum capacity, however efficient the software. Even when the Ethernet controller used in the system is designed to operate at 10 Mbps, 100 Mbps, or 1 Gbps, it’s unlikely that the system will achieve that performance. A high-performance PC as described above will have no trouble transmitting Ethernet frames at bandwidths approaching the Ethernet line speed. However, if an embedded system is connected to such a PC, it is very possible that the embedded system will not be able to keep up with the high data rates and therefore some of the frames will be lost (dropped).
Performance is not only limited by the embedded system’s CPU, but also by the limited amount of RAM available to receive packets. In the embedded system, packets are stored in buffers (called network buffers) that are processed by the CPU. A network buffer contains one Ethernet frame plus control information concerning that frame. The maximum Ethernet frame payload is 1500 bytes and therefore additional RAM is required for each network buffer. On our PC, in comparison, there is sufficient RAM to configure hundreds (possibly even thousands) of network buffers, yet this is typically not the case for an embedded target. Certain protocols will have difficulty performing their duties when the system has few buffers.
Packets generated by a fast producer and received by the target will consume most or all the TCP/IP stack network buffers and, as a result, packets will be dropped. This point will be explained in greater detail when we look at Transport protocols.
Hardware features such as Direct Memory Access (DMA) and CPU speed may improve this situation. The faster the target can receive and process the packets, the faster the network buffers are freed. No matter how quickly data comes in or goes out, the CPU still must process every single byte.
Next: Ethernet Controller Interface
Chapters of "µC/TCP-IP, the embedded protocol" are provided to UBM for users download and consultation, not for resale. Users wanting to know more about TCP/IP and Micrium can visit Micrium.