Sustaining competitive edge: Benefits of hierarchical Ethernet switching in next-generation 40 G AdvancedTCA platforms
Wireless network migration to 4G architecture requires AdvancedTCA platforms that deliver increased packet throughput, reduced packet switching latency, and flexibility to adapt to future requirements and changes. This article discusses how Ethernet switching, implemented in a hierarchical fashion, can help solve these critical requirements in 40G AdvancedTCA platforms, enabling the design and delivery of competitive network elements.
Global mobile traffic is growing at a breathtaking pace. It is projected to increase a staggering 39 times between 2009 and 2014[[i]]. Driving the demand are increasing expectations and appetite for more information, such as business and consumer applications, and the desire for entertainment through mobile devices, anywhere, any time, and at greater speeds. For wireless carriers to achieve greater speeds and pervasive connectedness, their networks need to start behaving more like landline broadband networks. This line of thinking represents a fundamental shift in perspective — from mobile services to broadband connections — for customers, service providers, and telecom equipment manufactures alike.
Enter the 4th generation (4 G) wireless network. Unlike earlier wireless standards, 4 G technology is based on TCP/IP, the core protocol of the Internet. TCP/IP enables wireless networks to deliver higher-level services, such as video and multimedia, in a cost-effective manner while supporting the devices and applications of the future. While there are known advantages of deploying 4 G networks in terms of higher bandwidth, there are significant challenges to delivering high-bandwidth, real-time applications with existing wireless infrastructure. This article discusses how a hierarchical Ethernet switching architecture will enable the creation of wireless solutions that not only enable 4 G applications today, but also future-proof them for tomorrow.
Breaking the packet per second bottleneck
In packet processing systems, the key bottleneck is not necessarily the bits per second throughput, but rather packets per second throughput. From that perspective, the 40 Gbps link can result in roughly 60 million packets per second when the smallest 64-byte packets are being used. This is one packet every 17 ns. In ATCA architecture, the ATCA hub, the main Ethernet switch blade interconnecting all other ATCA blades in the chassis, needs to provide at least 18 of such 40 Gbps links:
· 12-14 links to other payload blades (depending on the chassis size)
· 1 link for hub-to-hub connectivity
· 2-4 external links
· Links to additional processing elements on the hub, such as those installed in AdvancedMC (AMC) sites
Consequently, at the ATCA hub level, eighteen 40 Gbps links result in a 720 Gbps aggregate throughput (1071 million packets per second). Not only should the Ethernet switch provide that much aggregate throughput, but more importantly, it should have the capability to sustain that bandwidth while performing diverse packet-handling tasks such as Layer 2 and Layer 3 switching, ACL processing, and traffic management, enabling a truly non-blocking 40 Gb ATCA system.
Higher performance packet processing
In the ATCA architecture, Ethernet switching is directly responsible for packet distribution, also known as load balancing. Load balancing starts with packet parsing and classification. In fact, flexible and efficient packet classification is the key requirement for a number of follow-on packet processing tasks, such as Access Control List (ACL) functionality, packet steering and load balancing, traffic management and policing, and packet processing offloading. The packet parser and classifier within the Ethernet switch needs to be able to go deep into packet data, at least 128 bytes deep, be flexible in order to support a variety of protocols and their corresponding headers, support wild cards, support arbitrary bit fields, and, most importantly, be able to adapt to new protocols and protocol changes. This means that the packet parsing and classification engine cannot be implemented in hard logic; it needs to be microcode-based and have the option to be upgraded in the field. Selecting an ATCA hub that employs an Ethernet switch with microcode-based hardware architecture will ensure high performance levels that can be maintained under all conditions, as well as provide the flexibility to adapt to new protocols or header processing requirements.
Once packets are properly parsed and classified, packets can be distributed and steered to specific payload blades. Load balancing (Figure 1) can be based on packet types, protocols being used, address ranges, or even custom bit fields within the packet header. Hashing algorithms can be used to evenly distribute packet flows across multiple ATCA blades and multiple processors on each blade. A typical ATCA payload blade has multiple packet processing elements – consequently, packet load balancing needs to occur at multiple levels. First, the packet is steered to the appropriate blade by an ATCA hub blade. Next, the packet should be steered to an appropriate packet processor by the switch on the payload blade. Flexibility of packet parser and classifier, sophistication of hashing algorithms, and hierarchy of Ethernet switches is the key to achieving a well-balanced system, which in turn results in 40 Gbps performance levels.
Even when the 40 Gbps data stream is properly load balanced and distributed between multiple payload blades, it still poses a huge workload for packet processors. Ethernet switches can be very helpful in offloading tasks such as tunneling. Telecom applications use a variety of tunnels, such as GTP (Figure 2), which is commonly used in GSM, UMTS, and LTE networks; MPLS, VPWS, and VPLS, which are commonly used by wireline service providers; and IPv6 tunneling and Network Address Translation (NAT), which play a role in almost any network. Although packet processors are well suited to perform such tunneling and translation operations, new, sophisticated Ethernet switches can be used to offload these tasks. In such offload architecture, packet processors need to dynamically control and configure Ethernet switches; Therefore, it is essential that each ATCA payload blade has its own Ethernet switch.
Lowering latency, cutting congestion
Voice and video traffic is latency- and jitter-sensitive, and considering that next-generation networks are oriented toward multimedia traffic, low latency requirements come to the top of the list. In fact, the LTE specification is very explicit in setting significantly more stringent latency requirements. A number of factors can influence the overall packet processing latency within the network and within the ATCA system, including:
· Latency within the Ethernet switching silicon
· Congestion and congestion management
Perhaps the most obvious latency component is the latency within the Ethernet switch silicon itself. While traditional Ethernet switches perform store-and-forward operation, some switches implement cut-through switching, which starts transmitting a packet on an egress port even before the whole packet has been received, resulting in a packet switching latency of only a couple hundred nanoseconds as compared to multiple microseconds. Although at first glance these latency values appear to be small, they do start to add up when considering the multiple hops that a packet needs to make when being sent from one blade to another, and from one ATCA system to another. Selecting an ATCA system that employs cut-through switching will ensure that Ethernet switches add as little latency as possible.
Although engineers strive to design networks and systems with sufficient processing and connectivity resources, congestion will always occur, resulting in packet buffering, increased latency, and eventually packet drops. In most cases congestion cannot be avoided, but it can be managed in a graceful way, making sure that Service Level Agreements (SLAs) are met and latency for voice and video packets is kept low. Once again, it all starts with the packet parser and classifier, which help to identify priority packets, such as voice and video, as well as different flows. The traffic manager helps to tri-color mark packets, and the policer can drop packets that exceed the excess rate profile. This ensures fairness between flows and helps enforce SLAs. The number of flows and corresponding SLAs that can be identified, monitored, and enforced directly depends on the amount of resources (tables, rules, counters, policers) that are available within the Ethernet switch. To this extent, having Ethernet switches on the ATCA payload blades dramatically increases the amount of those resources. In the case of a 14-slot ATCA chassis employing 10 payload blades, the number of all resources could increase by 10x if hierarchical Ethernet switch architecture – that is, Ethernet switches on both payload blades and hub – was used.
After traffic management, the next step is flow control. Priority flow control (PFC) can be invoked, helping to slow packet queues, which are less sensitive to delay and jitter, allowing packets such as voice and video to go through. PFC can be used as a link-level flow control for lossless operation. Another scheme, enhanced transmission selection (ETS), provides minimum bandwidth guarantees for special traffic classes. Quantized Congestion Notification (QCN), a congestion feedback mechanism, can be used to reduce fabric congestion. There are a number of different networking technologies that can be employed for congestion management. Selecting an ATCA system architecture that uses a hierarchy of Ethernet switches and employs switch devices that support most congestion management protocols, such as GE Intelligent Platforms’ next-generation ATCA platform, will result in the highest system performance, with the lowest latency, and least amount of dropped packets.
Making the switch
In summary, emerging wireless network architectures require orders of magnitude higher data bandwidth coupled with the capability to handle more complex packet processing tasks to provide real-time applications. GE Intelligent Platforms is addressing these requirements with new, innovative products by using hierarchical switching in a next-generation ATCA platform. When considering next-generation platforms, consider not only 40 G backplane connectivity, but also make sure that the selected platform provides the lowest latency, the best performance, and the most flexibility in Ethernet switching and packet processing to help design a solution to deliver tomorrow’s applications today, and thereby sustain competitive edge.
GE Intelligent Platforms
mailto:gene.juknevicius@ge.com
www.ge-ip.com/industries/communications
Fulcrum Microsystems
glee@fulcrummicro.com
www.fulcrummicro.com
[[i]] “Cisco Visual Networking Index Forecast Predicts Continued Mobile Data Traffic Surge.” http://www.cisco.com/web/MT/news/10/news_220210.html

Leave a Comment