Scalable Techniques for Multi-terabit Cloud Networking
Current data center network designs are very compromised. They are based on blocking multi-tier designs with 50:1 over-subscription using the Spanning Tree Protocol (STP) to prevent loops and enable device redundancy. Link Aggregation, LAG, was introduced to distribute network traffic across multiple links for more capacity. This made scalable bandwidth and device redundancy mutually exclusive. With previous architectures, flexible designs for new cloud applications such as map-reduce,Hadoop clusters, Video/CDN, and HPC applications were price and performance prohibitive.
Introducing MLAG for cross-sectional bandwidth capacity
Arista’s introduction of Multi-chassis Link Aggregation (MLAG) is an innovative and profound development in cloud networking to offer fully non-blocking scalable two-tier designs in the data center.
Multi-chassis Link Aggregation (MLAG) can be best described as a set of ports spread across two cooperating switches (known as "peer switches") that behave like an ordinary link aggregation group. The beauty of MLAG is the creation of large layer 2-based domains that offer simplicity, reliability and vastly improved capacity for intra-machine communications
MLAG overcomes the inherent limitations of legacy Spanning Tree mechanisms enabling “active-active” multi-path topologies with multiple uplinks. This enables very large Layer-2 networks to better support virtualization or high performance compute (HPC) cluster applications and also overlays Layer 3 options.
A real world case
Arista customers are deploying elegant topologies of MLAG in a rack server or blade server with built in blade switches running HPC applications. In this figure below, multiple blade server enclosures with integrated switches in each chassis, interconnected via 10 Gigabit Ethernet, are depicted as the “Switch 3” element. Each Switch 3 element can be connected with multiple uplinks for scalable server connectivity. The transparent connection of these links across a pair of upstream Arista 7148SX (cooperating peers Switch 1 and 2) ensures that the servers are not only guaranteed a maximum upstream bandwidth of up to 160Gb/s but also built as a two-tier topology devoid of manual and complex configurations. This creates an elegant design that is fully non-blocking and scales to thousands of ports.
MLAG Design: Switch 3 is Blade Server + integrated Switch connected via MLAG to Switches 1 and 2 as peer Arista 7148SXs
MLAG achieves multi-terabit capacity making maximum use of each device and each link. It is an important tool in the network architect’s toolbox for building highly scalable and resilient cloud networks. I am excited about innovations like MLAG as they can be deployed on existing standards-based 1/10GbE switches without inventing proprietary mechanisms. At the same time MLAG options can co-exist with future standards for interoperability such as TRILL via IEEE/IETF/ ANSI committees. As always I welcome your comments at firstname.lastname@example.org.
Hadoop Cluster Applications - Technical Bulletin
Multi-Chassis Link Aggregation - Technical Bulletin
A Scalable Flexible Data Center Network