3 min read

Big-Data is a Big Deal for Cloud Networking

Picture of Jayshree Ullal Jayshree Ullal : Jul 11, 2011 12:42:44 AM

Jayshree Ullal 2011

Big-Data is a Big Deal for Cloud Networking

Once again new applications are pushing the envelope for modern cloud networking, a radical departure from traditional enterprises! Big Data was a hot topic at the GigaOm Structure conference last month.

As companies acquire and analyze vast amounts of structured and un-structured data, they increasingly re-engineer their data centers for new applications, which in turns fuels their need for a new cloud network. We witness an increasing trend towards Big Data for storing and efficiently processing petabytes to exabytes of unstructured data. Big Data and its associated storage archivals are changing the way networking must be repurposed for these new storage behaviors.

Return of Direct Attached Storage (DAS):

Storage Area Networks (SANS) using block-based Fiber channel were considered the only option in the past decade. Then came Network attached Storage with File protocols as well as ISCSI over high speed Ethernet. In the 2010 decade, the advent of dense storage with SSD combined with the popularity of Hadoop, is driving the third generation of storage that resembles the early direct-attached storage models. What goes around comes around with greater speed, scale and intelligence.

Hadoop, a big data software framework drives the compute engines in data centers from IBM to Google, Yahoo-spin off, EBay and Facebook. The framework is comprised of distributed file systems, databases, and data mining algorithms. A noteworthy aspect is the ability to create Hadoop clusters out of standards-based computing and networking elements to run parallelized, data-intensive workloads. This is a departure from convention where the storage archival process happens up front as the first step in the data's lifecycle.

In order to efficiently access stored results or simply calculate new ones, a well-designed network with full any-to-any capacity, low latency, and high bandwidth can significantly improve Hadoop performance. Also, as workloads grow, it is important that the network can sustain the inclusion of additional servers in an incremental fashion. Hadoop only scales in proportion to the compute resources networked together at any time.

Moving Computation closer to Data Storage:

Hadoop (using the Map-Reduce algorithm) to process large quantities of unstructured data is quickly changing the storage paradigm. In the map phase, in order to handle large data sets, the data is broken up into small chunks that are spread across the cluster nodes. Servers are given tasks relevant to the data already present in their directly attached storage (DAS). Pushing the computation to the data, storage and is a critical part of processing petabytes - even with the fattest pipe of 100 GbE, a badly allocated workload could take weeks to simply read in all the data necessary! Once this initial processing of data is completed, resulting outputs are sent to a smaller subset of the nodes for further processing and summarization. The resulting data movement is called a “shuffle”, and involves large amounts of data traversing into a few nodes, which is often demanding on the underlying Cloud Networking infrastructure.

The Big-Data Arista Advantage:

Arista is once again at the forefront of delivering the attributes required to build a reliable Big Data Cloud Network including:

Self-Healing Resilience: Highly available, and resilient software such as Arista EOS ensures that the Hadoop cluster stays up and running. The ability of EOS to automatically contain faults or better yet, repair them with fault containment and recovery mechanisms is unique and compelling. The quick re-provisioning of any failed server nodes during execution assures new levels of high availability and fault tolerance unavailable in enterprise networking.
Large or Dynamic Buffers and Congestion Visibility: Unpredictable data patterns during the shuffle phase can cause some server nodes to receive significantly more traffic than others. A successful Big Data deployment requires a highly scalable, high-performance network interconnect featuring non-blocking capacity and large or intelligent dynamic buffer allocation schemes that minimize congestion in case of fan-in. With Arista’s spine-leaf design that scales to 180000+ nodes today, large network buffers are deployed, so even the most oversubscribed server can receive all its intermediate data without packet loss. The Arista switches also feature dynamic buffer allocation (DBA) schemes, which ensure that the buffer resources are dedicated to the ports that need them the most. Tools such as Arista’s Latency Analyzer (LANZ) enable network introspection and alleviate congestion management for optimized workload clusters.
Rapid Configuration & Ease of Operations: Arista EOS Extensible Operating System is based on standard Linux, and a unique system database (SysDB) model. EOS provides the ideal foundation for user defined functionality and leverages open source tools to help any cluster reach its full potential. Additionally, Arista’s Zero Touch Provisioning (ZTP) allows rapid configuration and the operation of the cluster in minutes instead of hours, saving IT staff key resources in running large Hadoop installations

Big Data is indeed creating an extraordinary challenge and opportunity requiring critical decisions on application and storage performance. As always, I welcome your views at feedback@arista.com and I am excited by this trend.