Subscribe to Blog Notification Emails

Latest Blog Post

Redefining Cloud-Networking Resilience & Visibility

Jayshree Ullal
by Jayshree Ullal on Oct 1, 2013 7:36:33 PM

In modern two-tier leaf-spine cloud networks, the increasing dominance of east-west traffic patterns, accompanied by the sheer volume of traffic and the increase in data rates from 10G to 40G/100G are combining to make it challenging to predict and analyze performance issues proactively. The scale involved in connecting 100K+ physical servers, 1M+ Virtual Machines and large big data storage elements is redefining expectations of a resilient network. Self-healing networks and new levels of visibility are no longer optional, but are, in fact, mandatory.


So how is today’s Network Software doing in these environments?

Self Healing & Programmable Software is Paramount



Most Network OS’s are unfortunately decades old and are designed for older enterprise data center applications of the 1990s. Traditional vendors do try valiantly to band-aid these legacy OS’s, but modern networks need a new ground-up architecture. Arista’s Extensible OS (EOS) is the ONLY purpose-built data center networking software to address this requirement. EOS was designed to support mission critical clouds and data centers as the primary goal. Brilliant engineers lead the EOS team and our SVP of Software Engineering and CTO, Ken Duda, pioneered the architecture. Indeed it is this engineering feat of software excellence that drew me to the company several years ago.


Published studies have shown that the operational costs of running a network are many times more than the capital expenditures over time. The cost of operational down-time from lack of visibility into the network infrastructure is estimated to be $5,600 per minute. This amounts to more than a million dollars for just several hours of outage. Cloud-scale operators must reduce downtime and detect, isolate, and resolve application performance problems proactively in order to meet their customer expectations and Service Level Agreements.


The secret sauce of Arista EOS is a multi-process state-sharing architecture which is self-healing and which exposes open APIs to enable programmability. EOS stores all system state in a central database (Sysdb) that holds and validates all system state and propagates updates. The schema-specific code in Sysdb is machine generated, providing the performance of hand-written code without the errors. The stateful publish-subscribe approach of EOS is intrinsically deterministic, borrowing heavily from the world of databases where state survives application shutdown. Many alternate data center vendors claim “improved” operating systems yet they deploy archaic message-passing schemes, where agents interact by sending messages back and forth to convey state, adding complexity and delays. Archaic check-pointing services are often deployed for restart only, which can be error-prone. This is because agents read their checkpoints only during a restart, not all the time. Initialization as well as the restart of agents within EOS is handled consistently through the same repository without reliance on recovery.

Virtual to Physical to Application Visibility



To improve down time and save costs, dynamic network troubleshooting and monitoring tool sets are needed. We must provide both fine-grained visibility to application performance, and also more global network-wide monitoring capabilities. How can you capture, analyze and troubleshoot traffic between two virtual servers when there are literally hundreds of paths between the racks where servers are located and the exact location of the server is unknown?


Arista Network Telemetry works in conjunction with applications so that the network is not in the way anymore. It dramatically reduces application downtime and network operational costs through improved real-time system and network performance visibility, correlation to application behavior and advanced end-to-end path monitoring tools. This saves millions of dollars and hours of downtime. Arista Tracers are enhancements to the Arista Network Telemetry application that bring deeper application level visibility by integrating with distributed applications like Big Data, Cloud, and Virtualized environments (see Figure below).



Figure: Arista EOS Tracer Technologies
Examples such as Health Tracer, Path Tracer, VM Tracer and MapReduce Tracer redefine resiliency and visibility.

The programmable foundation of Arista EOS combined with Network Tracers provides a real-world solution to the real-world problems of cloud network visibility, monitoring and troubleshooting. It enables tight linkages between the physical, virtual and application infrastructure that result in considerable savings in operational expenditures.

Welcome to the new world of software defined cloud networking with increased visibility, and lower operational costs and reduced down time. I look forward to your comments at: feedback@arista.com

References:

Opinions expressed here are the personal opinions of the original authors, not of Arista Networks. The content is provided for informational purposes only and is not meant to be an endorsement or representation by Arista Networks or any other party.
Jayshree Ullal
Written by Jayshree Ullal
Jayshree Ullal is a networking executive veteran with 30+ years of experience. In 2018 Barron's named her one of the "World's Best CEOs." In 2015 she was co-awarded "EY 2015 Entrepreneur of the Year" across National USA and "#3 IT Industry Disrupter" by CRN. In 2005, she was named one of the "50 Most Powerful People" by Network World and one of the "Top Executives" by Forbes magazine 2012. As President and CEO for a decade, Jayshree led Arista Networks to a successful IPO in June 2014 at NYSE. She is responsible for building a multibillion dollar business in cloud networking and has forged strategic alliances with Microsoft, HP and VMware to name a few.

Related posts

Tech Time is Real Time

Silicon Valley is both an addiction and passion where entrepreneurs seek the realm of the impossible. Real-time language...

Jayshree Ullal
By Jayshree Ullal - December 11, 2018
Cognitive Campus - Next Frontier

Arista’s focus on disruption, with datacenters and routing, transforming siloed places in the network to seamless Places In the...

Jayshree Ullal
By Jayshree Ullal - October 16, 2018
Reflections on the Cloud Networking Decade

When I joined Arista ten years ago, we were in the midst of developing a novel purpose-built software architecture from a clean...

Jayshree Ullal
By Jayshree Ullal - September 17, 2018