Subscribe to Blog Notification Emails

Latest Blog Post

How Robust is your cloud network?

Jayshree Ullal
by Jayshree Ullal on Mar 1, 2009 11:14:53 PM

The IT industry has touted networking OS advantages for many generations. Cisco was the pioneer and leader in the 1990s with Cisco IOS, rich in features, albeit highly monolithic. In the second decade, both Juniper and Cisco developed targeted service provider-class software, with more modularity with JUNOS and IOS-XR. As we enter the third decade and era of datacenter and cloud computing, the scale of connecting 10,000-100,000 clusters of compute and storage elements redefines the expectation of a resilient network.

So how really robust are software architectures of today's data centers? Can they cope with the demands of emerging clouds and scale? All depends on how well they are inherently designed and architected. A new ground-up architecture that handles reboots and intrusive restarts responsively is imperative. This, in my view, simply cannot be achieved with incremental enhancements to existing OS. Arista's Extensible OS (EOS) and Cisco's NX-OS are the only two illustrations of purposed-built data center networking software that at least I am aware of.EOS's architectural design is perhaps best illustrated by a real customer experience recalled fondly by our VP of Software Engineering, Ken Duda. During early field trials with our 71XX products, one customer observed a rather high bit-error rate on links between our switch and a third party 10GE NIC. We immediately rolled in new settings into a patch in the form of an updated PHY driver, which was then installed live by our customer. Our EOS restarted the new agent within one second while all other parameters in the switch remained un-changed and transparent to the fix. The customer was delighted with this type of live patching without scheduled down time in a data center. "Vendors have been promising me this type of OS for years, and finally one has delivered!" he declared approvingly.The secret-sauce of Arista's EOS is a multi-process state-sharing architecture. EOS defines its state using central database (Sysdb) that holds and validates all system state and propagates updates to the agents. The schema-specific code in Sysdb is machine generated, providing the performance of hand-written code without the errors. This stateful approach of EOS is intrinsically more deterministic, borrowing heavily from the world of databases where the state survives application failure.Alternate data center architectures may deploy a message-passing scheme, where agents interact by sending messages back and forth to convey state adding delays. Check-pointing services are often deployed for restart only, which can be error-prone. This is because agents read their checkpoints only in a restart not all the time. EOS is different and unique. Initialization and restart of agents are both handled consistently through the same stateful repository without reliance on recovery code.Arista EOS customer Paul Mullen, Project Manager of Heanet in Ireland sums it up well. "It was immediately clear that Arista's 7100 is a game changer delivering the kind of performance and availability that heretofore would not have been possible"I am excited too! Welcome to the new decade of cloud and data center networking delivering self-healing innovations never seen before in classic networks….

I welcome your comments at feedback@arista.com

Opinions expressed here are the personal opinions of the original authors, not of Arista Networks. The content is provided for informational purposes only and is not meant to be an endorsement or representation by Arista Networks or any other party.
Jayshree Ullal
Written by Jayshree Ullal
As CEO and Chairperson of Arista, Jayshree Ullal is responsible for Arista's business and thought leadership in AI and cloud networking. She led the company to a historic and successful IPO in June 2014 from zero to a multibillion-dollar business. Formerly Jayshree was Senior Vice President at Cisco, responsible for a $10B business in datacenter, switching and services. With more than 40 years of networking experience, she is the recipient of numerous awards including E&Y's "Entrepreneur of the Year" in 2015, Barron's "World's Best CEOs" in 2018 and one of Fortune's "Top 20 Business persons" in 2019. Jayshree holds a B.S. in Engineering (Electrical) and an M.S. degree in engineering management. She is a recipient of the SFSU and SCU Distinguished Alumni Awards in 2013 and 2016.

Related posts

The New AI Era: Networking for AI and AI for Networking*

As we all recover from NVIDIA’s exhilarating GTC 2024 in San Jose last week, AI state-of-the-art news seems fast and furious....

Jayshree Ullal
By Jayshree Ullal - March 25, 2024
The Arrival of Open AI Networking

Recently I attended the 50th golden anniversary of Ethernet at the Computer History Museum. It was a reminder of how familiar...

Jayshree Ullal
By Jayshree Ullal - July 19, 2023
Network Identity Redefined for Zero Trust Enterprises

The perimeter of networks is changing and collapsing. In a zero trust network, no one and no thing is trusted from inside or...

Jayshree Ullal
By Jayshree Ullal - April 24, 2023