3 min read

Delivering a Multi Cloud and Cloud Native Operator Experience

Picture of Douglas Gourlay Douglas Gourlay : Nov 5, 2019 3:57:53 AM

2019 Douglas Gourlay

Delivering a Multi Cloud and Cloud Native Operator Experience

Right around the same time I joined Arista in 2009, Amazon Web Services developed the concept of the Virtual Private Cloud, one of the seminal technologies that became a core construct deployed throughout public clouds enabling enterprise customers to corral and protect resources and provision them into logical groups, align security policies, and simplify their management. Following this, Google developed a model for Virtual Private Clouds that spanned regions allocating one subnet per region by default - creating the first multi-region VPC within a single cloud provider.

In our endless quest at Arista to deliver an amazing operator experience we realized that while this was a great start and amazing technology we needed to continue this evolution and deliver a multi-region, multi-provider virtual private cloud capability that spans from the campus, to the data center, to AWS, Azure, and Google public clouds and even reaches down into the Kubernetes host - a true multi cloud and cloud native capability - delivered through Arista EOS^®.

We obviously have named this version of EOS: CloudEOS™. It inherits all of the best capabilities of EOS - in fact, it is built off of the same source code, so it is completely feature, API, and code compatible with its progenitor. But there are some amazing new differences worth highlighting and discussing:

CloudEOS is infrastructure as code. We have partnered with the leading DevOps/SRE provisioning tools for the cloud such as Hashicorp’s Terraform and CloudEOS enables, in about a half-dozen lines of simple code, for the declarative provisioning of an entire multi-cloud network. The topology, is a variable. The segmentation and namespace, a variable. The CIDR block, another variable. Terraform then makes a set of calls to Arista CloudVision^®to drive the network deployment from there.
CloudEOS is born in, and procured through, the cloud. CloudVision, initiated by Terraform, then jumpstarts your mutli cloud network as it actually procures on-demand instances of CloudEOS through the public cloud marketplaces and automatically deploys them into each local cloud VPC. These instances are amazing - they can scale up and scale back as network load changes enabling you to reduce operating costs in the cloud for variable usage workloads.
Cloud Network Private Segments - segmentation done right, for the enterprise, from campus to data center to cloud. I never understood why networking vendors always try to make things more complex for clients - yet it seems all most of them ever do. For instance, why do you need a different network segmentation model in the campus than you do the WAN than you do the data center, and even then another one for the cloud? They are all IP and largely Ethernet - why not help network operators simplify their environments by offering one consistent segmentation model that can scale to almost any deployment? So that’s what we built. Scalable, encrypted where necessary, simplified key management, open standards for multi-vendor interoperability as others decide to catch up or partners want to integrate.
Just to repeat myself - Open Standards. We based the entire CloudEOS and Multi Cloud and Cloud Native solutions on open standards. BGP, EVPN, VXLAN, IPSEC. Even the real time state streaming is gNMI/gRPC using OpenConfig models - when it came time to make a decision we erred on open rather than closed and proprietary. I haven’t seen many times when closed and proprietary beat open standards in the networking world.
Real Time Telemetry. Every router in every VPC around the globe, every container network interface in an auto-scaling k8s cluster, and every physical switch across the data center and campus are capable of streaming live network state back to Arista CloudVision. It’s literally like the networking version of Doctor Who’s famous ‘Tardis’ - in this case the network time machine that empowers network operators to significantly reduce the time it takes to troubleshoot and to have absolute authority and observability on what is and has happened in the network.

Cloud with Guard Rails

I used to hear people wanted to ‘run fast and break things’. Sure, this is fine if you’re hacking away on some web app for non-mission critical stuff. But we’re talking about networks here, things that power heart monitors in surgical theatres, connect up the physical security cameras at airports, and move trillions of dollars a day of transactions in financial exchanges. Here, in the network, things are different - we want to move fast, and not break things.

Our goal with CloudEOS is simple, we want to enable the DevOps teams to use the tools they are familiar with, add and scale and adjust their compute and storage and cloud service requirements in a rapid-fire and dynamic environment as fast as possible - we want to enable true business agility. We also want the networking team to have the confidence to know that the guardrails are in place - the CIDR block for the ERP system won’t get accidentally duplicated, that someone can’t connect an external test segment to a production data store in the enterprise, and that critical changes and deployments to production segments may require a code review in the network too before they are deployed into production.

See, when you’re on a racetrack going as fast as you can to win, guard rails help you go faster. You’re more confident that the track itself is helping you stay on it and with that confidence, you can push harder and go even faster.

So go on, take CloudEOS for a spin - it’s available in Azure and AWS today, and shortly in Google Cloud as well. We have clients with as few as 4-5 VPCs simplifying transit and we have massive scale-out customers with 500-1000 VPCs leveraging elastic scale up/down and dynamic path selection.

Also, worth checking out is this demo where the inestimable Fred Hsu takes us on a tour of CloudEOS at work to simplify, scale, and operate a multi-cloud and cloud-native infrastructure.