Subscribe to Blog Notification Emails

Latest Blog Post

The Future of Cognitive Cloud Networking

Jayshree Ullal
by Jayshree Ullal on Feb 27, 2018 3:03:44 PM

Artificial Intelligence, machine and deep learning have to be among the most popular tech-words of the past few years, and I was hoping that I wouldn’t get swept away by it. But when I heard a panel on this topic at our customer event this month on the state of AI networks, I found it incredibly fascinating and it piqued my curiosity! Let me start with a note of disclaimer for readers who are expecting a deep tutorial from me. There is a vast amount of research behind models and algorithms on this topic that I will not even attempt to cover. Instead I will try to share some thoughts on the practical relevance of this promising field.

Behind the Buzz Words

AI and machine learning have been around for a long time. The difference now is that there is much more powerful compute and network infrastructure available, along with exponentially more data to analyze. The criticality of efficient data movement includes a lifecycle of improvements in deep learning for ingesting, processing and inferring data, thereby creating higher layer abstractions for data scientists to quickly develop and train models.

The result is problems that were previously in the realm of impossible such as real-time language translation, fraud detection, and autonomous vehicle control, are being addressed through the use of neural network models, detecting patterns and behaviors across huge amounts of structured and unstructured data. As an example, an AI program that learns Van Gogh paintings can match with similar new paintings. While the human brain may be better in detecting deeper meaning and “conscious thought”, AI is radically increasing the benefits of “raw intelligence.” The continuing goal is to minimize the cycle time, both for the development of new algorithms and models and then to scale AI applications to serve billions of devices in real-time. Scientists can now reduce their research time from years to hours for trials and studies. Machine learning algorithms are typically implemented as floating point, which is why NVIDIA GPUs have been so popular here. This is combined with inference that is typically done in integer logic. This combination delivers the most machine learning and inference performance at the lowest cost and power. It also allows these systems to be tuned for AI applications.

The Network Relevance

Within a typical AI appliance, multiple GPUs are interconnected with very high-speed chip-to-chip interfaces. The NVIDIA DGX-1 with Volta system can interconnect 4 GPU chips with NVLink into a cube-mesh topology, which is then packaged together with general purpose CPUs. 100G Ethernet and RDMA over Converged Ethernet (RoCE) can be used to enable any GPU in the network to access any other GPUs memory. The high-performance Ethernet network used between DGX systems also communicates to storage devices (such as Pure Storage FlashBlade) and the DGX-1 servers, vastly simplifying AI system configuration and deployment. The NVIDIA DGX-1 system starts at 4 100G networking ports that deliver a total of 400G or 50 Gigabytes/sec of throughput, which is 4 times as much network bandwidth compared to general purpose servers in cloud networks.

Cognitive Networking Implications

AI servers together with an Arista leaf-and-spine network and storage appliances can form an important AI nucleus. We have tested these solutions with NVIDIA and Pure Storage to offer the highest IO density per appliance. The common theme is both AI storage and networking need insatiable bandwidth to feed the powerful applications. The NVIDIA DGX-1 system is just a 3U footprint with 4 100G interfaces to ingest up to 100 Gigabytes/second.

Cloud titans may migrate easily between different kinds of AI workloads without compromising AI applications. This improves monetization to optimize the ad or movie they are recommending to drive real time user experiences. Yet the potential goes beyond the cloud to enterprises as well. In small steps, Arista has already begun its journey through CloudVision’s® machine learning implementations. If there is an abnormal traffic rate, anomalies are quickly pinpointed and corrected.

At Arista we are at the cusp of building new, transformative technologies in our Arista EOS® architecture for machine learning, telemetry and failure mitigation. I am excited by the prospects ahead in the decade of transformation and innovation. Welcome to 2018 and the age of cognitive cloud networking.


Opinions expressed here are the personal opinions of the original authors, not of Arista Networks. The content is provided for informational purposes only and is not meant to be an endorsement or representation by Arista Networks or any other party.
Jayshree Ullal
Written by Jayshree Ullal
Jayshree Ullal is a networking executive veteran with 30+ years of experience. In 2018 Barron's named her one of the "World's Best CEOs." In 2015 she was co-awarded "EY 2015 Entrepreneur of the Year" across National USA and "#3 IT Industry Disrupter" by CRN. In 2005, she was named one of the "50 Most Powerful People" by Network World and one of the "Top Executives" by Forbes magazine 2012. As President and CEO for a decade, Jayshree led Arista Networks to a successful IPO in June 2014 at NYSE. She is responsible for building a multibillion dollar business in cloud networking and has forged strategic alliances with Microsoft, HP and VMware to name a few.

Related posts

Tech Time is Real Time

Silicon Valley is both an addiction and passion where entrepreneurs seek the realm of the impossible. Real-time language...

Jayshree Ullal
By Jayshree Ullal - December 11, 2018
The Easiest Way to go Faster is to go Faster

Why 400G Ethernet? In one sentence, because the easiest way to go faster is to go faster. Over time, Ethernet speed transitions...

Andreas Bechtolsheim
By Andreas Bechtolsheim - October 23, 2018
Cognitive Campus - Next Frontier

Arista’s focus on disruption, with datacenters and routing, transforming siloed places in the network to seamless Places In the...

Jayshree Ullal
By Jayshree Ullal - October 16, 2018