Paper Study - Opera Expanding across time to deliver bandwidth efficiency and low latency
02 Mar 2020** This is a study & review of the Opera paper. I’m not an author.
Opera: Expanding across time to deliver bandwidth efficiency and low latency
Abstract
Opera: dynamic network using multi-hop forwarding (expander-graph-based approach), but provides near-optimal bandwidth for bulk flows with direct forwarding overtime-varying source-to-destination circuits.
Key: rapid & deterministic reconfiguration of the network, so that at any time network implements an expander graph; across time, network provides bandwidth-efficient single-hop paths between all racks.
Intro
Reconfigurable networks aka dynamic network
Provision direct link between end points with high demand. Takes a long time to identify, reconfigure the network.
Problem: a trade-off between amortizing the overhead of reconfiguration against the inefficiency of sub-optimal configurations
Observation: most bytes in today’s datacenter networks are bulk traffic.
Goal: To minimize bandwidth tax paid by bulk traffic.
Solution: dynamic, circuit-switched topology, reconfigures small number of TOR switches uplinks, moving through a series of time-caring expander graphs.
Idea: this changing topology will ensure very pair of end points are allocated a direct link for bandwidth efficient connectivity for bulk traffic.
Challenge with Fat-tree: communication between racks has to go through intermediate switches, reduces network capacity between racks, thus Hadoop/MapReduce has “locality”
Expander topology: directly link an uplink of a TOR to other TORs (sparse graph, multiple short paths between source to destination) -> multiple hops of TOR to the destination TOR -> bandwidth waste -> establish single hop paths for source, destinations
Design
To find short path: expander graph, also fault-tolerant, because each pair has multiple paths.
To avoid bandwidth waste: reconfigure topology as needed, amortized costs very small for bulk traffic.
Staggered reconfiguration to ensure valid paths always exist. Continuous connectivity.
Expander graph: ensure 1. Multi hop path always exist. 2. Single hop path is provisioned. “Cycle time” is needed for each pair of source, destination to have a direct connection
Example: Two-tier leaf-spine topology
Each TOR connects to the four circuit switches, and matching A is multi hop path for low-latency traffic, matching B is one-hop path for bulk traffic. Bulk traffic can wait until matching B is instantiated in switch 2 to begin transmission. Latency-sensitive data can immediately begin transmission with matching A.
Topology is generated before network operations (bootstrapping), no computation during operation.
Traffic size to determine using indirect path or waiting for direct path. Or to use application data to determine which one to use.
Eval
Opera outperform others, lower flow completion time, in law-latency traffic and build traffic.
Opera is designed to handle bulk traffic with direct link, and this graph shows that Opera can achieve higher throughput with direct path.
Conclusion
Opera is a topology that implements a series of time-varying expander graphs that support low-latency traffic, and when integrated over time, provide direct connections between all endpoints to deliver high throughput to bulk traffic.