Zibo's Website A digital collection of my work and life

Paper Study - ExLL An Extremely Low-latency Congestion Control for Mobile Cellular Networks

** This is a study & review of the ExLL paper. I’m not an author.

ExLL: An Extremely Low-latency Congestion Control for Mobile Cellular Networks

Intro

  • Data latency: devastating effect on applications, such as self-driving cars, autonomous robots, and remote surgery (5G apps)

  • TCP sessions often have long delay: bufferfloat: over-buffering problem at bottleneck link because of TCP’s loose congestion control
  • More paper confirming bufferfloat is severe in cellular network

Cellular Network Characteristics for Protocol Design:

  • Downlink Scheduling: the base station (BS) schedules downlink packets towards multiple user equipments (UEs) at 1 ms granularity (a.k.a. transmission time interval, TTI), based on both the signal strength reported by each UE and the current traffic load -> Infer cellular link bandwidth instantly with reception pattern rather than explicitly probing for bandwidth or measuring for a long period of time
  • Uplink Scheduling: (the BS needs to grant uplink transmission eligibility for each UE, which happens at a regular interval known as SR (scheduling request) periodicity) Ignoring this periodicity -> underestimate minimum RTT -> CC to run in unrealistic operating conditions.
  • Minimum RTT is not affected by channel condition between UR and BS (due to adaptive MCS (modulation and coding scheme) selection used in LTE systems, which approximately eliminates MAC-layer retransmissions, but also not affected by other UEs connected to the same BS since per-UE queue is isolated)

Existing Protocols:

​ Testing done with Android phones and LTE connections

  • Low latency and throughput can’t be achieved together

fig1

  • ExLL:
    • Target: Low latency with high throughput
    • Estimations done at UE -> ExLL a receiver-drive CC
    • Estimates bandwidth of links by analyzing packet reception pattern in downlink
    • Estimates min RTT with SR periodicity in uplink
    • Uses control feedback from FAST for receive winding RWND
    • Server set its congestion window, CWND, upon RWND from receiver -> immediately deployable
    • ExLL can be sender-drive, small performance gap

Observation

  • Measurement Set-up

    table1

  • Max throughput and min RTT:

    • Moving device (change in RSSI, signal strength): stable RTT, unstable throughput
    • Screen Shot 2020-02-07 at 20.16.11
  • Min RTT over different RSSI values:

    • Min RTT not much affected by RSSI values
    • Screen Shot 2020-02-07 at 20.20.40
  • Per-UE queue:

    • Per-UE queue is important because delay in one UE will not affect other devices in the same Base Station (BS).
    • Screen Shot 2020-02-08 at 03.37.04

ExLL’s Network Inference

  • Downlink: efficient probing is thought to be important

    • Challenge: injecting packets that causes network over-buffering to probe bandwidth is bad for latency

    • Solutions:

      • Extracting physical layer parameters
        • Only supported by specific vendors
        • Needs software tools needing rooting a device
      • Machine Learning
        • Takes time to learn and respond -> latency in response to traffic load/channel condition change
      • ExLL: estimates bandwidth rather than probing
    • Scheduling Pattern

      • Screen Shot 2020-02-08 at 03.56.55
      • Figure 4 (a): At initial stage, the number of received packets in a radio frame increases rapidly as CWND increases -> downlink behavior temporarily depends on CWND growth
      • Screen Shot 2020-02-08 at 03.57.16
      • Figure 4 (b): the patterns of allocated subframes change, but the total number of packet receptions per radio frame becomes stable ( N(f) = 159, 162, 167 )
    • Downlink bandwidth estimation

      • F(·): packet reception during one radio frame (received bytes in one radio frame / 10 ms)
      • C(·): estimation of max bandwidth (before splitting for UEs)
      • Screen Shot 2020-02-10 at 12.11.27
      • F(·) ≈ Measured throughput
      • C(·)(a) ≈ 300 Mbps (30 MHz capacity), C(·)~(b)~ ≈ 400 Mbps (40 MHz capacity)
      • Poor channel condition is detected & C(·) is adjusted in (a) at t = 14s

      • Screen Shot 2020-02-10 at 12.13.02
  • Uplink

    • Scheduling pattern
    • Screen Shot 2020-02-10 at 12.23.21
    • Granularity of (a) is ~ 1ms, (b) is ~ 10ms -> SR (scheduling request) periodicity for uplink is 10ms.
    • CDF (cumulative density function) of RTT, 37 - 47 ms, avg = 42 ms, max - min = 10ms = SR periodicity
    • Challenge: use min RTT as measurement or target for controlling CWND -> conservative, loss throughput
      • Solution: new estimation technique that incorporates SR periodicity.

ExLL Design

  • Control Algorithm

    • Based on FAST control equation:

      ​ \(w_{i+1} = (i - \gamma)w_i + \gamma(\frac{mRE_i}{R_i}w_i + \alpha)\)

      • \[\gamma \in (0,1] , \alpha > 0, w_i \ , \ R_i \ , \ mER_i \ is \ CWND, \ RTT, minimum \ RTT \ estimate \ at \ i\]
      • CWND grows at constant factor: $\alpha$

      • $ \alpha $ determines the amount of queueing in the bottleneck of a flow, which accumulates for multiple flows, the agility of bandwidth adaptation, and the robustness in maintaining high throughput (many inherent trade-offs)

      • Example: small $ \alpha $

        Pros Cons
        small queue, low latency slow adaption, RTT fluctuation -> -> over reduce CWND -> empty queue -> less throughput
    • ExLL solves the above trade-offs by

      \[\alpha \to \alpha(1-\frac{T_i}{MTE_i})\]
    • $ T_i $, $ MTE_i $ is measured throughput, maximum throughput estimate at time i

      \(w_{i+1} = (i - \gamma)w_i + \gamma(\frac{mRE_i}{R_i}w_i + \alpha(1-\frac{T_i}{MTE_i}))\) Eq.3

    • Large $ \alpha $ : agile and robust bandwidth probing + no over buffering at bottleneck link, because growth factor $ \alpha(1-\frac{T_i}{MTE_i}) \to 0$ as $ T_i \to MTEi $

  • State Transition (workflow) receiver-drivenScreen Shot 2020-02-10 at 14.00.50

    • Observation mode

      • When Cubic’s CWND <= bandwidth estimate
    • Control mode

      • When Cubic’s CWND > bandwidth estimate
        • report RWND, compute wi+1, send back to server
    • Recovery logic (more on this later)

      • Packet loss or CWND reduced <= RWND
        • Store RWND value, stop updating RWND until CWND >= RWND again
      • Cubic reset
        • Restart ExLL from observation mode
  • Sender-driven: plug-in for CubicScreen Shot 2020-02-10 at 14.01.13

    • ExLL Control mode
      • When \(\frac{cwnd^C}{mRE_S} > MTE_S\)
        • Calculate $ cwnd^E_i $ from Eq.3
        • Overrides Cubic when $ cwnd^E < cwnd^C $
  • MTE calculation

    • F(·) is updated per radio frame = 10ms, much fast than sender’s information delayed by RTT ~ 40+ ms.
    • Sender-drive: $ MTE_S = \frac{cwnd}{\Delta t} $ , where $ \Delta t = t_{last Ack} - t_{first Ack} $, small difference in actual experiment measurements
    • Screen Shot 2020-02-10 at 14.24.44
  • mRE calculation

    • Observation: track minimum, average RTT per packet with mpRTT, apRTT

    • Control

      • $ mRE = mpRTT + D(2*(apRTT - mpRTT)) $

      Where D(·) finds matching SR periodicity in (5, 10, 20, 40, 80) from

      $ \hat T^{SR} = 2*(apRTT - mpRTT) $

      • Experiment: a eNB that uses SR periodicity of 10ms ≈ $\hat T^{SR}$

      • Screen Shot 2020-02-10 at 14.32.09
      • eNBs with 10, 20, 30 ms SR periodicity
      • Screen Shot 2020-02-10 at 14.34.38
  • Recovery from loss or timeout

  • CWND reduced to < RWND for packet loss
    • Store RWND value, stop updating RWND until CWND >= RWND again
  • Cubic resets, CWND = default initial CWND, for timeout
    • Restart ExLL from observation mode

Evaluation

  • Receiver- vs. sender-side ExLL: marginal difference, receiver-side slightly better
    • A real LTE network, mRTT = 50ms, max throughput = 150 Mbps.
    • Showing a similar results, stable RTT, throughput, both close to LTE limits.
    • CWND of sender-side ExLL less stable.
    • Screen Shot 2020-02-10 at 14.49.58
    • In multiple LTE networks
    • Sender-side ExLL slightly worse than receiver-side
    • Screen Shot 2020-02-10 at 14.59.11
    • LTE, RTT = 50ms, Tput = 100 -> 50 Mbps
    • Smooth transition from 100 Mbps to 50 Mbps
    • Screen Shot 2020-02-10 at 15.01.51
    • In different mobility scenarios
    • Screen Shot 2020-02-10 at 15.03.38
  • Static Channel:
    • 90 Mbps, 50ms, ExLL vs. BRR: more accurate RTT, CWND, less fluctuation in throughput
    • Screen Shot 2020-02-10 at 15.05.58
    • High throughput, close to Cubic + low latency, close to BDP, PropRate
    • Screen Shot 2020-02-10 at 15.08.06
  • Mobile Channel
    • Smaller RTT, smoother CWND in congested time, 20-40, 80-100
    • Screen Shot 2020-02-10 at 15.12.46
    • Outperforms others by a large margin
    • Screen Shot 2020-02-10 at 15.14.55
  • Multiple Flows
    • Fairness (Tput) much better than BRR
    • Screen Shot 2020-02-10 at 15.16.49
    • Screen Shot 2020-02-10 at 15.18.50
  • Non-cellular bottleneck adaptation
    • 0-30s: non-cellular bottleneck: observation mode
    • 30-60s: cellular bottleneck: control mode
    • beyond 60s:non-cellular bottleneck: observation mode
    • Screen Shot 2020-02-10 at 15.22.01
    • Runs as Cubic for fairness in first 120s, when non-cellular bottleneck disappears, fully utilize bandwidth
    • Screen Shot 2020-02-10 at 15.26.41
    • Fair share of bandwidth with other ExLL flows
    • Screen Shot 2020-02-10 at 15.28.51

Applications Performance

  • Showed significant amount of improvement on applications level speed

Screen Shot 2020-02-10 at 15.31.02

  • For Internet
    • Vegas, FAST: Limited deployment, complex.
    • BBR: high latency
    • Copa: low throughput
  • For Datacenter
    • pHost: unrealistic assumptions: core is free, size of flows is known in advance
    • ExpressPass: new switch function, overhead in packets
  • For Cellular:
    • DRWA, CQIC, Verus, CLAW, PropRate: worse performances

Conclusion

  • Develop novel techniques that can estimate the cellular link bandwidth and realistic minimum RTT without explicit probing, which can be easily extended to next-generation cellular technologies such as 5G.
  • Incorporate the control logic of FAST into ExLL to minimize the latency even in dynamic cellular channel conditions.
  • Implement ExLL in both receiver- and sender-side versions that give wider deployment opportunities. The receiver-side ExLL can provide an immediate solution for untouched commodity servers while the sender-side ExLL can provide a fundamental solution for 5G URLLC (ultra reliability and low latency communication).
[ paper academic ]