Existing Work

    To address the problem of congestion in datacenter, there are mainly two types of work.

    The first one is implicit congestion control by using ECN to forecast buffer overflow in switch/router, e.g. DCTCP \cite{bhakar_Sengupta_Sridharan_2010} and D2TCP \cite{Vamanan_Hasan_Vijaykumar_2012}. These works are well-compatible with current devices. However modifying the size of slide windows smoothly needs more than a several RTTs of time. This seemly short period of time is actually to the small flows.

    Another type of approaches are explicit flow control in switch/router, e.g. D3\cite{llani_Karagiannis_Rowtron_2011}, PDQ\cite{Hong_Caesar_Godfrey_2012}, pFabric \cite{McKeown_Prabhakar_Shenker_2013}. Though the design can be much easier, and simulation results seem great, it is hard to implement into real world, because they need hardware changes such as special-designed switch/router. The table below shows the comparison of 5 approaches in five metrics.

Some descriptive statistics about fruit and vegetable consumption among high school students in the U.S. While bananas and apples still top the list of most popular fresh fruits, the amount of bananas consumed grew from 7 pounds per person in 1970 to 10.4 pounds in 2010, whereas consumption of fresh apples decreased from 10.4 pounds to 9.5 pounds. Watermelons and grapes moved up in the rankings.
Protocol Fast response /no-complex rate control Compatibility Distributed Design Share Info(scalable) Preemptive
DCTCP \({\times}\) \(\surd\) \(\surd\) implicitly \({\times}\)
D2TCP \({\times}\) \(\surd\) \(\surd\) implicitly \(\surd\)
D3 \({\times}\) \({\times}\) backwards compatibility \(\surd\) \({\times}\)
PDQ \(\surd\) \({\times}\) \(\surd\) \(\surd\) \(\surd\)
pFabric \(\surd\) (clean-slate)\({\times}\) \(\surd\) \({\times}\) \(\surd\)


DCTCP(datacenter TCP): As a new variant of TCP, DCTCP is designed for the need of datacenter which host diverse applications, and in order to operate with a persistently low buffer occupancies while still keep high throughput for long flows with a mix of short and long lows. The main idea of DCTCP is utilizing the ECN(Explicit Congestion Notification) to provide multi-bit feedback to the end hosts, which will react early to congestion.
D2TCP(Deadline-Aware datacenter TCP): The key idea of D2TCP is to vary sending window (sending rate) based on both deadline and extent of congestion, that is, near-deadline flows back off less while far-deadline flows back off more. It is deadline-aware and could candle fan-in bursts which is built on top of DCTCP and use per-flow state at end hosts. As a result, senders will react to congestion without the knowledge of other flows.
D3(Deadline Delivery Protocol): The main idea of D3 is to make the network aware of flow deadlines and prioritize flows based on deadlines. D3 schedules network traffic based on SLAs and can eventually double the peak load a datacenter supports.
PDQ(Preemptive Distributed Quick Scheduling): The main idea of PDQ is scheduling flows based on flow criticality, sender appends flow criticality on packet header, switch allocates bandwidth to flows and tag sending rate on packet header of each flow, then sender will send the flows with the rate in the packet header. The difference between preemptive scheduling and dynamic scheduling is, in preemptive scheduling, less-critical flows should yield to critical flows and for the latter, flow criticality may change over time. There are four main scheduling disciplines in PDQ: EDF(Earliest Deadline First), SJF(Shortest Job First), EDF+SJF and Policy-based. And flow criticality are chosen based on these disciplines.
pFabric(Minimal Near-Optimal Datacenter Transport): pFabric is to decouple flow scheduling and rate control in datacenter packet transportation, and design a simple mechanism to separately achieve these goals. For flow scheduling a single priority will be carried by a packet, and set independently by each flow. For rate control, all flows should start at line-rate and throttle their sending rate only if they see high and persistent loss.[9] In simulations, the results confirm that the mechanism provide good performance.