Introduction

    “datacenter” refers to facilities used to house computer systems and associated components[1]. Many services which requires high performance computing or high storage volume today, for example, web search (Google, Bing), social networks (Facebook, Twitter), cloud computing platform (Amazon EMR and EC2) and cloud storage service (Amazon S3) are all supported by large-scale datacenters. Based on different usage, the number of nodes in a datacenter could range from several hundred to up to tens of thousand.
    Nodes in a datacenter is connected via routers and switches of multiple levels, as illustrated in Graph 1. Similar to all other network infrastructure, network issues like insufficient bandwidth, congestion, long latency all happen in datacenter network. However, because of the character of datacenter, there are some network issues in datacenter which cause more troubles than in other kind of network infrastructures.
    Incast: Incast is a many-to-one communication pattern commonly found in cloud datacenters.[2] When incast happens, multiple nodes respond to a single node simultaneously, and causes switch/router buffer overflow. When buffer overflow happens, standard TCP protocol tries to solve the problem by reducing the length of sliding window. However, it does not work well in datacenter because of the many-to-one pattern: when buffer overflow is detected, all the responding nodes reduce, and re-grow sliding window size simultaneously, which result in poor performance and doesn’t really solve the issue.
    Queue buildup and Buffer pressure: Data flows in datacenters could be categorized into short flow and long flow. Short flow means flows consists of fewer packets, while long flows may consists of way more packets. When long flow is transmitted in datacenter, router/switch buffer could be filled up, which does not affect overall throughput, but add significant delay to responses for short flows.
    In the user scenario of datacenter, short flows are usually more time-sensitive than long flows. Short flows are more likely to be generated by user interactive operations, for example, submitting a search query, pull an order list, etc. Long flow could be downloading a large file, or committing a disk backup. While users may not be irritated if their download tasks last 5 seconds longer, they may expect instant responses for their short flow requests.
    All of the issues talked above result in response delay and impair datacenter’s functionality. Statistics shows Amazon’s revenue is decreased by 1 percent for every 100ms latency, and Walmart users with 0-1 sec load time have 2x conversion rate of 1-2 sec. Hence a mechanism address such problems in datacenter is very essential.
    In this paper, we present ADaption Transmission Control Protocol(ADTCP) and experimental evaluations. ADTCP is a new transport layer protocol with congestion control mechanism realized on the end-hosts side, with no modifications in switch. The basic idea of ACTCP is that constructing a flow schedule table(FST) on each end host to schedule flows with priority and flow identify, hosts in datacenter cooperate and update FST to determine the send time and route of flows instead of switches. For full host-side design, there are two main challenges for implementing ADTCP, the first one is that information for server to schedule flows are limited, network are initially not designed to provide more information to server, to address this problem, we utilize ECN(explicit congestion notification)to inform sender host of congestion in passing routers or switches. Another challenge is, by convention routers or switches make real-time routing decisions. In ADTCP, instead of routers or switches, sender host make routing decisions through source routing to achieve multi-path balancing, which in this paper, is refer to priority random jump.
    The contributions of this work are:

  1. Design and implement the flow-down mechanism to achieve avoidance of flow congestion.

  2. Design flow-random mechanism to keep multi-path balancing.

  3. Priority scheduling with FST to decide which flow to be slowed down.


    In the following sections, we will analyze the traffic measurement in Datacenter network with ACTCP, in exiting work section, we discuss the insufficient solutions and compare with ACTCP. In section ADTCP, we introduce the design detail of adaption transmission control protocol.