Authorea

wendylyn edited introduction.tex over 10 years ago

Commit id: 9c0e7b863bc91b0d9bd3fc4a57a1f58425687a22

deletions | additions

\section{Introduction} ~~“datacenter” ~~~~“datacenter” refers to facilities used to house computer systems and associated components[1]. Many services which requires high performance computing or high storage volume today, for example, web search (Google, Bing), social networks (Facebook, Twitter), cloud computing platform (Amazon EMR and EC2) and cloud storage service (Amazon S3) are all supported by large-scale datacenters. Based on different usage, the number of nodes in a datacenter could range from several hundred to up to tens of thousand. \\~~Nodes \\~~~~Nodes in a datacenter is connected via routers and switches of multiple levels, as illustrated in Graph 1. Similar to all other network infrastructure, network issues like insufficient bandwidth, congestion, long latency all happen in datacenter network. However, because of the character of datacenter, there are some network issues in datacenter which cause more troubles than in other kind of network infrastructures. \\~~Incast: \\~~~~Incast: Incast is a many-to-one communication pattern commonly found in cloud datacenters.[2] When incast happens, multiple nodes respond to a single node simultaneously, and causes switch/router buffer overflow. When buffer overflow happens, standard TCP protocol tries to solve the problem by reducing the length of sliding window. However, it does not work well in datacenter because of the many-to-one pattern: when buffer overflow is detected, all the responding nodes reduce, and re-grow sliding window size simultaneously, which result in poor performance and doesn’t really solve the issue. \\~~Queue \\~~~~Queue buildup and Buffer pressure: Data flows in datacenters could be categorized into short flow and long flow. Short flow means flows consists of fewer packets, while long flows may consists of way more packets. When long flow is transmitted in datacenter, router/switch buffer could be filled up, which does not affect overall throughput, but add significant delay to responses for short flows. \\~~In \\~~~~In the user scenario of datacenter, short flows are usually more time-sensitive than long flows. Short flows are more likely to be generated by user interactive operations, for example, submitting a search query, pull an order list, etc. Long flow could be downloading a large file, or committing a disk backup. While users may not be irritated if their download tasks last 5 seconds longer, they may expect instant responses for their short flow requests. \\~~All \\~~~~All of the issues talked above result in response delay and impair datacenter’s functionality. Statistics shows Amazon’s revenue is decreased by 1 percent for every 100ms latency, and Walmart users with 0-1 sec load time have 2x conversion rate of 1-2 sec. Hence a mechanism address such problems in datacenter is very essential. \\~~In \\~~~~In this paper, we present ADaption Transmission Control Protocol(ADTCP) and experimental evaluations of it. ADTCP is a new congestion control mechanism based on the end-hosts, with no modifications in switch. The basic idea of ACTCP is that it constructs a flow schedule table(FST) on each end host to schedule flows, hosts cooperate and update FST to determine the send time and route of flows instead of switches. For full host-side design, there are two main challenges for implementing ADTCP, the first one is that information for server to schedule flows are limited, network are initially not designed to provide more information to server, to address this problem \\~~The \\~~~~The contributions of this work are: \begin{enumerate} \item contributionscontributionscontributionscontributions \item contributionscontributionscontributionscontributionscontributions