T-RACKs: A Faster Recovery Mechanism for TCP in Data Center Networks
KAUST DepartmentDepartment of Computer Science and Engineering, The Hong Kong University of Science and Technology (HKUST), Hong Kong. He is now with the Computer Science Department, Assiut University, Assiut 71515, Egypt, and also with the CEMSE Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955, Saudi Arabia
Permanent link to this recordhttp://hdl.handle.net/10754/668040
MetadataShow full item record
AbstractCloud interactive data-driven applications generate swarms of small TCP flows that compete for the small switch buffer space in data-center. Such applications require a small flow completion time (FCT) to be effective. Unfortunately, TCP is myopic with respect to the composite nature of application data. In addition it tends to artificially inflate the FCT of individual flows by several orders of magnitude, because of its Internet-centric design, that fixes the retransmission timeout (RTO) to be at least hundreds of milliseconds. To better understand this problem, in this paper, we use empirical measurements in a small data center testbed to study, at a microscopic level, the effects of various types of packet losses on TCP's performance. In particular, we single out packet losses that impact the tail end of small flows, as well as bursty losses that span a significant fraction of small TCP congestion windows, and show a non-negligible effect of such losses on the FCT. Based on this, we propose the so-called, timely-retransmitted ACKs (or T-RACKs), a simple loss recovery mechanism that conceals the drawbacks of the long RTO even in the presence of heavy packet losses. Interestingly enough, T-RACKS achieves this transparently to TCP itself as it does not require any change to TCP in the tenant's virtual machine (VM) or container. T-RACKs can be implemented as a software shim layer in the hypervisor between the VMs and the server's NIC or in hardware as a networking function in a SmartNIC. Simulation and real testbed results show remarkable performance improvements.
CitationAbdelmoniem, A. M., & Bensaou, B. (2021). T-RACKs: A Faster Recovery Mechanism for TCP in Data Center Networks. IEEE/ACM Transactions on Networking, 1–14. doi:10.1109/tnet.2021.3059913
SponsorsThe work of Ahmed M. Abdelmoniem was supported by the Hong Kong Ph.D. Fellowship Scheme through the Research Grant Council (RGC) under Grant HKPFS PF12-16707.