TideWatch: Fingerprinting the cyclicality of big data workloads

Handle URI:
http://hdl.handle.net/10754/564894
Title:
TideWatch: Fingerprinting the cyclicality of big data workloads
Authors:
Williams, Daniel W.; Zheng, Shuai; Zhang, Xiangliang ( 0000-0002-3574-5665 ) ; Jamjoom, Hani T.
Abstract:
Intrinsic to 'big data' processing workloads (e.g., iterative MapReduce, Pregel, etc.) are cyclical resource utilization patterns that are highly synchronized across different resource types as well as the workers in a cluster. In Infrastructure as a Service settings, cloud providers do not exploit this characteristic to better manage VMs because they view VMs as 'black boxes.' We present TideWatch, a system that automatically identifies cyclicality and similarity in running VMs. TideWatch predicts period lengths of most VMs in Hadoop workloads within 9% of actual iteration boundaries and successfully classifies up to 95% of running VMs as participating in the appropriate Hadoop cluster. Furthermore, we show how TideWatch can be used to improve the timing of VM migrations, reducing both migration time and network impact by over 50% when compared to a random approach. © 2014 IEEE.
KAUST Department:
Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division; Computer Science Program; Machine Intelligence & kNowledge Engineering Lab
Publisher:
Institute of Electrical and Electronics Engineers (IEEE)
Journal:
IEEE INFOCOM 2014 - IEEE Conference on Computer Communications
Conference/Event name:
33rd IEEE Conference on Computer Communications, IEEE INFOCOM 2014
Issue Date:
Apr-2014
DOI:
10.1109/INFOCOM.2014.6848144
Type:
Conference Paper
ISSN:
0743166X
ISBN:
9781479933600
Appears in Collections:
Conference Papers; Computer Science Program; Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division

Full metadata record

DC FieldValue Language
dc.contributor.authorWilliams, Daniel W.en
dc.contributor.authorZheng, Shuaien
dc.contributor.authorZhang, Xiangliangen
dc.contributor.authorJamjoom, Hani T.en
dc.date.accessioned2015-08-04T07:24:23Zen
dc.date.available2015-08-04T07:24:23Zen
dc.date.issued2014-04en
dc.identifier.isbn9781479933600en
dc.identifier.issn0743166Xen
dc.identifier.doi10.1109/INFOCOM.2014.6848144en
dc.identifier.urihttp://hdl.handle.net/10754/564894en
dc.description.abstractIntrinsic to 'big data' processing workloads (e.g., iterative MapReduce, Pregel, etc.) are cyclical resource utilization patterns that are highly synchronized across different resource types as well as the workers in a cluster. In Infrastructure as a Service settings, cloud providers do not exploit this characteristic to better manage VMs because they view VMs as 'black boxes.' We present TideWatch, a system that automatically identifies cyclicality and similarity in running VMs. TideWatch predicts period lengths of most VMs in Hadoop workloads within 9% of actual iteration boundaries and successfully classifies up to 95% of running VMs as participating in the appropriate Hadoop cluster. Furthermore, we show how TideWatch can be used to improve the timing of VM migrations, reducing both migration time and network impact by over 50% when compared to a random approach. © 2014 IEEE.en
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)en
dc.titleTideWatch: Fingerprinting the cyclicality of big data workloadsen
dc.typeConference Paperen
dc.contributor.departmentComputer, Electrical and Mathematical Sciences and Engineering (CEMSE) Divisionen
dc.contributor.departmentComputer Science Programen
dc.contributor.departmentMachine Intelligence & kNowledge Engineering Laben
dc.identifier.journalIEEE INFOCOM 2014 - IEEE Conference on Computer Communicationsen
dc.conference.date27 April 2014 through 2 May 2014en
dc.conference.name33rd IEEE Conference on Computer Communications, IEEE INFOCOM 2014en
dc.conference.locationToronto, ONen
dc.contributor.institutionIBM T. J. Watson Research Center, Yorktown Heights, NY, United Statesen
kaust.authorZheng, Shuaien
kaust.authorZhang, Xiangliangen
All Items in KAUST are protected by copyright, with all rights reserved, unless otherwise indicated.