Type
Conference PaperKAUST Department
Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) DivisionComputer Science Program
Machine Intelligence & kNowledge Engineering Lab
Date
2014-04Permanent link to this record
http://hdl.handle.net/10754/564894
Metadata
Show full item recordAbstract
Intrinsic to 'big data' processing workloads (e.g., iterative MapReduce, Pregel, etc.) are cyclical resource utilization patterns that are highly synchronized across different resource types as well as the workers in a cluster. In Infrastructure as a Service settings, cloud providers do not exploit this characteristic to better manage VMs because they view VMs as 'black boxes.' We present TideWatch, a system that automatically identifies cyclicality and similarity in running VMs. TideWatch predicts period lengths of most VMs in Hadoop workloads within 9% of actual iteration boundaries and successfully classifies up to 95% of running VMs as participating in the appropriate Hadoop cluster. Furthermore, we show how TideWatch can be used to improve the timing of VM migrations, reducing both migration time and network impact by over 50% when compared to a random approach. © 2014 IEEE.Citation
Williams, D., Zheng, S., Zhang, X., & Jamjoom, H. (2014). TideWatch: Fingerprinting the cyclicality of big data workloads. IEEE INFOCOM 2014 - IEEE Conference on Computer Communications. doi:10.1109/infocom.2014.6848144Conference/Event name
33rd IEEE Conference on Computer Communications, IEEE INFOCOM 2014ISBN
9781479933600ae974a485f413a2113503eed53cd6c53
10.1109/INFOCOM.2014.6848144