Content-Agnostic Malware Detection in Heterogeneous Malicious Distribution Graph

Handle URI:
http://hdl.handle.net/10754/622527
Title:
Content-Agnostic Malware Detection in Heterogeneous Malicious Distribution Graph
Authors:
Alabdulmohsin, Ibrahim ( 0000-0002-9387-5820 ) ; Han, Yufei; Shen, Yun; Zhang, Xiangliang ( 0000-0002-3574-5665 )
Abstract:
Malware detection has been widely studied by analysing either file dropping relationships or characteristics of the file distribution network. This paper, for the first time, studies a global heterogeneous malware delivery graph fusing file dropping relationship and the topology of the file distribution network. The integration offers a unique ability of structuring the end-to-end distribution relationship. However, it brings large heterogeneous graphs to analysis. In our study, an average daily generated graph has more than 4 million edges and 2.7 million nodes that differ in type, such as IPs, URLs, and files. We propose a novel Bayesian label propagation model to unify the multi-source information, including content-agnostic features of different node types and topological information of the heterogeneous network. Our approach does not need to examine the source codes nor inspect the dynamic behaviours of a binary. Instead, it estimates the maliciousness of a given file through a semi-supervised label propagation procedure, which has a linear time complexity w.r.t. the number of nodes and edges. The evaluation on 567 million real-world download events validates that our proposed approach efficiently detects malware with a high accuracy. © 2016 Copyright held by the owner/author(s).
KAUST Department:
King Abdullah University of Science and Technology, Saudi Arabia
Citation:
Alabdulmohsin I, Han Y, Shen Y, Zhang X (2016) Content-Agnostic Malware Detection in Heterogeneous Malicious Distribution Graph. Proceedings of the 25th ACM International on Conference on Information and Knowledge Management - CIKM ’16. Available: http://dx.doi.org/10.1145/2983323.2983700.
Publisher:
Association for Computing Machinery (ACM)
Journal:
Proceedings of the 25th ACM International on Conference on Information and Knowledge Management - CIKM '16
Conference/Event name:
25th ACM International Conference on Information and Knowledge Management, CIKM 2016
Issue Date:
26-Oct-2016
DOI:
10.1145/2983323.2983700
Type:
Conference Paper
Additional Links:
http://dl.acm.org/citation.cfm?doid=2983323.2983700
Appears in Collections:
Conference Papers

Full metadata record

DC FieldValue Language
dc.contributor.authorAlabdulmohsin, Ibrahimen
dc.contributor.authorHan, Yufeien
dc.contributor.authorShen, Yunen
dc.contributor.authorZhang, Xiangliangen
dc.date.accessioned2017-01-02T09:55:28Z-
dc.date.available2017-01-02T09:55:28Z-
dc.date.issued2016-10-26en
dc.identifier.citationAlabdulmohsin I, Han Y, Shen Y, Zhang X (2016) Content-Agnostic Malware Detection in Heterogeneous Malicious Distribution Graph. Proceedings of the 25th ACM International on Conference on Information and Knowledge Management - CIKM ’16. Available: http://dx.doi.org/10.1145/2983323.2983700.en
dc.identifier.doi10.1145/2983323.2983700en
dc.identifier.urihttp://hdl.handle.net/10754/622527-
dc.description.abstractMalware detection has been widely studied by analysing either file dropping relationships or characteristics of the file distribution network. This paper, for the first time, studies a global heterogeneous malware delivery graph fusing file dropping relationship and the topology of the file distribution network. The integration offers a unique ability of structuring the end-to-end distribution relationship. However, it brings large heterogeneous graphs to analysis. In our study, an average daily generated graph has more than 4 million edges and 2.7 million nodes that differ in type, such as IPs, URLs, and files. We propose a novel Bayesian label propagation model to unify the multi-source information, including content-agnostic features of different node types and topological information of the heterogeneous network. Our approach does not need to examine the source codes nor inspect the dynamic behaviours of a binary. Instead, it estimates the maliciousness of a given file through a semi-supervised label propagation procedure, which has a linear time complexity w.r.t. the number of nodes and edges. The evaluation on 567 million real-world download events validates that our proposed approach efficiently detects malware with a high accuracy. © 2016 Copyright held by the owner/author(s).en
dc.publisherAssociation for Computing Machinery (ACM)en
dc.relation.urlhttp://dl.acm.org/citation.cfm?doid=2983323.2983700en
dc.titleContent-Agnostic Malware Detection in Heterogeneous Malicious Distribution Graphen
dc.typeConference Paperen
dc.contributor.departmentKing Abdullah University of Science and Technology, Saudi Arabiaen
dc.identifier.journalProceedings of the 25th ACM International on Conference on Information and Knowledge Management - CIKM '16en
dc.conference.date2016-10-24 to 2016-10-28en
dc.conference.name25th ACM International Conference on Information and Knowledge Management, CIKM 2016en
dc.conference.locationIndianapolis, IN, USAen
dc.contributor.institutionSymantec Research Labs, Saudi Arabiaen
kaust.authorAlabdulmohsin, Ibrahimen
kaust.authorZhang, Xiangliangen
All Items in KAUST are protected by copyright, with all rights reserved, unless otherwise indicated.