Characterizing Android apps’ behavior for effective detection of malapps at large scale

Handle URI:
http://hdl.handle.net/10754/623405
Title:
Characterizing Android apps’ behavior for effective detection of malapps at large scale
Authors:
Wang, Xing; Wang, Wei; He, Yongzhong; Liu, Jiqiang; Han, Zhen; Zhang, Xiangliang ( 0000-0002-3574-5665 )
Abstract:
Android malicious applications (malapps) have surged and been sophisticated, posing a great threat to users. How to characterize, understand and detect Android malapps at a large scale is thus a big challenge. In this work, we are motivated to discover the discriminatory and persistent features extracted from Android APK files for automated malapp detection at a large scale. To achieve this goal, firstly we extract a very large number of features from each app and categorize the features into two groups, namely, app-specific features as well as platform-defined features. These feature sets will then be fed into four classifiers (i.e., Logistic Regression, linear SVM, Decision Tree and Random Forest) for the detection of malapps. Secondly, we evaluate the persistence of app-specific and platform-defined features on classification performance with two data sets collected in different time periods. Thirdly, we comprehensively analyze the relevant features selected by Logistic Regression classifier to identify the contributions of each feature set. We conduct extensive experiments on large real-world app sets consisting of 213,256 benign apps collected from six app markets, 4,363 benign apps from Google Play market, and 18,363 malapps. The experimental results and our analysis give insights regarding what discriminatory features are most effective to characterize malapps for building an effective and efficient malapp detection system. With the selected discriminatory features, the Logistic Regression classifier yields the best true positive rate as 96% with a false positive rate as 0.06%.
KAUST Department:
Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
Citation:
Wang X, Wang W, He Y, Liu J, Han Z, et al. (2017) Characterizing Android apps’ behavior for effective detection of malapps at large scale. Future Generation Computer Systems. Available: http://dx.doi.org/10.1016/j.future.2017.04.041.
Publisher:
Elsevier BV
Journal:
Future Generation Computer Systems
Issue Date:
6-May-2017
DOI:
10.1016/j.future.2017.04.041
Type:
Article
ISSN:
0167-739X
Sponsors:
The work reported in this paper was supported in part by the Scientific Research Foundation through the Returned Overseas Chinese Scholars, Ministry of Education of China, under Grant K14C300020, in part by Shanghai Key Laboratory of Integrated Administration Technologies for Information Security, under Grant AGK2015002, in part by ZTE Corporation Foundation, and in part by National Natural Science Foundation of China, under Grant 61672092.
Additional Links:
http://www.sciencedirect.com/science/article/pii/S0167739X17307720
Appears in Collections:
Articles; Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division

Full metadata record

DC FieldValue Language
dc.contributor.authorWang, Xingen
dc.contributor.authorWang, Weien
dc.contributor.authorHe, Yongzhongen
dc.contributor.authorLiu, Jiqiangen
dc.contributor.authorHan, Zhenen
dc.contributor.authorZhang, Xiangliangen
dc.date.accessioned2017-05-09T08:34:33Z-
dc.date.available2017-05-09T08:34:33Z-
dc.date.issued2017-05-06en
dc.identifier.citationWang X, Wang W, He Y, Liu J, Han Z, et al. (2017) Characterizing Android apps’ behavior for effective detection of malapps at large scale. Future Generation Computer Systems. Available: http://dx.doi.org/10.1016/j.future.2017.04.041.en
dc.identifier.issn0167-739Xen
dc.identifier.doi10.1016/j.future.2017.04.041en
dc.identifier.urihttp://hdl.handle.net/10754/623405-
dc.description.abstractAndroid malicious applications (malapps) have surged and been sophisticated, posing a great threat to users. How to characterize, understand and detect Android malapps at a large scale is thus a big challenge. In this work, we are motivated to discover the discriminatory and persistent features extracted from Android APK files for automated malapp detection at a large scale. To achieve this goal, firstly we extract a very large number of features from each app and categorize the features into two groups, namely, app-specific features as well as platform-defined features. These feature sets will then be fed into four classifiers (i.e., Logistic Regression, linear SVM, Decision Tree and Random Forest) for the detection of malapps. Secondly, we evaluate the persistence of app-specific and platform-defined features on classification performance with two data sets collected in different time periods. Thirdly, we comprehensively analyze the relevant features selected by Logistic Regression classifier to identify the contributions of each feature set. We conduct extensive experiments on large real-world app sets consisting of 213,256 benign apps collected from six app markets, 4,363 benign apps from Google Play market, and 18,363 malapps. The experimental results and our analysis give insights regarding what discriminatory features are most effective to characterize malapps for building an effective and efficient malapp detection system. With the selected discriminatory features, the Logistic Regression classifier yields the best true positive rate as 96% with a false positive rate as 0.06%.en
dc.description.sponsorshipThe work reported in this paper was supported in part by the Scientific Research Foundation through the Returned Overseas Chinese Scholars, Ministry of Education of China, under Grant K14C300020, in part by Shanghai Key Laboratory of Integrated Administration Technologies for Information Security, under Grant AGK2015002, in part by ZTE Corporation Foundation, and in part by National Natural Science Foundation of China, under Grant 61672092.en
dc.publisherElsevier BVen
dc.relation.urlhttp://www.sciencedirect.com/science/article/pii/S0167739X17307720en
dc.rightsNOTICE: this is the author’s version of a work that was accepted for publication in Future Generation Computer Systems. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Future Generation Computer Systems, [, , (2017-05-06)] DOI: 10.1016/j.future.2017.04.041 . © 2017. This manuscript version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/en
dc.subjectAndroiden
dc.subjectMalicious apps detectionen
dc.subjectFeature comparisonen
dc.titleCharacterizing Android apps’ behavior for effective detection of malapps at large scaleen
dc.typeArticleen
dc.contributor.departmentComputer, Electrical and Mathematical Sciences and Engineering (CEMSE) Divisionen
dc.identifier.journalFuture Generation Computer Systemsen
dc.eprint.versionPost-printen
dc.contributor.institutionBeijing Key Laboratory of Security and Privacy in Intelligent Transportation, Beijing Jiaotong University, 3 Shangyuancun, Beijing 100044, Chinaen
kaust.authorZhang, Xiangliangen
All Items in KAUST are protected by copyright, with all rights reserved, unless otherwise indicated.