This is an updating survey fo deep learning-based Android malware defenses, an constantly updated version for the manuscript, "Deep Learning for Android Malware Defenses: a Systematic Literature Review" by Yue Liu, Li Li, Chakkrit Tantithamthavorn and Yepang Liu.
To the best of our knowledge, no systematic literature review focusing on deep learning approaches for Android Malware defenses exists. In this paper, we conducted a systematic literature review to search and analyze how deep learning approaches have been applied in the context of malware defenses in the Android environment. As a result, a total of 132 studies covering the period 2014-2021 were identified. Our investigation reveals that, while the majority of these sources mainly consider DL-based on Android malware detection, 53 primary studies (40.1 percent) design defense approaches based on other scenarios. This review also discusses research trends, research focuses, challenges, and future research directions in DL-based Android malware defenses.
Please kindly cite this paper if it helps you research:
@article{liu2021deep,
title={Deep Learning for Android Malware Defenses: a Systematic Literature Review},
author={Liu, Yue and Tantithamthavorn, Chakkrit and Li, Li and Liu, Yepang},
journal={arXiv preprint arXiv:2103.05292},
year={2021}
}
If you have any question, please contact: [email protected]
You are welcome to update our review list!!
- fork this repository, add it and merge back;
- or email us.
- Systematic review process
- Paper structure
- Malware data collection
- Public malware defense tools
- Supplementary materials
- Recent Publications (Updating)
We collected primary studies related DL-based Android malware defenses from a variety of sources (IEEE, ACM Digital Library, Springer, Science Direct, Wiley Online Library, Google Scholar and Web of Knowledge). Only those studies related to deep learning-based Android malware defenses should be considered for further review;in addition, we proposed a quality appraisal criterion to obtain high-quality studies. The complete list of exclusion criteria and quality appraisal criterion is available at this page. After that, we obtained 132 relevant parimary studies.
We uploaded our completed paper lists Google Drive with detailed reviewed informatio.Our paper is structured as below:
- Malware Defenses Objectives
- binary malware classification
- malware family attribution
- repackaged/fake app detection
- adversarial learning attacks and protections
- malware evolution detection and defense
- malicious behavior analysis
- APK Characterization
- Program analysis approaches (static analysis, dynamic analysis, hybrid analysis)
- Feature categories (permission, API calls, filtered intents, app component, url, string, hardware component, app metadata, system call, dynamic activities, program graph, opcode, bytecode, java code)
- Feature encoding approaches (categorical, text-based, graph-based, image-based, hybrid)
- Deep Learning Techniques
- Learning paradigms (supervised, supervised & unsupervised, unsupervised, reinforcement learning)
- Deep learning models (Multilayer Perceptrons, Convolutional Neural Networks, Recurrent Neural Networks, Deep Belief Networks, Autoencoders, Generative Adversarial Networks, Graph Neural Networks, Attention-based neural networks, Deep Reinforcement Learning, Transformers, Hybrid models)
- Model explanation
- Deployment
- Off-device, Distributed, On-device
- Performance evaluation
- Dataset
- Evaluation approaches
- Evaluation metrics
- Availability
If you are interested in the summary of each subtopic for these 132 primary studies, you can read our survey to catch more information; If you want to check detailed information for each primary study, you can read our review table.
Data sources | Is update | Paper | Details |
---|---|---|---|
Drebin | - | NDSS-2014 | 123453 benign samples and 5560 malware(176 malware families); 2010-2012 samples |
Genome | - | S&P-2012 | 863 benign and 1260 malware; 2010-2011 samples |
Contagio | - | - | it consists of 11,960 mobile malware samples and 16,800 benign samples utill 2018 |
AMD | - | DIMVA-2017 | 24553 malware (2010-2016) |
AndroZoo | Yes | MSR-2016 | AndroZoo is a growing collection of Android Applications collected from several sources, including the official Google Play app market. It currently contains 17,951,878 different APKs. |
VirusTotal | Yes | - | VirusTotal aggregates many antivirus products and online scan engines. It also provide datasets for researchers |
VirusShare | Yes | - | VirusShare is a repository of malware samples to provide security researchers. System currently contains 44,390,572 malware samples. |
CICMalDroid | - | - | It has more than 17,341 Android samples utill 2018. |
RmvDroid | - | MSR-2019 | 9,133 malware samples, which belong to 56 malware families |
Google Play | Yes | - | Google play is the official Android market. PlayDrone: Google crawler |
Thirt-party markets | Yes | - | HUAWEI, APKpure, MI store, Tencent, 360, Wandoujia, Aptoide,Anzhi, APKmirror, Amazon Appstore, 9APPS |
- VirusTotal: Analyze suspicious files and URLs to detect types of malware, automatically share them with the security community. [Project link] [Request for research API]
- Deep Android Malware Detection
- A Multimodal Deep Learning Method for Android Malware Detection Using Various Features
- Detecting Android malware using Long Short-term Memory (LSTM)
- {TESSERACT}: Eliminating experimental bias in malware classification across space and time
- DeepIntent: Deep Icon-Behavior Learning for Detecting Intention-Behavior Discrepancy in Mobile Apps
- An Android mutation malware detection based on deep learning using visualization of importance from codes
- Familial Clustering for Weakly-Labeled Android Malware Using Hybrid Representation Learning
- Android Malware Detection Based on System Calls Analysis and CNN Classification
- Adversarial Deep Ensemble: Evasion Attacks and Defenses for Malware Detection
- Evaluating explanation methods for deep learning in security
- Enhancing State-of-the-art Classifiers with API Semantics to Detect Evolved Android Malware
- A Multi-modal Neural Embeddings Approach for Detecting Mobile Counterfeit Apps: A Case Study on Google Play Store
- DENAS: Automated Rule Generation by Knowledge Extraction from Neural Networks
- Experiences of Landing Machine Learning onto Market-Scale Mobile Malware Detection
- Hybrid Analysis of Android Apps for Security Vetting using Deep Learning
- Understanding Privacy Awareness in Android App Descriptions Using Deep Learning
- Combining multi-features with a neural joint model for Android malware detection
- Experimental comparison of features and classifiers for Android malware detection
- A Framework for Enhancing Deep Neural Networks Against Adversarial Malware
- Towards an interpretable deep learning model for mobile malware detection and family identification
- Explanation-Guided Backdoor Poisoning Attacks Against Malware Classifiers
- CADE: Detecting and Explaining Concept Drift Samples for Security Applications
- DexRay: A Simple, yet Effective Deep Learning Approach to Android Malware Detection Based on Image Representation of Bytecode
- Can We Leverage Predictive Uncertainty to Detect Dataset Shift and Adversarial Examples in Android Malware Detection?
- Why an Android App is Classified as Malware? Towards Malware Classification Interpretation
- Heterogeneous Temporal Graph Transformer: An Intelligent System for Evolving Android Malware Detection
- Robust Android Malware Detection against Adversarial Example Attacks
- PetaDroid: Adaptive Android Malware Detection Using Deep Learning
- Structural A!ack against Graph Based Android Malware Detection
- Adversarial Deep Learning for Robust Detection of Binary Encoded Malware, in IEEE Security and Privacy Workshops (SPW), 2018, Adversarial deep learning, [code]
- DroidCC: Android malware detection using deep learning, contains android malware samples, papers, tools etc;
- MADLIRA: Malware detection using learning and information retrieval for Android
- android-malware-detection: Android Malware Detection Using Machine Learning Classifiers ( Using Permissions requested by Apps)
- MLDroid/drebin: Drebin - NDSS 2014 Re-implementation
- MaMadroid: Implementation of MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral Models in NDSS 2017
- Lessons Learnt on Reproducibility in Machine Learning Based Android Malware Detection, in Empirical Software Engineering (EMSE).2021, Reproduction of Drebin, MaMadroid, Malscan, Droidcat, Revealdroid [code]
- Lightweight, Obfuscation-Resilient Detection and Family Identification of Android Malware, in TOSEM 2018, [code]
- Apktool: A tool for reverse engineering Android apk files [link]
- Androguard: Reverse engineering, Malware and goodware static analysis of Android applications ... and more [link]
- FlowDroid: FlowDroid statically computes data flows in Android apps and Java programs. [link]
- Monkey: An open source security tool for testing a data center's resiliency to perimeter breaches and internal server infection. The Monkey uses various methods to self propagate across a data center and reports success to a centralized Monkey Island server. [link]
- DroidBox: Dynamic analysis of Android apps [link]
- DroidBot: A lightweight test input generator for Android. Similar to Monkey, but with more intelligence and cool features. [link]
Research Papers
- Deep learning - LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. Nature, 2015, [pdf]
- Deep learning - Goodfellow, Ian, et al. MIT press, 2016, [pdf1][pdf2]
- Deep learning in neural networks: An overview - Schmidhuber, Jürgen. Neural networks, 2015, [pdf]
Online Tutorials and Repositories
- Awesome - Most Cited Deep Learning Papers - [Project link]
- Deep Learning Papers Reading Roadmap - [Project link]
- Top Deep Learning Projects -[Project link]
- Tracking Progress in Natural Language Processing -[Project link]
- Deep Learning Tutorial - by Haozan Liang, only Chinese version, continously maintaining and updating, [Project link]
Tools: Tensorflow, keras, scikit-learn, pytorch
Research Papers
- Android security: a survey of issues, malware penetration, and defenses - Faruki P, Bharmal A, Laxmi V, et al. IEEE communications surveys & tutorials, 2014, [pdf]
- A taxonomy and qualitative comparison of program analysis techniques for security assessment of android software - Sadeghi A, Bagheri H, Garcia J, et al. IEEE Transactions on Software Engineering, 2016, [pdf]
- The Evolution of Android Malware and Android Analysis Techniques - Tam K, Feizollah A, Anuar N B, et al. ACM Computing Surveys (CSUR), 2017, [pdf]
- Static analysis of android apps: A systematic literature review - Li L, Bissyandé T F, Papadakis M, et al. Information and Software Technology, 2017, [pdf] [Project link]
- A Survey on Malware Detection Using Data Mining Techniques - Ye Y, Li T, Adjeroh D, et al. ACM Computing Surveys (CSUR), 2017, [pdf]
- A survey on various threats and current state of security in android platform - Bhat P, Dutta K. ACM Computing Surveys (CSUR), 2019, [pdf]
- A survey of Android malware detection with deep neural models - Qiu J, Zhang J, Luo W, et al. ACM Computing Surveys (CSUR), 2020, [pdf]
Recent relevant studies (Last update: 2022-01, we welcome our fellow researchers to update recent works)
- MEGDroid: A model-driven event generation framework for dynamic android malware analysis; Information and Software Technology, 2021
- [GDroid: Android Malware Detection and Classification with Graph Convolutional Network](GDroid: Android Malware Detection and Classification with Graph Convolutional Network); Computers & Security, 2021
- Op2Vec: An Opcode Embedding Technique and Dataset Design for End-to-End Detection of Android Malware; arXiv preprint arXiv:2104.04798, 2021
- Towards an interpretable deep learning model for mobile malware detection and family identification; Computers & Security, 2021
- NATICUSdroid: A malware detection framework for Android using native and custom permissions; Journal of Information Security and Applications, 2021
- Mimosa: Reducing malware analysis overhead with coverings; arXiv preprint arXiv:2101.07328, 2021.
- IoTMalware: Android IoT Malware Detection based on Deep Neural Network and Blockchain Technology; arXiv preprint, 2021.
- Formal Equivalence Checking for Mobile Malware Detection and Family Classification; IEEE Transactions on Software Engineering (2021).
- A privacy and security analysis of early-deployed COVID-19 contact tracing Android apps; Empirical Software Engineering, 2021, 26(3): 1-51.
- Understanding worldwide private information collection on android; arXiv preprint, 2021.
- Systematic Mutation-Based Evaluation of the Soundness of Security-Focused Android Static Analysis Techniques; ACM Transactions on Privacy and Security (TOPS), 2021
- Malware Detection employed by Visualization and Deep Neural Network; Computers & Security, 2021
- Malware Detection and Analysis: Challenges and Research Opportunities; arXiv preprint, 2021.
- Towards interpreting ML-based automated malware detection models: a survey; arXiv preprint, 2021.
- A Novel Few-Shot Malware Classification Approach for Unknown Family Recognition with Multi-Prototype Modeling; Computers & Security, 2021
- Obfuscation-Resilient Executable Payload Extraction From Packed Malware;{USENIX} Security, 2021
- Marked for Disruption: Tracing the Evolution of Malware Delivery Operations Targeted for Takedown; arXiv preprint, 2021.
- Lessons Learnt on Reproducibility in Machine Learning Based Android Malware Detection, in Empirical Software Engineering (EMSE).2021