by Iman M. AlmomaniAala AlkhayerW. El-shafai
Recently, cybersecurity experts and researchers have given special attention to developing cost-effective deep learning (DL)-based algorithms for Android malware detection (AMD) systems. However, the conventional AMD solutions necessitate extensive computations to achieve high accuracy in detecting Android malware apps. Consequently, there is a significant benefit in utilizing convolution neural networks (CNNs) in vision-based AMD applications to quickly and efficiently learn without prior stages of reverse engineering processes. Thus, this paper introduces an efficient and automated vision-based AMD model composed of 16 well-developed and fine-tuned CNN algorithms. This model precludes the need for a pre-designated features extraction process while generating accurate predictions of malware images with minimum cost and high detection speed. Such performance is achieved with colored or grayscale malware images, whether by using balanced or imbalanced datasets. Firstly, the bytecodes of the “classes.dex” files extracted from the Android benign and malware apps were converted to color and grayscale visual images before forwarding them to the developed CNN algorithms for classification. Then, the detection efficiency of the proposed AMD model was examined and evaluated using the imbalanced benchmark Leopard Android dataset that composes 14733 samples of malware apps and 2486 samples of benign apps. Finally, different experimental scenarios were conducted using balanced and imbalanced Android samples of color and grayscale images generated from the Leopard dataset; to extensively and sufficiently validate the detection and classification performance of the suggested model. Comprehensive assessment classification parameters in the evaluation experiments were applied to prove the high capability of the developed fine-tuned CNN algorithms in recognizing Android malware attacks with low computational overhead. As a result, the detection accuracy reached 99.40% for balanced samples and 98.05% for imbalanced samples. Furthermore, the proposed AMD model outperforms the existing approaches that utilize conventional vision-based algorithms and are tested on the same benchmark Android dataset.