| dc.description.abstract |
Web-based systems are increasingly popular due to their extensive online features, but they also
introduce significant vulnerabilities, particularly in PHP frameworks that were easily exploited
due to their open access. Traditional methods like static analysis, dynamic testing, and machine
learning often struggle to effectively detect these vulnerabilities in complex codebases. Most
previous studies have typically focused on either graph-based or sequence-based approaches, each
capturing only part of the code’s intricacies-either its structural relations or temporal patterns, but
not both. To address these limitations, researchers proposed a novel deep learning approach that
combines the strengths of both. Researchers developed a stacking ensemble model that integrates
bidirectional long short-term memory (BiLSTM) and CodeBERT for sequential analysis with
graph neural network (GCN) and GraphCodeBERT to capture structural relations as base models
and the deep neural network (DNN) model as a metamodel for final prediction. The study used the
pretrained model CodeBERT for sequential data and GraphCodeBERT for graphical data for
embedding. The model evaluated on the SARD dataset about 20080 PHP code snippets. The first
combination achieved 99.97% accuracy of ensembled CodeBERT and GraphCodeBERT binary
classification and 99.86% accuracy for multiclass classification. This approach enhances the
overall reliability of the detection process. Based on the results of this study, the researcher
mentions future directions for improvements by employing additional models, data representation,
and diverse datasets and programming languages |
en_US |