




Deep learning has been widely used in computer vision, natural language processing, data mining and other fields. Although deep learning can efficiently handle problems in various fields, it can also be threatened by adversarial attacks, making its applications in specific fields unstable and even leading to security vulnerabilities. Improving the transferability of adversarial examples between different models can help improve network robustness, and at the same time, attacks based on transferability are more efficient compared to other attack methods. In this thesis, we focus on improving the transferability of adversarial samples in three areas: iterative attacks, middle layer attacks and attacks on ViT models, and propose corresponding improvements and solutions based on the summary of existing adversarial attack algorithms for the image classification domain. In summary, the main contents of this thesis include the following three aspects:

(1) In order to improve the transferability of the iterative attack algorithm, this thesis analyses the relationship between the overfitting phenomenon and transferability from a geometric perspective and proposes the angle-based transferable attack algorithm (ANI-FGSM). The algorithm takes the angle of the gradient direction of the loss function at the adversarial sample and the random samples in its neighbourhood as the regular term, and is able to smooth the loss function of the model so as to generate the adversarial samples with high transferability. Further, a theoretical proof of the enhanced transferability of ANI-FGSM is given in this thesis. In addition, ANI-FGSM outperforms the conventional algorithm in practice on the normal training model, the adversarial training model and the defence model, with higher success rate of black-box attacks and stronger usability.

(2) In order to make full use of the gradient information in the augmentation phase of the two-stage approach, this thesis proposes an augmentation phase gradient-based intermediate layer attack algorithm (UILA). The algorithm uses snapshot points to store the latest guidance direction information and sets up an outer loop for updating the snapshot points, overcoming the weakness of existing algorithms that are highly sensitive to input. In addition, this thesis introduces an empirical observation-guided procedure that is able to use the source model alone to select the best layer index, avoiding the need to re-evaluate the target model during hyperparameter optimization and reducing the cost of experimentation. Finally, experimental results on datasets of different sizes demonstrate the effectiveness of the UILA algorithm proposed in this thesis on white-box attacks, and the transferability of the algorithm for generating adversarial samples using different source models is verified.

(3) To improve the transferability of the adversarial samples on the ViT model, the chunking and sparse alternating attack algorithm (PSAA) is proposed in this thesis. This thesis first introduces the basic structure of the ViT model and compares the differences in feature extraction methods between ViT and CNN models. Based on the feature that the self-attentive mechanism can simultaneously extract global and local features, the alternating chunking and sparse attacks are used to generate adversarial samples, which effectively interfere with the extraction of interaction information between patches and image features by the ViT model. In addition, the transferability of the PSAA algorithm is further enhanced by introducing the PNA mechanism on sparse attacks. The experimental results show that the PSAA algorithm can not only effectively attack the ViT model and CNN model in the white-box setting, but also has good results in the black-box setting.

In summary, this article proposes three effective algorithms to generate adversarial examples with high transferability, which have shown good performance on different models and can be widely applied in practical scenarios.