Patch Attack¶
Attack type: white-box (supported by HEART), black-box (currently supported by ART), evasion, digital or physical. For more information on types of patch attack see test these more detailed explanations.
Best for: patch attacks are localized and unbounded, making them easy to transfer to the physical world (while remaining applicable in the digital space).
Attack summary: Patch attacks are carried out by adding an object to an image that degrades the results of a visual model ingesting that image, either producing the wrong classification, or failing to detect a relevant object within the image. Adversarial patches can be created with access to only the model’s output, and are not norm-bound or specific to a single image. Patch attacks are highly versatile and can be implemented both digitally and physically.
Task: Object detection vs image classification
Modality: HEART currently only supports images, ART supports images and video
Data: Single or three color channel images, of standardized dimensions. Specify pixels in range 0-1 or 0-255, matching input data
Model: Computer vision model
To get started with Patch attacks, see the Patch Attack Notebook, available via the IBM HEART-library GitHub repository.
For increased relevance to your use case, replace the selected hugging face model with your own model, and the test data set with a test dataset of your own.
A model’s robustness can be assessed by comparing performance before and after an attack. For details on how to evaluate model performance and attack effectiveness, see this explanation of evaluation metrics.
Pre-processing mitigation steps (image compression, spatial smoothing, variance minimization)
Defenses like adversarial training (currently supported by ART)
The examples of time and compute requirements below cover a variety of models and datasets to guide users’ expectations. These data can be used for resource planning for model testing and evaluation (T&E).
Execution Date |
Dataset |
Model |
Attack |
Device |
Num samples |
Peak memory |
Duration (seconds) |
Benign Acc |
Advers Acc |
|---|---|---|---|---|---|---|---|---|---|
12/6/24 |
MITLL/LADI-v2-dataset |
MITLL/LADI-v2-classifier-small |
PGD |
CPU |
50 |
1811.5 |
284 |
81.67 |
66.67 |
12/7/24 |
MITLL/LADI-v2-dataset |
MITLL/LADI-v2-classifier-small |
PGD |
CPU |
100 |
2069 |
527.64 |
78.17 |
72.17 |
Model and input data not compatible –> see ‘Compatibility considerations’ above
Patch may be too easily detected
Incorrect size, shape, or placement of the patch relative to the original image
[in physical patch use] Changes in lighting or object orientation can decrease effectiveness
For more information on causes of attack failure, see Carlini’s Indicators of Attack Failure and Tramer’s On Adaptive Attacks to Adversarial Example Defenses.
Similar attacks:
A second patch attack notebook, Adversarial Patch for Object Detection, can be found via the IBM HEART-library GitHub repository.
Other physically realizable attacks include adversarial laser beam.
Further reading:
For more information on which attacks are relevant in which conditions, please see HEART’s Adversarial Evaluation Pathways.