# Attack Types AI is vulnerable to evasion, poisoning, inference, and extraction attacks. **HEART currently focuses on the evaluation of computer vision models – including image classification and object detection – against evasion attacks**, but will soon be extended to include the other three types of threat. ## Evasion Attacks that involve the manipulation of runtime data input into a trained model and crafting perturbations that cause the model to perform poorly (sometimes in a specific, targeted way). [^1] HEART supports the following types of evasion attacks: - Hop skip jump - Laser - Query efficient Black Box - [Projected Gradient Descent](/reference_materials/attack_cards/projected_gradient_descent) - [Patch](/reference_materials/attack_cards/patch_attack) Evasion attacks may have different goals. For object detection models, evasion attacks may aim to cause the model to hallucinate objects (_appearing attack_), ignore existing objects (_hiding attack_ or _invisibility_) [^2], or _mislocate_ or _misclassify_ detected objects. Variations of the attack goal for object detection can include overflow attacks that lead the model to detect an extremely large number of objects. This can result in memory overflows and/or increased latency [^3], rendering the detector unusable (_denial of service_). Evasion attacks can be either _untargeted_ (the output is any valid output other than the correct output) or _targeted_ (the output is incorrect and has been defined as a specific target by the attacks). In the case of image classification, if the correct classification is ‘airplane’, any classification other than ‘airplane’ caused by the crafted adversarial perturbation is a successful _untargeted_ attack. In most cases, untargeted attacks shift the adversarial point to any close neighboring class. On the other hand, _targeted_ attacks aim to cause the classifier to output a specific target class. Adversarial perturbations for targeted attacks are often harder to craft than for untargeted attacks, as they need to find a perturbation that shifts the point into a specific class, not just any neighboring class. ## Poisoning Poisoning means that the attacker introduces specifically crafted, _poisonous_ samples into the training data with the goal of inserting a _backdoor_ into the trained model to influence its behavior at runtime, make it more difficult to train the model on that data, or change its behavior in unexpected ways like confusing two image categories. [^4] ## Inference Inference attacks, also know as privacy attacks, do not directly affect model performance. Instead, these attacks infer private or sensitive information in the training dataset of the model by interacting with that model (without any access to the training data itself). Attackers probe the model with specifically selected input samples and analyze model output to derive the targeted insights. ## Extraction In an extraction attack, the attacker will attempt to decipher information about the model. This including its architecture and parameters, under special circumstances, in order to replicate its functionality and steal its intellectual property. Attacks can be staged in concert to increase their combined effectiveness. For example, extraction attacks can provide key information to enable stronger [white-box](white_vs_black_box) evasion attacks, or to extract a model that could later leak private information in a strong white-box inference attack. ## References [^1]: Muthalagu, Raja, Jasmita Malik and Pranav M Pawar. "Detection and prevention of evasion attacks on machine learning models." Expert Systems with Applications 266 (25 March 2025): 126044. [^2]: Hu, Shengnan, et al. "Cca: Exploring the possibility of contextual camouflage attack on object detection." 2020 25th International Conference on Pattern Recognition (ICPR). IEEE, 2021. [^3]: Shapira, Avishag, et al. "Phantom sponges: Exploiting non-maximum suppression to attack deep object detectors." Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 2023. [^4]: Koffas, Stefanos, Jing Xu, Mauro Conti and Stjepan Picek. "Can You Hear It? Backdoor Attacks via Ultrasonic Triggers." 10.48550/arXiv.2107.14569. 2021.