The era of rapid AI development, also known as the AI Spring, is pressuring companies to shorten their Time-to-Market drastically. In order to stay competitive, they rush to release AI-based products, often sacrificing thorough development and testing. This rush results in underdeveloped AI systems that lack robustness and reliability. New machine learning (ML) algorithms, which are crucial to these developments, may enter production without adequate large-scale review, increasing the risk of ineffective or potentially hazardous implementations. The lack of software development resources increases these risks and challenges even further. Today’s high demand for AI expertise significantly outweighs the supply of skilled professionals and the high costs of computational resources, like GPUs, reduces the general access to AI models. The situation gets even trickier because of data quality, where training data does not meet the expectation and need for high-quality and relevance, resulting in weak AI outputs.
A study from UC Berkeley in Adversarial Machine Learning (AML) showed how AI systems can be tricked by simple environmental changes. Researchers demonstrated that self-driving cars could be tricked into incorrect actions by merely placing stickers on the road. This experiment underlines the sensitivity of AI systems to minor manipulations and shows the importance of designing AI with a higher level of resilience against such interference. For further details, Berkeley's CLTC website on AML offers more information.
Adversarial Machine Learning (AML) is a concept that sits at the intersection of cybersecurity and AI. It addresses the issue of adversarial attacks, which exploit the weaknesses in AI solutions. AML focuses on understanding and reducing the vulnerabilities inherent in AI systems, providing strategies and methods to defend against such attacks. This field is becoming increasingly important as AI systems become more mainstream and are being integrated into various aspects of technology and daily life.
Understanding and defending against adversarial attacks is essential for ensuring the reliability and safety of AI applications.
Because ML models are data-driven, adversarial ML attacks introduce unique security challenges during model development, deployment and inference.
AML attacks can be categorized into two main types: white-box attacks and black-box attacks. In a white-box attack, the attacker has extensive knowledge about the ML model, including its underlying architecture, training data, and the optimization algorithm employed during training. This deep understanding enables the attacker to execute highly targeted exploits.
On the other hand, in a black-box attack, the attacker lacks or has limited knowledge about the ML model, including its architecture, training data, decision boundaries, and optimization algorithm. Consequently, the attacker must engage with the ML model as an external user, using a trial-and-error approach through prompts to uncover vulnerabilities by analyzing its responses.
In general AML focuses on four main threats: Evasion Attacks, Poisoning Attacks, Extraction Attacks, and Inference Attacks.
Evasion Attacks:
These occur during a model's testing phase, where attackers modify inputs to cause incorrect predictions. These subtle changes are hard to detect but can significantly mislead the model. In this context, the case of the Cylance INFINITY AI engine presents a notable example. Cylance INFINITY AI used a scoring mechanism ranging from -1000 to +1000 to evaluate files, with certain executable file families whitelisted to minimize false positives (FP). This approach inadvertently introduced a bias towards code in these whitelisted files. Exploiting this vulnerability, researchers conducted an evasion attack by extracting strings from an online gaming program that was on Cylance's whitelist. They then appended these strings to malicious files, specifically the WannaCry and Samsam ransomware. As a result, the Cylance engine misclassified these altered ransomware files as benign, shifting their scores from high negative to high positive. The research findings were presented at “BSides Sydney 2019”.
Poisoning Attacks:
Targeting the training phase, these attacks involve injecting harmful data into the training set, leading to a corrupt and poorly performing model, posing serious security risks. A notable instance is altering training images for self-driving car algorithms, where manipulated stop signs images caused misidentification of road signs, demonstrating severe real-world implications of such attacks.
Extraction Attacks:
Attackers reverse-engineer a model to extract crucial details like its structure or training data, compromising its integrity and enabling more targeted subsequent attacks. Recently, researchers have demonstrated the feasibility of model extraction attacks in adversarial machine learning, showing that it's possible to reverse-engineer machine learning models. In this context, there are examples indicating that even sophisticated image analysis systems are susceptible to such security breaches.
Inference Attacks:
These attacks analyze a model's outputs to infer private data, posing a significant privacy risk, especially with sensitive information like medical or financial records. The case of the Netflix Prize serves as a significant example of an inference attack. When Netflix released an anonymized dataset for a competition aimed at improving its recommendation algorithm, researchers found a way to de-anonymize part of it. They achieved this by comparing the Netflix data with publicly available IMDb movie ratings, allowing them to identify certain Netflix users.
Muninn Innovation Lab (MIL) is a crucial bridge connecting Muninn with industry partners and academic institutions, putting us at the forefront of cybersecurity research and development. With successful collaborations involving grants, research projects, and internships with leading universities, MIL continues to foster innovation, enhance our cybersecurity solutions, and cultivate the next generation of cybersecurity talent.
Subscribe to our newsletter to receive new posts straight to your inbox