Attack Strength vs. Detectability Dilemma in Adversarial Machine Learning.

RSS Source
Christopher Frederickson, Michael Moore, Glenn Dawson, Robi Polikar

As the prevalence and everyday use of machine learning algorithms, along withour reliance on these algorithms grow dramatically, so do the efforts to attackand undermine these algorithms with malicious intent, resulting in a growinginterest in adversarial machine learning. A number of approaches have beendeveloped that can render a machine learning algorithm ineffective throughpoisoning or other types of attacks. Most attack algorithms typically usesophisticated optimization approaches, whose objective function is designed tocause maximum damage with respect to accuracy and performance of the algorithmwith respect to some task. In this effort, we show that while such an objectivefunction is indeed brutally effective in causing maximum damage on an embeddedfeature selection task, it often results in an attack mechanism that can beeasily detected with an embarrassingly simple novelty or outlier detectionalgorithm. We then propose an equally simple yet elegant solution by adding aregularization term to the attacker's objective function that penalizesoutlying attack points.

Stay in the loop.

Subscribe to our newsletter for a weekly update on the latest podcast, news, events, and jobs postings.