Reverse-engineering deception-based attacks

By Susan Miller,
Deputy Editor, Route Fifty

| July 15, 2020

The Defense Advanced Research Projects Agency is looking for ways to automate the identification of tools and processes used in deepfakes, or adversarial machine learning attacks, so an adversary’s unique weaknesses can be targeted and responses developed at scale.

With deception playing an increasingly large role in information-based attacks, the Defense Advanced Research Projects Agency is looking for ways to reverse engineer the toolchains behind the deliberately falsified images, video, audio and text that produce adversarial machine learning attacks.

The Reverse Engineering of Deceptions (RED) program, described in a July 1 Artificial Intelligence Exploration Opportunity, aims to develop techniques that automate the identification of tools and processes used in attacks and identify an adversary’s unique weaknesses so a targeted, intelligent response can be developed.

RED will initially produce algorithms for automatically identifying the toolchains behind information deception attacks. In Phase 2, the program will develop scalable databases of attack toolchains to support attribution and defense.

As an example as how RED might work, DARPA described a simplistic example in which an intelligence analyst is reviewing a video. Attackers may surreptitiously alter the video before the analysts sees it, perhaps changing the identity of a person in the video to misdirect attribution. For this deepfake, or adversarial video, RED would capture the adversarial input (the inserted or altered data) and determine whether it has characteristics seen in other deepfake algorithms.

That analysis may implicitly or explicitly capture information that reveals the adversary’s goals and the families of tools used, which may be publicly available, modified from publicly available techniques or completely new. In the wild, such analysis will be far more complex, DARPA said, as “toolchains may contain multiple tools, including ones that attempt to obscure deception.”

“Recovering such information could provide important clues as to the originator of the attack, their goals, and provide insight into what defenses might be most effective against their attacks,” DARPA said. Evidence that an attacker slightly modified a publicly available toolchain, for example, could indicate the adversary has enough skill to modify an existing tool. If the adversarial input is of an unknown type and unrelated to previous attack tools, the adversary may be highly skilled and have access to significant resources.

Although researchers have already shown they can detect or defend against some adversarial attacks, “these techniques fail in the real world scenario where direct access to the network or large training datasets are not available,” DARPA said. Additionally, an effective program must be able to scale, cataloging the unique signatures that identify toolchains of adversarial attacks in a format suitable for storage and search.

Long term, DARPA wants RED to support development of techniques that require little or no a priori knowledge of specific deception toolchains and can generalize across multiple information deception scenarios. Ideally RED would require minimal attack instances to learn unique attack signatures, automatically group examples together so toolchain families identified and scale to internet volumes of information.

For this first RED effort, however, DARPA is looking for proposals that describe technical approaches that will generalize broadly across information domains and toolchain families and deliver roadmaps for highly-automated systems that would satisfy many of the long-term goals.