Advanced¶

The Adversarial class provides an advanced way to specify the adversarial example that should be found by an attack and provides detailed information about the created adversarial. In addition, it provides a way to improve a previously found adversarial example by re-running an attack.

from foolbox.v1 import Adversarial
from foolbox.v1.attacks import LBFGSAttack
from foolbox.models import TenosrFlowModel
from foolbox.criteria import TargetClassProbability

Implicit¶

model = TensorFlowModel(inputs, logits, bounds=(0, 255))
criterion = TargetClassProbability('ostrich', p=0.99)
attack = LBFGSAttack(model, criterion)

Running the attack by passing an input and a label will implicitly create an Adversarial instance. By passing unpack=False we tell the attack to return the Adversarial instance rather than a numpy array.

adversarial = attack(image, label=label, unpack=False)

We can then get the actual adversarial input using the image attribute:

adversarial_image = adversarial.perturbed

Explicit¶

model = TensorFlowModel(images, logits, bounds=(0, 255))
criterion = TargetClassProbability('ostrich', p=0.99)
attack = LBFGSAttack()

We can also create the Adversarial instance ourselves and then pass it to the attack.

adversarial = Adversarial(model, criterion, image, label)
attack(adversarial)

Again, we can get the image using the image attribute:

adversarial_image = adversarial.perturbed

This approach gives us more flexibility and allows us to specify a different distance measure:

distance = MeanAbsoluteDistance
adversarial = Adversarial(model, criterion, image, label, distance=distance)