foolbox.adversarial

Provides a class that represents an adversarial example.

class foolbox.adversarial.Adversarial(model, criterion, original_image, original_class, distance=<class 'foolbox.distances.MeanSquaredDistance'>, verbose=False)[source]

Defines an adversarial that should be found and stores the result.

The Adversarial class represents a single adversarial example for a given model, criterion and reference image. It can be passed to an adversarial attack to find the actual adversarial.

Parameters:

model : a Model instance

The model that should be fooled by the adversarial.

criterion : a Criterion instance

The criterion that determines which images are adversarial.

original_image : a numpy.ndarray

The original image to which the adversarial image should be as close as possible.

original_class : int

The ground-truth label of the original image.

distance : a Distance class

The measure used to quantify similarity between images.

batch_predictions(images, greedy=False, strict=True, return_details=False)[source]

Interface to model.batch_predictions for attacks.

Parameters:

images : numpy.ndarray

Batch of images with shape (batch size, height, width, channels).

greedy : bool

Whether the first adversarial should be returned.

strict : bool

Controls if the bounds for the pixel values should be checked.

channel_axis(batch)[source]

Interface to model.channel_axis for attacks.

Parameters:

batch : bool

Controls whether the index of the axis for a batch of images (4 dimensions) or a single image (3 dimensions) should be returned.

gradient(image=None, label=None, strict=True)[source]

Interface to model.gradient for attacks.

Parameters:

image : numpy.ndarray

Image with shape (height, width, channels). Defaults to the original image.

label : int

Label used to calculate the loss that is differentiated. Defaults to the original label.

strict : bool

Controls if the bounds for the pixel values should be checked.

has_gradient()[source]

Returns true if _backward and _forward_backward can be called by an attack, False otherwise.

normalized_distance(image)[source]

Calculates the distance of a given image to the original image.

Parameters:

image : numpy.ndarray

The image that should be compared to the original image.

Returns:

Distance

The distance between the given image and the original image.

predictions(image, strict=True, return_details=False)[source]

Interface to model.predictions for attacks.

Parameters:

image : numpy.ndarray

Image with shape (height, width, channels).

strict : bool

Controls if the bounds for the pixel values should be checked.

predictions_and_gradient(image=None, label=None, strict=True, return_details=False)[source]

Interface to model.predictions_and_gradient for attacks.

Parameters:

image : numpy.ndarray

Image with shape (height, width, channels). Defaults to the original image.

label : int

Label used to calculate the loss that is differentiated. Defaults to the original label.

strict : bool

Controls if the bounds for the pixel values should be checked.

target_class()[source]

Interface to criterion.target_class for attacks.