foolbox.adversarial

Provides a class that represents an adversarial example.

class foolbox.adversarial.Adversarial(model, criterion, unperturbed, original_class, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None, verbose=False)[source]
adversarial_class[source]

The argmax of the model predictions for the best adversarial found so far.

None if no adversarial has been found.

backward_one(self, gradient, x=None, strict=True)[source]

Interface to model.backward_one for attacks.

Parameters:
gradient : numpy.ndarray

Gradient of some loss w.r.t. the logits.

x : numpy.ndarray

Single input with shape as expected by the model (without the batch dimension).

Returns:
gradient : numpy.ndarray

The gradient w.r.t the input.

See also

gradient()
channel_axis(self, batch)[source]

Interface to model.channel_axis for attacks.

Parameters:
batch : bool

Controls whether the index of the axis for a batch of inputs (4 dimensions) or a single input (3 dimensions) should be returned.

distance[source]

The distance of the adversarial input to the original input.

forward(self, inputs, greedy=False, strict=True, return_details=False)[source]

Interface to model.forward for attacks.

Parameters:
inputs : numpy.ndarray

Batch of inputs with shape as expected by the model.

greedy : bool

Whether the first adversarial should be returned.

strict : bool

Controls if the bounds for the pixel values should be checked.

forward_and_gradient(self, x, label=None, strict=True, return_details=False)[source]

Interface to model.forward_and_gradient_one for attacks.

Parameters:
x : numpy.ndarray

Multiple input with shape as expected by the model (with the batch dimension).

label : numpy.ndarray

Labels used to calculate the loss that is differentiated. Defaults to the original label.

strict : bool

Controls if the bounds for the pixel values should be checked.

forward_and_gradient_one(self, x=None, label=None, strict=True, return_details=False)[source]

Interface to model.forward_and_gradient_one for attacks.

Parameters:
x : numpy.ndarray

Single input with shape as expected by the model (without the batch dimension). Defaults to the original input.

label : int

Label used to calculate the loss that is differentiated. Defaults to the original label.

strict : bool

Controls if the bounds for the pixel values should be checked.

forward_one(self, x, strict=True, return_details=False)[source]

Interface to model.forward_one for attacks.

Parameters:
x : numpy.ndarray

Single input with shape as expected by the model (without the batch dimension).

strict : bool

Controls if the bounds for the pixel values should be checked.

gradient_one(self, x=None, label=None, strict=True)[source]

Interface to model.gradient_one for attacks.

Parameters:
x : numpy.ndarray

Single input with shape as expected by the model (without the batch dimension). Defaults to the original input.

label : int

Label used to calculate the loss that is differentiated. Defaults to the original label.

strict : bool

Controls if the bounds for the pixel values should be checked.

has_gradient(self)[source]

Returns true if _backward and _forward_backward can be called by an attack, False otherwise.

normalized_distance(self, x)[source]

Calculates the distance of a given input x to the original input.

Parameters:
x : numpy.ndarray

The input x that should be compared to the original input.

Returns:
Distance

The distance between the given input and the original input.

original_class[source]

The class of the original input (ground-truth, not model prediction).

output[source]

The model predictions for the best adversarial found so far.

None if no adversarial has been found.

perturbed[source]

The best adversarial example found so far.

reached_threshold(self)[source]

Returns True if a threshold is given and the currently best adversarial distance is smaller than the threshold.

target_class[source]

Interface to criterion.target_class for attacks.

unperturbed[source]

The original input.