foolbox.adversarial
¶
Provides a class that represents an adversarial example.

class
foolbox.adversarial.
Adversarial
(model, criterion, original_image, original_class, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None, verbose=False)[source]¶ Defines an adversarial that should be found and stores the result.
The
Adversarial
class represents a single adversarial example for a given model, criterion and reference image. It can be passed to an adversarial attack to find the actual adversarial.Parameters:  model : a
Model
instance The model that should be fooled by the adversarial.
 criterion : a
Criterion
instance The criterion that determines which images are adversarial.
 original_image : a
numpy.ndarray
The original image to which the adversarial image should be as close as possible.
 original_class : int
The groundtruth label of the original image.
 distance : a
Distance
class The measure used to quantify similarity between images.
 threshold : float or
Distance
If not None, the attack will stop as soon as the adversarial perturbation has a size smaller than this threshold. Can be an instance of the
Distance
class passed to the distance argument, or a float assumed to have the same unit as the the given distance. If None, the attack will simply minimize the distance as good as possible. Note that the threshold only influences early stopping of the attack; the returned adversarial does not necessarily have smaller perturbation size than this threshold; the reached_threshold() method can be used to check if the threshold has been reached.

adversarial_class
[source]¶ The argmax of the model predictions for the best adversarial found so far.
None if no adversarial has been found.

backward
(gradient, image=None, strict=True)[source]¶ Interface to model.backward for attacks.
Parameters:  gradient : numpy.ndarray
Gradient of some loss w.r.t. the logits.
 image : numpy.ndarray
Single input with shape as expected by the model (without the batch dimension).
Returns:  gradient : numpy.ndarray
The gradient w.r.t the image.
See also

batch_predictions
(images, greedy=False, strict=True, return_details=False)[source]¶ Interface to model.batch_predictions for attacks.
Parameters:  images : numpy.ndarray
Batch of inputs with shape as expected by the model.
 greedy : bool
Whether the first adversarial should be returned.
 strict : bool
Controls if the bounds for the pixel values should be checked.

channel_axis
(batch)[source]¶ Interface to model.channel_axis for attacks.
Parameters:  batch : bool
Controls whether the index of the axis for a batch of images (4 dimensions) or a single image (3 dimensions) should be returned.

gradient
(image=None, label=None, strict=True)[source]¶ Interface to model.gradient for attacks.
Parameters:  image : numpy.ndarray
Single input with shape as expected by the model (without the batch dimension). Defaults to the original image.
 label : int
Label used to calculate the loss that is differentiated. Defaults to the original label.
 strict : bool
Controls if the bounds for the pixel values should be checked.

has_gradient
()[source]¶ Returns true if _backward and _forward_backward can be called by an attack, False otherwise.

normalized_distance
(image)[source]¶ Calculates the distance of a given image to the original image.
Parameters:  image : numpy.ndarray
The image that should be compared to the original image.
Returns:  :class:`Distance`
The distance between the given image and the original image.

output
[source]¶ The model predictions for the best adversarial found so far.
None if no adversarial has been found.

predictions
(image, strict=True, return_details=False)[source]¶ Interface to model.predictions for attacks.
Parameters:  image : numpy.ndarray
Single input with shape as expected by the model (without the batch dimension).
 strict : bool
Controls if the bounds for the pixel values should be checked.

predictions_and_gradient
(image=None, label=None, strict=True, return_details=False)[source]¶ Interface to model.predictions_and_gradient for attacks.
Parameters:  image : numpy.ndarray
Single input with shape as expected by the model (without the batch dimension). Defaults to the original image.
 label : int
Label used to calculate the loss that is differentiated. Defaults to the original label.
 strict : bool
Controls if the bounds for the pixel values should be checked.
 model : a