Decisionbased attacks¶

class
foolbox.attacks.
BoundaryAttack
(model=None, criterion=<foolbox.criteria.Misclassification object>, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None)[source]¶ A powerful adversarial attack that requires neither gradients nor probabilities.
This is the reference implementation for the attack introduced in [Re72ca268aa551].
Notes
This implementation provides several advanced features:
 ability to continue previous attacks by passing an instance of the Adversarial class
 ability to pass an explicit starting point; especially to initialize a targeted attack
 ability to pass an alternative attack used for initialization
 finegrained control over logging
 ability to specify the batch size
 optional automatic batch size tuning
 optional multithreading for random number generation
 optional multithreading for candidate point generation
References
[Re72ca268aa551] Wieland Brendel (*), Jonas Rauber (*), Matthias Bethge, “DecisionBased Adversarial Attacks: Reliable Attacks Against BlackBox Machine Learning Models”, https://arxiv.org/abs/1712.04248 
__call__
(self, input_or_adv, label=None, unpack=True, iterations=5000, max_directions=25, starting_point=None, initialization_attack=None, log_every_n_steps=1, spherical_step=0.01, source_step=0.01, step_adaptation=1.5, batch_size=1, tune_batch_size=True, threaded_rnd=True, threaded_gen=True, alternative_generator=False, internal_dtype=<Mock name='mock.float64' id='140441795538000'>, verbose=False)[source]¶ Applies the Boundary Attack.
Parameters:  input_or_adv : numpy.ndarray or
Adversarial
The original, correctly classified input. If it is a numpy array, label must be passed as well. If it is an
Adversarial
instance, label must not be passed. label : int
The reference label of the original input. Must be passed if input is a numpy array, must not be passed if input is an
Adversarial
instance. unpack : bool
If true, returns the adversarial input, otherwise returns the Adversarial object.
 iterations : int
Maximum number of iterations to run. Might converge and stop before that.
 max_directions : int
Maximum number of trials per ieration.
 starting_point : numpy.ndarray
Adversarial input to use as a starting point, in particular for targeted attacks.
 initialization_attack :
Attack
Attack to use to find a starting point. Defaults to BlendedUniformNoiseAttack.
 log_every_n_steps : int
Determines verbositity of the logging.
 spherical_step : float
Initial step size for the orthogonal (spherical) step.
 source_step : float
Initial step size for the step towards the target.
 step_adaptation : float
Factor by which the step sizes are multiplied or divided.
 batch_size : int
Batch size or initial batch size if tune_batch_size is True
 tune_batch_size : bool
Whether or not the batch size should be automatically chosen between 1 and max_directions.
 threaded_rnd : bool
Whether the random number generation should be multithreaded.
 threaded_gen : bool
Whether the candidate point generation should be multithreaded.
 alternative_generator: bool
Whether an alternative implemenation of the candidate generator should be used.
 internal_dtype : np.float32 or np.float64
Higher precision might be slower but is numerically more stable.
 verbose : bool
Controls verbosity of the attack.
 input_or_adv : numpy.ndarray or

class
foolbox.attacks.
SpatialAttack
(model=None, criterion=<foolbox.criteria.Misclassification object>, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None)[source]¶ Adversarially chosen rotations and translations [1].
This implementation is based on the reference implementation by Madry et al.: https://github.com/MadryLab/adversarial_spatial
References
[Rdffd25498f9d1] Logan Engstrom*, Brandon Tran*, Dimitris Tsipras*, Ludwig Schmidt, Aleksander Mądry: “A Rotation and a Translation Suffice: Fooling CNNs with Simple Transformations”, http://arxiv.org/abs/1712.02779 
__call__
(self, input_or_adv, label=None, unpack=True, do_rotations=True, do_translations=True, x_shift_limits=(5, 5), y_shift_limits=(5, 5), angular_limits=(5, 5), granularity=10, random_sampling=False, abort_early=True)[source]¶ Adversarially chosen rotations and translations.
Parameters:  input_or_adv : numpy.ndarray or
Adversarial
The original, unperturbed input as a numpy.ndarray or an
Adversarial
instance. label : int
The reference label of the original input. Must be passed if a is a numpy.ndarray, must not be passed if a is an
Adversarial
instance. unpack : bool
If true, returns the adversarial input, otherwise returns the Adversarial object.
 do_rotations : bool
If False no rotations will be applied to the image.
 do_translations : bool
If False no translations will be applied to the image.
 x_shift_limits : int or (int, int)
Limits for horizontal translations in pixels. If one integer is provided the limits will be (x_shift_limits, x_shift_limits).
 y_shift_limits : int or (int, int)
Limits for vertical translations in pixels. If one integer is provided the limits will be (y_shift_limits, y_shift_limits).
 angular_limits : int or (int, int)
Limits for rotations in degrees. If one integer is provided the limits will be [angular_limits, angular_limits].
 granularity : int
Density of sampling within limits for each dimension.
 random_sampling : bool
If True we sample translations/rotations randomly within limits, otherwise we use a regular grid.
 abort_early : bool
If True, the attack stops as soon as it finds an adversarial.
 input_or_adv : numpy.ndarray or


class
foolbox.attacks.
PointwiseAttack
(model=None, criterion=<foolbox.criteria.Misclassification object>, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None)[source]¶ Starts with an adversarial and performs a binary search between the adversarial and the original for each dimension of the input individually.
References
[R739f80a248751] L. Schott, J. Rauber, M. Bethge, W. Brendel: “Towards the first adversarially robust neural network model on MNIST”, ICLR (2019) https://arxiv.org/abs/1805.09190 
__call__
(self, input_or_adv, label=None, unpack=True, starting_point=None, initialization_attack=None)[source]¶ Starts with an adversarial and performs a binary search between the adversarial and the original for each dimension of the input individually.
Parameters:  input_or_adv : numpy.ndarray or
Adversarial
The original, unperturbed input as a numpy.ndarray or an
Adversarial
instance. label : int
The reference label of the original input. Must be passed if a is a numpy.ndarray, must not be passed if a is an
Adversarial
instance. unpack : bool
If true, returns the adversarial input, otherwise returns the Adversarial object.
 starting_point : numpy.ndarray
Adversarial input to use as a starting point, in particular for targeted attacks.
 initialization_attack :
Attack
Attack to use to find a starting point. Defaults to SaltAndPepperNoiseAttack.
 input_or_adv : numpy.ndarray or


class
foolbox.attacks.
GaussianBlurAttack
(model=None, criterion=<foolbox.criteria.Misclassification object>, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None)[source]¶ Blurs the input until it is misclassified.

__call__
(self, input_or_adv, label=None, unpack=True, epsilons=1000)[source]¶ Blurs the input until it is misclassified.
Parameters:  input_or_adv : numpy.ndarray or
Adversarial
The original, unperturbed input as a numpy.ndarray or an
Adversarial
instance. label : int
The reference label of the original input. Must be passed if input is a numpy.ndarray, must not be passed if input is an
Adversarial
instance. unpack : bool
If true, returns the adversarial input, otherwise returns the Adversarial object.
 epsilons : int or Iterable[float]
Either Iterable of standard deviations of the Gaussian blur or number of standard deviations between 0 and 1 that should be tried.
 input_or_adv : numpy.ndarray or


class
foolbox.attacks.
ContrastReductionAttack
(model=None, criterion=<foolbox.criteria.Misclassification object>, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None)[source]¶ Reduces the contrast of the input until it is misclassified.

__call__
(self, input_or_adv, label=None, unpack=True, epsilons=1000)[source]¶ Reduces the contrast of the input until it is misclassified.
Parameters:  input_or_adv : numpy.ndarray or
Adversarial
The original, unperturbed input as a numpy.ndarray or an
Adversarial
instance. label : int
The reference label of the original input. Must be passed if a is a numpy.ndarray, must not be passed if a is an
Adversarial
instance. unpack : bool
If true, returns the adversarial input, otherwise returns the Adversarial object.
 epsilons : int or Iterable[float]
Either Iterable of contrast levels or number of contrast levels between 1 and 0 that should be tried. Epsilons are one minus the contrast level.
 input_or_adv : numpy.ndarray or


class
foolbox.attacks.
AdditiveUniformNoiseAttack
(model=None, criterion=<foolbox.criteria.Misclassification object>, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None)[source]¶ Adds uniform noise to the input, gradually increasing the standard deviation until the input is misclassified.

__call__
(self, input_or_adv, label=None, unpack=True, epsilons=1000)[source]¶ Adds uniform or Gaussian noise to the input, gradually increasing the standard deviation until the input is misclassified.
Parameters:  input_or_adv : numpy.ndarray or
Adversarial
The original, unperturbed input as a numpy.ndarray or an
Adversarial
instance. label : int
The reference label of the original input. Must be passed if a is a numpy.ndarray, must not be passed if a is an
Adversarial
instance. unpack : bool
If true, returns the adversarial input, otherwise returns the Adversarial object.
 epsilons : int or Iterable[float]
Either Iterable of noise levels or number of noise levels between 0 and 1 that should be tried.
 input_or_adv : numpy.ndarray or

__init__
(self, model=None, criterion=<foolbox.criteria.Misclassification object at 0x7fbb27529ef0>, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None)[source]¶ Initialize self. See help(type(self)) for accurate signature.

__new__
(*args, **kwargs)[source]¶ Create and return a new object. See help(type) for accurate signature.

__subclasshook__
()[source]¶ Abstract classes can override this to customize issubclass().
This is invoked early on by abc.ABCMeta.__subclasscheck__(). It should return True, False or NotImplemented. If it returns NotImplemented, the normal algorithm is used. Otherwise, it overrides the normal algorithm (and the outcome is cached).

name
(self)[source]¶ Returns a human readable name that uniquely identifies the attack with its hyperparameters.
Returns:  str
Human readable name that uniquely identifies the attack with its hyperparameters.
Notes
Defaults to the class name but subclasses can provide more descriptive names and must take hyperparameters into account.


class
foolbox.attacks.
AdditiveGaussianNoiseAttack
(model=None, criterion=<foolbox.criteria.Misclassification object>, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None)[source]¶ Adds Gaussian noise to the input, gradually increasing the standard deviation until the input is misclassified.

__call__
(self, input_or_adv, label=None, unpack=True, epsilons=1000)[source]¶ Adds uniform or Gaussian noise to the input, gradually increasing the standard deviation until the input is misclassified.
Parameters:  input_or_adv : numpy.ndarray or
Adversarial
The original, unperturbed input as a numpy.ndarray or an
Adversarial
instance. label : int
The reference label of the original input. Must be passed if a is a numpy.ndarray, must not be passed if a is an
Adversarial
instance. unpack : bool
If true, returns the adversarial input, otherwise returns the Adversarial object.
 epsilons : int or Iterable[float]
Either Iterable of noise levels or number of noise levels between 0 and 1 that should be tried.
 input_or_adv : numpy.ndarray or

__init__
(self, model=None, criterion=<foolbox.criteria.Misclassification object at 0x7fbb27529ef0>, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None)[source]¶ Initialize self. See help(type(self)) for accurate signature.

__new__
(*args, **kwargs)[source]¶ Create and return a new object. See help(type) for accurate signature.

__subclasshook__
()[source]¶ Abstract classes can override this to customize issubclass().
This is invoked early on by abc.ABCMeta.__subclasscheck__(). It should return True, False or NotImplemented. If it returns NotImplemented, the normal algorithm is used. Otherwise, it overrides the normal algorithm (and the outcome is cached).

name
(self)[source]¶ Returns a human readable name that uniquely identifies the attack with its hyperparameters.
Returns:  str
Human readable name that uniquely identifies the attack with its hyperparameters.
Notes
Defaults to the class name but subclasses can provide more descriptive names and must take hyperparameters into account.


class
foolbox.attacks.
SaltAndPepperNoiseAttack
(model=None, criterion=<foolbox.criteria.Misclassification object>, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None)[source]¶ Increases the amount of salt and pepper noise until the input is misclassified.

__call__
(self, input_or_adv, label=None, unpack=True, epsilons=100, repetitions=10)[source]¶ Increases the amount of salt and pepper noise until the input is misclassified.
Parameters:  input_or_adv : numpy.ndarray or
Adversarial
The original, unperturbed input as a numpy.ndarray or an
Adversarial
instance. label : int
The reference label of the original input. Must be passed if a is a numpy.ndarray, must not be passed if a is an
Adversarial
instance. unpack : bool
If true, returns the adversarial input, otherwise returns the Adversarial object.
 epsilons : int
Number of steps to try between probability 0 and 1.
 repetitions : int
Specifies how often the attack will be repeated.
 input_or_adv : numpy.ndarray or


class
foolbox.attacks.
BlendedUniformNoiseAttack
(model=None, criterion=<foolbox.criteria.Misclassification object>, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None)[source]¶ Blends the input with a uniform noise input until it is misclassified.

__call__
(self, input_or_adv, label=None, unpack=True, epsilons=1000, max_directions=1000)[source]¶ Blends the input with a uniform noise input until it is misclassified.
Parameters:  input_or_adv : numpy.ndarray or
Adversarial
The original, unperturbed input as a numpy.ndarray or an
Adversarial
instance. label : int
The reference label of the original input. Must be passed if a is a numpy.ndarray, must not be passed if a is an
Adversarial
instance. unpack : bool
If true, returns the adversarial input, otherwise returns the Adversarial object.
 epsilons : int or Iterable[float]
Either Iterable of blending steps or number of blending steps between 0 and 1 that should be tried.
 max_directions : int
Maximum number of random inputs to try.
 input_or_adv : numpy.ndarray or


class
foolbox.attacks.
HopSkipJumpAttack
(model=None, criterion=<foolbox.criteria.Misclassification object>, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None)[source]¶ A powerful adversarial attack that requires neither gradients nor probabilities.
Notes
Features: * ability to switch between two types of distances: MSE and Linf. * ability to continue previous attacks by passing an instance of the
Adversarial class ability to pass an explicit starting point; especially to initialize a targeted attack
 ability to pass an alternative attack used for initialization
 ability to specify the batch size
References
HopSkipJumpAttack was originally proposed by Chen, Jordan and Wainwright. It is a decisionbased attack that requires access to output labels of a model alone. Paper link: https://arxiv.org/abs/1904.02144 The implementation in Foolbox is based on Boundary Attack.

__call__
(self, input_or_adv, label=None, unpack=True, iterations=64, initial_num_evals=100, max_num_evals=10000, stepsize_search='geometric_progression', gamma=1.0, starting_point=None, batch_size=256, internal_dtype=<Mock name='mock.float64' id='140441795538000'>, log_every_n_steps=1, verbose=False)[source]¶ Applies HopSkipJumpAttack.
Parameters:  input_or_adv : numpy.ndarray or
Adversarial
The original, correctly classified input. If it is a numpy array, label must be passed as well. If it is an
Adversarial
instance, label must not be passed. label : int
The reference label of the original input. Must be passed if input is a numpy array, must not be passed if input is an
Adversarial
instance. unpack : bool
If true, returns the adversarial input, otherwise returns the Adversarial object.
 iterations : int
Number of iterations to run.
 initial_num_evals: int
Initial number of evaluations for gradient estimation. Larger initial_num_evals increases time efficiency, but may decrease query efficiency.
 max_num_evals: int
Maximum number of evaluations for gradient estimation.
 stepsize_search: str
How to search for stepsize; choices are ‘geometric_progression’, ‘grid_search’. ‘geometric progression’ initializes the stepsize by x_t  x_p / sqrt(iteration), and keep decreasing by half until reaching the target side of the boundary. ‘grid_search’ chooses the optimal epsilon over a grid, in the scale of x_t  x_p.
 gamma: float
 The binary search threshold theta is gamma / d^1.5 for
l2 attack and gamma / d^2 for linf attack.
 starting_point : numpy.ndarray
Adversarial input to use as a starting point, required for targeted attacks.
 batch_size : int
Batch size for model prediction.
 internal_dtype : np.float32 or np.float64
Higher precision might be slower but is numerically more stable.
 log_every_n_steps : int
Determines verbositity of the logging.
 verbose : bool
Controls verbosity of the attack.
 input_or_adv : numpy.ndarray or

approximate_gradient
(self, decision_function, sample, num_evals, delta)[source]¶ Gradient direction estimation

binary_search_batch
(self, unperturbed, perturbed_inputs, decision_function)[source]¶ Binary search to approach the boundary.

geometric_progression_for_stepsize
(self, x, update, dist, decision_function, current_iteration)[source]¶ Geometric progression to search for stepsize. Keep decreasing stepsize by half until reaching the desired side of the boundary.