Decision-based attacks¶

class foolbox.attacks.BoundaryAttack(model=None, criterion=<foolbox.criteria.Misclassification object>, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None)[source]¶

A powerful adversarial attack that requires neither gradients nor probabilities.

This is the reference implementation for the attack introduced in [Re72ca268aa55-1].

Notes

This implementation provides several advanced features:

ability to continue previous attacks by passing an instance of the Adversarial class
ability to pass an explicit starting point; especially to initialize a targeted attack
ability to pass an alternative attack used for initialization
fine-grained control over logging
ability to specify the batch size
optional automatic batch size tuning
optional multithreading for random number generation
optional multithreading for candidate point generation

References

[Re72ca268aa55-1]

Wieland Brendel (*), Jonas Rauber (*), Matthias Bethge, “Decision-Based Adversarial Attacks: Reliable Attacks Against Black-Box Machine Learning Models”, https://arxiv.org/abs/1712.04248

as_generator(self, a, iterations=5000, max_directions=25, starting_point=None, initialization_attack=None, log_every_n_steps=None, spherical_step=0.01, source_step=0.01, step_adaptation=1.5, batch_size=1, tune_batch_size=True, threaded_rnd=True, threaded_gen=True, alternative_generator=False, internal_dtype=<Mock name='mock.float64' id='139662396493776'>, loggingLevel=30)[source]¶

Applies the Boundary Attack.

Parameters:

input_or_adv : numpy.ndarray or Adversarial: The original, correctly classified input. If it is a numpy array, label must be passed as well. If it is an Adversarial instance, label must not be passed.
label : int: The reference label of the original input. Must be passed if input is a numpy array, must not be passed if input is an Adversarial instance.
unpack : bool: If true, returns the adversarial input, otherwise returns the Adversarial object.
iterations : int: Maximum number of iterations to run. Might converge and stop before that.
max_directions : int: Maximum number of trials per ieration.
starting_point : numpy.ndarray: Adversarial input to use as a starting point, in particular for targeted attacks.
initialization_attack : Attack: Attack to use to find a starting point. Defaults to BlendedUniformNoiseAttack.
log_every_n_steps : int: Determines verbositity of the logging.
spherical_step : float: Initial step size for the orthogonal (spherical) step.
source_step : float: Initial step size for the step towards the target.
step_adaptation : float: Factor by which the step sizes are multiplied or divided.
batch_size : int: Batch size or initial batch size if tune_batch_size is True
tune_batch_size : bool: Whether or not the batch size should be automatically chosen between 1 and max_directions.
threaded_rnd : bool: Whether the random number generation should be multithreaded.
threaded_gen : bool: Whether the candidate point generation should be multithreaded.
alternative_generator: bool: Whether an alternative implemenation of the candidate generator should be used.
internal_dtype : np.float32 or np.float64: Higher precision might be slower but is numerically more stable.
loggingLevel : int: Controls the verbosity of the logging, e.g. logging.INFO or logging.WARNING.

class foolbox.attacks.SpatialAttack(model=None, criterion=<foolbox.criteria.Misclassification object>, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None)[source]¶

Adversarially chosen rotations and translations [1].

This implementation is based on the reference implementation by Madry et al.: https://github.com/MadryLab/adversarial_spatial

References

[Rdffd25498f9d-1]

Logan Engstrom*, Brandon Tran*, Dimitris Tsipras*, Ludwig Schmidt, Aleksander Mądry: “A Rotation and a Translation Suffice: Fooling CNNs with Simple Transformations”, http://arxiv.org/abs/1712.02779

as_generator(self, a, do_rotations=True, do_translations=True, x_shift_limits=(-5, 5), y_shift_limits=(-5, 5), angular_limits=(-5, 5), granularity=10, random_sampling=False, abort_early=True)[source]¶

Adversarially chosen rotations and translations.

Parameters:

input_or_adv : numpy.ndarray or Adversarial: The original, unperturbed input as a numpy.ndarray or an Adversarial instance.
label : int: The reference label of the original input. Must be passed if a is a numpy.ndarray, must not be passed if a is an Adversarial instance.
unpack : bool: If true, returns the adversarial input, otherwise returns the Adversarial object.
do_rotations : bool: If False no rotations will be applied to the image.
do_translations : bool: If False no translations will be applied to the image.
x_shift_limits : int or (int, int): Limits for horizontal translations in pixels. If one integer is provided the limits will be (-x_shift_limits, x_shift_limits).
y_shift_limits : int or (int, int): Limits for vertical translations in pixels. If one integer is provided the limits will be (-y_shift_limits, y_shift_limits).
angular_limits : int or (int, int): Limits for rotations in degrees. If one integer is provided the limits will be [-angular_limits, angular_limits].
granularity : int: Density of sampling within limits for each dimension.
random_sampling : bool: If True we sample translations/rotations randomly within limits, otherwise we use a regular grid.
abort_early : bool: If True, the attack stops as soon as it finds an adversarial.

class foolbox.attacks.PointwiseAttack(model=None, criterion=<foolbox.criteria.Misclassification object>, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None)[source]¶

Starts with an adversarial and performs a binary search between the adversarial and the original for each dimension of the input individually.

References

[R739f80a24875-1]

L. Schott, J. Rauber, M. Bethge, W. Brendel: “Towards the first adversarially robust neural network model on MNIST”, ICLR (2019) https://arxiv.org/abs/1805.09190

as_generator(self, a, starting_point=None, initialization_attack=None)[source]¶

Starts with an adversarial and performs a binary search between the adversarial and the original for each dimension of the input individually.

Parameters:

input_or_adv : numpy.ndarray or Adversarial: The original, unperturbed input as a numpy.ndarray or an Adversarial instance.
label : int: The reference label of the original input. Must be passed if a is a numpy.ndarray, must not be passed if a is an Adversarial instance.
unpack : bool: If true, returns the adversarial input, otherwise returns the Adversarial object.
starting_point : numpy.ndarray: Adversarial input to use as a starting point, in particular for targeted attacks.
initialization_attack : Attack: Attack to use to find a starting point. Defaults to SaltAndPepperNoiseAttack.

class foolbox.attacks.GaussianBlurAttack(model=None, criterion=<foolbox.criteria.Misclassification object>, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None)[source]¶

Blurs the input until it is misclassified.

as_generator(self, a, epsilons=1000)[source]¶

Blurs the input until it is misclassified.

Parameters:

input_or_adv : numpy.ndarray or Adversarial: The original, unperturbed input as a numpy.ndarray or an Adversarial instance.
label : int: The reference label of the original input. Must be passed if input is a numpy.ndarray, must not be passed if input is an Adversarial instance.
unpack : bool: If true, returns the adversarial input, otherwise returns the Adversarial object.
epsilons : int or Iterable[float]: Either Iterable of standard deviations of the Gaussian blur or number of standard deviations between 0 and 1 that should be tried.

class foolbox.attacks.ContrastReductionAttack(model=None, criterion=<foolbox.criteria.Misclassification object>, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None)[source]¶

Reduces the contrast of the input until it is misclassified.

as_generator(self, a, epsilons=1000)[source]¶

Reduces the contrast of the input until it is misclassified.

Parameters:

input_or_adv : numpy.ndarray or Adversarial: The original, unperturbed input as a numpy.ndarray or an Adversarial instance.
label : int: The reference label of the original input. Must be passed if a is a numpy.ndarray, must not be passed if a is an Adversarial instance.
unpack : bool: If true, returns the adversarial input, otherwise returns the Adversarial object.
epsilons : int or Iterable[float]: Either Iterable of contrast levels or number of contrast levels between 1 and 0 that should be tried. Epsilons are one minus the contrast level.

class foolbox.attacks.AdditiveUniformNoiseAttack(model=None, criterion=<foolbox.criteria.Misclassification object>, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None)[source]¶

Adds uniform noise to the input, gradually increasing the standard deviation until the input is misclassified.

__call__(self, inputs, labels, unpack=True, individual_kwargs=None, **kwargs)[source]¶: Call self as a function.

__class__[source]¶: alias of abc.ABCMeta

__delattr__(self, name, /)[source]¶: Implement delattr(self, name).

__dir__()[source]¶: default dir() implementation

__eq__(self, value, /)[source]¶: Return self==value.

__format__()[source]¶: default object formatter

__ge__(self, value, /)[source]¶: Return self>=value.

__getattribute__(self, name, /)[source]¶: Return getattr(self, name).

__gt__(self, value, /)[source]¶: Return self>value.

__hash__(self, /)[source]¶: Return hash(self).

__init__(self, model=None, criterion=<foolbox.criteria.Misclassification object at 0x7f05af852eb8>, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None)[source]¶: Initialize self. See help(type(self)) for accurate signature.

__le__(self, value, /)[source]¶: Return self<=value.

__lt__(self, value, /)[source]¶: Return self<value.

__ne__(self, value, /)[source]¶: Return self!=value.

__new__(*args, **kwargs)[source]¶: Create and return a new object. See help(type) for accurate signature.

__reduce__()[source]¶: helper for pickle

__reduce_ex__()[source]¶: helper for pickle

__repr__(self, /)[source]¶: Return repr(self).

__setattr__(self, name, value, /)[source]¶: Implement setattr(self, name, value).

__sizeof__()[source]¶: size of object in memory, in bytes

__str__(self, /)[source]¶: Return str(self).

__subclasshook__()[source]¶

Abstract classes can override this to customize issubclass().

This is invoked early on by abc.ABCMeta.__subclasscheck__(). It should return True, False or NotImplemented. If it returns NotImplemented, the normal algorithm is used. Otherwise, it overrides the normal algorithm (and the outcome is cached).

__weakref__[source]¶: list of weak references to the object (if defined)

as_generator(self, a, epsilons=1000)[source]¶

Adds uniform or Gaussian noise to the input, gradually increasing the standard deviation until the input is misclassified.

Parameters:

input_or_adv : numpy.ndarray or Adversarial: The original, unperturbed input as a numpy.ndarray or an Adversarial instance.
label : int: The reference label of the original input. Must be passed if a is a numpy.ndarray, must not be passed if a is an Adversarial instance.
unpack : bool: If true, returns the adversarial input, otherwise returns the Adversarial object.
epsilons : int or Iterable[float]: Either Iterable of noise levels or number of noise levels between 0 and 1 that should be tried.

name(self)[source]¶

Returns a human readable name that uniquely identifies the attack with its hyperparameters.

Returns:	str Human readable name that uniquely identifies the attack with its hyperparameters.

Notes

Defaults to the class name but subclasses can provide more descriptive names and must take hyperparameters into account.

class foolbox.attacks.AdditiveGaussianNoiseAttack(model=None, criterion=<foolbox.criteria.Misclassification object>, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None)[source]¶

Adds Gaussian noise to the input, gradually increasing the standard deviation until the input is misclassified.

__call__(self, inputs, labels, unpack=True, individual_kwargs=None, **kwargs)[source]¶: Call self as a function.

__class__[source]¶: alias of abc.ABCMeta

__delattr__(self, name, /)[source]¶: Implement delattr(self, name).

__dir__()[source]¶: default dir() implementation

__eq__(self, value, /)[source]¶: Return self==value.

__format__()[source]¶: default object formatter

__ge__(self, value, /)[source]¶: Return self>=value.

__getattribute__(self, name, /)[source]¶: Return getattr(self, name).

__gt__(self, value, /)[source]¶: Return self>value.

__hash__(self, /)[source]¶: Return hash(self).

__init__(self, model=None, criterion=<foolbox.criteria.Misclassification object at 0x7f05af852eb8>, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None)[source]¶: Initialize self. See help(type(self)) for accurate signature.

__le__(self, value, /)[source]¶: Return self<=value.

__lt__(self, value, /)[source]¶: Return self<value.

__ne__(self, value, /)[source]¶: Return self!=value.

__new__(*args, **kwargs)[source]¶: Create and return a new object. See help(type) for accurate signature.

__reduce__()[source]¶: helper for pickle

__reduce_ex__()[source]¶: helper for pickle

__repr__(self, /)[source]¶: Return repr(self).

__setattr__(self, name, value, /)[source]¶: Implement setattr(self, name, value).

__sizeof__()[source]¶: size of object in memory, in bytes

__str__(self, /)[source]¶: Return str(self).

__subclasshook__()[source]¶

Abstract classes can override this to customize issubclass().

This is invoked early on by abc.ABCMeta.__subclasscheck__(). It should return True, False or NotImplemented. If it returns NotImplemented, the normal algorithm is used. Otherwise, it overrides the normal algorithm (and the outcome is cached).

__weakref__[source]¶: list of weak references to the object (if defined)

as_generator(self, a, epsilons=1000)[source]¶

Adds uniform or Gaussian noise to the input, gradually increasing the standard deviation until the input is misclassified.

Parameters:

input_or_adv : numpy.ndarray or Adversarial: The original, unperturbed input as a numpy.ndarray or an Adversarial instance.
label : int: The reference label of the original input. Must be passed if a is a numpy.ndarray, must not be passed if a is an Adversarial instance.
unpack : bool: If true, returns the adversarial input, otherwise returns the Adversarial object.
epsilons : int or Iterable[float]: Either Iterable of noise levels or number of noise levels between 0 and 1 that should be tried.

name(self)[source]¶

Returns a human readable name that uniquely identifies the attack with its hyperparameters.

Returns:	str Human readable name that uniquely identifies the attack with its hyperparameters.

Notes

Defaults to the class name but subclasses can provide more descriptive names and must take hyperparameters into account.

class foolbox.attacks.SaltAndPepperNoiseAttack(model=None, criterion=<foolbox.criteria.Misclassification object>, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None)[source]¶

Increases the amount of salt and pepper noise until the input is misclassified.

as_generator(self, a, epsilons=100, repetitions=10)[source]¶

Increases the amount of salt and pepper noise until the input is misclassified.

Parameters:

input_or_adv : numpy.ndarray or Adversarial: The original, unperturbed input as a numpy.ndarray or an Adversarial instance.
label : int: The reference label of the original input. Must be passed if a is a numpy.ndarray, must not be passed if a is an Adversarial instance.
unpack : bool: If true, returns the adversarial input, otherwise returns the Adversarial object.
epsilons : int: Number of steps to try between probability 0 and 1.
repetitions : int: Specifies how often the attack will be repeated.

class foolbox.attacks.BlendedUniformNoiseAttack(model=None, criterion=<foolbox.criteria.Misclassification object>, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None)[source]¶

Blends the input with a uniform noise input until it is misclassified.

as_generator(self, a, epsilons=1000, max_directions=1000)[source]¶

Blends the input with a uniform noise input until it is misclassified.

Parameters:

input_or_adv : numpy.ndarray or Adversarial: The original, unperturbed input as a numpy.ndarray or an Adversarial instance.
label : int: The reference label of the original input. Must be passed if a is a numpy.ndarray, must not be passed if a is an Adversarial instance.
unpack : bool: If true, returns the adversarial input, otherwise returns the Adversarial object.
epsilons : int or Iterable[float]: Either Iterable of blending steps or number of blending steps between 0 and 1 that should be tried.
max_directions : int: Maximum number of random inputs to try.

class foolbox.attacks.HopSkipJumpAttack(model=None, criterion=<foolbox.criteria.Misclassification object>, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None)[source]¶

A powerful adversarial attack that requires neither gradients nor probabilities.

Notes

Features: * ability to switch between two types of distances: MSE and Linf. * ability to continue previous attacks by passing an instance of the

Adversarial class

ability to pass an explicit starting point; especially to initialize a targeted attack
ability to pass an alternative attack used for initialization
ability to specify the batch size

References

HopSkipJumpAttack was originally proposed by Chen, Jordan and Wainwright. It is a decision-based attack that requires access to output labels of a model alone. Paper link: https://arxiv.org/abs/1904.02144 The implementation in Foolbox is based on Boundary Attack.

approximate_gradient(self, decision_function, sample, num_evals, delta)[source]¶: Gradient direction estimation

as_generator(self, a, iterations=64, initial_num_evals=100, max_num_evals=10000, stepsize_search='geometric_progression', gamma=1.0, starting_point=None, batch_size=256, internal_dtype=<Mock name='mock.float64' id='139662396493776'>, log_every_n_steps=None, loggingLevel=30)[source]¶

Applies HopSkipJumpAttack.

Parameters:

input_or_adv : numpy.ndarray or Adversarial

The original, correctly classified input. If it is a numpy array, label must be passed as well. If it is an Adversarial instance, label must not be passed.

label : int

The reference label of the original input. Must be passed if input is a numpy array, must not be passed if input is an Adversarial instance.

unpack : bool

If true, returns the adversarial input, otherwise returns the Adversarial object.

iterations : int

Number of iterations to run.

initial_num_evals: int

Initial number of evaluations for gradient estimation. Larger initial_num_evals increases time efficiency, but may decrease query efficiency.

max_num_evals: int

Maximum number of evaluations for gradient estimation.

stepsize_search: str

How to search for stepsize; choices are ‘geometric_progression’, ‘grid_search’. ‘geometric progression’ initializes the stepsize by ||x_t - x||_p / sqrt(iteration), and keep decreasing by half until reaching the target side of the boundary. ‘grid_search’ chooses the optimal epsilon over a grid, in the scale of ||x_t - x||_p.

gamma: float

The binary search threshold theta is gamma / d^1.5 for: l2 attack and gamma / d^2 for linf attack.

starting_point : numpy.ndarray

Adversarial input to use as a starting point, required for targeted attacks.

batch_size : int

Batch size for model prediction.

internal_dtype : np.float32 or np.float64

Higher precision might be slower but is numerically more stable.

log_every_n_steps : int

Determines verbositity of the logging.

loggingLevel : int

Controls the verbosity of the logging, e.g. logging.INFO or logging.WARNING.

attack(self, a, iterations)[source]¶

iterations : int: Maximum number of iterations to run.

binary_search_batch(self, unperturbed, perturbed_inputs, decision_function)[source]¶: Binary search to approach the boundary.

geometric_progression_for_stepsize(self, x, update, dist, decision_function, current_iteration)[source]¶: Geometric progression to search for stepsize. Keep decreasing stepsize by half until reaching the desired side of the boundary.

project(self, unperturbed, perturbed_inputs, alphas)[source]¶: Projection onto given l2 / linf balls in a batch.

select_delta(self, dist_post_update, current_iteration)[source]¶: Choose the delta at the scale of distance between x and perturbed sample.

class foolbox.attacks.GenAttack(model=None, criterion=<foolbox.criteria.Misclassification object>, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None)[source]¶

The GenAttack introduced in [R996613153a1e-1].

This attack is performs a genetic search in order to find an adversarial perturbation in a black-box scenario in as few queries as possible.

References

[R996613153a1e-1]

Moustafa Alzantot, Yash Sharma, Supriyo Chakraborty, Huan Zhang, Cho-Jui Hsieh, Mani Srivastava, “GenAttack: Practical Black-box Attacks with Gradient-Free Optimization”,

https://arxiv.org/abs/1607.02533

as_generator(self, a, generations=10, alpha=1.0, p=0.05, N=10, tau=0.1, search_shape=None, epsilon=0.3, binary_search=20)[source]¶

A black-box attack based on genetic algorithms. Can either try to find an adversarial perturbation for a fixed epsilon distance or perform a binary search over epsilon values in order to find a minimal perturbation. Parameters ———- inputs : numpy.ndarray

Batch of inputs with shape as expected by the underlying model.

labels : numpy.ndarray: Class labels of the inputs as a vector of integers in [0, number of classes).
unpack : bool: If true, returns the adversarial inputs as an array, otherwise returns Adversarial objects.
generations : int: Number of generations, i.e. iterations, in the genetic algorithm.
alpha : float: Mutation-range.
p : float: Mutation probability.
N : int: Population size of the genetic algorithm.
tau: float: Temperature for the softmax sampling used to determine the parents of the new crossover.
search_shape : tuple (default: None): Set this to a smaller image shape than the true shape to search in a smaller input space. The input will be scaled using a linear interpolation to match the required input shape of the model.
binary_search : bool or int: Whether to perform a binary search over epsilon and using their values to start the search. If False, hyperparameters are not optimized. Can also be an integer, specifying the number of binary search steps (default 20).
epsilon : float: Limit on the perturbation size; if binary_search is True, this value is only for initialization and automatically adapted.