Gradient-based attacks¶

class foolbox.attacks.GradientAttack(model=None, criterion=<foolbox.criteria.Misclassification object>, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None)[source]¶

Perturbs the input with the gradient of the loss w.r.t. the input, gradually increasing the magnitude until the input is misclassified.

Does not do anything if the model does not have a gradient.

as_generator(self, a, epsilons=1000, max_epsilon=1)[source]¶

Perturbs the input with the gradient of the loss w.r.t. the input, gradually increasing the magnitude until the input is misclassified.

Parameters:

inputs : numpy.ndarray: Batch of inputs with shape as expected by the underlying model.
labels : numpy.ndarray: Class labels of the inputs as a vector of integers in [0, number of classes).
unpack : bool: If true, returns the adversarial inputs as an array, otherwise returns Adversarial objects.
epsilons : int or Iterable[float]: Either Iterable of step sizes in the gradient direction or number of step sizes between 0 and max_epsilon that should be tried.
max_epsilon : float: Largest step size if epsilons is not an iterable.

class foolbox.attacks.GradientSignAttack(model=None, criterion=<foolbox.criteria.Misclassification object>, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None)[source]¶

Adds the sign of the gradient to the input, gradually increasing the magnitude until the input is misclassified. This attack is often referred to as Fast Gradient Sign Method and was introduced in [R20d0064ee4c9-1].

Does not do anything if the model does not have a gradient.

References

[R20d0064ee4c9-1]

Ian J. Goodfellow, Jonathon Shlens, Christian Szegedy, “Explaining and Harnessing Adversarial Examples”, https://arxiv.org/abs/1412.6572

as_generator(self, a, epsilons=1000, max_epsilon=1)[source]¶

Adds the sign of the gradient to the input, gradually increasing the magnitude until the input is misclassified.

Parameters:

inputs : numpy.ndarray: Batch of inputs with shape as expected by the underlying model.
labels : numpy.ndarray: Class labels of the inputs as a vector of integers in [0, number of classes).
unpack : bool: If true, returns the adversarial inputs as an array, otherwise returns Adversarial objects.
epsilons : int or Iterable[float]: Either Iterable of step sizes in the direction of the sign of the gradient or number of step sizes between 0 and max_epsilon that should be tried.
max_epsilon : float: Largest step size if epsilons is not an iterable.

foolbox.attacks.FGSM[source]¶: alias of foolbox.attacks.gradient.GradientSignAttack

class foolbox.attacks.LinfinityBasicIterativeAttack(model=None, criterion=<foolbox.criteria.Misclassification object>, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None)[source]¶

The Basic Iterative Method introduced in [R37dbc8f24aee-1].

This attack is also known as Projected Gradient Descent (PGD) (without random start) or FGMS^k.

References

[R37dbc8f24aee-1]

Alexey Kurakin, Ian Goodfellow, Samy Bengio, “Adversarial examples in the physical world”,

https://arxiv.org/abs/1607.02533

See also

ProjectedGradientDescentAttack

as_generator(self, a, binary_search=True, epsilon=0.3, stepsize=0.05, iterations=10, random_start=False, return_early=True)[source]¶

Simple iterative gradient-based attack known as Basic Iterative Method, Projected Gradient Descent or FGSM^k.

Parameters:

inputs : numpy.ndarray: Batch of inputs with shape as expected by the underlying model.
labels : numpy.ndarray: Class labels of the inputs as a vector of integers in [0, number of classes).
unpack : bool: If true, returns the adversarial inputs as an array, otherwise returns Adversarial objects.
binary_search : bool or int: Whether to perform a binary search over epsilon and stepsize, keeping their ratio constant and using their values to start the search. If False, hyperparameters are not optimized. Can also be an integer, specifying the number of binary search steps (default 20).
epsilon : float: Limit on the perturbation size; if binary_search is True, this value is only for initialization and automatically adapted.
stepsize : float: Step size for gradient descent; if binary_search is True, this value is only for initialization and automatically adapted.
iterations : int: Number of iterations for each gradient descent run.
random_start : bool: Start the attack from a random point rather than from the original input.
return_early : bool: Whether an individual gradient descent run should stop as soon as an adversarial is found.

foolbox.attacks.BasicIterativeMethod[source]¶: alias of foolbox.attacks.iterative_projected_gradient.LinfinityBasicIterativeAttack

foolbox.attacks.BIM[source]¶: alias of foolbox.attacks.iterative_projected_gradient.LinfinityBasicIterativeAttack

class foolbox.attacks.L1BasicIterativeAttack(model=None, criterion=<foolbox.criteria.Misclassification object>, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None)[source]¶

Modified version of the Basic Iterative Method that minimizes the L1 distance.

See also

LinfinityBasicIterativeAttack

as_generator(self, a, binary_search=True, epsilon=0.3, stepsize=0.05, iterations=10, random_start=False, return_early=True)[source]¶

Simple iterative gradient-based attack known as Basic Iterative Method, Projected Gradient Descent or FGSM^k.

Parameters:

inputs : numpy.ndarray: Batch of inputs with shape as expected by the underlying model.
labels : numpy.ndarray: Class labels of the inputs as a vector of integers in [0, number of classes).
unpack : bool: If true, returns the adversarial inputs as an array, otherwise returns Adversarial objects.
binary_search : bool or int: Whether to perform a binary search over epsilon and stepsize, keeping their ratio constant and using their values to start the search. If False, hyperparameters are not optimized. Can also be an integer, specifying the number of binary search steps (default 20).
epsilon : float: Limit on the perturbation size; if binary_search is True, this value is only for initialization and automatically adapted.
stepsize : float: Step size for gradient descent; if binary_search is True, this value is only for initialization and automatically adapted.
iterations : int: Number of iterations for each gradient descent run.
random_start : bool: Start the attack from a random point rather than from the original input.
return_early : bool: Whether an individual gradient descent run should stop as soon as an adversarial is found.

class foolbox.attacks.L2BasicIterativeAttack(model=None, criterion=<foolbox.criteria.Misclassification object>, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None)[source]¶

Modified version of the Basic Iterative Method that minimizes the L2 distance.

See also

LinfinityBasicIterativeAttack

as_generator(self, a, binary_search=True, epsilon=0.3, stepsize=0.05, iterations=10, random_start=False, return_early=True)[source]¶

Simple iterative gradient-based attack known as Basic Iterative Method, Projected Gradient Descent or FGSM^k.

Parameters:

inputs : numpy.ndarray: Batch of inputs with shape as expected by the underlying model.
labels : numpy.ndarray: Class labels of the inputs as a vector of integers in [0, number of classes).
unpack : bool: If true, returns the adversarial inputs as an array, otherwise returns Adversarial objects.
binary_search : bool or int: Whether to perform a binary search over epsilon and stepsize, keeping their ratio constant and using their values to start the search. If False, hyperparameters are not optimized. Can also be an integer, specifying the number of binary search steps (default 20).
epsilon : float: Limit on the perturbation size; if binary_search is True, this value is only for initialization and automatically adapted.
stepsize : float: Step size for gradient descent; if binary_search is True, this value is only for initialization and automatically adapted.
iterations : int: Number of iterations for each gradient descent run.
random_start : bool: Start the attack from a random point rather than from the original input.
return_early : bool: Whether an individual gradient descent run should stop as soon as an adversarial is found.

class foolbox.attacks.ProjectedGradientDescentAttack(model=None, criterion=<foolbox.criteria.Misclassification object>, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None)[source]¶

The Projected Gradient Descent Attack introduced in [R367e8e10528a-1] without random start.

When used without a random start, this attack is also known as Basic Iterative Method (BIM) or FGSM^k.

References

[R367e8e10528a-1]

Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, Adrian Vladu, “Towards Deep Learning Models Resistant to Adversarial Attacks”, https://arxiv.org/abs/1706.06083

as_generator(self, a, binary_search=True, epsilon=0.3, stepsize=0.01, iterations=40, random_start=False, return_early=True)[source]¶

Simple iterative gradient-based attack known as Basic Iterative Method, Projected Gradient Descent or FGSM^k.

Parameters:

inputs : numpy.ndarray: Batch of inputs with shape as expected by the underlying model.
labels : numpy.ndarray: Class labels of the inputs as a vector of integers in [0, number of classes).
unpack : bool: If true, returns the adversarial inputs as an array, otherwise returns Adversarial objects.
binary_search : bool or int: Whether to perform a binary search over epsilon and stepsize, keeping their ratio constant and using their values to start the search. If False, hyperparameters are not optimized. Can also be an integer, specifying the number of binary search steps (default 20).
epsilon : float: Limit on the perturbation size; if binary_search is True, this value is only for initialization and automatically adapted.
stepsize : float: Step size for gradient descent; if binary_search is True, this value is only for initialization and automatically adapted.
iterations : int: Number of iterations for each gradient descent run.
random_start : bool: Start the attack from a random point rather than from the original input.
return_early : bool: Whether an individual gradient descent run should stop as soon as an adversarial is found.

foolbox.attacks.ProjectedGradientDescent[source]¶: alias of foolbox.attacks.iterative_projected_gradient.ProjectedGradientDescentAttack

foolbox.attacks.PGD[source]¶: alias of foolbox.attacks.iterative_projected_gradient.ProjectedGradientDescentAttack

class foolbox.attacks.RandomStartProjectedGradientDescentAttack(model=None, criterion=<foolbox.criteria.Misclassification object>, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None)[source]¶

The Projected Gradient Descent Attack introduced in [Re6066bc39e14-1] with random start.

References

[Re6066bc39e14-1]

Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, Adrian Vladu, “Towards Deep Learning Models Resistant to Adversarial Attacks”, https://arxiv.org/abs/1706.06083

See also

ProjectedGradientDescentAttack

as_generator(self, a, binary_search=True, epsilon=0.3, stepsize=0.01, iterations=40, random_start=True, return_early=True)[source]¶

Simple iterative gradient-based attack known as Basic Iterative Method, Projected Gradient Descent or FGSM^k.

Parameters:

inputs : numpy.ndarray: Batch of inputs with shape as expected by the underlying model.
labels : numpy.ndarray: Class labels of the inputs as a vector of integers in [0, number of classes).
unpack : bool: If true, returns the adversarial inputs as an array, otherwise returns Adversarial objects.
binary_search : bool or int: Whether to perform a binary search over epsilon and stepsize, keeping their ratio constant and using their values to start the search. If False, hyperparameters are not optimized. Can also be an integer, specifying the number of binary search steps (default 20).
epsilon : float: Limit on the perturbation size; if binary_search is True, this value is only for initialization and automatically adapted.
stepsize : float: Step size for gradient descent; if binary_search is True, this value is only for initialization and automatically adapted.
iterations : int: Number of iterations for each gradient descent run.
random_start : bool: Start the attack from a random point rather than from the original input.
return_early : bool: Whether an individual gradient descent run should stop as soon as an adversarial is found.

foolbox.attacks.RandomProjectedGradientDescent[source]¶: alias of foolbox.attacks.iterative_projected_gradient.RandomStartProjectedGradientDescentAttack

foolbox.attacks.RandomPGD[source]¶: alias of foolbox.attacks.iterative_projected_gradient.RandomStartProjectedGradientDescentAttack

class foolbox.attacks.AdamL1BasicIterativeAttack(model=None, criterion=<foolbox.criteria.Misclassification object>, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None)[source]¶

Modified version of the Basic Iterative Method that minimizes the L1 distance using the Adam optimizer.

See also

LinfinityBasicIterativeAttack

as_generator(self, a, binary_search=True, epsilon=0.3, stepsize=0.05, iterations=10, random_start=False, return_early=True)[source]¶

Simple iterative gradient-based attack known as Basic Iterative Method, Projected Gradient Descent or FGSM^k.

Parameters:

input_or_adv : numpy.ndarray or Adversarial: The original, unperturbed input as a numpy.ndarray or an Adversarial instance.
label : int: The reference label of the original input. Must be passed if a is a numpy.ndarray, must not be passed if a is an Adversarial instance.
unpack : bool: If true, returns the adversarial input, otherwise returns the Adversarial object.
binary_search : bool or int: Whether to perform a binary search over epsilon and stepsize, keeping their ratio constant and using their values to start the search. If False, hyperparameters are not optimized. Can also be an integer, specifying the number of binary search steps (default 20).
epsilon : float: Limit on the perturbation size; if binary_search is True, this value is only for initialization and automatically adapted.
stepsize : float: Step size for gradient descent; if binary_search is True, this value is only for initialization and automatically adapted.
iterations : int: Number of iterations for each gradient descent run.
random_start : bool: Start the attack from a random point rather than from the original input.
return_early : bool: Whether an individual gradient descent run should stop as soon as an adversarial is found.

class foolbox.attacks.AdamL2BasicIterativeAttack(model=None, criterion=<foolbox.criteria.Misclassification object>, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None)[source]¶

Modified version of the Basic Iterative Method that minimizes the L2 distance using the Adam optimizer.

See also

LinfinityBasicIterativeAttack

as_generator(self, a, binary_search=True, epsilon=0.3, stepsize=0.05, iterations=10, random_start=False, return_early=True)[source]¶

Simple iterative gradient-based attack known as Basic Iterative Method, Projected Gradient Descent or FGSM^k.

Parameters:

input_or_adv : numpy.ndarray or Adversarial: The original, unperturbed input as a numpy.ndarray or an Adversarial instance.
label : int: The reference label of the original input. Must be passed if a is a numpy.ndarray, must not be passed if a is an Adversarial instance.
unpack : bool: If true, returns the adversarial input, otherwise returns the Adversarial object.
binary_search : bool or int: Whether to perform a binary search over epsilon and stepsize, keeping their ratio constant and using their values to start the search. If False, hyperparameters are not optimized. Can also be an integer, specifying the number of binary search steps (default 20).
epsilon : float: Limit on the perturbation size; if binary_search is True, this value is only for initialization and automatically adapted.
stepsize : float: Step size for gradient descent; if binary_search is True, this value is only for initialization and automatically adapted.
iterations : int: Number of iterations for each gradient descent run.
random_start : bool: Start the attack from a random point rather than from the original input.
return_early : bool: Whether an individual gradient descent run should stop as soon as an adversarial is found.

class foolbox.attacks.AdamProjectedGradientDescentAttack(model=None, criterion=<foolbox.criteria.Misclassification object>, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None)[source]¶

The Projected Gradient Descent Attack introduced in [Re2d4f39a0205-1], [Re2d4f39a0205-2] without random start using the Adam optimizer.

When used without a random start, this attack is also known as Basic Iterative Method (BIM) or FGSM^k.

References

[Re2d4f39a0205-1]

Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, Adrian Vladu, “Towards Deep Learning Models Resistant to Adversarial Attacks”, https://arxiv.org/abs/1706.06083

[Re2d4f39a0205-2]

Nicholas Carlini, David Wagner: “Towards Evaluating the Robustness of Neural Networks”, https://arxiv.org/abs/1608.04644

as_generator(self, a, binary_search=True, epsilon=0.3, stepsize=0.01, iterations=40, random_start=False, return_early=True)[source]¶

Simple iterative gradient-based attack known as Basic Iterative Method, Projected Gradient Descent or FGSM^k.

Parameters:

input_or_adv : numpy.ndarray or Adversarial: The original, unperturbed input as a numpy.ndarray or an Adversarial instance.
label : int: The reference label of the original input. Must be passed if a is a numpy.ndarray, must not be passed if a is an Adversarial instance.
unpack : bool: If true, returns the adversarial input, otherwise returns the Adversarial object.
binary_search : bool or int: Whether to perform a binary search over epsilon and stepsize, keeping their ratio constant and using their values to start the search. If False, hyperparameters are not optimized. Can also be an integer, specifying the number of binary search steps (default 20).
epsilon : float: Limit on the perturbation size; if binary_search is True, this value is only for initialization and automatically adapted.
stepsize : float: Step size for gradient descent; if binary_search is True, this value is only for initialization and automatically adapted.
iterations : int: Number of iterations for each gradient descent run.
random_start : bool: Start the attack from a random point rather than from the original input.
return_early : bool: Whether an individual gradient descent run should stop as soon as an adversarial is found.

foolbox.attacks.AdamProjectedGradientDescent[source]¶: alias of foolbox.attacks.iterative_projected_gradient.AdamProjectedGradientDescentAttack

foolbox.attacks.AdamPGD[source]¶: alias of foolbox.attacks.iterative_projected_gradient.AdamProjectedGradientDescentAttack

class foolbox.attacks.AdamRandomStartProjectedGradientDescentAttack(model=None, criterion=<foolbox.criteria.Misclassification object>, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None)[source]¶

The Projected Gradient Descent Attack introduced in [R3210aa339085-1], [R3210aa339085-2] with random start using the Adam optimizer.

References

[R3210aa339085-1]

Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, Adrian Vladu, “Towards Deep Learning Models Resistant to Adversarial Attacks”, https://arxiv.org/abs/1706.06083

[R3210aa339085-2]

Nicholas Carlini, David Wagner: “Towards Evaluating the Robustness of Neural Networks”, https://arxiv.org/abs/1608.04644

See also

ProjectedGradientDescentAttack

as_generator(self, a, binary_search=True, epsilon=0.3, stepsize=0.01, iterations=40, random_start=True, return_early=True)[source]¶

Simple iterative gradient-based attack known as Basic Iterative Method, Projected Gradient Descent or FGSM^k.

Parameters:

input_or_adv : numpy.ndarray or Adversarial: The original, unperturbed input as a numpy.ndarray or an Adversarial instance.
label : int: The reference label of the original input. Must be passed if a is a numpy.ndarray, must not be passed if a is an Adversarial instance.
unpack : bool: If true, returns the adversarial input, otherwise returns the Adversarial object.
binary_search : bool or int: Whether to perform a binary search over epsilon and stepsize, keeping their ratio constant and using their values to start the search. If False, hyperparameters are not optimized. Can also be an integer, specifying the number of binary search steps (default 20).
epsilon : float: Limit on the perturbation size; if binary_search is True, this value is only for initialization and automatically adapted.
stepsize : float: Step size for gradient descent; if binary_search is True, this value is only for initialization and automatically adapted.
iterations : int: Number of iterations for each gradient descent run.
random_start : bool: Start the attack from a random point rather than from the original input.
return_early : bool: Whether an individual gradient descent run should stop as soon as an adversarial is found.

foolbox.attacks.AdamRandomProjectedGradientDescent[source]¶: alias of foolbox.attacks.iterative_projected_gradient.AdamRandomStartProjectedGradientDescentAttack

foolbox.attacks.AdamRandomPGD[source]¶: alias of foolbox.attacks.iterative_projected_gradient.AdamRandomStartProjectedGradientDescentAttack

class foolbox.attacks.MomentumIterativeAttack(model=None, criterion=<foolbox.criteria.Misclassification object>, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None)[source]¶

The Momentum Iterative Method attack introduced in [R86d363e1fb2f-1]. It’s like the Basic Iterative Method or Projected Gradient Descent except that it uses momentum.

References

[R86d363e1fb2f-1]

Yinpeng Dong, Fangzhou Liao, Tianyu Pang, Hang Su, Jun Zhu, Xiaolin Hu, Jianguo Li, “Boosting Adversarial Attacks with Momentum”, https://arxiv.org/abs/1710.06081

as_generator(self, a, binary_search=True, epsilon=0.3, stepsize=0.06, iterations=10, decay_factor=1.0, random_start=False, return_early=True)[source]¶

Momentum-based iterative gradient attack known as Momentum Iterative Method.

Parameters:

inputs : numpy.ndarray: Batch of inputs with shape as expected by the underlying model.
labels : numpy.ndarray: Class labels of the inputs as a vector of integers in [0, number of classes).
unpack : bool: If true, returns the adversarial inputs as an array, otherwise returns Adversarial objects.
binary_search : bool: Whether to perform a binary search over epsilon and stepsize, keeping their ratio constant and using their values to start the search. If False, hyperparameters are not optimized. Can also be an integer, specifying the number of binary search steps (default 20).
epsilon : float: Limit on the perturbation size; if binary_search is True, this value is only for initialization and automatically adapted.
stepsize : float: Step size for gradient descent; if binary_search is True, this value is only for initialization and automatically adapted.
iterations : int: Number of iterations for each gradient descent run.
decay_factor : float: Decay factor used by the momentum term.
random_start : bool: Start the attack from a random point rather than from the original input.
return_early : bool: Whether an individual gradient descent run should stop as soon as an adversarial is found.

foolbox.attacks.MomentumIterativeMethod[source]¶: alias of foolbox.attacks.iterative_projected_gradient.MomentumIterativeAttack

class foolbox.attacks.DeepFoolAttack(model=None, criterion=<foolbox.criteria.Misclassification object>, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None)[source]¶

Simple and close to optimal gradient-based adversarial attack.

Implementes DeepFool introduced in [Rb4dd02640756-1].

References

[Rb4dd02640756-1]

Seyed-Mohsen Moosavi-Dezfooli, Alhussein Fawzi, Pascal Frossard, “DeepFool: a simple and accurate method to fool deep neural networks”, https://arxiv.org/abs/1511.04599

as_generator(self, a, steps=100, subsample=10, p=None)[source]¶

Simple and close to optimal gradient-based adversarial attack.

Parameters:

input_or_adv : numpy.ndarray or Adversarial: The original, unperturbed input as a numpy.ndarray or an Adversarial instance.
label : int: The reference label of the original input. Must be passed if a is a numpy.ndarray, must not be passed if a is an Adversarial instance.
unpack : bool: If true, returns the adversarial input, otherwise returns the Adversarial object.
steps : int: Maximum number of steps to perform.
subsample : int: Limit on the number of the most likely classes that should be considered. A small value is usually sufficient and much faster.
p : int or float: Lp-norm that should be minimzed, must be 2 or np.inf.

class foolbox.attacks.NewtonFoolAttack(model=None, criterion=<foolbox.criteria.Misclassification object>, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None)[source]¶

Implements the NewtonFool Attack.

The attack was introduced in [R6a972939b320-1].

References

[R6a972939b320-1]

Uyeong Jang et al., “Objective Metrics and Gradient Descent Algorithms for Adversarial Examples in Machine Learning”, https://dl.acm.org/citation.cfm?id=3134635

as_generator(self, a, max_iter=100, eta=0.01)[source]¶

Parameters:

input_or_adv : numpy.ndarray or Adversarial: The original, unperturbed input as a numpy.ndarray or an Adversarial instance.
label : int: The reference label of the original input. Must be passed if a is a numpy.ndarray, must not be passed if a is an Adversarial instance.
unpack : bool: If true, returns the adversarial input, otherwise returns the Adversarial object.
max_iter : int: The maximum number of iterations.
eta : float: the eta coefficient

class foolbox.attacks.DeepFoolL2Attack(model=None, criterion=<foolbox.criteria.Misclassification object>, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None)[source]¶

as_generator(self, a, steps=100, subsample=10)[source]¶

Simple and close to optimal gradient-based adversarial attack.

Parameters:

input_or_adv : numpy.ndarray or Adversarial: The original, unperturbed input as a numpy.ndarray or an Adversarial instance.
label : int: The reference label of the original input. Must be passed if a is a numpy.ndarray, must not be passed if a is an Adversarial instance.
unpack : bool: If true, returns the adversarial input, otherwise returns the Adversarial object.
steps : int: Maximum number of steps to perform.
subsample : int: Limit on the number of the most likely classes that should be considered. A small value is usually sufficient and much faster.
p : int or float: Lp-norm that should be minimzed, must be 2 or np.inf.

class foolbox.attacks.DeepFoolLinfinityAttack(model=None, criterion=<foolbox.criteria.Misclassification object>, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None)[source]¶

as_generator(self, a, steps=100, subsample=10)[source]¶

Simple and close to optimal gradient-based adversarial attack.

Parameters:

input_or_adv : numpy.ndarray or Adversarial: The original, unperturbed input as a numpy.ndarray or an Adversarial instance.
label : int: The reference label of the original input. Must be passed if a is a numpy.ndarray, must not be passed if a is an Adversarial instance.
unpack : bool: If true, returns the adversarial input, otherwise returns the Adversarial object.
steps : int: Maximum number of steps to perform.
subsample : int: Limit on the number of the most likely classes that should be considered. A small value is usually sufficient and much faster.
p : int or float: Lp-norm that should be minimzed, must be 2 or np.inf.

class foolbox.attacks.ADefAttack(model=None, criterion=<foolbox.criteria.Misclassification object>, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None)[source]¶

Adversarial attack that distorts the image, i.e. changes the locations of pixels. The algorithm is described in [Rf241e6d2664d-1], a Repository with the original code can be found in [Rf241e6d2664d-2]. References ———- .. [Rf241e6d2664d-1] Rima Alaifari, Giovanni S. Alberti, and Tandri Gauksson:

“ADef: an Iterative Algorithm to Construct Adversarial Deformations”, https://arxiv.org/abs/1804.07729

as_generator(self, a, max_iter=100, smooth=1.0, subsample=10)[source]¶

Parameters:

input_or_adv : numpy.ndarray or Adversarial: The original, unperturbed input as a numpy.ndarray or an Adversarial instance.
label : int: The reference label of the original input. Must be passed if a is a numpy.ndarray, must not be passed if a is an Adversarial instance.
unpack : bool: If true, returns the adversarial input, otherwise returns the Adversarial object.
max_iter : int > 0: Maximum number of iterations (default max_iter = 100).
smooth : float >= 0: Width of the Gaussian kernel used for smoothing. (default is smooth = 0 for no smoothing).
subsample : int >= 2: Limit on the number of the most likely classes that should be considered. A small value is usually sufficient and much faster. (default subsample = 10)

class foolbox.attacks.SaliencyMapAttack(model=None, criterion=<foolbox.criteria.Misclassification object>, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None)[source]¶

Implements the Saliency Map Attack.

The attack was introduced in [R08e06ca693ba-1].

References

[R08e06ca693ba-1]

Nicolas Papernot, Patrick McDaniel, Somesh Jha, Matt Fredrikson, Z. Berkay Celik, Ananthram Swami, “The Limitations of Deep Learning in Adversarial Settings”, https://arxiv.org/abs/1511.07528

as_generator(self, a, max_iter=2000, num_random_targets=0, fast=True, theta=0.1, max_perturbations_per_pixel=7)[source]¶

Implements the Saliency Map Attack.

Parameters:

input_or_adv : numpy.ndarray or Adversarial: The original, unperturbed input as a numpy.ndarray or an Adversarial instance.
label : int: The reference label of the original input. Must be passed if a is a numpy.ndarray, must not be passed if a is an Adversarial instance.
max_iter : int: The maximum number of iterations to run.
num_random_targets : int: Number of random target classes if no target class is given by the criterion.
fast : bool: Whether to use the fast saliency map calculation.
theta : float: perturbation per pixel relative to [min, max] range.
max_perturbations_per_pixel : int: Maximum number of times a pixel can be modified.

class foolbox.attacks.IterativeGradientAttack(model=None, criterion=<foolbox.criteria.Misclassification object>, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None)[source]¶

Like GradientAttack but with several steps for each epsilon.

as_generator(self, a, epsilons=100, max_epsilon=1, steps=10)[source]¶

Like GradientAttack but with several steps for each epsilon.

Parameters:

input_or_adv : numpy.ndarray or Adversarial: The original, unperturbed input as a numpy.ndarray or an Adversarial instance.
label : int: The reference label of the original input. Must be passed if a is a numpy.ndarray, must not be passed if a is an Adversarial instance.
unpack : bool: If true, returns the adversarial input, otherwise returns the Adversarial object.
epsilons : int or Iterable[float]: Either Iterable of step sizes in the gradient direction or number of step sizes between 0 and max_epsilon that should be tried.
max_epsilon : float: Largest step size if epsilons is not an iterable.
steps : int: Number of iterations to run.

class foolbox.attacks.IterativeGradientSignAttack(model=None, criterion=<foolbox.criteria.Misclassification object>, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None)[source]¶

Like GradientSignAttack but with several steps for each epsilon.

as_generator(self, a, epsilons=100, max_epsilon=1, steps=10)[source]¶

Like GradientSignAttack but with several steps for each epsilon.

Parameters:

input_or_adv : numpy.ndarray or Adversarial: The original, unperturbed input as a numpy.ndarray or an Adversarial instance.
label : int: The reference label of the original input. Must be passed if a is a numpy.ndarray, must not be passed if a is an Adversarial instance.
unpack : bool: If true, returns the adversarial input, otherwise returns the Adversarial object.
epsilons : int or Iterable[float]: Either Iterable of step sizes in the direction of the sign of the gradient or number of step sizes between 0 and max_epsilon that should be tried.
max_epsilon : float: Largest step size if epsilons is not an iterable.
steps : int: Number of iterations to run.

class foolbox.attacks.CarliniWagnerL2Attack(model=None, criterion=<foolbox.criteria.Misclassification object>, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None)[source]¶

The L2 version of the Carlini & Wagner attack.

This attack is described in [Rc2cb572b91c5-1]. This implementation is based on the reference implementation by Carlini [Rc2cb572b91c5-2]. For bounds ≠ (0, 1), it differs from [Rc2cb572b91c5-2] because we normalize the squared L2 loss with the bounds.

References

[Rc2cb572b91c5-1]

Nicholas Carlini, David Wagner: “Towards Evaluating the Robustness of Neural Networks”, https://arxiv.org/abs/1608.04644

[Rc2cb572b91c5-2]

(1, 2) https://github.com/carlini/nn_robust_attacks

as_generator(self, a, binary_search_steps=5, max_iterations=1000, confidence=0, learning_rate=0.005, initial_const=0.01, abort_early=True)[source]¶

The L2 version of the Carlini & Wagner attack.

Parameters:

inputs : numpy.ndarray: Batch of inputs with shape as expected by the underlying model.
labels : numpy.ndarray: Class labels of the inputs as a vector of integers in [0, number of classes).
unpack : bool: If true, returns the adversarial inputs as an array, otherwise returns Adversarial objects.
binary_search_steps : int: The number of steps for the binary search used to find the optimal tradeoff-constant between distance and confidence.
max_iterations : int: The maximum number of iterations. Larger values are more accurate; setting it too small will require a large learning rate and will produce poor results.
confidence : int or float: Confidence of adversarial examples: a higher value produces adversarials that are further away, but more strongly classified as adversarial.
learning_rate : float: The learning rate for the attack algorithm. Smaller values produce better results but take longer to converge.
initial_const : float: The initial tradeoff-constant to use to tune the relative importance of distance and confidence. If binary_search_steps is large, the initial constant is not important.
abort_early : bool: If True, Adam will be aborted if the loss hasn’t decreased for some time (a tenth of max_iterations).

static best_other_class(logits, exclude)[source]¶: Returns the index of the largest logit, ignoring the class that is passed as exclude.

classmethod loss_function(const, a, x, logits, reconstructed_original, confidence, min_, max_)[source]¶: Returns the loss and the gradient of the loss w.r.t. x, assuming that logits = model(x).

class foolbox.attacks.EADAttack(model=None, criterion=<foolbox.criteria.Misclassification object>, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None)[source]¶

Gradient based attack which uses an elastic-net regularization [1]. This implementation is based on the attacks description [1] and its reference implementation [2].

References

[Rf0e4124daa63-1]

Pin-Yu Chen (*), Yash Sharma (*), Huan Zhang, Jinfeng Yi, Cho-Jui Hsieh, “EAD: Elastic-Net Attacks to Deep Neural Networks via Adversarial Examples”, https://arxiv.org/abs/1709.04114

[Rf0e4124daa63-2]

Pin-Yu Chen (*), Yash Sharma (*), Huan Zhang, Jinfeng Yi, Cho-Jui Hsieh, “Reference Implementation of ‘EAD: Elastic-Net Attacks to Deep Neural Networks via Adversarial Examples’”, https://github.com/ysharma1126/EAD_Attack/blob/master/en_attack.py

as_generator(self, a, binary_search_steps=5, max_iterations=1000, confidence=0, initial_learning_rate=0.01, regularization=0.01, initial_const=0.01, abort_early=True)[source]¶

The L2 version of the Carlini & Wagner attack.

Parameters:

inputs : numpy.ndarray: Batch of inputs with shape as expected by the underlying model.
labels : numpy.ndarray: Class labels of the inputs as a vector of integers in [0, number of classes).
unpack : bool: If true, returns the adversarial inputs as an array, otherwise returns Adversarial objects.
binary_search_steps : int: The number of steps for the binary search used to find the optimal tradeoff-constant between distance and confidence.
max_iterations : int: The maximum number of iterations. Larger values are more accurate; setting it too small will require a large learning rate and will produce poor results.
confidence : int or float: Confidence of adversarial examples: a higher value produces adversarials that are further away, but more strongly classified as adversarial.
initial_learning_rate : float: The initial learning rate for the attack algorithm. Smaller values produce better results but take longer to converge. During the attack a square-root decay in the learning rate is performed.
initial_const : float: The initial tradeoff-constant to use to tune the relative importance of distance and confidence. If binary_search_steps is large, the initial constant is not important.
regularization : float: The L1 regularization parameter (also called beta). A value of 0 corresponds to the attacks.CarliniWagnerL2Attack attack.
abort_early : bool: If True, Adam will be aborted if the loss hasn’t decreased for some time (a tenth of max_iterations).

static best_other_class(logits, exclude)[source]¶: Returns the index of the largest logit, ignoring the class that is passed as exclude.

classmethod loss_function(const, a, x, logits, reconstructed_original, confidence, min_, max_)[source]¶: Returns the loss and the gradient of the loss w.r.t. x, assuming that logits = model(x).

classmethod project_shrinkage_thresholding(z, x0, regularization, min_, max_)[source]¶: Performs the element-wise projected shrinkage-thresholding operation

class foolbox.attacks.DecoupledDirectionNormL2Attack(model=None, criterion=<foolbox.criteria.Misclassification object>, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None)[source]¶

The Decoupled Direction and Norm L2 adversarial attack from [R0e9d4da0ab48-1].

References

[R0e9d4da0ab48-1]

Jérôme Rony, Luiz G. Hafemann, Luiz S. Oliveira, Ismail Ben Ayed,

Robert Sabourin, Eric Granger, “Decoupling Direction and Norm for Efficient Gradient-Based L2 Adversarial Attacks and Defenses”, https://arxiv.org/abs/1811.09600

as_generator(self, a, steps=100, gamma=0.05, initial_norm=1, quantize=True, levels=256)[source]¶

The Decoupled Direction and Norm L2 adversarial attack.

Parameters:

input_or_adv : numpy.ndarray or Adversarial: The original, unperturbed input as a numpy.ndarray or an Adversarial instance.
label : int: The reference label of the original input. Must be passed if a is a numpy.ndarray, must not be passed if a is an Adversarial instance.
unpack : bool: If true, returns the adversarial input, otherwise returns the Adversarial object.
steps : int: Number of steps for the optimization.
gamma : float, optional: Factor by which the norm will be modified. new_norm = norm * (1 + or - gamma).
init_norm : float, optional: Initial value for the norm.
quantize : bool, optional: If True, the returned adversarials will have quantized values to the specified number of levels.
levels : int, optional: Number of levels to use for quantization (e.g. 256 for 8 bit images).

class foolbox.attacks.SparseL1BasicIterativeAttack(model=None, criterion=<foolbox.criteria.Misclassification object>, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None)[source]¶

Sparse version of the Basic Iterative Method that minimizes the L1 distance introduced in [R0591d14da1c3-1].

References

[R0591d14da1c3-1]

Florian Tramèr, Dan Boneh, “Adversarial Training and Robustness for Multiple Perturbations”, https://arxiv.org/abs/1904.13000

See also

L1BasicIterativeAttack

as_generator(self, a, q=80.0, binary_search=True, epsilon=0.3, stepsize=0.05, iterations=10, random_start=False, return_early=True)[source]¶

Sparse version of a gradient-based attack that minimizes the L1 distance.

Parameters:

inputs : numpy.ndarray: Batch of inputs with shape as expected by the underlying model.
labels : numpy.ndarray: Class labels of the inputs as a vector of integers in [0, number of classes).
unpack : bool: If true, returns the adversarial inputs as an array, otherwise returns Adversarial objects.
q : float: Relative percentile to make gradients sparse (must be in [0, 100))
binary_search : bool or int: Whether to perform a binary search over epsilon and stepsize, keeping their ratio constant and using their values to start the search. If False, hyperparameters are not optimized. Can also be an integer, specifying the number of binary search steps (default 20).
epsilon : float: Limit on the perturbation size; if binary_search is True, this value is only for initialization and automatically adapted.
stepsize : float: Step size for gradient descent; if binary_search is True, this value is only for initialization and automatically adapted.
iterations : int: Number of iterations for each gradient descent run.
random_start : bool: Start the attack from a random point rather than from the original input.
return_early : bool: Whether an individual gradient descent run should stop as soon as an adversarial is found.

class foolbox.attacks.VirtualAdversarialAttack(model=None, criterion=<foolbox.criteria.Misclassification object>, distance=<class 'foolbox.distances.MeanSquaredDistance'>, threshold=None)[source]¶

Calculate an untargeted adversarial perturbation by performing a approximated second order optimization step on the KL divergence between the unperturbed predictions and the predictions for the adversarial perturbation. This attack was introduced in [Rc6516d158ac2-1].

References

[Rc6516d158ac2-1]

Takeru Miyato, Shin-ichi Maeda, Masanori Koyama, Ken Nakae, Shin Ishii, “Distributional Smoothing with Virtual Adversarial Training”, https://arxiv.org/abs/1507.00677

as_generator(self, a, xi=1e-05, iterations=1, epsilons=1000, max_epsilon=0.3)[source]¶

Parameters:

inputs : numpy.ndarray: Batch of inputs with shape as expected by the underlying model.
labels : numpy.ndarray: Class labels of the inputs as a vector of integers in [0, number of classes).
unpack : bool: If true, returns the adversarial inputs as an array, otherwise returns Adversarial objects.
xi : float: The finite difference size for performing the power method.
iterations : int: Number of iterations to perform power method to search for second order perturbation of KL divergence.
epsilons : int or Iterable[float]: Either Iterable of step sizes in the direction of the sign of the gradient or number of step sizes between 0 and max_epsilon that should be tried.
max_epsilon : float: Largest step size if epsilons is not an iterable.