foolbox.criteria
Criteria are used to define which inputs are adversarial.
We provide common criteria for untargeted and targeted adversarial attacks,
e.g. Misclassification
and TargetedMisclassification
.
New criteria can easily be implemented by subclassing Criterion
and implementing Criterion.__call__()
.
Criteria can be combined using a logical and criterion1 & criterion2
to create a new criterion.
Misclassification
from foolbox.criteria import Misclassification
criterion = Misclassification(labels)
- class foolbox.criteria.Misclassification(labels)
Considers those perturbed inputs adversarial whose predicted class differs from the label.
- Parameters
labels (Any) – Tensor with labels of the unperturbed inputs
(batch,)
.
TargetedMisclassification
from foolbox.criteria import TargetedMisclassification
criterion = TargetedMisclassification(target_classes)
- class foolbox.criteria.TargetedMisclassification(target_classes)
Considers those perturbed inputs adversarial whose predicted class matches the target class.
- Parameters
target_classes (Any) – Tensor with target classes
(batch,)
.
Criterion
- class foolbox.criteria.Criterion
Abstract base class to implement new criteria.
- abstract __call__(perturbed, outputs)
Returns a boolean tensor indicating which perturbed inputs are adversarial.
- Parameters
perturbed (T) – Tensor with perturbed inputs
(batch, ...)
.outputs (T) – Tensor with model outputs for the perturbed inputs
(batch, ...)
.
- Returns
A boolean tensor indicating which perturbed inputs are adversarial
(batch,)
.- Return type
T