--- license: apache-2.0 --- reproducing: "Ensemble everything everywhere: Multi-scale aggregation for adversarial robustness" (https://arxiv.org/abs/2408.05446) source code and usage examples: https://github.com/ETH-DISCO/self-ensembling architecture based on Torchvision's Resnet152 default implementation hyperparameters: - criterion: `torch.nn.CrossEntropyLoss()` - optimizer: `torch.optim.AdamW` - scaler: `GradScaler` - datasets: `["cifar10", "cirfar100"]` - lr: `0.0001` - num_epochs: `16` (higher would be even better, but maybe by <1%) - crossmax_k: `2` (difference between `crossmax_k=2` and `crossmax_k=3` is about 1-2%, so it's not a big deal)