代码拉取完成,页面将自动刷新
WARNING:__main__:
*****************************************
Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
*****************************************
| distributed init (rank 7): env://, gpu 7
| distributed init (rank 0): env://, gpu 0
| distributed init (rank 5): env://, gpu 5
| distributed init (rank 1): env://, gpu 1
| distributed init (rank 3): env://, gpu 3
| distributed init (rank 4): env://, gpu 4
| distributed init (rank 2): env://, gpu 2
| distributed init (rank 6): env://, gpu 6
Namespace(aa='rand-m9-mstd0.5-inc1', auto_resume=True, batch_size=256, clip_grad=None, color_jitter=0.4, crop_pct=None, cutmix=0.0, cutmix_minmax=None, data_path='/data/benchmarks/ILSVRC2012_LMDB', data_set='IMNET_LMDB', device='cuda', disable_eval=False, dist_backend='nccl', dist_eval=True, dist_on_itp=False, dist_url='env://', distributed=True, drop_path=0.2, enable_wandb=False, epochs=300, eval=False, eval_data_path=None, finetune='', gpu=0, head_init_scale=1.0, imagenet_default_mean_and_std=True, input_size=224, layer_decay=1.0, layer_scale_init_value=1e-06, local_rank=-1, log_dir=None, lr=0.004, min_lr=1e-06, mixup=0.0, mixup_mode='batch', mixup_prob=1.0, mixup_switch_prob=0.5, model='convnext_tiny', model_ema=False, model_ema_decay=0.9999, model_ema_eval=False, model_ema_force_cpu=False, model_key='model|module', model_prefix='', momentum=0.9, nb_classes=1000, num_workers=10, opt='adamw', opt_betas=None, opt_eps=1e-08, output_dir='./checkpoint', pin_mem=True, project='convnext', rank=0, recount=1, remode='pixel', reprob=0.25, resplit=False, resume='', save_ckpt=True, save_ckpt_freq=1, save_ckpt_num=3, seed=0, smoothing=0.1, start_epoch=0, train_interpolation='bicubic', update_freq=2, use_amp=True, wandb_ckpt=False, warmup_epochs=20, warmup_steps=-1, weight_decay=0.05, weight_decay_end=None, world_size=8)
Transform =
RandomResizedCropAndInterpolation(size=(224, 224), scale=(0.08, 1.0), ratio=(0.75, 1.3333), interpolation=bicubic)
RandomHorizontalFlip(p=0.5)
RandAugment(n=2, ops=
AugmentOp(name=AutoContrast, p=0.5, m=9, mstd=0.5)
AugmentOp(name=Equalize, p=0.5, m=9, mstd=0.5)
AugmentOp(name=Invert, p=0.5, m=9, mstd=0.5)
AugmentOp(name=Rotate, p=0.5, m=9, mstd=0.5)
AugmentOp(name=PosterizeIncreasing, p=0.5, m=9, mstd=0.5)
AugmentOp(name=SolarizeIncreasing, p=0.5, m=9, mstd=0.5)
AugmentOp(name=SolarizeAdd, p=0.5, m=9, mstd=0.5)
AugmentOp(name=ColorIncreasing, p=0.5, m=9, mstd=0.5)
AugmentOp(name=ContrastIncreasing, p=0.5, m=9, mstd=0.5)
AugmentOp(name=BrightnessIncreasing, p=0.5, m=9, mstd=0.5)
AugmentOp(name=SharpnessIncreasing, p=0.5, m=9, mstd=0.5)
AugmentOp(name=ShearX, p=0.5, m=9, mstd=0.5)
AugmentOp(name=ShearY, p=0.5, m=9, mstd=0.5)
AugmentOp(name=TranslateXRel, p=0.5, m=9, mstd=0.5)
AugmentOp(name=TranslateYRel, p=0.5, m=9, mstd=0.5))
ToTensor()
Normalize(mean=tensor([0.4850, 0.4560, 0.4060]), std=tensor([0.2290, 0.2240, 0.2250]))
RandomErasing(p=0.25, mode=pixel, count=(1, 1))
---------------------------
reading from datapath /data/benchmarks/ILSVRC2012_LMDB
Number of the class = 1000
Transform =
Resize(size=256, interpolation=bicubic, max_size=None, antialias=None)
CenterCrop(size=(224, 224))
ToTensor()
Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225))
---------------------------
reading from datapath /data/benchmarks/ILSVRC2012_LMDB
Number of the class = 1000
Sampler_train = <torch.utils.data.distributed.DistributedSampler object at 0x7f6084969c40>
Model = MobileNetV3_Small(
(conv1): Conv2d(3, 16, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(hs1): Hardswish()
(bneck): Sequential(
(0): Block(
(conv1): Conv2d(16, 16, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): ReLU(inplace=True)
(conv2): Conv2d(16, 16, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=16, bias=False)
(bn2): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): ReLU(inplace=True)
(se): SeModule(
(se): Sequential(
(0): AdaptiveAvgPool2d(output_size=1)
(1): Conv2d(16, 8, kernel_size=(1, 1), stride=(1, 1), bias=False)
(2): BatchNorm2d(8, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): ReLU(inplace=True)
(4): Conv2d(8, 16, kernel_size=(1, 1), stride=(1, 1), bias=False)
(5): Hardsigmoid()
)
)
(conv3): Conv2d(16, 16, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act3): ReLU(inplace=True)
(skip): Sequential(
(0): Conv2d(16, 16, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=16, bias=False)
(1): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): Block(
(conv1): Conv2d(16, 72, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(72, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): ReLU(inplace=True)
(conv2): Conv2d(72, 72, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=72, bias=False)
(bn2): BatchNorm2d(72, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): ReLU(inplace=True)
(se): Identity()
(conv3): Conv2d(72, 24, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act3): ReLU(inplace=True)
(skip): Sequential(
(0): Conv2d(16, 16, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=16, bias=False)
(1): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): Conv2d(16, 24, kernel_size=(1, 1), stride=(1, 1))
(3): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(2): Block(
(conv1): Conv2d(24, 88, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(88, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): ReLU(inplace=True)
(conv2): Conv2d(88, 88, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=88, bias=False)
(bn2): BatchNorm2d(88, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): ReLU(inplace=True)
(se): Identity()
(conv3): Conv2d(88, 24, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act3): ReLU(inplace=True)
)
(3): Block(
(conv1): Conv2d(24, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): Hardswish()
(conv2): Conv2d(96, 96, kernel_size=(5, 5), stride=(2, 2), padding=(2, 2), groups=96, bias=False)
(bn2): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): Hardswish()
(se): SeModule(
(se): Sequential(
(0): AdaptiveAvgPool2d(output_size=1)
(1): Conv2d(96, 24, kernel_size=(1, 1), stride=(1, 1), bias=False)
(2): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): ReLU(inplace=True)
(4): Conv2d(24, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)
(5): Hardsigmoid()
)
)
(conv3): Conv2d(96, 40, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(40, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act3): Hardswish()
(skip): Sequential(
(0): Conv2d(24, 24, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=24, bias=False)
(1): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): Conv2d(24, 40, kernel_size=(1, 1), stride=(1, 1))
(3): BatchNorm2d(40, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(4): Block(
(conv1): Conv2d(40, 240, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(240, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): Hardswish()
(conv2): Conv2d(240, 240, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=240, bias=False)
(bn2): BatchNorm2d(240, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): Hardswish()
(se): SeModule(
(se): Sequential(
(0): AdaptiveAvgPool2d(output_size=1)
(1): Conv2d(240, 60, kernel_size=(1, 1), stride=(1, 1), bias=False)
(2): BatchNorm2d(60, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): ReLU(inplace=True)
(4): Conv2d(60, 240, kernel_size=(1, 1), stride=(1, 1), bias=False)
(5): Hardsigmoid()
)
)
(conv3): Conv2d(240, 40, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(40, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act3): Hardswish()
)
(5): Block(
(conv1): Conv2d(40, 240, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(240, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): Hardswish()
(conv2): Conv2d(240, 240, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=240, bias=False)
(bn2): BatchNorm2d(240, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): Hardswish()
(se): SeModule(
(se): Sequential(
(0): AdaptiveAvgPool2d(output_size=1)
(1): Conv2d(240, 60, kernel_size=(1, 1), stride=(1, 1), bias=False)
(2): BatchNorm2d(60, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): ReLU(inplace=True)
(4): Conv2d(60, 240, kernel_size=(1, 1), stride=(1, 1), bias=False)
(5): Hardsigmoid()
)
)
(conv3): Conv2d(240, 40, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(40, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act3): Hardswish()
)
(6): Block(
(conv1): Conv2d(40, 120, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(120, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): Hardswish()
(conv2): Conv2d(120, 120, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=120, bias=False)
(bn2): BatchNorm2d(120, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): Hardswish()
(se): SeModule(
(se): Sequential(
(0): AdaptiveAvgPool2d(output_size=1)
(1): Conv2d(120, 30, kernel_size=(1, 1), stride=(1, 1), bias=False)
(2): BatchNorm2d(30, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): ReLU(inplace=True)
(4): Conv2d(30, 120, kernel_size=(1, 1), stride=(1, 1), bias=False)
(5): Hardsigmoid()
)
)
(conv3): Conv2d(120, 48, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(48, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act3): Hardswish()
(skip): Sequential(
(0): Conv2d(40, 48, kernel_size=(1, 1), stride=(1, 1), bias=False)
(1): BatchNorm2d(48, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(7): Block(
(conv1): Conv2d(48, 144, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(144, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): Hardswish()
(conv2): Conv2d(144, 144, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=144, bias=False)
(bn2): BatchNorm2d(144, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): Hardswish()
(se): SeModule(
(se): Sequential(
(0): AdaptiveAvgPool2d(output_size=1)
(1): Conv2d(144, 36, kernel_size=(1, 1), stride=(1, 1), bias=False)
(2): BatchNorm2d(36, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): ReLU(inplace=True)
(4): Conv2d(36, 144, kernel_size=(1, 1), stride=(1, 1), bias=False)
(5): Hardsigmoid()
)
)
(conv3): Conv2d(144, 48, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(48, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act3): Hardswish()
)
(8): Block(
(conv1): Conv2d(48, 288, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(288, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): Hardswish()
(conv2): Conv2d(288, 288, kernel_size=(5, 5), stride=(2, 2), padding=(2, 2), groups=288, bias=False)
(bn2): BatchNorm2d(288, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): Hardswish()
(se): SeModule(
(se): Sequential(
(0): AdaptiveAvgPool2d(output_size=1)
(1): Conv2d(288, 72, kernel_size=(1, 1), stride=(1, 1), bias=False)
(2): BatchNorm2d(72, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): ReLU(inplace=True)
(4): Conv2d(72, 288, kernel_size=(1, 1), stride=(1, 1), bias=False)
(5): Hardsigmoid()
)
)
(conv3): Conv2d(288, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act3): Hardswish()
(skip): Sequential(
(0): Conv2d(48, 48, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=48, bias=False)
(1): BatchNorm2d(48, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): Conv2d(48, 96, kernel_size=(1, 1), stride=(1, 1))
(3): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(9): Block(
(conv1): Conv2d(96, 576, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(576, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): Hardswish()
(conv2): Conv2d(576, 576, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=576, bias=False)
(bn2): BatchNorm2d(576, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): Hardswish()
(se): SeModule(
(se): Sequential(
(0): AdaptiveAvgPool2d(output_size=1)
(1): Conv2d(576, 144, kernel_size=(1, 1), stride=(1, 1), bias=False)
(2): BatchNorm2d(144, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): ReLU(inplace=True)
(4): Conv2d(144, 576, kernel_size=(1, 1), stride=(1, 1), bias=False)
(5): Hardsigmoid()
)
)
(conv3): Conv2d(576, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act3): Hardswish()
)
(10): Block(
(conv1): Conv2d(96, 576, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(576, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): Hardswish()
(conv2): Conv2d(576, 576, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=576, bias=False)
(bn2): BatchNorm2d(576, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): Hardswish()
(se): SeModule(
(se): Sequential(
(0): AdaptiveAvgPool2d(output_size=1)
(1): Conv2d(576, 144, kernel_size=(1, 1), stride=(1, 1), bias=False)
(2): BatchNorm2d(144, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): ReLU(inplace=True)
(4): Conv2d(144, 576, kernel_size=(1, 1), stride=(1, 1), bias=False)
(5): Hardsigmoid()
)
)
(conv3): Conv2d(576, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act3): Hardswish()
)
)
(conv2): Conv2d(96, 576, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn2): BatchNorm2d(576, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(hs2): Hardswish()
(gap): AdaptiveAvgPool2d(output_size=1)
(linear3): Linear(in_features=576, out_features=1280, bias=False)
(bn3): BatchNorm1d(1280, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(hs3): Hardswish()
(drop): Dropout(p=0.2, inplace=False)
(linear4): Linear(in_features=1280, out_features=1000, bias=True)
)
number of params: 2950524
LR = 0.00400000
Batch size = 4096
Update frequent = 2
Number of training examples = 1281167
Number of training training per epoch = 312
Param groups = {
"decay": {
"weight_decay": 0.05,
"params": [
"conv1.weight",
"bneck.0.conv1.weight",
"bneck.0.conv2.weight",
"bneck.0.se.se.1.weight",
"bneck.0.se.se.4.weight",
"bneck.0.conv3.weight",
"bneck.0.skip.0.weight",
"bneck.1.conv1.weight",
"bneck.1.conv2.weight",
"bneck.1.conv3.weight",
"bneck.1.skip.0.weight",
"bneck.1.skip.2.weight",
"bneck.2.conv1.weight",
"bneck.2.conv2.weight",
"bneck.2.conv3.weight",
"bneck.3.conv1.weight",
"bneck.3.conv2.weight",
"bneck.3.se.se.1.weight",
"bneck.3.se.se.4.weight",
"bneck.3.conv3.weight",
"bneck.3.skip.0.weight",
"bneck.3.skip.2.weight",
"bneck.4.conv1.weight",
"bneck.4.conv2.weight",
"bneck.4.se.se.1.weight",
"bneck.4.se.se.4.weight",
"bneck.4.conv3.weight",
"bneck.5.conv1.weight",
"bneck.5.conv2.weight",
"bneck.5.se.se.1.weight",
"bneck.5.se.se.4.weight",
"bneck.5.conv3.weight",
"bneck.6.conv1.weight",
"bneck.6.conv2.weight",
"bneck.6.se.se.1.weight",
"bneck.6.se.se.4.weight",
"bneck.6.conv3.weight",
"bneck.6.skip.0.weight",
"bneck.7.conv1.weight",
"bneck.7.conv2.weight",
"bneck.7.se.se.1.weight",
"bneck.7.se.se.4.weight",
"bneck.7.conv3.weight",
"bneck.8.conv1.weight",
"bneck.8.conv2.weight",
"bneck.8.se.se.1.weight",
"bneck.8.se.se.4.weight",
"bneck.8.conv3.weight",
"bneck.8.skip.0.weight",
"bneck.8.skip.2.weight",
"bneck.9.conv1.weight",
"bneck.9.conv2.weight",
"bneck.9.se.se.1.weight",
"bneck.9.se.se.4.weight",
"bneck.9.conv3.weight",
"bneck.10.conv1.weight",
"bneck.10.conv2.weight",
"bneck.10.se.se.1.weight",
"bneck.10.se.se.4.weight",
"bneck.10.conv3.weight",
"conv2.weight",
"linear3.weight",
"linear4.weight"
],
"lr_scale": 1.0
},
"no_decay": {
"weight_decay": 0.0,
"params": [
"bn1.weight",
"bn1.bias",
"bneck.0.bn1.weight",
"bneck.0.bn1.bias",
"bneck.0.bn2.weight",
"bneck.0.bn2.bias",
"bneck.0.se.se.2.weight",
"bneck.0.se.se.2.bias",
"bneck.0.bn3.weight",
"bneck.0.bn3.bias",
"bneck.0.skip.1.weight",
"bneck.0.skip.1.bias",
"bneck.1.bn1.weight",
"bneck.1.bn1.bias",
"bneck.1.bn2.weight",
"bneck.1.bn2.bias",
"bneck.1.bn3.weight",
"bneck.1.bn3.bias",
"bneck.1.skip.1.weight",
"bneck.1.skip.1.bias",
"bneck.1.skip.2.bias",
"bneck.1.skip.3.weight",
"bneck.1.skip.3.bias",
"bneck.2.bn1.weight",
"bneck.2.bn1.bias",
"bneck.2.bn2.weight",
"bneck.2.bn2.bias",
"bneck.2.bn3.weight",
"bneck.2.bn3.bias",
"bneck.3.bn1.weight",
"bneck.3.bn1.bias",
"bneck.3.bn2.weight",
"bneck.3.bn2.bias",
"bneck.3.se.se.2.weight",
"bneck.3.se.se.2.bias",
"bneck.3.bn3.weight",
"bneck.3.bn3.bias",
"bneck.3.skip.1.weight",
"bneck.3.skip.1.bias",
"bneck.3.skip.2.bias",
"bneck.3.skip.3.weight",
"bneck.3.skip.3.bias",
"bneck.4.bn1.weight",
"bneck.4.bn1.bias",
"bneck.4.bn2.weight",
"bneck.4.bn2.bias",
"bneck.4.se.se.2.weight",
"bneck.4.se.se.2.bias",
"bneck.4.bn3.weight",
"bneck.4.bn3.bias",
"bneck.5.bn1.weight",
"bneck.5.bn1.bias",
"bneck.5.bn2.weight",
"bneck.5.bn2.bias",
"bneck.5.se.se.2.weight",
"bneck.5.se.se.2.bias",
"bneck.5.bn3.weight",
"bneck.5.bn3.bias",
"bneck.6.bn1.weight",
"bneck.6.bn1.bias",
"bneck.6.bn2.weight",
"bneck.6.bn2.bias",
"bneck.6.se.se.2.weight",
"bneck.6.se.se.2.bias",
"bneck.6.bn3.weight",
"bneck.6.bn3.bias",
"bneck.6.skip.1.weight",
"bneck.6.skip.1.bias",
"bneck.7.bn1.weight",
"bneck.7.bn1.bias",
"bneck.7.bn2.weight",
"bneck.7.bn2.bias",
"bneck.7.se.se.2.weight",
"bneck.7.se.se.2.bias",
"bneck.7.bn3.weight",
"bneck.7.bn3.bias",
"bneck.8.bn1.weight",
"bneck.8.bn1.bias",
"bneck.8.bn2.weight",
"bneck.8.bn2.bias",
"bneck.8.se.se.2.weight",
"bneck.8.se.se.2.bias",
"bneck.8.bn3.weight",
"bneck.8.bn3.bias",
"bneck.8.skip.1.weight",
"bneck.8.skip.1.bias",
"bneck.8.skip.2.bias",
"bneck.8.skip.3.weight",
"bneck.8.skip.3.bias",
"bneck.9.bn1.weight",
"bneck.9.bn1.bias",
"bneck.9.bn2.weight",
"bneck.9.bn2.bias",
"bneck.9.se.se.2.weight",
"bneck.9.se.se.2.bias",
"bneck.9.bn3.weight",
"bneck.9.bn3.bias",
"bneck.10.bn1.weight",
"bneck.10.bn1.bias",
"bneck.10.bn2.weight",
"bneck.10.bn2.bias",
"bneck.10.se.se.2.weight",
"bneck.10.se.se.2.bias",
"bneck.10.bn3.weight",
"bneck.10.bn3.bias",
"bn2.weight",
"bn2.bias",
"bn3.weight",
"bn3.bias",
"linear4.bias"
],
"lr_scale": 1.0
}
}
Use Cosine LR scheduler
Set warmup steps = 6240
Set warmup steps = 0
Max WD = 0.0500000, Min WD = 0.0500000
criterion = LabelSmoothingCrossEntropy()
Auto resume checkpoint:
Start training for 300 epochs
Epoch: [0] [ 0/625] eta: 5:22:53 lr: 0.000000 min_lr: 0.000000 loss: 6.9073 (6.9073) class_acc: 0.0000 (0.0000) weight_decay: 0.0500 (0.0500) time: 30.9969 data: 24.4808 max mem: 2905
Epoch: [0] [200/625] eta: 0:15:28 lr: 0.000064 min_lr: 0.000064 loss: 6.8783 (6.8985) class_acc: 0.0000 (0.0013) weight_decay: 0.0500 (0.0500) grad_norm: 0.4079 (0.4513) time: 1.9811 data: 0.3600 max mem: 2905
Epoch: [0] [400/625] eta: 0:08:01 lr: 0.000128 min_lr: 0.000128 loss: 6.7819 (6.8701) class_acc: 0.0039 (0.0018) weight_decay: 0.0500 (0.0500) grad_norm: 0.5901 (0.4735) time: 2.2324 data: 0.0009 max mem: 2905
Epoch: [0] [600/625] eta: 0:00:53 lr: 0.000192 min_lr: 0.000192 loss: 6.5924 (6.8041) class_acc: 0.0039 (0.0032) weight_decay: 0.0500 (0.0500) grad_norm: 0.9315 (0.5781) time: 2.0935 data: 0.0452 max mem: 2905
Epoch: [0] [624/625] eta: 0:00:02 lr: 0.000199 min_lr: 0.000199 loss: 6.5503 (6.7951) class_acc: 0.0078 (0.0034) weight_decay: 0.0500 (0.0500) grad_norm: 0.9210 (0.5922) time: 0.6318 data: 0.0064 max mem: 2905
Epoch: [0] Total time: 0:21:52 (2.1002 s / it)
Averaged stats: lr: 0.000199 min_lr: 0.000199 loss: 6.5503 (6.7943) class_acc: 0.0078 (0.0036) weight_decay: 0.0500 (0.0500) grad_norm: 0.9210 (0.5922)
Test: [ 0/50] eta: 0:11:29 loss: 6.2122 (6.2122) acc1: 0.0000 (0.0000) acc5: 2.4000 (2.4000) time: 13.7935 data: 13.0781 max mem: 2905
Test: [10/50] eta: 0:01:21 loss: 6.2798 (6.2349) acc1: 0.8000 (2.0364) acc5: 5.6000 (6.4727) time: 2.0441 data: 1.9611 max mem: 2905
Test: [20/50] eta: 0:00:46 loss: 6.2798 (6.2465) acc1: 1.6000 (1.9048) acc5: 5.6000 (6.4381) time: 0.9553 data: 0.9362 max mem: 2905
Test: [30/50] eta: 0:00:27 loss: 6.2106 (6.2254) acc1: 1.6000 (2.1161) acc5: 6.4000 (6.7097) time: 1.0293 data: 1.0112 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 6.2730 (6.2426) acc1: 1.6000 (1.9707) acc5: 5.6000 (6.1463) time: 0.9834 data: 0.9653 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 6.2752 (6.2454) acc1: 0.8000 (1.8560) acc5: 4.8000 (6.0320) time: 0.9657 data: 0.9474 max mem: 2905
Test: Total time: 0:00:59 (1.1806 s / it)
* Acc@1 1.620 Acc@5 5.984 loss 6.224
Accuracy of the model on the 50000 test images: 1.6%
Max accuracy: 1.62%
Epoch: [1] [ 0/625] eta: 3:30:46 lr: 0.000200 min_lr: 0.000200 loss: 6.5212 (6.5212) class_acc: 0.0078 (0.0078) weight_decay: 0.0500 (0.0500) time: 20.2340 data: 18.8082 max mem: 2905
Epoch: [1] [200/625] eta: 0:14:09 lr: 0.000264 min_lr: 0.000264 loss: 6.3831 (6.4747) class_acc: 0.0156 (0.0131) weight_decay: 0.0500 (0.0500) grad_norm: 1.1870 (1.0887) time: 1.9414 data: 0.0293 max mem: 2905
Epoch: [1] [400/625] eta: 0:07:25 lr: 0.000328 min_lr: 0.000328 loss: 6.3248 (6.4075) class_acc: 0.0234 (0.0163) weight_decay: 0.0500 (0.0500) grad_norm: 1.2754 (1.1543) time: 1.9028 data: 0.0008 max mem: 2905
Epoch: [1] [600/625] eta: 0:00:49 lr: 0.000392 min_lr: 0.000392 loss: 6.1402 (6.3414) class_acc: 0.0273 (0.0195) weight_decay: 0.0500 (0.0500) grad_norm: 1.3534 (1.2217) time: 2.0527 data: 0.0007 max mem: 2905
Epoch: [1] [624/625] eta: 0:00:01 lr: 0.000399 min_lr: 0.000399 loss: 6.1052 (6.3331) class_acc: 0.0312 (0.0201) weight_decay: 0.0500 (0.0500) grad_norm: 1.4380 (1.2335) time: 0.8558 data: 0.0018 max mem: 2905
Epoch: [1] Total time: 0:20:10 (1.9367 s / it)
Averaged stats: lr: 0.000399 min_lr: 0.000399 loss: 6.1052 (6.3370) class_acc: 0.0312 (0.0199) weight_decay: 0.0500 (0.0500) grad_norm: 1.4380 (1.2335)
Test: [ 0/50] eta: 0:10:19 loss: 5.3883 (5.3883) acc1: 5.6000 (5.6000) acc5: 20.8000 (20.8000) time: 12.3925 data: 12.3576 max mem: 2905
Test: [10/50] eta: 0:01:24 loss: 5.4225 (5.4007) acc1: 6.4000 (7.0545) acc5: 20.0000 (18.8364) time: 2.1014 data: 2.0793 max mem: 2905
Test: [20/50] eta: 0:00:51 loss: 5.4212 (5.4345) acc1: 5.6000 (5.9048) acc5: 15.2000 (17.8286) time: 1.1690 data: 1.1492 max mem: 2905
Test: [30/50] eta: 0:00:30 loss: 5.3942 (5.4341) acc1: 4.8000 (5.8839) acc5: 15.2000 (17.3161) time: 1.2309 data: 1.2122 max mem: 2905
Test: [40/50] eta: 0:00:13 loss: 5.4350 (5.4472) acc1: 4.8000 (5.6976) acc5: 14.4000 (16.6049) time: 0.8849 data: 0.8655 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 5.5120 (5.4625) acc1: 4.8000 (5.6800) acc5: 14.4000 (16.3680) time: 0.8166 data: 0.7956 max mem: 2905
Test: Total time: 0:00:54 (1.0842 s / it)
* Acc@1 6.006 Acc@5 17.070 loss 5.431
Accuracy of the model on the 50000 test images: 6.0%
Max accuracy: 6.01%
Epoch: [2] [ 0/625] eta: 3:47:03 lr: 0.000400 min_lr: 0.000400 loss: 6.1433 (6.1433) class_acc: 0.0234 (0.0234) weight_decay: 0.0500 (0.0500) time: 21.7982 data: 18.5213 max mem: 2905
Epoch: [2] [200/625] eta: 0:13:53 lr: 0.000464 min_lr: 0.000464 loss: 6.0624 (6.0685) class_acc: 0.0391 (0.0372) weight_decay: 0.0500 (0.0500) grad_norm: 1.6316 (1.5256) time: 1.8131 data: 0.0244 max mem: 2905
Epoch: [2] [400/625] eta: 0:07:09 lr: 0.000528 min_lr: 0.000528 loss: 5.8892 (6.0085) class_acc: 0.0508 (0.0422) weight_decay: 0.0500 (0.0500) grad_norm: 1.5450 (1.5541) time: 1.8028 data: 0.0042 max mem: 2905
Epoch: [2] [600/625] eta: 0:00:47 lr: 0.000592 min_lr: 0.000592 loss: 5.7767 (5.9473) class_acc: 0.0586 (0.0475) weight_decay: 0.0500 (0.0500) grad_norm: 1.7937 (1.6152) time: 1.9841 data: 0.0006 max mem: 2905
Epoch: [2] [624/625] eta: 0:00:01 lr: 0.000599 min_lr: 0.000599 loss: 5.7209 (5.9391) class_acc: 0.0625 (0.0482) weight_decay: 0.0500 (0.0500) grad_norm: 1.7196 (1.6232) time: 0.8288 data: 0.0015 max mem: 2905
Epoch: [2] Total time: 0:19:34 (1.8798 s / it)
Averaged stats: lr: 0.000599 min_lr: 0.000599 loss: 5.7209 (5.9395) class_acc: 0.0625 (0.0480) weight_decay: 0.0500 (0.0500) grad_norm: 1.7196 (1.6232)
Test: [ 0/50] eta: 0:09:46 loss: 4.8598 (4.8598) acc1: 11.2000 (11.2000) acc5: 27.2000 (27.2000) time: 11.7263 data: 11.6884 max mem: 2905
Test: [10/50] eta: 0:01:17 loss: 4.8561 (4.7968) acc1: 12.0000 (12.5818) acc5: 28.8000 (29.4545) time: 1.9294 data: 1.9089 max mem: 2905
Test: [20/50] eta: 0:00:44 loss: 4.8561 (4.8427) acc1: 11.2000 (11.2762) acc5: 28.8000 (28.1524) time: 0.9804 data: 0.9605 max mem: 2905
Test: [30/50] eta: 0:00:25 loss: 4.8037 (4.8281) acc1: 10.4000 (11.1742) acc5: 28.0000 (28.5419) time: 0.9209 data: 0.9003 max mem: 2905
Test: [40/50] eta: 0:00:10 loss: 4.8505 (4.8551) acc1: 8.8000 (10.9659) acc5: 27.2000 (28.1756) time: 0.6264 data: 0.6058 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 4.9349 (4.8786) acc1: 9.6000 (10.8160) acc5: 24.8000 (27.5040) time: 0.6236 data: 0.6034 max mem: 2905
Test: Total time: 0:00:46 (0.9271 s / it)
* Acc@1 11.498 Acc@5 28.242 loss 4.837
Accuracy of the model on the 50000 test images: 11.5%
Max accuracy: 11.50%
Epoch: [3] [ 0/625] eta: 3:34:40 lr: 0.000600 min_lr: 0.000600 loss: 5.7872 (5.7872) class_acc: 0.0508 (0.0508) weight_decay: 0.0500 (0.0500) time: 20.6085 data: 20.4796 max mem: 2905
Epoch: [3] [200/625] eta: 0:13:29 lr: 0.000664 min_lr: 0.000664 loss: 5.6046 (5.6750) class_acc: 0.0742 (0.0743) weight_decay: 0.0500 (0.0500) grad_norm: 1.9255 (1.8777) time: 1.8261 data: 0.2726 max mem: 2905
Epoch: [3] [400/625] eta: 0:07:08 lr: 0.000728 min_lr: 0.000728 loss: 5.4507 (5.6158) class_acc: 0.1055 (0.0807) weight_decay: 0.0500 (0.0500) grad_norm: 2.1282 (1.9889) time: 1.8797 data: 0.0230 max mem: 2905
Epoch: [3] [600/625] eta: 0:00:48 lr: 0.000792 min_lr: 0.000792 loss: 5.4681 (5.5621) class_acc: 0.1016 (0.0856) weight_decay: 0.0500 (0.0500) grad_norm: 1.9530 (2.0324) time: 1.9466 data: 0.0006 max mem: 2905
Epoch: [3] [624/625] eta: 0:00:01 lr: 0.000799 min_lr: 0.000799 loss: 5.3850 (5.5557) class_acc: 0.1055 (0.0865) weight_decay: 0.0500 (0.0500) grad_norm: 2.1287 (2.0300) time: 0.7554 data: 0.0016 max mem: 2905
Epoch: [3] Total time: 0:19:31 (1.8738 s / it)
Averaged stats: lr: 0.000799 min_lr: 0.000799 loss: 5.3850 (5.5575) class_acc: 0.1055 (0.0864) weight_decay: 0.0500 (0.0500) grad_norm: 2.1287 (2.0300)
Test: [ 0/50] eta: 0:10:33 loss: 4.4218 (4.4218) acc1: 12.8000 (12.8000) acc5: 36.0000 (36.0000) time: 12.6686 data: 12.6427 max mem: 2905
Test: [10/50] eta: 0:01:15 loss: 4.3667 (4.2635) acc1: 19.2000 (18.6182) acc5: 40.0000 (40.8727) time: 1.8861 data: 1.8650 max mem: 2905
Test: [20/50] eta: 0:00:40 loss: 4.3667 (4.3161) acc1: 16.0000 (16.3810) acc5: 39.2000 (38.1333) time: 0.7907 data: 0.7708 max mem: 2905
Test: [30/50] eta: 0:00:25 loss: 4.3468 (4.2962) acc1: 15.2000 (16.4903) acc5: 36.8000 (38.0903) time: 0.9523 data: 0.9335 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 4.2393 (4.3090) acc1: 16.8000 (16.9366) acc5: 36.8000 (37.7951) time: 1.0633 data: 1.0448 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 4.4039 (4.3490) acc1: 16.0000 (16.4960) acc5: 34.4000 (37.3280) time: 0.7165 data: 0.6985 max mem: 2905
Test: Total time: 0:00:54 (1.0855 s / it)
* Acc@1 17.338 Acc@5 37.778 loss 4.303
Accuracy of the model on the 50000 test images: 17.3%
Max accuracy: 17.34%
Epoch: [4] [ 0/625] eta: 3:15:07 lr: 0.000800 min_lr: 0.000800 loss: 5.2683 (5.2683) class_acc: 0.1016 (0.1016) weight_decay: 0.0500 (0.0500) time: 18.7315 data: 18.1657 max mem: 2905
Epoch: [4] [200/625] eta: 0:13:45 lr: 0.000864 min_lr: 0.000864 loss: 5.3035 (5.3295) class_acc: 0.1133 (0.1126) weight_decay: 0.0500 (0.0500) grad_norm: 2.0196 (2.1965) time: 1.8294 data: 0.0499 max mem: 2905
Epoch: [4] [400/625] eta: 0:07:13 lr: 0.000928 min_lr: 0.000928 loss: 5.2115 (5.2869) class_acc: 0.1328 (0.1181) weight_decay: 0.0500 (0.0500) grad_norm: 2.1358 (inf) time: 1.9779 data: 0.0099 max mem: 2905
Epoch: [4] [600/625] eta: 0:00:48 lr: 0.000992 min_lr: 0.000992 loss: 5.1308 (5.2411) class_acc: 0.1406 (0.1237) weight_decay: 0.0500 (0.0500) grad_norm: 2.1777 (inf) time: 1.9683 data: 0.0520 max mem: 2905
Epoch: [4] [624/625] eta: 0:00:01 lr: 0.001000 min_lr: 0.001000 loss: 5.1033 (5.2360) class_acc: 0.1484 (0.1244) weight_decay: 0.0500 (0.0500) grad_norm: 2.3770 (inf) time: 0.8879 data: 0.0299 max mem: 2905
Epoch: [4] Total time: 0:19:35 (1.8809 s / it)
Averaged stats: lr: 0.001000 min_lr: 0.001000 loss: 5.1033 (5.2305) class_acc: 0.1484 (0.1258) weight_decay: 0.0500 (0.0500) grad_norm: 2.3770 (inf)
Test: [ 0/50] eta: 0:10:10 loss: 3.8296 (3.8296) acc1: 24.8000 (24.8000) acc5: 44.0000 (44.0000) time: 12.2025 data: 12.1732 max mem: 2905
Test: [10/50] eta: 0:01:29 loss: 3.8296 (3.8120) acc1: 24.8000 (25.7455) acc5: 48.0000 (47.7091) time: 2.2404 data: 2.2214 max mem: 2905
Test: [20/50] eta: 0:00:52 loss: 3.9097 (3.9069) acc1: 21.6000 (22.8571) acc5: 46.4000 (45.1810) time: 1.2378 data: 1.2191 max mem: 2905
Test: [30/50] eta: 0:00:29 loss: 3.9239 (3.8921) acc1: 20.8000 (23.0968) acc5: 41.6000 (45.0839) time: 1.0447 data: 1.0252 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 3.9548 (3.9210) acc1: 22.4000 (22.8683) acc5: 41.6000 (44.7610) time: 0.6728 data: 0.6523 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 4.0196 (3.9464) acc1: 20.8000 (22.3360) acc5: 42.4000 (44.3680) time: 0.6484 data: 0.6262 max mem: 2905
Test: Total time: 0:00:52 (1.0593 s / it)
* Acc@1 22.582 Acc@5 44.934 loss 3.915
Accuracy of the model on the 50000 test images: 22.6%
Max accuracy: 22.58%
Epoch: [5] [ 0/625] eta: 3:37:49 lr: 0.001000 min_lr: 0.001000 loss: 5.1298 (5.1298) class_acc: 0.1367 (0.1367) weight_decay: 0.0500 (0.0500) time: 20.9108 data: 18.1586 max mem: 2905
Epoch: [5] [200/625] eta: 0:13:51 lr: 0.001064 min_lr: 0.001064 loss: 5.0483 (5.0432) class_acc: 0.1602 (0.1535) weight_decay: 0.0500 (0.0500) grad_norm: 2.8175 (2.6947) time: 1.9335 data: 0.0055 max mem: 2905
Epoch: [5] [400/625] eta: 0:07:11 lr: 0.001128 min_lr: 0.001128 loss: 4.9352 (5.0098) class_acc: 0.1641 (0.1558) weight_decay: 0.0500 (0.0500) grad_norm: 2.1319 (2.5978) time: 2.0246 data: 0.0043 max mem: 2905
Epoch: [5] [600/625] eta: 0:00:48 lr: 0.001192 min_lr: 0.001192 loss: 4.8698 (4.9720) class_acc: 0.1641 (0.1608) weight_decay: 0.0500 (0.0500) grad_norm: 2.8755 (2.6583) time: 1.9903 data: 0.0335 max mem: 2905
Epoch: [5] [624/625] eta: 0:00:01 lr: 0.001200 min_lr: 0.001200 loss: 4.8149 (4.9663) class_acc: 0.1719 (0.1614) weight_decay: 0.0500 (0.0500) grad_norm: 2.6388 (2.6477) time: 0.7885 data: 0.0014 max mem: 2905
Epoch: [5] Total time: 0:19:32 (1.8765 s / it)
Averaged stats: lr: 0.001200 min_lr: 0.001200 loss: 4.8149 (4.9633) class_acc: 0.1719 (0.1627) weight_decay: 0.0500 (0.0500) grad_norm: 2.6388 (2.6477)
Test: [ 0/50] eta: 0:10:16 loss: 3.5700 (3.5700) acc1: 28.8000 (28.8000) acc5: 50.4000 (50.4000) time: 12.3370 data: 12.3068 max mem: 2905
Test: [10/50] eta: 0:01:23 loss: 3.5700 (3.5258) acc1: 28.8000 (27.9273) acc5: 52.0000 (52.3636) time: 2.0841 data: 2.0649 max mem: 2905
Test: [20/50] eta: 0:00:48 loss: 3.6373 (3.6148) acc1: 25.6000 (25.3714) acc5: 50.4000 (50.6667) time: 1.0670 data: 1.0480 max mem: 2905
Test: [30/50] eta: 0:00:27 loss: 3.6760 (3.5991) acc1: 24.8000 (26.0387) acc5: 49.6000 (50.6839) time: 0.9670 data: 0.9467 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 3.6760 (3.6394) acc1: 25.6000 (26.1659) acc5: 49.6000 (49.9122) time: 0.6360 data: 0.6151 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 3.6795 (3.6541) acc1: 25.6000 (25.9040) acc5: 50.4000 (50.0000) time: 0.5791 data: 0.5590 max mem: 2905
Test: Total time: 0:00:48 (0.9651 s / it)
* Acc@1 26.876 Acc@5 50.772 loss 3.622
Accuracy of the model on the 50000 test images: 26.9%
Max accuracy: 26.88%
Epoch: [6] [ 0/625] eta: 3:29:02 lr: 0.001200 min_lr: 0.001200 loss: 4.9009 (4.9009) class_acc: 0.1836 (0.1836) weight_decay: 0.0500 (0.0500) time: 20.0682 data: 19.9396 max mem: 2905
Epoch: [6] [200/625] eta: 0:13:53 lr: 0.001264 min_lr: 0.001264 loss: 4.7983 (4.7987) class_acc: 0.1953 (0.1882) weight_decay: 0.0500 (0.0500) grad_norm: 2.6991 (2.5532) time: 1.9000 data: 1.6467 max mem: 2905
Epoch: [6] [400/625] eta: 0:07:08 lr: 0.001328 min_lr: 0.001328 loss: 4.7366 (4.7691) class_acc: 0.1992 (0.1925) weight_decay: 0.0500 (0.0500) grad_norm: 2.5836 (2.5758) time: 1.6145 data: 1.4355 max mem: 2905
Epoch: [6] [600/625] eta: 0:00:47 lr: 0.001393 min_lr: 0.001393 loss: 4.6280 (4.7387) class_acc: 0.2070 (0.1974) weight_decay: 0.0500 (0.0500) grad_norm: 2.2042 (2.5552) time: 2.0588 data: 1.8792 max mem: 2905
Epoch: [6] [624/625] eta: 0:00:01 lr: 0.001400 min_lr: 0.001400 loss: 4.6438 (4.7372) class_acc: 0.2070 (0.1974) weight_decay: 0.0500 (0.0500) grad_norm: 2.6381 (2.5659) time: 0.6559 data: 0.4986 max mem: 2905
Epoch: [6] Total time: 0:19:26 (1.8664 s / it)
Averaged stats: lr: 0.001400 min_lr: 0.001400 loss: 4.6438 (4.7416) class_acc: 0.2070 (0.1962) weight_decay: 0.0500 (0.0500) grad_norm: 2.6381 (2.5659)
Test: [ 0/50] eta: 0:09:43 loss: 3.2068 (3.2068) acc1: 33.6000 (33.6000) acc5: 59.2000 (59.2000) time: 11.6622 data: 11.6334 max mem: 2905
Test: [10/50] eta: 0:01:24 loss: 3.2586 (3.2402) acc1: 34.4000 (33.4545) acc5: 60.0000 (59.3455) time: 2.1058 data: 2.0845 max mem: 2905
Test: [20/50] eta: 0:00:49 loss: 3.3495 (3.3518) acc1: 30.4000 (30.8571) acc5: 57.6000 (56.6476) time: 1.1618 data: 1.1412 max mem: 2905
Test: [30/50] eta: 0:00:26 loss: 3.3963 (3.3467) acc1: 29.6000 (30.6065) acc5: 54.4000 (56.4903) time: 0.9146 data: 0.8941 max mem: 2905
Test: [40/50] eta: 0:00:10 loss: 3.3060 (3.3673) acc1: 29.6000 (30.2829) acc5: 55.2000 (55.8829) time: 0.4684 data: 0.4486 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 3.4843 (3.3869) acc1: 28.8000 (29.8400) acc5: 53.6000 (55.4240) time: 0.4594 data: 0.4407 max mem: 2905
Test: Total time: 0:00:45 (0.9178 s / it)
* Acc@1 30.902 Acc@5 55.788 loss 3.356
Accuracy of the model on the 50000 test images: 30.9%
Max accuracy: 30.90%
Epoch: [7] [ 0/625] eta: 4:03:25 lr: 0.001400 min_lr: 0.001400 loss: 4.6585 (4.6585) class_acc: 0.2344 (0.2344) weight_decay: 0.0500 (0.0500) time: 23.3680 data: 16.2731 max mem: 2905
Epoch: [7] [200/625] eta: 0:13:40 lr: 0.001464 min_lr: 0.001464 loss: 4.5794 (4.6002) class_acc: 0.2070 (0.2162) weight_decay: 0.0500 (0.0500) grad_norm: 2.3339 (2.6517) time: 2.0130 data: 0.0005 max mem: 2905
Epoch: [7] [400/625] eta: 0:07:09 lr: 0.001528 min_lr: 0.001528 loss: 4.5334 (4.5822) class_acc: 0.2305 (0.2191) weight_decay: 0.0500 (0.0500) grad_norm: 2.3935 (2.6114) time: 1.9204 data: 0.0005 max mem: 2905
Epoch: [7] [600/625] eta: 0:00:47 lr: 0.001593 min_lr: 0.001593 loss: 4.4742 (4.5610) class_acc: 0.2422 (0.2231) weight_decay: 0.0500 (0.0500) grad_norm: 2.2934 (2.5822) time: 2.0638 data: 0.0005 max mem: 2905
Epoch: [7] [624/625] eta: 0:00:01 lr: 0.001600 min_lr: 0.001600 loss: 4.4444 (4.5575) class_acc: 0.2383 (0.2238) weight_decay: 0.0500 (0.0500) grad_norm: 2.4272 (2.5888) time: 0.7204 data: 0.0013 max mem: 2905
Epoch: [7] Total time: 0:19:30 (1.8731 s / it)
Averaged stats: lr: 0.001600 min_lr: 0.001600 loss: 4.4444 (4.5618) class_acc: 0.2383 (0.2250) weight_decay: 0.0500 (0.0500) grad_norm: 2.4272 (2.5888)
Test: [ 0/50] eta: 0:10:50 loss: 2.9093 (2.9093) acc1: 36.8000 (36.8000) acc5: 63.2000 (63.2000) time: 13.0051 data: 12.9644 max mem: 2905
Test: [10/50] eta: 0:01:27 loss: 2.9527 (3.0195) acc1: 37.6000 (37.1636) acc5: 62.4000 (62.3273) time: 2.1962 data: 2.1751 max mem: 2905
Test: [20/50] eta: 0:00:51 loss: 3.2389 (3.1744) acc1: 35.2000 (34.0952) acc5: 58.4000 (59.2762) time: 1.1524 data: 1.1297 max mem: 2905
Test: [30/50] eta: 0:00:29 loss: 3.3120 (3.1727) acc1: 31.2000 (34.0129) acc5: 56.8000 (59.1226) time: 1.0619 data: 1.0387 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 3.2125 (3.1923) acc1: 32.0000 (34.0098) acc5: 57.6000 (58.9463) time: 0.6523 data: 0.6316 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 3.3117 (3.2221) acc1: 32.0000 (33.6320) acc5: 57.6000 (58.6240) time: 0.6373 data: 0.6169 max mem: 2905
Test: Total time: 0:00:49 (0.9976 s / it)
* Acc@1 34.238 Acc@5 59.060 loss 3.189
Accuracy of the model on the 50000 test images: 34.2%
Max accuracy: 34.24%
Epoch: [8] [ 0/625] eta: 3:37:37 lr: 0.001600 min_lr: 0.001600 loss: 4.4698 (4.4698) class_acc: 0.2773 (0.2773) weight_decay: 0.0500 (0.0500) time: 20.8919 data: 17.4404 max mem: 2905
Epoch: [8] [200/625] eta: 0:13:38 lr: 0.001664 min_lr: 0.001664 loss: 4.4204 (4.4612) class_acc: 0.2422 (0.2399) weight_decay: 0.0500 (0.0500) grad_norm: 2.5189 (2.6102) time: 1.8606 data: 0.0007 max mem: 2905
Epoch: [8] [400/625] eta: 0:07:10 lr: 0.001728 min_lr: 0.001728 loss: 4.4202 (4.4448) class_acc: 0.2461 (0.2436) weight_decay: 0.0500 (0.0500) grad_norm: 2.3623 (2.5957) time: 1.9448 data: 0.0006 max mem: 2905
Epoch: [8] [600/625] eta: 0:00:48 lr: 0.001793 min_lr: 0.001793 loss: 4.3859 (4.4276) class_acc: 0.2617 (0.2474) weight_decay: 0.0500 (0.0500) grad_norm: 2.4784 (2.5829) time: 2.0012 data: 0.0006 max mem: 2905
Epoch: [8] [624/625] eta: 0:00:01 lr: 0.001800 min_lr: 0.001800 loss: 4.3258 (4.4244) class_acc: 0.2656 (0.2481) weight_decay: 0.0500 (0.0500) grad_norm: 2.2915 (2.5702) time: 0.7980 data: 0.0013 max mem: 2905
Epoch: [8] Total time: 0:19:42 (1.8919 s / it)
Averaged stats: lr: 0.001800 min_lr: 0.001800 loss: 4.3258 (4.4180) class_acc: 0.2656 (0.2493) weight_decay: 0.0500 (0.0500) grad_norm: 2.2915 (2.5702)
Test: [ 0/50] eta: 0:09:58 loss: 2.8935 (2.8935) acc1: 39.2000 (39.2000) acc5: 61.6000 (61.6000) time: 11.9733 data: 11.9122 max mem: 2905
Test: [10/50] eta: 0:01:29 loss: 2.8935 (2.9067) acc1: 40.0000 (40.0727) acc5: 67.2000 (64.8727) time: 2.2268 data: 2.2035 max mem: 2905
Test: [20/50] eta: 0:00:54 loss: 3.0836 (3.0588) acc1: 36.0000 (36.6095) acc5: 61.6000 (61.8667) time: 1.3003 data: 1.2802 max mem: 2905
Test: [30/50] eta: 0:00:31 loss: 3.0992 (3.0312) acc1: 34.4000 (36.9032) acc5: 60.8000 (62.0387) time: 1.1798 data: 1.1599 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 3.0874 (3.0463) acc1: 38.4000 (36.7415) acc5: 60.8000 (61.8341) time: 0.7516 data: 0.7325 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 3.0975 (3.0647) acc1: 35.2000 (36.2720) acc5: 60.8000 (61.3920) time: 0.6948 data: 0.6749 max mem: 2905
Test: Total time: 0:00:54 (1.0808 s / it)
* Acc@1 36.742 Acc@5 61.776 loss 3.036
Accuracy of the model on the 50000 test images: 36.7%
Max accuracy: 36.74%
Epoch: [9] [ 0/625] eta: 3:36:02 lr: 0.001800 min_lr: 0.001800 loss: 4.3272 (4.3272) class_acc: 0.2500 (0.2500) weight_decay: 0.0500 (0.0500) time: 20.7408 data: 16.6444 max mem: 2905
Epoch: [9] [200/625] eta: 0:14:23 lr: 0.001864 min_lr: 0.001864 loss: 4.2635 (4.3240) class_acc: 0.2734 (0.2654) weight_decay: 0.0500 (0.0500) grad_norm: 2.4228 (2.4175) time: 1.9545 data: 0.0091 max mem: 2905
Epoch: [9] [400/625] eta: 0:07:29 lr: 0.001929 min_lr: 0.001929 loss: 4.2678 (4.3155) class_acc: 0.2578 (0.2662) weight_decay: 0.0500 (0.0500) grad_norm: 2.1543 (2.5136) time: 1.9232 data: 0.0007 max mem: 2905
Epoch: [9] [600/625] eta: 0:00:49 lr: 0.001993 min_lr: 0.001993 loss: 4.2901 (4.3029) class_acc: 0.2617 (0.2689) weight_decay: 0.0500 (0.0500) grad_norm: 2.5049 (2.5141) time: 1.9850 data: 0.0731 max mem: 2905
Epoch: [9] [624/625] eta: 0:00:01 lr: 0.002000 min_lr: 0.002000 loss: 4.2756 (4.3026) class_acc: 0.2656 (0.2688) weight_decay: 0.0500 (0.0500) grad_norm: 2.2845 (2.5103) time: 0.7254 data: 0.0210 max mem: 2905
Epoch: [9] Total time: 0:20:10 (1.9361 s / it)
Averaged stats: lr: 0.002000 min_lr: 0.002000 loss: 4.2756 (4.2989) class_acc: 0.2656 (0.2701) weight_decay: 0.0500 (0.0500) grad_norm: 2.2845 (2.5103)
Test: [ 0/50] eta: 0:08:55 loss: 2.7651 (2.7651) acc1: 43.2000 (43.2000) acc5: 67.2000 (67.2000) time: 10.7198 data: 10.6874 max mem: 2905
Test: [10/50] eta: 0:01:19 loss: 2.7537 (2.7308) acc1: 40.8000 (41.3818) acc5: 67.2000 (66.6909) time: 1.9963 data: 1.9748 max mem: 2905
Test: [20/50] eta: 0:00:48 loss: 2.9111 (2.8878) acc1: 37.6000 (38.1333) acc5: 63.2000 (64.1905) time: 1.1607 data: 1.1401 max mem: 2905
Test: [30/50] eta: 0:00:28 loss: 2.9691 (2.8784) acc1: 36.0000 (38.1935) acc5: 61.6000 (64.1548) time: 1.1207 data: 1.1007 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 3.0237 (2.9040) acc1: 37.6000 (38.1073) acc5: 61.6000 (63.2781) time: 0.6884 data: 0.6686 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 3.0316 (2.9240) acc1: 36.8000 (37.6480) acc5: 60.8000 (62.9120) time: 0.5287 data: 0.5094 max mem: 2905
Test: Total time: 0:00:48 (0.9770 s / it)
* Acc@1 38.540 Acc@5 63.834 loss 2.886
Accuracy of the model on the 50000 test images: 38.5%
Max accuracy: 38.54%
Epoch: [10] [ 0/625] eta: 3:53:22 lr: 0.002000 min_lr: 0.002000 loss: 4.2604 (4.2604) class_acc: 0.2812 (0.2812) weight_decay: 0.0500 (0.0500) time: 22.4036 data: 16.9625 max mem: 2905
Epoch: [10] [200/625] eta: 0:14:06 lr: 0.002064 min_lr: 0.002064 loss: 4.2511 (4.2439) class_acc: 0.2852 (0.2789) weight_decay: 0.0500 (0.0500) grad_norm: 2.1119 (2.3423) time: 1.9231 data: 0.0280 max mem: 2905
Epoch: [10] [400/625] eta: 0:07:13 lr: 0.002129 min_lr: 0.002129 loss: 4.1512 (4.2185) class_acc: 0.2969 (0.2835) weight_decay: 0.0500 (0.0500) grad_norm: 2.5338 (2.3170) time: 1.8732 data: 0.0006 max mem: 2905
Epoch: [10] [600/625] eta: 0:00:48 lr: 0.002193 min_lr: 0.002193 loss: 4.1442 (4.2089) class_acc: 0.3008 (0.2857) weight_decay: 0.0500 (0.0500) grad_norm: 1.9480 (inf) time: 2.0251 data: 0.0221 max mem: 2905
Epoch: [10] [624/625] eta: 0:00:01 lr: 0.002200 min_lr: 0.002200 loss: 4.1306 (4.2070) class_acc: 0.2812 (0.2856) weight_decay: 0.0500 (0.0500) grad_norm: 1.9993 (inf) time: 0.7978 data: 0.0014 max mem: 2905
Epoch: [10] Total time: 0:19:35 (1.8811 s / it)
Averaged stats: lr: 0.002200 min_lr: 0.002200 loss: 4.1306 (4.2042) class_acc: 0.2812 (0.2866) weight_decay: 0.0500 (0.0500) grad_norm: 1.9993 (inf)
Test: [ 0/50] eta: 0:10:14 loss: 2.6962 (2.6962) acc1: 43.2000 (43.2000) acc5: 64.8000 (64.8000) time: 12.2856 data: 12.2574 max mem: 2905
Test: [10/50] eta: 0:01:22 loss: 2.6962 (2.6542) acc1: 46.4000 (44.2909) acc5: 68.0000 (68.0000) time: 2.0653 data: 2.0455 max mem: 2905
Test: [20/50] eta: 0:00:48 loss: 2.8521 (2.8162) acc1: 39.2000 (40.6476) acc5: 65.6000 (66.1333) time: 1.0829 data: 1.0643 max mem: 2905
Test: [30/50] eta: 0:00:28 loss: 2.8902 (2.8010) acc1: 37.6000 (40.8000) acc5: 65.6000 (66.2710) time: 1.0456 data: 1.0258 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.8382 (2.8281) acc1: 38.4000 (40.6829) acc5: 65.6000 (65.5024) time: 0.7540 data: 0.7338 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.9067 (2.8491) acc1: 37.6000 (40.1280) acc5: 65.6000 (65.2000) time: 0.7291 data: 0.7101 max mem: 2905
Test: Total time: 0:00:50 (1.0079 s / it)
* Acc@1 40.592 Acc@5 65.836 loss 2.813
Accuracy of the model on the 50000 test images: 40.6%
Max accuracy: 40.59%
Epoch: [11] [ 0/625] eta: 3:43:47 lr: 0.002200 min_lr: 0.002200 loss: 3.9268 (3.9268) class_acc: 0.3516 (0.3516) weight_decay: 0.0500 (0.0500) time: 21.4832 data: 17.0127 max mem: 2905
Epoch: [11] [200/625] eta: 0:14:26 lr: 0.002264 min_lr: 0.002264 loss: 4.1573 (4.1289) class_acc: 0.3008 (0.3006) weight_decay: 0.0500 (0.0500) grad_norm: 2.0794 (2.3988) time: 1.9358 data: 0.0012 max mem: 2905
Epoch: [11] [400/625] eta: 0:07:24 lr: 0.002329 min_lr: 0.002329 loss: 4.1037 (4.1273) class_acc: 0.2930 (0.2998) weight_decay: 0.0500 (0.0500) grad_norm: 2.4230 (2.4047) time: 1.8472 data: 0.0245 max mem: 2905
Epoch: [11] [600/625] eta: 0:00:49 lr: 0.002393 min_lr: 0.002393 loss: 4.1005 (4.1241) class_acc: 0.2891 (0.3001) weight_decay: 0.0500 (0.0500) grad_norm: 1.6508 (2.3153) time: 2.1331 data: 0.0204 max mem: 2905
Epoch: [11] [624/625] eta: 0:00:01 lr: 0.002400 min_lr: 0.002400 loss: 4.0749 (4.1226) class_acc: 0.3047 (0.3005) weight_decay: 0.0500 (0.0500) grad_norm: 2.1353 (2.3125) time: 0.7291 data: 0.0015 max mem: 2905
Epoch: [11] Total time: 0:20:11 (1.9381 s / it)
Averaged stats: lr: 0.002400 min_lr: 0.002400 loss: 4.0749 (4.1200) class_acc: 0.3047 (0.3013) weight_decay: 0.0500 (0.0500) grad_norm: 2.1353 (2.3125)
Test: [ 0/50] eta: 0:10:36 loss: 2.4763 (2.4763) acc1: 46.4000 (46.4000) acc5: 71.2000 (71.2000) time: 12.7394 data: 12.7065 max mem: 2905
Test: [10/50] eta: 0:01:27 loss: 2.5527 (2.5122) acc1: 46.4000 (46.8364) acc5: 71.2000 (70.9091) time: 2.1897 data: 2.1654 max mem: 2905
Test: [20/50] eta: 0:00:53 loss: 2.6607 (2.6901) acc1: 44.8000 (43.1238) acc5: 67.2000 (68.1524) time: 1.2351 data: 1.2141 max mem: 2905
Test: [30/50] eta: 0:00:32 loss: 2.7662 (2.6840) acc1: 40.8000 (42.6581) acc5: 65.6000 (67.9484) time: 1.3126 data: 1.2931 max mem: 2905
Test: [40/50] eta: 0:00:14 loss: 2.7282 (2.7012) acc1: 41.6000 (42.7707) acc5: 68.0000 (67.7659) time: 1.0129 data: 0.9931 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.7282 (2.7157) acc1: 40.8000 (42.4960) acc5: 68.0000 (67.6000) time: 0.8635 data: 0.8440 max mem: 2905
Test: Total time: 0:00:58 (1.1762 s / it)
* Acc@1 42.952 Acc@5 67.950 loss 2.687
Accuracy of the model on the 50000 test images: 43.0%
Max accuracy: 42.95%
Epoch: [12] [ 0/625] eta: 3:48:46 lr: 0.002400 min_lr: 0.002400 loss: 4.2393 (4.2393) class_acc: 0.2773 (0.2773) weight_decay: 0.0500 (0.0500) time: 21.9627 data: 18.4014 max mem: 2905
Epoch: [12] [200/625] eta: 0:14:12 lr: 0.002464 min_lr: 0.002464 loss: 4.0651 (4.0561) class_acc: 0.3047 (0.3143) weight_decay: 0.0500 (0.0500) grad_norm: 2.0893 (2.3006) time: 1.7633 data: 0.3592 max mem: 2905
Epoch: [12] [400/625] eta: 0:07:15 lr: 0.002529 min_lr: 0.002529 loss: 4.0072 (4.0528) class_acc: 0.3164 (0.3130) weight_decay: 0.0500 (0.0500) grad_norm: 2.0570 (2.3277) time: 1.8035 data: 0.0569 max mem: 2905
Epoch: [12] [600/625] eta: 0:00:48 lr: 0.002593 min_lr: 0.002593 loss: 4.0482 (4.0528) class_acc: 0.3203 (0.3136) weight_decay: 0.0500 (0.0500) grad_norm: 1.9792 (2.3293) time: 2.1068 data: 0.0625 max mem: 2905
Epoch: [12] [624/625] eta: 0:00:01 lr: 0.002600 min_lr: 0.002600 loss: 4.0376 (4.0532) class_acc: 0.3125 (0.3135) weight_decay: 0.0500 (0.0500) grad_norm: 2.0785 (2.3239) time: 0.8143 data: 0.0186 max mem: 2905
Epoch: [12] Total time: 0:20:07 (1.9318 s / it)
Averaged stats: lr: 0.002600 min_lr: 0.002600 loss: 4.0376 (4.0536) class_acc: 0.3125 (0.3139) weight_decay: 0.0500 (0.0500) grad_norm: 2.0785 (2.3239)
Test: [ 0/50] eta: 0:11:07 loss: 2.4376 (2.4376) acc1: 46.4000 (46.4000) acc5: 71.2000 (71.2000) time: 13.3431 data: 13.3089 max mem: 2905
Test: [10/50] eta: 0:01:31 loss: 2.4719 (2.5471) acc1: 47.2000 (46.0364) acc5: 70.4000 (69.7455) time: 2.2773 data: 2.2560 max mem: 2905
Test: [20/50] eta: 0:00:55 loss: 2.7218 (2.7209) acc1: 41.6000 (41.6762) acc5: 67.2000 (67.6952) time: 1.2851 data: 1.2659 max mem: 2905
Test: [30/50] eta: 0:00:31 loss: 2.8736 (2.7055) acc1: 39.2000 (41.8581) acc5: 66.4000 (67.5097) time: 1.1669 data: 1.1483 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.7616 (2.7235) acc1: 40.8000 (41.5024) acc5: 67.2000 (67.0829) time: 0.6777 data: 0.6549 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.8082 (2.7487) acc1: 40.0000 (41.5040) acc5: 65.6000 (66.6880) time: 0.6488 data: 0.6265 max mem: 2905
Test: Total time: 0:00:53 (1.0601 s / it)
* Acc@1 42.014 Acc@5 67.190 loss 2.726
Accuracy of the model on the 50000 test images: 42.0%
Max accuracy: 42.95%
Epoch: [13] [ 0/625] eta: 4:13:58 lr: 0.002600 min_lr: 0.002600 loss: 3.8410 (3.8410) class_acc: 0.3633 (0.3633) weight_decay: 0.0500 (0.0500) time: 24.3811 data: 23.5717 max mem: 2905
Epoch: [13] [200/625] eta: 0:15:25 lr: 0.002665 min_lr: 0.002665 loss: 3.9449 (3.9901) class_acc: 0.3320 (0.3244) weight_decay: 0.0500 (0.0500) grad_norm: 2.1871 (2.3120) time: 1.8859 data: 0.0125 max mem: 2905
Epoch: [13] [400/625] eta: 0:07:40 lr: 0.002729 min_lr: 0.002729 loss: 4.0058 (3.9963) class_acc: 0.3125 (0.3242) weight_decay: 0.0500 (0.0500) grad_norm: 2.2123 (2.2978) time: 1.8758 data: 0.0232 max mem: 2905
Epoch: [13] [600/625] eta: 0:00:52 lr: 0.002793 min_lr: 0.002793 loss: 3.9816 (3.9945) class_acc: 0.3125 (0.3250) weight_decay: 0.0500 (0.0500) grad_norm: 2.0018 (2.2742) time: 2.2631 data: 0.0006 max mem: 2905
Epoch: [13] [624/625] eta: 0:00:02 lr: 0.002800 min_lr: 0.002800 loss: 3.9411 (3.9933) class_acc: 0.3320 (0.3251) weight_decay: 0.0500 (0.0500) grad_norm: 1.9408 (2.2684) time: 1.0124 data: 0.0016 max mem: 2905
Epoch: [13] Total time: 0:21:28 (2.0615 s / it)
Averaged stats: lr: 0.002800 min_lr: 0.002800 loss: 3.9411 (3.9959) class_acc: 0.3320 (0.3249) weight_decay: 0.0500 (0.0500) grad_norm: 1.9408 (2.2684)
Test: [ 0/50] eta: 0:11:37 loss: 2.5089 (2.5089) acc1: 45.6000 (45.6000) acc5: 76.8000 (76.8000) time: 13.9592 data: 13.9329 max mem: 2905
Test: [10/50] eta: 0:01:35 loss: 2.5266 (2.5857) acc1: 47.2000 (47.0545) acc5: 70.4000 (70.5455) time: 2.3968 data: 2.3777 max mem: 2905
Test: [20/50] eta: 0:00:54 loss: 2.6529 (2.6943) acc1: 44.0000 (43.2381) acc5: 68.8000 (69.2571) time: 1.2106 data: 1.1920 max mem: 2905
Test: [30/50] eta: 0:00:31 loss: 2.6981 (2.6633) acc1: 40.0000 (43.2774) acc5: 68.8000 (69.6516) time: 1.1508 data: 1.1319 max mem: 2905
Test: [40/50] eta: 0:00:13 loss: 2.7048 (2.6942) acc1: 40.8000 (42.6732) acc5: 68.0000 (68.8195) time: 0.9040 data: 0.8853 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.7685 (2.7166) acc1: 40.8000 (42.3680) acc5: 66.4000 (68.2560) time: 0.8507 data: 0.8325 max mem: 2905
Test: Total time: 0:01:00 (1.2075 s / it)
* Acc@1 42.820 Acc@5 68.040 loss 2.698
Accuracy of the model on the 50000 test images: 42.8%
Max accuracy: 42.95%
Epoch: [14] [ 0/625] eta: 4:22:44 lr: 0.002800 min_lr: 0.002800 loss: 3.6967 (3.6967) class_acc: 0.3594 (0.3594) weight_decay: 0.0500 (0.0500) time: 25.2232 data: 25.0372 max mem: 2905
Epoch: [14] [200/625] eta: 0:15:03 lr: 0.002865 min_lr: 0.002865 loss: 3.8945 (3.9471) class_acc: 0.3359 (0.3323) weight_decay: 0.0500 (0.0500) grad_norm: 1.8518 (2.3200) time: 1.9372 data: 0.0008 max mem: 2905
Epoch: [14] [400/625] eta: 0:07:35 lr: 0.002929 min_lr: 0.002929 loss: 3.9635 (3.9485) class_acc: 0.3164 (0.3321) weight_decay: 0.0500 (0.0500) grad_norm: 2.0855 (2.2405) time: 2.0097 data: 0.0007 max mem: 2905
Epoch: [14] [600/625] eta: 0:00:50 lr: 0.002993 min_lr: 0.002993 loss: 3.9552 (3.9478) class_acc: 0.3242 (0.3329) weight_decay: 0.0500 (0.0500) grad_norm: 1.8658 (2.2335) time: 2.0167 data: 0.0008 max mem: 2905
Epoch: [14] [624/625] eta: 0:00:01 lr: 0.003000 min_lr: 0.003000 loss: 3.8815 (3.9471) class_acc: 0.3359 (0.3331) weight_decay: 0.0500 (0.0500) grad_norm: 2.0965 (2.2444) time: 0.6867 data: 0.0016 max mem: 2905
Epoch: [14] Total time: 0:20:22 (1.9556 s / it)
Averaged stats: lr: 0.003000 min_lr: 0.003000 loss: 3.8815 (3.9474) class_acc: 0.3359 (0.3333) weight_decay: 0.0500 (0.0500) grad_norm: 2.0965 (2.2444)
Test: [ 0/50] eta: 0:09:58 loss: 2.2925 (2.2925) acc1: 44.8000 (44.8000) acc5: 78.4000 (78.4000) time: 11.9628 data: 11.9327 max mem: 2905
Test: [10/50] eta: 0:01:20 loss: 2.3968 (2.4221) acc1: 48.0000 (48.2182) acc5: 72.8000 (72.1455) time: 2.0113 data: 1.9916 max mem: 2905
Test: [20/50] eta: 0:00:45 loss: 2.5434 (2.5883) acc1: 42.4000 (43.9619) acc5: 67.2000 (69.7524) time: 0.9887 data: 0.9698 max mem: 2905
Test: [30/50] eta: 0:00:26 loss: 2.6395 (2.5606) acc1: 40.8000 (43.7677) acc5: 68.8000 (70.0645) time: 0.9109 data: 0.8925 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.6302 (2.5910) acc1: 41.6000 (43.3951) acc5: 69.6000 (69.3268) time: 0.8205 data: 0.8026 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.6377 (2.6034) acc1: 41.6000 (43.1360) acc5: 67.2000 (69.1040) time: 0.6856 data: 0.6677 max mem: 2905
Test: Total time: 0:00:52 (1.0425 s / it)
* Acc@1 44.156 Acc@5 69.362 loss 2.579
Accuracy of the model on the 50000 test images: 44.2%
Max accuracy: 44.16%
Epoch: [15] [ 0/625] eta: 3:41:04 lr: 0.003000 min_lr: 0.003000 loss: 3.8014 (3.8014) class_acc: 0.3711 (0.3711) weight_decay: 0.0500 (0.0500) time: 21.2231 data: 20.8451 max mem: 2905
Epoch: [15] [200/625] eta: 0:14:34 lr: 0.003065 min_lr: 0.003065 loss: 3.9123 (3.8989) class_acc: 0.3398 (0.3444) weight_decay: 0.0500 (0.0500) grad_norm: 1.8742 (2.2813) time: 1.9342 data: 0.0007 max mem: 2905
Epoch: [15] [400/625] eta: 0:07:34 lr: 0.003129 min_lr: 0.003129 loss: 4.0027 (3.9085) class_acc: 0.3203 (0.3407) weight_decay: 0.0500 (0.0500) grad_norm: 2.6344 (2.2771) time: 2.1218 data: 0.0007 max mem: 2905
Epoch: [15] [600/625] eta: 0:00:50 lr: 0.003193 min_lr: 0.003193 loss: 3.8798 (3.9093) class_acc: 0.3477 (0.3401) weight_decay: 0.0500 (0.0500) grad_norm: 2.2548 (2.2944) time: 2.0437 data: 0.0156 max mem: 2905
Epoch: [15] [624/625] eta: 0:00:01 lr: 0.003200 min_lr: 0.003200 loss: 3.9158 (3.9099) class_acc: 0.3359 (0.3400) weight_decay: 0.0500 (0.0500) grad_norm: 1.7865 (2.2740) time: 0.7800 data: 0.0014 max mem: 2905
Epoch: [15] Total time: 0:20:22 (1.9558 s / it)
Averaged stats: lr: 0.003200 min_lr: 0.003200 loss: 3.9158 (3.9047) class_acc: 0.3359 (0.3413) weight_decay: 0.0500 (0.0500) grad_norm: 1.7865 (2.2740)
Test: [ 0/50] eta: 0:10:48 loss: 2.4163 (2.4163) acc1: 52.8000 (52.8000) acc5: 72.8000 (72.8000) time: 12.9715 data: 12.9431 max mem: 2905
Test: [10/50] eta: 0:01:28 loss: 2.4163 (2.3852) acc1: 49.6000 (49.6727) acc5: 73.6000 (72.9455) time: 2.2191 data: 2.1984 max mem: 2905
Test: [20/50] eta: 0:00:52 loss: 2.6175 (2.5718) acc1: 44.0000 (44.9143) acc5: 69.6000 (70.2857) time: 1.1909 data: 1.1703 max mem: 2905
Test: [30/50] eta: 0:00:30 loss: 2.6738 (2.5493) acc1: 42.4000 (44.8516) acc5: 67.2000 (70.0645) time: 1.1545 data: 1.1335 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.5792 (2.5740) acc1: 42.4000 (44.2732) acc5: 68.8000 (69.7756) time: 0.7458 data: 0.7242 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.5827 (2.5912) acc1: 42.4000 (44.1600) acc5: 68.8000 (69.5680) time: 0.6467 data: 0.6245 max mem: 2905
Test: Total time: 0:00:52 (1.0484 s / it)
* Acc@1 44.670 Acc@5 69.832 loss 2.567
Accuracy of the model on the 50000 test images: 44.7%
Max accuracy: 44.67%
Epoch: [16] [ 0/625] eta: 4:06:53 lr: 0.003201 min_lr: 0.003201 loss: 3.8835 (3.8835) class_acc: 0.3281 (0.3281) weight_decay: 0.0500 (0.0500) time: 23.7022 data: 23.5759 max mem: 2905
Epoch: [16] [200/625] eta: 0:14:28 lr: 0.003265 min_lr: 0.003265 loss: 3.8391 (3.8736) class_acc: 0.3516 (0.3472) weight_decay: 0.0500 (0.0500) grad_norm: 2.2771 (2.2100) time: 1.9181 data: 0.0462 max mem: 2905
Epoch: [16] [400/625] eta: 0:07:21 lr: 0.003329 min_lr: 0.003329 loss: 3.8863 (3.8710) class_acc: 0.3477 (0.3481) weight_decay: 0.0500 (0.0500) grad_norm: 2.0969 (2.1265) time: 1.9864 data: 0.1882 max mem: 2905
Epoch: [16] [600/625] eta: 0:00:48 lr: 0.003393 min_lr: 0.003393 loss: 3.8983 (3.8678) class_acc: 0.3477 (0.3492) weight_decay: 0.0500 (0.0500) grad_norm: 2.1502 (2.1549) time: 2.0331 data: 0.0263 max mem: 2905
Epoch: [16] [624/625] eta: 0:00:01 lr: 0.003400 min_lr: 0.003400 loss: 3.8396 (3.8680) class_acc: 0.3516 (0.3491) weight_decay: 0.0500 (0.0500) grad_norm: 1.6281 (2.1470) time: 0.7000 data: 0.0013 max mem: 2905
Epoch: [16] Total time: 0:19:51 (1.9058 s / it)
Averaged stats: lr: 0.003400 min_lr: 0.003400 loss: 3.8396 (3.8673) class_acc: 0.3516 (0.3485) weight_decay: 0.0500 (0.0500) grad_norm: 1.6281 (2.1470)
Test: [ 0/50] eta: 0:10:17 loss: 2.2174 (2.2174) acc1: 55.2000 (55.2000) acc5: 76.0000 (76.0000) time: 12.3581 data: 12.3300 max mem: 2905
Test: [10/50] eta: 0:01:19 loss: 2.3906 (2.3496) acc1: 50.4000 (50.4000) acc5: 75.2000 (73.8182) time: 1.9922 data: 1.9721 max mem: 2905
Test: [20/50] eta: 0:00:45 loss: 2.4548 (2.5181) acc1: 44.8000 (45.7524) acc5: 71.2000 (71.4286) time: 0.9715 data: 0.9526 max mem: 2905
Test: [30/50] eta: 0:00:26 loss: 2.6267 (2.5054) acc1: 40.8000 (45.4968) acc5: 69.6000 (71.3806) time: 0.9471 data: 0.9276 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.5787 (2.5398) acc1: 43.2000 (45.0146) acc5: 70.4000 (70.8488) time: 0.7296 data: 0.7099 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.5787 (2.5577) acc1: 43.2000 (44.6720) acc5: 69.6000 (70.3680) time: 0.6492 data: 0.6307 max mem: 2905
Test: Total time: 0:00:49 (0.9956 s / it)
* Acc@1 46.040 Acc@5 71.156 loss 2.523
Accuracy of the model on the 50000 test images: 46.0%
Max accuracy: 46.04%
Epoch: [17] [ 0/625] eta: 3:13:39 lr: 0.003401 min_lr: 0.003401 loss: 3.8910 (3.8910) class_acc: 0.3594 (0.3594) weight_decay: 0.0500 (0.0500) time: 18.5909 data: 18.4678 max mem: 2905
Epoch: [17] [200/625] eta: 0:14:01 lr: 0.003465 min_lr: 0.003465 loss: 3.8857 (3.8389) class_acc: 0.3477 (0.3538) weight_decay: 0.0500 (0.0500) grad_norm: 1.8196 (2.2447) time: 1.9227 data: 1.6070 max mem: 2905
Epoch: [17] [400/625] eta: 0:07:11 lr: 0.003529 min_lr: 0.003529 loss: 3.7862 (3.8369) class_acc: 0.3555 (0.3550) weight_decay: 0.0500 (0.0500) grad_norm: 1.8312 (inf) time: 1.8386 data: 1.6031 max mem: 2905
Epoch: [17] [600/625] eta: 0:00:47 lr: 0.003593 min_lr: 0.003593 loss: 3.8457 (3.8430) class_acc: 0.3516 (0.3534) weight_decay: 0.0500 (0.0500) grad_norm: 1.5092 (inf) time: 1.8567 data: 1.6188 max mem: 2905
Epoch: [17] [624/625] eta: 0:00:01 lr: 0.003600 min_lr: 0.003600 loss: 3.8486 (3.8432) class_acc: 0.3477 (0.3534) weight_decay: 0.0500 (0.0500) grad_norm: 2.1188 (inf) time: 0.7762 data: 0.6235 max mem: 2905
Epoch: [17] Total time: 0:19:21 (1.8588 s / it)
Averaged stats: lr: 0.003600 min_lr: 0.003600 loss: 3.8486 (3.8353) class_acc: 0.3477 (0.3548) weight_decay: 0.0500 (0.0500) grad_norm: 2.1188 (inf)
Test: [ 0/50] eta: 0:10:14 loss: 2.5408 (2.5408) acc1: 40.8000 (40.8000) acc5: 71.2000 (71.2000) time: 12.2809 data: 12.2577 max mem: 2905
Test: [10/50] eta: 0:01:22 loss: 2.5408 (2.5432) acc1: 46.4000 (45.6727) acc5: 68.8000 (69.8182) time: 2.0601 data: 2.0388 max mem: 2905
Test: [20/50] eta: 0:00:48 loss: 2.7378 (2.7221) acc1: 42.4000 (41.4095) acc5: 67.2000 (67.0476) time: 1.0946 data: 1.0739 max mem: 2905
Test: [30/50] eta: 0:00:28 loss: 2.8656 (2.7270) acc1: 39.2000 (41.2903) acc5: 64.0000 (66.9936) time: 1.0673 data: 1.0479 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.8656 (2.7829) acc1: 39.2000 (40.3902) acc5: 64.0000 (65.8146) time: 0.6944 data: 0.6748 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.8625 (2.7906) acc1: 40.0000 (40.6880) acc5: 64.0000 (65.7600) time: 0.6024 data: 0.5829 max mem: 2905
Test: Total time: 0:00:48 (0.9757 s / it)
* Acc@1 41.648 Acc@5 66.760 loss 2.742
Accuracy of the model on the 50000 test images: 41.6%
Max accuracy: 46.04%
Epoch: [18] [ 0/625] eta: 3:33:38 lr: 0.003601 min_lr: 0.003601 loss: 3.7695 (3.7695) class_acc: 0.3867 (0.3867) weight_decay: 0.0500 (0.0500) time: 20.5104 data: 18.1215 max mem: 2905
Epoch: [18] [200/625] eta: 0:13:40 lr: 0.003665 min_lr: 0.003665 loss: 3.7912 (3.8065) class_acc: 0.3633 (0.3593) weight_decay: 0.0500 (0.0500) grad_norm: 1.4906 (1.9935) time: 1.7403 data: 0.0926 max mem: 2905
Epoch: [18] [400/625] eta: 0:07:02 lr: 0.003729 min_lr: 0.003729 loss: 3.8296 (3.8077) class_acc: 0.3594 (0.3597) weight_decay: 0.0500 (0.0500) grad_norm: 1.6810 (2.0978) time: 1.9347 data: 0.0006 max mem: 2905
Epoch: [18] [600/625] eta: 0:00:47 lr: 0.003793 min_lr: 0.003793 loss: 3.7772 (3.8113) class_acc: 0.3438 (0.3588) weight_decay: 0.0500 (0.0500) grad_norm: 1.7198 (2.0771) time: 2.0009 data: 0.0006 max mem: 2905
Epoch: [18] [624/625] eta: 0:00:01 lr: 0.003800 min_lr: 0.003800 loss: 3.8208 (3.8127) class_acc: 0.3516 (0.3589) weight_decay: 0.0500 (0.0500) grad_norm: 2.2762 (2.1282) time: 0.7129 data: 0.0014 max mem: 2905
Epoch: [18] Total time: 0:19:17 (1.8526 s / it)
Averaged stats: lr: 0.003800 min_lr: 0.003800 loss: 3.8208 (3.8136) class_acc: 0.3516 (0.3592) weight_decay: 0.0500 (0.0500) grad_norm: 2.2762 (2.1282)
Test: [ 0/50] eta: 0:09:54 loss: 2.6247 (2.6247) acc1: 45.6000 (45.6000) acc5: 70.4000 (70.4000) time: 11.8913 data: 11.8576 max mem: 2905
Test: [10/50] eta: 0:01:20 loss: 2.6708 (2.6645) acc1: 44.0000 (43.0545) acc5: 68.8000 (68.4364) time: 2.0214 data: 2.0011 max mem: 2905
Test: [20/50] eta: 0:00:47 loss: 2.8277 (2.7971) acc1: 38.4000 (40.0381) acc5: 66.4000 (66.6667) time: 1.0846 data: 1.0656 max mem: 2905
Test: [30/50] eta: 0:00:27 loss: 2.8516 (2.7834) acc1: 38.4000 (40.4903) acc5: 64.8000 (66.4000) time: 1.0508 data: 1.0310 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.7912 (2.8077) acc1: 40.8000 (39.9220) acc5: 63.2000 (66.0293) time: 0.7072 data: 0.6882 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.8734 (2.8130) acc1: 36.8000 (39.8080) acc5: 65.6000 (66.0320) time: 0.6628 data: 0.6440 max mem: 2905
Test: Total time: 0:00:48 (0.9739 s / it)
* Acc@1 41.306 Acc@5 66.252 loss 2.793
Accuracy of the model on the 50000 test images: 41.3%
Max accuracy: 46.04%
Epoch: [19] [ 0/625] eta: 3:27:37 lr: 0.003801 min_lr: 0.003801 loss: 3.7461 (3.7461) class_acc: 0.3789 (0.3789) weight_decay: 0.0500 (0.0500) time: 19.9313 data: 19.0259 max mem: 2905
Epoch: [19] [200/625] eta: 0:13:18 lr: 0.003865 min_lr: 0.003865 loss: 3.7814 (3.7913) class_acc: 0.3633 (0.3657) weight_decay: 0.0500 (0.0500) grad_norm: 1.7745 (2.0692) time: 1.7525 data: 0.0081 max mem: 2905
Epoch: [19] [400/625] eta: 0:06:59 lr: 0.003929 min_lr: 0.003929 loss: 3.7692 (3.7926) class_acc: 0.3672 (0.3659) weight_decay: 0.0500 (0.0500) grad_norm: 1.5919 (2.0602) time: 1.7423 data: 0.0006 max mem: 2905
Epoch: [19] [600/625] eta: 0:00:46 lr: 0.003993 min_lr: 0.003993 loss: 3.7743 (3.7874) class_acc: 0.3672 (0.3665) weight_decay: 0.0500 (0.0500) grad_norm: 1.8381 (2.0593) time: 1.7971 data: 0.0006 max mem: 2905
Epoch: [19] [624/625] eta: 0:00:01 lr: 0.004000 min_lr: 0.004000 loss: 3.7813 (3.7878) class_acc: 0.3633 (0.3663) weight_decay: 0.0500 (0.0500) grad_norm: 1.7304 (2.0414) time: 0.9912 data: 0.0013 max mem: 2905
Epoch: [19] Total time: 0:19:13 (1.8450 s / it)
Averaged stats: lr: 0.004000 min_lr: 0.004000 loss: 3.7813 (3.7908) class_acc: 0.3633 (0.3636) weight_decay: 0.0500 (0.0500) grad_norm: 1.7304 (2.0414)
Test: [ 0/50] eta: 0:10:33 loss: 2.4847 (2.4847) acc1: 39.2000 (39.2000) acc5: 69.6000 (69.6000) time: 12.6661 data: 12.6366 max mem: 2905
Test: [10/50] eta: 0:01:30 loss: 2.3167 (2.3055) acc1: 51.2000 (48.5818) acc5: 73.6000 (74.7636) time: 2.2625 data: 2.2422 max mem: 2905
Test: [20/50] eta: 0:00:54 loss: 2.5198 (2.4631) acc1: 45.6000 (45.0286) acc5: 72.0000 (72.3048) time: 1.2663 data: 1.2454 max mem: 2905
Test: [30/50] eta: 0:00:29 loss: 2.6153 (2.4630) acc1: 43.2000 (45.3161) acc5: 69.6000 (71.8194) time: 1.0732 data: 1.0521 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.5907 (2.4913) acc1: 44.8000 (45.3659) acc5: 68.0000 (71.2781) time: 0.5919 data: 0.5697 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.6554 (2.5000) acc1: 44.8000 (45.2480) acc5: 68.0000 (70.8800) time: 0.5238 data: 0.5017 max mem: 2905
Test: Total time: 0:00:50 (1.0104 s / it)
* Acc@1 46.238 Acc@5 71.376 loss 2.472
Accuracy of the model on the 50000 test images: 46.2%
Max accuracy: 46.24%
Epoch: [20] [ 0/625] eta: 3:41:23 lr: 0.004000 min_lr: 0.004000 loss: 3.5793 (3.5793) class_acc: 0.3906 (0.3906) weight_decay: 0.0500 (0.0500) time: 21.2543 data: 16.9177 max mem: 2905
Epoch: [20] [200/625] eta: 0:14:01 lr: 0.004000 min_lr: 0.004000 loss: 3.7860 (3.7508) class_acc: 0.3516 (0.3733) weight_decay: 0.0500 (0.0500) grad_norm: 1.7203 (1.9910) time: 1.8292 data: 0.0011 max mem: 2905
Epoch: [20] [400/625] eta: 0:07:15 lr: 0.004000 min_lr: 0.004000 loss: 3.7795 (3.7539) class_acc: 0.3711 (0.3730) weight_decay: 0.0500 (0.0500) grad_norm: 1.7622 (2.0548) time: 1.9608 data: 0.0008 max mem: 2905
Epoch: [20] [600/625] eta: 0:00:48 lr: 0.004000 min_lr: 0.004000 loss: 3.7585 (3.7597) class_acc: 0.3672 (0.3716) weight_decay: 0.0500 (0.0500) grad_norm: 1.5971 (2.0682) time: 1.8815 data: 0.0011 max mem: 2905
Epoch: [20] [624/625] eta: 0:00:01 lr: 0.004000 min_lr: 0.004000 loss: 3.8036 (3.7596) class_acc: 0.3633 (0.3715) weight_decay: 0.0500 (0.0500) grad_norm: 1.7657 (2.0750) time: 0.7967 data: 0.0013 max mem: 2905
Epoch: [20] Total time: 0:19:32 (1.8759 s / it)
Averaged stats: lr: 0.004000 min_lr: 0.004000 loss: 3.8036 (3.7625) class_acc: 0.3633 (0.3694) weight_decay: 0.0500 (0.0500) grad_norm: 1.7657 (2.0750)
Test: [ 0/50] eta: 0:10:29 loss: 3.0842 (3.0842) acc1: 32.8000 (32.8000) acc5: 60.8000 (60.8000) time: 12.5883 data: 12.5592 max mem: 2905
Test: [10/50] eta: 0:01:21 loss: 2.8227 (2.7515) acc1: 44.0000 (41.7455) acc5: 65.6000 (67.2000) time: 2.0425 data: 2.0229 max mem: 2905
Test: [20/50] eta: 0:00:46 loss: 2.8696 (2.9127) acc1: 39.2000 (38.8571) acc5: 64.8000 (65.2571) time: 0.9919 data: 0.9734 max mem: 2905
Test: [30/50] eta: 0:00:27 loss: 3.0171 (2.8860) acc1: 36.8000 (39.7419) acc5: 62.4000 (64.8258) time: 0.9784 data: 0.9600 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.8158 (2.9152) acc1: 36.8000 (38.9073) acc5: 61.6000 (64.4293) time: 0.7555 data: 0.7357 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.8438 (2.9174) acc1: 36.8000 (38.8960) acc5: 64.8000 (64.5600) time: 0.5881 data: 0.5677 max mem: 2905
Test: Total time: 0:00:50 (1.0061 s / it)
* Acc@1 39.712 Acc@5 64.622 loss 2.871
Accuracy of the model on the 50000 test images: 39.7%
Max accuracy: 46.24%
Epoch: [21] [ 0/625] eta: 4:05:07 lr: 0.004000 min_lr: 0.004000 loss: 3.7904 (3.7904) class_acc: 0.3672 (0.3672) weight_decay: 0.0500 (0.0500) time: 23.5315 data: 17.3222 max mem: 2905
Epoch: [21] [200/625] eta: 0:14:13 lr: 0.004000 min_lr: 0.004000 loss: 3.6553 (3.7189) class_acc: 0.3828 (0.3741) weight_decay: 0.0500 (0.0500) grad_norm: 1.8959 (1.9880) time: 2.0153 data: 0.0016 max mem: 2905
Epoch: [21] [400/625] eta: 0:07:19 lr: 0.004000 min_lr: 0.004000 loss: 3.6694 (3.7235) class_acc: 0.3750 (0.3756) weight_decay: 0.0500 (0.0500) grad_norm: 2.2716 (2.1337) time: 1.9323 data: 0.0006 max mem: 2905
Epoch: [21] [600/625] eta: 0:00:48 lr: 0.004000 min_lr: 0.004000 loss: 3.7389 (3.7282) class_acc: 0.3750 (0.3741) weight_decay: 0.0500 (0.0500) grad_norm: 1.7289 (2.0225) time: 1.9361 data: 0.0006 max mem: 2905
Epoch: [21] [624/625] eta: 0:00:01 lr: 0.003999 min_lr: 0.003999 loss: 3.6977 (3.7290) class_acc: 0.3750 (0.3740) weight_decay: 0.0500 (0.0500) grad_norm: 1.8270 (2.0189) time: 0.8300 data: 0.0016 max mem: 2905
Epoch: [21] Total time: 0:19:52 (1.9078 s / it)
Averaged stats: lr: 0.003999 min_lr: 0.003999 loss: 3.6977 (3.7321) class_acc: 0.3750 (0.3755) weight_decay: 0.0500 (0.0500) grad_norm: 1.8270 (2.0189)
Test: [ 0/50] eta: 0:10:33 loss: 2.4033 (2.4033) acc1: 46.4000 (46.4000) acc5: 70.4000 (70.4000) time: 12.6693 data: 12.6438 max mem: 2905
Test: [10/50] eta: 0:01:30 loss: 2.3841 (2.4315) acc1: 47.2000 (47.6364) acc5: 71.2000 (70.9818) time: 2.2617 data: 2.2409 max mem: 2905
Test: [20/50] eta: 0:00:53 loss: 2.5151 (2.5363) acc1: 46.4000 (45.3333) acc5: 71.2000 (70.4381) time: 1.2458 data: 1.2264 max mem: 2905
Test: [30/50] eta: 0:00:29 loss: 2.6047 (2.5368) acc1: 44.0000 (45.2129) acc5: 70.4000 (70.3226) time: 1.0357 data: 1.0170 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.6984 (2.5929) acc1: 43.2000 (44.3317) acc5: 68.0000 (69.2293) time: 0.5806 data: 0.5621 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.5740 (2.5983) acc1: 44.0000 (44.4480) acc5: 68.8000 (69.2640) time: 0.5218 data: 0.5029 max mem: 2905
Test: Total time: 0:00:50 (1.0020 s / it)
* Acc@1 44.660 Acc@5 69.776 loss 2.574
Accuracy of the model on the 50000 test images: 44.7%
Max accuracy: 46.24%
Epoch: [22] [ 0/625] eta: 3:27:10 lr: 0.003999 min_lr: 0.003999 loss: 3.8757 (3.8757) class_acc: 0.3516 (0.3516) weight_decay: 0.0500 (0.0500) time: 19.8884 data: 18.6958 max mem: 2905
Epoch: [22] [200/625] eta: 0:13:47 lr: 0.003999 min_lr: 0.003999 loss: 3.7881 (3.7122) class_acc: 0.3672 (0.3787) weight_decay: 0.0500 (0.0500) grad_norm: 1.7065 (1.9623) time: 1.8335 data: 0.0006 max mem: 2905
Epoch: [22] [400/625] eta: 0:07:11 lr: 0.003999 min_lr: 0.003999 loss: 3.6754 (3.7072) class_acc: 0.3750 (0.3799) weight_decay: 0.0500 (0.0500) grad_norm: 1.7173 (1.9558) time: 1.8211 data: 0.0008 max mem: 2905
Epoch: [22] [600/625] eta: 0:00:47 lr: 0.003999 min_lr: 0.003999 loss: 3.6931 (3.7025) class_acc: 0.3828 (0.3805) weight_decay: 0.0500 (0.0500) grad_norm: 1.7012 (1.9935) time: 1.7528 data: 0.0110 max mem: 2905
Epoch: [22] [624/625] eta: 0:00:01 lr: 0.003999 min_lr: 0.003999 loss: 3.6958 (3.7020) class_acc: 0.3906 (0.3807) weight_decay: 0.0500 (0.0500) grad_norm: 1.8089 (1.9925) time: 0.8893 data: 0.0082 max mem: 2905
Epoch: [22] Total time: 0:19:43 (1.8940 s / it)
Averaged stats: lr: 0.003999 min_lr: 0.003999 loss: 3.6958 (3.7013) class_acc: 0.3906 (0.3812) weight_decay: 0.0500 (0.0500) grad_norm: 1.8089 (1.9925)
Test: [ 0/50] eta: 0:09:19 loss: 2.1642 (2.1642) acc1: 53.6000 (53.6000) acc5: 73.6000 (73.6000) time: 11.1931 data: 11.1673 max mem: 2905
Test: [10/50] eta: 0:01:08 loss: 2.2309 (2.1733) acc1: 53.6000 (53.8182) acc5: 75.2000 (76.0000) time: 1.7047 data: 1.6841 max mem: 2905
Test: [20/50] eta: 0:00:41 loss: 2.2977 (2.3197) acc1: 49.6000 (49.9048) acc5: 72.8000 (74.2857) time: 0.8950 data: 0.8757 max mem: 2905
Test: [30/50] eta: 0:00:26 loss: 2.4963 (2.3411) acc1: 45.6000 (49.4452) acc5: 72.0000 (73.4710) time: 1.0874 data: 1.0689 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.4343 (2.3794) acc1: 47.2000 (48.4293) acc5: 71.2000 (72.8390) time: 0.9650 data: 0.9471 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.3533 (2.3700) acc1: 48.0000 (48.4800) acc5: 72.0000 (73.0880) time: 0.5751 data: 0.5571 max mem: 2905
Test: Total time: 0:00:52 (1.0429 s / it)
* Acc@1 48.866 Acc@5 73.468 loss 2.344
Accuracy of the model on the 50000 test images: 48.9%
Max accuracy: 48.87%
Epoch: [23] [ 0/625] eta: 3:39:53 lr: 0.003999 min_lr: 0.003999 loss: 3.5973 (3.5973) class_acc: 0.4336 (0.4336) weight_decay: 0.0500 (0.0500) time: 21.1095 data: 20.3723 max mem: 2905
Epoch: [23] [200/625] eta: 0:14:14 lr: 0.003999 min_lr: 0.003999 loss: 3.6491 (3.6711) class_acc: 0.3789 (0.3890) weight_decay: 0.0500 (0.0500) grad_norm: 1.9880 (2.2326) time: 1.9598 data: 0.6415 max mem: 2905
Epoch: [23] [400/625] eta: 0:07:16 lr: 0.003998 min_lr: 0.003998 loss: 3.7368 (3.6840) class_acc: 0.3750 (0.3860) weight_decay: 0.0500 (0.0500) grad_norm: 1.6813 (2.1713) time: 1.8327 data: 0.0009 max mem: 2905
Epoch: [23] [600/625] eta: 0:00:48 lr: 0.003998 min_lr: 0.003998 loss: 3.6761 (3.6815) class_acc: 0.3789 (0.3858) weight_decay: 0.0500 (0.0500) grad_norm: 1.5312 (inf) time: 1.8845 data: 0.0417 max mem: 2905
Epoch: [23] [624/625] eta: 0:00:01 lr: 0.003998 min_lr: 0.003998 loss: 3.6482 (3.6813) class_acc: 0.3789 (0.3858) weight_decay: 0.0500 (0.0500) grad_norm: 1.7138 (inf) time: 0.8022 data: 0.0152 max mem: 2905
Epoch: [23] Total time: 0:19:34 (1.8799 s / it)
Averaged stats: lr: 0.003998 min_lr: 0.003998 loss: 3.6482 (3.6764) class_acc: 0.3789 (0.3861) weight_decay: 0.0500 (0.0500) grad_norm: 1.7138 (inf)
Test: [ 0/50] eta: 0:10:17 loss: 2.3520 (2.3520) acc1: 49.6000 (49.6000) acc5: 72.8000 (72.8000) time: 12.3509 data: 12.3117 max mem: 2905
Test: [10/50] eta: 0:01:09 loss: 2.3520 (2.3535) acc1: 48.8000 (48.9455) acc5: 72.0000 (73.0909) time: 1.7484 data: 1.7276 max mem: 2905
Test: [20/50] eta: 0:00:38 loss: 2.5317 (2.5242) acc1: 44.0000 (45.4476) acc5: 71.2000 (70.1714) time: 0.7383 data: 0.7184 max mem: 2905
Test: [30/50] eta: 0:00:24 loss: 2.6113 (2.5221) acc1: 43.2000 (45.2903) acc5: 68.0000 (70.1419) time: 0.9622 data: 0.9422 max mem: 2905
Test: [40/50] eta: 0:00:10 loss: 2.6022 (2.5431) acc1: 43.2000 (44.9561) acc5: 68.0000 (70.0488) time: 0.8729 data: 0.8520 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.6396 (2.5664) acc1: 43.2000 (44.6560) acc5: 68.8000 (69.9200) time: 0.4563 data: 0.4363 max mem: 2905
Test: Total time: 0:00:47 (0.9547 s / it)
* Acc@1 45.160 Acc@5 70.234 loss 2.548
Accuracy of the model on the 50000 test images: 45.2%
Max accuracy: 48.87%
Epoch: [24] [ 0/625] eta: 4:03:15 lr: 0.003998 min_lr: 0.003998 loss: 3.6114 (3.6114) class_acc: 0.4141 (0.4141) weight_decay: 0.0500 (0.0500) time: 23.3534 data: 19.9394 max mem: 2905
Epoch: [24] [200/625] eta: 0:14:07 lr: 0.003998 min_lr: 0.003998 loss: 3.6764 (3.6450) class_acc: 0.3828 (0.3905) weight_decay: 0.0500 (0.0500) grad_norm: 1.4303 (2.1383) time: 1.8211 data: 0.0007 max mem: 2905
Epoch: [24] [400/625] eta: 0:07:12 lr: 0.003997 min_lr: 0.003997 loss: 3.6918 (3.6442) class_acc: 0.3789 (0.3912) weight_decay: 0.0500 (0.0500) grad_norm: 1.6180 (2.1398) time: 1.8983 data: 0.0186 max mem: 2905
Epoch: [24] [600/625] eta: 0:00:48 lr: 0.003997 min_lr: 0.003997 loss: 3.5939 (3.6418) class_acc: 0.3828 (0.3918) weight_decay: 0.0500 (0.0500) grad_norm: 1.8641 (2.1125) time: 2.1217 data: 0.0462 max mem: 2905
Epoch: [24] [624/625] eta: 0:00:01 lr: 0.003997 min_lr: 0.003997 loss: 3.6427 (3.6419) class_acc: 0.3906 (0.3918) weight_decay: 0.0500 (0.0500) grad_norm: 1.6812 (2.1027) time: 0.6184 data: 0.0013 max mem: 2905
Epoch: [24] Total time: 0:20:08 (1.9335 s / it)
Averaged stats: lr: 0.003997 min_lr: 0.003997 loss: 3.6427 (3.6528) class_acc: 0.3906 (0.3904) weight_decay: 0.0500 (0.0500) grad_norm: 1.6812 (2.1027)
Test: [ 0/50] eta: 0:11:01 loss: 2.1656 (2.1656) acc1: 52.8000 (52.8000) acc5: 79.2000 (79.2000) time: 13.2280 data: 13.1921 max mem: 2905
Test: [10/50] eta: 0:01:27 loss: 2.1610 (2.1598) acc1: 54.4000 (54.2545) acc5: 78.4000 (76.4364) time: 2.1764 data: 2.1563 max mem: 2905
Test: [20/50] eta: 0:00:51 loss: 2.3438 (2.3316) acc1: 50.4000 (49.6381) acc5: 73.6000 (74.5143) time: 1.1365 data: 1.1169 max mem: 2905
Test: [30/50] eta: 0:00:30 loss: 2.4677 (2.3443) acc1: 44.8000 (48.9290) acc5: 72.8000 (74.0645) time: 1.1834 data: 1.1620 max mem: 2905
Test: [40/50] eta: 0:00:13 loss: 2.3905 (2.3670) acc1: 46.4000 (48.8976) acc5: 72.0000 (73.4634) time: 0.9036 data: 0.8819 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.3807 (2.3688) acc1: 48.8000 (49.2160) acc5: 72.0000 (73.4560) time: 0.8287 data: 0.8077 max mem: 2905
Test: Total time: 0:00:54 (1.0976 s / it)
* Acc@1 48.894 Acc@5 73.396 loss 2.367
Accuracy of the model on the 50000 test images: 48.9%
Max accuracy: 48.89%
Epoch: [25] [ 0/625] eta: 4:01:03 lr: 0.003997 min_lr: 0.003997 loss: 3.5686 (3.5686) class_acc: 0.3945 (0.3945) weight_decay: 0.0500 (0.0500) time: 23.1420 data: 22.6885 max mem: 2905
Epoch: [25] [200/625] eta: 0:14:33 lr: 0.003996 min_lr: 0.003996 loss: 3.6051 (3.6239) class_acc: 0.4023 (0.3945) weight_decay: 0.0500 (0.0500) grad_norm: 1.9610 (2.0712) time: 1.9828 data: 0.0008 max mem: 2905
Epoch: [25] [400/625] eta: 0:07:28 lr: 0.003996 min_lr: 0.003996 loss: 3.6724 (3.6338) class_acc: 0.3750 (0.3932) weight_decay: 0.0500 (0.0500) grad_norm: 2.1317 (2.0402) time: 1.9872 data: 0.0009 max mem: 2905
Epoch: [25] [600/625] eta: 0:00:49 lr: 0.003996 min_lr: 0.003996 loss: 3.6125 (3.6347) class_acc: 0.3711 (0.3930) weight_decay: 0.0500 (0.0500) grad_norm: 2.1590 (2.0136) time: 1.9020 data: 0.0012 max mem: 2905
Epoch: [25] [624/625] eta: 0:00:01 lr: 0.003995 min_lr: 0.003995 loss: 3.6628 (3.6352) class_acc: 0.3867 (0.3928) weight_decay: 0.0500 (0.0500) grad_norm: 1.7407 (2.0130) time: 0.7558 data: 0.0026 max mem: 2905
Epoch: [25] Total time: 0:20:02 (1.9239 s / it)
Averaged stats: lr: 0.003995 min_lr: 0.003995 loss: 3.6628 (3.6307) class_acc: 0.3867 (0.3951) weight_decay: 0.0500 (0.0500) grad_norm: 1.7407 (2.0130)
Test: [ 0/50] eta: 0:09:56 loss: 2.2252 (2.2252) acc1: 52.0000 (52.0000) acc5: 78.4000 (78.4000) time: 11.9393 data: 11.9016 max mem: 2905
Test: [10/50] eta: 0:01:23 loss: 2.3495 (2.3328) acc1: 49.6000 (50.1091) acc5: 73.6000 (73.4545) time: 2.0886 data: 2.0668 max mem: 2905
Test: [20/50] eta: 0:00:50 loss: 2.4747 (2.4821) acc1: 47.2000 (46.3238) acc5: 71.2000 (72.1143) time: 1.1564 data: 1.1366 max mem: 2905
Test: [30/50] eta: 0:00:28 loss: 2.5827 (2.4736) acc1: 43.2000 (46.0645) acc5: 69.6000 (72.1032) time: 1.0338 data: 1.0141 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.5845 (2.4907) acc1: 44.8000 (46.1463) acc5: 68.0000 (71.5902) time: 0.5820 data: 0.5610 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.4665 (2.4988) acc1: 45.6000 (46.0480) acc5: 71.2000 (71.3280) time: 0.4792 data: 0.4575 max mem: 2905
Test: Total time: 0:00:47 (0.9536 s / it)
* Acc@1 46.482 Acc@5 71.380 loss 2.475
Accuracy of the model on the 50000 test images: 46.5%
Max accuracy: 48.89%
Epoch: [26] [ 0/625] eta: 3:26:09 lr: 0.003995 min_lr: 0.003995 loss: 3.7965 (3.7965) class_acc: 0.3867 (0.3867) weight_decay: 0.0500 (0.0500) time: 19.7910 data: 18.9403 max mem: 2905
Epoch: [26] [200/625] eta: 0:13:49 lr: 0.003995 min_lr: 0.003995 loss: 3.6161 (3.5845) class_acc: 0.4062 (0.4035) weight_decay: 0.0500 (0.0500) grad_norm: 1.5103 (2.1486) time: 2.0134 data: 0.7893 max mem: 2905
Epoch: [26] [400/625] eta: 0:07:11 lr: 0.003994 min_lr: 0.003994 loss: 3.5623 (3.5993) class_acc: 0.4023 (0.4004) weight_decay: 0.0500 (0.0500) grad_norm: 1.4205 (2.0950) time: 1.9136 data: 0.5641 max mem: 2905
Epoch: [26] [600/625] eta: 0:00:48 lr: 0.003994 min_lr: 0.003994 loss: 3.5630 (3.6026) class_acc: 0.3984 (0.3999) weight_decay: 0.0500 (0.0500) grad_norm: 1.7351 (2.1099) time: 1.8916 data: 0.0527 max mem: 2905
Epoch: [26] [624/625] eta: 0:00:01 lr: 0.003994 min_lr: 0.003994 loss: 3.5775 (3.6028) class_acc: 0.3984 (0.3997) weight_decay: 0.0500 (0.0500) grad_norm: 1.6100 (2.0984) time: 0.6861 data: 0.0291 max mem: 2905
Epoch: [26] Total time: 0:19:50 (1.9047 s / it)
Averaged stats: lr: 0.003994 min_lr: 0.003994 loss: 3.5775 (3.6107) class_acc: 0.3984 (0.3985) weight_decay: 0.0500 (0.0500) grad_norm: 1.6100 (2.0984)
Test: [ 0/50] eta: 0:10:45 loss: 2.3278 (2.3278) acc1: 48.8000 (48.8000) acc5: 74.4000 (74.4000) time: 12.9197 data: 12.8940 max mem: 2905
Test: [10/50] eta: 0:01:25 loss: 2.3480 (2.3127) acc1: 48.8000 (49.0182) acc5: 74.4000 (74.4000) time: 2.1292 data: 2.1102 max mem: 2905
Test: [20/50] eta: 0:00:49 loss: 2.4423 (2.4436) acc1: 46.4000 (46.5524) acc5: 71.2000 (72.4952) time: 1.0935 data: 1.0744 max mem: 2905
Test: [30/50] eta: 0:00:29 loss: 2.5074 (2.4321) acc1: 45.6000 (46.9936) acc5: 70.4000 (72.2839) time: 1.1002 data: 1.0800 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.4564 (2.4516) acc1: 45.6000 (46.9268) acc5: 68.8000 (71.8439) time: 0.7694 data: 0.7475 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.5118 (2.4673) acc1: 44.8000 (46.4320) acc5: 68.0000 (71.2640) time: 0.7267 data: 0.7061 max mem: 2905
Test: Total time: 0:00:50 (1.0174 s / it)
* Acc@1 47.270 Acc@5 72.132 loss 2.443
Accuracy of the model on the 50000 test images: 47.3%
Max accuracy: 48.89%
Epoch: [27] [ 0/625] eta: 3:49:47 lr: 0.003994 min_lr: 0.003994 loss: 3.5339 (3.5339) class_acc: 0.4141 (0.4141) weight_decay: 0.0500 (0.0500) time: 22.0595 data: 15.4624 max mem: 2905
Epoch: [27] [200/625] eta: 0:14:15 lr: 0.003993 min_lr: 0.003993 loss: 3.5980 (3.5782) class_acc: 0.3984 (0.4067) weight_decay: 0.0500 (0.0500) grad_norm: 2.2039 (inf) time: 1.8993 data: 0.0070 max mem: 2905
Epoch: [27] [400/625] eta: 0:07:14 lr: 0.003993 min_lr: 0.003993 loss: 3.6139 (3.5753) class_acc: 0.3867 (0.4067) weight_decay: 0.0500 (0.0500) grad_norm: 1.8785 (inf) time: 1.8946 data: 0.0008 max mem: 2905
Epoch: [27] [600/625] eta: 0:00:47 lr: 0.003992 min_lr: 0.003992 loss: 3.6045 (3.5791) class_acc: 0.3984 (0.4064) weight_decay: 0.0500 (0.0500) grad_norm: 1.4802 (inf) time: 2.0196 data: 0.0316 max mem: 2905
Epoch: [27] [624/625] eta: 0:00:01 lr: 0.003992 min_lr: 0.003992 loss: 3.5842 (3.5794) class_acc: 0.3867 (0.4060) weight_decay: 0.0500 (0.0500) grad_norm: 1.4526 (inf) time: 0.3925 data: 0.0258 max mem: 2905
Epoch: [27] Total time: 0:19:41 (1.8902 s / it)
Averaged stats: lr: 0.003992 min_lr: 0.003992 loss: 3.5842 (3.5927) class_acc: 0.3867 (0.4023) weight_decay: 0.0500 (0.0500) grad_norm: 1.4526 (inf)
Test: [ 0/50] eta: 0:10:33 loss: 2.0578 (2.0578) acc1: 54.4000 (54.4000) acc5: 79.2000 (79.2000) time: 12.6712 data: 12.6441 max mem: 2905
Test: [10/50] eta: 0:01:26 loss: 2.1337 (2.1419) acc1: 54.4000 (53.8182) acc5: 76.0000 (76.3636) time: 2.1702 data: 2.1514 max mem: 2905
Test: [20/50] eta: 0:00:52 loss: 2.2473 (2.2373) acc1: 51.2000 (51.1238) acc5: 76.0000 (75.6191) time: 1.1922 data: 1.1741 max mem: 2905
Test: [30/50] eta: 0:00:30 loss: 2.2980 (2.2494) acc1: 48.0000 (50.6065) acc5: 74.4000 (75.2000) time: 1.1977 data: 1.1790 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.2357 (2.2689) acc1: 47.2000 (50.3415) acc5: 72.8000 (74.7707) time: 0.7709 data: 0.7522 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.2537 (2.2768) acc1: 51.2000 (50.1760) acc5: 73.6000 (74.5120) time: 0.7491 data: 0.7296 max mem: 2905
Test: Total time: 0:00:52 (1.0513 s / it)
* Acc@1 50.538 Acc@5 75.020 loss 2.257
Accuracy of the model on the 50000 test images: 50.5%
Max accuracy: 50.54%
Epoch: [28] [ 0/625] eta: 3:31:58 lr: 0.003992 min_lr: 0.003992 loss: 3.6734 (3.6734) class_acc: 0.3750 (0.3750) weight_decay: 0.0500 (0.0500) time: 20.3503 data: 20.0512 max mem: 2905
Epoch: [28] [200/625] eta: 0:14:28 lr: 0.003991 min_lr: 0.003991 loss: 3.6027 (3.5692) class_acc: 0.3906 (0.4049) weight_decay: 0.0500 (0.0500) grad_norm: 1.6694 (2.1667) time: 1.9741 data: 0.4945 max mem: 2905
Epoch: [28] [400/625] eta: 0:07:31 lr: 0.003991 min_lr: 0.003991 loss: 3.5839 (3.5724) class_acc: 0.4141 (0.4070) weight_decay: 0.0500 (0.0500) grad_norm: 1.7739 (2.0792) time: 2.1572 data: 0.0009 max mem: 2905
Epoch: [28] [600/625] eta: 0:00:49 lr: 0.003990 min_lr: 0.003990 loss: 3.5119 (3.5714) class_acc: 0.4219 (0.4068) weight_decay: 0.0500 (0.0500) grad_norm: 1.7313 (2.2045) time: 1.8744 data: 0.0008 max mem: 2905
Epoch: [28] [624/625] eta: 0:00:01 lr: 0.003990 min_lr: 0.003990 loss: 3.5770 (3.5711) class_acc: 0.4102 (0.4070) weight_decay: 0.0500 (0.0500) grad_norm: 1.5188 (2.1853) time: 0.8201 data: 0.0022 max mem: 2905
Epoch: [28] Total time: 0:20:14 (1.9439 s / it)
Averaged stats: lr: 0.003990 min_lr: 0.003990 loss: 3.5770 (3.5763) class_acc: 0.4102 (0.4058) weight_decay: 0.0500 (0.0500) grad_norm: 1.5188 (2.1853)
Test: [ 0/50] eta: 0:10:08 loss: 2.6075 (2.6075) acc1: 41.6000 (41.6000) acc5: 72.0000 (72.0000) time: 12.1648 data: 12.1361 max mem: 2905
Test: [10/50] eta: 0:01:18 loss: 2.3099 (2.2855) acc1: 52.0000 (51.3455) acc5: 73.6000 (74.2545) time: 1.9731 data: 1.9537 max mem: 2905
Test: [20/50] eta: 0:00:45 loss: 2.4136 (2.4097) acc1: 51.2000 (48.9905) acc5: 72.0000 (72.5714) time: 0.9957 data: 0.9773 max mem: 2905
Test: [30/50] eta: 0:00:26 loss: 2.4623 (2.4255) acc1: 45.6000 (47.8968) acc5: 71.2000 (72.2323) time: 0.9975 data: 0.9782 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.4623 (2.4689) acc1: 44.0000 (46.8683) acc5: 69.6000 (71.4537) time: 0.7210 data: 0.7008 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.5017 (2.4676) acc1: 44.0000 (46.8640) acc5: 72.0000 (71.5840) time: 0.6412 data: 0.6217 max mem: 2905
Test: Total time: 0:00:48 (0.9798 s / it)
* Acc@1 47.210 Acc@5 72.040 loss 2.436
Accuracy of the model on the 50000 test images: 47.2%
Max accuracy: 50.54%
Epoch: [29] [ 0/625] eta: 3:32:55 lr: 0.003990 min_lr: 0.003990 loss: 3.7204 (3.7204) class_acc: 0.3633 (0.3633) weight_decay: 0.0500 (0.0500) time: 20.4412 data: 17.6682 max mem: 2905
Epoch: [29] [200/625] eta: 0:13:53 lr: 0.003989 min_lr: 0.003989 loss: 3.6205 (3.5465) class_acc: 0.4062 (0.4141) weight_decay: 0.0500 (0.0500) grad_norm: 1.8717 (2.3388) time: 1.8569 data: 0.0008 max mem: 2905
Epoch: [29] [400/625] eta: 0:07:16 lr: 0.003988 min_lr: 0.003988 loss: 3.6023 (3.5596) class_acc: 0.4102 (0.4110) weight_decay: 0.0500 (0.0500) grad_norm: 1.2989 (2.1954) time: 1.9144 data: 0.0195 max mem: 2905
Epoch: [29] [600/625] eta: 0:00:49 lr: 0.003988 min_lr: 0.003988 loss: 3.5655 (3.5613) class_acc: 0.3945 (0.4096) weight_decay: 0.0500 (0.0500) grad_norm: 1.4800 (2.1547) time: 2.0340 data: 0.0050 max mem: 2905
Epoch: [29] [624/625] eta: 0:00:01 lr: 0.003987 min_lr: 0.003987 loss: 3.5300 (3.5614) class_acc: 0.4102 (0.4095) weight_decay: 0.0500 (0.0500) grad_norm: 1.5709 (2.1348) time: 0.4527 data: 0.0014 max mem: 2905
Epoch: [29] Total time: 0:20:05 (1.9282 s / it)
Averaged stats: lr: 0.003987 min_lr: 0.003987 loss: 3.5300 (3.5587) class_acc: 0.4102 (0.4088) weight_decay: 0.0500 (0.0500) grad_norm: 1.5709 (2.1348)
Test: [ 0/50] eta: 0:10:31 loss: 2.3556 (2.3556) acc1: 48.8000 (48.8000) acc5: 73.6000 (73.6000) time: 12.6234 data: 12.5951 max mem: 2905
Test: [10/50] eta: 0:01:20 loss: 2.3297 (2.3093) acc1: 51.2000 (51.0545) acc5: 73.6000 (74.6909) time: 2.0217 data: 1.9997 max mem: 2905
Test: [20/50] eta: 0:00:46 loss: 2.5162 (2.4935) acc1: 45.6000 (46.8191) acc5: 71.2000 (72.0000) time: 1.0032 data: 0.9816 max mem: 2905
Test: [30/50] eta: 0:00:27 loss: 2.5584 (2.4914) acc1: 43.2000 (46.4516) acc5: 68.8000 (71.2774) time: 1.0244 data: 1.0035 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.5514 (2.5317) acc1: 43.2000 (45.9902) acc5: 67.2000 (70.6927) time: 0.9106 data: 0.8909 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.5967 (2.5288) acc1: 44.8000 (46.1600) acc5: 70.4000 (71.0560) time: 0.8203 data: 0.8014 max mem: 2905
Test: Total time: 0:00:54 (1.0977 s / it)
* Acc@1 46.138 Acc@5 71.058 loss 2.510
Accuracy of the model on the 50000 test images: 46.1%
Max accuracy: 50.54%
Epoch: [30] [ 0/625] eta: 3:25:37 lr: 0.003987 min_lr: 0.003987 loss: 3.2437 (3.2437) class_acc: 0.4492 (0.4492) weight_decay: 0.0500 (0.0500) time: 19.7393 data: 17.9200 max mem: 2905
Epoch: [30] [200/625] eta: 0:14:18 lr: 0.003987 min_lr: 0.003987 loss: 3.5421 (3.5261) class_acc: 0.4102 (0.4111) weight_decay: 0.0500 (0.0500) grad_norm: 2.3030 (2.4305) time: 1.7800 data: 0.0008 max mem: 2905
Epoch: [30] [400/625] eta: 0:07:21 lr: 0.003986 min_lr: 0.003986 loss: 3.5666 (3.5352) class_acc: 0.4141 (0.4122) weight_decay: 0.0500 (0.0500) grad_norm: 1.6741 (2.2632) time: 1.9781 data: 0.0007 max mem: 2905
Epoch: [30] [600/625] eta: 0:00:49 lr: 0.003985 min_lr: 0.003985 loss: 3.5600 (3.5398) class_acc: 0.3984 (0.4118) weight_decay: 0.0500 (0.0500) grad_norm: 2.0842 (2.2458) time: 2.2599 data: 0.0626 max mem: 2905
Epoch: [30] [624/625] eta: 0:00:01 lr: 0.003985 min_lr: 0.003985 loss: 3.5901 (3.5410) class_acc: 0.3906 (0.4114) weight_decay: 0.0500 (0.0500) grad_norm: 1.8053 (2.2553) time: 1.1417 data: 0.0063 max mem: 2905
Epoch: [30] Total time: 0:20:03 (1.9257 s / it)
Averaged stats: lr: 0.003985 min_lr: 0.003985 loss: 3.5901 (3.5421) class_acc: 0.3906 (0.4121) weight_decay: 0.0500 (0.0500) grad_norm: 1.8053 (2.2553)
Test: [ 0/50] eta: 0:10:32 loss: 2.7928 (2.7928) acc1: 40.8000 (40.8000) acc5: 67.2000 (67.2000) time: 12.6565 data: 12.6298 max mem: 2905
Test: [10/50] eta: 0:01:27 loss: 2.6436 (2.5998) acc1: 47.2000 (45.9636) acc5: 68.8000 (70.1091) time: 2.1983 data: 2.1711 max mem: 2905
Test: [20/50] eta: 0:00:49 loss: 2.7254 (2.7817) acc1: 40.0000 (41.5619) acc5: 67.2000 (67.2000) time: 1.0826 data: 1.0593 max mem: 2905
Test: [30/50] eta: 0:00:27 loss: 2.9964 (2.8042) acc1: 36.8000 (40.9806) acc5: 63.2000 (66.3484) time: 0.8918 data: 0.8714 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.9047 (2.8522) acc1: 36.8000 (40.0000) acc5: 61.6000 (65.4244) time: 0.5747 data: 0.5550 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.8493 (2.8554) acc1: 36.0000 (39.9040) acc5: 64.8000 (65.4400) time: 0.4939 data: 0.4732 max mem: 2905
Test: Total time: 0:00:46 (0.9399 s / it)
* Acc@1 40.982 Acc@5 65.704 loss 2.836
Accuracy of the model on the 50000 test images: 41.0%
Max accuracy: 50.54%
Epoch: [31] [ 0/625] eta: 3:53:25 lr: 0.003985 min_lr: 0.003985 loss: 3.6486 (3.6486) class_acc: 0.4297 (0.4297) weight_decay: 0.0500 (0.0500) time: 22.4095 data: 20.3318 max mem: 2905
Epoch: [31] [200/625] eta: 0:14:04 lr: 0.003984 min_lr: 0.003984 loss: 3.6013 (3.5261) class_acc: 0.4023 (0.4139) weight_decay: 0.0500 (0.0500) grad_norm: 2.3462 (1.8529) time: 1.7383 data: 0.1542 max mem: 2905
Epoch: [31] [400/625] eta: 0:07:14 lr: 0.003983 min_lr: 0.003983 loss: 3.5149 (3.5341) class_acc: 0.4219 (0.4127) weight_decay: 0.0500 (0.0500) grad_norm: 1.7822 (2.0280) time: 1.7648 data: 0.8956 max mem: 2905
Epoch: [31] [600/625] eta: 0:00:48 lr: 0.003982 min_lr: 0.003982 loss: 3.5508 (3.5355) class_acc: 0.4102 (0.4137) weight_decay: 0.0500 (0.0500) grad_norm: 1.6879 (2.0432) time: 1.8432 data: 0.0007 max mem: 2905
Epoch: [31] [624/625] eta: 0:00:01 lr: 0.003982 min_lr: 0.003982 loss: 3.5453 (3.5359) class_acc: 0.4141 (0.4137) weight_decay: 0.0500 (0.0500) grad_norm: 1.4899 (2.0233) time: 0.8732 data: 0.0015 max mem: 2905
Epoch: [31] Total time: 0:19:48 (1.9012 s / it)
Averaged stats: lr: 0.003982 min_lr: 0.003982 loss: 3.5453 (3.5250) class_acc: 0.4141 (0.4158) weight_decay: 0.0500 (0.0500) grad_norm: 1.4899 (2.0233)
Test: [ 0/50] eta: 0:11:25 loss: 2.2735 (2.2735) acc1: 48.0000 (48.0000) acc5: 79.2000 (79.2000) time: 13.7045 data: 13.6744 max mem: 2905
Test: [10/50] eta: 0:01:30 loss: 2.2915 (2.2470) acc1: 50.4000 (50.5455) acc5: 76.8000 (76.2182) time: 2.2507 data: 2.2287 max mem: 2905
Test: [20/50] eta: 0:00:49 loss: 2.3651 (2.3817) acc1: 46.4000 (47.6952) acc5: 74.4000 (74.5905) time: 1.0518 data: 1.0316 max mem: 2905
Test: [30/50] eta: 0:00:28 loss: 2.4947 (2.3762) acc1: 44.8000 (47.8710) acc5: 72.8000 (74.2194) time: 0.9471 data: 0.9277 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.4947 (2.4134) acc1: 48.0000 (47.4927) acc5: 70.4000 (73.0341) time: 0.6323 data: 0.6124 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.4857 (2.4287) acc1: 47.2000 (47.3760) acc5: 68.8000 (72.4640) time: 0.6758 data: 0.6558 max mem: 2905
Test: Total time: 0:00:49 (0.9914 s / it)
* Acc@1 47.818 Acc@5 72.532 loss 2.410
Accuracy of the model on the 50000 test images: 47.8%
Max accuracy: 50.54%
Epoch: [32] [ 0/625] eta: 3:29:47 lr: 0.003982 min_lr: 0.003982 loss: 3.4008 (3.4008) class_acc: 0.4297 (0.4297) weight_decay: 0.0500 (0.0500) time: 20.1404 data: 16.2818 max mem: 2905
Epoch: [32] [200/625] eta: 0:14:22 lr: 0.003981 min_lr: 0.003981 loss: 3.5179 (3.4990) class_acc: 0.4180 (0.4237) weight_decay: 0.0500 (0.0500) grad_norm: 2.2271 (2.1251) time: 1.7957 data: 0.0008 max mem: 2905
Epoch: [32] [400/625] eta: 0:07:14 lr: 0.003980 min_lr: 0.003980 loss: 3.4666 (3.5082) class_acc: 0.4336 (0.4210) weight_decay: 0.0500 (0.0500) grad_norm: 2.2451 (2.2061) time: 1.9811 data: 0.0006 max mem: 2905
Epoch: [32] [600/625] eta: 0:00:48 lr: 0.003979 min_lr: 0.003979 loss: 3.4920 (3.5121) class_acc: 0.4102 (0.4188) weight_decay: 0.0500 (0.0500) grad_norm: 1.6433 (2.0921) time: 1.8876 data: 0.0009 max mem: 2905
Epoch: [32] [624/625] eta: 0:00:01 lr: 0.003979 min_lr: 0.003979 loss: 3.4904 (3.5116) class_acc: 0.4062 (0.4187) weight_decay: 0.0500 (0.0500) grad_norm: 1.6433 (2.0771) time: 0.7863 data: 0.0018 max mem: 2905
Epoch: [32] Total time: 0:19:50 (1.9042 s / it)
Averaged stats: lr: 0.003979 min_lr: 0.003979 loss: 3.4904 (3.5114) class_acc: 0.4062 (0.4186) weight_decay: 0.0500 (0.0500) grad_norm: 1.6433 (2.0771)
Test: [ 0/50] eta: 0:10:48 loss: 2.6732 (2.6732) acc1: 39.2000 (39.2000) acc5: 68.0000 (68.0000) time: 12.9644 data: 12.9375 max mem: 2905
Test: [10/50] eta: 0:01:23 loss: 2.3532 (2.4733) acc1: 48.0000 (46.9091) acc5: 73.6000 (71.7818) time: 2.0976 data: 2.0761 max mem: 2905
Test: [20/50] eta: 0:00:47 loss: 2.6492 (2.6222) acc1: 42.4000 (43.5429) acc5: 70.4000 (69.8667) time: 1.0304 data: 1.0106 max mem: 2905
Test: [30/50] eta: 0:00:27 loss: 2.7581 (2.6196) acc1: 41.6000 (43.9226) acc5: 68.0000 (69.3936) time: 1.0148 data: 0.9960 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.7418 (2.6598) acc1: 41.6000 (43.5317) acc5: 68.0000 (68.6634) time: 0.7184 data: 0.6994 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.6973 (2.6637) acc1: 42.4000 (43.6800) acc5: 68.0000 (68.6720) time: 0.6728 data: 0.6523 max mem: 2905
Test: Total time: 0:00:48 (0.9736 s / it)
* Acc@1 43.536 Acc@5 68.598 loss 2.643
Accuracy of the model on the 50000 test images: 43.5%
Max accuracy: 50.54%
Epoch: [33] [ 0/625] eta: 3:21:09 lr: 0.003979 min_lr: 0.003979 loss: 3.6944 (3.6944) class_acc: 0.3672 (0.3672) weight_decay: 0.0500 (0.0500) time: 19.3115 data: 16.1847 max mem: 2905
Epoch: [33] [200/625] eta: 0:13:48 lr: 0.003978 min_lr: 0.003978 loss: 3.4921 (3.4781) class_acc: 0.4180 (0.4240) weight_decay: 0.0500 (0.0500) grad_norm: 1.6415 (2.2971) time: 1.7956 data: 0.0209 max mem: 2905
Epoch: [33] [400/625] eta: 0:07:05 lr: 0.003977 min_lr: 0.003977 loss: 3.4732 (3.4920) class_acc: 0.4297 (0.4222) weight_decay: 0.0500 (0.0500) grad_norm: 1.5325 (2.2462) time: 1.9609 data: 0.0069 max mem: 2905
Epoch: [33] [600/625] eta: 0:00:48 lr: 0.003976 min_lr: 0.003976 loss: 3.4680 (3.4957) class_acc: 0.4219 (0.4212) weight_decay: 0.0500 (0.0500) grad_norm: 1.4542 (2.2352) time: 2.1684 data: 0.0011 max mem: 2905
Epoch: [33] [624/625] eta: 0:00:01 lr: 0.003975 min_lr: 0.003975 loss: 3.5286 (3.4964) class_acc: 0.4141 (0.4210) weight_decay: 0.0500 (0.0500) grad_norm: 1.5404 (2.2102) time: 0.6252 data: 0.0030 max mem: 2905
Epoch: [33] Total time: 0:19:59 (1.9199 s / it)
Averaged stats: lr: 0.003975 min_lr: 0.003975 loss: 3.5286 (3.5017) class_acc: 0.4141 (0.4206) weight_decay: 0.0500 (0.0500) grad_norm: 1.5404 (2.2102)
Test: [ 0/50] eta: 0:11:32 loss: 2.2907 (2.2907) acc1: 50.4000 (50.4000) acc5: 75.2000 (75.2000) time: 13.8549 data: 13.8182 max mem: 2905
Test: [10/50] eta: 0:01:38 loss: 2.2598 (2.2844) acc1: 50.4000 (50.9818) acc5: 75.2000 (74.4000) time: 2.4684 data: 2.4484 max mem: 2905
Test: [20/50] eta: 0:00:56 loss: 2.3548 (2.3906) acc1: 48.0000 (48.9143) acc5: 75.2000 (73.6000) time: 1.2919 data: 1.2731 max mem: 2905
Test: [30/50] eta: 0:00:31 loss: 2.4564 (2.3973) acc1: 46.4000 (48.0516) acc5: 72.8000 (73.3419) time: 1.0758 data: 1.0563 max mem: 2905
Test: [40/50] eta: 0:00:13 loss: 2.5022 (2.4034) acc1: 46.4000 (47.9610) acc5: 72.0000 (73.3463) time: 0.6888 data: 0.6695 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.4441 (2.4173) acc1: 46.4000 (47.7440) acc5: 71.2000 (73.1360) time: 0.6201 data: 0.6010 max mem: 2905
Test: Total time: 0:00:54 (1.0916 s / it)
* Acc@1 48.582 Acc@5 73.274 loss 2.370
Accuracy of the model on the 50000 test images: 48.6%
Max accuracy: 50.54%
Epoch: [34] [ 0/625] eta: 3:23:05 lr: 0.003975 min_lr: 0.003975 loss: 3.4742 (3.4742) class_acc: 0.4258 (0.4258) weight_decay: 0.0500 (0.0500) time: 19.4971 data: 17.0031 max mem: 2905
Epoch: [34] [200/625] eta: 0:14:41 lr: 0.003974 min_lr: 0.003974 loss: 3.4882 (3.4755) class_acc: 0.4258 (0.4260) weight_decay: 0.0500 (0.0500) grad_norm: 1.2058 (2.2156) time: 1.8631 data: 0.0006 max mem: 2905
Epoch: [34] [400/625] eta: 0:07:33 lr: 0.003973 min_lr: 0.003973 loss: 3.4412 (3.4874) class_acc: 0.4258 (0.4244) weight_decay: 0.0500 (0.0500) grad_norm: 2.0687 (inf) time: 1.9048 data: 0.0007 max mem: 2905
Epoch: [34] [600/625] eta: 0:00:50 lr: 0.003972 min_lr: 0.003972 loss: 3.4508 (3.4869) class_acc: 0.4219 (0.4240) weight_decay: 0.0500 (0.0500) grad_norm: 1.4027 (inf) time: 1.8883 data: 0.0006 max mem: 2905
Epoch: [34] [624/625] eta: 0:00:01 lr: 0.003972 min_lr: 0.003972 loss: 3.4583 (3.4864) class_acc: 0.4219 (0.4241) weight_decay: 0.0500 (0.0500) grad_norm: 1.6445 (inf) time: 0.8677 data: 0.0014 max mem: 2905
Epoch: [34] Total time: 0:20:24 (1.9595 s / it)
Averaged stats: lr: 0.003972 min_lr: 0.003972 loss: 3.4583 (3.4904) class_acc: 0.4219 (0.4227) weight_decay: 0.0500 (0.0500) grad_norm: 1.6445 (inf)
Test: [ 0/50] eta: 0:10:56 loss: 2.3697 (2.3697) acc1: 48.0000 (48.0000) acc5: 74.4000 (74.4000) time: 13.1339 data: 13.1075 max mem: 2905
Test: [10/50] eta: 0:01:26 loss: 2.3371 (2.3366) acc1: 50.4000 (50.5455) acc5: 73.6000 (74.1091) time: 2.1663 data: 2.1474 max mem: 2905
Test: [20/50] eta: 0:00:50 loss: 2.4313 (2.4743) acc1: 47.2000 (47.0095) acc5: 71.2000 (72.2667) time: 1.1212 data: 1.1027 max mem: 2905
Test: [30/50] eta: 0:00:29 loss: 2.6178 (2.4762) acc1: 44.8000 (46.6323) acc5: 70.4000 (71.7936) time: 1.0683 data: 1.0487 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.5198 (2.4820) acc1: 46.4000 (46.6927) acc5: 71.2000 (71.6488) time: 0.6772 data: 0.6585 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.4586 (2.4786) acc1: 45.6000 (46.7840) acc5: 72.0000 (71.6960) time: 0.5905 data: 0.5721 max mem: 2905
Test: Total time: 0:00:50 (1.0058 s / it)
* Acc@1 47.320 Acc@5 72.012 loss 2.447
Accuracy of the model on the 50000 test images: 47.3%
Max accuracy: 50.54%
Epoch: [35] [ 0/625] eta: 3:39:56 lr: 0.003972 min_lr: 0.003972 loss: 3.3530 (3.3530) class_acc: 0.4297 (0.4297) weight_decay: 0.0500 (0.0500) time: 21.1136 data: 16.9161 max mem: 2905
Epoch: [35] [200/625] eta: 0:14:37 lr: 0.003971 min_lr: 0.003971 loss: 3.4848 (3.4611) class_acc: 0.4141 (0.4300) weight_decay: 0.0500 (0.0500) grad_norm: 2.7237 (2.2412) time: 1.7407 data: 0.0104 max mem: 2905
Epoch: [35] [400/625] eta: 0:07:26 lr: 0.003969 min_lr: 0.003969 loss: 3.4997 (3.4627) class_acc: 0.4258 (0.4276) weight_decay: 0.0500 (0.0500) grad_norm: 1.6701 (2.2118) time: 1.8030 data: 0.0653 max mem: 2905
Epoch: [35] [600/625] eta: 0:00:49 lr: 0.003968 min_lr: 0.003968 loss: 3.5058 (3.4728) class_acc: 0.4219 (0.4254) weight_decay: 0.0500 (0.0500) grad_norm: 2.8002 (2.2541) time: 2.0367 data: 0.0062 max mem: 2905
Epoch: [35] [624/625] eta: 0:00:01 lr: 0.003968 min_lr: 0.003968 loss: 3.5165 (3.4741) class_acc: 0.4219 (0.4252) weight_decay: 0.0500 (0.0500) grad_norm: 1.8629 (2.2401) time: 0.4576 data: 0.0141 max mem: 2905
Epoch: [35] Total time: 0:20:22 (1.9558 s / it)
Averaged stats: lr: 0.003968 min_lr: 0.003968 loss: 3.5165 (3.4797) class_acc: 0.4219 (0.4247) weight_decay: 0.0500 (0.0500) grad_norm: 1.8629 (2.2401)
Test: [ 0/50] eta: 0:10:42 loss: 2.6505 (2.6505) acc1: 37.6000 (37.6000) acc5: 69.6000 (69.6000) time: 12.8504 data: 12.8258 max mem: 2905
Test: [10/50] eta: 0:01:25 loss: 2.4185 (2.3711) acc1: 52.0000 (49.5273) acc5: 74.4000 (73.6000) time: 2.1436 data: 2.1250 max mem: 2905
Test: [20/50] eta: 0:00:49 loss: 2.4582 (2.4916) acc1: 45.6000 (46.1333) acc5: 72.8000 (71.6571) time: 1.0952 data: 1.0773 max mem: 2905
Test: [30/50] eta: 0:00:29 loss: 2.6083 (2.5177) acc1: 42.4000 (45.5484) acc5: 69.6000 (71.4581) time: 1.0996 data: 1.0820 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.6142 (2.5576) acc1: 44.0000 (45.3659) acc5: 68.0000 (70.4585) time: 0.8511 data: 0.8329 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.6923 (2.5644) acc1: 42.4000 (44.9280) acc5: 68.0000 (70.2240) time: 0.7552 data: 0.7365 max mem: 2905
Test: Total time: 0:00:54 (1.0944 s / it)
* Acc@1 45.012 Acc@5 70.206 loss 2.546
Accuracy of the model on the 50000 test images: 45.0%
Max accuracy: 50.54%
Epoch: [36] [ 0/625] eta: 3:58:20 lr: 0.003968 min_lr: 0.003968 loss: 3.3311 (3.3311) class_acc: 0.4453 (0.4453) weight_decay: 0.0500 (0.0500) time: 22.8804 data: 18.0996 max mem: 2905
Epoch: [36] [200/625] eta: 0:14:25 lr: 0.003967 min_lr: 0.003967 loss: 3.4543 (3.4543) class_acc: 0.4336 (0.4306) weight_decay: 0.0500 (0.0500) grad_norm: 2.4093 (2.3944) time: 2.0214 data: 0.0008 max mem: 2905
Epoch: [36] [400/625] eta: 0:07:26 lr: 0.003965 min_lr: 0.003965 loss: 3.4779 (3.4624) class_acc: 0.4180 (0.4283) weight_decay: 0.0500 (0.0500) grad_norm: 1.6931 (2.1741) time: 1.9241 data: 0.0102 max mem: 2905
Epoch: [36] [600/625] eta: 0:00:49 lr: 0.003964 min_lr: 0.003964 loss: 3.3859 (3.4684) class_acc: 0.4297 (0.4278) weight_decay: 0.0500 (0.0500) grad_norm: 1.4834 (2.1673) time: 2.1073 data: 0.0040 max mem: 2905
Epoch: [36] [624/625] eta: 0:00:01 lr: 0.003964 min_lr: 0.003964 loss: 3.4411 (3.4672) class_acc: 0.4219 (0.4281) weight_decay: 0.0500 (0.0500) grad_norm: 1.7825 (2.1718) time: 0.6709 data: 0.0014 max mem: 2905
Epoch: [36] Total time: 0:20:11 (1.9380 s / it)
Averaged stats: lr: 0.003964 min_lr: 0.003964 loss: 3.4411 (3.4641) class_acc: 0.4219 (0.4281) weight_decay: 0.0500 (0.0500) grad_norm: 1.7825 (2.1718)
Test: [ 0/50] eta: 0:08:49 loss: 2.0329 (2.0329) acc1: 52.8000 (52.8000) acc5: 80.0000 (80.0000) time: 10.5943 data: 10.5649 max mem: 2905
Test: [10/50] eta: 0:01:08 loss: 2.1859 (2.2179) acc1: 52.8000 (53.0909) acc5: 76.0000 (75.0545) time: 1.7098 data: 1.6906 max mem: 2905
Test: [20/50] eta: 0:00:42 loss: 2.3918 (2.3210) acc1: 50.4000 (49.4095) acc5: 72.8000 (73.6381) time: 0.9658 data: 0.9471 max mem: 2905
Test: [30/50] eta: 0:00:27 loss: 2.4170 (2.3351) acc1: 46.4000 (49.1871) acc5: 71.2000 (73.5484) time: 1.1629 data: 1.1430 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.3483 (2.3676) acc1: 47.2000 (48.5463) acc5: 71.2000 (73.1122) time: 0.9476 data: 0.9282 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.4027 (2.3716) acc1: 45.6000 (48.4640) acc5: 73.6000 (73.1520) time: 0.5007 data: 0.4828 max mem: 2905
Test: Total time: 0:00:52 (1.0439 s / it)
* Acc@1 49.020 Acc@5 73.458 loss 2.348
Accuracy of the model on the 50000 test images: 49.0%
Max accuracy: 50.54%
Epoch: [37] [ 0/625] eta: 3:44:21 lr: 0.003964 min_lr: 0.003964 loss: 3.5528 (3.5528) class_acc: 0.4375 (0.4375) weight_decay: 0.0500 (0.0500) time: 21.5379 data: 21.2904 max mem: 2905
Epoch: [37] [200/625] eta: 0:14:39 lr: 0.003962 min_lr: 0.003962 loss: 3.4587 (3.4448) class_acc: 0.4141 (0.4326) weight_decay: 0.0500 (0.0500) grad_norm: 1.9882 (2.3421) time: 2.0363 data: 0.2053 max mem: 2905
Epoch: [37] [400/625] eta: 0:07:31 lr: 0.003961 min_lr: 0.003961 loss: 3.4508 (3.4458) class_acc: 0.4453 (0.4317) weight_decay: 0.0500 (0.0500) grad_norm: 1.8675 (2.2136) time: 1.9872 data: 0.0010 max mem: 2905
Epoch: [37] [600/625] eta: 0:00:49 lr: 0.003960 min_lr: 0.003960 loss: 3.4958 (3.4564) class_acc: 0.4141 (0.4294) weight_decay: 0.0500 (0.0500) grad_norm: 1.3652 (2.1520) time: 2.1020 data: 0.0366 max mem: 2905
Epoch: [37] [624/625] eta: 0:00:01 lr: 0.003959 min_lr: 0.003959 loss: 3.5078 (3.4588) class_acc: 0.4180 (0.4289) weight_decay: 0.0500 (0.0500) grad_norm: 1.3907 (2.1503) time: 0.6706 data: 0.0018 max mem: 2905
Epoch: [37] Total time: 0:20:12 (1.9398 s / it)
Averaged stats: lr: 0.003959 min_lr: 0.003959 loss: 3.5078 (3.4561) class_acc: 0.4180 (0.4296) weight_decay: 0.0500 (0.0500) grad_norm: 1.3907 (2.1503)
Test: [ 0/50] eta: 0:09:47 loss: 2.3570 (2.3570) acc1: 48.8000 (48.8000) acc5: 76.0000 (76.0000) time: 11.7520 data: 11.7171 max mem: 2905
Test: [10/50] eta: 0:01:15 loss: 2.1305 (2.1705) acc1: 52.8000 (51.7091) acc5: 77.6000 (77.0182) time: 1.8980 data: 1.8762 max mem: 2905
Test: [20/50] eta: 0:00:43 loss: 2.3069 (2.3413) acc1: 47.2000 (48.5714) acc5: 74.4000 (74.5143) time: 0.9281 data: 0.9079 max mem: 2905
Test: [30/50] eta: 0:00:24 loss: 2.5020 (2.3560) acc1: 44.8000 (47.8968) acc5: 72.0000 (73.8065) time: 0.8678 data: 0.8476 max mem: 2905
Test: [40/50] eta: 0:00:10 loss: 2.4412 (2.3812) acc1: 45.6000 (47.9415) acc5: 72.0000 (73.3463) time: 0.6246 data: 0.6049 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.3646 (2.3709) acc1: 47.2000 (47.9840) acc5: 72.0000 (73.5040) time: 0.4498 data: 0.4308 max mem: 2905
Test: Total time: 0:00:45 (0.9003 s / it)
* Acc@1 49.484 Acc@5 74.074 loss 2.317
Accuracy of the model on the 50000 test images: 49.5%
Max accuracy: 50.54%
Epoch: [38] [ 0/625] eta: 3:34:39 lr: 0.003959 min_lr: 0.003959 loss: 3.5082 (3.5082) class_acc: 0.4180 (0.4180) weight_decay: 0.0500 (0.0500) time: 20.6079 data: 20.1486 max mem: 2905
Epoch: [38] [200/625] eta: 0:14:18 lr: 0.003958 min_lr: 0.003958 loss: 3.4614 (3.4361) class_acc: 0.4258 (0.4308) weight_decay: 0.0500 (0.0500) grad_norm: 1.6697 (2.3046) time: 1.9057 data: 0.0942 max mem: 2905
Epoch: [38] [400/625] eta: 0:07:27 lr: 0.003956 min_lr: 0.003956 loss: 3.4751 (3.4424) class_acc: 0.4141 (0.4296) weight_decay: 0.0500 (0.0500) grad_norm: 1.4432 (2.4068) time: 1.9917 data: 0.0008 max mem: 2905
Epoch: [38] [600/625] eta: 0:00:49 lr: 0.003955 min_lr: 0.003955 loss: 3.4399 (3.4434) class_acc: 0.4297 (0.4302) weight_decay: 0.0500 (0.0500) grad_norm: 1.7011 (2.3015) time: 2.2151 data: 0.0007 max mem: 2905
Epoch: [38] [624/625] eta: 0:00:01 lr: 0.003955 min_lr: 0.003955 loss: 3.4671 (3.4436) class_acc: 0.4219 (0.4301) weight_decay: 0.0500 (0.0500) grad_norm: 1.9491 (2.3007) time: 0.6622 data: 0.0013 max mem: 2905
Epoch: [38] Total time: 0:20:12 (1.9395 s / it)
Averaged stats: lr: 0.003955 min_lr: 0.003955 loss: 3.4671 (3.4478) class_acc: 0.4219 (0.4311) weight_decay: 0.0500 (0.0500) grad_norm: 1.9491 (2.3007)
Test: [ 0/50] eta: 0:10:14 loss: 2.7064 (2.7064) acc1: 40.0000 (40.0000) acc5: 72.0000 (72.0000) time: 12.2815 data: 12.2534 max mem: 2905
Test: [10/50] eta: 0:01:25 loss: 2.5738 (2.5380) acc1: 45.6000 (45.0909) acc5: 71.2000 (70.9818) time: 2.1405 data: 2.1216 max mem: 2905
Test: [20/50] eta: 0:00:52 loss: 2.7629 (2.6967) acc1: 43.2000 (42.4762) acc5: 68.8000 (69.1048) time: 1.2291 data: 1.2106 max mem: 2905
Test: [30/50] eta: 0:00:31 loss: 2.8041 (2.6914) acc1: 40.0000 (42.4000) acc5: 67.2000 (68.5677) time: 1.2787 data: 1.2603 max mem: 2905
Test: [40/50] eta: 0:00:13 loss: 2.6971 (2.7138) acc1: 41.6000 (42.1073) acc5: 68.0000 (68.0390) time: 0.8855 data: 0.8666 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.7160 (2.7101) acc1: 43.2000 (42.5280) acc5: 68.0000 (68.0480) time: 0.7354 data: 0.7161 max mem: 2905
Test: Total time: 0:00:55 (1.1124 s / it)
* Acc@1 43.506 Acc@5 68.474 loss 2.667
Accuracy of the model on the 50000 test images: 43.5%
Max accuracy: 50.54%
Epoch: [39] [ 0/625] eta: 3:45:22 lr: 0.003955 min_lr: 0.003955 loss: 3.1841 (3.1841) class_acc: 0.4766 (0.4766) weight_decay: 0.0500 (0.0500) time: 21.6368 data: 18.6295 max mem: 2905
Epoch: [39] [200/625] eta: 0:15:01 lr: 0.003953 min_lr: 0.003953 loss: 3.4117 (3.4293) class_acc: 0.4336 (0.4337) weight_decay: 0.0500 (0.0500) grad_norm: 1.4930 (2.4655) time: 1.9263 data: 0.0006 max mem: 2905
Epoch: [39] [400/625] eta: 0:07:42 lr: 0.003952 min_lr: 0.003952 loss: 3.3971 (3.4350) class_acc: 0.4297 (0.4320) weight_decay: 0.0500 (0.0500) grad_norm: 1.6770 (2.2379) time: 1.9215 data: 0.0007 max mem: 2905
Epoch: [39] [600/625] eta: 0:00:51 lr: 0.003950 min_lr: 0.003950 loss: 3.4286 (3.4383) class_acc: 0.4414 (0.4327) weight_decay: 0.0500 (0.0500) grad_norm: 1.7661 (2.1673) time: 2.0051 data: 0.0007 max mem: 2905
Epoch: [39] [624/625] eta: 0:00:01 lr: 0.003950 min_lr: 0.003950 loss: 3.4282 (3.4385) class_acc: 0.4297 (0.4327) weight_decay: 0.0500 (0.0500) grad_norm: 1.5666 (2.1542) time: 0.7318 data: 0.0014 max mem: 2905
Epoch: [39] Total time: 0:20:43 (1.9903 s / it)
Averaged stats: lr: 0.003950 min_lr: 0.003950 loss: 3.4282 (3.4390) class_acc: 0.4297 (0.4330) weight_decay: 0.0500 (0.0500) grad_norm: 1.5666 (2.1542)
Test: [ 0/50] eta: 0:10:55 loss: 2.3686 (2.3686) acc1: 44.8000 (44.8000) acc5: 72.8000 (72.8000) time: 13.1138 data: 13.0878 max mem: 2905
Test: [10/50] eta: 0:01:23 loss: 2.1979 (2.1789) acc1: 53.6000 (53.7455) acc5: 75.2000 (75.7091) time: 2.0881 data: 2.0680 max mem: 2905
Test: [20/50] eta: 0:00:48 loss: 2.3461 (2.3167) acc1: 49.6000 (50.3619) acc5: 73.6000 (74.0191) time: 1.0489 data: 1.0295 max mem: 2905
Test: [30/50] eta: 0:00:28 loss: 2.3705 (2.3122) acc1: 48.0000 (50.3742) acc5: 73.6000 (73.9355) time: 1.0630 data: 1.0444 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.3351 (2.3320) acc1: 48.8000 (49.8537) acc5: 73.6000 (73.5220) time: 0.7608 data: 0.7425 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.3770 (2.3480) acc1: 47.2000 (49.2960) acc5: 72.8000 (73.3920) time: 0.6592 data: 0.6413 max mem: 2905
Test: Total time: 0:00:50 (1.0056 s / it)
* Acc@1 49.532 Acc@5 73.846 loss 2.324
Accuracy of the model on the 50000 test images: 49.5%
Max accuracy: 50.54%
Epoch: [40] [ 0/625] eta: 3:28:09 lr: 0.003950 min_lr: 0.003950 loss: 3.5175 (3.5175) class_acc: 0.4023 (0.4023) weight_decay: 0.0500 (0.0500) time: 19.9833 data: 15.8166 max mem: 2905
Epoch: [40] [200/625] eta: 0:13:59 lr: 0.003948 min_lr: 0.003948 loss: 3.4259 (3.4236) class_acc: 0.4297 (0.4370) weight_decay: 0.0500 (0.0500) grad_norm: 2.3682 (2.3779) time: 1.9172 data: 0.2894 max mem: 2905
Epoch: [40] [400/625] eta: 0:07:14 lr: 0.003947 min_lr: 0.003947 loss: 3.3919 (3.4230) class_acc: 0.4297 (0.4370) weight_decay: 0.0500 (0.0500) grad_norm: 1.8893 (2.2478) time: 1.7788 data: 0.0706 max mem: 2905
Epoch: [40] [600/625] eta: 0:00:47 lr: 0.003945 min_lr: 0.003945 loss: 3.4431 (3.4283) class_acc: 0.4336 (0.4363) weight_decay: 0.0500 (0.0500) grad_norm: 1.6500 (2.1530) time: 1.8195 data: 0.2604 max mem: 2905
Epoch: [40] [624/625] eta: 0:00:01 lr: 0.003945 min_lr: 0.003945 loss: 3.4649 (3.4298) class_acc: 0.4297 (0.4360) weight_decay: 0.0500 (0.0500) grad_norm: 1.6349 (2.1598) time: 0.7839 data: 0.1111 max mem: 2905
Epoch: [40] Total time: 0:19:32 (1.8761 s / it)
Averaged stats: lr: 0.003945 min_lr: 0.003945 loss: 3.4649 (3.4282) class_acc: 0.4297 (0.4356) weight_decay: 0.0500 (0.0500) grad_norm: 1.6349 (2.1598)
Test: [ 0/50] eta: 0:09:56 loss: 2.6887 (2.6887) acc1: 40.8000 (40.8000) acc5: 68.0000 (68.0000) time: 11.9333 data: 11.9080 max mem: 2905
Test: [10/50] eta: 0:01:20 loss: 2.4430 (2.3694) acc1: 48.8000 (50.3273) acc5: 72.8000 (73.4545) time: 2.0103 data: 1.9918 max mem: 2905
Test: [20/50] eta: 0:00:46 loss: 2.6287 (2.5685) acc1: 44.8000 (45.6000) acc5: 68.0000 (70.3238) time: 1.0345 data: 1.0155 max mem: 2905
Test: [30/50] eta: 0:00:26 loss: 2.6538 (2.5495) acc1: 42.4000 (45.5484) acc5: 68.0000 (70.5806) time: 0.9829 data: 0.9632 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.5278 (2.5705) acc1: 43.2000 (45.4634) acc5: 68.8000 (69.9317) time: 0.7307 data: 0.7109 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.5604 (2.5863) acc1: 44.8000 (45.1360) acc5: 68.8000 (69.7600) time: 0.6761 data: 0.6564 max mem: 2905
Test: Total time: 0:00:48 (0.9642 s / it)
* Acc@1 45.532 Acc@5 70.668 loss 2.553
Accuracy of the model on the 50000 test images: 45.5%
Max accuracy: 50.54%
Epoch: [41] [ 0/625] eta: 3:26:26 lr: 0.003945 min_lr: 0.003945 loss: 3.5007 (3.5007) class_acc: 0.4336 (0.4336) weight_decay: 0.0500 (0.0500) time: 19.8185 data: 17.3438 max mem: 2905
Epoch: [41] [200/625] eta: 0:14:00 lr: 0.003943 min_lr: 0.003943 loss: 3.3732 (3.3916) class_acc: 0.4375 (0.4430) weight_decay: 0.0500 (0.0500) grad_norm: 1.8629 (2.4691) time: 1.8012 data: 0.3605 max mem: 2905
Epoch: [41] [400/625] eta: 0:07:15 lr: 0.003941 min_lr: 0.003941 loss: 3.4659 (3.4016) class_acc: 0.4219 (0.4415) weight_decay: 0.0500 (0.0500) grad_norm: 1.4757 (inf) time: 1.9578 data: 0.0576 max mem: 2905
Epoch: [41] [600/625] eta: 0:00:48 lr: 0.003940 min_lr: 0.003940 loss: 3.4806 (3.4165) class_acc: 0.4258 (0.4386) weight_decay: 0.0500 (0.0500) grad_norm: 1.9142 (inf) time: 1.9242 data: 0.0218 max mem: 2905
Epoch: [41] [624/625] eta: 0:00:01 lr: 0.003939 min_lr: 0.003939 loss: 3.3626 (3.4157) class_acc: 0.4492 (0.4387) weight_decay: 0.0500 (0.0500) grad_norm: 1.6419 (inf) time: 0.9451 data: 0.0015 max mem: 2905
Epoch: [41] Total time: 0:19:49 (1.9028 s / it)
Averaged stats: lr: 0.003939 min_lr: 0.003939 loss: 3.3626 (3.4211) class_acc: 0.4492 (0.4371) weight_decay: 0.0500 (0.0500) grad_norm: 1.6419 (inf)
Test: [ 0/50] eta: 0:10:34 loss: 2.2427 (2.2427) acc1: 50.4000 (50.4000) acc5: 77.6000 (77.6000) time: 12.6959 data: 12.6652 max mem: 2905
Test: [10/50] eta: 0:01:27 loss: 2.2100 (2.2620) acc1: 52.8000 (51.9273) acc5: 76.0000 (75.6364) time: 2.1897 data: 2.1676 max mem: 2905
Test: [20/50] eta: 0:00:51 loss: 2.4331 (2.3792) acc1: 47.2000 (48.4571) acc5: 74.4000 (74.2857) time: 1.1603 data: 1.1400 max mem: 2905
Test: [30/50] eta: 0:00:29 loss: 2.4820 (2.4063) acc1: 44.8000 (47.7936) acc5: 72.8000 (73.8323) time: 1.0871 data: 1.0658 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.4954 (2.4318) acc1: 44.8000 (47.4537) acc5: 71.2000 (73.1122) time: 0.6651 data: 0.6445 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.4891 (2.4456) acc1: 44.8000 (47.3920) acc5: 71.2000 (72.9280) time: 0.6625 data: 0.6444 max mem: 2905
Test: Total time: 0:00:49 (0.9910 s / it)
* Acc@1 47.910 Acc@5 72.916 loss 2.415
Accuracy of the model on the 50000 test images: 47.9%
Max accuracy: 50.54%
Epoch: [42] [ 0/625] eta: 3:33:02 lr: 0.003939 min_lr: 0.003939 loss: 3.1996 (3.1996) class_acc: 0.5078 (0.5078) weight_decay: 0.0500 (0.0500) time: 20.4526 data: 17.3140 max mem: 2905
Epoch: [42] [200/625] eta: 0:13:47 lr: 0.003938 min_lr: 0.003938 loss: 3.3309 (3.3978) class_acc: 0.4531 (0.4408) weight_decay: 0.0500 (0.0500) grad_norm: 2.2845 (2.5679) time: 1.8660 data: 0.0112 max mem: 2905
Epoch: [42] [400/625] eta: 0:07:21 lr: 0.003936 min_lr: 0.003936 loss: 3.4166 (3.4028) class_acc: 0.4375 (0.4404) weight_decay: 0.0500 (0.0500) grad_norm: 1.4697 (2.3329) time: 2.1669 data: 0.0209 max mem: 2905
Epoch: [42] [600/625] eta: 0:00:50 lr: 0.003934 min_lr: 0.003934 loss: 3.3962 (3.4140) class_acc: 0.4453 (0.4388) weight_decay: 0.0500 (0.0500) grad_norm: 2.1224 (2.3716) time: 2.0179 data: 0.0056 max mem: 2905
Epoch: [42] [624/625] eta: 0:00:02 lr: 0.003934 min_lr: 0.003934 loss: 3.4417 (3.4153) class_acc: 0.4375 (0.4387) weight_decay: 0.0500 (0.0500) grad_norm: 1.6972 (2.3574) time: 0.7335 data: 0.0014 max mem: 2905
Epoch: [42] Total time: 0:21:02 (2.0197 s / it)
Averaged stats: lr: 0.003934 min_lr: 0.003934 loss: 3.4417 (3.4156) class_acc: 0.4375 (0.4376) weight_decay: 0.0500 (0.0500) grad_norm: 1.6972 (2.3574)
Test: [ 0/50] eta: 0:09:13 loss: 2.4724 (2.4724) acc1: 51.2000 (51.2000) acc5: 72.0000 (72.0000) time: 11.0760 data: 11.0492 max mem: 2905
Test: [10/50] eta: 0:01:31 loss: 2.3714 (2.3949) acc1: 48.8000 (48.5091) acc5: 73.6000 (72.9455) time: 2.2761 data: 2.2572 max mem: 2905
Test: [20/50] eta: 0:00:53 loss: 2.4774 (2.5129) acc1: 46.4000 (45.5238) acc5: 72.8000 (71.4667) time: 1.3361 data: 1.3179 max mem: 2905
Test: [30/50] eta: 0:00:30 loss: 2.5873 (2.5127) acc1: 44.8000 (45.6774) acc5: 70.4000 (71.4323) time: 1.1018 data: 1.0825 max mem: 2905
Test: [40/50] eta: 0:00:13 loss: 2.5052 (2.5162) acc1: 46.4000 (45.8146) acc5: 72.0000 (71.4732) time: 0.7906 data: 0.7713 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.4843 (2.5266) acc1: 45.6000 (45.9840) acc5: 72.0000 (71.4080) time: 0.7060 data: 0.6867 max mem: 2905
Test: Total time: 0:00:55 (1.1005 s / it)
* Acc@1 46.254 Acc@5 71.254 loss 2.505
Accuracy of the model on the 50000 test images: 46.3%
Max accuracy: 50.54%
Epoch: [43] [ 0/625] eta: 3:39:59 lr: 0.003934 min_lr: 0.003934 loss: 3.3257 (3.3257) class_acc: 0.4531 (0.4531) weight_decay: 0.0500 (0.0500) time: 21.1192 data: 19.2549 max mem: 2905
Epoch: [43] [200/625] eta: 0:14:44 lr: 0.003932 min_lr: 0.003932 loss: 3.4162 (3.4085) class_acc: 0.4414 (0.4399) weight_decay: 0.0500 (0.0500) grad_norm: 1.7694 (2.3014) time: 1.9873 data: 0.0009 max mem: 2905
Epoch: [43] [400/625] eta: 0:07:39 lr: 0.003930 min_lr: 0.003930 loss: 3.3546 (3.4110) class_acc: 0.4570 (0.4383) weight_decay: 0.0500 (0.0500) grad_norm: 1.6722 (2.2402) time: 2.1345 data: 0.0006 max mem: 2905
Epoch: [43] [600/625] eta: 0:00:51 lr: 0.003928 min_lr: 0.003928 loss: 3.4291 (3.4132) class_acc: 0.4258 (0.4374) weight_decay: 0.0500 (0.0500) grad_norm: 1.7748 (2.2886) time: 1.9398 data: 0.0007 max mem: 2905
Epoch: [43] [624/625] eta: 0:00:02 lr: 0.003928 min_lr: 0.003928 loss: 3.3666 (3.4122) class_acc: 0.4531 (0.4376) weight_decay: 0.0500 (0.0500) grad_norm: 2.0819 (2.3060) time: 0.8716 data: 0.0013 max mem: 2905
Epoch: [43] Total time: 0:20:51 (2.0021 s / it)
Averaged stats: lr: 0.003928 min_lr: 0.003928 loss: 3.3666 (3.4030) class_acc: 0.4531 (0.4402) weight_decay: 0.0500 (0.0500) grad_norm: 2.0819 (2.3060)
Test: [ 0/50] eta: 0:10:58 loss: 2.6287 (2.6287) acc1: 46.4000 (46.4000) acc5: 72.0000 (72.0000) time: 13.1616 data: 13.1340 max mem: 2905
Test: [10/50] eta: 0:01:25 loss: 2.2934 (2.3228) acc1: 49.6000 (50.2545) acc5: 74.4000 (73.8909) time: 2.1315 data: 2.1103 max mem: 2905
Test: [20/50] eta: 0:00:49 loss: 2.4850 (2.4914) acc1: 47.2000 (46.6286) acc5: 72.8000 (72.4952) time: 1.0617 data: 1.0402 max mem: 2905
Test: [30/50] eta: 0:00:28 loss: 2.6065 (2.4646) acc1: 42.4000 (46.7355) acc5: 71.2000 (72.4129) time: 1.0437 data: 1.0223 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.5615 (2.4729) acc1: 44.0000 (46.6732) acc5: 70.4000 (72.3122) time: 0.8354 data: 0.8146 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.5612 (2.4825) acc1: 45.6000 (46.5920) acc5: 70.4000 (72.2080) time: 0.7857 data: 0.7649 max mem: 2905
Test: Total time: 0:00:55 (1.1032 s / it)
* Acc@1 47.476 Acc@5 72.522 loss 2.442
Accuracy of the model on the 50000 test images: 47.5%
Max accuracy: 50.54%
Epoch: [44] [ 0/625] eta: 4:18:49 lr: 0.003928 min_lr: 0.003928 loss: 3.1990 (3.1990) class_acc: 0.4922 (0.4922) weight_decay: 0.0500 (0.0500) time: 24.8479 data: 22.2422 max mem: 2905
Epoch: [44] [200/625] eta: 0:15:02 lr: 0.003926 min_lr: 0.003926 loss: 3.4597 (3.3822) class_acc: 0.4297 (0.4461) weight_decay: 0.0500 (0.0500) grad_norm: 2.5751 (2.3647) time: 2.0278 data: 0.0237 max mem: 2905
Epoch: [44] [400/625] eta: 0:07:30 lr: 0.003924 min_lr: 0.003924 loss: 3.4122 (3.3933) class_acc: 0.4453 (0.4438) weight_decay: 0.0500 (0.0500) grad_norm: 1.6542 (2.2966) time: 1.8663 data: 1.6861 max mem: 2905
Epoch: [44] [600/625] eta: 0:00:50 lr: 0.003922 min_lr: 0.003922 loss: 3.3890 (3.3998) class_acc: 0.4375 (0.4425) weight_decay: 0.0500 (0.0500) grad_norm: 1.4126 (2.2748) time: 1.9420 data: 0.7089 max mem: 2905
Epoch: [44] [624/625] eta: 0:00:01 lr: 0.003922 min_lr: 0.003922 loss: 3.4104 (3.4009) class_acc: 0.4453 (0.4426) weight_decay: 0.0500 (0.0500) grad_norm: 2.5555 (2.3638) time: 0.8427 data: 0.3045 max mem: 2905
Epoch: [44] Total time: 0:20:24 (1.9587 s / it)
Averaged stats: lr: 0.003922 min_lr: 0.003922 loss: 3.4104 (3.3967) class_acc: 0.4453 (0.4417) weight_decay: 0.0500 (0.0500) grad_norm: 2.5555 (2.3638)
Test: [ 0/50] eta: 0:09:43 loss: 2.8062 (2.8062) acc1: 40.8000 (40.8000) acc5: 68.0000 (68.0000) time: 11.6620 data: 11.6140 max mem: 2905
Test: [10/50] eta: 0:01:23 loss: 2.6173 (2.6161) acc1: 44.0000 (45.7455) acc5: 68.8000 (69.4545) time: 2.0819 data: 2.0603 max mem: 2905
Test: [20/50] eta: 0:00:50 loss: 2.7449 (2.7464) acc1: 40.8000 (42.2476) acc5: 67.2000 (67.7714) time: 1.1730 data: 1.1528 max mem: 2905
Test: [30/50] eta: 0:00:27 loss: 2.8054 (2.7631) acc1: 39.2000 (42.1677) acc5: 65.6000 (67.6903) time: 0.9937 data: 0.9739 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.8054 (2.7753) acc1: 37.6000 (41.5610) acc5: 64.8000 (66.9463) time: 0.5233 data: 0.5035 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.7604 (2.7762) acc1: 41.6000 (41.6160) acc5: 66.4000 (67.0880) time: 0.4337 data: 0.4133 max mem: 2905
Test: Total time: 0:00:46 (0.9396 s / it)
* Acc@1 42.200 Acc@5 67.380 loss 2.747
Accuracy of the model on the 50000 test images: 42.2%
Max accuracy: 50.54%
Epoch: [45] [ 0/625] eta: 3:22:33 lr: 0.003922 min_lr: 0.003922 loss: 3.1638 (3.1638) class_acc: 0.4766 (0.4766) weight_decay: 0.0500 (0.0500) time: 19.4456 data: 16.4242 max mem: 2905
Epoch: [45] [200/625] eta: 0:14:08 lr: 0.003920 min_lr: 0.003920 loss: 3.4061 (3.3835) class_acc: 0.4453 (0.4440) weight_decay: 0.0500 (0.0500) grad_norm: 2.0137 (2.2698) time: 1.9818 data: 0.0008 max mem: 2905
Epoch: [45] [400/625] eta: 0:07:21 lr: 0.003918 min_lr: 0.003918 loss: 3.3883 (3.3812) class_acc: 0.4453 (0.4436) weight_decay: 0.0500 (0.0500) grad_norm: 1.5594 (2.2185) time: 2.0136 data: 0.0011 max mem: 2905
Epoch: [45] [600/625] eta: 0:00:49 lr: 0.003916 min_lr: 0.003916 loss: 3.3852 (3.3876) class_acc: 0.4492 (0.4421) weight_decay: 0.0500 (0.0500) grad_norm: 1.4303 (2.1894) time: 2.1442 data: 0.0318 max mem: 2905
Epoch: [45] [624/625] eta: 0:00:01 lr: 0.003916 min_lr: 0.003916 loss: 3.4186 (3.3894) class_acc: 0.4297 (0.4419) weight_decay: 0.0500 (0.0500) grad_norm: 1.8138 (2.1946) time: 0.7823 data: 0.0020 max mem: 2905
Epoch: [45] Total time: 0:20:02 (1.9247 s / it)
Averaged stats: lr: 0.003916 min_lr: 0.003916 loss: 3.4186 (3.3921) class_acc: 0.4297 (0.4426) weight_decay: 0.0500 (0.0500) grad_norm: 1.8138 (2.1946)
Test: [ 0/50] eta: 0:11:25 loss: 2.8380 (2.8380) acc1: 36.8000 (36.8000) acc5: 66.4000 (66.4000) time: 13.7012 data: 13.6766 max mem: 2905
Test: [10/50] eta: 0:01:33 loss: 2.6871 (2.6018) acc1: 45.6000 (44.8000) acc5: 70.4000 (69.5273) time: 2.3336 data: 2.3127 max mem: 2905
Test: [20/50] eta: 0:00:55 loss: 2.8070 (2.8404) acc1: 39.2000 (41.0286) acc5: 66.4000 (66.0191) time: 1.2635 data: 1.2439 max mem: 2905
Test: [30/50] eta: 0:00:33 loss: 2.9869 (2.8374) acc1: 38.4000 (40.6194) acc5: 64.0000 (65.8065) time: 1.2906 data: 1.2722 max mem: 2905
Test: [40/50] eta: 0:00:13 loss: 2.9134 (2.8546) acc1: 38.4000 (40.2732) acc5: 64.0000 (65.3073) time: 0.8778 data: 0.8594 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.8201 (2.8462) acc1: 38.4000 (40.4320) acc5: 64.8000 (65.5040) time: 0.7974 data: 0.7773 max mem: 2905
Test: Total time: 0:00:57 (1.1475 s / it)
* Acc@1 40.948 Acc@5 66.198 loss 2.818
Accuracy of the model on the 50000 test images: 40.9%
Max accuracy: 50.54%
Epoch: [46] [ 0/625] eta: 3:38:00 lr: 0.003916 min_lr: 0.003916 loss: 3.4364 (3.4364) class_acc: 0.4492 (0.4492) weight_decay: 0.0500 (0.0500) time: 20.9293 data: 16.6041 max mem: 2905
Epoch: [46] [200/625] eta: 0:14:44 lr: 0.003913 min_lr: 0.003913 loss: 3.4116 (3.3715) class_acc: 0.4375 (0.4477) weight_decay: 0.0500 (0.0500) grad_norm: 2.8070 (2.6428) time: 1.9882 data: 0.0006 max mem: 2905
Epoch: [46] [400/625] eta: 0:07:30 lr: 0.003911 min_lr: 0.003911 loss: 3.3764 (3.3810) class_acc: 0.4375 (0.4464) weight_decay: 0.0500 (0.0500) grad_norm: 1.5905 (2.3596) time: 1.9217 data: 0.0006 max mem: 2905
Epoch: [46] [600/625] eta: 0:00:49 lr: 0.003909 min_lr: 0.003909 loss: 3.4171 (3.3907) class_acc: 0.4258 (0.4433) weight_decay: 0.0500 (0.0500) grad_norm: 2.3015 (2.3302) time: 1.9684 data: 0.0006 max mem: 2905
Epoch: [46] [624/625] eta: 0:00:01 lr: 0.003909 min_lr: 0.003909 loss: 3.4027 (3.3911) class_acc: 0.4453 (0.4435) weight_decay: 0.0500 (0.0500) grad_norm: 2.7630 (2.3911) time: 0.7131 data: 0.0013 max mem: 2905
Epoch: [46] Total time: 0:20:17 (1.9483 s / it)
Averaged stats: lr: 0.003909 min_lr: 0.003909 loss: 3.4027 (3.3825) class_acc: 0.4453 (0.4445) weight_decay: 0.0500 (0.0500) grad_norm: 2.7630 (2.3911)
Test: [ 0/50] eta: 0:10:19 loss: 2.4703 (2.4703) acc1: 48.0000 (48.0000) acc5: 72.8000 (72.8000) time: 12.3987 data: 12.3723 max mem: 2905
Test: [10/50] eta: 0:01:17 loss: 2.4856 (2.4961) acc1: 46.4000 (46.9091) acc5: 71.2000 (71.2000) time: 1.9487 data: 1.9267 max mem: 2905
Test: [20/50] eta: 0:00:44 loss: 2.5880 (2.5977) acc1: 44.0000 (44.8762) acc5: 69.6000 (70.2476) time: 0.9253 data: 0.9051 max mem: 2905
Test: [30/50] eta: 0:00:25 loss: 2.6890 (2.5957) acc1: 42.4000 (44.6710) acc5: 68.8000 (70.1677) time: 0.8834 data: 0.8637 max mem: 2905
Test: [40/50] eta: 0:00:10 loss: 2.7281 (2.6489) acc1: 39.2000 (43.4146) acc5: 67.2000 (69.2293) time: 0.6471 data: 0.6269 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.7281 (2.6545) acc1: 40.8000 (43.5680) acc5: 67.2000 (68.9600) time: 0.6357 data: 0.6158 max mem: 2905
Test: Total time: 0:00:44 (0.8898 s / it)
* Acc@1 44.068 Acc@5 69.208 loss 2.618
Accuracy of the model on the 50000 test images: 44.1%
Max accuracy: 50.54%
Epoch: [47] [ 0/625] eta: 3:52:41 lr: 0.003909 min_lr: 0.003909 loss: 3.1954 (3.1954) class_acc: 0.4492 (0.4492) weight_decay: 0.0500 (0.0500) time: 22.3390 data: 18.0115 max mem: 2905
Epoch: [47] [200/625] eta: 0:14:28 lr: 0.003907 min_lr: 0.003907 loss: 3.3203 (3.3673) class_acc: 0.4492 (0.4469) weight_decay: 0.0500 (0.0500) grad_norm: 2.1779 (2.3875) time: 1.9263 data: 0.0006 max mem: 2905
Epoch: [47] [400/625] eta: 0:07:27 lr: 0.003905 min_lr: 0.003905 loss: 3.3434 (3.3741) class_acc: 0.4414 (0.4463) weight_decay: 0.0500 (0.0500) grad_norm: 1.8667 (2.2675) time: 1.9413 data: 0.0013 max mem: 2905
Epoch: [47] [600/625] eta: 0:00:50 lr: 0.003902 min_lr: 0.003902 loss: 3.3939 (3.3779) class_acc: 0.4336 (0.4456) weight_decay: 0.0500 (0.0500) grad_norm: 1.8650 (inf) time: 1.9792 data: 0.0008 max mem: 2905
Epoch: [47] [624/625] eta: 0:00:01 lr: 0.003902 min_lr: 0.003902 loss: 3.4238 (3.3784) class_acc: 0.4492 (0.4457) weight_decay: 0.0500 (0.0500) grad_norm: 1.8650 (inf) time: 0.8030 data: 0.0022 max mem: 2905
Epoch: [47] Total time: 0:20:26 (1.9628 s / it)
Averaged stats: lr: 0.003902 min_lr: 0.003902 loss: 3.4238 (3.3762) class_acc: 0.4492 (0.4452) weight_decay: 0.0500 (0.0500) grad_norm: 1.8650 (inf)
Test: [ 0/50] eta: 0:11:01 loss: 2.3562 (2.3562) acc1: 45.6000 (45.6000) acc5: 74.4000 (74.4000) time: 13.2286 data: 13.1987 max mem: 2905
Test: [10/50] eta: 0:01:31 loss: 2.4464 (2.3885) acc1: 51.2000 (50.6909) acc5: 71.2000 (72.5091) time: 2.2831 data: 2.2626 max mem: 2905
Test: [20/50] eta: 0:00:54 loss: 2.5119 (2.5128) acc1: 48.0000 (46.8952) acc5: 70.4000 (71.2000) time: 1.2430 data: 1.2241 max mem: 2905
Test: [30/50] eta: 0:00:32 loss: 2.6188 (2.5403) acc1: 44.0000 (46.1419) acc5: 70.4000 (70.7613) time: 1.2836 data: 1.2650 max mem: 2905
Test: [40/50] eta: 0:00:13 loss: 2.5691 (2.5498) acc1: 44.0000 (46.0878) acc5: 70.4000 (70.3610) time: 0.9235 data: 0.9041 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.5691 (2.5482) acc1: 44.0000 (45.8240) acc5: 71.2000 (70.6240) time: 0.8419 data: 0.8226 max mem: 2905
Test: Total time: 0:00:57 (1.1471 s / it)
* Acc@1 46.410 Acc@5 71.536 loss 2.492
Accuracy of the model on the 50000 test images: 46.4%
Max accuracy: 50.54%
Epoch: [48] [ 0/625] eta: 3:52:03 lr: 0.003902 min_lr: 0.003902 loss: 3.1786 (3.1786) class_acc: 0.5156 (0.5156) weight_decay: 0.0500 (0.0500) time: 22.2780 data: 20.1751 max mem: 2905
Epoch: [48] [200/625] eta: 0:14:21 lr: 0.003900 min_lr: 0.003900 loss: 3.3386 (3.3467) class_acc: 0.4414 (0.4495) weight_decay: 0.0500 (0.0500) grad_norm: 2.4963 (2.2951) time: 1.7452 data: 0.3338 max mem: 2905
Epoch: [48] [400/625] eta: 0:07:31 lr: 0.003898 min_lr: 0.003898 loss: 3.3932 (3.3649) class_acc: 0.4375 (0.4464) weight_decay: 0.0500 (0.0500) grad_norm: 1.3706 (2.1958) time: 1.9722 data: 0.0006 max mem: 2905
Epoch: [48] [600/625] eta: 0:00:49 lr: 0.003895 min_lr: 0.003895 loss: 3.3079 (3.3682) class_acc: 0.4609 (0.4462) weight_decay: 0.0500 (0.0500) grad_norm: 2.2261 (2.2688) time: 2.0331 data: 0.4061 max mem: 2905
Epoch: [48] [624/625] eta: 0:00:01 lr: 0.003895 min_lr: 0.003895 loss: 3.4283 (3.3705) class_acc: 0.4258 (0.4457) weight_decay: 0.0500 (0.0500) grad_norm: 2.7065 (2.2906) time: 0.8636 data: 0.2198 max mem: 2905
Epoch: [48] Total time: 0:20:20 (1.9523 s / it)
Averaged stats: lr: 0.003895 min_lr: 0.003895 loss: 3.4283 (3.3713) class_acc: 0.4258 (0.4466) weight_decay: 0.0500 (0.0500) grad_norm: 2.7065 (2.2906)
Test: [ 0/50] eta: 0:09:59 loss: 2.7328 (2.7328) acc1: 37.6000 (37.6000) acc5: 66.4000 (66.4000) time: 11.9817 data: 11.9570 max mem: 2905
Test: [10/50] eta: 0:01:22 loss: 2.5133 (2.4935) acc1: 49.6000 (48.1455) acc5: 72.0000 (72.0727) time: 2.0612 data: 2.0427 max mem: 2905
Test: [20/50] eta: 0:00:48 loss: 2.5796 (2.5711) acc1: 44.8000 (46.0190) acc5: 70.4000 (70.6667) time: 1.1021 data: 1.0837 max mem: 2905
Test: [30/50] eta: 0:00:27 loss: 2.5841 (2.5604) acc1: 44.8000 (46.0645) acc5: 69.6000 (70.5290) time: 1.0160 data: 0.9962 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.5841 (2.6177) acc1: 45.6000 (45.2293) acc5: 68.0000 (69.2878) time: 0.5999 data: 0.5794 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.5875 (2.6161) acc1: 46.4000 (45.0400) acc5: 68.0000 (69.6640) time: 0.5985 data: 0.5793 max mem: 2905
Test: Total time: 0:00:46 (0.9272 s / it)
* Acc@1 45.200 Acc@5 70.318 loss 2.603
Accuracy of the model on the 50000 test images: 45.2%
Max accuracy: 50.54%
Epoch: [49] [ 0/625] eta: 3:44:40 lr: 0.003895 min_lr: 0.003895 loss: 3.2399 (3.2399) class_acc: 0.4609 (0.4609) weight_decay: 0.0500 (0.0500) time: 21.5688 data: 17.3616 max mem: 2905
Epoch: [49] [200/625] eta: 0:14:05 lr: 0.003893 min_lr: 0.003893 loss: 3.3509 (3.3598) class_acc: 0.4492 (0.4500) weight_decay: 0.0500 (0.0500) grad_norm: 2.2987 (2.4050) time: 1.7431 data: 0.0009 max mem: 2905
Epoch: [49] [400/625] eta: 0:07:10 lr: 0.003890 min_lr: 0.003890 loss: 3.3827 (3.3656) class_acc: 0.4453 (0.4474) weight_decay: 0.0500 (0.0500) grad_norm: 3.0852 (2.3308) time: 1.7235 data: 0.0007 max mem: 2905
Epoch: [49] [600/625] eta: 0:00:47 lr: 0.003888 min_lr: 0.003888 loss: 3.3552 (3.3663) class_acc: 0.4375 (0.4469) weight_decay: 0.0500 (0.0500) grad_norm: 2.3929 (2.3984) time: 1.9828 data: 0.0009 max mem: 2905
Epoch: [49] [624/625] eta: 0:00:01 lr: 0.003888 min_lr: 0.003888 loss: 3.4198 (3.3683) class_acc: 0.4297 (0.4465) weight_decay: 0.0500 (0.0500) grad_norm: 1.7160 (2.3695) time: 0.7373 data: 0.0339 max mem: 2905
Epoch: [49] Total time: 0:19:39 (1.8867 s / it)
Averaged stats: lr: 0.003888 min_lr: 0.003888 loss: 3.4198 (3.3642) class_acc: 0.4297 (0.4474) weight_decay: 0.0500 (0.0500) grad_norm: 1.7160 (2.3695)
Test: [ 0/50] eta: 0:10:45 loss: 2.0867 (2.0867) acc1: 50.4000 (50.4000) acc5: 79.2000 (79.2000) time: 12.9013 data: 12.8659 max mem: 2905
Test: [10/50] eta: 0:01:30 loss: 2.2798 (2.2423) acc1: 51.2000 (51.7818) acc5: 74.4000 (74.6182) time: 2.2561 data: 2.2347 max mem: 2905
Test: [20/50] eta: 0:00:53 loss: 2.4046 (2.4170) acc1: 47.2000 (48.0381) acc5: 72.0000 (72.9524) time: 1.2368 data: 1.2163 max mem: 2905
Test: [30/50] eta: 0:00:30 loss: 2.5913 (2.4262) acc1: 44.8000 (47.4839) acc5: 71.2000 (73.0581) time: 1.1101 data: 1.0898 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.4528 (2.4268) acc1: 46.4000 (47.6488) acc5: 71.2000 (72.7805) time: 0.6835 data: 0.6651 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.3834 (2.4297) acc1: 46.4000 (47.3440) acc5: 73.6000 (72.9440) time: 0.5962 data: 0.5773 max mem: 2905
Test: Total time: 0:00:52 (1.0450 s / it)
* Acc@1 48.114 Acc@5 73.030 loss 2.409
Accuracy of the model on the 50000 test images: 48.1%
Max accuracy: 50.54%
Epoch: [50] [ 0/625] eta: 4:07:52 lr: 0.003888 min_lr: 0.003888 loss: 3.3523 (3.3523) class_acc: 0.4023 (0.4023) weight_decay: 0.0500 (0.0500) time: 23.7968 data: 19.1362 max mem: 2905
Epoch: [50] [200/625] eta: 0:13:33 lr: 0.003885 min_lr: 0.003885 loss: 3.3637 (3.3469) class_acc: 0.4492 (0.4533) weight_decay: 0.0500 (0.0500) grad_norm: 3.0410 (2.2378) time: 1.7296 data: 0.0336 max mem: 2905
Epoch: [50] [400/625] eta: 0:07:13 lr: 0.003883 min_lr: 0.003883 loss: 3.3803 (3.3584) class_acc: 0.4336 (0.4501) weight_decay: 0.0500 (0.0500) grad_norm: 2.2490 (2.2780) time: 2.0968 data: 0.1872 max mem: 2905
Epoch: [50] [600/625] eta: 0:00:47 lr: 0.003881 min_lr: 0.003881 loss: 3.4428 (3.3672) class_acc: 0.4453 (0.4478) weight_decay: 0.0500 (0.0500) grad_norm: 2.6783 (2.2415) time: 1.9913 data: 0.0180 max mem: 2905
Epoch: [50] [624/625] eta: 0:00:01 lr: 0.003880 min_lr: 0.003880 loss: 3.3217 (3.3662) class_acc: 0.4375 (0.4478) weight_decay: 0.0500 (0.0500) grad_norm: 1.9183 (2.2416) time: 0.7108 data: 0.0150 max mem: 2905
Epoch: [50] Total time: 0:19:27 (1.8674 s / it)
Averaged stats: lr: 0.003880 min_lr: 0.003880 loss: 3.3217 (3.3603) class_acc: 0.4375 (0.4487) weight_decay: 0.0500 (0.0500) grad_norm: 1.9183 (2.2416)
Test: [ 0/50] eta: 0:09:06 loss: 2.3292 (2.3292) acc1: 46.4000 (46.4000) acc5: 75.2000 (75.2000) time: 10.9310 data: 10.9059 max mem: 2905
Test: [10/50] eta: 0:01:12 loss: 2.3329 (2.4272) acc1: 47.2000 (47.7091) acc5: 75.2000 (72.3636) time: 1.8244 data: 1.8051 max mem: 2905
Test: [20/50] eta: 0:00:42 loss: 2.6642 (2.5996) acc1: 44.0000 (43.5429) acc5: 68.0000 (69.4095) time: 0.9403 data: 0.9215 max mem: 2905
Test: [30/50] eta: 0:00:25 loss: 2.6907 (2.5875) acc1: 42.4000 (44.2065) acc5: 68.0000 (69.6516) time: 0.9788 data: 0.9597 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.5704 (2.6325) acc1: 44.8000 (43.5512) acc5: 68.0000 (68.8585) time: 0.8873 data: 0.8681 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.7138 (2.6392) acc1: 44.0000 (43.7440) acc5: 67.2000 (68.7840) time: 0.5399 data: 0.5206 max mem: 2905
Test: Total time: 0:00:50 (1.0120 s / it)
* Acc@1 44.782 Acc@5 69.550 loss 2.608
Accuracy of the model on the 50000 test images: 44.8%
Max accuracy: 50.54%
Epoch: [51] [ 0/625] eta: 3:59:38 lr: 0.003880 min_lr: 0.003880 loss: 3.0909 (3.0909) class_acc: 0.4844 (0.4844) weight_decay: 0.0500 (0.0500) time: 23.0050 data: 17.1572 max mem: 2905
Epoch: [51] [200/625] eta: 0:13:39 lr: 0.003878 min_lr: 0.003878 loss: 3.3334 (3.3462) class_acc: 0.4492 (0.4511) weight_decay: 0.0500 (0.0500) grad_norm: 2.1487 (2.3534) time: 1.7712 data: 0.2478 max mem: 2905
Epoch: [51] [400/625] eta: 0:07:15 lr: 0.003875 min_lr: 0.003875 loss: 3.3337 (3.3544) class_acc: 0.4609 (0.4518) weight_decay: 0.0500 (0.0500) grad_norm: 1.7774 (2.3754) time: 1.8824 data: 0.0014 max mem: 2905
Epoch: [51] [600/625] eta: 0:00:49 lr: 0.003873 min_lr: 0.003873 loss: 3.3345 (3.3542) class_acc: 0.4492 (0.4509) weight_decay: 0.0500 (0.0500) grad_norm: 1.4334 (2.3757) time: 2.0715 data: 0.0067 max mem: 2905
Epoch: [51] [624/625] eta: 0:00:01 lr: 0.003873 min_lr: 0.003873 loss: 3.3649 (3.3550) class_acc: 0.4531 (0.4509) weight_decay: 0.0500 (0.0500) grad_norm: 1.4907 (2.3431) time: 1.1988 data: 0.0014 max mem: 2905
Epoch: [51] Total time: 0:20:08 (1.9338 s / it)
Averaged stats: lr: 0.003873 min_lr: 0.003873 loss: 3.3649 (3.3532) class_acc: 0.4531 (0.4504) weight_decay: 0.0500 (0.0500) grad_norm: 1.4907 (2.3431)
Test: [ 0/50] eta: 0:09:50 loss: 2.2951 (2.2951) acc1: 44.8000 (44.8000) acc5: 77.6000 (77.6000) time: 11.8174 data: 11.7762 max mem: 2905
Test: [10/50] eta: 0:01:19 loss: 2.2951 (2.3385) acc1: 50.4000 (50.8364) acc5: 75.2000 (74.1818) time: 1.9801 data: 1.9595 max mem: 2905
Test: [20/50] eta: 0:00:47 loss: 2.4481 (2.4977) acc1: 47.2000 (46.6667) acc5: 71.2000 (72.2286) time: 1.0553 data: 1.0363 max mem: 2905
Test: [30/50] eta: 0:00:28 loss: 2.6484 (2.4995) acc1: 43.2000 (46.5032) acc5: 69.6000 (71.8968) time: 1.0884 data: 1.0690 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.6053 (2.5626) acc1: 43.2000 (45.4244) acc5: 68.0000 (70.8683) time: 0.9251 data: 0.9045 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.5410 (2.5572) acc1: 45.6000 (45.9040) acc5: 70.4000 (70.9440) time: 0.8118 data: 0.7913 max mem: 2905
Test: Total time: 0:00:56 (1.1241 s / it)
* Acc@1 46.402 Acc@5 71.460 loss 2.520
Accuracy of the model on the 50000 test images: 46.4%
Max accuracy: 50.54%
Epoch: [52] [ 0/625] eta: 3:43:35 lr: 0.003873 min_lr: 0.003873 loss: 3.3729 (3.3729) class_acc: 0.4766 (0.4766) weight_decay: 0.0500 (0.0500) time: 21.4641 data: 21.3380 max mem: 2905
Epoch: [52] [200/625] eta: 0:14:33 lr: 0.003870 min_lr: 0.003870 loss: 3.3844 (3.3343) class_acc: 0.4375 (0.4534) weight_decay: 0.0500 (0.0500) grad_norm: 1.8153 (2.3459) time: 1.9381 data: 1.3273 max mem: 2905
Epoch: [52] [400/625] eta: 0:07:29 lr: 0.003867 min_lr: 0.003867 loss: 3.3677 (3.3465) class_acc: 0.4375 (0.4509) weight_decay: 0.0500 (0.0500) grad_norm: 1.4532 (2.4347) time: 1.9345 data: 0.0007 max mem: 2905
Epoch: [52] [600/625] eta: 0:00:50 lr: 0.003865 min_lr: 0.003865 loss: 3.3735 (3.3534) class_acc: 0.4336 (0.4499) weight_decay: 0.0500 (0.0500) grad_norm: 1.8741 (2.3795) time: 2.0584 data: 0.0008 max mem: 2905
Epoch: [52] [624/625] eta: 0:00:01 lr: 0.003865 min_lr: 0.003865 loss: 3.3151 (3.3529) class_acc: 0.4531 (0.4498) weight_decay: 0.0500 (0.0500) grad_norm: 2.3008 (2.4096) time: 0.7025 data: 0.0040 max mem: 2905
Epoch: [52] Total time: 0:20:21 (1.9546 s / it)
Averaged stats: lr: 0.003865 min_lr: 0.003865 loss: 3.3151 (3.3513) class_acc: 0.4531 (0.4506) weight_decay: 0.0500 (0.0500) grad_norm: 2.3008 (2.4096)
Test: [ 0/50] eta: 0:10:44 loss: 2.3972 (2.3972) acc1: 46.4000 (46.4000) acc5: 76.0000 (76.0000) time: 12.8923 data: 12.8580 max mem: 2905
Test: [10/50] eta: 0:01:29 loss: 2.3972 (2.4438) acc1: 49.6000 (49.1636) acc5: 72.8000 (72.4364) time: 2.2326 data: 2.2132 max mem: 2905
Test: [20/50] eta: 0:00:54 loss: 2.5716 (2.6270) acc1: 46.4000 (44.4952) acc5: 68.8000 (69.5619) time: 1.2482 data: 1.2296 max mem: 2905
Test: [30/50] eta: 0:00:30 loss: 2.7839 (2.6609) acc1: 38.4000 (43.3290) acc5: 68.0000 (69.2129) time: 1.1617 data: 1.1430 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.7189 (2.6865) acc1: 40.0000 (42.8488) acc5: 68.0000 (68.7610) time: 0.6817 data: 0.6640 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.6566 (2.6838) acc1: 40.0000 (42.8000) acc5: 67.2000 (68.6080) time: 0.5759 data: 0.5584 max mem: 2905
Test: Total time: 0:00:52 (1.0487 s / it)
* Acc@1 43.642 Acc@5 68.832 loss 2.653
Accuracy of the model on the 50000 test images: 43.6%
Max accuracy: 50.54%
Epoch: [53] [ 0/625] eta: 3:38:28 lr: 0.003865 min_lr: 0.003865 loss: 3.3371 (3.3371) class_acc: 0.4688 (0.4688) weight_decay: 0.0500 (0.0500) time: 20.9733 data: 20.7793 max mem: 2905
Epoch: [53] [200/625] eta: 0:14:13 lr: 0.003862 min_lr: 0.003862 loss: 3.2799 (3.3253) class_acc: 0.4492 (0.4560) weight_decay: 0.0500 (0.0500) grad_norm: 1.7506 (2.3380) time: 1.9518 data: 1.3974 max mem: 2905
Epoch: [53] [400/625] eta: 0:07:25 lr: 0.003859 min_lr: 0.003859 loss: 3.3510 (3.3319) class_acc: 0.4453 (0.4556) weight_decay: 0.0500 (0.0500) grad_norm: 2.6383 (2.4065) time: 1.9923 data: 1.7855 max mem: 2905
Epoch: [53] [600/625] eta: 0:00:48 lr: 0.003857 min_lr: 0.003857 loss: 3.3991 (3.3381) class_acc: 0.4180 (0.4539) weight_decay: 0.0500 (0.0500) grad_norm: 2.0424 (2.3773) time: 1.7336 data: 0.9385 max mem: 2905
Epoch: [53] [624/625] eta: 0:00:01 lr: 0.003856 min_lr: 0.003856 loss: 3.3224 (3.3387) class_acc: 0.4414 (0.4538) weight_decay: 0.0500 (0.0500) grad_norm: 1.8395 (2.3796) time: 0.6873 data: 0.1658 max mem: 2905
Epoch: [53] Total time: 0:19:50 (1.9047 s / it)
Averaged stats: lr: 0.003856 min_lr: 0.003856 loss: 3.3224 (3.3406) class_acc: 0.4414 (0.4530) weight_decay: 0.0500 (0.0500) grad_norm: 1.8395 (2.3796)
Test: [ 0/50] eta: 0:09:10 loss: 2.1193 (2.1193) acc1: 54.4000 (54.4000) acc5: 77.6000 (77.6000) time: 11.0062 data: 10.9774 max mem: 2905
Test: [10/50] eta: 0:01:19 loss: 2.1193 (2.1892) acc1: 52.0000 (51.8545) acc5: 76.8000 (76.0000) time: 1.9799 data: 1.9603 max mem: 2905
Test: [20/50] eta: 0:00:46 loss: 2.3489 (2.3282) acc1: 48.0000 (49.1810) acc5: 75.2000 (74.1714) time: 1.0845 data: 1.0654 max mem: 2905
Test: [30/50] eta: 0:00:26 loss: 2.3489 (2.3033) acc1: 48.8000 (50.1677) acc5: 73.6000 (74.6581) time: 0.9899 data: 0.9690 max mem: 2905
Test: [40/50] eta: 0:00:10 loss: 2.3220 (2.3416) acc1: 48.8000 (48.9366) acc5: 74.4000 (73.8732) time: 0.5976 data: 0.5760 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.3411 (2.3414) acc1: 47.2000 (49.0880) acc5: 73.6000 (73.8720) time: 0.5045 data: 0.4844 max mem: 2905
Test: Total time: 0:00:45 (0.9145 s / it)
* Acc@1 50.186 Acc@5 74.632 loss 2.302
Accuracy of the model on the 50000 test images: 50.2%
Max accuracy: 50.54%
Epoch: [54] [ 0/625] eta: 3:28:15 lr: 0.003856 min_lr: 0.003856 loss: 3.5111 (3.5111) class_acc: 0.4219 (0.4219) weight_decay: 0.0500 (0.0500) time: 19.9926 data: 19.8575 max mem: 2905
Epoch: [54] [200/625] eta: 0:13:54 lr: 0.003854 min_lr: 0.003854 loss: 3.3217 (3.3241) class_acc: 0.4648 (0.4572) weight_decay: 0.0500 (0.0500) grad_norm: 1.9949 (2.3172) time: 1.9306 data: 1.7631 max mem: 2905
Epoch: [54] [400/625] eta: 0:07:08 lr: 0.003851 min_lr: 0.003851 loss: 3.3417 (3.3287) class_acc: 0.4492 (0.4559) weight_decay: 0.0500 (0.0500) grad_norm: 2.9670 (inf) time: 1.7860 data: 1.6256 max mem: 2905
Epoch: [54] [600/625] eta: 0:00:47 lr: 0.003848 min_lr: 0.003848 loss: 3.3792 (3.3330) class_acc: 0.4414 (0.4550) weight_decay: 0.0500 (0.0500) grad_norm: 2.0976 (inf) time: 1.7896 data: 0.5088 max mem: 2905
Epoch: [54] [624/625] eta: 0:00:01 lr: 0.003848 min_lr: 0.003848 loss: 3.3487 (3.3338) class_acc: 0.4453 (0.4551) weight_decay: 0.0500 (0.0500) grad_norm: 2.2949 (inf) time: 0.8153 data: 0.3852 max mem: 2905
Epoch: [54] Total time: 0:19:25 (1.8656 s / it)
Averaged stats: lr: 0.003848 min_lr: 0.003848 loss: 3.3487 (3.3365) class_acc: 0.4453 (0.4541) weight_decay: 0.0500 (0.0500) grad_norm: 2.2949 (inf)
Test: [ 0/50] eta: 0:10:09 loss: 2.6353 (2.6353) acc1: 44.0000 (44.0000) acc5: 68.0000 (68.0000) time: 12.1905 data: 12.1571 max mem: 2905
Test: [10/50] eta: 0:01:15 loss: 2.6075 (2.5707) acc1: 46.4000 (47.4182) acc5: 68.0000 (69.5273) time: 1.8995 data: 1.8780 max mem: 2905
Test: [20/50] eta: 0:00:43 loss: 2.6569 (2.6665) acc1: 44.8000 (44.3810) acc5: 68.0000 (68.8381) time: 0.9079 data: 0.8879 max mem: 2905
Test: [30/50] eta: 0:00:26 loss: 2.7139 (2.6335) acc1: 41.6000 (44.5936) acc5: 68.8000 (69.1097) time: 0.9731 data: 0.9534 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.5953 (2.6445) acc1: 41.6000 (44.6829) acc5: 68.8000 (69.1122) time: 0.9164 data: 0.8971 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.6022 (2.6296) acc1: 41.6000 (44.9280) acc5: 68.8000 (69.2960) time: 0.6418 data: 0.6237 max mem: 2905
Test: Total time: 0:00:53 (1.0665 s / it)
* Acc@1 45.138 Acc@5 69.818 loss 2.596
Accuracy of the model on the 50000 test images: 45.1%
Max accuracy: 50.54%
Epoch: [55] [ 0/625] eta: 3:45:14 lr: 0.003848 min_lr: 0.003848 loss: 3.3281 (3.3281) class_acc: 0.4336 (0.4336) weight_decay: 0.0500 (0.0500) time: 21.6229 data: 18.5020 max mem: 2905
Epoch: [55] [200/625] eta: 0:14:17 lr: 0.003845 min_lr: 0.003845 loss: 3.2878 (3.3199) class_acc: 0.4609 (0.4595) weight_decay: 0.0500 (0.0500) grad_norm: 1.8169 (2.3169) time: 1.9516 data: 0.0713 max mem: 2905
Epoch: [55] [400/625] eta: 0:07:21 lr: 0.003842 min_lr: 0.003842 loss: 3.3595 (3.3283) class_acc: 0.4531 (0.4565) weight_decay: 0.0500 (0.0500) grad_norm: 2.1946 (2.3024) time: 1.8699 data: 0.3091 max mem: 2905
Epoch: [55] [600/625] eta: 0:00:48 lr: 0.003839 min_lr: 0.003839 loss: 3.2922 (3.3343) class_acc: 0.4570 (0.4550) weight_decay: 0.0500 (0.0500) grad_norm: 2.0148 (2.3592) time: 2.0353 data: 0.2675 max mem: 2905
Epoch: [55] [624/625] eta: 0:00:01 lr: 0.003839 min_lr: 0.003839 loss: 3.3192 (3.3334) class_acc: 0.4609 (0.4552) weight_decay: 0.0500 (0.0500) grad_norm: 1.9418 (2.3589) time: 0.7735 data: 0.0180 max mem: 2905
Epoch: [55] Total time: 0:19:56 (1.9138 s / it)
Averaged stats: lr: 0.003839 min_lr: 0.003839 loss: 3.3192 (3.3316) class_acc: 0.4609 (0.4546) weight_decay: 0.0500 (0.0500) grad_norm: 1.9418 (2.3589)
Test: [ 0/50] eta: 0:10:13 loss: 2.5900 (2.5900) acc1: 42.4000 (42.4000) acc5: 72.0000 (72.0000) time: 12.2720 data: 12.2428 max mem: 2905
Test: [10/50] eta: 0:01:22 loss: 2.4106 (2.4076) acc1: 48.0000 (48.1455) acc5: 72.0000 (72.5091) time: 2.0528 data: 2.0318 max mem: 2905
Test: [20/50] eta: 0:00:43 loss: 2.4658 (2.5802) acc1: 44.8000 (44.8762) acc5: 68.8000 (69.7905) time: 0.9145 data: 0.8941 max mem: 2905
Test: [30/50] eta: 0:00:24 loss: 2.6815 (2.5835) acc1: 41.6000 (44.4645) acc5: 68.0000 (69.4452) time: 0.7759 data: 0.7548 max mem: 2905
Test: [40/50] eta: 0:00:10 loss: 2.5540 (2.6018) acc1: 40.8000 (43.9415) acc5: 70.4000 (69.4244) time: 0.6095 data: 0.5892 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.5929 (2.6230) acc1: 43.2000 (43.6800) acc5: 69.6000 (69.1360) time: 0.3771 data: 0.3575 max mem: 2905
Test: Total time: 0:00:45 (0.9085 s / it)
* Acc@1 44.820 Acc@5 69.716 loss 2.599
Accuracy of the model on the 50000 test images: 44.8%
Max accuracy: 50.54%
Epoch: [56] [ 0/625] eta: 3:37:05 lr: 0.003839 min_lr: 0.003839 loss: 3.2067 (3.2067) class_acc: 0.4727 (0.4727) weight_decay: 0.0500 (0.0500) time: 20.8406 data: 19.3918 max mem: 2905
Epoch: [56] [200/625] eta: 0:14:23 lr: 0.003836 min_lr: 0.003836 loss: 3.2960 (3.3139) class_acc: 0.4609 (0.4597) weight_decay: 0.0500 (0.0500) grad_norm: 1.9604 (2.2977) time: 1.9227 data: 0.1201 max mem: 2905
Epoch: [56] [400/625] eta: 0:07:30 lr: 0.003833 min_lr: 0.003833 loss: 3.3843 (3.3160) class_acc: 0.4414 (0.4577) weight_decay: 0.0500 (0.0500) grad_norm: 1.6814 (2.3021) time: 1.9409 data: 0.0009 max mem: 2905
Epoch: [56] [600/625] eta: 0:00:49 lr: 0.003831 min_lr: 0.003831 loss: 3.3578 (3.3277) class_acc: 0.4492 (0.4549) weight_decay: 0.0500 (0.0500) grad_norm: 1.7998 (2.4202) time: 2.0004 data: 0.0007 max mem: 2905
Epoch: [56] [624/625] eta: 0:00:01 lr: 0.003830 min_lr: 0.003830 loss: 3.3479 (3.3292) class_acc: 0.4375 (0.4547) weight_decay: 0.0500 (0.0500) grad_norm: 1.5540 (2.4038) time: 0.8399 data: 0.0325 max mem: 2905
Epoch: [56] Total time: 0:20:19 (1.9514 s / it)
Averaged stats: lr: 0.003830 min_lr: 0.003830 loss: 3.3479 (3.3284) class_acc: 0.4375 (0.4555) weight_decay: 0.0500 (0.0500) grad_norm: 1.5540 (2.4038)
Test: [ 0/50] eta: 0:11:16 loss: 3.0927 (3.0927) acc1: 38.4000 (38.4000) acc5: 63.2000 (63.2000) time: 13.5395 data: 13.5018 max mem: 2905
Test: [10/50] eta: 0:01:25 loss: 2.7362 (2.7250) acc1: 44.8000 (44.3636) acc5: 68.0000 (67.6364) time: 2.1336 data: 2.1126 max mem: 2905
Test: [20/50] eta: 0:00:45 loss: 2.9154 (2.8930) acc1: 39.2000 (40.5714) acc5: 65.6000 (65.1429) time: 0.9039 data: 0.8844 max mem: 2905
Test: [30/50] eta: 0:00:25 loss: 2.9884 (2.9015) acc1: 37.6000 (40.6452) acc5: 63.2000 (64.8516) time: 0.8217 data: 0.8028 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.8629 (2.9498) acc1: 40.0000 (39.7073) acc5: 58.4000 (63.6293) time: 0.8414 data: 0.8234 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.9843 (2.9531) acc1: 41.6000 (39.8240) acc5: 61.6000 (63.8400) time: 0.5726 data: 0.5548 max mem: 2905
Test: Total time: 0:00:51 (1.0302 s / it)
* Acc@1 40.296 Acc@5 64.894 loss 2.903
Accuracy of the model on the 50000 test images: 40.3%
Max accuracy: 50.54%
Epoch: [57] [ 0/625] eta: 3:30:03 lr: 0.003830 min_lr: 0.003830 loss: 3.2730 (3.2730) class_acc: 0.4570 (0.4570) weight_decay: 0.0500 (0.0500) time: 20.1652 data: 19.7166 max mem: 2905
Epoch: [57] [200/625] eta: 0:14:21 lr: 0.003827 min_lr: 0.003827 loss: 3.2875 (3.3136) class_acc: 0.4609 (0.4594) weight_decay: 0.0500 (0.0500) grad_norm: 2.2192 (2.4328) time: 1.9736 data: 0.0192 max mem: 2905
Epoch: [57] [400/625] eta: 0:07:28 lr: 0.003824 min_lr: 0.003824 loss: 3.2251 (3.3176) class_acc: 0.4688 (0.4580) weight_decay: 0.0500 (0.0500) grad_norm: 1.3674 (2.3841) time: 1.9043 data: 0.0007 max mem: 2905
Epoch: [57] [600/625] eta: 0:00:49 lr: 0.003821 min_lr: 0.003821 loss: 3.3144 (3.3283) class_acc: 0.4570 (0.4563) weight_decay: 0.0500 (0.0500) grad_norm: 2.2357 (2.3572) time: 2.0957 data: 0.0006 max mem: 2905
Epoch: [57] [624/625] eta: 0:00:01 lr: 0.003821 min_lr: 0.003821 loss: 3.3472 (3.3292) class_acc: 0.4492 (0.4561) weight_decay: 0.0500 (0.0500) grad_norm: 1.4799 (2.3247) time: 0.6861 data: 0.0013 max mem: 2905
Epoch: [57] Total time: 0:20:15 (1.9448 s / it)
Averaged stats: lr: 0.003821 min_lr: 0.003821 loss: 3.3472 (3.3227) class_acc: 0.4492 (0.4567) weight_decay: 0.0500 (0.0500) grad_norm: 1.4799 (2.3247)
Test: [ 0/50] eta: 0:10:09 loss: 2.5353 (2.5353) acc1: 41.6000 (41.6000) acc5: 73.6000 (73.6000) time: 12.1871 data: 12.1532 max mem: 2905
Test: [10/50] eta: 0:01:25 loss: 2.5353 (2.5381) acc1: 46.4000 (46.2545) acc5: 73.6000 (72.0000) time: 2.1339 data: 2.1129 max mem: 2905
Test: [20/50] eta: 0:00:50 loss: 2.6292 (2.6446) acc1: 44.0000 (43.7333) acc5: 70.4000 (70.6286) time: 1.1516 data: 1.1311 max mem: 2905
Test: [30/50] eta: 0:00:30 loss: 2.6422 (2.6565) acc1: 42.4000 (43.6903) acc5: 68.8000 (69.9097) time: 1.1590 data: 1.1388 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.7315 (2.6883) acc1: 40.8000 (43.2000) acc5: 66.4000 (69.1707) time: 0.7926 data: 0.7714 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.7178 (2.6932) acc1: 42.4000 (43.3920) acc5: 66.4000 (69.0400) time: 0.7077 data: 0.6872 max mem: 2905
Test: Total time: 0:00:51 (1.0361 s / it)
* Acc@1 43.756 Acc@5 68.748 loss 2.676
Accuracy of the model on the 50000 test images: 43.8%
Max accuracy: 50.54%
Epoch: [58] [ 0/625] eta: 4:51:19 lr: 0.003821 min_lr: 0.003821 loss: 3.0344 (3.0344) class_acc: 0.5273 (0.5273) weight_decay: 0.0500 (0.0500) time: 27.9675 data: 18.8793 max mem: 2905
Epoch: [58] [200/625] eta: 0:14:29 lr: 0.003818 min_lr: 0.003818 loss: 3.3546 (3.3013) class_acc: 0.4570 (0.4609) weight_decay: 0.0500 (0.0500) grad_norm: 2.1308 (2.3499) time: 2.1118 data: 0.0007 max mem: 2905
Epoch: [58] [400/625] eta: 0:07:27 lr: 0.003815 min_lr: 0.003815 loss: 3.2634 (3.3127) class_acc: 0.4648 (0.4570) weight_decay: 0.0500 (0.0500) grad_norm: 2.8615 (2.3693) time: 1.8460 data: 0.0134 max mem: 2905
Epoch: [58] [600/625] eta: 0:00:49 lr: 0.003812 min_lr: 0.003812 loss: 3.3320 (3.3137) class_acc: 0.4570 (0.4573) weight_decay: 0.0500 (0.0500) grad_norm: 1.9713 (2.3525) time: 1.9947 data: 0.0399 max mem: 2905
Epoch: [58] [624/625] eta: 0:00:01 lr: 0.003812 min_lr: 0.003812 loss: 3.2654 (3.3127) class_acc: 0.4609 (0.4576) weight_decay: 0.0500 (0.0500) grad_norm: 1.9461 (2.3378) time: 0.5971 data: 0.0015 max mem: 2905
Epoch: [58] Total time: 0:20:23 (1.9581 s / it)
Averaged stats: lr: 0.003812 min_lr: 0.003812 loss: 3.2654 (3.3198) class_acc: 0.4609 (0.4577) weight_decay: 0.0500 (0.0500) grad_norm: 1.9461 (2.3378)
Test: [ 0/50] eta: 0:10:21 loss: 2.7599 (2.7599) acc1: 36.0000 (36.0000) acc5: 65.6000 (65.6000) time: 12.4284 data: 12.3951 max mem: 2905
Test: [10/50] eta: 0:01:21 loss: 2.4294 (2.4196) acc1: 46.4000 (48.5091) acc5: 72.8000 (71.6364) time: 2.0481 data: 2.0272 max mem: 2905
Test: [20/50] eta: 0:00:47 loss: 2.5258 (2.6131) acc1: 44.8000 (44.4190) acc5: 67.2000 (69.0286) time: 1.0430 data: 1.0240 max mem: 2905
Test: [30/50] eta: 0:00:28 loss: 2.7874 (2.6474) acc1: 41.6000 (43.7936) acc5: 65.6000 (69.0065) time: 1.0583 data: 1.0383 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.7533 (2.7165) acc1: 41.6000 (43.2000) acc5: 66.4000 (67.9610) time: 0.8515 data: 0.8311 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.7404 (2.7161) acc1: 40.0000 (43.0720) acc5: 66.4000 (68.1280) time: 0.7604 data: 0.7415 max mem: 2905
Test: Total time: 0:00:54 (1.0802 s / it)
* Acc@1 43.572 Acc@5 68.920 loss 2.678
Accuracy of the model on the 50000 test images: 43.6%
Max accuracy: 50.54%
Epoch: [59] [ 0/625] eta: 3:17:50 lr: 0.003812 min_lr: 0.003812 loss: 3.2325 (3.2325) class_acc: 0.4688 (0.4688) weight_decay: 0.0500 (0.0500) time: 18.9934 data: 18.4385 max mem: 2905
Epoch: [59] [200/625] eta: 0:14:26 lr: 0.003809 min_lr: 0.003809 loss: 3.3065 (3.2989) class_acc: 0.4570 (0.4624) weight_decay: 0.0500 (0.0500) grad_norm: 1.4441 (2.4589) time: 1.8814 data: 0.0467 max mem: 2905
Epoch: [59] [400/625] eta: 0:07:24 lr: 0.003805 min_lr: 0.003805 loss: 3.2682 (3.3057) class_acc: 0.4492 (0.4601) weight_decay: 0.0500 (0.0500) grad_norm: 2.0477 (2.4738) time: 1.9482 data: 0.0946 max mem: 2905
Epoch: [59] [600/625] eta: 0:00:48 lr: 0.003802 min_lr: 0.003802 loss: 3.3696 (3.3190) class_acc: 0.4414 (0.4570) weight_decay: 0.0500 (0.0500) grad_norm: 2.6339 (2.4898) time: 1.6085 data: 0.6684 max mem: 2905
Epoch: [59] [624/625] eta: 0:00:01 lr: 0.003802 min_lr: 0.003802 loss: 3.3636 (3.3204) class_acc: 0.4570 (0.4570) weight_decay: 0.0500 (0.0500) grad_norm: 1.8166 (2.4651) time: 0.8589 data: 0.3184 max mem: 2905
Epoch: [59] Total time: 0:20:02 (1.9239 s / it)
Averaged stats: lr: 0.003802 min_lr: 0.003802 loss: 3.3636 (3.3148) class_acc: 0.4570 (0.4582) weight_decay: 0.0500 (0.0500) grad_norm: 1.8166 (2.4651)
Test: [ 0/50] eta: 0:09:50 loss: 2.5679 (2.5679) acc1: 41.6000 (41.6000) acc5: 68.0000 (68.0000) time: 11.8145 data: 11.7875 max mem: 2905
Test: [10/50] eta: 0:01:19 loss: 2.5679 (2.6337) acc1: 43.2000 (44.9455) acc5: 68.8000 (69.0909) time: 1.9793 data: 1.9586 max mem: 2905
Test: [20/50] eta: 0:00:46 loss: 2.8220 (2.7727) acc1: 41.6000 (41.4857) acc5: 67.2000 (66.9714) time: 1.0306 data: 1.0111 max mem: 2905
Test: [30/50] eta: 0:00:26 loss: 2.9081 (2.8047) acc1: 36.8000 (40.7226) acc5: 65.6000 (66.1419) time: 0.9852 data: 0.9664 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.8539 (2.8500) acc1: 38.4000 (40.1951) acc5: 64.0000 (65.8341) time: 0.6664 data: 0.6475 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.8539 (2.8624) acc1: 38.4000 (40.2560) acc5: 63.2000 (65.6960) time: 0.5486 data: 0.5293 max mem: 2905
Test: Total time: 0:00:48 (0.9692 s / it)
* Acc@1 41.166 Acc@5 66.114 loss 2.834
Accuracy of the model on the 50000 test images: 41.2%
Max accuracy: 50.54%
Epoch: [60] [ 0/625] eta: 3:44:21 lr: 0.003802 min_lr: 0.003802 loss: 3.2687 (3.2687) class_acc: 0.4648 (0.4648) weight_decay: 0.0500 (0.0500) time: 21.5383 data: 19.5985 max mem: 2905
Epoch: [60] [200/625] eta: 0:14:03 lr: 0.003799 min_lr: 0.003799 loss: 3.3259 (3.3044) class_acc: 0.4492 (0.4620) weight_decay: 0.0500 (0.0500) grad_norm: 1.7361 (2.3662) time: 2.1894 data: 0.1077 max mem: 2905
Epoch: [60] [400/625] eta: 0:07:19 lr: 0.003796 min_lr: 0.003796 loss: 3.3135 (3.3169) class_acc: 0.4531 (0.4596) weight_decay: 0.0500 (0.0500) grad_norm: 2.7666 (2.3642) time: 1.9194 data: 0.0060 max mem: 2905
Epoch: [60] [600/625] eta: 0:00:48 lr: 0.003793 min_lr: 0.003793 loss: 3.3658 (3.3245) class_acc: 0.4531 (0.4576) weight_decay: 0.0500 (0.0500) grad_norm: 2.8036 (2.4371) time: 1.9693 data: 0.0098 max mem: 2905
Epoch: [60] [624/625] eta: 0:00:01 lr: 0.003792 min_lr: 0.003792 loss: 3.2929 (3.3241) class_acc: 0.4531 (0.4576) weight_decay: 0.0500 (0.0500) grad_norm: 2.6227 (2.4382) time: 0.4429 data: 0.0021 max mem: 2905
Epoch: [60] Total time: 0:19:49 (1.9039 s / it)
Averaged stats: lr: 0.003792 min_lr: 0.003792 loss: 3.2929 (3.3129) class_acc: 0.4531 (0.4587) weight_decay: 0.0500 (0.0500) grad_norm: 2.6227 (2.4382)
Test: [ 0/50] eta: 0:10:19 loss: 2.4493 (2.4493) acc1: 49.6000 (49.6000) acc5: 72.8000 (72.8000) time: 12.3956 data: 12.3685 max mem: 2905
Test: [10/50] eta: 0:01:24 loss: 2.4073 (2.2970) acc1: 49.6000 (50.4000) acc5: 72.8000 (74.6182) time: 2.1071 data: 2.0854 max mem: 2905
Test: [20/50] eta: 0:00:49 loss: 2.4407 (2.4134) acc1: 47.2000 (48.0000) acc5: 71.2000 (73.1048) time: 1.1267 data: 1.1064 max mem: 2905
Test: [30/50] eta: 0:00:29 loss: 2.4924 (2.4219) acc1: 44.8000 (47.4323) acc5: 71.2000 (72.6710) time: 1.1551 data: 1.1348 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.5364 (2.4671) acc1: 44.0000 (46.8878) acc5: 71.2000 (71.5707) time: 0.8972 data: 0.8773 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.6074 (2.4902) acc1: 44.8000 (46.7360) acc5: 69.6000 (71.6160) time: 0.8044 data: 0.7864 max mem: 2905
Test: Total time: 0:00:53 (1.0737 s / it)
* Acc@1 46.664 Acc@5 71.960 loss 2.469
Accuracy of the model on the 50000 test images: 46.7%
Max accuracy: 50.54%
Epoch: [61] [ 0/625] eta: 3:28:39 lr: 0.003792 min_lr: 0.003792 loss: 3.4290 (3.4290) class_acc: 0.4180 (0.4180) weight_decay: 0.0500 (0.0500) time: 20.0314 data: 16.9222 max mem: 2905
Epoch: [61] [200/625] eta: 0:14:03 lr: 0.003789 min_lr: 0.003789 loss: 3.3371 (3.2947) class_acc: 0.4609 (0.4648) weight_decay: 0.0500 (0.0500) grad_norm: 1.3305 (inf) time: 1.7724 data: 0.0008 max mem: 2905
Epoch: [61] [400/625] eta: 0:07:18 lr: 0.003786 min_lr: 0.003786 loss: 3.3204 (3.3116) class_acc: 0.4570 (0.4608) weight_decay: 0.0500 (0.0500) grad_norm: 1.5571 (inf) time: 1.9372 data: 0.0364 max mem: 2905
Epoch: [61] [600/625] eta: 0:00:48 lr: 0.003782 min_lr: 0.003782 loss: 3.3354 (3.3174) class_acc: 0.4570 (0.4592) weight_decay: 0.0500 (0.0500) grad_norm: 2.0327 (inf) time: 1.9096 data: 0.0569 max mem: 2905
Epoch: [61] [624/625] eta: 0:00:01 lr: 0.003782 min_lr: 0.003782 loss: 3.3305 (3.3174) class_acc: 0.4453 (0.4591) weight_decay: 0.0500 (0.0500) grad_norm: 2.0398 (inf) time: 0.9253 data: 0.1530 max mem: 2905
Epoch: [61] Total time: 0:19:49 (1.9027 s / it)
Averaged stats: lr: 0.003782 min_lr: 0.003782 loss: 3.3305 (3.3062) class_acc: 0.4453 (0.4601) weight_decay: 0.0500 (0.0500) grad_norm: 2.0398 (inf)
Test: [ 0/50] eta: 0:10:14 loss: 2.6603 (2.6603) acc1: 42.4000 (42.4000) acc5: 74.4000 (74.4000) time: 12.2876 data: 12.2547 max mem: 2905
Test: [10/50] eta: 0:01:24 loss: 2.4463 (2.4569) acc1: 46.4000 (47.4182) acc5: 72.8000 (73.2364) time: 2.1033 data: 2.0830 max mem: 2905
Test: [20/50] eta: 0:00:49 loss: 2.5358 (2.6238) acc1: 45.6000 (44.0762) acc5: 69.6000 (70.4381) time: 1.1224 data: 1.1031 max mem: 2905
Test: [30/50] eta: 0:00:29 loss: 2.7010 (2.6093) acc1: 41.6000 (44.0516) acc5: 68.0000 (70.0129) time: 1.1473 data: 1.1284 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.5750 (2.6176) acc1: 44.0000 (44.1951) acc5: 68.8000 (69.8341) time: 0.8082 data: 0.7889 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.6903 (2.6433) acc1: 43.2000 (43.9840) acc5: 67.2000 (69.2800) time: 0.7064 data: 0.6862 max mem: 2905
Test: Total time: 0:00:51 (1.0300 s / it)
* Acc@1 44.604 Acc@5 69.548 loss 2.620
Accuracy of the model on the 50000 test images: 44.6%
Max accuracy: 50.54%
Epoch: [62] [ 0/625] eta: 3:34:36 lr: 0.003782 min_lr: 0.003782 loss: 3.2953 (3.2953) class_acc: 0.4727 (0.4727) weight_decay: 0.0500 (0.0500) time: 20.6029 data: 18.2351 max mem: 2905
Epoch: [62] [200/625] eta: 0:14:20 lr: 0.003779 min_lr: 0.003779 loss: 3.3156 (3.2916) class_acc: 0.4492 (0.4614) weight_decay: 0.0500 (0.0500) grad_norm: 1.9701 (2.4303) time: 2.0279 data: 0.0635 max mem: 2905
Epoch: [62] [400/625] eta: 0:07:30 lr: 0.003775 min_lr: 0.003775 loss: 3.2369 (3.2940) class_acc: 0.4766 (0.4630) weight_decay: 0.0500 (0.0500) grad_norm: 1.5507 (2.3717) time: 2.1138 data: 0.0062 max mem: 2905
Epoch: [62] [600/625] eta: 0:00:50 lr: 0.003772 min_lr: 0.003772 loss: 3.3230 (3.3032) class_acc: 0.4609 (0.4611) weight_decay: 0.0500 (0.0500) grad_norm: 1.9138 (2.3462) time: 1.9723 data: 0.0007 max mem: 2905
Epoch: [62] [624/625] eta: 0:00:01 lr: 0.003772 min_lr: 0.003772 loss: 3.3741 (3.3057) class_acc: 0.4492 (0.4608) weight_decay: 0.0500 (0.0500) grad_norm: 3.4766 (2.4026) time: 0.7307 data: 0.0019 max mem: 2905
Epoch: [62] Total time: 0:20:24 (1.9587 s / it)
Averaged stats: lr: 0.003772 min_lr: 0.003772 loss: 3.3741 (3.3025) class_acc: 0.4492 (0.4614) weight_decay: 0.0500 (0.0500) grad_norm: 3.4766 (2.4026)
Test: [ 0/50] eta: 0:09:41 loss: 4.1889 (4.1889) acc1: 24.8000 (24.8000) acc5: 44.8000 (44.8000) time: 11.6306 data: 11.5982 max mem: 2905
Test: [10/50] eta: 0:01:23 loss: 4.2697 (4.3001) acc1: 23.2000 (23.2000) acc5: 43.2000 (41.8909) time: 2.0929 data: 2.0735 max mem: 2905
Test: [20/50] eta: 0:00:51 loss: 4.2838 (4.2976) acc1: 21.6000 (22.4762) acc5: 43.2000 (43.3143) time: 1.2279 data: 1.2095 max mem: 2905
Test: [30/50] eta: 0:00:31 loss: 4.2838 (4.2823) acc1: 21.6000 (23.1226) acc5: 44.8000 (43.6645) time: 1.2612 data: 1.2428 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 4.3064 (4.3028) acc1: 22.4000 (22.9463) acc5: 44.8000 (43.3171) time: 0.8346 data: 0.8143 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 4.1999 (4.2837) acc1: 23.2000 (23.5200) acc5: 42.4000 (43.5360) time: 0.6891 data: 0.6682 max mem: 2905
Test: Total time: 0:00:54 (1.0802 s / it)
* Acc@1 23.526 Acc@5 44.086 loss 4.261
Accuracy of the model on the 50000 test images: 23.5%
Max accuracy: 50.54%
Epoch: [63] [ 0/625] eta: 3:55:24 lr: 0.003772 min_lr: 0.003772 loss: 3.0685 (3.0685) class_acc: 0.4961 (0.4961) weight_decay: 0.0500 (0.0500) time: 22.5997 data: 17.5528 max mem: 2905
Epoch: [63] [200/625] eta: 0:15:04 lr: 0.003768 min_lr: 0.003768 loss: 3.2956 (3.2822) class_acc: 0.4688 (0.4638) weight_decay: 0.0500 (0.0500) grad_norm: 1.6090 (2.1121) time: 2.0489 data: 0.0008 max mem: 2905
Epoch: [63] [400/625] eta: 0:07:41 lr: 0.003765 min_lr: 0.003765 loss: 3.3249 (3.2900) class_acc: 0.4375 (0.4625) weight_decay: 0.0500 (0.0500) grad_norm: 2.0915 (2.2819) time: 2.0329 data: 0.0098 max mem: 2905
Epoch: [63] [600/625] eta: 0:00:51 lr: 0.003762 min_lr: 0.003762 loss: 3.3034 (3.2949) class_acc: 0.4570 (0.4615) weight_decay: 0.0500 (0.0500) grad_norm: 1.7658 (2.2952) time: 2.1978 data: 0.0008 max mem: 2905
Epoch: [63] [624/625] eta: 0:00:01 lr: 0.003761 min_lr: 0.003761 loss: 3.3163 (3.2961) class_acc: 0.4570 (0.4613) weight_decay: 0.0500 (0.0500) grad_norm: 2.4834 (2.3165) time: 0.7203 data: 0.0016 max mem: 2905
Epoch: [63] Total time: 0:20:48 (1.9969 s / it)
Averaged stats: lr: 0.003761 min_lr: 0.003761 loss: 3.3163 (3.2990) class_acc: 0.4570 (0.4616) weight_decay: 0.0500 (0.0500) grad_norm: 2.4834 (2.3165)
Test: [ 0/50] eta: 0:10:58 loss: 2.8210 (2.8210) acc1: 39.2000 (39.2000) acc5: 64.8000 (64.8000) time: 13.1770 data: 13.1492 max mem: 2905
Test: [10/50] eta: 0:01:26 loss: 2.5112 (2.5639) acc1: 49.6000 (47.7818) acc5: 69.6000 (69.8182) time: 2.1529 data: 2.1330 max mem: 2905
Test: [20/50] eta: 0:00:50 loss: 2.7240 (2.7259) acc1: 41.6000 (44.1905) acc5: 67.2000 (67.3143) time: 1.1248 data: 1.1057 max mem: 2905
Test: [30/50] eta: 0:00:30 loss: 2.8464 (2.7408) acc1: 40.0000 (43.3290) acc5: 64.8000 (67.5355) time: 1.1552 data: 1.1357 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.7646 (2.7504) acc1: 40.0000 (42.9268) acc5: 64.8000 (67.5512) time: 0.8427 data: 0.8226 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.6743 (2.7355) acc1: 41.6000 (42.8480) acc5: 68.0000 (67.9200) time: 0.7518 data: 0.7322 max mem: 2905
Test: Total time: 0:00:53 (1.0701 s / it)
* Acc@1 43.300 Acc@5 68.072 loss 2.707
Accuracy of the model on the 50000 test images: 43.3%
Max accuracy: 50.54%
Epoch: [64] [ 0/625] eta: 4:13:30 lr: 0.003761 min_lr: 0.003761 loss: 3.2595 (3.2595) class_acc: 0.4648 (0.4648) weight_decay: 0.0500 (0.0500) time: 24.3364 data: 21.6845 max mem: 2905
Epoch: [64] [200/625] eta: 0:14:48 lr: 0.003758 min_lr: 0.003758 loss: 3.3159 (3.2854) class_acc: 0.4531 (0.4652) weight_decay: 0.0500 (0.0500) grad_norm: 1.4851 (2.6517) time: 1.8460 data: 0.0233 max mem: 2905
Epoch: [64] [400/625] eta: 0:07:34 lr: 0.003754 min_lr: 0.003754 loss: 3.2846 (3.2904) class_acc: 0.4688 (0.4646) weight_decay: 0.0500 (0.0500) grad_norm: 1.9850 (2.5268) time: 2.0707 data: 0.0432 max mem: 2905
Epoch: [64] [600/625] eta: 0:00:49 lr: 0.003751 min_lr: 0.003751 loss: 3.2551 (3.2931) class_acc: 0.4648 (0.4638) weight_decay: 0.0500 (0.0500) grad_norm: 1.5453 (2.4160) time: 1.7913 data: 0.0008 max mem: 2905
Epoch: [64] [624/625] eta: 0:00:01 lr: 0.003751 min_lr: 0.003751 loss: 3.2928 (3.2933) class_acc: 0.4453 (0.4638) weight_decay: 0.0500 (0.0500) grad_norm: 1.8197 (2.4034) time: 1.1334 data: 0.0013 max mem: 2905
Epoch: [64] Total time: 0:20:16 (1.9471 s / it)
Averaged stats: lr: 0.003751 min_lr: 0.003751 loss: 3.2928 (3.2943) class_acc: 0.4453 (0.4625) weight_decay: 0.0500 (0.0500) grad_norm: 1.8197 (2.4034)
Test: [ 0/50] eta: 0:10:20 loss: 2.5753 (2.5753) acc1: 43.2000 (43.2000) acc5: 73.6000 (73.6000) time: 12.4140 data: 12.3846 max mem: 2905
Test: [10/50] eta: 0:01:26 loss: 2.2326 (2.2686) acc1: 51.2000 (51.7818) acc5: 73.6000 (74.6909) time: 2.1583 data: 2.1376 max mem: 2905
Test: [20/50] eta: 0:00:51 loss: 2.4573 (2.4576) acc1: 45.6000 (47.9238) acc5: 71.2000 (71.8095) time: 1.1754 data: 1.1558 max mem: 2905
Test: [30/50] eta: 0:00:29 loss: 2.4989 (2.4497) acc1: 45.6000 (47.8710) acc5: 69.6000 (72.2839) time: 1.0760 data: 1.0568 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.4338 (2.4724) acc1: 47.2000 (47.4927) acc5: 73.6000 (71.8634) time: 0.6565 data: 0.6360 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.4733 (2.4769) acc1: 47.2000 (47.5520) acc5: 72.0000 (72.0000) time: 0.5592 data: 0.5377 max mem: 2905
Test: Total time: 0:00:49 (0.9991 s / it)
* Acc@1 47.702 Acc@5 72.402 loss 2.465
Accuracy of the model on the 50000 test images: 47.7%
Max accuracy: 50.54%
Epoch: [65] [ 0/625] eta: 3:28:46 lr: 0.003751 min_lr: 0.003751 loss: 3.0546 (3.0546) class_acc: 0.5234 (0.5234) weight_decay: 0.0500 (0.0500) time: 20.0417 data: 17.8946 max mem: 2905
Epoch: [65] [200/625] eta: 0:13:52 lr: 0.003747 min_lr: 0.003747 loss: 3.2824 (3.2756) class_acc: 0.4492 (0.4665) weight_decay: 0.0500 (0.0500) grad_norm: 2.0486 (2.1572) time: 1.7947 data: 0.0153 max mem: 2905
Epoch: [65] [400/625] eta: 0:07:15 lr: 0.003744 min_lr: 0.003744 loss: 3.2794 (3.2817) class_acc: 0.4570 (0.4666) weight_decay: 0.0500 (0.0500) grad_norm: 2.1750 (2.3234) time: 1.7721 data: 0.0006 max mem: 2905
Epoch: [65] [600/625] eta: 0:00:48 lr: 0.003740 min_lr: 0.003740 loss: 3.2455 (3.2915) class_acc: 0.4648 (0.4642) weight_decay: 0.0500 (0.0500) grad_norm: 2.1300 (2.3734) time: 1.8006 data: 0.0007 max mem: 2905
Epoch: [65] [624/625] eta: 0:00:01 lr: 0.003740 min_lr: 0.003740 loss: 3.2493 (3.2904) class_acc: 0.4570 (0.4640) weight_decay: 0.0500 (0.0500) grad_norm: 1.5518 (2.3458) time: 0.5022 data: 0.0013 max mem: 2905
Epoch: [65] Total time: 0:19:46 (1.8983 s / it)
Averaged stats: lr: 0.003740 min_lr: 0.003740 loss: 3.2493 (3.2930) class_acc: 0.4570 (0.4627) weight_decay: 0.0500 (0.0500) grad_norm: 1.5518 (2.3458)
Test: [ 0/50] eta: 0:11:23 loss: 3.3565 (3.3565) acc1: 30.4000 (30.4000) acc5: 58.4000 (58.4000) time: 13.6671 data: 13.6356 max mem: 2905
Test: [10/50] eta: 0:01:28 loss: 2.6612 (2.7729) acc1: 45.6000 (43.3455) acc5: 68.0000 (66.6182) time: 2.2197 data: 2.1993 max mem: 2905
Test: [20/50] eta: 0:00:51 loss: 2.7806 (2.9154) acc1: 40.8000 (39.4667) acc5: 66.4000 (65.0286) time: 1.1249 data: 1.1030 max mem: 2905
Test: [30/50] eta: 0:00:30 loss: 2.9907 (2.9645) acc1: 35.2000 (38.9419) acc5: 62.4000 (64.2839) time: 1.1072 data: 1.0860 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.9907 (2.9921) acc1: 37.6000 (38.8488) acc5: 62.4000 (63.9220) time: 0.7560 data: 0.7372 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 3.0347 (2.9969) acc1: 38.4000 (39.0080) acc5: 63.2000 (63.8560) time: 0.7157 data: 0.6959 max mem: 2905
Test: Total time: 0:00:51 (1.0373 s / it)
* Acc@1 38.858 Acc@5 64.376 loss 2.958
Accuracy of the model on the 50000 test images: 38.9%
Max accuracy: 50.54%
Epoch: [66] [ 0/625] eta: 4:13:42 lr: 0.003740 min_lr: 0.003740 loss: 3.1317 (3.1317) class_acc: 0.5000 (0.5000) weight_decay: 0.0500 (0.0500) time: 24.3558 data: 18.3498 max mem: 2905
Epoch: [66] [200/625] eta: 0:14:38 lr: 0.003736 min_lr: 0.003736 loss: 3.2005 (3.2693) class_acc: 0.4883 (0.4694) weight_decay: 0.0500 (0.0500) grad_norm: 1.6101 (2.0653) time: 1.9880 data: 0.0007 max mem: 2905
Epoch: [66] [400/625] eta: 0:07:36 lr: 0.003732 min_lr: 0.003732 loss: 3.2568 (3.2826) class_acc: 0.4727 (0.4666) weight_decay: 0.0500 (0.0500) grad_norm: 1.8743 (2.2674) time: 1.9582 data: 0.0007 max mem: 2905
Epoch: [66] [600/625] eta: 0:00:50 lr: 0.003729 min_lr: 0.003729 loss: 3.2907 (3.2877) class_acc: 0.4688 (0.4656) weight_decay: 0.0500 (0.0500) grad_norm: 2.8982 (2.3220) time: 1.9225 data: 0.0008 max mem: 2905
Epoch: [66] [624/625] eta: 0:00:01 lr: 0.003728 min_lr: 0.003728 loss: 3.2886 (3.2894) class_acc: 0.4531 (0.4651) weight_decay: 0.0500 (0.0500) grad_norm: 1.8921 (2.3150) time: 0.8207 data: 0.0017 max mem: 2905
Epoch: [66] Total time: 0:20:28 (1.9652 s / it)
Averaged stats: lr: 0.003728 min_lr: 0.003728 loss: 3.2886 (3.2860) class_acc: 0.4531 (0.4647) weight_decay: 0.0500 (0.0500) grad_norm: 1.8921 (2.3150)
Test: [ 0/50] eta: 0:10:32 loss: 2.5922 (2.5922) acc1: 45.6000 (45.6000) acc5: 74.4000 (74.4000) time: 12.6414 data: 12.6096 max mem: 2905
Test: [10/50] eta: 0:01:26 loss: 2.4585 (2.3893) acc1: 48.0000 (49.5273) acc5: 74.4000 (73.9636) time: 2.1562 data: 2.1346 max mem: 2905
Test: [20/50] eta: 0:00:51 loss: 2.5236 (2.5445) acc1: 45.6000 (45.6762) acc5: 71.2000 (71.4286) time: 1.1626 data: 1.1431 max mem: 2905
Test: [30/50] eta: 0:00:30 loss: 2.5935 (2.5335) acc1: 42.4000 (45.6258) acc5: 70.4000 (71.2000) time: 1.1773 data: 1.1582 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.4917 (2.5227) acc1: 46.4000 (46.1659) acc5: 71.2000 (71.0439) time: 0.8395 data: 0.8202 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.5178 (2.5222) acc1: 46.4000 (46.0960) acc5: 71.2000 (71.1840) time: 0.7331 data: 0.7132 max mem: 2905
Test: Total time: 0:00:53 (1.0723 s / it)
* Acc@1 46.272 Acc@5 71.408 loss 2.506
Accuracy of the model on the 50000 test images: 46.3%
Max accuracy: 50.54%
Epoch: [67] [ 0/625] eta: 3:35:29 lr: 0.003728 min_lr: 0.003728 loss: 3.3032 (3.3032) class_acc: 0.4453 (0.4453) weight_decay: 0.0500 (0.0500) time: 20.6867 data: 16.2919 max mem: 2905
Epoch: [67] [200/625] eta: 0:14:11 lr: 0.003725 min_lr: 0.003725 loss: 3.2558 (3.2759) class_acc: 0.4531 (0.4656) weight_decay: 0.0500 (0.0500) grad_norm: 1.8062 (2.4144) time: 1.9989 data: 0.0008 max mem: 2905
Epoch: [67] [400/625] eta: 0:07:21 lr: 0.003721 min_lr: 0.003721 loss: 3.2501 (3.2790) class_acc: 0.4648 (0.4651) weight_decay: 0.0500 (0.0500) grad_norm: 2.6835 (inf) time: 1.8869 data: 0.0009 max mem: 2905
Epoch: [67] [600/625] eta: 0:00:49 lr: 0.003717 min_lr: 0.003717 loss: 3.3442 (3.2892) class_acc: 0.4531 (0.4630) weight_decay: 0.0500 (0.0500) grad_norm: 2.3558 (inf) time: 1.9344 data: 0.0008 max mem: 2905
Epoch: [67] [624/625] eta: 0:00:01 lr: 0.003717 min_lr: 0.003717 loss: 3.3001 (3.2896) class_acc: 0.4609 (0.4629) weight_decay: 0.0500 (0.0500) grad_norm: 2.4876 (inf) time: 0.4860 data: 0.0027 max mem: 2905
Epoch: [67] Total time: 0:20:17 (1.9482 s / it)
Averaged stats: lr: 0.003717 min_lr: 0.003717 loss: 3.3001 (3.2834) class_acc: 0.4609 (0.4649) weight_decay: 0.0500 (0.0500) grad_norm: 2.4876 (inf)
Test: [ 0/50] eta: 0:10:11 loss: 2.4155 (2.4155) acc1: 49.6000 (49.6000) acc5: 75.2000 (75.2000) time: 12.2247 data: 12.1922 max mem: 2905
Test: [10/50] eta: 0:01:23 loss: 2.3761 (2.3209) acc1: 51.2000 (51.3455) acc5: 72.8000 (74.9818) time: 2.0864 data: 2.0654 max mem: 2905
Test: [20/50] eta: 0:00:50 loss: 2.5671 (2.5084) acc1: 46.4000 (47.2381) acc5: 70.4000 (72.2667) time: 1.1415 data: 1.1221 max mem: 2905
Test: [30/50] eta: 0:00:29 loss: 2.7034 (2.5077) acc1: 43.2000 (46.8903) acc5: 69.6000 (71.6387) time: 1.1633 data: 1.1441 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.5463 (2.5183) acc1: 45.6000 (46.6537) acc5: 69.6000 (71.3561) time: 0.8611 data: 0.8416 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.4716 (2.5196) acc1: 45.6000 (46.4960) acc5: 68.8000 (71.1040) time: 0.7859 data: 0.7671 max mem: 2905
Test: Total time: 0:00:52 (1.0524 s / it)
* Acc@1 46.630 Acc@5 71.640 loss 2.499
Accuracy of the model on the 50000 test images: 46.6%
Max accuracy: 50.54%
Epoch: [68] [ 0/625] eta: 3:35:23 lr: 0.003717 min_lr: 0.003717 loss: 3.3656 (3.3656) class_acc: 0.4492 (0.4492) weight_decay: 0.0500 (0.0500) time: 20.6781 data: 17.5342 max mem: 2905
Epoch: [68] [200/625] eta: 0:14:32 lr: 0.003713 min_lr: 0.003713 loss: 3.2633 (3.2471) class_acc: 0.4648 (0.4744) weight_decay: 0.0500 (0.0500) grad_norm: 1.9967 (2.2418) time: 2.0033 data: 0.0008 max mem: 2905
Epoch: [68] [400/625] eta: 0:07:29 lr: 0.003710 min_lr: 0.003710 loss: 3.3248 (3.2670) class_acc: 0.4609 (0.4687) weight_decay: 0.0500 (0.0500) grad_norm: 1.8128 (2.3008) time: 1.9474 data: 0.0017 max mem: 2905
Epoch: [68] [600/625] eta: 0:00:49 lr: 0.003706 min_lr: 0.003706 loss: 3.2695 (3.2754) class_acc: 0.4648 (0.4669) weight_decay: 0.0500 (0.0500) grad_norm: 2.1800 (2.2377) time: 2.0684 data: 0.0064 max mem: 2905
Epoch: [68] [624/625] eta: 0:00:01 lr: 0.003705 min_lr: 0.003705 loss: 3.2841 (3.2756) class_acc: 0.4453 (0.4665) weight_decay: 0.0500 (0.0500) grad_norm: 2.5654 (2.2713) time: 0.8496 data: 0.0015 max mem: 2905
Epoch: [68] Total time: 0:20:19 (1.9518 s / it)
Averaged stats: lr: 0.003705 min_lr: 0.003705 loss: 3.2841 (3.2779) class_acc: 0.4453 (0.4654) weight_decay: 0.0500 (0.0500) grad_norm: 2.5654 (2.2713)
Test: [ 0/50] eta: 0:10:57 loss: 2.4177 (2.4177) acc1: 44.0000 (44.0000) acc5: 72.0000 (72.0000) time: 13.1497 data: 13.1174 max mem: 2905
Test: [10/50] eta: 0:01:32 loss: 2.4550 (2.4474) acc1: 46.4000 (47.5636) acc5: 72.0000 (71.7818) time: 2.3061 data: 2.2873 max mem: 2905
Test: [20/50] eta: 0:00:55 loss: 2.5473 (2.5723) acc1: 44.8000 (45.2952) acc5: 68.0000 (70.0191) time: 1.2749 data: 1.2568 max mem: 2905
Test: [30/50] eta: 0:00:32 loss: 2.6019 (2.5602) acc1: 44.0000 (46.1419) acc5: 67.2000 (70.1677) time: 1.2130 data: 1.1923 max mem: 2905
Test: [40/50] eta: 0:00:13 loss: 2.6563 (2.6148) acc1: 45.6000 (46.0098) acc5: 68.0000 (69.5024) time: 0.7932 data: 0.7721 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.7203 (2.6443) acc1: 44.0000 (45.5840) acc5: 68.8000 (69.3760) time: 0.7319 data: 0.7132 max mem: 2905
Test: Total time: 0:00:55 (1.1041 s / it)
* Acc@1 44.886 Acc@5 69.590 loss 2.635
Accuracy of the model on the 50000 test images: 44.9%
Max accuracy: 50.54%
Epoch: [69] [ 0/625] eta: 3:47:55 lr: 0.003705 min_lr: 0.003705 loss: 3.2073 (3.2073) class_acc: 0.5039 (0.5039) weight_decay: 0.0500 (0.0500) time: 21.8809 data: 18.4495 max mem: 2905
Epoch: [69] [200/625] eta: 0:14:43 lr: 0.003702 min_lr: 0.003702 loss: 3.2782 (3.2626) class_acc: 0.4688 (0.4683) weight_decay: 0.0500 (0.0500) grad_norm: 1.8088 (2.2343) time: 2.0267 data: 0.1072 max mem: 2905
Epoch: [69] [400/625] eta: 0:07:30 lr: 0.003698 min_lr: 0.003698 loss: 3.2588 (3.2713) class_acc: 0.4648 (0.4668) weight_decay: 0.0500 (0.0500) grad_norm: 1.8190 (2.2371) time: 1.7620 data: 0.0054 max mem: 2905
Epoch: [69] [600/625] eta: 0:00:50 lr: 0.003694 min_lr: 0.003694 loss: 3.2714 (3.2814) class_acc: 0.4648 (0.4644) weight_decay: 0.0500 (0.0500) grad_norm: 1.9751 (2.3004) time: 2.0135 data: 0.0411 max mem: 2905
Epoch: [69] [624/625] eta: 0:00:01 lr: 0.003694 min_lr: 0.003694 loss: 3.2842 (3.2822) class_acc: 0.4570 (0.4643) weight_decay: 0.0500 (0.0500) grad_norm: 1.9751 (2.3004) time: 0.7078 data: 0.0012 max mem: 2905
Epoch: [69] Total time: 0:20:18 (1.9504 s / it)
Averaged stats: lr: 0.003694 min_lr: 0.003694 loss: 3.2842 (3.2774) class_acc: 0.4570 (0.4661) weight_decay: 0.0500 (0.0500) grad_norm: 1.9751 (2.3004)
Test: [ 0/50] eta: 0:10:14 loss: 2.5352 (2.5352) acc1: 47.2000 (47.2000) acc5: 72.8000 (72.8000) time: 12.2867 data: 12.2539 max mem: 2905
Test: [10/50] eta: 0:01:21 loss: 2.3039 (2.2981) acc1: 50.4000 (50.6909) acc5: 73.6000 (74.0364) time: 2.0250 data: 2.0035 max mem: 2905
Test: [20/50] eta: 0:00:47 loss: 2.4613 (2.4317) acc1: 48.0000 (48.0762) acc5: 72.8000 (72.5714) time: 1.0466 data: 1.0250 max mem: 2905
Test: [30/50] eta: 0:00:26 loss: 2.5963 (2.4802) acc1: 44.8000 (47.2516) acc5: 70.4000 (71.4323) time: 0.9370 data: 0.9161 max mem: 2905
Test: [40/50] eta: 0:00:10 loss: 2.5883 (2.4845) acc1: 44.0000 (47.1024) acc5: 70.4000 (71.1805) time: 0.5641 data: 0.5457 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.5888 (2.5088) acc1: 43.2000 (46.5440) acc5: 69.6000 (70.8960) time: 0.5196 data: 0.5018 max mem: 2905
Test: Total time: 0:00:45 (0.9007 s / it)
* Acc@1 46.142 Acc@5 71.584 loss 2.487
Accuracy of the model on the 50000 test images: 46.1%
Max accuracy: 50.54%
Epoch: [70] [ 0/625] eta: 3:31:54 lr: 0.003694 min_lr: 0.003694 loss: 3.1680 (3.1680) class_acc: 0.4961 (0.4961) weight_decay: 0.0500 (0.0500) time: 20.3439 data: 19.5478 max mem: 2905
Epoch: [70] [200/625] eta: 0:14:13 lr: 0.003690 min_lr: 0.003690 loss: 3.2342 (3.2535) class_acc: 0.4688 (0.4697) weight_decay: 0.0500 (0.0500) grad_norm: 1.8662 (2.0418) time: 1.8954 data: 1.1768 max mem: 2905
Epoch: [70] [400/625] eta: 0:07:21 lr: 0.003686 min_lr: 0.003686 loss: 3.2927 (3.2590) class_acc: 0.4609 (0.4698) weight_decay: 0.0500 (0.0500) grad_norm: 2.2044 (2.1714) time: 1.9010 data: 1.6722 max mem: 2905
Epoch: [70] [600/625] eta: 0:00:48 lr: 0.003682 min_lr: 0.003682 loss: 3.2808 (3.2661) class_acc: 0.4648 (0.4688) weight_decay: 0.0500 (0.0500) grad_norm: 2.2191 (2.2436) time: 1.8598 data: 1.6945 max mem: 2905
Epoch: [70] [624/625] eta: 0:00:01 lr: 0.003682 min_lr: 0.003682 loss: 3.2867 (3.2673) class_acc: 0.4531 (0.4683) weight_decay: 0.0500 (0.0500) grad_norm: 1.5888 (2.2310) time: 0.8394 data: 0.6855 max mem: 2905
Epoch: [70] Total time: 0:19:45 (1.8970 s / it)
Averaged stats: lr: 0.003682 min_lr: 0.003682 loss: 3.2867 (3.2710) class_acc: 0.4531 (0.4675) weight_decay: 0.0500 (0.0500) grad_norm: 1.5888 (2.2310)
Test: [ 0/50] eta: 0:09:43 loss: 2.3033 (2.3033) acc1: 44.8000 (44.8000) acc5: 75.2000 (75.2000) time: 11.6675 data: 11.6301 max mem: 2905
Test: [10/50] eta: 0:01:23 loss: 2.4451 (2.5108) acc1: 47.2000 (47.2727) acc5: 71.2000 (71.4909) time: 2.0781 data: 2.0551 max mem: 2905
Test: [20/50] eta: 0:00:49 loss: 2.5803 (2.6281) acc1: 45.6000 (44.8381) acc5: 69.6000 (70.1714) time: 1.1449 data: 1.1249 max mem: 2905
Test: [30/50] eta: 0:00:26 loss: 2.6033 (2.6347) acc1: 40.0000 (44.6194) acc5: 69.6000 (70.4774) time: 0.9306 data: 0.9109 max mem: 2905
Test: [40/50] eta: 0:00:10 loss: 2.6644 (2.6802) acc1: 44.8000 (44.1366) acc5: 68.0000 (69.4829) time: 0.4902 data: 0.4692 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.7450 (2.6783) acc1: 41.6000 (44.2240) acc5: 67.2000 (69.4720) time: 0.4627 data: 0.4426 max mem: 2905
Test: Total time: 0:00:45 (0.9005 s / it)
* Acc@1 44.674 Acc@5 69.370 loss 2.661
Accuracy of the model on the 50000 test images: 44.7%
Max accuracy: 50.54%
Epoch: [71] [ 0/625] eta: 3:58:25 lr: 0.003681 min_lr: 0.003681 loss: 3.3352 (3.3352) class_acc: 0.4648 (0.4648) weight_decay: 0.0500 (0.0500) time: 22.8884 data: 20.4500 max mem: 2905
Epoch: [71] [200/625] eta: 0:13:31 lr: 0.003678 min_lr: 0.003678 loss: 3.2586 (3.2437) class_acc: 0.4609 (0.4738) weight_decay: 0.0500 (0.0500) grad_norm: 2.2543 (2.4048) time: 1.8935 data: 0.0062 max mem: 2905
Epoch: [71] [400/625] eta: 0:07:12 lr: 0.003674 min_lr: 0.003674 loss: 3.1917 (3.2518) class_acc: 0.4844 (0.4710) weight_decay: 0.0500 (0.0500) grad_norm: 2.3252 (2.2583) time: 1.9401 data: 0.0230 max mem: 2905
Epoch: [71] [600/625] eta: 0:00:48 lr: 0.003670 min_lr: 0.003670 loss: 3.2620 (3.2606) class_acc: 0.4570 (0.4691) weight_decay: 0.0500 (0.0500) grad_norm: 1.6146 (2.2707) time: 1.8943 data: 0.0006 max mem: 2905
Epoch: [71] [624/625] eta: 0:00:01 lr: 0.003669 min_lr: 0.003669 loss: 3.2440 (3.2606) class_acc: 0.4766 (0.4692) weight_decay: 0.0500 (0.0500) grad_norm: 1.9048 (2.2708) time: 0.3730 data: 0.0108 max mem: 2905
Epoch: [71] Total time: 0:19:59 (1.9188 s / it)
Averaged stats: lr: 0.003669 min_lr: 0.003669 loss: 3.2440 (3.2653) class_acc: 0.4766 (0.4688) weight_decay: 0.0500 (0.0500) grad_norm: 1.9048 (2.2708)
Test: [ 0/50] eta: 0:10:34 loss: 2.8029 (2.8029) acc1: 43.2000 (43.2000) acc5: 64.0000 (64.0000) time: 12.6890 data: 12.6506 max mem: 2905
Test: [10/50] eta: 0:01:22 loss: 2.5863 (2.6212) acc1: 45.6000 (46.2545) acc5: 69.6000 (69.9636) time: 2.0631 data: 2.0422 max mem: 2905
Test: [20/50] eta: 0:00:48 loss: 2.8255 (2.7735) acc1: 40.8000 (42.1333) acc5: 68.0000 (67.3905) time: 1.0703 data: 1.0511 max mem: 2905
Test: [30/50] eta: 0:00:29 loss: 2.8536 (2.7821) acc1: 38.4000 (41.6774) acc5: 66.4000 (67.2000) time: 1.1133 data: 1.0934 max mem: 2905
Test: [40/50] eta: 0:00:13 loss: 2.8018 (2.8022) acc1: 42.4000 (41.6585) acc5: 67.2000 (67.0829) time: 1.0214 data: 1.0019 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.7924 (2.7989) acc1: 39.2000 (41.6000) acc5: 68.0000 (67.1360) time: 0.9269 data: 0.9072 max mem: 2905
Test: Total time: 0:00:57 (1.1533 s / it)
* Acc@1 42.048 Acc@5 67.582 loss 2.781
Accuracy of the model on the 50000 test images: 42.0%
Max accuracy: 50.54%
Epoch: [72] [ 0/625] eta: 4:45:59 lr: 0.003669 min_lr: 0.003669 loss: 3.2441 (3.2441) class_acc: 0.4609 (0.4609) weight_decay: 0.0500 (0.0500) time: 27.4554 data: 27.3179 max mem: 2905
Epoch: [72] [200/625] eta: 0:14:47 lr: 0.003665 min_lr: 0.003665 loss: 3.2518 (3.2295) class_acc: 0.4688 (0.4770) weight_decay: 0.0500 (0.0500) grad_norm: 2.1179 (2.3574) time: 1.9984 data: 0.6676 max mem: 2905
Epoch: [72] [400/625] eta: 0:07:29 lr: 0.003661 min_lr: 0.003661 loss: 3.2914 (3.2553) class_acc: 0.4609 (0.4712) weight_decay: 0.0500 (0.0500) grad_norm: 1.7572 (2.3537) time: 1.9639 data: 0.0008 max mem: 2905
Epoch: [72] [600/625] eta: 0:00:50 lr: 0.003657 min_lr: 0.003657 loss: 3.2522 (3.2603) class_acc: 0.4609 (0.4706) weight_decay: 0.0500 (0.0500) grad_norm: 1.5007 (2.3501) time: 2.0197 data: 0.0009 max mem: 2905
Epoch: [72] [624/625] eta: 0:00:01 lr: 0.003657 min_lr: 0.003657 loss: 3.2724 (3.2622) class_acc: 0.4609 (0.4700) weight_decay: 0.0500 (0.0500) grad_norm: 2.0141 (2.3422) time: 0.7504 data: 0.0107 max mem: 2905
Epoch: [72] Total time: 0:20:19 (1.9520 s / it)
Averaged stats: lr: 0.003657 min_lr: 0.003657 loss: 3.2724 (3.2626) class_acc: 0.4609 (0.4694) weight_decay: 0.0500 (0.0500) grad_norm: 2.0141 (2.3422)
Test: [ 0/50] eta: 0:10:50 loss: 2.9999 (2.9999) acc1: 39.2000 (39.2000) acc5: 68.0000 (68.0000) time: 13.0025 data: 12.9768 max mem: 2905
Test: [10/50] eta: 0:01:20 loss: 2.4491 (2.4742) acc1: 49.6000 (47.4909) acc5: 72.0000 (71.4909) time: 2.0189 data: 1.9978 max mem: 2905
Test: [20/50] eta: 0:00:46 loss: 2.5583 (2.6920) acc1: 40.0000 (42.8190) acc5: 68.8000 (68.5333) time: 0.9823 data: 0.9625 max mem: 2905
Test: [30/50] eta: 0:00:26 loss: 2.8917 (2.7317) acc1: 39.2000 (42.7097) acc5: 64.8000 (67.6387) time: 0.9727 data: 0.9534 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.8917 (2.7469) acc1: 40.8000 (42.7902) acc5: 64.8000 (67.4732) time: 0.6816 data: 0.6608 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.8702 (2.7662) acc1: 40.8000 (42.4800) acc5: 65.6000 (67.5520) time: 0.5656 data: 0.5451 max mem: 2905
Test: Total time: 0:00:48 (0.9613 s / it)
* Acc@1 42.372 Acc@5 67.640 loss 2.754
Accuracy of the model on the 50000 test images: 42.4%
Max accuracy: 50.54%
Epoch: [73] [ 0/625] eta: 3:41:34 lr: 0.003657 min_lr: 0.003657 loss: 3.3996 (3.3996) class_acc: 0.4336 (0.4336) weight_decay: 0.0500 (0.0500) time: 21.2717 data: 21.1469 max mem: 2905
Epoch: [73] [200/625] eta: 0:14:18 lr: 0.003653 min_lr: 0.003653 loss: 3.2344 (3.2404) class_acc: 0.4648 (0.4727) weight_decay: 0.0500 (0.0500) grad_norm: 1.8714 (2.3847) time: 2.1308 data: 0.2239 max mem: 2905
Epoch: [73] [400/625] eta: 0:07:24 lr: 0.003649 min_lr: 0.003649 loss: 3.2630 (3.2619) class_acc: 0.4609 (0.4695) weight_decay: 0.0500 (0.0500) grad_norm: 1.5318 (2.4013) time: 2.0914 data: 0.4550 max mem: 2905
Epoch: [73] [600/625] eta: 0:00:49 lr: 0.003645 min_lr: 0.003645 loss: 3.1777 (3.2662) class_acc: 0.4688 (0.4683) weight_decay: 0.0500 (0.0500) grad_norm: 1.5482 (2.4028) time: 1.9119 data: 0.0007 max mem: 2905
Epoch: [73] [624/625] eta: 0:00:01 lr: 0.003644 min_lr: 0.003644 loss: 3.2238 (3.2663) class_acc: 0.4688 (0.4683) weight_decay: 0.0500 (0.0500) grad_norm: 2.1276 (2.4070) time: 0.6894 data: 0.0195 max mem: 2905
Epoch: [73] Total time: 0:20:13 (1.9422 s / it)
Averaged stats: lr: 0.003644 min_lr: 0.003644 loss: 3.2238 (3.2615) class_acc: 0.4688 (0.4691) weight_decay: 0.0500 (0.0500) grad_norm: 2.1276 (2.4070)
Test: [ 0/50] eta: 0:10:19 loss: 2.4044 (2.4044) acc1: 48.0000 (48.0000) acc5: 73.6000 (73.6000) time: 12.3961 data: 12.3669 max mem: 2905
Test: [10/50] eta: 0:01:23 loss: 2.4581 (2.4322) acc1: 48.0000 (48.6545) acc5: 73.6000 (73.6727) time: 2.0979 data: 2.0774 max mem: 2905
Test: [20/50] eta: 0:00:49 loss: 2.5647 (2.5454) acc1: 44.0000 (45.7524) acc5: 72.0000 (71.5429) time: 1.1155 data: 1.0947 max mem: 2905
Test: [30/50] eta: 0:00:28 loss: 2.5929 (2.5454) acc1: 41.6000 (45.3419) acc5: 70.4000 (70.8129) time: 1.0391 data: 1.0181 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.5929 (2.6031) acc1: 41.6000 (44.2146) acc5: 68.8000 (70.1854) time: 0.6100 data: 0.5895 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.7911 (2.6439) acc1: 40.8000 (43.6000) acc5: 68.8000 (69.7120) time: 0.5337 data: 0.5132 max mem: 2905
Test: Total time: 0:00:47 (0.9540 s / it)
* Acc@1 44.012 Acc@5 69.468 loss 2.622
Accuracy of the model on the 50000 test images: 44.0%
Max accuracy: 50.54%
Epoch: [74] [ 0/625] eta: 3:22:34 lr: 0.003644 min_lr: 0.003644 loss: 3.2778 (3.2778) class_acc: 0.4297 (0.4297) weight_decay: 0.0500 (0.0500) time: 19.4468 data: 17.8797 max mem: 2905
Epoch: [74] [200/625] eta: 0:13:44 lr: 0.003640 min_lr: 0.003640 loss: 3.2522 (3.2521) class_acc: 0.4805 (0.4716) weight_decay: 0.0500 (0.0500) grad_norm: 1.9541 (inf) time: 1.7463 data: 0.9433 max mem: 2905
Epoch: [74] [400/625] eta: 0:06:59 lr: 0.003636 min_lr: 0.003636 loss: 3.2739 (3.2589) class_acc: 0.4688 (0.4695) weight_decay: 0.0500 (0.0500) grad_norm: 1.7812 (inf) time: 1.7296 data: 1.4300 max mem: 2905
Epoch: [74] [600/625] eta: 0:00:46 lr: 0.003632 min_lr: 0.003632 loss: 3.2757 (3.2619) class_acc: 0.4531 (0.4688) weight_decay: 0.0500 (0.0500) grad_norm: 2.6342 (inf) time: 1.8961 data: 1.7076 max mem: 2905
Epoch: [74] [624/625] eta: 0:00:01 lr: 0.003631 min_lr: 0.003631 loss: 3.2375 (3.2617) class_acc: 0.4688 (0.4686) weight_decay: 0.0500 (0.0500) grad_norm: 2.1911 (inf) time: 0.7458 data: 0.5926 max mem: 2905
Epoch: [74] Total time: 0:19:14 (1.8473 s / it)
Averaged stats: lr: 0.003631 min_lr: 0.003631 loss: 3.2375 (3.2609) class_acc: 0.4688 (0.4693) weight_decay: 0.0500 (0.0500) grad_norm: 2.1911 (inf)
Test: [ 0/50] eta: 0:10:40 loss: 2.7115 (2.7115) acc1: 40.0000 (40.0000) acc5: 67.2000 (67.2000) time: 12.8154 data: 12.7897 max mem: 2905
Test: [10/50] eta: 0:01:21 loss: 2.5669 (2.5203) acc1: 48.0000 (48.1455) acc5: 69.6000 (71.2727) time: 2.0346 data: 2.0141 max mem: 2905
Test: [20/50] eta: 0:00:46 loss: 2.5740 (2.6284) acc1: 47.2000 (45.4857) acc5: 69.6000 (69.9429) time: 0.9883 data: 0.9675 max mem: 2905
Test: [30/50] eta: 0:00:27 loss: 2.5641 (2.6008) acc1: 43.2000 (45.5226) acc5: 68.8000 (69.9355) time: 0.9944 data: 0.9739 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.5385 (2.6105) acc1: 44.8000 (45.5805) acc5: 68.8000 (69.8732) time: 0.8006 data: 0.7817 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.5839 (2.6133) acc1: 47.2000 (45.4240) acc5: 69.6000 (69.9040) time: 0.6633 data: 0.6441 max mem: 2905
Test: Total time: 0:00:52 (1.0472 s / it)
* Acc@1 45.582 Acc@5 70.386 loss 2.583
Accuracy of the model on the 50000 test images: 45.6%
Max accuracy: 50.54%
Epoch: [75] [ 0/625] eta: 3:14:11 lr: 0.003631 min_lr: 0.003631 loss: 3.3014 (3.3014) class_acc: 0.4688 (0.4688) weight_decay: 0.0500 (0.0500) time: 18.6430 data: 17.9810 max mem: 2905
Epoch: [75] [200/625] eta: 0:13:31 lr: 0.003627 min_lr: 0.003627 loss: 3.2851 (3.2233) class_acc: 0.4648 (0.4764) weight_decay: 0.0500 (0.0500) grad_norm: 1.4224 (2.2634) time: 1.9488 data: 1.1112 max mem: 2905
Epoch: [75] [400/625] eta: 0:07:00 lr: 0.003623 min_lr: 0.003623 loss: 3.2408 (3.2366) class_acc: 0.4727 (0.4742) weight_decay: 0.0500 (0.0500) grad_norm: 2.2368 (2.3292) time: 1.8082 data: 0.8952 max mem: 2905
Epoch: [75] [600/625] eta: 0:00:46 lr: 0.003619 min_lr: 0.003619 loss: 3.2120 (3.2422) class_acc: 0.4648 (0.4730) weight_decay: 0.0500 (0.0500) grad_norm: 1.9178 (2.2967) time: 1.7674 data: 1.3425 max mem: 2905
Epoch: [75] [624/625] eta: 0:00:01 lr: 0.003618 min_lr: 0.003618 loss: 3.2393 (3.2418) class_acc: 0.4648 (0.4730) weight_decay: 0.0500 (0.0500) grad_norm: 1.8435 (2.3120) time: 0.6558 data: 0.4252 max mem: 2905
Epoch: [75] Total time: 0:19:18 (1.8528 s / it)
Averaged stats: lr: 0.003618 min_lr: 0.003618 loss: 3.2393 (3.2509) class_acc: 0.4648 (0.4714) weight_decay: 0.0500 (0.0500) grad_norm: 1.8435 (2.3120)
Test: [ 0/50] eta: 0:09:24 loss: 3.1796 (3.1796) acc1: 31.2000 (31.2000) acc5: 55.2000 (55.2000) time: 11.2840 data: 11.2471 max mem: 2905
Test: [10/50] eta: 0:01:15 loss: 2.8774 (2.8887) acc1: 42.4000 (39.7091) acc5: 65.6000 (65.7455) time: 1.8980 data: 1.8789 max mem: 2905
Test: [20/50] eta: 0:00:45 loss: 2.9550 (2.9869) acc1: 36.0000 (37.8286) acc5: 64.0000 (63.7714) time: 1.0258 data: 1.0064 max mem: 2905
Test: [30/50] eta: 0:00:26 loss: 2.9550 (2.9381) acc1: 37.6000 (39.2516) acc5: 64.0000 (64.3355) time: 1.0165 data: 0.9962 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.8604 (2.9443) acc1: 40.0000 (39.1024) acc5: 64.0000 (64.5268) time: 0.7972 data: 0.7755 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 3.0163 (2.9519) acc1: 37.6000 (39.4400) acc5: 62.4000 (64.4640) time: 0.6565 data: 0.6338 max mem: 2905
Test: Total time: 0:00:51 (1.0202 s / it)
* Acc@1 39.926 Acc@5 65.324 loss 2.914
Accuracy of the model on the 50000 test images: 39.9%
Max accuracy: 50.54%
Epoch: [76] [ 0/625] eta: 3:19:10 lr: 0.003618 min_lr: 0.003618 loss: 3.2141 (3.2141) class_acc: 0.5000 (0.5000) weight_decay: 0.0500 (0.0500) time: 19.1208 data: 16.4793 max mem: 2905
Epoch: [76] [200/625] eta: 0:13:34 lr: 0.003614 min_lr: 0.003614 loss: 3.2017 (3.2141) class_acc: 0.4805 (0.4795) weight_decay: 0.0500 (0.0500) grad_norm: 1.7372 (2.1841) time: 1.9545 data: 0.0006 max mem: 2905
Epoch: [76] [400/625] eta: 0:07:08 lr: 0.003610 min_lr: 0.003610 loss: 3.2270 (3.2342) class_acc: 0.4727 (0.4757) weight_decay: 0.0500 (0.0500) grad_norm: 2.0285 (2.2051) time: 1.8680 data: 0.0011 max mem: 2905
Epoch: [76] [600/625] eta: 0:00:47 lr: 0.003605 min_lr: 0.003605 loss: 3.2251 (3.2480) class_acc: 0.4805 (0.4729) weight_decay: 0.0500 (0.0500) grad_norm: 1.9673 (2.2847) time: 1.9398 data: 0.0009 max mem: 2905
Epoch: [76] [624/625] eta: 0:00:01 lr: 0.003605 min_lr: 0.003605 loss: 3.1834 (3.2478) class_acc: 0.4648 (0.4727) weight_decay: 0.0500 (0.0500) grad_norm: 1.8908 (2.2698) time: 0.7790 data: 0.0017 max mem: 2905
Epoch: [76] Total time: 0:19:30 (1.8723 s / it)
Averaged stats: lr: 0.003605 min_lr: 0.003605 loss: 3.1834 (3.2505) class_acc: 0.4648 (0.4716) weight_decay: 0.0500 (0.0500) grad_norm: 1.8908 (2.2698)
Test: [ 0/50] eta: 0:09:56 loss: 2.8905 (2.8905) acc1: 40.8000 (40.8000) acc5: 68.0000 (68.0000) time: 11.9219 data: 11.8856 max mem: 2905
Test: [10/50] eta: 0:01:18 loss: 2.5949 (2.4659) acc1: 48.0000 (48.3636) acc5: 71.2000 (73.2364) time: 1.9684 data: 1.9461 max mem: 2905
Test: [20/50] eta: 0:00:45 loss: 2.6713 (2.6873) acc1: 44.8000 (44.1905) acc5: 70.4000 (70.1333) time: 0.9806 data: 0.9604 max mem: 2905
Test: [30/50] eta: 0:00:25 loss: 2.7028 (2.6572) acc1: 42.4000 (44.6194) acc5: 68.0000 (70.1161) time: 0.8688 data: 0.8497 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.5708 (2.6823) acc1: 45.6000 (44.3512) acc5: 68.0000 (69.5805) time: 0.6970 data: 0.6789 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.8549 (2.7078) acc1: 40.8000 (44.0320) acc5: 67.2000 (69.3440) time: 0.4666 data: 0.4481 max mem: 2905
Test: Total time: 0:00:48 (0.9704 s / it)
* Acc@1 44.416 Acc@5 68.872 loss 2.693
Accuracy of the model on the 50000 test images: 44.4%
Max accuracy: 50.54%
Epoch: [77] [ 0/625] eta: 3:23:02 lr: 0.003605 min_lr: 0.003605 loss: 3.1335 (3.1335) class_acc: 0.5000 (0.5000) weight_decay: 0.0500 (0.0500) time: 19.4922 data: 17.2747 max mem: 2905
Epoch: [77] [200/625] eta: 0:14:45 lr: 0.003601 min_lr: 0.003601 loss: 3.2871 (3.2283) class_acc: 0.4609 (0.4747) weight_decay: 0.0500 (0.0500) grad_norm: 2.2438 (2.3596) time: 2.2775 data: 0.0142 max mem: 2905
Epoch: [77] [400/625] eta: 0:07:39 lr: 0.003596 min_lr: 0.003596 loss: 3.1919 (3.2328) class_acc: 0.4766 (0.4741) weight_decay: 0.0500 (0.0500) grad_norm: 1.8472 (2.4034) time: 2.0814 data: 0.0009 max mem: 2905
Epoch: [77] [600/625] eta: 0:00:51 lr: 0.003592 min_lr: 0.003592 loss: 3.2658 (3.2416) class_acc: 0.4570 (0.4725) weight_decay: 0.0500 (0.0500) grad_norm: 1.4974 (2.3675) time: 2.0703 data: 0.0008 max mem: 2905
Epoch: [77] [624/625] eta: 0:00:01 lr: 0.003591 min_lr: 0.003591 loss: 3.2584 (3.2431) class_acc: 0.4609 (0.4724) weight_decay: 0.0500 (0.0500) grad_norm: 1.9946 (2.3511) time: 0.7879 data: 0.0014 max mem: 2905
Epoch: [77] Total time: 0:20:44 (1.9913 s / it)
Averaged stats: lr: 0.003591 min_lr: 0.003591 loss: 3.2584 (3.2502) class_acc: 0.4609 (0.4714) weight_decay: 0.0500 (0.0500) grad_norm: 1.9946 (2.3511)
Test: [ 0/50] eta: 0:10:26 loss: 2.8636 (2.8636) acc1: 39.2000 (39.2000) acc5: 59.2000 (59.2000) time: 12.5368 data: 12.5089 max mem: 2905
Test: [10/50] eta: 0:01:18 loss: 2.7026 (2.6826) acc1: 45.6000 (44.9455) acc5: 68.0000 (67.4909) time: 1.9725 data: 1.9526 max mem: 2905
Test: [20/50] eta: 0:00:45 loss: 2.9940 (2.9806) acc1: 37.6000 (39.1619) acc5: 63.2000 (63.7714) time: 0.9714 data: 0.9530 max mem: 2905
Test: [30/50] eta: 0:00:26 loss: 3.1087 (2.9214) acc1: 35.2000 (39.7161) acc5: 61.6000 (64.9290) time: 0.9758 data: 0.9571 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 3.0187 (2.9601) acc1: 36.0000 (38.5561) acc5: 64.0000 (64.5854) time: 0.9415 data: 0.9221 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 3.0477 (2.9741) acc1: 35.2000 (38.5760) acc5: 63.2000 (64.5120) time: 0.7431 data: 0.7246 max mem: 2905
Test: Total time: 0:00:55 (1.1072 s / it)
* Acc@1 39.770 Acc@5 65.256 loss 2.925
Accuracy of the model on the 50000 test images: 39.8%
Max accuracy: 50.54%
Epoch: [78] [ 0/625] eta: 3:59:48 lr: 0.003591 min_lr: 0.003591 loss: 3.0610 (3.0610) class_acc: 0.5508 (0.5508) weight_decay: 0.0500 (0.0500) time: 23.0222 data: 18.9652 max mem: 2905
Epoch: [78] [200/625] eta: 0:14:45 lr: 0.003587 min_lr: 0.003587 loss: 3.2425 (3.2196) class_acc: 0.4805 (0.4769) weight_decay: 0.0500 (0.0500) grad_norm: 1.9498 (2.1774) time: 1.8999 data: 0.0293 max mem: 2905
Epoch: [78] [400/625] eta: 0:07:37 lr: 0.003583 min_lr: 0.003583 loss: 3.2560 (3.2415) class_acc: 0.4766 (0.4729) weight_decay: 0.0500 (0.0500) grad_norm: 1.7879 (2.3186) time: 1.9466 data: 0.0009 max mem: 2905
Epoch: [78] [600/625] eta: 0:00:50 lr: 0.003578 min_lr: 0.003578 loss: 3.2755 (3.2462) class_acc: 0.4688 (0.4720) weight_decay: 0.0500 (0.0500) grad_norm: 2.4751 (2.3692) time: 2.1851 data: 0.0070 max mem: 2905
Epoch: [78] [624/625] eta: 0:00:01 lr: 0.003578 min_lr: 0.003578 loss: 3.2972 (3.2478) class_acc: 0.4570 (0.4716) weight_decay: 0.0500 (0.0500) grad_norm: 2.4421 (2.3766) time: 0.6817 data: 0.0013 max mem: 2905
Epoch: [78] Total time: 0:20:27 (1.9640 s / it)
Averaged stats: lr: 0.003578 min_lr: 0.003578 loss: 3.2972 (3.2454) class_acc: 0.4570 (0.4725) weight_decay: 0.0500 (0.0500) grad_norm: 2.4421 (2.3766)
Test: [ 0/50] eta: 0:10:01 loss: 2.6390 (2.6390) acc1: 40.0000 (40.0000) acc5: 71.2000 (71.2000) time: 12.0390 data: 12.0108 max mem: 2905
Test: [10/50] eta: 0:01:15 loss: 2.5639 (2.5345) acc1: 46.4000 (46.7636) acc5: 71.2000 (70.1818) time: 1.8892 data: 1.8698 max mem: 2905
Test: [20/50] eta: 0:00:40 loss: 2.7075 (2.6961) acc1: 41.6000 (42.9714) acc5: 68.8000 (69.2190) time: 0.8176 data: 0.7971 max mem: 2905
Test: [30/50] eta: 0:00:22 loss: 2.7879 (2.7247) acc1: 40.0000 (42.8129) acc5: 68.8000 (69.0065) time: 0.6937 data: 0.6735 max mem: 2905
Test: [40/50] eta: 0:00:09 loss: 2.6464 (2.7150) acc1: 41.6000 (43.1805) acc5: 69.6000 (69.0732) time: 0.6179 data: 0.5991 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.8301 (2.7481) acc1: 41.6000 (43.0560) acc5: 66.4000 (68.5920) time: 0.5397 data: 0.5214 max mem: 2905
Test: Total time: 0:00:44 (0.8888 s / it)
* Acc@1 43.550 Acc@5 68.968 loss 2.711
Accuracy of the model on the 50000 test images: 43.6%
Max accuracy: 50.54%
Epoch: [79] [ 0/625] eta: 3:36:25 lr: 0.003578 min_lr: 0.003578 loss: 3.2456 (3.2456) class_acc: 0.4648 (0.4648) weight_decay: 0.0500 (0.0500) time: 20.7761 data: 17.3916 max mem: 2905
Epoch: [79] [200/625] eta: 0:13:49 lr: 0.003573 min_lr: 0.003573 loss: 3.2224 (3.2172) class_acc: 0.4648 (0.4782) weight_decay: 0.0500 (0.0500) grad_norm: 2.2990 (2.4234) time: 1.8416 data: 0.4426 max mem: 2905
Epoch: [79] [400/625] eta: 0:07:13 lr: 0.003569 min_lr: 0.003569 loss: 3.2415 (3.2276) class_acc: 0.4727 (0.4769) weight_decay: 0.0500 (0.0500) grad_norm: 1.7812 (2.2080) time: 1.9107 data: 0.6879 max mem: 2905
Epoch: [79] [600/625] eta: 0:00:47 lr: 0.003564 min_lr: 0.003564 loss: 3.2019 (3.2301) class_acc: 0.4805 (0.4759) weight_decay: 0.0500 (0.0500) grad_norm: 1.7941 (2.1904) time: 1.9262 data: 0.0038 max mem: 2905
Epoch: [79] [624/625] eta: 0:00:01 lr: 0.003564 min_lr: 0.003564 loss: 3.2196 (3.2311) class_acc: 0.4727 (0.4756) weight_decay: 0.0500 (0.0500) grad_norm: 2.3000 (2.2107) time: 0.6290 data: 0.0178 max mem: 2905
Epoch: [79] Total time: 0:19:36 (1.8820 s / it)
Averaged stats: lr: 0.003564 min_lr: 0.003564 loss: 3.2196 (3.2409) class_acc: 0.4727 (0.4737) weight_decay: 0.0500 (0.0500) grad_norm: 2.3000 (2.2107)
Test: [ 0/50] eta: 0:11:04 loss: 2.1923 (2.1923) acc1: 48.0000 (48.0000) acc5: 80.8000 (80.8000) time: 13.2909 data: 13.2682 max mem: 2905
Test: [10/50] eta: 0:01:26 loss: 2.5050 (2.4486) acc1: 49.6000 (48.0727) acc5: 74.4000 (72.1455) time: 2.1504 data: 2.1306 max mem: 2905
Test: [20/50] eta: 0:00:46 loss: 2.6015 (2.6043) acc1: 46.4000 (45.2952) acc5: 68.0000 (69.6381) time: 0.9575 data: 0.9380 max mem: 2905
Test: [30/50] eta: 0:00:25 loss: 2.5778 (2.5703) acc1: 44.8000 (45.7032) acc5: 68.8000 (70.4258) time: 0.7868 data: 0.7667 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.5778 (2.5978) acc1: 44.8000 (45.4049) acc5: 71.2000 (70.0293) time: 0.6966 data: 0.6760 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.6982 (2.6245) acc1: 43.2000 (44.8800) acc5: 68.8000 (69.8880) time: 0.5687 data: 0.5493 max mem: 2905
Test: Total time: 0:00:49 (0.9884 s / it)
* Acc@1 45.212 Acc@5 70.668 loss 2.593
Accuracy of the model on the 50000 test images: 45.2%
Max accuracy: 50.54%
Epoch: [80] [ 0/625] eta: 3:30:04 lr: 0.003564 min_lr: 0.003564 loss: 3.2491 (3.2491) class_acc: 0.4961 (0.4961) weight_decay: 0.0500 (0.0500) time: 20.1677 data: 17.6115 max mem: 2905
Epoch: [80] [200/625] eta: 0:13:41 lr: 0.003559 min_lr: 0.003559 loss: 3.2224 (3.2172) class_acc: 0.4844 (0.4787) weight_decay: 0.0500 (0.0500) grad_norm: 2.1317 (2.3878) time: 1.7900 data: 0.2065 max mem: 2905
Epoch: [80] [400/625] eta: 0:07:17 lr: 0.003555 min_lr: 0.003555 loss: 3.2049 (3.2250) class_acc: 0.4688 (0.4770) weight_decay: 0.0500 (0.0500) grad_norm: 1.9495 (2.4040) time: 2.0370 data: 0.0009 max mem: 2905
Epoch: [80] [600/625] eta: 0:00:48 lr: 0.003550 min_lr: 0.003550 loss: 3.2427 (3.2333) class_acc: 0.4805 (0.4759) weight_decay: 0.0500 (0.0500) grad_norm: 2.1451 (inf) time: 1.7995 data: 0.0720 max mem: 2905
Epoch: [80] [624/625] eta: 0:00:01 lr: 0.003550 min_lr: 0.003550 loss: 3.1767 (3.2322) class_acc: 0.4688 (0.4761) weight_decay: 0.0500 (0.0500) grad_norm: 1.7529 (inf) time: 0.4759 data: 0.0063 max mem: 2905
Epoch: [80] Total time: 0:19:53 (1.9102 s / it)
Averaged stats: lr: 0.003550 min_lr: 0.003550 loss: 3.1767 (3.2392) class_acc: 0.4688 (0.4742) weight_decay: 0.0500 (0.0500) grad_norm: 1.7529 (inf)
Test: [ 0/50] eta: 0:11:01 loss: 3.0755 (3.0755) acc1: 34.4000 (34.4000) acc5: 60.8000 (60.8000) time: 13.2301 data: 13.1962 max mem: 2905
Test: [10/50] eta: 0:01:21 loss: 2.9662 (3.0095) acc1: 40.0000 (39.4909) acc5: 63.2000 (62.9818) time: 2.0406 data: 2.0188 max mem: 2905
Test: [20/50] eta: 0:00:46 loss: 3.1298 (3.2012) acc1: 35.2000 (36.2286) acc5: 61.6000 (60.1905) time: 0.9685 data: 0.9486 max mem: 2905
Test: [30/50] eta: 0:00:27 loss: 3.3673 (3.2551) acc1: 31.2000 (35.0194) acc5: 56.0000 (59.2774) time: 0.9817 data: 0.9614 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 3.5322 (3.3293) acc1: 30.4000 (34.4000) acc5: 53.6000 (58.2049) time: 0.7766 data: 0.7562 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 3.2877 (3.3151) acc1: 32.8000 (34.6560) acc5: 58.4000 (58.5920) time: 0.7002 data: 0.6775 max mem: 2905
Test: Total time: 0:00:50 (1.0199 s / it)
* Acc@1 34.898 Acc@5 59.410 loss 3.291
Accuracy of the model on the 50000 test images: 34.9%
Max accuracy: 50.54%
Epoch: [81] [ 0/625] eta: 3:42:44 lr: 0.003550 min_lr: 0.003550 loss: 3.3862 (3.3862) class_acc: 0.4492 (0.4492) weight_decay: 0.0500 (0.0500) time: 21.3832 data: 18.7355 max mem: 2905
Epoch: [81] [200/625] eta: 0:14:18 lr: 0.003545 min_lr: 0.003545 loss: 3.1863 (3.2257) class_acc: 0.4844 (0.4772) weight_decay: 0.0500 (0.0500) grad_norm: 2.1731 (2.5033) time: 2.0882 data: 0.0008 max mem: 2905
Epoch: [81] [400/625] eta: 0:07:24 lr: 0.003541 min_lr: 0.003541 loss: 3.2344 (3.2386) class_acc: 0.4609 (0.4743) weight_decay: 0.0500 (0.0500) grad_norm: 2.4606 (2.4067) time: 1.9807 data: 0.0108 max mem: 2905
Epoch: [81] [600/625] eta: 0:00:49 lr: 0.003536 min_lr: 0.003536 loss: 3.2765 (3.2428) class_acc: 0.4688 (0.4739) weight_decay: 0.0500 (0.0500) grad_norm: 2.3574 (2.4149) time: 2.0867 data: 0.0097 max mem: 2905
Epoch: [81] [624/625] eta: 0:00:01 lr: 0.003535 min_lr: 0.003535 loss: 3.2719 (3.2434) class_acc: 0.4570 (0.4738) weight_decay: 0.0500 (0.0500) grad_norm: 2.0595 (2.3979) time: 0.7564 data: 0.0017 max mem: 2905
Epoch: [81] Total time: 0:20:11 (1.9391 s / it)
Averaged stats: lr: 0.003535 min_lr: 0.003535 loss: 3.2719 (3.2360) class_acc: 0.4570 (0.4745) weight_decay: 0.0500 (0.0500) grad_norm: 2.0595 (2.3979)
Test: [ 0/50] eta: 0:10:16 loss: 2.5239 (2.5239) acc1: 42.4000 (42.4000) acc5: 71.2000 (71.2000) time: 12.3235 data: 12.2984 max mem: 2905
Test: [10/50] eta: 0:01:30 loss: 2.4535 (2.4161) acc1: 45.6000 (46.6909) acc5: 71.2000 (72.7273) time: 2.2552 data: 2.2328 max mem: 2905
Test: [20/50] eta: 0:00:53 loss: 2.5067 (2.5621) acc1: 45.6000 (44.9905) acc5: 71.2000 (71.0476) time: 1.2678 data: 1.2473 max mem: 2905
Test: [30/50] eta: 0:00:30 loss: 2.6581 (2.5426) acc1: 43.2000 (45.3161) acc5: 70.4000 (71.0452) time: 1.1491 data: 1.1294 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.5609 (2.5721) acc1: 44.0000 (45.0927) acc5: 68.0000 (70.3610) time: 0.7466 data: 0.7267 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.5889 (2.5729) acc1: 48.0000 (45.2000) acc5: 67.2000 (70.2080) time: 0.6185 data: 0.5982 max mem: 2905
Test: Total time: 0:00:53 (1.0656 s / it)
* Acc@1 45.720 Acc@5 70.684 loss 2.557
Accuracy of the model on the 50000 test images: 45.7%
Max accuracy: 50.54%
Epoch: [82] [ 0/625] eta: 3:50:20 lr: 0.003535 min_lr: 0.003535 loss: 3.2080 (3.2080) class_acc: 0.4844 (0.4844) weight_decay: 0.0500 (0.0500) time: 22.1127 data: 18.8227 max mem: 2905
Epoch: [82] [200/625] eta: 0:14:37 lr: 0.003531 min_lr: 0.003531 loss: 3.2304 (3.2092) class_acc: 0.4805 (0.4800) weight_decay: 0.0500 (0.0500) grad_norm: 2.3143 (2.3862) time: 2.0445 data: 0.1168 max mem: 2905
Epoch: [82] [400/625] eta: 0:07:32 lr: 0.003526 min_lr: 0.003526 loss: 3.1940 (3.2177) class_acc: 0.4844 (0.4777) weight_decay: 0.0500 (0.0500) grad_norm: 1.6919 (2.2969) time: 1.9838 data: 0.0009 max mem: 2905
Epoch: [82] [600/625] eta: 0:00:50 lr: 0.003521 min_lr: 0.003521 loss: 3.2946 (3.2266) class_acc: 0.4609 (0.4757) weight_decay: 0.0500 (0.0500) grad_norm: 2.1730 (2.2894) time: 2.0316 data: 0.0008 max mem: 2905
Epoch: [82] [624/625] eta: 0:00:01 lr: 0.003521 min_lr: 0.003521 loss: 3.2061 (3.2260) class_acc: 0.4844 (0.4760) weight_decay: 0.0500 (0.0500) grad_norm: 2.7533 (2.3053) time: 0.7690 data: 0.0012 max mem: 2905
Epoch: [82] Total time: 0:20:21 (1.9552 s / it)
Averaged stats: lr: 0.003521 min_lr: 0.003521 loss: 3.2061 (3.2329) class_acc: 0.4844 (0.4750) weight_decay: 0.0500 (0.0500) grad_norm: 2.7533 (2.3053)
Test: [ 0/50] eta: 0:09:23 loss: 3.6125 (3.6125) acc1: 28.0000 (28.0000) acc5: 55.2000 (55.2000) time: 11.2742 data: 11.2477 max mem: 2905
Test: [10/50] eta: 0:01:10 loss: 2.8790 (2.8529) acc1: 41.6000 (41.3818) acc5: 66.4000 (65.8182) time: 1.7549 data: 1.7351 max mem: 2905
Test: [20/50] eta: 0:00:39 loss: 2.8978 (2.9828) acc1: 38.4000 (38.8190) acc5: 65.6000 (64.8381) time: 0.8098 data: 0.7904 max mem: 2905
Test: [30/50] eta: 0:00:24 loss: 3.1137 (3.0085) acc1: 36.8000 (38.6323) acc5: 64.0000 (64.3097) time: 0.9049 data: 0.8851 max mem: 2905
Test: [40/50] eta: 0:00:10 loss: 3.1587 (3.1080) acc1: 35.2000 (37.4439) acc5: 61.6000 (62.7902) time: 0.7714 data: 0.7525 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 3.1427 (3.0892) acc1: 35.2000 (37.3440) acc5: 61.6000 (63.0720) time: 0.4237 data: 0.4056 max mem: 2905
Test: Total time: 0:00:46 (0.9204 s / it)
* Acc@1 38.268 Acc@5 63.232 loss 3.059
Accuracy of the model on the 50000 test images: 38.3%
Max accuracy: 50.54%
Epoch: [83] [ 0/625] eta: 4:56:25 lr: 0.003521 min_lr: 0.003521 loss: 3.0757 (3.0757) class_acc: 0.5312 (0.5312) weight_decay: 0.0500 (0.0500) time: 28.4562 data: 16.8971 max mem: 2905
Epoch: [83] [200/625] eta: 0:14:12 lr: 0.003516 min_lr: 0.003516 loss: 3.1774 (3.2202) class_acc: 0.4766 (0.4809) weight_decay: 0.0500 (0.0500) grad_norm: 2.1348 (2.3409) time: 1.9362 data: 0.0006 max mem: 2905
Epoch: [83] [400/625] eta: 0:07:24 lr: 0.003512 min_lr: 0.003512 loss: 3.2603 (3.2223) class_acc: 0.4688 (0.4788) weight_decay: 0.0500 (0.0500) grad_norm: 2.2746 (2.3557) time: 1.9203 data: 0.0005 max mem: 2905
Epoch: [83] [600/625] eta: 0:00:49 lr: 0.003507 min_lr: 0.003507 loss: 3.2580 (3.2315) class_acc: 0.4727 (0.4765) weight_decay: 0.0500 (0.0500) grad_norm: 2.0450 (2.4248) time: 2.0852 data: 0.0006 max mem: 2905
Epoch: [83] [624/625] eta: 0:00:01 lr: 0.003506 min_lr: 0.003506 loss: 3.2510 (3.2318) class_acc: 0.4609 (0.4762) weight_decay: 0.0500 (0.0500) grad_norm: 2.0070 (2.4234) time: 1.0125 data: 0.0015 max mem: 2905
Epoch: [83] Total time: 0:20:08 (1.9342 s / it)
Averaged stats: lr: 0.003506 min_lr: 0.003506 loss: 3.2510 (3.2308) class_acc: 0.4609 (0.4760) weight_decay: 0.0500 (0.0500) grad_norm: 2.0070 (2.4234)
Test: [ 0/50] eta: 0:09:37 loss: 3.5897 (3.5897) acc1: 25.6000 (25.6000) acc5: 56.8000 (56.8000) time: 11.5551 data: 11.5257 max mem: 2905
Test: [10/50] eta: 0:01:25 loss: 2.8603 (2.8662) acc1: 44.8000 (42.2545) acc5: 67.2000 (66.5455) time: 2.1307 data: 2.1110 max mem: 2905
Test: [20/50] eta: 0:00:51 loss: 2.9121 (2.9307) acc1: 41.6000 (40.1905) acc5: 67.2000 (65.7524) time: 1.2357 data: 1.2163 max mem: 2905
Test: [30/50] eta: 0:00:30 loss: 2.9853 (2.9078) acc1: 39.2000 (40.7484) acc5: 64.8000 (65.8839) time: 1.2107 data: 1.1904 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.9584 (2.9032) acc1: 41.6000 (40.8000) acc5: 64.8000 (66.1659) time: 0.7657 data: 0.7460 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.9494 (2.9020) acc1: 43.2000 (41.2960) acc5: 68.0000 (66.2400) time: 0.6063 data: 0.5869 max mem: 2905
Test: Total time: 0:00:52 (1.0557 s / it)
* Acc@1 41.544 Acc@5 66.228 loss 2.859
Accuracy of the model on the 50000 test images: 41.5%
Max accuracy: 50.54%
Epoch: [84] [ 0/625] eta: 3:25:58 lr: 0.003506 min_lr: 0.003506 loss: 3.3097 (3.3097) class_acc: 0.4180 (0.4180) weight_decay: 0.0500 (0.0500) time: 19.7730 data: 18.7967 max mem: 2905
Epoch: [84] [200/625] eta: 0:13:43 lr: 0.003502 min_lr: 0.003502 loss: 3.2494 (3.2083) class_acc: 0.4844 (0.4841) weight_decay: 0.0500 (0.0500) grad_norm: 1.9331 (2.2869) time: 1.8785 data: 0.4316 max mem: 2905
Epoch: [84] [400/625] eta: 0:07:17 lr: 0.003497 min_lr: 0.003497 loss: 3.2281 (3.2140) class_acc: 0.4727 (0.4815) weight_decay: 0.0500 (0.0500) grad_norm: 2.0059 (2.2879) time: 2.2076 data: 0.0110 max mem: 2905
Epoch: [84] [600/625] eta: 0:00:49 lr: 0.003492 min_lr: 0.003492 loss: 3.2498 (3.2230) class_acc: 0.4688 (0.4793) weight_decay: 0.0500 (0.0500) grad_norm: 2.1001 (2.3016) time: 1.9811 data: 0.0083 max mem: 2905
Epoch: [84] [624/625] eta: 0:00:01 lr: 0.003491 min_lr: 0.003491 loss: 3.2252 (3.2234) class_acc: 0.4570 (0.4787) weight_decay: 0.0500 (0.0500) grad_norm: 2.2111 (2.3185) time: 0.6303 data: 0.0013 max mem: 2905
Epoch: [84] Total time: 0:20:20 (1.9534 s / it)
Averaged stats: lr: 0.003491 min_lr: 0.003491 loss: 3.2252 (3.2286) class_acc: 0.4570 (0.4766) weight_decay: 0.0500 (0.0500) grad_norm: 2.2111 (2.3185)
Test: [ 0/50] eta: 0:10:26 loss: 2.7145 (2.7145) acc1: 40.0000 (40.0000) acc5: 72.0000 (72.0000) time: 12.5353 data: 12.4993 max mem: 2905
Test: [10/50] eta: 0:01:23 loss: 2.7145 (2.6205) acc1: 44.0000 (45.0182) acc5: 68.8000 (70.8364) time: 2.0960 data: 2.0761 max mem: 2905
Test: [20/50] eta: 0:00:48 loss: 2.7654 (2.8310) acc1: 42.4000 (41.6000) acc5: 68.8000 (67.7714) time: 1.0872 data: 1.0679 max mem: 2905
Test: [30/50] eta: 0:00:29 loss: 2.8514 (2.8097) acc1: 39.2000 (42.2710) acc5: 64.8000 (67.5355) time: 1.0979 data: 1.0783 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.7876 (2.8573) acc1: 41.6000 (41.6781) acc5: 63.2000 (66.7902) time: 0.9016 data: 0.8827 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.7876 (2.8668) acc1: 41.6000 (41.3600) acc5: 66.4000 (66.8000) time: 0.8606 data: 0.8412 max mem: 2905
Test: Total time: 0:00:55 (1.1005 s / it)
* Acc@1 42.174 Acc@5 67.112 loss 2.809
Accuracy of the model on the 50000 test images: 42.2%
Max accuracy: 50.54%
Epoch: [85] [ 0/625] eta: 3:52:01 lr: 0.003491 min_lr: 0.003491 loss: 3.1183 (3.1183) class_acc: 0.5000 (0.5000) weight_decay: 0.0500 (0.0500) time: 22.2746 data: 15.9025 max mem: 2905
Epoch: [85] [200/625] eta: 0:14:20 lr: 0.003487 min_lr: 0.003487 loss: 3.1167 (3.1996) class_acc: 0.4961 (0.4793) weight_decay: 0.0500 (0.0500) grad_norm: 2.3758 (2.5677) time: 2.0041 data: 0.0007 max mem: 2905
Epoch: [85] [400/625] eta: 0:07:27 lr: 0.003482 min_lr: 0.003482 loss: 3.2080 (3.2102) class_acc: 0.4688 (0.4795) weight_decay: 0.0500 (0.0500) grad_norm: 2.5124 (2.3747) time: 2.0525 data: 0.0007 max mem: 2905
Epoch: [85] [600/625] eta: 0:00:49 lr: 0.003477 min_lr: 0.003477 loss: 3.1954 (3.2200) class_acc: 0.4727 (0.4782) weight_decay: 0.0500 (0.0500) grad_norm: 2.2954 (2.4543) time: 2.0844 data: 0.0007 max mem: 2905
Epoch: [85] [624/625] eta: 0:00:01 lr: 0.003476 min_lr: 0.003476 loss: 3.2326 (3.2217) class_acc: 0.4688 (0.4780) weight_decay: 0.0500 (0.0500) grad_norm: 1.9679 (2.4388) time: 0.7124 data: 0.0069 max mem: 2905
Epoch: [85] Total time: 0:20:33 (1.9735 s / it)
Averaged stats: lr: 0.003476 min_lr: 0.003476 loss: 3.2326 (3.2250) class_acc: 0.4688 (0.4770) weight_decay: 0.0500 (0.0500) grad_norm: 1.9679 (2.4388)
Test: [ 0/50] eta: 0:11:07 loss: 2.4857 (2.4857) acc1: 43.2000 (43.2000) acc5: 72.0000 (72.0000) time: 13.3551 data: 13.3195 max mem: 2905
Test: [10/50] eta: 0:01:29 loss: 2.6886 (2.6757) acc1: 43.2000 (44.5091) acc5: 68.8000 (69.3818) time: 2.2306 data: 2.2072 max mem: 2905
Test: [20/50] eta: 0:00:51 loss: 2.8033 (2.8298) acc1: 37.6000 (39.9238) acc5: 67.2000 (67.3524) time: 1.1288 data: 1.1077 max mem: 2905
Test: [30/50] eta: 0:00:28 loss: 2.9182 (2.8089) acc1: 36.8000 (40.6452) acc5: 65.6000 (66.9936) time: 0.9553 data: 0.9353 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.9188 (2.8687) acc1: 36.8000 (39.7463) acc5: 64.0000 (66.4000) time: 0.5317 data: 0.5123 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.8870 (2.8784) acc1: 39.2000 (40.1920) acc5: 64.8000 (66.0640) time: 0.4663 data: 0.4472 max mem: 2905
Test: Total time: 0:00:47 (0.9431 s / it)
* Acc@1 41.010 Acc@5 66.718 loss 2.847
Accuracy of the model on the 50000 test images: 41.0%
Max accuracy: 50.54%
Epoch: [86] [ 0/625] eta: 3:39:37 lr: 0.003476 min_lr: 0.003476 loss: 3.1238 (3.1238) class_acc: 0.5234 (0.5234) weight_decay: 0.0500 (0.0500) time: 21.0835 data: 16.5965 max mem: 2905
Epoch: [86] [200/625] eta: 0:14:14 lr: 0.003472 min_lr: 0.003472 loss: 3.2137 (3.2178) class_acc: 0.4727 (0.4777) weight_decay: 0.0500 (0.0500) grad_norm: 1.7342 (2.3453) time: 1.8250 data: 0.0258 max mem: 2905
Epoch: [86] [400/625] eta: 0:07:19 lr: 0.003467 min_lr: 0.003467 loss: 3.2722 (3.2242) class_acc: 0.4609 (0.4762) weight_decay: 0.0500 (0.0500) grad_norm: 2.1211 (2.3977) time: 1.9427 data: 0.0073 max mem: 2905
Epoch: [86] [600/625] eta: 0:00:48 lr: 0.003462 min_lr: 0.003462 loss: 3.2211 (3.2284) class_acc: 0.4844 (0.4764) weight_decay: 0.0500 (0.0500) grad_norm: 2.1045 (2.4542) time: 1.9558 data: 0.0008 max mem: 2905
Epoch: [86] [624/625] eta: 0:00:01 lr: 0.003461 min_lr: 0.003461 loss: 3.2572 (3.2290) class_acc: 0.4648 (0.4761) weight_decay: 0.0500 (0.0500) grad_norm: 2.0516 (2.4424) time: 0.7033 data: 0.0027 max mem: 2905
Epoch: [86] Total time: 0:19:43 (1.8941 s / it)
Averaged stats: lr: 0.003461 min_lr: 0.003461 loss: 3.2572 (3.2208) class_acc: 0.4648 (0.4785) weight_decay: 0.0500 (0.0500) grad_norm: 2.0516 (2.4424)
Test: [ 0/50] eta: 0:10:38 loss: 2.6977 (2.6977) acc1: 35.2000 (35.2000) acc5: 70.4000 (70.4000) time: 12.7678 data: 12.7387 max mem: 2905
Test: [10/50] eta: 0:01:30 loss: 2.6977 (2.6980) acc1: 44.0000 (43.5636) acc5: 68.8000 (69.1636) time: 2.2671 data: 2.2468 max mem: 2905
Test: [20/50] eta: 0:00:54 loss: 2.8232 (2.8710) acc1: 41.6000 (41.3714) acc5: 65.6000 (66.1714) time: 1.2520 data: 1.2306 max mem: 2905
Test: [30/50] eta: 0:00:30 loss: 2.9138 (2.8949) acc1: 39.2000 (41.1355) acc5: 64.0000 (65.3936) time: 1.1397 data: 1.1182 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.9052 (2.9008) acc1: 40.0000 (41.3854) acc5: 64.8000 (65.3268) time: 0.6785 data: 0.6589 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.8334 (2.8894) acc1: 41.6000 (41.5840) acc5: 66.4000 (65.7120) time: 0.6200 data: 0.6009 max mem: 2905
Test: Total time: 0:00:52 (1.0405 s / it)
* Acc@1 41.826 Acc@5 66.430 loss 2.853
Accuracy of the model on the 50000 test images: 41.8%
Max accuracy: 50.54%
Epoch: [87] [ 0/625] eta: 3:36:40 lr: 0.003461 min_lr: 0.003461 loss: 3.0572 (3.0572) class_acc: 0.5000 (0.5000) weight_decay: 0.0500 (0.0500) time: 20.8011 data: 20.5531 max mem: 2905
Epoch: [87] [200/625] eta: 0:13:55 lr: 0.003456 min_lr: 0.003456 loss: 3.1673 (3.1950) class_acc: 0.4883 (0.4839) weight_decay: 0.0500 (0.0500) grad_norm: 1.8527 (inf) time: 1.8162 data: 0.5992 max mem: 2905
Epoch: [87] [400/625] eta: 0:07:17 lr: 0.003451 min_lr: 0.003451 loss: 3.2221 (3.2059) class_acc: 0.4727 (0.4820) weight_decay: 0.0500 (0.0500) grad_norm: 2.4221 (inf) time: 1.8759 data: 0.0911 max mem: 2905
Epoch: [87] [600/625] eta: 0:00:49 lr: 0.003446 min_lr: 0.003446 loss: 3.2460 (3.2084) class_acc: 0.4766 (0.4806) weight_decay: 0.0500 (0.0500) grad_norm: 1.8130 (inf) time: 1.9575 data: 0.0009 max mem: 2905
Epoch: [87] [624/625] eta: 0:00:01 lr: 0.003446 min_lr: 0.003446 loss: 3.2013 (3.2090) class_acc: 0.4727 (0.4803) weight_decay: 0.0500 (0.0500) grad_norm: 1.5561 (inf) time: 0.7937 data: 0.0018 max mem: 2905
Epoch: [87] Total time: 0:20:05 (1.9285 s / it)
Averaged stats: lr: 0.003446 min_lr: 0.003446 loss: 3.2013 (3.2166) class_acc: 0.4727 (0.4791) weight_decay: 0.0500 (0.0500) grad_norm: 1.5561 (inf)
Test: [ 0/50] eta: 0:09:48 loss: 2.4731 (2.4731) acc1: 45.6000 (45.6000) acc5: 71.2000 (71.2000) time: 11.7673 data: 11.7303 max mem: 2905
Test: [10/50] eta: 0:01:25 loss: 2.5290 (2.6485) acc1: 45.6000 (45.3091) acc5: 69.6000 (69.6000) time: 2.1425 data: 2.1228 max mem: 2905
Test: [20/50] eta: 0:00:52 loss: 2.8579 (2.7854) acc1: 40.0000 (42.4000) acc5: 66.4000 (67.6952) time: 1.2531 data: 1.2341 max mem: 2905
Test: [30/50] eta: 0:00:31 loss: 2.9216 (2.8183) acc1: 39.2000 (41.9355) acc5: 65.6000 (66.8387) time: 1.2380 data: 1.2187 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.9664 (2.8576) acc1: 39.2000 (41.2683) acc5: 64.8000 (66.1073) time: 0.7718 data: 0.7529 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.8881 (2.8694) acc1: 39.2000 (41.2800) acc5: 65.6000 (66.2560) time: 0.6690 data: 0.6495 max mem: 2905
Test: Total time: 0:00:53 (1.0618 s / it)
* Acc@1 41.934 Acc@5 66.992 loss 2.814
Accuracy of the model on the 50000 test images: 41.9%
Max accuracy: 50.54%
Epoch: [88] [ 0/625] eta: 3:43:40 lr: 0.003446 min_lr: 0.003446 loss: 3.3014 (3.3014) class_acc: 0.4727 (0.4727) weight_decay: 0.0500 (0.0500) time: 21.4722 data: 21.3492 max mem: 2905
Epoch: [88] [200/625] eta: 0:14:12 lr: 0.003441 min_lr: 0.003441 loss: 3.2173 (3.2094) class_acc: 0.4609 (0.4784) weight_decay: 0.0500 (0.0500) grad_norm: 2.0129 (2.0891) time: 1.9693 data: 0.0255 max mem: 2905
Epoch: [88] [400/625] eta: 0:07:26 lr: 0.003436 min_lr: 0.003436 loss: 3.1962 (3.2177) class_acc: 0.4727 (0.4780) weight_decay: 0.0500 (0.0500) grad_norm: 2.4191 (2.2923) time: 1.8834 data: 0.0013 max mem: 2905
Epoch: [88] [600/625] eta: 0:00:49 lr: 0.003431 min_lr: 0.003431 loss: 3.2292 (3.2176) class_acc: 0.4648 (0.4785) weight_decay: 0.0500 (0.0500) grad_norm: 2.2967 (2.3386) time: 1.9979 data: 0.0007 max mem: 2905
Epoch: [88] [624/625] eta: 0:00:01 lr: 0.003430 min_lr: 0.003430 loss: 3.1960 (3.2165) class_acc: 0.4688 (0.4788) weight_decay: 0.0500 (0.0500) grad_norm: 1.6856 (2.3120) time: 0.7321 data: 0.0117 max mem: 2905
Epoch: [88] Total time: 0:20:32 (1.9727 s / it)
Averaged stats: lr: 0.003430 min_lr: 0.003430 loss: 3.1960 (3.2131) class_acc: 0.4688 (0.4796) weight_decay: 0.0500 (0.0500) grad_norm: 1.6856 (2.3120)
Test: [ 0/50] eta: 0:10:33 loss: 2.0753 (2.0753) acc1: 56.0000 (56.0000) acc5: 82.4000 (82.4000) time: 12.6689 data: 12.6069 max mem: 2905
Test: [10/50] eta: 0:01:23 loss: 2.2820 (2.3151) acc1: 54.4000 (51.2727) acc5: 74.4000 (74.2545) time: 2.0962 data: 2.0732 max mem: 2905
Test: [20/50] eta: 0:00:49 loss: 2.3739 (2.4700) acc1: 48.0000 (47.5429) acc5: 72.8000 (72.8762) time: 1.0980 data: 1.0784 max mem: 2905
Test: [30/50] eta: 0:00:29 loss: 2.5386 (2.4788) acc1: 44.0000 (47.4581) acc5: 71.2000 (72.3613) time: 1.1526 data: 1.1328 max mem: 2905
Test: [40/50] eta: 0:00:13 loss: 2.5386 (2.4966) acc1: 44.0000 (47.2000) acc5: 70.4000 (72.0781) time: 0.9380 data: 0.9169 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.5597 (2.5066) acc1: 45.6000 (47.2160) acc5: 68.8000 (71.7920) time: 0.8518 data: 0.8287 max mem: 2905
Test: Total time: 0:00:54 (1.0840 s / it)
* Acc@1 47.640 Acc@5 72.136 loss 2.453
Accuracy of the model on the 50000 test images: 47.6%
Max accuracy: 50.54%
Epoch: [89] [ 0/625] eta: 3:27:03 lr: 0.003430 min_lr: 0.003430 loss: 3.2908 (3.2908) class_acc: 0.4609 (0.4609) weight_decay: 0.0500 (0.0500) time: 19.8774 data: 19.7514 max mem: 2905
Epoch: [89] [200/625] eta: 0:14:34 lr: 0.003425 min_lr: 0.003425 loss: 3.2745 (3.2132) class_acc: 0.4648 (0.4771) weight_decay: 0.0500 (0.0500) grad_norm: 2.5580 (2.5102) time: 2.0330 data: 0.0057 max mem: 2905
Epoch: [89] [400/625] eta: 0:07:23 lr: 0.003420 min_lr: 0.003420 loss: 3.1864 (3.2044) class_acc: 0.4766 (0.4797) weight_decay: 0.0500 (0.0500) grad_norm: 1.6811 (2.4566) time: 1.8411 data: 0.0225 max mem: 2905
Epoch: [89] [600/625] eta: 0:00:48 lr: 0.003415 min_lr: 0.003415 loss: 3.2201 (3.2147) class_acc: 0.4844 (0.4782) weight_decay: 0.0500 (0.0500) grad_norm: 1.9170 (2.4484) time: 1.9983 data: 0.0006 max mem: 2905
Epoch: [89] [624/625] eta: 0:00:01 lr: 0.003414 min_lr: 0.003414 loss: 3.2429 (3.2162) class_acc: 0.4648 (0.4780) weight_decay: 0.0500 (0.0500) grad_norm: 2.4656 (2.4715) time: 0.7636 data: 0.0013 max mem: 2905
Epoch: [89] Total time: 0:19:54 (1.9114 s / it)
Averaged stats: lr: 0.003414 min_lr: 0.003414 loss: 3.2429 (3.2108) class_acc: 0.4648 (0.4797) weight_decay: 0.0500 (0.0500) grad_norm: 2.4656 (2.4715)
Test: [ 0/50] eta: 0:10:53 loss: 2.9856 (2.9856) acc1: 36.0000 (36.0000) acc5: 58.4000 (58.4000) time: 13.0795 data: 13.0515 max mem: 2905
Test: [10/50] eta: 0:01:27 loss: 2.6802 (2.7531) acc1: 41.6000 (42.2545) acc5: 69.6000 (67.0545) time: 2.1886 data: 2.1679 max mem: 2905
Test: [20/50] eta: 0:00:51 loss: 2.7576 (2.8603) acc1: 40.0000 (41.1429) acc5: 66.4000 (66.4762) time: 1.1448 data: 1.1254 max mem: 2905
Test: [30/50] eta: 0:00:26 loss: 2.8800 (2.8196) acc1: 40.0000 (41.7548) acc5: 66.4000 (66.9161) time: 0.8852 data: 0.8659 max mem: 2905
Test: [40/50] eta: 0:00:10 loss: 2.7411 (2.8420) acc1: 39.2000 (41.1317) acc5: 66.4000 (66.5561) time: 0.4482 data: 0.4290 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.8411 (2.8381) acc1: 39.2000 (41.1040) acc5: 67.2000 (66.6400) time: 0.4669 data: 0.4474 max mem: 2905
Test: Total time: 0:00:46 (0.9233 s / it)
* Acc@1 42.196 Acc@5 67.034 loss 2.805
Accuracy of the model on the 50000 test images: 42.2%
Max accuracy: 50.54%
Epoch: [90] [ 0/625] eta: 3:24:46 lr: 0.003414 min_lr: 0.003414 loss: 3.3031 (3.3031) class_acc: 0.4492 (0.4492) weight_decay: 0.0500 (0.0500) time: 19.6577 data: 18.9060 max mem: 2905
Epoch: [90] [200/625] eta: 0:14:05 lr: 0.003409 min_lr: 0.003409 loss: 3.1990 (3.1914) class_acc: 0.4727 (0.4836) weight_decay: 0.0500 (0.0500) grad_norm: 2.1038 (2.3971) time: 1.8025 data: 0.0006 max mem: 2905
Epoch: [90] [400/625] eta: 0:07:25 lr: 0.003404 min_lr: 0.003404 loss: 3.1987 (3.1960) class_acc: 0.4883 (0.4836) weight_decay: 0.0500 (0.0500) grad_norm: 2.0117 (2.3809) time: 2.0952 data: 0.0016 max mem: 2905
Epoch: [90] [600/625] eta: 0:00:49 lr: 0.003399 min_lr: 0.003399 loss: 3.2620 (3.2044) class_acc: 0.4609 (0.4809) weight_decay: 0.0500 (0.0500) grad_norm: 2.0815 (2.4661) time: 2.0855 data: 0.0242 max mem: 2905
Epoch: [90] [624/625] eta: 0:00:01 lr: 0.003398 min_lr: 0.003398 loss: 3.2434 (3.2058) class_acc: 0.4688 (0.4806) weight_decay: 0.0500 (0.0500) grad_norm: 2.5561 (2.4694) time: 0.9347 data: 0.0013 max mem: 2905
Epoch: [90] Total time: 0:20:16 (1.9462 s / it)
Averaged stats: lr: 0.003398 min_lr: 0.003398 loss: 3.2434 (3.2069) class_acc: 0.4688 (0.4805) weight_decay: 0.0500 (0.0500) grad_norm: 2.5561 (2.4694)
Test: [ 0/50] eta: 0:10:37 loss: 2.5887 (2.5887) acc1: 43.2000 (43.2000) acc5: 74.4000 (74.4000) time: 12.7457 data: 12.7187 max mem: 2905
Test: [10/50] eta: 0:01:19 loss: 2.5916 (2.7249) acc1: 44.0000 (44.7273) acc5: 70.4000 (68.5091) time: 1.9900 data: 1.9712 max mem: 2905
Test: [20/50] eta: 0:00:46 loss: 2.9536 (2.8927) acc1: 38.4000 (40.6476) acc5: 65.6000 (66.2857) time: 0.9816 data: 0.9635 max mem: 2905
Test: [30/50] eta: 0:00:27 loss: 2.9536 (2.8755) acc1: 37.6000 (40.1290) acc5: 64.8000 (66.0129) time: 1.0233 data: 1.0045 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.8274 (2.8910) acc1: 38.4000 (40.0585) acc5: 64.8000 (65.6781) time: 0.9459 data: 0.9260 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.8944 (2.8773) acc1: 39.2000 (40.0160) acc5: 66.4000 (65.8240) time: 0.9833 data: 0.9642 max mem: 2905
Test: Total time: 0:00:57 (1.1524 s / it)
* Acc@1 40.932 Acc@5 66.728 loss 2.844
Accuracy of the model on the 50000 test images: 40.9%
Max accuracy: 50.54%
Epoch: [91] [ 0/625] eta: 3:44:45 lr: 0.003398 min_lr: 0.003398 loss: 3.2703 (3.2703) class_acc: 0.4453 (0.4453) weight_decay: 0.0500 (0.0500) time: 21.5764 data: 19.1172 max mem: 2905
Epoch: [91] [200/625] eta: 0:15:07 lr: 0.003393 min_lr: 0.003393 loss: 3.1338 (3.1884) class_acc: 0.4805 (0.4838) weight_decay: 0.0500 (0.0500) grad_norm: 1.8080 (2.3157) time: 2.0237 data: 0.0512 max mem: 2905
Epoch: [91] [400/625] eta: 0:07:39 lr: 0.003388 min_lr: 0.003388 loss: 3.1891 (3.1953) class_acc: 0.4883 (0.4832) weight_decay: 0.0500 (0.0500) grad_norm: 1.6608 (2.4206) time: 1.9308 data: 0.0008 max mem: 2905
Epoch: [91] [600/625] eta: 0:00:50 lr: 0.003383 min_lr: 0.003383 loss: 3.2319 (3.2035) class_acc: 0.4766 (0.4809) weight_decay: 0.0500 (0.0500) grad_norm: 2.2538 (2.4311) time: 1.9758 data: 0.0007 max mem: 2905
Epoch: [91] [624/625] eta: 0:00:02 lr: 0.003382 min_lr: 0.003382 loss: 3.2151 (3.2043) class_acc: 0.4766 (0.4808) weight_decay: 0.0500 (0.0500) grad_norm: 1.7773 (2.4272) time: 0.3949 data: 0.0030 max mem: 2905
Epoch: [91] Total time: 0:20:53 (2.0049 s / it)
Averaged stats: lr: 0.003382 min_lr: 0.003382 loss: 3.2151 (3.2037) class_acc: 0.4766 (0.4814) weight_decay: 0.0500 (0.0500) grad_norm: 1.7773 (2.4272)
Test: [ 0/50] eta: 0:10:32 loss: 3.0355 (3.0355) acc1: 36.8000 (36.8000) acc5: 64.8000 (64.8000) time: 12.6457 data: 12.6177 max mem: 2905
Test: [10/50] eta: 0:01:23 loss: 2.8191 (2.7846) acc1: 40.8000 (42.3273) acc5: 65.6000 (68.2182) time: 2.0807 data: 2.0610 max mem: 2905
Test: [20/50] eta: 0:00:49 loss: 2.9319 (2.9184) acc1: 38.4000 (40.6095) acc5: 65.6000 (66.5905) time: 1.0990 data: 1.0793 max mem: 2905
Test: [30/50] eta: 0:00:29 loss: 3.0517 (2.9091) acc1: 39.2000 (40.2839) acc5: 65.6000 (66.3484) time: 1.1542 data: 1.1349 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.9775 (2.9410) acc1: 39.2000 (40.1561) acc5: 65.6000 (65.8341) time: 0.9119 data: 0.8935 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.9188 (2.9157) acc1: 39.2000 (40.4800) acc5: 64.8000 (66.1440) time: 0.8133 data: 0.7940 max mem: 2905
Test: Total time: 0:00:53 (1.0732 s / it)
* Acc@1 41.224 Acc@5 66.532 loss 2.869
Accuracy of the model on the 50000 test images: 41.2%
Max accuracy: 50.54%
Epoch: [92] [ 0/625] eta: 4:09:38 lr: 0.003382 min_lr: 0.003382 loss: 3.1955 (3.1955) class_acc: 0.5000 (0.5000) weight_decay: 0.0500 (0.0500) time: 23.9660 data: 19.1441 max mem: 2905
Epoch: [92] [200/625] eta: 0:15:13 lr: 0.003377 min_lr: 0.003377 loss: 3.0893 (3.1763) class_acc: 0.4805 (0.4867) weight_decay: 0.0500 (0.0500) grad_norm: 2.2637 (2.2168) time: 1.9879 data: 0.0055 max mem: 2905
Epoch: [92] [400/625] eta: 0:07:49 lr: 0.003372 min_lr: 0.003372 loss: 3.2306 (3.1971) class_acc: 0.4766 (0.4825) weight_decay: 0.0500 (0.0500) grad_norm: 2.6521 (2.4301) time: 2.1071 data: 0.0009 max mem: 2905
Epoch: [92] [600/625] eta: 0:00:51 lr: 0.003367 min_lr: 0.003367 loss: 3.1990 (3.2010) class_acc: 0.4766 (0.4817) weight_decay: 0.0500 (0.0500) grad_norm: 2.4445 (2.4852) time: 1.9103 data: 0.0009 max mem: 2905
Epoch: [92] [624/625] eta: 0:00:02 lr: 0.003366 min_lr: 0.003366 loss: 3.2054 (3.2019) class_acc: 0.4727 (0.4812) weight_decay: 0.0500 (0.0500) grad_norm: 1.7489 (2.4519) time: 0.5328 data: 0.0013 max mem: 2905
Epoch: [92] Total time: 0:20:51 (2.0020 s / it)
Averaged stats: lr: 0.003366 min_lr: 0.003366 loss: 3.2054 (3.2000) class_acc: 0.4727 (0.4825) weight_decay: 0.0500 (0.0500) grad_norm: 1.7489 (2.4519)
Test: [ 0/50] eta: 0:09:50 loss: 2.7334 (2.7334) acc1: 44.8000 (44.8000) acc5: 71.2000 (71.2000) time: 11.8003 data: 11.7737 max mem: 2905
Test: [10/50] eta: 0:01:12 loss: 2.5515 (2.4975) acc1: 48.0000 (47.2000) acc5: 69.6000 (70.4000) time: 1.8221 data: 1.8021 max mem: 2905
Test: [20/50] eta: 0:00:41 loss: 2.6460 (2.6877) acc1: 44.8000 (44.0000) acc5: 69.6000 (68.6857) time: 0.8466 data: 0.8265 max mem: 2905
Test: [30/50] eta: 0:00:24 loss: 2.6825 (2.6899) acc1: 39.2000 (43.5871) acc5: 68.0000 (68.8258) time: 0.9090 data: 0.8884 max mem: 2905
Test: [40/50] eta: 0:00:10 loss: 2.7232 (2.7105) acc1: 43.2000 (43.6293) acc5: 68.0000 (68.6439) time: 0.7509 data: 0.7314 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.8209 (2.7290) acc1: 43.2000 (43.6000) acc5: 67.2000 (68.3200) time: 0.4287 data: 0.4102 max mem: 2905
Test: Total time: 0:00:46 (0.9390 s / it)
* Acc@1 44.554 Acc@5 69.216 loss 2.669
Accuracy of the model on the 50000 test images: 44.6%
Max accuracy: 50.54%
Epoch: [93] [ 0/625] eta: 3:57:51 lr: 0.003366 min_lr: 0.003366 loss: 3.1277 (3.1277) class_acc: 0.4844 (0.4844) weight_decay: 0.0500 (0.0500) time: 22.8344 data: 17.9874 max mem: 2905
Epoch: [93] [200/625] eta: 0:14:04 lr: 0.003361 min_lr: 0.003361 loss: 3.2206 (3.1788) class_acc: 0.4844 (0.4901) weight_decay: 0.0500 (0.0500) grad_norm: 2.1617 (2.3268) time: 1.9248 data: 0.0007 max mem: 2905
Epoch: [93] [400/625] eta: 0:07:21 lr: 0.003355 min_lr: 0.003355 loss: 3.1919 (3.1772) class_acc: 0.4883 (0.4884) weight_decay: 0.0500 (0.0500) grad_norm: 2.3519 (2.3898) time: 1.9549 data: 0.0008 max mem: 2905
Epoch: [93] [600/625] eta: 0:00:49 lr: 0.003350 min_lr: 0.003350 loss: 3.2102 (3.1923) class_acc: 0.4727 (0.4846) weight_decay: 0.0500 (0.0500) grad_norm: 1.9046 (inf) time: 2.0385 data: 0.0025 max mem: 2905
Epoch: [93] [624/625] eta: 0:00:01 lr: 0.003350 min_lr: 0.003350 loss: 3.1707 (3.1921) class_acc: 0.4844 (0.4845) weight_decay: 0.0500 (0.0500) grad_norm: 1.7101 (inf) time: 0.5544 data: 0.0012 max mem: 2905
Epoch: [93] Total time: 0:20:20 (1.9530 s / it)
Averaged stats: lr: 0.003350 min_lr: 0.003350 loss: 3.1707 (3.1948) class_acc: 0.4844 (0.4839) weight_decay: 0.0500 (0.0500) grad_norm: 1.7101 (inf)
Test: [ 0/50] eta: 0:10:15 loss: 2.7892 (2.7892) acc1: 38.4000 (38.4000) acc5: 67.2000 (67.2000) time: 12.3007 data: 12.2757 max mem: 2905
Test: [10/50] eta: 0:01:22 loss: 2.7373 (2.7327) acc1: 45.6000 (43.3455) acc5: 68.0000 (69.0182) time: 2.0597 data: 2.0408 max mem: 2905
Test: [20/50] eta: 0:00:49 loss: 2.8712 (2.8986) acc1: 40.0000 (40.1905) acc5: 67.2000 (65.9429) time: 1.1081 data: 1.0895 max mem: 2905
Test: [30/50] eta: 0:00:29 loss: 2.8939 (2.8708) acc1: 39.2000 (41.1097) acc5: 62.4000 (65.9355) time: 1.1571 data: 1.1383 max mem: 2905
Test: [40/50] eta: 0:00:13 loss: 2.8631 (2.9068) acc1: 41.6000 (40.8585) acc5: 64.8000 (65.6781) time: 1.0247 data: 1.0033 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.8560 (2.8971) acc1: 40.0000 (40.7360) acc5: 65.6000 (65.9200) time: 0.9014 data: 0.8793 max mem: 2905
Test: Total time: 0:00:56 (1.1287 s / it)
* Acc@1 41.162 Acc@5 65.946 loss 2.865
Accuracy of the model on the 50000 test images: 41.2%
Max accuracy: 50.54%
Epoch: [94] [ 0/625] eta: 3:41:05 lr: 0.003350 min_lr: 0.003350 loss: 3.2145 (3.2145) class_acc: 0.4766 (0.4766) weight_decay: 0.0500 (0.0500) time: 21.2249 data: 19.2294 max mem: 2905
Epoch: [94] [200/625] eta: 0:14:09 lr: 0.003344 min_lr: 0.003344 loss: 3.1320 (3.1687) class_acc: 0.4844 (0.4890) weight_decay: 0.0500 (0.0500) grad_norm: 1.8061 (2.2527) time: 1.9739 data: 0.0007 max mem: 2905
Epoch: [94] [400/625] eta: 0:07:26 lr: 0.003339 min_lr: 0.003339 loss: 3.1528 (3.1754) class_acc: 0.5000 (0.4877) weight_decay: 0.0500 (0.0500) grad_norm: 1.7392 (2.3678) time: 1.9544 data: 0.0007 max mem: 2905
Epoch: [94] [600/625] eta: 0:00:49 lr: 0.003334 min_lr: 0.003334 loss: 3.2093 (3.1897) class_acc: 0.4727 (0.4851) weight_decay: 0.0500 (0.0500) grad_norm: 1.7204 (2.2605) time: 2.0430 data: 0.0007 max mem: 2905
Epoch: [94] [624/625] eta: 0:00:01 lr: 0.003333 min_lr: 0.003333 loss: 3.1833 (3.1901) class_acc: 0.4805 (0.4847) weight_decay: 0.0500 (0.0500) grad_norm: 2.1688 (2.2645) time: 0.9834 data: 0.0015 max mem: 2905
Epoch: [94] Total time: 0:20:10 (1.9365 s / it)
Averaged stats: lr: 0.003333 min_lr: 0.003333 loss: 3.1833 (3.1929) class_acc: 0.4805 (0.4840) weight_decay: 0.0500 (0.0500) grad_norm: 2.1688 (2.2645)
Test: [ 0/50] eta: 0:11:08 loss: 3.7219 (3.7219) acc1: 29.6000 (29.6000) acc5: 52.8000 (52.8000) time: 13.3745 data: 13.3477 max mem: 2905
Test: [10/50] eta: 0:01:20 loss: 3.7219 (3.7422) acc1: 28.0000 (29.0182) acc5: 52.8000 (53.0909) time: 2.0177 data: 1.9984 max mem: 2905
Test: [20/50] eta: 0:00:44 loss: 3.8554 (3.9058) acc1: 26.4000 (27.3905) acc5: 48.8000 (50.5143) time: 0.8972 data: 0.8786 max mem: 2905
Test: [30/50] eta: 0:00:25 loss: 4.0046 (3.9151) acc1: 22.4000 (26.6839) acc5: 46.4000 (50.2710) time: 0.8295 data: 0.8103 max mem: 2905
Test: [40/50] eta: 0:00:10 loss: 4.0415 (3.9787) acc1: 23.2000 (26.1659) acc5: 47.2000 (49.7951) time: 0.5859 data: 0.5670 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 4.1366 (4.0060) acc1: 25.6000 (25.9360) acc5: 48.0000 (49.4720) time: 0.5389 data: 0.5200 max mem: 2905
Test: Total time: 0:00:43 (0.8734 s / it)
* Acc@1 26.544 Acc@5 50.080 loss 3.954
Accuracy of the model on the 50000 test images: 26.5%
Max accuracy: 50.54%
Epoch: [95] [ 0/625] eta: 3:26:53 lr: 0.003333 min_lr: 0.003333 loss: 3.0119 (3.0119) class_acc: 0.4961 (0.4961) weight_decay: 0.0500 (0.0500) time: 19.8616 data: 19.6932 max mem: 2905
Epoch: [95] [200/625] eta: 0:14:18 lr: 0.003327 min_lr: 0.003327 loss: 3.1781 (3.1674) class_acc: 0.5000 (0.4885) weight_decay: 0.0500 (0.0500) grad_norm: 1.5392 (2.3899) time: 1.8658 data: 0.0010 max mem: 2905
Epoch: [95] [400/625] eta: 0:07:19 lr: 0.003322 min_lr: 0.003322 loss: 3.2103 (3.1855) class_acc: 0.4844 (0.4858) weight_decay: 0.0500 (0.0500) grad_norm: 2.5359 (2.4348) time: 1.9323 data: 0.0008 max mem: 2905
Epoch: [95] [600/625] eta: 0:00:48 lr: 0.003317 min_lr: 0.003317 loss: 3.1700 (3.1942) class_acc: 0.4844 (0.4839) weight_decay: 0.0500 (0.0500) grad_norm: 2.6757 (2.4888) time: 2.0826 data: 0.0356 max mem: 2905
Epoch: [95] [624/625] eta: 0:00:01 lr: 0.003316 min_lr: 0.003316 loss: 3.2174 (3.1947) class_acc: 0.4688 (0.4837) weight_decay: 0.0500 (0.0500) grad_norm: 2.6382 (2.4824) time: 0.8019 data: 0.0015 max mem: 2905
Epoch: [95] Total time: 0:19:56 (1.9137 s / it)
Averaged stats: lr: 0.003316 min_lr: 0.003316 loss: 3.2174 (3.1875) class_acc: 0.4688 (0.4846) weight_decay: 0.0500 (0.0500) grad_norm: 2.6382 (2.4824)
Test: [ 0/50] eta: 0:10:36 loss: 3.2388 (3.2388) acc1: 31.2000 (31.2000) acc5: 57.6000 (57.6000) time: 12.7226 data: 12.6828 max mem: 2905
Test: [10/50] eta: 0:01:25 loss: 2.7290 (2.8571) acc1: 40.0000 (40.3636) acc5: 66.4000 (67.0545) time: 2.1277 data: 2.1026 max mem: 2905
Test: [20/50] eta: 0:00:49 loss: 3.0270 (2.9852) acc1: 39.2000 (38.2857) acc5: 64.0000 (64.6095) time: 1.0855 data: 1.0627 max mem: 2905
Test: [30/50] eta: 0:00:28 loss: 3.0936 (3.0343) acc1: 33.6000 (37.3161) acc5: 62.4000 (63.3806) time: 1.0356 data: 1.0144 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 3.2047 (3.0753) acc1: 34.4000 (37.3463) acc5: 60.0000 (62.5756) time: 0.7524 data: 0.7319 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 3.1869 (3.0730) acc1: 35.2000 (37.2160) acc5: 61.6000 (62.6400) time: 0.6658 data: 0.6445 max mem: 2905
Test: Total time: 0:00:50 (1.0064 s / it)
* Acc@1 38.344 Acc@5 63.160 loss 3.042
Accuracy of the model on the 50000 test images: 38.3%
Max accuracy: 50.54%
Epoch: [96] [ 0/625] eta: 4:32:47 lr: 0.003316 min_lr: 0.003316 loss: 3.3384 (3.3384) class_acc: 0.4531 (0.4531) weight_decay: 0.0500 (0.0500) time: 26.1884 data: 17.6749 max mem: 2905
Epoch: [96] [200/625] eta: 0:14:28 lr: 0.003311 min_lr: 0.003311 loss: 3.1485 (3.1673) class_acc: 0.4922 (0.4882) weight_decay: 0.0500 (0.0500) grad_norm: 1.9368 (2.4397) time: 1.9324 data: 0.0251 max mem: 2905
Epoch: [96] [400/625] eta: 0:07:27 lr: 0.003305 min_lr: 0.003305 loss: 3.1798 (3.1821) class_acc: 0.4766 (0.4848) weight_decay: 0.0500 (0.0500) grad_norm: 2.4185 (2.3982) time: 1.9602 data: 0.0455 max mem: 2905
Epoch: [96] [600/625] eta: 0:00:49 lr: 0.003300 min_lr: 0.003300 loss: 3.2417 (3.1865) class_acc: 0.4727 (0.4850) weight_decay: 0.0500 (0.0500) grad_norm: 1.8276 (2.3432) time: 1.9510 data: 0.0300 max mem: 2905
Epoch: [96] [624/625] eta: 0:00:01 lr: 0.003299 min_lr: 0.003299 loss: 3.2191 (3.1878) class_acc: 0.4805 (0.4846) weight_decay: 0.0500 (0.0500) grad_norm: 2.1578 (2.3450) time: 1.0348 data: 0.0176 max mem: 2905
Epoch: [96] Total time: 0:20:10 (1.9368 s / it)
Averaged stats: lr: 0.003299 min_lr: 0.003299 loss: 3.2191 (3.1868) class_acc: 0.4805 (0.4853) weight_decay: 0.0500 (0.0500) grad_norm: 2.1578 (2.3450)
Test: [ 0/50] eta: 0:10:37 loss: 3.1733 (3.1733) acc1: 42.4000 (42.4000) acc5: 61.6000 (61.6000) time: 12.7541 data: 12.7302 max mem: 2905
Test: [10/50] eta: 0:01:23 loss: 2.8607 (2.8430) acc1: 40.8000 (42.1818) acc5: 64.8000 (66.5455) time: 2.0897 data: 2.0693 max mem: 2905
Test: [20/50] eta: 0:00:48 loss: 3.0559 (3.0544) acc1: 37.6000 (38.6286) acc5: 63.2000 (64.0762) time: 1.0608 data: 1.0407 max mem: 2905
Test: [30/50] eta: 0:00:28 loss: 3.1425 (3.0267) acc1: 37.6000 (39.2774) acc5: 63.2000 (64.4645) time: 1.0541 data: 1.0342 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 3.0507 (3.0390) acc1: 39.2000 (38.9854) acc5: 64.0000 (64.2537) time: 0.7882 data: 0.7690 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 3.1531 (3.0842) acc1: 36.8000 (38.2240) acc5: 62.4000 (63.5360) time: 0.6698 data: 0.6511 max mem: 2905
Test: Total time: 0:00:50 (1.0071 s / it)
* Acc@1 38.232 Acc@5 63.632 loss 3.045
Accuracy of the model on the 50000 test images: 38.2%
Max accuracy: 50.54%
Epoch: [97] [ 0/625] eta: 3:34:01 lr: 0.003299 min_lr: 0.003299 loss: 3.0913 (3.0913) class_acc: 0.5000 (0.5000) weight_decay: 0.0500 (0.0500) time: 20.5468 data: 17.2245 max mem: 2905
Epoch: [97] [200/625] eta: 0:14:08 lr: 0.003294 min_lr: 0.003294 loss: 3.2464 (3.1633) class_acc: 0.4844 (0.4878) weight_decay: 0.0500 (0.0500) grad_norm: 3.2161 (2.4480) time: 1.8910 data: 0.0011 max mem: 2905
Epoch: [97] [400/625] eta: 0:07:19 lr: 0.003288 min_lr: 0.003288 loss: 3.1565 (3.1726) class_acc: 0.4766 (0.4880) weight_decay: 0.0500 (0.0500) grad_norm: 1.9234 (2.3524) time: 1.9757 data: 0.0007 max mem: 2905
Epoch: [97] [600/625] eta: 0:00:48 lr: 0.003283 min_lr: 0.003283 loss: 3.2316 (3.1810) class_acc: 0.4805 (0.4878) weight_decay: 0.0500 (0.0500) grad_norm: 2.0122 (2.3987) time: 1.9237 data: 0.0009 max mem: 2905
Epoch: [97] [624/625] eta: 0:00:01 lr: 0.003282 min_lr: 0.003282 loss: 3.1572 (3.1808) class_acc: 0.4805 (0.4876) weight_decay: 0.0500 (0.0500) grad_norm: 2.0122 (2.4072) time: 0.7896 data: 0.0016 max mem: 2905
Epoch: [97] Total time: 0:19:49 (1.9040 s / it)
Averaged stats: lr: 0.003282 min_lr: 0.003282 loss: 3.1572 (3.1837) class_acc: 0.4805 (0.4861) weight_decay: 0.0500 (0.0500) grad_norm: 2.0122 (2.4072)
Test: [ 0/50] eta: 0:10:40 loss: 2.8172 (2.8172) acc1: 43.2000 (43.2000) acc5: 69.6000 (69.6000) time: 12.8192 data: 12.7952 max mem: 2905
Test: [10/50] eta: 0:01:29 loss: 3.1881 (3.0690) acc1: 36.8000 (38.4727) acc5: 60.8000 (63.8545) time: 2.2419 data: 2.2226 max mem: 2905
Test: [20/50] eta: 0:00:52 loss: 3.2060 (3.1854) acc1: 34.4000 (36.0762) acc5: 60.0000 (61.1810) time: 1.2130 data: 1.1933 max mem: 2905
Test: [30/50] eta: 0:00:30 loss: 3.1958 (3.1799) acc1: 33.6000 (36.3355) acc5: 59.2000 (60.8000) time: 1.0999 data: 1.0799 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 3.1758 (3.2466) acc1: 34.4000 (35.2585) acc5: 58.4000 (59.4341) time: 0.6607 data: 0.6421 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 3.1973 (3.2212) acc1: 34.4000 (35.4240) acc5: 58.4000 (59.9680) time: 0.5383 data: 0.5172 max mem: 2905
Test: Total time: 0:00:51 (1.0219 s / it)
* Acc@1 35.868 Acc@5 60.620 loss 3.195
Accuracy of the model on the 50000 test images: 35.9%
Max accuracy: 50.54%
Epoch: [98] [ 0/625] eta: 3:45:23 lr: 0.003282 min_lr: 0.003282 loss: 2.9842 (2.9842) class_acc: 0.5156 (0.5156) weight_decay: 0.0500 (0.0500) time: 21.6375 data: 18.5556 max mem: 2905
Epoch: [98] [200/625] eta: 0:14:19 lr: 0.003276 min_lr: 0.003276 loss: 3.1381 (3.1743) class_acc: 0.4883 (0.4863) weight_decay: 0.0500 (0.0500) grad_norm: 1.8925 (2.2741) time: 1.8770 data: 0.0333 max mem: 2905
Epoch: [98] [400/625] eta: 0:07:19 lr: 0.003271 min_lr: 0.003271 loss: 3.2362 (3.1812) class_acc: 0.4805 (0.4859) weight_decay: 0.0500 (0.0500) grad_norm: 1.9335 (2.2470) time: 1.9194 data: 0.0756 max mem: 2905
Epoch: [98] [600/625] eta: 0:00:48 lr: 0.003265 min_lr: 0.003265 loss: 3.1855 (3.1871) class_acc: 0.4844 (0.4852) weight_decay: 0.0500 (0.0500) grad_norm: 2.2260 (2.3422) time: 1.9480 data: 0.0007 max mem: 2905
Epoch: [98] [624/625] eta: 0:00:01 lr: 0.003265 min_lr: 0.003265 loss: 3.1908 (3.1877) class_acc: 0.4805 (0.4851) weight_decay: 0.0500 (0.0500) grad_norm: 1.8568 (2.3188) time: 0.9145 data: 0.0015 max mem: 2905
Epoch: [98] Total time: 0:19:52 (1.9085 s / it)
Averaged stats: lr: 0.003265 min_lr: 0.003265 loss: 3.1908 (3.1826) class_acc: 0.4805 (0.4861) weight_decay: 0.0500 (0.0500) grad_norm: 1.8568 (2.3188)
Test: [ 0/50] eta: 0:09:33 loss: 3.4754 (3.4754) acc1: 30.4000 (30.4000) acc5: 54.4000 (54.4000) time: 11.4651 data: 11.4382 max mem: 2905
Test: [10/50] eta: 0:01:17 loss: 2.8267 (2.9547) acc1: 40.8000 (40.2909) acc5: 62.4000 (62.6182) time: 1.9294 data: 1.9096 max mem: 2905
Test: [20/50] eta: 0:00:45 loss: 3.1309 (3.1653) acc1: 36.0000 (36.3429) acc5: 62.4000 (61.0286) time: 1.0109 data: 0.9920 max mem: 2905
Test: [30/50] eta: 0:00:26 loss: 3.3449 (3.2374) acc1: 32.0000 (35.6129) acc5: 57.6000 (60.0258) time: 0.9970 data: 0.9780 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 3.3578 (3.2939) acc1: 34.4000 (35.5707) acc5: 56.8000 (59.3756) time: 0.9319 data: 0.9104 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 3.2328 (3.2922) acc1: 36.8000 (35.7440) acc5: 58.4000 (59.3280) time: 0.7029 data: 0.6817 max mem: 2905
Test: Total time: 0:00:54 (1.0803 s / it)
* Acc@1 36.370 Acc@5 60.698 loss 3.247
Accuracy of the model on the 50000 test images: 36.4%
Max accuracy: 50.54%
Epoch: [99] [ 0/625] eta: 3:36:58 lr: 0.003265 min_lr: 0.003265 loss: 3.0258 (3.0258) class_acc: 0.5117 (0.5117) weight_decay: 0.0500 (0.0500) time: 20.8301 data: 19.7825 max mem: 2905
Epoch: [99] [200/625] eta: 0:14:16 lr: 0.003259 min_lr: 0.003259 loss: 3.1685 (3.1541) class_acc: 0.4883 (0.4904) weight_decay: 0.0500 (0.0500) grad_norm: 1.8130 (2.3655) time: 1.7334 data: 0.0006 max mem: 2905
Epoch: [99] [400/625] eta: 0:07:23 lr: 0.003253 min_lr: 0.003253 loss: 3.2450 (3.1692) class_acc: 0.4805 (0.4886) weight_decay: 0.0500 (0.0500) grad_norm: 2.3855 (2.4884) time: 1.8393 data: 0.0006 max mem: 2905
Epoch: [99] [600/625] eta: 0:00:48 lr: 0.003248 min_lr: 0.003248 loss: 3.2114 (3.1752) class_acc: 0.4766 (0.4871) weight_decay: 0.0500 (0.0500) grad_norm: 1.7437 (2.4589) time: 1.9574 data: 0.0008 max mem: 2905
Epoch: [99] [624/625] eta: 0:00:01 lr: 0.003247 min_lr: 0.003247 loss: 3.2502 (3.1780) class_acc: 0.4805 (0.4867) weight_decay: 0.0500 (0.0500) grad_norm: 1.5675 (2.4343) time: 0.4238 data: 0.0022 max mem: 2905
Epoch: [99] Total time: 0:19:53 (1.9098 s / it)
Averaged stats: lr: 0.003247 min_lr: 0.003247 loss: 3.2502 (3.1779) class_acc: 0.4805 (0.4874) weight_decay: 0.0500 (0.0500) grad_norm: 1.5675 (2.4343)
Test: [ 0/50] eta: 0:10:29 loss: 2.9894 (2.9894) acc1: 42.4000 (42.4000) acc5: 64.8000 (64.8000) time: 12.5814 data: 12.5505 max mem: 2905
Test: [10/50] eta: 0:01:12 loss: 3.0311 (3.1152) acc1: 37.6000 (38.6182) acc5: 61.6000 (61.0182) time: 1.8156 data: 1.7956 max mem: 2905
Test: [20/50] eta: 0:00:41 loss: 3.0638 (3.2249) acc1: 37.6000 (36.8381) acc5: 60.8000 (60.4191) time: 0.8368 data: 0.8170 max mem: 2905
Test: [30/50] eta: 0:00:24 loss: 3.1582 (3.1819) acc1: 36.0000 (37.0065) acc5: 61.6000 (60.8774) time: 0.9198 data: 0.9005 max mem: 2905
Test: [40/50] eta: 0:00:10 loss: 3.1503 (3.2181) acc1: 36.0000 (36.3512) acc5: 60.8000 (60.4098) time: 0.7385 data: 0.7193 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 3.2797 (3.2487) acc1: 34.4000 (35.9200) acc5: 57.6000 (59.8080) time: 0.4536 data: 0.4338 max mem: 2905
Test: Total time: 0:00:47 (0.9532 s / it)
* Acc@1 36.152 Acc@5 60.362 loss 3.227
Accuracy of the model on the 50000 test images: 36.2%
Max accuracy: 50.54%
Epoch: [100] [ 0/625] eta: 3:39:06 lr: 0.003247 min_lr: 0.003247 loss: 3.1840 (3.1840) class_acc: 0.4805 (0.4805) weight_decay: 0.0500 (0.0500) time: 21.0347 data: 19.0746 max mem: 2905
Epoch: [100] [200/625] eta: 0:14:15 lr: 0.003242 min_lr: 0.003242 loss: 3.1730 (3.1509) class_acc: 0.4922 (0.4960) weight_decay: 0.0500 (0.0500) grad_norm: 1.9238 (inf) time: 1.9224 data: 0.0095 max mem: 2905
Epoch: [100] [400/625] eta: 0:07:18 lr: 0.003236 min_lr: 0.003236 loss: 3.1646 (3.1680) class_acc: 0.4922 (0.4901) weight_decay: 0.0500 (0.0500) grad_norm: 1.9940 (inf) time: 1.7774 data: 0.0140 max mem: 2905
Epoch: [100] [600/625] eta: 0:00:48 lr: 0.003230 min_lr: 0.003230 loss: 3.1874 (3.1789) class_acc: 0.4766 (0.4879) weight_decay: 0.0500 (0.0500) grad_norm: 2.8117 (inf) time: 1.8698 data: 0.0199 max mem: 2905
Epoch: [100] [624/625] eta: 0:00:01 lr: 0.003230 min_lr: 0.003230 loss: 3.1544 (3.1785) class_acc: 0.4805 (0.4877) weight_decay: 0.0500 (0.0500) grad_norm: 1.9590 (inf) time: 0.8299 data: 0.0016 max mem: 2905
Epoch: [100] Total time: 0:19:40 (1.8881 s / it)
Averaged stats: lr: 0.003230 min_lr: 0.003230 loss: 3.1544 (3.1749) class_acc: 0.4805 (0.4877) weight_decay: 0.0500 (0.0500) grad_norm: 1.9590 (inf)
Test: [ 0/50] eta: 0:10:30 loss: 3.2758 (3.2758) acc1: 32.0000 (32.0000) acc5: 60.0000 (60.0000) time: 12.6001 data: 12.5677 max mem: 2905
Test: [10/50] eta: 0:01:17 loss: 2.9321 (2.8179) acc1: 41.6000 (41.5273) acc5: 68.0000 (67.8545) time: 1.9304 data: 1.9080 max mem: 2905
Test: [20/50] eta: 0:00:45 loss: 2.9604 (2.9723) acc1: 37.6000 (38.7048) acc5: 65.6000 (65.4476) time: 0.9590 data: 0.9392 max mem: 2905
Test: [30/50] eta: 0:00:26 loss: 3.0349 (2.9716) acc1: 37.6000 (39.6645) acc5: 63.2000 (65.1871) time: 0.9637 data: 0.9449 max mem: 2905
Test: [40/50] eta: 0:00:10 loss: 2.9051 (2.9577) acc1: 40.8000 (39.9415) acc5: 65.6000 (65.1317) time: 0.5951 data: 0.5765 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.7267 (2.9271) acc1: 40.8000 (39.8880) acc5: 68.0000 (65.5840) time: 0.5293 data: 0.5115 max mem: 2905
Test: Total time: 0:00:45 (0.9121 s / it)
* Acc@1 40.750 Acc@5 65.856 loss 2.890
Accuracy of the model on the 50000 test images: 40.8%
Max accuracy: 50.54%
Epoch: [101] [ 0/625] eta: 3:33:06 lr: 0.003230 min_lr: 0.003230 loss: 3.1376 (3.1376) class_acc: 0.5000 (0.5000) weight_decay: 0.0500 (0.0500) time: 20.4577 data: 20.0765 max mem: 2905
Epoch: [101] [200/625] eta: 0:14:02 lr: 0.003224 min_lr: 0.003224 loss: 3.1664 (3.1466) class_acc: 0.4922 (0.4948) weight_decay: 0.0500 (0.0500) grad_norm: 2.7249 (2.4280) time: 1.7851 data: 0.9487 max mem: 2905
Epoch: [101] [400/625] eta: 0:07:19 lr: 0.003218 min_lr: 0.003218 loss: 3.1468 (3.1597) class_acc: 0.4961 (0.4907) weight_decay: 0.0500 (0.0500) grad_norm: 2.0668 (2.2974) time: 1.9406 data: 0.0175 max mem: 2905
Epoch: [101] [600/625] eta: 0:00:49 lr: 0.003212 min_lr: 0.003212 loss: 3.1833 (3.1699) class_acc: 0.4844 (0.4885) weight_decay: 0.0500 (0.0500) grad_norm: 2.4687 (2.4344) time: 2.0241 data: 0.0007 max mem: 2905
Epoch: [101] [624/625] eta: 0:00:01 lr: 0.003212 min_lr: 0.003212 loss: 3.1917 (3.1714) class_acc: 0.4688 (0.4881) weight_decay: 0.0500 (0.0500) grad_norm: 2.6580 (2.4374) time: 0.7518 data: 0.0017 max mem: 2905
Epoch: [101] Total time: 0:19:59 (1.9193 s / it)
Averaged stats: lr: 0.003212 min_lr: 0.003212 loss: 3.1917 (3.1750) class_acc: 0.4688 (0.4878) weight_decay: 0.0500 (0.0500) grad_norm: 2.6580 (2.4374)
Test: [ 0/50] eta: 0:10:25 loss: 2.9443 (2.9443) acc1: 37.6000 (37.6000) acc5: 65.6000 (65.6000) time: 12.5021 data: 12.4728 max mem: 2905
Test: [10/50] eta: 0:01:27 loss: 2.9216 (2.9307) acc1: 41.6000 (41.5273) acc5: 64.8000 (64.5818) time: 2.1930 data: 2.1736 max mem: 2905
Test: [20/50] eta: 0:00:53 loss: 3.0859 (3.1313) acc1: 37.6000 (37.4857) acc5: 63.2000 (61.6000) time: 1.2344 data: 1.2157 max mem: 2905
Test: [30/50] eta: 0:00:31 loss: 3.1563 (3.1153) acc1: 34.4000 (37.4968) acc5: 60.8000 (61.8065) time: 1.2435 data: 1.2252 max mem: 2905
Test: [40/50] eta: 0:00:13 loss: 3.1352 (3.1408) acc1: 36.8000 (37.0341) acc5: 60.8000 (61.7171) time: 0.8088 data: 0.7905 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 3.2198 (3.1685) acc1: 35.2000 (36.5760) acc5: 60.0000 (61.2320) time: 0.7152 data: 0.6960 max mem: 2905
Test: Total time: 0:00:54 (1.0895 s / it)
* Acc@1 37.208 Acc@5 62.040 loss 3.130
Accuracy of the model on the 50000 test images: 37.2%
Max accuracy: 50.54%
Epoch: [102] [ 0/625] eta: 3:51:59 lr: 0.003212 min_lr: 0.003212 loss: 3.0261 (3.0261) class_acc: 0.5273 (0.5273) weight_decay: 0.0500 (0.0500) time: 22.2708 data: 20.7913 max mem: 2905
Epoch: [102] [200/625] eta: 0:14:13 lr: 0.003206 min_lr: 0.003206 loss: 3.1967 (3.1444) class_acc: 0.4883 (0.4942) weight_decay: 0.0500 (0.0500) grad_norm: 2.2358 (2.3092) time: 1.8325 data: 1.3429 max mem: 2905
Epoch: [102] [400/625] eta: 0:07:29 lr: 0.003200 min_lr: 0.003200 loss: 3.2087 (3.1594) class_acc: 0.4648 (0.4911) weight_decay: 0.0500 (0.0500) grad_norm: 2.3602 (2.2920) time: 1.9981 data: 0.0009 max mem: 2905
Epoch: [102] [600/625] eta: 0:00:49 lr: 0.003195 min_lr: 0.003195 loss: 3.1908 (3.1655) class_acc: 0.5000 (0.4896) weight_decay: 0.0500 (0.0500) grad_norm: 2.5207 (2.3051) time: 2.0056 data: 0.0014 max mem: 2905
Epoch: [102] [624/625] eta: 0:00:01 lr: 0.003194 min_lr: 0.003194 loss: 3.2182 (3.1673) class_acc: 0.4688 (0.4891) weight_decay: 0.0500 (0.0500) grad_norm: 2.5357 (2.3254) time: 0.8929 data: 0.0032 max mem: 2905
Epoch: [102] Total time: 0:20:20 (1.9535 s / it)
Averaged stats: lr: 0.003194 min_lr: 0.003194 loss: 3.2182 (3.1719) class_acc: 0.4688 (0.4884) weight_decay: 0.0500 (0.0500) grad_norm: 2.5357 (2.3254)
Test: [ 0/50] eta: 0:11:13 loss: 3.2653 (3.2653) acc1: 35.2000 (35.2000) acc5: 60.8000 (60.8000) time: 13.4636 data: 13.4379 max mem: 2905
Test: [10/50] eta: 0:01:33 loss: 3.4063 (3.3121) acc1: 32.8000 (34.1818) acc5: 56.8000 (58.4727) time: 2.3348 data: 2.3139 max mem: 2905
Test: [20/50] eta: 0:00:55 loss: 3.4072 (3.3944) acc1: 32.8000 (33.0286) acc5: 58.4000 (58.1714) time: 1.2554 data: 1.2359 max mem: 2905
Test: [30/50] eta: 0:00:30 loss: 3.4000 (3.3726) acc1: 34.4000 (34.2194) acc5: 59.2000 (58.1677) time: 1.0599 data: 1.0410 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 3.3375 (3.3585) acc1: 35.2000 (34.3024) acc5: 56.8000 (58.1659) time: 0.6676 data: 0.6482 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 3.4321 (3.3680) acc1: 33.6000 (34.0480) acc5: 56.8000 (58.0480) time: 0.5880 data: 0.5689 max mem: 2905
Test: Total time: 0:00:53 (1.0638 s / it)
* Acc@1 34.996 Acc@5 58.714 loss 3.336
Accuracy of the model on the 50000 test images: 35.0%
Max accuracy: 50.54%
Epoch: [103] [ 0/625] eta: 3:43:39 lr: 0.003194 min_lr: 0.003194 loss: 3.0953 (3.0953) class_acc: 0.5117 (0.5117) weight_decay: 0.0500 (0.0500) time: 21.4712 data: 19.3919 max mem: 2905
Epoch: [103] [200/625] eta: 0:14:27 lr: 0.003188 min_lr: 0.003188 loss: 3.2309 (3.1527) class_acc: 0.4766 (0.4925) weight_decay: 0.0500 (0.0500) grad_norm: 1.6373 (2.3989) time: 1.8658 data: 0.0007 max mem: 2905
Epoch: [103] [400/625] eta: 0:07:25 lr: 0.003182 min_lr: 0.003182 loss: 3.1425 (3.1620) class_acc: 0.4805 (0.4906) weight_decay: 0.0500 (0.0500) grad_norm: 1.7598 (2.3464) time: 1.9866 data: 0.0007 max mem: 2905
Epoch: [103] [600/625] eta: 0:00:49 lr: 0.003176 min_lr: 0.003176 loss: 3.1077 (3.1660) class_acc: 0.4883 (0.4895) weight_decay: 0.0500 (0.0500) grad_norm: 2.0958 (2.3843) time: 1.9008 data: 0.0009 max mem: 2905
Epoch: [103] [624/625] eta: 0:00:01 lr: 0.003176 min_lr: 0.003176 loss: 3.1982 (3.1670) class_acc: 0.4805 (0.4893) weight_decay: 0.0500 (0.0500) grad_norm: 2.6730 (2.3986) time: 0.8285 data: 0.0019 max mem: 2905
Epoch: [103] Total time: 0:20:17 (1.9488 s / it)
Averaged stats: lr: 0.003176 min_lr: 0.003176 loss: 3.1982 (3.1690) class_acc: 0.4805 (0.4892) weight_decay: 0.0500 (0.0500) grad_norm: 2.6730 (2.3986)
Test: [ 0/50] eta: 0:10:09 loss: 2.9171 (2.9171) acc1: 38.4000 (38.4000) acc5: 64.8000 (64.8000) time: 12.1858 data: 12.1565 max mem: 2905
Test: [10/50] eta: 0:01:23 loss: 2.7624 (2.7191) acc1: 45.6000 (45.0182) acc5: 69.6000 (68.2182) time: 2.0918 data: 2.0732 max mem: 2905
Test: [20/50] eta: 0:00:49 loss: 2.8051 (2.8198) acc1: 40.8000 (42.3238) acc5: 65.6000 (66.7810) time: 1.1362 data: 1.1173 max mem: 2905
Test: [30/50] eta: 0:00:28 loss: 2.9191 (2.8257) acc1: 39.2000 (42.0903) acc5: 64.0000 (66.7355) time: 1.0936 data: 1.0734 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.6934 (2.8243) acc1: 40.8000 (42.3024) acc5: 67.2000 (66.9854) time: 0.7156 data: 0.6960 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.6455 (2.8337) acc1: 41.6000 (42.0320) acc5: 67.2000 (66.9280) time: 0.7225 data: 0.7034 max mem: 2905
Test: Total time: 0:00:49 (0.9947 s / it)
* Acc@1 42.654 Acc@5 67.668 loss 2.807
Accuracy of the model on the 50000 test images: 42.7%
Max accuracy: 50.54%
Epoch: [104] [ 0/625] eta: 3:38:52 lr: 0.003176 min_lr: 0.003176 loss: 3.3443 (3.3443) class_acc: 0.4531 (0.4531) weight_decay: 0.0500 (0.0500) time: 21.0125 data: 20.8825 max mem: 2905
Epoch: [104] [200/625] eta: 0:14:28 lr: 0.003170 min_lr: 0.003170 loss: 3.1164 (3.1433) class_acc: 0.5039 (0.4987) weight_decay: 0.0500 (0.0500) grad_norm: 2.5210 (2.5574) time: 2.0468 data: 0.0043 max mem: 2905
Epoch: [104] [400/625] eta: 0:07:25 lr: 0.003164 min_lr: 0.003164 loss: 3.1662 (3.1526) class_acc: 0.4805 (0.4945) weight_decay: 0.0500 (0.0500) grad_norm: 2.3693 (2.5750) time: 2.0765 data: 0.0254 max mem: 2905
Epoch: [104] [600/625] eta: 0:00:49 lr: 0.003158 min_lr: 0.003158 loss: 3.2398 (3.1664) class_acc: 0.4805 (0.4916) weight_decay: 0.0500 (0.0500) grad_norm: 2.1587 (2.4623) time: 1.9450 data: 0.0044 max mem: 2905
Epoch: [104] [624/625] eta: 0:00:01 lr: 0.003158 min_lr: 0.003158 loss: 3.1528 (3.1661) class_acc: 0.5000 (0.4917) weight_decay: 0.0500 (0.0500) grad_norm: 1.8106 (2.4427) time: 0.8764 data: 0.0017 max mem: 2905
Epoch: [104] Total time: 0:20:02 (1.9237 s / it)
Averaged stats: lr: 0.003158 min_lr: 0.003158 loss: 3.1528 (3.1650) class_acc: 0.5000 (0.4900) weight_decay: 0.0500 (0.0500) grad_norm: 1.8106 (2.4427)
Test: [ 0/50] eta: 0:09:13 loss: 2.9056 (2.9056) acc1: 37.6000 (37.6000) acc5: 70.4000 (70.4000) time: 11.0689 data: 11.0371 max mem: 2905
Test: [10/50] eta: 0:01:16 loss: 2.7858 (2.7033) acc1: 44.8000 (44.5091) acc5: 70.4000 (68.2182) time: 1.9054 data: 1.8855 max mem: 2905
Test: [20/50] eta: 0:00:45 loss: 2.8192 (2.8607) acc1: 43.2000 (41.9048) acc5: 64.8000 (66.3619) time: 1.0375 data: 1.0188 max mem: 2905
Test: [30/50] eta: 0:00:27 loss: 2.9429 (2.8504) acc1: 38.4000 (41.7548) acc5: 64.8000 (66.6065) time: 1.0493 data: 1.0301 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.9429 (2.8931) acc1: 38.4000 (40.8000) acc5: 65.6000 (66.1659) time: 0.8328 data: 0.8139 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.9974 (2.8920) acc1: 38.4000 (40.7520) acc5: 64.0000 (66.2720) time: 0.7642 data: 0.7457 max mem: 2905
Test: Total time: 0:00:51 (1.0282 s / it)
* Acc@1 41.038 Acc@5 66.228 loss 2.870
Accuracy of the model on the 50000 test images: 41.0%
Max accuracy: 50.54%
Epoch: [105] [ 0/625] eta: 3:31:32 lr: 0.003158 min_lr: 0.003158 loss: 3.2014 (3.2014) class_acc: 0.5000 (0.5000) weight_decay: 0.0500 (0.0500) time: 20.3073 data: 20.1807 max mem: 2905
Epoch: [105] [200/625] eta: 0:14:14 lr: 0.003152 min_lr: 0.003152 loss: 3.1316 (3.1406) class_acc: 0.4961 (0.4958) weight_decay: 0.0500 (0.0500) grad_norm: 2.2005 (2.4949) time: 1.9919 data: 1.1444 max mem: 2905
Epoch: [105] [400/625] eta: 0:07:23 lr: 0.003146 min_lr: 0.003146 loss: 3.1241 (3.1512) class_acc: 0.4961 (0.4927) weight_decay: 0.0500 (0.0500) grad_norm: 2.2250 (2.3531) time: 2.1765 data: 0.0260 max mem: 2905
Epoch: [105] [600/625] eta: 0:00:49 lr: 0.003140 min_lr: 0.003140 loss: 3.2256 (3.1601) class_acc: 0.4727 (0.4906) weight_decay: 0.0500 (0.0500) grad_norm: 1.8167 (2.3134) time: 2.0227 data: 0.0117 max mem: 2905
Epoch: [105] [624/625] eta: 0:00:01 lr: 0.003139 min_lr: 0.003139 loss: 3.1774 (3.1607) class_acc: 0.4883 (0.4905) weight_decay: 0.0500 (0.0500) grad_norm: 2.0872 (2.3286) time: 0.8843 data: 0.0110 max mem: 2905
Epoch: [105] Total time: 0:20:13 (1.9421 s / it)
Averaged stats: lr: 0.003139 min_lr: 0.003139 loss: 3.1774 (3.1608) class_acc: 0.4883 (0.4910) weight_decay: 0.0500 (0.0500) grad_norm: 2.0872 (2.3286)
Test: [ 0/50] eta: 0:10:53 loss: 3.0985 (3.0985) acc1: 38.4000 (38.4000) acc5: 57.6000 (57.6000) time: 13.0717 data: 13.0460 max mem: 2905
Test: [10/50] eta: 0:01:25 loss: 2.7367 (2.7793) acc1: 42.4000 (42.3273) acc5: 68.0000 (67.3455) time: 2.1423 data: 2.1236 max mem: 2905
Test: [20/50] eta: 0:00:50 loss: 2.9104 (2.9556) acc1: 39.2000 (39.1619) acc5: 65.6000 (65.2191) time: 1.1294 data: 1.1112 max mem: 2905
Test: [30/50] eta: 0:00:30 loss: 3.0784 (2.9826) acc1: 36.8000 (38.8903) acc5: 64.0000 (64.8774) time: 1.1902 data: 1.1708 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 3.0541 (3.0222) acc1: 36.8000 (38.3220) acc5: 62.4000 (64.1366) time: 0.8381 data: 0.8176 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 3.0541 (3.0259) acc1: 37.6000 (38.4160) acc5: 62.4000 (64.2240) time: 0.8281 data: 0.8090 max mem: 2905
Test: Total time: 0:00:52 (1.0554 s / it)
* Acc@1 38.796 Acc@5 64.274 loss 2.991
Accuracy of the model on the 50000 test images: 38.8%
Max accuracy: 50.54%
Epoch: [106] [ 0/625] eta: 3:26:28 lr: 0.003139 min_lr: 0.003139 loss: 3.0712 (3.0712) class_acc: 0.5156 (0.5156) weight_decay: 0.0500 (0.0500) time: 19.8212 data: 16.6009 max mem: 2905
Epoch: [106] [200/625] eta: 0:14:34 lr: 0.003133 min_lr: 0.003133 loss: 3.0933 (3.1407) class_acc: 0.5000 (0.4959) weight_decay: 0.0500 (0.0500) grad_norm: 1.9702 (2.2952) time: 1.8725 data: 0.0009 max mem: 2905
Epoch: [106] [400/625] eta: 0:07:33 lr: 0.003127 min_lr: 0.003127 loss: 3.1157 (3.1477) class_acc: 0.5039 (0.4948) weight_decay: 0.0500 (0.0500) grad_norm: 2.2686 (2.2832) time: 1.9957 data: 0.0013 max mem: 2905
Epoch: [106] [600/625] eta: 0:00:50 lr: 0.003121 min_lr: 0.003121 loss: 3.1503 (3.1548) class_acc: 0.4883 (0.4930) weight_decay: 0.0500 (0.0500) grad_norm: 2.1014 (inf) time: 1.9333 data: 0.0008 max mem: 2905
Epoch: [106] [624/625] eta: 0:00:01 lr: 0.003121 min_lr: 0.003121 loss: 3.1527 (3.1560) class_acc: 0.4805 (0.4928) weight_decay: 0.0500 (0.0500) grad_norm: 1.7173 (inf) time: 0.7861 data: 0.0016 max mem: 2905
Epoch: [106] Total time: 0:20:23 (1.9584 s / it)
Averaged stats: lr: 0.003121 min_lr: 0.003121 loss: 3.1527 (3.1577) class_acc: 0.4805 (0.4914) weight_decay: 0.0500 (0.0500) grad_norm: 1.7173 (inf)
Test: [ 0/50] eta: 0:10:23 loss: 2.6026 (2.6026) acc1: 42.4000 (42.4000) acc5: 73.6000 (73.6000) time: 12.4737 data: 12.4500 max mem: 2905
Test: [10/50] eta: 0:01:22 loss: 2.6030 (2.6332) acc1: 47.2000 (47.5636) acc5: 72.0000 (70.1818) time: 2.0535 data: 2.0331 max mem: 2905
Test: [20/50] eta: 0:00:48 loss: 2.6467 (2.7372) acc1: 43.2000 (43.5810) acc5: 68.8000 (68.4952) time: 1.0799 data: 1.0594 max mem: 2905
Test: [30/50] eta: 0:00:28 loss: 2.8302 (2.7560) acc1: 40.0000 (43.1226) acc5: 67.2000 (67.9226) time: 1.0699 data: 1.0501 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.8621 (2.7954) acc1: 41.6000 (42.9268) acc5: 67.2000 (67.3756) time: 0.7944 data: 0.7759 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.8119 (2.8224) acc1: 42.4000 (42.8960) acc5: 66.4000 (66.9440) time: 0.7183 data: 0.6997 max mem: 2905
Test: Total time: 0:00:51 (1.0240 s / it)
* Acc@1 43.326 Acc@5 68.176 loss 2.768
Accuracy of the model on the 50000 test images: 43.3%
Max accuracy: 50.54%
Epoch: [107] [ 0/625] eta: 3:23:31 lr: 0.003121 min_lr: 0.003121 loss: 3.1516 (3.1516) class_acc: 0.5430 (0.5430) weight_decay: 0.0500 (0.0500) time: 19.5378 data: 17.3284 max mem: 2905
Epoch: [107] [200/625] eta: 0:13:57 lr: 0.003115 min_lr: 0.003115 loss: 3.2130 (3.1431) class_acc: 0.4766 (0.4946) weight_decay: 0.0500 (0.0500) grad_norm: 2.2384 (2.3360) time: 1.6691 data: 0.8265 max mem: 2905
Epoch: [107] [400/625] eta: 0:07:18 lr: 0.003109 min_lr: 0.003109 loss: 3.1817 (3.1498) class_acc: 0.4883 (0.4934) weight_decay: 0.0500 (0.0500) grad_norm: 2.0891 (2.3818) time: 2.0108 data: 0.3347 max mem: 2905
Epoch: [107] [600/625] eta: 0:00:49 lr: 0.003103 min_lr: 0.003103 loss: 3.1289 (3.1537) class_acc: 0.4922 (0.4930) weight_decay: 0.0500 (0.0500) grad_norm: 1.7743 (2.3908) time: 2.0010 data: 0.0373 max mem: 2905
Epoch: [107] [624/625] eta: 0:00:01 lr: 0.003102 min_lr: 0.003102 loss: 3.1967 (3.1549) class_acc: 0.4805 (0.4927) weight_decay: 0.0500 (0.0500) grad_norm: 1.4697 (2.3685) time: 0.8167 data: 0.0269 max mem: 2905
Epoch: [107] Total time: 0:20:02 (1.9234 s / it)
Averaged stats: lr: 0.003102 min_lr: 0.003102 loss: 3.1967 (3.1571) class_acc: 0.4805 (0.4912) weight_decay: 0.0500 (0.0500) grad_norm: 1.4697 (2.3685)
Test: [ 0/50] eta: 0:10:32 loss: 2.7467 (2.7467) acc1: 37.6000 (37.6000) acc5: 66.4000 (66.4000) time: 12.6427 data: 12.5997 max mem: 2905
Test: [10/50] eta: 0:01:20 loss: 2.7949 (2.7865) acc1: 44.8000 (43.9273) acc5: 68.0000 (68.4364) time: 2.0092 data: 1.9883 max mem: 2905
Test: [20/50] eta: 0:00:47 loss: 2.9228 (2.9221) acc1: 39.2000 (39.9619) acc5: 67.2000 (67.0476) time: 1.0199 data: 1.0005 max mem: 2905
Test: [30/50] eta: 0:00:27 loss: 3.0798 (2.9494) acc1: 36.8000 (40.2323) acc5: 64.0000 (66.2710) time: 1.0505 data: 1.0303 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 3.0080 (2.9402) acc1: 40.8000 (40.4683) acc5: 62.4000 (66.2634) time: 0.8262 data: 0.8049 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.8797 (2.9281) acc1: 40.8000 (40.6240) acc5: 67.2000 (66.3680) time: 0.7876 data: 0.7670 max mem: 2905
Test: Total time: 0:00:52 (1.0584 s / it)
* Acc@1 41.036 Acc@5 66.174 loss 2.896
Accuracy of the model on the 50000 test images: 41.0%
Max accuracy: 50.54%
Epoch: [108] [ 0/625] eta: 3:38:43 lr: 0.003102 min_lr: 0.003102 loss: 3.1909 (3.1909) class_acc: 0.4844 (0.4844) weight_decay: 0.0500 (0.0500) time: 20.9975 data: 16.6143 max mem: 2905
Epoch: [108] [200/625] eta: 0:14:29 lr: 0.003096 min_lr: 0.003096 loss: 3.1393 (3.1491) class_acc: 0.4844 (0.4938) weight_decay: 0.0500 (0.0500) grad_norm: 1.8122 (2.2510) time: 1.8571 data: 0.0008 max mem: 2905
Epoch: [108] [400/625] eta: 0:07:33 lr: 0.003090 min_lr: 0.003090 loss: 3.1372 (3.1548) class_acc: 0.4805 (0.4926) weight_decay: 0.0500 (0.0500) grad_norm: 2.3752 (2.4471) time: 1.9529 data: 0.0279 max mem: 2905
Epoch: [108] [600/625] eta: 0:00:50 lr: 0.003084 min_lr: 0.003084 loss: 3.1605 (3.1559) class_acc: 0.4922 (0.4923) weight_decay: 0.0500 (0.0500) grad_norm: 2.8046 (2.4023) time: 1.7287 data: 0.0342 max mem: 2905
Epoch: [108] [624/625] eta: 0:00:01 lr: 0.003083 min_lr: 0.003083 loss: 3.1924 (3.1573) class_acc: 0.4766 (0.4920) weight_decay: 0.0500 (0.0500) grad_norm: 2.0274 (2.3856) time: 1.0585 data: 0.0261 max mem: 2905
Epoch: [108] Total time: 0:20:32 (1.9719 s / it)
Averaged stats: lr: 0.003083 min_lr: 0.003083 loss: 3.1924 (3.1517) class_acc: 0.4766 (0.4928) weight_decay: 0.0500 (0.0500) grad_norm: 2.0274 (2.3856)
Test: [ 0/50] eta: 0:10:56 loss: 3.5392 (3.5392) acc1: 26.4000 (26.4000) acc5: 56.8000 (56.8000) time: 13.1332 data: 13.1083 max mem: 2905
Test: [10/50] eta: 0:01:26 loss: 3.4241 (3.4111) acc1: 33.6000 (33.1636) acc5: 59.2000 (58.8364) time: 2.1695 data: 2.1505 max mem: 2905
Test: [20/50] eta: 0:00:51 loss: 3.4225 (3.4542) acc1: 32.8000 (32.4571) acc5: 57.6000 (57.1810) time: 1.1366 data: 1.1185 max mem: 2905
Test: [30/50] eta: 0:00:30 loss: 3.4147 (3.4114) acc1: 33.6000 (33.8839) acc5: 56.8000 (57.9097) time: 1.1673 data: 1.1494 max mem: 2905
Test: [40/50] eta: 0:00:13 loss: 3.3768 (3.4268) acc1: 36.0000 (33.6585) acc5: 56.8000 (57.6390) time: 0.9507 data: 0.9320 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 3.4960 (3.4551) acc1: 32.8000 (33.2640) acc5: 55.2000 (57.1680) time: 0.8618 data: 0.8431 max mem: 2905
Test: Total time: 0:00:55 (1.1140 s / it)
* Acc@1 33.600 Acc@5 57.274 loss 3.427
Accuracy of the model on the 50000 test images: 33.6%
Max accuracy: 50.54%
Epoch: [109] [ 0/625] eta: 4:20:55 lr: 0.003083 min_lr: 0.003083 loss: 3.1236 (3.1236) class_acc: 0.5078 (0.5078) weight_decay: 0.0500 (0.0500) time: 25.0482 data: 20.8191 max mem: 2905
Epoch: [109] [200/625] eta: 0:15:00 lr: 0.003077 min_lr: 0.003077 loss: 3.1276 (3.1345) class_acc: 0.4922 (0.4954) weight_decay: 0.0500 (0.0500) grad_norm: 2.5944 (2.2594) time: 1.9012 data: 0.0019 max mem: 2905
Epoch: [109] [400/625] eta: 0:07:41 lr: 0.003071 min_lr: 0.003071 loss: 3.1508 (3.1457) class_acc: 0.4922 (0.4936) weight_decay: 0.0500 (0.0500) grad_norm: 1.6831 (2.2911) time: 1.8350 data: 0.0012 max mem: 2905
Epoch: [109] [600/625] eta: 0:00:50 lr: 0.003065 min_lr: 0.003065 loss: 3.1741 (3.1498) class_acc: 0.4766 (0.4923) weight_decay: 0.0500 (0.0500) grad_norm: 2.7819 (2.3950) time: 2.0337 data: 0.0013 max mem: 2905
Epoch: [109] [624/625] eta: 0:00:01 lr: 0.003064 min_lr: 0.003064 loss: 3.1565 (3.1510) class_acc: 0.4883 (0.4918) weight_decay: 0.0500 (0.0500) grad_norm: 1.8517 (2.3795) time: 0.7991 data: 0.0015 max mem: 2905
Epoch: [109] Total time: 0:20:40 (1.9851 s / it)
Averaged stats: lr: 0.003064 min_lr: 0.003064 loss: 3.1565 (3.1535) class_acc: 0.4883 (0.4926) weight_decay: 0.0500 (0.0500) grad_norm: 1.8517 (2.3795)
Test: [ 0/50] eta: 0:09:18 loss: 3.3506 (3.3506) acc1: 26.4000 (26.4000) acc5: 57.6000 (57.6000) time: 11.1705 data: 11.1417 max mem: 2905
Test: [10/50] eta: 0:01:24 loss: 3.2450 (3.1655) acc1: 38.4000 (37.2364) acc5: 61.6000 (61.5273) time: 2.1080 data: 2.0888 max mem: 2905
Test: [20/50] eta: 0:00:51 loss: 3.4150 (3.3660) acc1: 35.2000 (33.5619) acc5: 56.0000 (58.7429) time: 1.2460 data: 1.2276 max mem: 2905
Test: [30/50] eta: 0:00:28 loss: 3.5671 (3.3309) acc1: 32.0000 (34.0903) acc5: 56.0000 (59.0710) time: 1.0761 data: 1.0575 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 3.4460 (3.3640) acc1: 34.4000 (34.1268) acc5: 58.4000 (58.7707) time: 0.6133 data: 0.5941 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 3.3549 (3.3470) acc1: 34.4000 (34.4320) acc5: 58.4000 (58.8640) time: 0.4925 data: 0.4732 max mem: 2905
Test: Total time: 0:00:49 (0.9945 s / it)
* Acc@1 34.692 Acc@5 59.638 loss 3.313
Accuracy of the model on the 50000 test images: 34.7%
Max accuracy: 50.54%
Epoch: [110] [ 0/625] eta: 3:27:59 lr: 0.003064 min_lr: 0.003064 loss: 3.2464 (3.2464) class_acc: 0.4609 (0.4609) weight_decay: 0.0500 (0.0500) time: 19.9674 data: 18.5506 max mem: 2905
Epoch: [110] [200/625] eta: 0:14:01 lr: 0.003058 min_lr: 0.003058 loss: 3.1178 (3.1293) class_acc: 0.5117 (0.4969) weight_decay: 0.0500 (0.0500) grad_norm: 2.5135 (2.3586) time: 1.6940 data: 0.6570 max mem: 2905
Epoch: [110] [400/625] eta: 0:07:17 lr: 0.003052 min_lr: 0.003052 loss: 3.1297 (3.1314) class_acc: 0.4883 (0.4965) weight_decay: 0.0500 (0.0500) grad_norm: 1.7539 (2.4547) time: 1.9405 data: 0.0007 max mem: 2905
Epoch: [110] [600/625] eta: 0:00:48 lr: 0.003046 min_lr: 0.003046 loss: 3.1310 (3.1387) class_acc: 0.4844 (0.4959) weight_decay: 0.0500 (0.0500) grad_norm: 2.0552 (2.4851) time: 1.8424 data: 0.0008 max mem: 2905
Epoch: [110] [624/625] eta: 0:00:01 lr: 0.003045 min_lr: 0.003045 loss: 3.1845 (3.1403) class_acc: 0.4844 (0.4954) weight_decay: 0.0500 (0.0500) grad_norm: 2.2768 (2.4909) time: 0.6790 data: 0.0013 max mem: 2905
Epoch: [110] Total time: 0:20:02 (1.9243 s / it)
Averaged stats: lr: 0.003045 min_lr: 0.003045 loss: 3.1845 (3.1448) class_acc: 0.4844 (0.4940) weight_decay: 0.0500 (0.0500) grad_norm: 2.2768 (2.4909)
Test: [ 0/50] eta: 0:10:39 loss: 3.3610 (3.3610) acc1: 32.8000 (32.8000) acc5: 56.0000 (56.0000) time: 12.7838 data: 12.7582 max mem: 2905
Test: [10/50] eta: 0:01:23 loss: 3.3610 (3.4506) acc1: 32.8000 (32.6545) acc5: 56.0000 (57.1636) time: 2.0904 data: 2.0713 max mem: 2905
Test: [20/50] eta: 0:00:48 loss: 3.5958 (3.6039) acc1: 28.0000 (30.7048) acc5: 52.8000 (54.5143) time: 1.0588 data: 1.0394 max mem: 2905
Test: [30/50] eta: 0:00:27 loss: 3.5662 (3.5653) acc1: 28.8000 (31.4065) acc5: 53.6000 (55.0194) time: 1.0114 data: 0.9916 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 3.4763 (3.5758) acc1: 31.2000 (31.0634) acc5: 54.4000 (54.8488) time: 0.6688 data: 0.6495 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 3.4508 (3.5631) acc1: 31.2000 (31.3280) acc5: 57.6000 (55.2320) time: 0.6077 data: 0.5879 max mem: 2905
Test: Total time: 0:00:48 (0.9734 s / it)
* Acc@1 32.310 Acc@5 56.064 loss 3.533
Accuracy of the model on the 50000 test images: 32.3%
Max accuracy: 50.54%
Epoch: [111] [ 0/625] eta: 3:25:03 lr: 0.003045 min_lr: 0.003045 loss: 3.3081 (3.3081) class_acc: 0.4609 (0.4609) weight_decay: 0.0500 (0.0500) time: 19.6849 data: 19.5456 max mem: 2905
Epoch: [111] [200/625] eta: 0:14:17 lr: 0.003039 min_lr: 0.003039 loss: 3.0880 (3.1228) class_acc: 0.5117 (0.5016) weight_decay: 0.0500 (0.0500) grad_norm: 2.1752 (2.3099) time: 1.7331 data: 0.0007 max mem: 2905
Epoch: [111] [400/625] eta: 0:07:20 lr: 0.003033 min_lr: 0.003033 loss: 3.1746 (3.1405) class_acc: 0.4883 (0.4963) weight_decay: 0.0500 (0.0500) grad_norm: 2.4413 (2.3929) time: 2.0589 data: 0.0402 max mem: 2905
Epoch: [111] [600/625] eta: 0:00:49 lr: 0.003027 min_lr: 0.003027 loss: 3.1327 (3.1485) class_acc: 0.4922 (0.4945) weight_decay: 0.0500 (0.0500) grad_norm: 2.2550 (2.5366) time: 2.0231 data: 0.0007 max mem: 2905
Epoch: [111] [624/625] eta: 0:00:01 lr: 0.003026 min_lr: 0.003026 loss: 3.1295 (3.1488) class_acc: 0.4805 (0.4942) weight_decay: 0.0500 (0.0500) grad_norm: 2.1352 (2.5109) time: 0.9173 data: 0.0012 max mem: 2905
Epoch: [111] Total time: 0:20:00 (1.9214 s / it)
Averaged stats: lr: 0.003026 min_lr: 0.003026 loss: 3.1295 (3.1461) class_acc: 0.4805 (0.4943) weight_decay: 0.0500 (0.0500) grad_norm: 2.1352 (2.5109)
Test: [ 0/50] eta: 0:10:21 loss: 3.0910 (3.0910) acc1: 38.4000 (38.4000) acc5: 60.8000 (60.8000) time: 12.4327 data: 12.4083 max mem: 2905
Test: [10/50] eta: 0:01:22 loss: 2.6630 (2.6809) acc1: 44.0000 (44.2909) acc5: 70.4000 (69.3091) time: 2.0591 data: 2.0373 max mem: 2905
Test: [20/50] eta: 0:00:48 loss: 2.8214 (2.9004) acc1: 41.6000 (40.9143) acc5: 67.2000 (66.4381) time: 1.0642 data: 1.0429 max mem: 2905
Test: [30/50] eta: 0:00:28 loss: 3.0150 (2.9705) acc1: 36.0000 (40.0000) acc5: 64.0000 (65.2129) time: 1.0654 data: 1.0453 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.7833 (2.9176) acc1: 41.6000 (40.8585) acc5: 64.0000 (65.5220) time: 0.8258 data: 0.8070 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.9047 (2.9343) acc1: 36.8000 (40.2880) acc5: 64.0000 (65.2000) time: 0.7696 data: 0.7499 max mem: 2905
Test: Total time: 0:00:50 (1.0166 s / it)
* Acc@1 41.010 Acc@5 65.934 loss 2.893
Accuracy of the model on the 50000 test images: 41.0%
Max accuracy: 50.54%
Epoch: [112] [ 0/625] eta: 3:35:00 lr: 0.003026 min_lr: 0.003026 loss: 2.9544 (2.9544) class_acc: 0.5352 (0.5352) weight_decay: 0.0500 (0.0500) time: 20.6410 data: 18.8036 max mem: 2905
Epoch: [112] [200/625] eta: 0:14:23 lr: 0.003020 min_lr: 0.003020 loss: 3.1282 (3.1449) class_acc: 0.4961 (0.4954) weight_decay: 0.0500 (0.0500) grad_norm: 1.8091 (2.3144) time: 1.9008 data: 0.0387 max mem: 2905
Epoch: [112] [400/625] eta: 0:07:12 lr: 0.003014 min_lr: 0.003014 loss: 3.1076 (3.1472) class_acc: 0.4844 (0.4954) weight_decay: 0.0500 (0.0500) grad_norm: 1.9001 (2.3687) time: 1.6005 data: 0.0007 max mem: 2905
Epoch: [112] [600/625] eta: 0:00:47 lr: 0.003007 min_lr: 0.003007 loss: 3.1915 (3.1477) class_acc: 0.4844 (0.4950) weight_decay: 0.0500 (0.0500) grad_norm: 2.2115 (2.3580) time: 1.8564 data: 0.0006 max mem: 2905
Epoch: [112] [624/625] eta: 0:00:01 lr: 0.003007 min_lr: 0.003007 loss: 3.1449 (3.1481) class_acc: 0.4961 (0.4949) weight_decay: 0.0500 (0.0500) grad_norm: 1.9117 (2.3456) time: 0.7710 data: 0.0015 max mem: 2905
Epoch: [112] Total time: 0:19:40 (1.8893 s / it)
Averaged stats: lr: 0.003007 min_lr: 0.003007 loss: 3.1449 (3.1430) class_acc: 0.4961 (0.4945) weight_decay: 0.0500 (0.0500) grad_norm: 1.9117 (2.3456)
Test: [ 0/50] eta: 0:09:59 loss: 3.3152 (3.3152) acc1: 32.0000 (32.0000) acc5: 60.0000 (60.0000) time: 11.9977 data: 11.9675 max mem: 2905
Test: [10/50] eta: 0:01:20 loss: 3.0412 (3.0686) acc1: 40.0000 (39.7818) acc5: 62.4000 (63.5636) time: 2.0132 data: 1.9925 max mem: 2905
Test: [20/50] eta: 0:00:47 loss: 3.1450 (3.2844) acc1: 37.6000 (36.8381) acc5: 60.0000 (60.8762) time: 1.0595 data: 1.0393 max mem: 2905
Test: [30/50] eta: 0:00:26 loss: 3.2180 (3.2495) acc1: 35.2000 (36.9806) acc5: 58.4000 (60.8258) time: 0.9231 data: 0.9036 max mem: 2905
Test: [40/50] eta: 0:00:10 loss: 3.2099 (3.2744) acc1: 36.0000 (36.8000) acc5: 57.6000 (60.5659) time: 0.5613 data: 0.5431 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 3.3768 (3.2704) acc1: 36.8000 (36.6400) acc5: 57.6000 (60.7040) time: 0.5332 data: 0.5143 max mem: 2905
Test: Total time: 0:00:46 (0.9208 s / it)
* Acc@1 36.796 Acc@5 61.040 loss 3.241
Accuracy of the model on the 50000 test images: 36.8%
Max accuracy: 50.54%
Epoch: [113] [ 0/625] eta: 3:34:57 lr: 0.003007 min_lr: 0.003007 loss: 3.2775 (3.2775) class_acc: 0.4727 (0.4727) weight_decay: 0.0500 (0.0500) time: 20.6364 data: 17.7362 max mem: 2905
Epoch: [113] [200/625] eta: 0:13:44 lr: 0.003000 min_lr: 0.003000 loss: 3.1433 (3.1208) class_acc: 0.4961 (0.4994) weight_decay: 0.0500 (0.0500) grad_norm: 2.1387 (2.5876) time: 1.8093 data: 0.0008 max mem: 2905
Epoch: [113] [400/625] eta: 0:07:12 lr: 0.002994 min_lr: 0.002994 loss: 3.1086 (3.1297) class_acc: 0.4961 (0.4984) weight_decay: 0.0500 (0.0500) grad_norm: 2.1409 (inf) time: 1.8178 data: 0.0008 max mem: 2905
Epoch: [113] [600/625] eta: 0:00:48 lr: 0.002988 min_lr: 0.002988 loss: 3.1627 (3.1366) class_acc: 0.4922 (0.4969) weight_decay: 0.0500 (0.0500) grad_norm: 1.9516 (inf) time: 2.0145 data: 0.0008 max mem: 2905
Epoch: [113] [624/625] eta: 0:00:01 lr: 0.002987 min_lr: 0.002987 loss: 3.1529 (3.1368) class_acc: 0.4961 (0.4969) weight_decay: 0.0500 (0.0500) grad_norm: 2.2345 (inf) time: 0.7779 data: 0.0015 max mem: 2905
Epoch: [113] Total time: 0:19:54 (1.9114 s / it)
Averaged stats: lr: 0.002987 min_lr: 0.002987 loss: 3.1529 (3.1396) class_acc: 0.4961 (0.4951) weight_decay: 0.0500 (0.0500) grad_norm: 2.2345 (inf)
Test: [ 0/50] eta: 0:10:29 loss: 2.9229 (2.9229) acc1: 37.6000 (37.6000) acc5: 64.8000 (64.8000) time: 12.5963 data: 12.5672 max mem: 2905
Test: [10/50] eta: 0:01:24 loss: 3.0741 (3.0143) acc1: 39.2000 (39.0545) acc5: 63.2000 (64.0727) time: 2.1124 data: 2.0938 max mem: 2905
Test: [20/50] eta: 0:00:49 loss: 3.1591 (3.2334) acc1: 35.2000 (35.8857) acc5: 62.4000 (60.9905) time: 1.1058 data: 1.0871 max mem: 2905
Test: [30/50] eta: 0:00:29 loss: 3.4525 (3.2071) acc1: 33.6000 (36.6968) acc5: 58.4000 (61.2645) time: 1.1364 data: 1.1172 max mem: 2905
Test: [40/50] eta: 0:00:13 loss: 3.3710 (3.2426) acc1: 36.8000 (36.3707) acc5: 58.4000 (60.8390) time: 1.0408 data: 1.0216 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 3.2432 (3.2648) acc1: 36.8000 (35.9200) acc5: 60.0000 (60.6400) time: 0.9112 data: 0.8911 max mem: 2905
Test: Total time: 0:00:58 (1.1690 s / it)
* Acc@1 36.038 Acc@5 60.556 loss 3.247
Accuracy of the model on the 50000 test images: 36.0%
Max accuracy: 50.54%
Epoch: [114] [ 0/625] eta: 3:33:10 lr: 0.002987 min_lr: 0.002987 loss: 3.3175 (3.3175) class_acc: 0.4336 (0.4336) weight_decay: 0.0500 (0.0500) time: 20.4642 data: 16.8792 max mem: 2905
Epoch: [114] [200/625] eta: 0:14:40 lr: 0.002981 min_lr: 0.002981 loss: 3.1523 (3.1309) class_acc: 0.5039 (0.4982) weight_decay: 0.0500 (0.0500) grad_norm: 2.1341 (2.2794) time: 1.8696 data: 0.0009 max mem: 2905
Epoch: [114] [400/625] eta: 0:07:30 lr: 0.002975 min_lr: 0.002975 loss: 3.1316 (3.1342) class_acc: 0.4961 (0.4968) weight_decay: 0.0500 (0.0500) grad_norm: 2.1540 (2.3170) time: 1.9479 data: 0.0317 max mem: 2905
Epoch: [114] [600/625] eta: 0:00:49 lr: 0.002968 min_lr: 0.002968 loss: 3.1213 (3.1373) class_acc: 0.5000 (0.4960) weight_decay: 0.0500 (0.0500) grad_norm: 2.1035 (2.4307) time: 1.9822 data: 0.0335 max mem: 2905
Epoch: [114] [624/625] eta: 0:00:01 lr: 0.002968 min_lr: 0.002968 loss: 3.1687 (3.1388) class_acc: 0.4766 (0.4956) weight_decay: 0.0500 (0.0500) grad_norm: 2.3021 (2.4420) time: 0.8579 data: 0.0066 max mem: 2905
Epoch: [114] Total time: 0:20:08 (1.9328 s / it)
Averaged stats: lr: 0.002968 min_lr: 0.002968 loss: 3.1687 (3.1394) class_acc: 0.4766 (0.4951) weight_decay: 0.0500 (0.0500) grad_norm: 2.3021 (2.4420)
Test: [ 0/50] eta: 0:09:14 loss: 3.5593 (3.5593) acc1: 36.8000 (36.8000) acc5: 54.4000 (54.4000) time: 11.0843 data: 11.0517 max mem: 2905
Test: [10/50] eta: 0:01:22 loss: 3.1926 (3.2685) acc1: 38.4000 (37.3818) acc5: 60.0000 (58.9818) time: 2.0692 data: 2.0476 max mem: 2905
Test: [20/50] eta: 0:00:49 loss: 3.3341 (3.4525) acc1: 29.6000 (33.1810) acc5: 57.6000 (57.3714) time: 1.1825 data: 1.1630 max mem: 2905
Test: [30/50] eta: 0:00:28 loss: 3.6556 (3.4761) acc1: 29.6000 (33.2129) acc5: 56.8000 (57.1097) time: 1.0764 data: 1.0585 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 3.5992 (3.5599) acc1: 32.0000 (32.6049) acc5: 56.8000 (56.0000) time: 0.6618 data: 0.6445 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 3.7008 (3.5789) acc1: 31.2000 (32.2400) acc5: 52.8000 (55.8240) time: 0.6254 data: 0.6076 max mem: 2905
Test: Total time: 0:00:48 (0.9657 s / it)
* Acc@1 32.702 Acc@5 56.232 loss 3.560
Accuracy of the model on the 50000 test images: 32.7%
Max accuracy: 50.54%
Epoch: [115] [ 0/625] eta: 3:36:38 lr: 0.002968 min_lr: 0.002968 loss: 3.4248 (3.4248) class_acc: 0.4531 (0.4531) weight_decay: 0.0500 (0.0500) time: 20.7979 data: 19.1501 max mem: 2905
Epoch: [115] [200/625] eta: 0:13:51 lr: 0.002961 min_lr: 0.002961 loss: 3.0897 (3.1254) class_acc: 0.5000 (0.4982) weight_decay: 0.0500 (0.0500) grad_norm: 1.6014 (2.2201) time: 1.9206 data: 0.2626 max mem: 2905
Epoch: [115] [400/625] eta: 0:07:14 lr: 0.002955 min_lr: 0.002955 loss: 3.1286 (3.1279) class_acc: 0.5000 (0.4982) weight_decay: 0.0500 (0.0500) grad_norm: 2.0474 (2.3980) time: 1.9727 data: 0.0938 max mem: 2905
Epoch: [115] [600/625] eta: 0:00:48 lr: 0.002949 min_lr: 0.002949 loss: 3.1684 (3.1346) class_acc: 0.4844 (0.4961) weight_decay: 0.0500 (0.0500) grad_norm: 2.0476 (2.5024) time: 1.9562 data: 0.0008 max mem: 2905
Epoch: [115] [624/625] eta: 0:00:01 lr: 0.002948 min_lr: 0.002948 loss: 3.1492 (3.1352) class_acc: 0.4961 (0.4961) weight_decay: 0.0500 (0.0500) grad_norm: 2.0476 (2.4942) time: 0.7735 data: 0.0015 max mem: 2905
Epoch: [115] Total time: 0:19:50 (1.9051 s / it)
Averaged stats: lr: 0.002948 min_lr: 0.002948 loss: 3.1492 (3.1370) class_acc: 0.4961 (0.4958) weight_decay: 0.0500 (0.0500) grad_norm: 2.0476 (2.4942)
Test: [ 0/50] eta: 0:09:49 loss: 3.2632 (3.2632) acc1: 36.0000 (36.0000) acc5: 62.4000 (62.4000) time: 11.7911 data: 11.7616 max mem: 2905
Test: [10/50] eta: 0:01:19 loss: 2.8094 (2.8797) acc1: 42.4000 (41.7455) acc5: 64.8000 (65.8909) time: 1.9933 data: 1.9718 max mem: 2905
Test: [20/50] eta: 0:00:46 loss: 2.8652 (2.9311) acc1: 42.4000 (41.5619) acc5: 64.8000 (65.2571) time: 1.0551 data: 1.0359 max mem: 2905
Test: [30/50] eta: 0:00:28 loss: 2.8842 (2.9233) acc1: 42.4000 (42.0129) acc5: 66.4000 (65.1097) time: 1.0874 data: 1.0695 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.8003 (2.9050) acc1: 42.4000 (42.1073) acc5: 67.2000 (65.7366) time: 0.9582 data: 0.9403 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.8003 (2.9042) acc1: 42.4000 (42.4640) acc5: 67.2000 (65.7760) time: 0.8729 data: 0.8522 max mem: 2905
Test: Total time: 0:00:53 (1.0693 s / it)
* Acc@1 42.494 Acc@5 66.976 loss 2.840
Accuracy of the model on the 50000 test images: 42.5%
Max accuracy: 50.54%
Epoch: [116] [ 0/625] eta: 3:38:25 lr: 0.002948 min_lr: 0.002948 loss: 3.1878 (3.1878) class_acc: 0.4844 (0.4844) weight_decay: 0.0500 (0.0500) time: 20.9682 data: 19.3178 max mem: 2905
Epoch: [116] [200/625] eta: 0:14:33 lr: 0.002942 min_lr: 0.002942 loss: 3.1119 (3.1229) class_acc: 0.5000 (0.4999) weight_decay: 0.0500 (0.0500) grad_norm: 1.8911 (2.1056) time: 1.9544 data: 0.0007 max mem: 2905
Epoch: [116] [400/625] eta: 0:07:24 lr: 0.002935 min_lr: 0.002935 loss: 3.0774 (3.1305) class_acc: 0.4922 (0.4985) weight_decay: 0.0500 (0.0500) grad_norm: 1.8437 (2.2882) time: 1.9125 data: 0.3040 max mem: 2905
Epoch: [116] [600/625] eta: 0:00:49 lr: 0.002929 min_lr: 0.002929 loss: 3.1168 (3.1329) class_acc: 0.4961 (0.4968) weight_decay: 0.0500 (0.0500) grad_norm: 2.2304 (2.3359) time: 1.8592 data: 0.0549 max mem: 2905
Epoch: [116] [624/625] eta: 0:00:01 lr: 0.002928 min_lr: 0.002928 loss: 3.1823 (3.1342) class_acc: 0.4844 (0.4964) weight_decay: 0.0500 (0.0500) grad_norm: 2.3173 (2.3465) time: 0.8612 data: 0.0239 max mem: 2905
Epoch: [116] Total time: 0:19:59 (1.9199 s / it)
Averaged stats: lr: 0.002928 min_lr: 0.002928 loss: 3.1823 (3.1324) class_acc: 0.4844 (0.4966) weight_decay: 0.0500 (0.0500) grad_norm: 2.3173 (2.3465)
Test: [ 0/50] eta: 0:08:39 loss: 4.1406 (4.1406) acc1: 28.8000 (28.8000) acc5: 44.0000 (44.0000) time: 10.3957 data: 10.3660 max mem: 2905
Test: [10/50] eta: 0:01:15 loss: 3.5837 (3.5091) acc1: 34.4000 (33.0909) acc5: 57.6000 (56.0000) time: 1.8812 data: 1.8603 max mem: 2905
Test: [20/50] eta: 0:00:50 loss: 3.8511 (3.7375) acc1: 28.0000 (29.4857) acc5: 52.0000 (52.8762) time: 1.2523 data: 1.2323 max mem: 2905
Test: [30/50] eta: 0:00:28 loss: 3.8628 (3.7050) acc1: 28.0000 (29.5484) acc5: 52.8000 (53.4194) time: 1.1517 data: 1.1317 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 3.8783 (3.7928) acc1: 27.2000 (28.7805) acc5: 51.2000 (52.4878) time: 0.6668 data: 0.6456 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 3.9732 (3.8051) acc1: 25.6000 (28.6560) acc5: 51.2000 (52.4640) time: 0.4727 data: 0.4524 max mem: 2905
Test: Total time: 0:00:52 (1.0405 s / it)
* Acc@1 29.068 Acc@5 52.220 loss 3.771
Accuracy of the model on the 50000 test images: 29.1%
Max accuracy: 50.54%
Epoch: [117] [ 0/625] eta: 3:28:43 lr: 0.002928 min_lr: 0.002928 loss: 3.0983 (3.0983) class_acc: 0.5234 (0.5234) weight_decay: 0.0500 (0.0500) time: 20.0370 data: 19.3412 max mem: 2905
Epoch: [117] [200/625] eta: 0:14:07 lr: 0.002922 min_lr: 0.002922 loss: 3.1313 (3.1128) class_acc: 0.5000 (0.5043) weight_decay: 0.0500 (0.0500) grad_norm: 1.9268 (2.4443) time: 1.9684 data: 0.0007 max mem: 2905
Epoch: [117] [400/625] eta: 0:07:10 lr: 0.002915 min_lr: 0.002915 loss: 3.1125 (3.1223) class_acc: 0.4922 (0.4998) weight_decay: 0.0500 (0.0500) grad_norm: 1.9851 (2.4194) time: 1.9290 data: 0.0007 max mem: 2905
Epoch: [117] [600/625] eta: 0:00:47 lr: 0.002909 min_lr: 0.002909 loss: 3.2236 (3.1351) class_acc: 0.4844 (0.4972) weight_decay: 0.0500 (0.0500) grad_norm: 1.7876 (2.4818) time: 1.8074 data: 0.0006 max mem: 2905
Epoch: [117] [624/625] eta: 0:00:01 lr: 0.002908 min_lr: 0.002908 loss: 3.1674 (3.1353) class_acc: 0.4922 (0.4971) weight_decay: 0.0500 (0.0500) grad_norm: 2.3374 (2.4869) time: 0.9966 data: 0.0020 max mem: 2905
Epoch: [117] Total time: 0:19:42 (1.8925 s / it)
Averaged stats: lr: 0.002908 min_lr: 0.002908 loss: 3.1674 (3.1307) class_acc: 0.4922 (0.4975) weight_decay: 0.0500 (0.0500) grad_norm: 2.3374 (2.4869)
Test: [ 0/50] eta: 0:11:42 loss: 3.9953 (3.9953) acc1: 21.6000 (21.6000) acc5: 48.0000 (48.0000) time: 14.0416 data: 14.0124 max mem: 2905
Test: [10/50] eta: 0:01:39 loss: 4.0153 (4.0923) acc1: 25.6000 (25.7455) acc5: 47.2000 (47.2727) time: 2.4778 data: 2.4549 max mem: 2905
Test: [20/50] eta: 0:00:57 loss: 4.1811 (4.1390) acc1: 25.6000 (25.4857) acc5: 47.2000 (46.8952) time: 1.3048 data: 1.2837 max mem: 2905
Test: [30/50] eta: 0:00:32 loss: 4.1423 (4.0859) acc1: 25.6000 (26.1677) acc5: 48.0000 (48.2065) time: 1.1654 data: 1.1449 max mem: 2905
Test: [40/50] eta: 0:00:13 loss: 4.1545 (4.1688) acc1: 25.6000 (25.4634) acc5: 44.8000 (47.2585) time: 0.8025 data: 0.7828 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 4.1061 (4.1333) acc1: 25.6000 (25.8240) acc5: 46.4000 (47.6160) time: 0.7169 data: 0.6953 max mem: 2905
Test: Total time: 0:00:56 (1.1385 s / it)
* Acc@1 26.426 Acc@5 47.990 loss 4.078
Accuracy of the model on the 50000 test images: 26.4%
Max accuracy: 50.54%
Epoch: [118] [ 0/625] eta: 3:23:37 lr: 0.002908 min_lr: 0.002908 loss: 3.0896 (3.0896) class_acc: 0.5039 (0.5039) weight_decay: 0.0500 (0.0500) time: 19.5480 data: 19.3674 max mem: 2905
Epoch: [118] [200/625] eta: 0:14:21 lr: 0.002902 min_lr: 0.002902 loss: 3.1041 (3.1083) class_acc: 0.5117 (0.5010) weight_decay: 0.0500 (0.0500) grad_norm: 1.8109 (2.4007) time: 1.9184 data: 0.0011 max mem: 2905
Epoch: [118] [400/625] eta: 0:07:22 lr: 0.002895 min_lr: 0.002895 loss: 3.0830 (3.1199) class_acc: 0.5078 (0.4984) weight_decay: 0.0500 (0.0500) grad_norm: 1.9833 (2.3379) time: 1.7695 data: 0.0008 max mem: 2905
Epoch: [118] [600/625] eta: 0:00:49 lr: 0.002889 min_lr: 0.002889 loss: 3.1148 (3.1254) class_acc: 0.5039 (0.4970) weight_decay: 0.0500 (0.0500) grad_norm: 1.8102 (2.2762) time: 1.8758 data: 0.0676 max mem: 2905
Epoch: [118] [624/625] eta: 0:00:01 lr: 0.002888 min_lr: 0.002888 loss: 3.0968 (3.1248) class_acc: 0.4883 (0.4971) weight_decay: 0.0500 (0.0500) grad_norm: 1.8244 (2.3112) time: 0.6965 data: 0.0173 max mem: 2905
Epoch: [118] Total time: 0:20:04 (1.9268 s / it)
Averaged stats: lr: 0.002888 min_lr: 0.002888 loss: 3.0968 (3.1292) class_acc: 0.4883 (0.4974) weight_decay: 0.0500 (0.0500) grad_norm: 1.8244 (2.3112)
Test: [ 0/50] eta: 0:10:16 loss: 3.0749 (3.0749) acc1: 38.4000 (38.4000) acc5: 63.2000 (63.2000) time: 12.3364 data: 12.3054 max mem: 2905
Test: [10/50] eta: 0:01:20 loss: 3.0426 (3.0186) acc1: 38.4000 (39.1273) acc5: 63.2000 (63.2727) time: 2.0005 data: 1.9818 max mem: 2905
Test: [20/50] eta: 0:00:45 loss: 3.1473 (3.2063) acc1: 35.2000 (35.8857) acc5: 60.8000 (60.9143) time: 0.9837 data: 0.9651 max mem: 2905
Test: [30/50] eta: 0:00:26 loss: 3.3094 (3.2278) acc1: 33.6000 (35.7161) acc5: 59.2000 (61.1613) time: 0.9439 data: 0.9246 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 3.2905 (3.2561) acc1: 33.6000 (35.5122) acc5: 59.2000 (60.6244) time: 0.7544 data: 0.7334 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 3.3631 (3.2862) acc1: 33.6000 (35.0720) acc5: 59.2000 (60.4960) time: 0.7094 data: 0.6879 max mem: 2905
Test: Total time: 0:00:48 (0.9624 s / it)
* Acc@1 36.162 Acc@5 61.122 loss 3.242
Accuracy of the model on the 50000 test images: 36.2%
Max accuracy: 50.54%
Epoch: [119] [ 0/625] eta: 3:32:10 lr: 0.002888 min_lr: 0.002888 loss: 3.1628 (3.1628) class_acc: 0.4883 (0.4883) weight_decay: 0.0500 (0.0500) time: 20.3688 data: 18.4151 max mem: 2905
Epoch: [119] [200/625] eta: 0:14:32 lr: 0.002882 min_lr: 0.002882 loss: 3.1986 (3.1052) class_acc: 0.4922 (0.5031) weight_decay: 0.0500 (0.0500) grad_norm: 1.9173 (2.2286) time: 2.1279 data: 1.5632 max mem: 2905
Epoch: [119] [400/625] eta: 0:07:27 lr: 0.002875 min_lr: 0.002875 loss: 3.1299 (3.1126) class_acc: 0.4883 (0.5006) weight_decay: 0.0500 (0.0500) grad_norm: 2.1469 (2.4205) time: 1.8948 data: 1.7302 max mem: 2905
Epoch: [119] [600/625] eta: 0:00:49 lr: 0.002869 min_lr: 0.002869 loss: 3.1275 (3.1199) class_acc: 0.4844 (0.4984) weight_decay: 0.0500 (0.0500) grad_norm: 2.5512 (inf) time: 1.8978 data: 1.7007 max mem: 2905
Epoch: [119] [624/625] eta: 0:00:01 lr: 0.002868 min_lr: 0.002868 loss: 3.1683 (3.1216) class_acc: 0.4922 (0.4982) weight_decay: 0.0500 (0.0500) grad_norm: 2.1216 (inf) time: 0.7417 data: 0.5816 max mem: 2905
Epoch: [119] Total time: 0:20:02 (1.9246 s / it)
Averaged stats: lr: 0.002868 min_lr: 0.002868 loss: 3.1683 (3.1224) class_acc: 0.4922 (0.4993) weight_decay: 0.0500 (0.0500) grad_norm: 2.1216 (inf)
Test: [ 0/50] eta: 0:09:59 loss: 4.2683 (4.2683) acc1: 24.0000 (24.0000) acc5: 44.0000 (44.0000) time: 11.9993 data: 11.9667 max mem: 2905
Test: [10/50] eta: 0:01:26 loss: 3.5573 (3.6179) acc1: 32.0000 (32.1455) acc5: 55.2000 (54.9818) time: 2.1551 data: 2.1356 max mem: 2905
Test: [20/50] eta: 0:00:50 loss: 3.8268 (3.7968) acc1: 28.8000 (28.4952) acc5: 52.8000 (52.8381) time: 1.1803 data: 1.1608 max mem: 2905
Test: [30/50] eta: 0:00:27 loss: 3.8980 (3.8041) acc1: 27.2000 (29.0839) acc5: 52.0000 (52.9290) time: 0.9089 data: 0.8884 max mem: 2905
Test: [40/50] eta: 0:00:10 loss: 3.7821 (3.7762) acc1: 29.6000 (29.1512) acc5: 52.0000 (53.2098) time: 0.4686 data: 0.4494 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 3.6374 (3.7386) acc1: 29.6000 (29.6000) acc5: 54.4000 (53.6000) time: 0.4130 data: 0.3949 max mem: 2905
Test: Total time: 0:00:46 (0.9211 s / it)
* Acc@1 30.378 Acc@5 54.038 loss 3.683
Accuracy of the model on the 50000 test images: 30.4%
Max accuracy: 50.54%
Epoch: [120] [ 0/625] eta: 3:22:12 lr: 0.002868 min_lr: 0.002868 loss: 3.2221 (3.2221) class_acc: 0.4883 (0.4883) weight_decay: 0.0500 (0.0500) time: 19.4124 data: 18.0886 max mem: 2905
Epoch: [120] [200/625] eta: 0:13:58 lr: 0.002862 min_lr: 0.002862 loss: 3.1484 (3.1049) class_acc: 0.4961 (0.5015) weight_decay: 0.0500 (0.0500) grad_norm: 2.2431 (2.4987) time: 1.8141 data: 0.0133 max mem: 2905
Epoch: [120] [400/625] eta: 0:07:12 lr: 0.002855 min_lr: 0.002855 loss: 3.1716 (3.1122) class_acc: 0.4883 (0.5001) weight_decay: 0.0500 (0.0500) grad_norm: 1.4651 (2.4838) time: 1.9277 data: 0.0008 max mem: 2905
Epoch: [120] [600/625] eta: 0:00:47 lr: 0.002849 min_lr: 0.002849 loss: 3.1559 (3.1190) class_acc: 0.4766 (0.4991) weight_decay: 0.0500 (0.0500) grad_norm: 1.9856 (2.4202) time: 1.9562 data: 0.0239 max mem: 2905
Epoch: [120] [624/625] eta: 0:00:01 lr: 0.002848 min_lr: 0.002848 loss: 3.1083 (3.1197) class_acc: 0.5039 (0.4989) weight_decay: 0.0500 (0.0500) grad_norm: 2.3195 (2.4117) time: 0.7088 data: 0.0017 max mem: 2905
Epoch: [120] Total time: 0:19:28 (1.8694 s / it)
Averaged stats: lr: 0.002848 min_lr: 0.002848 loss: 3.1083 (3.1193) class_acc: 0.5039 (0.4995) weight_decay: 0.0500 (0.0500) grad_norm: 2.3195 (2.4117)
Test: [ 0/50] eta: 0:10:14 loss: 3.2160 (3.2160) acc1: 35.2000 (35.2000) acc5: 58.4000 (58.4000) time: 12.2955 data: 12.2578 max mem: 2905
Test: [10/50] eta: 0:01:16 loss: 3.2160 (3.2369) acc1: 36.8000 (36.5091) acc5: 59.2000 (59.9273) time: 1.9078 data: 1.8871 max mem: 2905
Test: [20/50] eta: 0:00:42 loss: 3.4970 (3.4354) acc1: 33.6000 (33.7143) acc5: 57.6000 (57.3714) time: 0.8834 data: 0.8641 max mem: 2905
Test: [30/50] eta: 0:00:26 loss: 3.5129 (3.4275) acc1: 32.8000 (34.2710) acc5: 52.8000 (57.6000) time: 0.9770 data: 0.9563 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 3.4612 (3.4435) acc1: 34.4000 (34.4390) acc5: 58.4000 (57.8146) time: 0.9414 data: 0.9208 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 3.4612 (3.4346) acc1: 34.4000 (34.2400) acc5: 58.4000 (58.0800) time: 0.6356 data: 0.6153 max mem: 2905
Test: Total time: 0:00:53 (1.0685 s / it)
* Acc@1 35.024 Acc@5 58.718 loss 3.397
Accuracy of the model on the 50000 test images: 35.0%
Max accuracy: 50.54%
Epoch: [121] [ 0/625] eta: 3:46:09 lr: 0.002848 min_lr: 0.002848 loss: 3.1343 (3.1343) class_acc: 0.5273 (0.5273) weight_decay: 0.0500 (0.0500) time: 21.7105 data: 21.5833 max mem: 2905
Epoch: [121] [200/625] eta: 0:13:43 lr: 0.002841 min_lr: 0.002841 loss: 3.1020 (3.0979) class_acc: 0.5000 (0.5024) weight_decay: 0.0500 (0.0500) grad_norm: 2.0525 (2.4709) time: 1.8017 data: 0.0150 max mem: 2905
Epoch: [121] [400/625] eta: 0:07:16 lr: 0.002835 min_lr: 0.002835 loss: 3.0542 (3.1099) class_acc: 0.5156 (0.5016) weight_decay: 0.0500 (0.0500) grad_norm: 2.0552 (2.5925) time: 1.9216 data: 0.0010 max mem: 2905
Epoch: [121] [600/625] eta: 0:00:48 lr: 0.002828 min_lr: 0.002828 loss: 3.1089 (3.1144) class_acc: 0.5000 (0.5011) weight_decay: 0.0500 (0.0500) grad_norm: 2.2338 (2.4597) time: 2.0491 data: 0.0202 max mem: 2905
Epoch: [121] [624/625] eta: 0:00:01 lr: 0.002827 min_lr: 0.002827 loss: 3.0468 (3.1138) class_acc: 0.5078 (0.5012) weight_decay: 0.0500 (0.0500) grad_norm: 1.9030 (2.4433) time: 0.5510 data: 0.0064 max mem: 2905
Epoch: [121] Total time: 0:20:02 (1.9241 s / it)
Averaged stats: lr: 0.002827 min_lr: 0.002827 loss: 3.0468 (3.1194) class_acc: 0.5078 (0.4998) weight_decay: 0.0500 (0.0500) grad_norm: 1.9030 (2.4433)
Test: [ 0/50] eta: 0:11:05 loss: 2.9186 (2.9186) acc1: 40.0000 (40.0000) acc5: 67.2000 (67.2000) time: 13.3024 data: 13.2690 max mem: 2905
Test: [10/50] eta: 0:01:25 loss: 2.9789 (3.1045) acc1: 40.0000 (38.9818) acc5: 62.4000 (61.6727) time: 2.1324 data: 2.1116 max mem: 2905
Test: [20/50] eta: 0:00:44 loss: 3.1822 (3.2055) acc1: 36.8000 (36.8000) acc5: 60.8000 (60.6476) time: 0.8952 data: 0.8753 max mem: 2905
Test: [30/50] eta: 0:00:25 loss: 3.2810 (3.2318) acc1: 36.0000 (37.0065) acc5: 58.4000 (60.1032) time: 0.8147 data: 0.7945 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 3.2810 (3.2511) acc1: 33.6000 (36.4878) acc5: 58.4000 (60.3707) time: 0.9387 data: 0.9179 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 3.1007 (3.2328) acc1: 38.4000 (36.5600) acc5: 62.4000 (60.6240) time: 0.7069 data: 0.6854 max mem: 2905
Test: Total time: 0:00:54 (1.0818 s / it)
* Acc@1 36.668 Acc@5 61.252 loss 3.199
Accuracy of the model on the 50000 test images: 36.7%
Max accuracy: 50.54%
Epoch: [122] [ 0/625] eta: 3:49:48 lr: 0.002827 min_lr: 0.002827 loss: 3.1040 (3.1040) class_acc: 0.5273 (0.5273) weight_decay: 0.0500 (0.0500) time: 22.0619 data: 19.6065 max mem: 2905
Epoch: [122] [200/625] eta: 0:14:54 lr: 0.002821 min_lr: 0.002821 loss: 3.1230 (3.1029) class_acc: 0.5000 (0.5030) weight_decay: 0.0500 (0.0500) grad_norm: 3.2911 (2.3558) time: 2.0302 data: 0.0065 max mem: 2905
Epoch: [122] [400/625] eta: 0:07:46 lr: 0.002814 min_lr: 0.002814 loss: 3.1109 (3.1038) class_acc: 0.4961 (0.5038) weight_decay: 0.0500 (0.0500) grad_norm: 1.9128 (2.4172) time: 2.0333 data: 0.0168 max mem: 2905
Epoch: [122] [600/625] eta: 0:00:51 lr: 0.002808 min_lr: 0.002808 loss: 3.1263 (3.1124) class_acc: 0.5039 (0.5025) weight_decay: 0.0500 (0.0500) grad_norm: 2.1427 (2.4045) time: 2.0243 data: 0.0320 max mem: 2905
Epoch: [122] [624/625] eta: 0:00:02 lr: 0.002807 min_lr: 0.002807 loss: 3.1364 (3.1136) class_acc: 0.4961 (0.5020) weight_decay: 0.0500 (0.0500) grad_norm: 1.8771 (2.4042) time: 0.8710 data: 0.0129 max mem: 2905
Epoch: [122] Total time: 0:20:56 (2.0110 s / it)
Averaged stats: lr: 0.002807 min_lr: 0.002807 loss: 3.1364 (3.1195) class_acc: 0.4961 (0.4998) weight_decay: 0.0500 (0.0500) grad_norm: 1.8771 (2.4042)
Test: [ 0/50] eta: 0:10:47 loss: 2.4031 (2.4031) acc1: 55.2000 (55.2000) acc5: 75.2000 (75.2000) time: 12.9530 data: 12.9225 max mem: 2905
Test: [10/50] eta: 0:01:30 loss: 2.5733 (2.6236) acc1: 45.6000 (45.3091) acc5: 69.6000 (70.9818) time: 2.2582 data: 2.2391 max mem: 2905
Test: [20/50] eta: 0:00:53 loss: 2.7623 (2.7106) acc1: 42.4000 (43.6952) acc5: 69.6000 (70.0190) time: 1.2297 data: 1.2101 max mem: 2905
Test: [30/50] eta: 0:00:30 loss: 2.7293 (2.6982) acc1: 42.4000 (44.0258) acc5: 70.4000 (69.9097) time: 1.1424 data: 1.1218 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.7293 (2.7471) acc1: 43.2000 (43.6683) acc5: 68.8000 (69.3659) time: 0.6703 data: 0.6496 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.7873 (2.7637) acc1: 43.2000 (43.5680) acc5: 67.2000 (68.9920) time: 0.6040 data: 0.5840 max mem: 2905
Test: Total time: 0:00:51 (1.0319 s / it)
* Acc@1 43.874 Acc@5 69.140 loss 2.704
Accuracy of the model on the 50000 test images: 43.9%
Max accuracy: 50.54%
Epoch: [123] [ 0/625] eta: 3:45:48 lr: 0.002807 min_lr: 0.002807 loss: 3.0981 (3.0981) class_acc: 0.5273 (0.5273) weight_decay: 0.0500 (0.0500) time: 21.6769 data: 19.5205 max mem: 2905
Epoch: [123] [200/625] eta: 0:13:57 lr: 0.002800 min_lr: 0.002800 loss: 3.1774 (3.0866) class_acc: 0.4922 (0.5060) weight_decay: 0.0500 (0.0500) grad_norm: 2.2369 (2.5515) time: 1.7764 data: 0.9033 max mem: 2905
Epoch: [123] [400/625] eta: 0:07:23 lr: 0.002794 min_lr: 0.002794 loss: 3.1246 (3.0986) class_acc: 0.4883 (0.5041) weight_decay: 0.0500 (0.0500) grad_norm: 1.8960 (2.6429) time: 1.9137 data: 0.0069 max mem: 2905
Epoch: [123] [600/625] eta: 0:00:49 lr: 0.002787 min_lr: 0.002787 loss: 3.0566 (3.1097) class_acc: 0.5117 (0.5018) weight_decay: 0.0500 (0.0500) grad_norm: 2.4988 (2.6974) time: 2.1439 data: 0.2413 max mem: 2905
Epoch: [123] [624/625] eta: 0:00:01 lr: 0.002786 min_lr: 0.002786 loss: 3.1921 (3.1112) class_acc: 0.4922 (0.5012) weight_decay: 0.0500 (0.0500) grad_norm: 2.2753 (2.6823) time: 0.3701 data: 0.0022 max mem: 2905
Epoch: [123] Total time: 0:20:23 (1.9578 s / it)
Averaged stats: lr: 0.002786 min_lr: 0.002786 loss: 3.1921 (3.1139) class_acc: 0.4922 (0.5014) weight_decay: 0.0500 (0.0500) grad_norm: 2.2753 (2.6823)
Test: [ 0/50] eta: 0:11:26 loss: 3.3095 (3.3095) acc1: 34.4000 (34.4000) acc5: 60.8000 (60.8000) time: 13.7278 data: 13.6841 max mem: 2905
Test: [10/50] eta: 0:01:28 loss: 3.2626 (3.3004) acc1: 36.0000 (34.9091) acc5: 57.6000 (59.4909) time: 2.2241 data: 2.2037 max mem: 2905
Test: [20/50] eta: 0:00:52 loss: 3.2967 (3.3582) acc1: 33.6000 (34.2095) acc5: 56.8000 (58.7048) time: 1.1608 data: 1.1428 max mem: 2905
Test: [30/50] eta: 0:00:31 loss: 3.3093 (3.3349) acc1: 34.4000 (34.7355) acc5: 56.8000 (59.2258) time: 1.2176 data: 1.1994 max mem: 2905
Test: [40/50] eta: 0:00:13 loss: 3.2431 (3.3348) acc1: 36.8000 (34.8683) acc5: 58.4000 (59.1220) time: 0.9768 data: 0.9574 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 3.2611 (3.3471) acc1: 32.8000 (34.7840) acc5: 59.2000 (59.1360) time: 0.9445 data: 0.9251 max mem: 2905
Test: Total time: 0:00:56 (1.1373 s / it)
* Acc@1 34.816 Acc@5 59.766 loss 3.311
Accuracy of the model on the 50000 test images: 34.8%
Max accuracy: 50.54%
Epoch: [124] [ 0/625] eta: 4:07:18 lr: 0.002786 min_lr: 0.002786 loss: 2.8637 (2.8637) class_acc: 0.5469 (0.5469) weight_decay: 0.0500 (0.0500) time: 23.7417 data: 18.2486 max mem: 2905
Epoch: [124] [200/625] eta: 0:14:14 lr: 0.002780 min_lr: 0.002780 loss: 3.1413 (3.0922) class_acc: 0.4883 (0.5041) weight_decay: 0.0500 (0.0500) grad_norm: 2.1518 (2.3834) time: 1.8396 data: 0.0544 max mem: 2905
Epoch: [124] [400/625] eta: 0:07:29 lr: 0.002773 min_lr: 0.002773 loss: 3.1112 (3.1076) class_acc: 0.5039 (0.5026) weight_decay: 0.0500 (0.0500) grad_norm: 2.4841 (2.4374) time: 1.8704 data: 0.3943 max mem: 2905
Epoch: [124] [600/625] eta: 0:00:50 lr: 0.002766 min_lr: 0.002766 loss: 3.0802 (3.1127) class_acc: 0.4961 (0.5011) weight_decay: 0.0500 (0.0500) grad_norm: 2.0208 (2.3702) time: 2.0075 data: 0.0648 max mem: 2905
Epoch: [124] [624/625] eta: 0:00:01 lr: 0.002766 min_lr: 0.002766 loss: 3.1260 (3.1140) class_acc: 0.4961 (0.5007) weight_decay: 0.0500 (0.0500) grad_norm: 2.0441 (2.3610) time: 0.7640 data: 0.0173 max mem: 2905
Epoch: [124] Total time: 0:20:29 (1.9671 s / it)
Averaged stats: lr: 0.002766 min_lr: 0.002766 loss: 3.1260 (3.1093) class_acc: 0.4961 (0.5019) weight_decay: 0.0500 (0.0500) grad_norm: 2.0441 (2.3610)
Test: [ 0/50] eta: 0:10:06 loss: 3.5939 (3.5939) acc1: 32.8000 (32.8000) acc5: 52.8000 (52.8000) time: 12.1348 data: 12.1105 max mem: 2905
Test: [10/50] eta: 0:01:18 loss: 3.0858 (3.1422) acc1: 37.6000 (38.1818) acc5: 64.0000 (62.8364) time: 1.9732 data: 1.9545 max mem: 2905
Test: [20/50] eta: 0:00:46 loss: 3.2235 (3.3061) acc1: 34.4000 (35.1619) acc5: 57.6000 (59.4286) time: 1.0053 data: 0.9864 max mem: 2905
Test: [30/50] eta: 0:00:26 loss: 3.4119 (3.3379) acc1: 30.4000 (34.7355) acc5: 56.8000 (59.2516) time: 0.9775 data: 0.9579 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 3.3754 (3.3646) acc1: 32.8000 (34.1463) acc5: 58.4000 (58.9854) time: 0.6632 data: 0.6438 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 3.3754 (3.3862) acc1: 32.8000 (34.0640) acc5: 57.6000 (58.6080) time: 0.5680 data: 0.5491 max mem: 2905
Test: Total time: 0:00:47 (0.9570 s / it)
* Acc@1 34.428 Acc@5 58.544 loss 3.367
Accuracy of the model on the 50000 test images: 34.4%
Max accuracy: 50.54%
Epoch: [125] [ 0/625] eta: 3:23:15 lr: 0.002766 min_lr: 0.002766 loss: 3.1610 (3.1610) class_acc: 0.4844 (0.4844) weight_decay: 0.0500 (0.0500) time: 19.5122 data: 16.7370 max mem: 2905
Epoch: [125] [200/625] eta: 0:13:57 lr: 0.002759 min_lr: 0.002759 loss: 3.0773 (3.0800) class_acc: 0.5000 (0.5088) weight_decay: 0.0500 (0.0500) grad_norm: 1.5427 (2.4667) time: 1.7531 data: 0.0006 max mem: 2905
Epoch: [125] [400/625] eta: 0:07:20 lr: 0.002752 min_lr: 0.002752 loss: 3.1404 (3.0988) class_acc: 0.4922 (0.5046) weight_decay: 0.0500 (0.0500) grad_norm: 2.1039 (2.5865) time: 2.0188 data: 0.0008 max mem: 2905
Epoch: [125] [600/625] eta: 0:00:49 lr: 0.002746 min_lr: 0.002746 loss: 3.1789 (3.1080) class_acc: 0.4922 (0.5020) weight_decay: 0.0500 (0.0500) grad_norm: 2.3566 (2.5312) time: 2.1738 data: 0.0006 max mem: 2905
Epoch: [125] [624/625] eta: 0:00:01 lr: 0.002745 min_lr: 0.002745 loss: 3.0558 (3.1074) class_acc: 0.5039 (0.5021) weight_decay: 0.0500 (0.0500) grad_norm: 2.6566 (2.5402) time: 0.6747 data: 0.0013 max mem: 2905
Epoch: [125] Total time: 0:20:05 (1.9282 s / it)
Averaged stats: lr: 0.002745 min_lr: 0.002745 loss: 3.0558 (3.1072) class_acc: 0.5039 (0.5025) weight_decay: 0.0500 (0.0500) grad_norm: 2.6566 (2.5402)
Test: [ 0/50] eta: 0:09:55 loss: 2.9795 (2.9795) acc1: 38.4000 (38.4000) acc5: 64.0000 (64.0000) time: 11.9024 data: 11.8643 max mem: 2905
Test: [10/50] eta: 0:01:25 loss: 2.8623 (2.8837) acc1: 40.0000 (41.0182) acc5: 64.8000 (65.1636) time: 2.1405 data: 2.1095 max mem: 2905
Test: [20/50] eta: 0:00:52 loss: 2.9883 (3.1113) acc1: 39.2000 (37.8286) acc5: 62.4000 (62.2476) time: 1.2308 data: 1.2063 max mem: 2905
Test: [30/50] eta: 0:00:31 loss: 3.2644 (3.1340) acc1: 36.8000 (37.4968) acc5: 60.8000 (62.0645) time: 1.2606 data: 1.2421 max mem: 2905
Test: [40/50] eta: 0:00:13 loss: 3.2380 (3.1791) acc1: 36.0000 (37.2293) acc5: 60.0000 (61.6390) time: 0.8953 data: 0.8763 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 3.2380 (3.1937) acc1: 36.0000 (37.1840) acc5: 60.0000 (61.4400) time: 0.7498 data: 0.7303 max mem: 2905
Test: Total time: 0:00:56 (1.1222 s / it)
* Acc@1 37.044 Acc@5 62.218 loss 3.153
Accuracy of the model on the 50000 test images: 37.0%
Max accuracy: 50.54%
Epoch: [126] [ 0/625] eta: 3:38:53 lr: 0.002745 min_lr: 0.002745 loss: 3.1257 (3.1257) class_acc: 0.5039 (0.5039) weight_decay: 0.0500 (0.0500) time: 21.0129 data: 17.3480 max mem: 2905
Epoch: [126] [200/625] eta: 0:14:48 lr: 0.002738 min_lr: 0.002738 loss: 3.0536 (3.0802) class_acc: 0.5039 (0.5091) weight_decay: 0.0500 (0.0500) grad_norm: 3.0856 (2.4246) time: 1.9752 data: 0.0040 max mem: 2905
Epoch: [126] [400/625] eta: 0:07:39 lr: 0.002732 min_lr: 0.002732 loss: 3.0944 (3.0932) class_acc: 0.5078 (0.5060) weight_decay: 0.0500 (0.0500) grad_norm: 1.6031 (inf) time: 1.9729 data: 0.0014 max mem: 2905
Epoch: [126] [600/625] eta: 0:00:50 lr: 0.002725 min_lr: 0.002725 loss: 3.1125 (3.1016) class_acc: 0.4961 (0.5040) weight_decay: 0.0500 (0.0500) grad_norm: 2.7345 (inf) time: 1.9805 data: 0.0009 max mem: 2905
Epoch: [126] [624/625] eta: 0:00:01 lr: 0.002724 min_lr: 0.002724 loss: 3.1249 (3.1032) class_acc: 0.4961 (0.5036) weight_decay: 0.0500 (0.0500) grad_norm: 2.0445 (inf) time: 0.8589 data: 0.0016 max mem: 2905
Epoch: [126] Total time: 0:20:40 (1.9842 s / it)
Averaged stats: lr: 0.002724 min_lr: 0.002724 loss: 3.1249 (3.1052) class_acc: 0.4961 (0.5032) weight_decay: 0.0500 (0.0500) grad_norm: 2.0445 (inf)
Test: [ 0/50] eta: 0:10:02 loss: 3.7799 (3.7799) acc1: 27.2000 (27.2000) acc5: 55.2000 (55.2000) time: 12.0437 data: 12.0108 max mem: 2905
Test: [10/50] eta: 0:01:18 loss: 3.3612 (3.2897) acc1: 37.6000 (37.3091) acc5: 58.4000 (59.5636) time: 1.9675 data: 1.9478 max mem: 2905
Test: [20/50] eta: 0:00:45 loss: 3.3612 (3.4181) acc1: 35.2000 (35.0476) acc5: 57.6000 (57.5238) time: 1.0001 data: 0.9815 max mem: 2905
Test: [30/50] eta: 0:00:26 loss: 3.4702 (3.4304) acc1: 33.6000 (34.7355) acc5: 56.0000 (57.4710) time: 0.9542 data: 0.9352 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 3.4367 (3.4353) acc1: 32.8000 (34.4976) acc5: 56.0000 (57.4244) time: 0.6697 data: 0.6514 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 3.4229 (3.4391) acc1: 32.0000 (34.5120) acc5: 56.8000 (57.4080) time: 0.6718 data: 0.6541 max mem: 2905
Test: Total time: 0:00:47 (0.9592 s / it)
* Acc@1 35.066 Acc@5 58.574 loss 3.400
Accuracy of the model on the 50000 test images: 35.1%
Max accuracy: 50.54%
Epoch: [127] [ 0/625] eta: 3:23:21 lr: 0.002724 min_lr: 0.002724 loss: 3.0901 (3.0901) class_acc: 0.5039 (0.5039) weight_decay: 0.0500 (0.0500) time: 19.5219 data: 19.4014 max mem: 2905
Epoch: [127] [200/625] eta: 0:14:34 lr: 0.002717 min_lr: 0.002717 loss: 3.0676 (3.0913) class_acc: 0.5039 (0.5047) weight_decay: 0.0500 (0.0500) grad_norm: 2.2467 (2.4505) time: 2.0436 data: 0.0007 max mem: 2905
Epoch: [127] [400/625] eta: 0:07:30 lr: 0.002711 min_lr: 0.002711 loss: 3.0939 (3.1025) class_acc: 0.5156 (0.5019) weight_decay: 0.0500 (0.0500) grad_norm: 2.7163 (2.3695) time: 1.8498 data: 0.0006 max mem: 2905
Epoch: [127] [600/625] eta: 0:00:49 lr: 0.002704 min_lr: 0.002704 loss: 3.1401 (3.1065) class_acc: 0.4883 (0.5011) weight_decay: 0.0500 (0.0500) grad_norm: 1.8148 (2.4658) time: 2.1132 data: 0.0006 max mem: 2905
Epoch: [127] [624/625] eta: 0:00:01 lr: 0.002703 min_lr: 0.002703 loss: 3.0607 (3.1061) class_acc: 0.5117 (0.5012) weight_decay: 0.0500 (0.0500) grad_norm: 1.6179 (2.4430) time: 0.7911 data: 0.0019 max mem: 2905
Epoch: [127] Total time: 0:20:12 (1.9402 s / it)
Averaged stats: lr: 0.002703 min_lr: 0.002703 loss: 3.0607 (3.1065) class_acc: 0.5117 (0.5024) weight_decay: 0.0500 (0.0500) grad_norm: 1.6179 (2.4430)
Test: [ 0/50] eta: 0:10:37 loss: 3.6693 (3.6693) acc1: 27.2000 (27.2000) acc5: 52.8000 (52.8000) time: 12.7422 data: 12.7054 max mem: 2905
Test: [10/50] eta: 0:01:23 loss: 3.2945 (3.1750) acc1: 36.0000 (37.0182) acc5: 60.0000 (60.5091) time: 2.0804 data: 2.0589 max mem: 2905
Test: [20/50] eta: 0:00:48 loss: 3.3121 (3.4147) acc1: 35.2000 (33.3714) acc5: 58.4000 (57.1810) time: 1.0583 data: 1.0391 max mem: 2905
Test: [30/50] eta: 0:00:29 loss: 3.5898 (3.4077) acc1: 30.4000 (33.5742) acc5: 53.6000 (57.1097) time: 1.1034 data: 1.0851 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 3.3769 (3.4460) acc1: 32.8000 (33.1317) acc5: 57.6000 (56.9561) time: 0.9683 data: 0.9500 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 3.4949 (3.4829) acc1: 30.4000 (32.6400) acc5: 55.2000 (56.3680) time: 0.8643 data: 0.8457 max mem: 2905
Test: Total time: 0:00:56 (1.1320 s / it)
* Acc@1 33.440 Acc@5 57.588 loss 3.434
Accuracy of the model on the 50000 test images: 33.4%
Max accuracy: 50.54%
Epoch: [128] [ 0/625] eta: 3:44:39 lr: 0.002703 min_lr: 0.002703 loss: 3.1353 (3.1353) class_acc: 0.4531 (0.4531) weight_decay: 0.0500 (0.0500) time: 21.5670 data: 19.3903 max mem: 2905
Epoch: [128] [200/625] eta: 0:14:31 lr: 0.002696 min_lr: 0.002696 loss: 3.1144 (3.0818) class_acc: 0.4922 (0.5080) weight_decay: 0.0500 (0.0500) grad_norm: 1.7458 (2.5138) time: 2.1050 data: 0.8604 max mem: 2905
Epoch: [128] [400/625] eta: 0:07:29 lr: 0.002690 min_lr: 0.002690 loss: 3.0982 (3.0897) class_acc: 0.5000 (0.5065) weight_decay: 0.0500 (0.0500) grad_norm: 2.3505 (2.4784) time: 1.9242 data: 0.1620 max mem: 2905
Epoch: [128] [600/625] eta: 0:00:49 lr: 0.002683 min_lr: 0.002683 loss: 3.1106 (3.0972) class_acc: 0.5039 (0.5052) weight_decay: 0.0500 (0.0500) grad_norm: 2.0343 (2.5318) time: 1.9149 data: 0.0044 max mem: 2905
Epoch: [128] [624/625] eta: 0:00:01 lr: 0.002682 min_lr: 0.002682 loss: 3.1843 (3.0991) class_acc: 0.4883 (0.5047) weight_decay: 0.0500 (0.0500) grad_norm: 1.9398 (2.5042) time: 0.7570 data: 0.0106 max mem: 2905
Epoch: [128] Total time: 0:20:14 (1.9432 s / it)
Averaged stats: lr: 0.002682 min_lr: 0.002682 loss: 3.1843 (3.0996) class_acc: 0.4883 (0.5041) weight_decay: 0.0500 (0.0500) grad_norm: 1.9398 (2.5042)
Test: [ 0/50] eta: 0:10:17 loss: 3.1274 (3.1274) acc1: 38.4000 (38.4000) acc5: 60.0000 (60.0000) time: 12.3595 data: 12.3347 max mem: 2905
Test: [10/50] eta: 0:01:26 loss: 2.6262 (2.6298) acc1: 46.4000 (46.0364) acc5: 71.2000 (70.1818) time: 2.1689 data: 2.1489 max mem: 2905
Test: [20/50] eta: 0:00:52 loss: 2.7527 (2.7734) acc1: 41.6000 (42.3238) acc5: 67.2000 (68.2286) time: 1.2200 data: 1.2005 max mem: 2905
Test: [30/50] eta: 0:00:29 loss: 2.9066 (2.8009) acc1: 41.6000 (42.6839) acc5: 64.8000 (67.7936) time: 1.0967 data: 1.0756 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.9212 (2.8277) acc1: 41.6000 (42.1073) acc5: 64.8000 (67.3756) time: 0.6433 data: 0.6215 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.8017 (2.8137) acc1: 41.6000 (42.2880) acc5: 65.6000 (67.5360) time: 0.5492 data: 0.5301 max mem: 2905
Test: Total time: 0:00:50 (1.0037 s / it)
* Acc@1 42.296 Acc@5 67.754 loss 2.787
Accuracy of the model on the 50000 test images: 42.3%
Max accuracy: 50.54%
Epoch: [129] [ 0/625] eta: 3:45:57 lr: 0.002682 min_lr: 0.002682 loss: 3.0994 (3.0994) class_acc: 0.4922 (0.4922) weight_decay: 0.0500 (0.0500) time: 21.6926 data: 15.0788 max mem: 2905
Epoch: [129] [200/625] eta: 0:13:54 lr: 0.002675 min_lr: 0.002675 loss: 3.1086 (3.0957) class_acc: 0.5000 (0.5059) weight_decay: 0.0500 (0.0500) grad_norm: 2.1434 (2.2222) time: 1.7782 data: 0.0087 max mem: 2905
Epoch: [129] [400/625] eta: 0:07:12 lr: 0.002668 min_lr: 0.002668 loss: 3.1449 (3.1054) class_acc: 0.5000 (0.5021) weight_decay: 0.0500 (0.0500) grad_norm: 1.9761 (2.3517) time: 1.9512 data: 0.0006 max mem: 2905
Epoch: [129] [600/625] eta: 0:00:48 lr: 0.002662 min_lr: 0.002662 loss: 3.1047 (3.1075) class_acc: 0.4961 (0.5017) weight_decay: 0.0500 (0.0500) grad_norm: 2.2787 (2.3888) time: 2.1130 data: 0.0007 max mem: 2905
Epoch: [129] [624/625] eta: 0:00:01 lr: 0.002661 min_lr: 0.002661 loss: 3.1605 (3.1092) class_acc: 0.5000 (0.5015) weight_decay: 0.0500 (0.0500) grad_norm: 1.8851 (2.4000) time: 0.7090 data: 0.0018 max mem: 2905
Epoch: [129] Total time: 0:20:05 (1.9283 s / it)
Averaged stats: lr: 0.002661 min_lr: 0.002661 loss: 3.1605 (3.0982) class_acc: 0.5000 (0.5045) weight_decay: 0.0500 (0.0500) grad_norm: 1.8851 (2.4000)
Test: [ 0/50] eta: 0:10:03 loss: 2.2711 (2.2711) acc1: 52.0000 (52.0000) acc5: 76.0000 (76.0000) time: 12.0701 data: 12.0443 max mem: 2905
Test: [10/50] eta: 0:01:16 loss: 2.5456 (2.5777) acc1: 45.6000 (46.6909) acc5: 73.6000 (70.1818) time: 1.9042 data: 1.8842 max mem: 2905
Test: [20/50] eta: 0:00:44 loss: 2.7482 (2.6799) acc1: 43.2000 (44.1524) acc5: 66.4000 (68.9143) time: 0.9414 data: 0.9204 max mem: 2905
Test: [30/50] eta: 0:00:25 loss: 2.8096 (2.7056) acc1: 42.4000 (44.0258) acc5: 65.6000 (68.6710) time: 0.9571 data: 0.9368 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.8800 (2.7397) acc1: 42.4000 (43.6488) acc5: 67.2000 (68.4488) time: 0.9694 data: 0.9504 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.8305 (2.7510) acc1: 42.4000 (43.4720) acc5: 67.2000 (68.3520) time: 0.9179 data: 0.8981 max mem: 2905
Test: Total time: 0:00:55 (1.1118 s / it)
* Acc@1 44.050 Acc@5 69.110 loss 2.714
Accuracy of the model on the 50000 test images: 44.1%
Max accuracy: 50.54%
Epoch: [130] [ 0/625] eta: 3:42:07 lr: 0.002661 min_lr: 0.002661 loss: 2.9074 (2.9074) class_acc: 0.5508 (0.5508) weight_decay: 0.0500 (0.0500) time: 21.3248 data: 20.4725 max mem: 2905
Epoch: [130] [200/625] eta: 0:14:48 lr: 0.002654 min_lr: 0.002654 loss: 3.0971 (3.0695) class_acc: 0.5000 (0.5100) weight_decay: 0.0500 (0.0500) grad_norm: 3.7519 (2.7999) time: 2.0013 data: 0.0042 max mem: 2905
Epoch: [130] [400/625] eta: 0:07:36 lr: 0.002647 min_lr: 0.002647 loss: 3.0981 (3.0921) class_acc: 0.5000 (0.5066) weight_decay: 0.0500 (0.0500) grad_norm: 1.8191 (2.6758) time: 2.0823 data: 0.0007 max mem: 2905
Epoch: [130] [600/625] eta: 0:00:50 lr: 0.002640 min_lr: 0.002640 loss: 3.0461 (3.0995) class_acc: 0.5039 (0.5051) weight_decay: 0.0500 (0.0500) grad_norm: 2.4477 (2.5700) time: 2.0177 data: 0.0009 max mem: 2905
Epoch: [130] [624/625] eta: 0:00:01 lr: 0.002640 min_lr: 0.002640 loss: 3.0996 (3.1000) class_acc: 0.4922 (0.5051) weight_decay: 0.0500 (0.0500) grad_norm: 2.4012 (2.5652) time: 0.7370 data: 0.0013 max mem: 2905
Epoch: [130] Total time: 0:20:27 (1.9640 s / it)
Averaged stats: lr: 0.002640 min_lr: 0.002640 loss: 3.0996 (3.0959) class_acc: 0.4922 (0.5052) weight_decay: 0.0500 (0.0500) grad_norm: 2.4012 (2.5652)
Test: [ 0/50] eta: 0:09:59 loss: 2.6034 (2.6034) acc1: 43.2000 (43.2000) acc5: 75.2000 (75.2000) time: 11.9956 data: 11.9550 max mem: 2905
Test: [10/50] eta: 0:01:24 loss: 2.6750 (2.7373) acc1: 44.0000 (42.9818) acc5: 66.4000 (67.2000) time: 2.1179 data: 2.0949 max mem: 2905
Test: [20/50] eta: 0:00:50 loss: 2.8872 (2.8578) acc1: 40.0000 (40.4952) acc5: 64.8000 (65.6000) time: 1.1698 data: 1.1492 max mem: 2905
Test: [30/50] eta: 0:00:27 loss: 2.9208 (2.8683) acc1: 38.4000 (40.8774) acc5: 64.0000 (65.4968) time: 0.9990 data: 0.9794 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.9052 (2.8803) acc1: 40.8000 (40.7805) acc5: 64.8000 (65.5220) time: 0.5554 data: 0.5356 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.8313 (2.8828) acc1: 40.8000 (40.8160) acc5: 64.8000 (65.8720) time: 0.4990 data: 0.4795 max mem: 2905
Test: Total time: 0:00:46 (0.9368 s / it)
* Acc@1 41.654 Acc@5 66.666 loss 2.834
Accuracy of the model on the 50000 test images: 41.7%
Max accuracy: 50.54%
Epoch: [131] [ 0/625] eta: 3:48:14 lr: 0.002640 min_lr: 0.002640 loss: 3.0310 (3.0310) class_acc: 0.5156 (0.5156) weight_decay: 0.0500 (0.0500) time: 21.9107 data: 16.7253 max mem: 2905
Epoch: [131] [200/625] eta: 0:14:34 lr: 0.002633 min_lr: 0.002633 loss: 3.0468 (3.0699) class_acc: 0.5000 (0.5134) weight_decay: 0.0500 (0.0500) grad_norm: 1.9916 (2.5971) time: 1.9409 data: 0.0006 max mem: 2905
Epoch: [131] [400/625] eta: 0:07:30 lr: 0.002626 min_lr: 0.002626 loss: 3.0858 (3.0833) class_acc: 0.4922 (0.5101) weight_decay: 0.0500 (0.0500) grad_norm: 1.9900 (2.4006) time: 1.8726 data: 0.0008 max mem: 2905
Epoch: [131] [600/625] eta: 0:00:50 lr: 0.002619 min_lr: 0.002619 loss: 3.0437 (3.0933) class_acc: 0.5117 (0.5072) weight_decay: 0.0500 (0.0500) grad_norm: 1.8872 (2.4156) time: 2.1263 data: 0.0007 max mem: 2905
Epoch: [131] [624/625] eta: 0:00:01 lr: 0.002618 min_lr: 0.002618 loss: 3.1339 (3.0944) class_acc: 0.4922 (0.5068) weight_decay: 0.0500 (0.0500) grad_norm: 2.1859 (2.4276) time: 0.6524 data: 0.0017 max mem: 2905
Epoch: [131] Total time: 0:20:22 (1.9561 s / it)
Averaged stats: lr: 0.002618 min_lr: 0.002618 loss: 3.1339 (3.0918) class_acc: 0.4922 (0.5058) weight_decay: 0.0500 (0.0500) grad_norm: 2.1859 (2.4276)
Test: [ 0/50] eta: 0:10:49 loss: 3.8753 (3.8753) acc1: 24.0000 (24.0000) acc5: 53.6000 (53.6000) time: 12.9968 data: 12.9594 max mem: 2905
Test: [10/50] eta: 0:01:25 loss: 3.7036 (3.6966) acc1: 30.4000 (30.1091) acc5: 56.8000 (55.4909) time: 2.1387 data: 2.1133 max mem: 2905
Test: [20/50] eta: 0:00:49 loss: 3.7122 (3.8033) acc1: 29.6000 (28.3048) acc5: 54.4000 (53.7143) time: 1.0947 data: 1.0727 max mem: 2905
Test: [30/50] eta: 0:00:29 loss: 3.8895 (3.8641) acc1: 27.2000 (28.3355) acc5: 50.4000 (52.2839) time: 1.1186 data: 1.0983 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 3.8814 (3.8360) acc1: 28.0000 (28.5659) acc5: 50.4000 (52.4293) time: 0.8823 data: 0.8622 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 3.7503 (3.8345) acc1: 30.4000 (29.0240) acc5: 52.0000 (52.4320) time: 0.8088 data: 0.7881 max mem: 2905
Test: Total time: 0:00:53 (1.0619 s / it)
* Acc@1 29.670 Acc@5 52.654 loss 3.790
Accuracy of the model on the 50000 test images: 29.7%
Max accuracy: 50.54%
Epoch: [132] [ 0/625] eta: 4:00:51 lr: 0.002618 min_lr: 0.002618 loss: 2.9817 (2.9817) class_acc: 0.5312 (0.5312) weight_decay: 0.0500 (0.0500) time: 23.1219 data: 21.6330 max mem: 2905
Epoch: [132] [200/625] eta: 0:14:47 lr: 0.002612 min_lr: 0.002612 loss: 3.0838 (3.0779) class_acc: 0.5000 (0.5057) weight_decay: 0.0500 (0.0500) grad_norm: 2.0978 (2.6972) time: 1.8246 data: 1.2337 max mem: 2905
Epoch: [132] [400/625] eta: 0:07:35 lr: 0.002605 min_lr: 0.002605 loss: 3.1394 (3.0841) class_acc: 0.4961 (0.5042) weight_decay: 0.0500 (0.0500) grad_norm: 2.0463 (2.5991) time: 2.1092 data: 1.1952 max mem: 2905
Epoch: [132] [600/625] eta: 0:00:50 lr: 0.002598 min_lr: 0.002598 loss: 3.1011 (3.0896) class_acc: 0.5078 (0.5043) weight_decay: 0.0500 (0.0500) grad_norm: 2.2059 (2.5178) time: 1.9701 data: 1.7989 max mem: 2905
Epoch: [132] [624/625] eta: 0:00:01 lr: 0.002597 min_lr: 0.002597 loss: 3.1803 (3.0913) class_acc: 0.4961 (0.5041) weight_decay: 0.0500 (0.0500) grad_norm: 2.2541 (inf) time: 0.7571 data: 0.6053 max mem: 2905
Epoch: [132] Total time: 0:20:22 (1.9560 s / it)
Averaged stats: lr: 0.002597 min_lr: 0.002597 loss: 3.1803 (3.0907) class_acc: 0.4961 (0.5059) weight_decay: 0.0500 (0.0500) grad_norm: 2.2541 (inf)
Test: [ 0/50] eta: 0:10:41 loss: 3.3563 (3.3563) acc1: 34.4000 (34.4000) acc5: 59.2000 (59.2000) time: 12.8237 data: 12.7917 max mem: 2905
Test: [10/50] eta: 0:01:26 loss: 3.1198 (3.1489) acc1: 38.4000 (37.9636) acc5: 63.2000 (61.9636) time: 2.1598 data: 2.1406 max mem: 2905
Test: [20/50] eta: 0:00:50 loss: 3.1198 (3.2114) acc1: 38.4000 (37.2190) acc5: 60.8000 (60.8762) time: 1.1107 data: 1.0922 max mem: 2905
Test: [30/50] eta: 0:00:29 loss: 3.2023 (3.1852) acc1: 36.8000 (37.2645) acc5: 60.0000 (61.0065) time: 1.0764 data: 1.0572 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 3.2023 (3.1928) acc1: 38.4000 (37.5415) acc5: 60.8000 (60.7024) time: 0.7072 data: 0.6878 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 3.1572 (3.1698) acc1: 38.4000 (37.6960) acc5: 60.8000 (61.2480) time: 0.6147 data: 0.5963 max mem: 2905
Test: Total time: 0:00:50 (1.0009 s / it)
* Acc@1 37.802 Acc@5 61.950 loss 3.152
Accuracy of the model on the 50000 test images: 37.8%
Max accuracy: 50.54%
Epoch: [133] [ 0/625] eta: 3:47:41 lr: 0.002597 min_lr: 0.002597 loss: 2.8834 (2.8834) class_acc: 0.5664 (0.5664) weight_decay: 0.0500 (0.0500) time: 21.8578 data: 18.2387 max mem: 2905
Epoch: [133] [200/625] eta: 0:13:49 lr: 0.002590 min_lr: 0.002590 loss: 3.0385 (3.0677) class_acc: 0.5234 (0.5119) weight_decay: 0.0500 (0.0500) grad_norm: 2.3796 (2.3078) time: 1.9566 data: 0.0010 max mem: 2905
Epoch: [133] [400/625] eta: 0:07:13 lr: 0.002583 min_lr: 0.002583 loss: 3.1106 (3.0745) class_acc: 0.4922 (0.5093) weight_decay: 0.0500 (0.0500) grad_norm: 1.9767 (2.4886) time: 1.9532 data: 0.0011 max mem: 2905
Epoch: [133] [600/625] eta: 0:00:49 lr: 0.002576 min_lr: 0.002576 loss: 3.0295 (3.0795) class_acc: 0.5078 (0.5072) weight_decay: 0.0500 (0.0500) grad_norm: 2.1936 (2.5566) time: 2.0735 data: 0.0018 max mem: 2905
Epoch: [133] [624/625] eta: 0:00:01 lr: 0.002576 min_lr: 0.002576 loss: 3.0714 (3.0792) class_acc: 0.5156 (0.5072) weight_decay: 0.0500 (0.0500) grad_norm: 1.9940 (2.5608) time: 0.7445 data: 0.0014 max mem: 2905
Epoch: [133] Total time: 0:19:57 (1.9164 s / it)
Averaged stats: lr: 0.002576 min_lr: 0.002576 loss: 3.0714 (3.0852) class_acc: 0.5156 (0.5070) weight_decay: 0.0500 (0.0500) grad_norm: 1.9940 (2.5608)
Test: [ 0/50] eta: 0:10:22 loss: 2.9289 (2.9289) acc1: 45.6000 (45.6000) acc5: 64.0000 (64.0000) time: 12.4445 data: 12.4172 max mem: 2905
Test: [10/50] eta: 0:01:29 loss: 2.9289 (2.9586) acc1: 44.8000 (41.8909) acc5: 64.8000 (65.2364) time: 2.2397 data: 2.2186 max mem: 2905
Test: [20/50] eta: 0:00:54 loss: 3.1046 (3.0914) acc1: 38.4000 (39.2762) acc5: 63.2000 (62.8191) time: 1.2882 data: 1.2673 max mem: 2905
Test: [30/50] eta: 0:00:32 loss: 3.1749 (3.1206) acc1: 34.4000 (38.1419) acc5: 60.8000 (62.2968) time: 1.2659 data: 1.2458 max mem: 2905
Test: [40/50] eta: 0:00:13 loss: 3.1749 (3.1419) acc1: 36.0000 (37.9512) acc5: 60.8000 (61.9707) time: 0.8894 data: 0.8710 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 3.1066 (3.1548) acc1: 38.4000 (37.7920) acc5: 62.4000 (62.2400) time: 0.7562 data: 0.7356 max mem: 2905
Test: Total time: 0:00:57 (1.1431 s / it)
* Acc@1 37.562 Acc@5 62.364 loss 3.155
Accuracy of the model on the 50000 test images: 37.6%
Max accuracy: 50.54%
Epoch: [134] [ 0/625] eta: 3:55:08 lr: 0.002576 min_lr: 0.002576 loss: 3.1418 (3.1418) class_acc: 0.4922 (0.4922) weight_decay: 0.0500 (0.0500) time: 22.5737 data: 17.1659 max mem: 2905
Epoch: [134] [200/625] eta: 0:14:41 lr: 0.002569 min_lr: 0.002569 loss: 3.1136 (3.0636) class_acc: 0.4961 (0.5117) weight_decay: 0.0500 (0.0500) grad_norm: 2.0655 (2.7874) time: 2.0163 data: 0.0761 max mem: 2905
Epoch: [134] [400/625] eta: 0:07:31 lr: 0.002562 min_lr: 0.002562 loss: 3.0404 (3.0785) class_acc: 0.5234 (0.5092) weight_decay: 0.0500 (0.0500) grad_norm: 2.0991 (2.5392) time: 1.6611 data: 0.0552 max mem: 2905
Epoch: [134] [600/625] eta: 0:00:50 lr: 0.002555 min_lr: 0.002555 loss: 3.0568 (3.0820) class_acc: 0.5195 (0.5086) weight_decay: 0.0500 (0.0500) grad_norm: 2.8572 (2.5756) time: 2.0468 data: 0.0266 max mem: 2905
Epoch: [134] [624/625] eta: 0:00:01 lr: 0.002554 min_lr: 0.002554 loss: 3.1062 (3.0840) class_acc: 0.4961 (0.5082) weight_decay: 0.0500 (0.0500) grad_norm: 1.8271 (2.5438) time: 0.7714 data: 0.0016 max mem: 2905
Epoch: [134] Total time: 0:20:30 (1.9695 s / it)
Averaged stats: lr: 0.002554 min_lr: 0.002554 loss: 3.1062 (3.0853) class_acc: 0.4961 (0.5074) weight_decay: 0.0500 (0.0500) grad_norm: 1.8271 (2.5438)
Test: [ 0/50] eta: 0:10:34 loss: 2.8744 (2.8744) acc1: 44.0000 (44.0000) acc5: 64.0000 (64.0000) time: 12.6816 data: 12.6517 max mem: 2905
Test: [10/50] eta: 0:01:27 loss: 2.7979 (2.7414) acc1: 45.6000 (43.8545) acc5: 66.4000 (68.4364) time: 2.1864 data: 2.1664 max mem: 2905
Test: [20/50] eta: 0:00:51 loss: 2.8649 (2.8343) acc1: 42.4000 (41.2952) acc5: 66.4000 (67.8095) time: 1.1763 data: 1.1566 max mem: 2905
Test: [30/50] eta: 0:00:30 loss: 2.8933 (2.8228) acc1: 41.6000 (41.6000) acc5: 65.6000 (67.4839) time: 1.1369 data: 1.1173 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.9831 (2.8686) acc1: 40.0000 (41.1512) acc5: 62.4000 (66.5561) time: 0.6994 data: 0.6796 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.8160 (2.8491) acc1: 40.0000 (41.2000) acc5: 65.6000 (67.1040) time: 0.5741 data: 0.5543 max mem: 2905
Test: Total time: 0:00:51 (1.0231 s / it)
* Acc@1 42.118 Acc@5 67.348 loss 2.825
Accuracy of the model on the 50000 test images: 42.1%
Max accuracy: 50.54%
Epoch: [135] [ 0/625] eta: 3:30:14 lr: 0.002554 min_lr: 0.002554 loss: 2.9742 (2.9742) class_acc: 0.5469 (0.5469) weight_decay: 0.0500 (0.0500) time: 20.1824 data: 18.0960 max mem: 2905
Epoch: [135] [200/625] eta: 0:14:28 lr: 0.002547 min_lr: 0.002547 loss: 3.0613 (3.0633) class_acc: 0.5039 (0.5089) weight_decay: 0.0500 (0.0500) grad_norm: 1.6176 (2.4505) time: 1.8430 data: 0.0007 max mem: 2905
Epoch: [135] [400/625] eta: 0:07:28 lr: 0.002540 min_lr: 0.002540 loss: 3.0230 (3.0801) class_acc: 0.5195 (0.5067) weight_decay: 0.0500 (0.0500) grad_norm: 1.6732 (2.5765) time: 2.0788 data: 0.0113 max mem: 2905
Epoch: [135] [600/625] eta: 0:00:49 lr: 0.002533 min_lr: 0.002533 loss: 3.1288 (3.0886) class_acc: 0.4961 (0.5053) weight_decay: 0.0500 (0.0500) grad_norm: 2.2097 (2.5779) time: 1.9825 data: 0.0107 max mem: 2905
Epoch: [135] [624/625] eta: 0:00:01 lr: 0.002533 min_lr: 0.002533 loss: 3.1344 (3.0895) class_acc: 0.5039 (0.5050) weight_decay: 0.0500 (0.0500) grad_norm: 2.0216 (2.5654) time: 0.8351 data: 0.0013 max mem: 2905
Epoch: [135] Total time: 0:20:04 (1.9264 s / it)
Averaged stats: lr: 0.002533 min_lr: 0.002533 loss: 3.1344 (3.0839) class_acc: 0.5039 (0.5075) weight_decay: 0.0500 (0.0500) grad_norm: 2.0216 (2.5654)
Test: [ 0/50] eta: 0:09:08 loss: 2.6195 (2.6195) acc1: 48.8000 (48.8000) acc5: 72.8000 (72.8000) time: 10.9628 data: 10.9350 max mem: 2905
Test: [10/50] eta: 0:01:18 loss: 2.6195 (2.5478) acc1: 48.8000 (48.0727) acc5: 72.8000 (72.0000) time: 1.9565 data: 1.9366 max mem: 2905
Test: [20/50] eta: 0:00:47 loss: 2.7389 (2.7370) acc1: 42.4000 (44.4190) acc5: 67.2000 (68.8381) time: 1.0976 data: 1.0786 max mem: 2905
Test: [30/50] eta: 0:00:27 loss: 2.6896 (2.7006) acc1: 41.6000 (44.6452) acc5: 67.2000 (69.7290) time: 1.0168 data: 0.9965 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.6447 (2.6930) acc1: 45.6000 (45.0927) acc5: 68.8000 (69.7366) time: 0.6306 data: 0.6095 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.6632 (2.6966) acc1: 44.0000 (44.6560) acc5: 68.8000 (69.6000) time: 0.5742 data: 0.5546 max mem: 2905
Test: Total time: 0:00:46 (0.9361 s / it)
* Acc@1 44.968 Acc@5 70.110 loss 2.668
Accuracy of the model on the 50000 test images: 45.0%
Max accuracy: 50.54%
Epoch: [136] [ 0/625] eta: 3:35:22 lr: 0.002532 min_lr: 0.002532 loss: 2.8887 (2.8887) class_acc: 0.5273 (0.5273) weight_decay: 0.0500 (0.0500) time: 20.6767 data: 17.5166 max mem: 2905
Epoch: [136] [200/625] eta: 0:13:44 lr: 0.002526 min_lr: 0.002526 loss: 3.0730 (3.0530) class_acc: 0.5039 (0.5150) weight_decay: 0.0500 (0.0500) grad_norm: 1.8825 (2.5561) time: 1.7947 data: 0.0351 max mem: 2905
Epoch: [136] [400/625] eta: 0:07:11 lr: 0.002519 min_lr: 0.002519 loss: 3.0894 (3.0723) class_acc: 0.5117 (0.5099) weight_decay: 0.0500 (0.0500) grad_norm: 2.1014 (2.5203) time: 1.9860 data: 0.0515 max mem: 2905
Epoch: [136] [600/625] eta: 0:00:47 lr: 0.002512 min_lr: 0.002512 loss: 3.0741 (3.0770) class_acc: 0.5078 (0.5089) weight_decay: 0.0500 (0.0500) grad_norm: 2.2991 (2.4746) time: 1.7775 data: 0.2486 max mem: 2905
Epoch: [136] [624/625] eta: 0:00:01 lr: 0.002511 min_lr: 0.002511 loss: 3.0776 (3.0774) class_acc: 0.5117 (0.5090) weight_decay: 0.0500 (0.0500) grad_norm: 2.5085 (2.4902) time: 0.7786 data: 0.3327 max mem: 2905
Epoch: [136] Total time: 0:19:24 (1.8627 s / it)
Averaged stats: lr: 0.002511 min_lr: 0.002511 loss: 3.0776 (3.0764) class_acc: 0.5117 (0.5091) weight_decay: 0.0500 (0.0500) grad_norm: 2.5085 (2.4902)
Test: [ 0/50] eta: 0:10:56 loss: 2.8796 (2.8796) acc1: 37.6000 (37.6000) acc5: 64.8000 (64.8000) time: 13.1252 data: 13.0958 max mem: 2905
Test: [10/50] eta: 0:01:21 loss: 2.6973 (2.6960) acc1: 43.2000 (43.5636) acc5: 67.2000 (69.0909) time: 2.0297 data: 2.0104 max mem: 2905
Test: [20/50] eta: 0:00:46 loss: 2.9880 (2.9792) acc1: 37.6000 (39.3524) acc5: 64.0000 (65.4095) time: 0.9670 data: 0.9483 max mem: 2905
Test: [30/50] eta: 0:00:27 loss: 3.1363 (3.0198) acc1: 35.2000 (39.1742) acc5: 60.8000 (64.7226) time: 0.9917 data: 0.9726 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 3.0299 (3.0329) acc1: 35.2000 (38.8293) acc5: 64.0000 (64.1171) time: 0.8095 data: 0.7905 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 3.1343 (3.1084) acc1: 34.4000 (37.7760) acc5: 61.6000 (63.1680) time: 0.8178 data: 0.7969 max mem: 2905
Test: Total time: 0:00:51 (1.0226 s / it)
* Acc@1 38.236 Acc@5 63.560 loss 3.092
Accuracy of the model on the 50000 test images: 38.2%
Max accuracy: 50.54%
Epoch: [137] [ 0/625] eta: 3:33:32 lr: 0.002511 min_lr: 0.002511 loss: 3.1157 (3.1157) class_acc: 0.4766 (0.4766) weight_decay: 0.0500 (0.0500) time: 20.5002 data: 17.0115 max mem: 2905
Epoch: [137] [200/625] eta: 0:13:44 lr: 0.002504 min_lr: 0.002504 loss: 3.0574 (3.0515) class_acc: 0.5117 (0.5149) weight_decay: 0.0500 (0.0500) grad_norm: 2.4387 (2.4823) time: 1.8340 data: 0.0006 max mem: 2905
Epoch: [137] [400/625] eta: 0:07:15 lr: 0.002497 min_lr: 0.002497 loss: 3.1327 (3.0634) class_acc: 0.4922 (0.5108) weight_decay: 0.0500 (0.0500) grad_norm: 1.9285 (2.4755) time: 1.9271 data: 0.0007 max mem: 2905
Epoch: [137] [600/625] eta: 0:00:48 lr: 0.002490 min_lr: 0.002490 loss: 3.0988 (3.0693) class_acc: 0.5039 (0.5111) weight_decay: 0.0500 (0.0500) grad_norm: 2.7438 (2.5150) time: 1.9791 data: 0.0008 max mem: 2905
Epoch: [137] [624/625] eta: 0:00:01 lr: 0.002489 min_lr: 0.002489 loss: 3.1064 (3.0705) class_acc: 0.5078 (0.5111) weight_decay: 0.0500 (0.0500) grad_norm: 2.5806 (2.5282) time: 0.8149 data: 0.1237 max mem: 2905
Epoch: [137] Total time: 0:19:46 (1.8978 s / it)
Averaged stats: lr: 0.002489 min_lr: 0.002489 loss: 3.1064 (3.0754) class_acc: 0.5078 (0.5093) weight_decay: 0.0500 (0.0500) grad_norm: 2.5806 (2.5282)
Test: [ 0/50] eta: 0:10:17 loss: 3.4693 (3.4693) acc1: 35.2000 (35.2000) acc5: 60.8000 (60.8000) time: 12.3499 data: 12.3215 max mem: 2905
Test: [10/50] eta: 0:01:29 loss: 3.0785 (2.9895) acc1: 38.4000 (40.4364) acc5: 63.2000 (65.5273) time: 2.2368 data: 2.2153 max mem: 2905
Test: [20/50] eta: 0:00:52 loss: 3.0792 (3.1107) acc1: 36.8000 (38.2476) acc5: 62.4000 (63.4286) time: 1.2124 data: 1.1916 max mem: 2905
Test: [30/50] eta: 0:00:29 loss: 3.1595 (3.1155) acc1: 36.8000 (38.6581) acc5: 60.0000 (63.2774) time: 1.0249 data: 1.0048 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 3.1201 (3.1650) acc1: 37.6000 (37.8927) acc5: 60.0000 (62.4781) time: 0.6029 data: 0.5837 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.9527 (3.1231) acc1: 38.4000 (38.6880) acc5: 64.0000 (63.2320) time: 0.5648 data: 0.5465 max mem: 2905
Test: Total time: 0:00:49 (0.9882 s / it)
* Acc@1 38.972 Acc@5 63.514 loss 3.087
Accuracy of the model on the 50000 test images: 39.0%
Max accuracy: 50.54%
Epoch: [138] [ 0/625] eta: 4:30:32 lr: 0.002489 min_lr: 0.002489 loss: 2.9704 (2.9704) class_acc: 0.5352 (0.5352) weight_decay: 0.0500 (0.0500) time: 25.9724 data: 19.9072 max mem: 2905
Epoch: [138] [200/625] eta: 0:14:23 lr: 0.002482 min_lr: 0.002482 loss: 3.0487 (3.0520) class_acc: 0.5117 (0.5141) weight_decay: 0.0500 (0.0500) grad_norm: 1.9061 (2.4300) time: 1.9446 data: 0.0229 max mem: 2905
Epoch: [138] [400/625] eta: 0:07:25 lr: 0.002475 min_lr: 0.002475 loss: 3.1225 (3.0603) class_acc: 0.4961 (0.5122) weight_decay: 0.0500 (0.0500) grad_norm: 2.1429 (2.5400) time: 1.9835 data: 0.0110 max mem: 2905
Epoch: [138] [600/625] eta: 0:00:49 lr: 0.002468 min_lr: 0.002468 loss: 3.0555 (3.0663) class_acc: 0.5000 (0.5112) weight_decay: 0.0500 (0.0500) grad_norm: 2.7596 (2.5965) time: 1.8843 data: 0.0007 max mem: 2905
Epoch: [138] [624/625] eta: 0:00:01 lr: 0.002467 min_lr: 0.002467 loss: 3.1084 (3.0688) class_acc: 0.5039 (0.5108) weight_decay: 0.0500 (0.0500) grad_norm: 2.4343 (2.5907) time: 0.9180 data: 0.0148 max mem: 2905
Epoch: [138] Total time: 0:20:07 (1.9324 s / it)
Averaged stats: lr: 0.002467 min_lr: 0.002467 loss: 3.1084 (3.0770) class_acc: 0.5039 (0.5087) weight_decay: 0.0500 (0.0500) grad_norm: 2.4343 (2.5907)
Test: [ 0/50] eta: 0:09:15 loss: 3.5574 (3.5574) acc1: 30.4000 (30.4000) acc5: 53.6000 (53.6000) time: 11.1032 data: 11.0800 max mem: 2905
Test: [10/50] eta: 0:01:13 loss: 3.3169 (3.3655) acc1: 36.0000 (35.5636) acc5: 57.6000 (57.8182) time: 1.8332 data: 1.8135 max mem: 2905
Test: [20/50] eta: 0:00:43 loss: 3.4039 (3.4427) acc1: 32.8000 (34.0952) acc5: 57.6000 (57.1810) time: 0.9577 data: 0.9368 max mem: 2905
Test: [30/50] eta: 0:00:25 loss: 3.5441 (3.5067) acc1: 30.4000 (33.2387) acc5: 56.0000 (57.0065) time: 0.9812 data: 0.9603 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 3.7328 (3.5416) acc1: 30.4000 (32.7220) acc5: 56.8000 (56.6829) time: 0.8310 data: 0.8121 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 3.5819 (3.5300) acc1: 31.2000 (33.0240) acc5: 56.8000 (56.9440) time: 0.7929 data: 0.7740 max mem: 2905
Test: Total time: 0:00:50 (1.0049 s / it)
* Acc@1 33.694 Acc@5 57.450 loss 3.479
Accuracy of the model on the 50000 test images: 33.7%
Max accuracy: 50.54%
Epoch: [139] [ 0/625] eta: 4:15:13 lr: 0.002467 min_lr: 0.002467 loss: 2.9521 (2.9521) class_acc: 0.5195 (0.5195) weight_decay: 0.0500 (0.0500) time: 24.5017 data: 18.5787 max mem: 2905
Epoch: [139] [200/625] eta: 0:14:36 lr: 0.002460 min_lr: 0.002460 loss: 3.1134 (3.0511) class_acc: 0.5117 (0.5145) weight_decay: 0.0500 (0.0500) grad_norm: 2.0421 (2.3328) time: 1.8900 data: 0.0009 max mem: 2905
Epoch: [139] [400/625] eta: 0:07:29 lr: 0.002453 min_lr: 0.002453 loss: 3.0990 (3.0677) class_acc: 0.4922 (0.5111) weight_decay: 0.0500 (0.0500) grad_norm: 2.5744 (2.4046) time: 2.0204 data: 0.0827 max mem: 2905
Epoch: [139] [600/625] eta: 0:00:50 lr: 0.002446 min_lr: 0.002446 loss: 3.0533 (3.0758) class_acc: 0.5000 (0.5089) weight_decay: 0.0500 (0.0500) grad_norm: 2.6337 (inf) time: 1.9939 data: 0.0009 max mem: 2905
Epoch: [139] [624/625] eta: 0:00:01 lr: 0.002446 min_lr: 0.002446 loss: 3.0955 (3.0768) class_acc: 0.5000 (0.5088) weight_decay: 0.0500 (0.0500) grad_norm: 2.3258 (inf) time: 0.7435 data: 0.0015 max mem: 2905
Epoch: [139] Total time: 0:20:22 (1.9560 s / it)
Averaged stats: lr: 0.002446 min_lr: 0.002446 loss: 3.0955 (3.0721) class_acc: 0.5000 (0.5104) weight_decay: 0.0500 (0.0500) grad_norm: 2.3258 (inf)
Test: [ 0/50] eta: 0:09:43 loss: 2.7553 (2.7553) acc1: 39.2000 (39.2000) acc5: 63.2000 (63.2000) time: 11.6684 data: 11.6374 max mem: 2905
Test: [10/50] eta: 0:01:25 loss: 3.2310 (3.2752) acc1: 38.4000 (37.7455) acc5: 60.8000 (60.0727) time: 2.1353 data: 2.1149 max mem: 2905
Test: [20/50] eta: 0:00:51 loss: 3.4882 (3.4361) acc1: 33.6000 (34.6667) acc5: 56.8000 (57.9429) time: 1.2273 data: 1.2081 max mem: 2905
Test: [30/50] eta: 0:00:29 loss: 3.7071 (3.4906) acc1: 31.2000 (34.0129) acc5: 55.2000 (57.1871) time: 1.1301 data: 1.1094 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 3.6503 (3.5432) acc1: 32.0000 (33.6195) acc5: 54.4000 (56.2341) time: 0.6680 data: 0.6462 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 3.5699 (3.5565) acc1: 32.0000 (33.1040) acc5: 55.2000 (56.2560) time: 0.6150 data: 0.5931 max mem: 2905
Test: Total time: 0:00:50 (1.0074 s / it)
* Acc@1 33.010 Acc@5 56.616 loss 3.518
Accuracy of the model on the 50000 test images: 33.0%
Max accuracy: 50.54%
Epoch: [140] [ 0/625] eta: 4:04:05 lr: 0.002445 min_lr: 0.002445 loss: 3.1129 (3.1129) class_acc: 0.4805 (0.4805) weight_decay: 0.0500 (0.0500) time: 23.4328 data: 20.6775 max mem: 2905
Epoch: [140] [200/625] eta: 0:14:33 lr: 0.002438 min_lr: 0.002438 loss: 3.1190 (3.0637) class_acc: 0.5039 (0.5121) weight_decay: 0.0500 (0.0500) grad_norm: 2.1149 (2.2286) time: 1.8908 data: 0.0006 max mem: 2905
Epoch: [140] [400/625] eta: 0:07:37 lr: 0.002431 min_lr: 0.002431 loss: 3.0820 (3.0676) class_acc: 0.5000 (0.5100) weight_decay: 0.0500 (0.0500) grad_norm: 3.1054 (2.4204) time: 1.9468 data: 0.0009 max mem: 2905
Epoch: [140] [600/625] eta: 0:00:50 lr: 0.002424 min_lr: 0.002424 loss: 3.1091 (3.0742) class_acc: 0.4883 (0.5089) weight_decay: 0.0500 (0.0500) grad_norm: 1.8624 (2.4511) time: 1.9886 data: 0.0008 max mem: 2905
Epoch: [140] [624/625] eta: 0:00:01 lr: 0.002424 min_lr: 0.002424 loss: 3.0889 (3.0742) class_acc: 0.5078 (0.5088) weight_decay: 0.0500 (0.0500) grad_norm: 2.0453 (2.4640) time: 0.8653 data: 0.0014 max mem: 2905
Epoch: [140] Total time: 0:20:34 (1.9744 s / it)
Averaged stats: lr: 0.002424 min_lr: 0.002424 loss: 3.0889 (3.0672) class_acc: 0.5078 (0.5112) weight_decay: 0.0500 (0.0500) grad_norm: 2.0453 (2.4640)
Test: [ 0/50] eta: 0:09:31 loss: 2.6982 (2.6982) acc1: 43.2000 (43.2000) acc5: 71.2000 (71.2000) time: 11.4348 data: 11.4081 max mem: 2905
Test: [10/50] eta: 0:01:19 loss: 2.8186 (2.8873) acc1: 42.4000 (42.3273) acc5: 64.8000 (65.6727) time: 1.9786 data: 1.9575 max mem: 2905
Test: [20/50] eta: 0:00:47 loss: 3.0463 (3.0889) acc1: 37.6000 (38.4000) acc5: 61.6000 (63.4667) time: 1.0915 data: 1.0718 max mem: 2905
Test: [30/50] eta: 0:00:28 loss: 3.2969 (3.1485) acc1: 34.4000 (37.1355) acc5: 60.8000 (62.3484) time: 1.0894 data: 1.0703 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 3.4500 (3.2531) acc1: 30.4000 (35.3366) acc5: 57.6000 (60.7610) time: 0.8052 data: 0.7859 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 3.3255 (3.2708) acc1: 31.2000 (34.9920) acc5: 58.4000 (60.6080) time: 0.7198 data: 0.7010 max mem: 2905
Test: Total time: 0:00:50 (1.0103 s / it)
* Acc@1 35.424 Acc@5 60.910 loss 3.244
Accuracy of the model on the 50000 test images: 35.4%
Max accuracy: 50.54%
Epoch: [141] [ 0/625] eta: 3:31:49 lr: 0.002424 min_lr: 0.002424 loss: 2.8883 (2.8883) class_acc: 0.5664 (0.5664) weight_decay: 0.0500 (0.0500) time: 20.3349 data: 18.4900 max mem: 2905
Epoch: [141] [200/625] eta: 0:14:15 lr: 0.002417 min_lr: 0.002417 loss: 3.0460 (3.0330) class_acc: 0.5156 (0.5191) weight_decay: 0.0500 (0.0500) grad_norm: 1.8336 (2.5539) time: 1.7892 data: 0.1157 max mem: 2905
Epoch: [141] [400/625] eta: 0:07:23 lr: 0.002409 min_lr: 0.002409 loss: 3.0470 (3.0547) class_acc: 0.5000 (0.5134) weight_decay: 0.0500 (0.0500) grad_norm: 2.1578 (2.4908) time: 1.8894 data: 0.0324 max mem: 2905
Epoch: [141] [600/625] eta: 0:00:48 lr: 0.002402 min_lr: 0.002402 loss: 3.0643 (3.0666) class_acc: 0.5039 (0.5114) weight_decay: 0.0500 (0.0500) grad_norm: 2.0136 (2.5000) time: 1.9136 data: 0.0007 max mem: 2905
Epoch: [141] [624/625] eta: 0:00:01 lr: 0.002402 min_lr: 0.002402 loss: 3.0441 (3.0652) class_acc: 0.5195 (0.5117) weight_decay: 0.0500 (0.0500) grad_norm: 2.3080 (2.5106) time: 0.8594 data: 0.0013 max mem: 2905
Epoch: [141] Total time: 0:19:55 (1.9133 s / it)
Averaged stats: lr: 0.002402 min_lr: 0.002402 loss: 3.0441 (3.0653) class_acc: 0.5195 (0.5119) weight_decay: 0.0500 (0.0500) grad_norm: 2.3080 (2.5106)
Test: [ 0/50] eta: 0:09:44 loss: 2.8425 (2.8425) acc1: 40.8000 (40.8000) acc5: 65.6000 (65.6000) time: 11.6981 data: 11.6530 max mem: 2905
Test: [10/50] eta: 0:01:21 loss: 2.7544 (2.7566) acc1: 42.4000 (43.4182) acc5: 66.4000 (68.2909) time: 2.0483 data: 2.0258 max mem: 2905
Test: [20/50] eta: 0:00:48 loss: 2.9078 (2.9245) acc1: 39.2000 (40.0381) acc5: 68.0000 (66.5143) time: 1.1187 data: 1.0991 max mem: 2905
Test: [30/50] eta: 0:00:28 loss: 2.9040 (2.9074) acc1: 39.2000 (40.6710) acc5: 66.4000 (66.1677) time: 1.1138 data: 1.0937 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.8411 (2.9284) acc1: 40.0000 (40.5659) acc5: 62.4000 (65.8146) time: 0.7600 data: 0.7403 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.9588 (2.9424) acc1: 38.4000 (40.4960) acc5: 64.0000 (65.7280) time: 0.7342 data: 0.7158 max mem: 2905
Test: Total time: 0:00:49 (0.9977 s / it)
* Acc@1 40.812 Acc@5 66.092 loss 2.915
Accuracy of the model on the 50000 test images: 40.8%
Max accuracy: 50.54%
Epoch: [142] [ 0/625] eta: 3:39:07 lr: 0.002402 min_lr: 0.002402 loss: 3.0267 (3.0267) class_acc: 0.5039 (0.5039) weight_decay: 0.0500 (0.0500) time: 21.0363 data: 18.1838 max mem: 2905
Epoch: [142] [200/625] eta: 0:13:46 lr: 0.002395 min_lr: 0.002395 loss: 3.0032 (3.0524) class_acc: 0.5234 (0.5137) weight_decay: 0.0500 (0.0500) grad_norm: 2.4480 (2.6205) time: 1.8363 data: 0.0007 max mem: 2905
Epoch: [142] [400/625] eta: 0:07:10 lr: 0.002387 min_lr: 0.002387 loss: 3.0138 (3.0590) class_acc: 0.5234 (0.5129) weight_decay: 0.0500 (0.0500) grad_norm: 2.5212 (2.5400) time: 1.8549 data: 0.0008 max mem: 2905
Epoch: [142] [600/625] eta: 0:00:48 lr: 0.002380 min_lr: 0.002380 loss: 3.0526 (3.0600) class_acc: 0.5117 (0.5132) weight_decay: 0.0500 (0.0500) grad_norm: 2.4879 (2.5871) time: 2.0143 data: 0.0464 max mem: 2905
Epoch: [142] [624/625] eta: 0:00:01 lr: 0.002380 min_lr: 0.002380 loss: 3.0838 (3.0604) class_acc: 0.5117 (0.5131) weight_decay: 0.0500 (0.0500) grad_norm: 2.5098 (2.5811) time: 0.8553 data: 0.0324 max mem: 2905
Epoch: [142] Total time: 0:19:36 (1.8829 s / it)
Averaged stats: lr: 0.002380 min_lr: 0.002380 loss: 3.0838 (3.0624) class_acc: 0.5117 (0.5125) weight_decay: 0.0500 (0.0500) grad_norm: 2.5098 (2.5811)
Test: [ 0/50] eta: 0:10:35 loss: 2.5529 (2.5529) acc1: 45.6000 (45.6000) acc5: 76.0000 (76.0000) time: 12.7139 data: 12.6900 max mem: 2905
Test: [10/50] eta: 0:01:29 loss: 2.7295 (2.7750) acc1: 45.6000 (44.1455) acc5: 67.2000 (67.7818) time: 2.2323 data: 2.2109 max mem: 2905
Test: [20/50] eta: 0:00:53 loss: 2.9010 (2.9018) acc1: 40.8000 (40.8381) acc5: 65.6000 (66.6667) time: 1.2302 data: 1.2102 max mem: 2905
Test: [30/50] eta: 0:00:32 loss: 2.9971 (2.9226) acc1: 38.4000 (40.7484) acc5: 64.0000 (66.4258) time: 1.2898 data: 1.2706 max mem: 2905
Test: [40/50] eta: 0:00:13 loss: 2.9971 (2.9448) acc1: 37.6000 (40.3707) acc5: 64.0000 (65.9317) time: 0.8946 data: 0.8735 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.9262 (2.9244) acc1: 38.4000 (40.7840) acc5: 64.8000 (66.1280) time: 0.8447 data: 0.8237 max mem: 2905
Test: Total time: 0:00:55 (1.1161 s / it)
* Acc@1 41.306 Acc@5 66.844 loss 2.901
Accuracy of the model on the 50000 test images: 41.3%
Max accuracy: 50.54%
Epoch: [143] [ 0/625] eta: 3:47:01 lr: 0.002380 min_lr: 0.002380 loss: 3.0527 (3.0527) class_acc: 0.5195 (0.5195) weight_decay: 0.0500 (0.0500) time: 21.7950 data: 19.6368 max mem: 2905
Epoch: [143] [200/625] eta: 0:14:21 lr: 0.002373 min_lr: 0.002373 loss: 3.0497 (3.0543) class_acc: 0.5039 (0.5142) weight_decay: 0.0500 (0.0500) grad_norm: 2.0981 (2.3357) time: 1.7320 data: 0.0229 max mem: 2905
Epoch: [143] [400/625] eta: 0:07:23 lr: 0.002365 min_lr: 0.002365 loss: 3.0871 (3.0539) class_acc: 0.5000 (0.5146) weight_decay: 0.0500 (0.0500) grad_norm: 2.3299 (2.3092) time: 1.9094 data: 0.0281 max mem: 2905
Epoch: [143] [600/625] eta: 0:00:48 lr: 0.002358 min_lr: 0.002358 loss: 3.0476 (3.0592) class_acc: 0.5078 (0.5131) weight_decay: 0.0500 (0.0500) grad_norm: 1.7280 (2.4605) time: 2.0905 data: 0.0351 max mem: 2905
Epoch: [143] [624/625] eta: 0:00:01 lr: 0.002358 min_lr: 0.002358 loss: 3.0777 (3.0605) class_acc: 0.5039 (0.5128) weight_decay: 0.0500 (0.0500) grad_norm: 1.7995 (2.4555) time: 0.8065 data: 0.0190 max mem: 2905
Epoch: [143] Total time: 0:19:56 (1.9137 s / it)
Averaged stats: lr: 0.002358 min_lr: 0.002358 loss: 3.0777 (3.0604) class_acc: 0.5039 (0.5126) weight_decay: 0.0500 (0.0500) grad_norm: 1.7995 (2.4555)
Test: [ 0/50] eta: 0:10:49 loss: 2.8117 (2.8117) acc1: 39.2000 (39.2000) acc5: 68.8000 (68.8000) time: 12.9843 data: 12.9552 max mem: 2905
Test: [10/50] eta: 0:01:21 loss: 2.8304 (2.8467) acc1: 41.6000 (41.5273) acc5: 67.2000 (67.1273) time: 2.0420 data: 2.0209 max mem: 2905
Test: [20/50] eta: 0:00:46 loss: 3.0952 (3.0441) acc1: 38.4000 (38.8190) acc5: 63.2000 (64.3429) time: 0.9826 data: 0.9632 max mem: 2905
Test: [30/50] eta: 0:00:26 loss: 3.1793 (3.0280) acc1: 37.6000 (39.4323) acc5: 63.2000 (64.7226) time: 0.9622 data: 0.9436 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 3.1292 (3.0684) acc1: 37.6000 (38.7122) acc5: 63.2000 (63.9610) time: 0.6577 data: 0.6391 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 3.1208 (3.0727) acc1: 37.6000 (38.7680) acc5: 63.2000 (63.9200) time: 0.5875 data: 0.5691 max mem: 2905
Test: Total time: 0:00:47 (0.9456 s / it)
* Acc@1 39.052 Acc@5 64.066 loss 3.030
Accuracy of the model on the 50000 test images: 39.1%
Max accuracy: 50.54%
Epoch: [144] [ 0/625] eta: 3:31:47 lr: 0.002358 min_lr: 0.002358 loss: 3.0349 (3.0349) class_acc: 0.5352 (0.5352) weight_decay: 0.0500 (0.0500) time: 20.3321 data: 18.2587 max mem: 2905
Epoch: [144] [200/625] eta: 0:14:12 lr: 0.002350 min_lr: 0.002350 loss: 2.9920 (3.0409) class_acc: 0.5156 (0.5182) weight_decay: 0.0500 (0.0500) grad_norm: 2.1870 (2.4454) time: 1.8720 data: 0.0057 max mem: 2905
Epoch: [144] [400/625] eta: 0:07:23 lr: 0.002343 min_lr: 0.002343 loss: 3.0716 (3.0474) class_acc: 0.5156 (0.5145) weight_decay: 0.0500 (0.0500) grad_norm: 2.0637 (2.5605) time: 2.0267 data: 0.0659 max mem: 2905
Epoch: [144] [600/625] eta: 0:00:49 lr: 0.002336 min_lr: 0.002336 loss: 3.0663 (3.0522) class_acc: 0.5195 (0.5142) weight_decay: 0.0500 (0.0500) grad_norm: 2.7558 (2.5100) time: 2.0571 data: 0.0037 max mem: 2905
Epoch: [144] [624/625] eta: 0:00:01 lr: 0.002335 min_lr: 0.002335 loss: 3.1179 (3.0541) class_acc: 0.4922 (0.5138) weight_decay: 0.0500 (0.0500) grad_norm: 2.7494 (2.5205) time: 0.9005 data: 0.0095 max mem: 2905
Epoch: [144] Total time: 0:20:13 (1.9423 s / it)
Averaged stats: lr: 0.002335 min_lr: 0.002335 loss: 3.1179 (3.0561) class_acc: 0.4922 (0.5136) weight_decay: 0.0500 (0.0500) grad_norm: 2.7494 (2.5205)
Test: [ 0/50] eta: 0:09:26 loss: 3.0690 (3.0690) acc1: 35.2000 (35.2000) acc5: 64.0000 (64.0000) time: 11.3384 data: 11.3129 max mem: 2905
Test: [10/50] eta: 0:01:13 loss: 3.0585 (2.9509) acc1: 40.0000 (40.3636) acc5: 64.8000 (64.5818) time: 1.8457 data: 1.8273 max mem: 2905
Test: [20/50] eta: 0:00:43 loss: 3.0100 (3.0383) acc1: 39.2000 (38.2857) acc5: 64.8000 (63.1238) time: 0.9509 data: 0.9328 max mem: 2905
Test: [30/50] eta: 0:00:26 loss: 3.0654 (3.0544) acc1: 36.8000 (38.6065) acc5: 63.2000 (63.2516) time: 1.0023 data: 0.9832 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 3.0751 (3.0647) acc1: 39.2000 (38.4000) acc5: 64.0000 (63.2585) time: 1.0781 data: 1.0594 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 3.0188 (3.0677) acc1: 36.0000 (38.1760) acc5: 63.2000 (63.3280) time: 0.8005 data: 0.7826 max mem: 2905
Test: Total time: 0:00:56 (1.1317 s / it)
* Acc@1 39.424 Acc@5 64.080 loss 3.020
Accuracy of the model on the 50000 test images: 39.4%
Max accuracy: 50.54%
Epoch: [145] [ 0/625] eta: 3:36:02 lr: 0.002335 min_lr: 0.002335 loss: 2.8599 (2.8599) class_acc: 0.5430 (0.5430) weight_decay: 0.0500 (0.0500) time: 20.7404 data: 17.3197 max mem: 2905
Epoch: [145] [200/625] eta: 0:14:27 lr: 0.002328 min_lr: 0.002328 loss: 2.9824 (3.0238) class_acc: 0.5234 (0.5212) weight_decay: 0.0500 (0.0500) grad_norm: 2.1198 (2.4618) time: 1.9230 data: 0.0068 max mem: 2905
Epoch: [145] [400/625] eta: 0:07:31 lr: 0.002321 min_lr: 0.002321 loss: 3.0434 (3.0445) class_acc: 0.5078 (0.5168) weight_decay: 0.0500 (0.0500) grad_norm: 2.5601 (2.5921) time: 2.1497 data: 0.0007 max mem: 2905
Epoch: [145] [600/625] eta: 0:00:50 lr: 0.002314 min_lr: 0.002314 loss: 3.0194 (3.0510) class_acc: 0.5195 (0.5156) weight_decay: 0.0500 (0.0500) grad_norm: 2.1357 (2.5381) time: 2.1454 data: 0.0150 max mem: 2905
Epoch: [145] [624/625] eta: 0:00:01 lr: 0.002313 min_lr: 0.002313 loss: 3.0428 (3.0504) class_acc: 0.5117 (0.5156) weight_decay: 0.0500 (0.0500) grad_norm: 2.3087 (2.5419) time: 0.6577 data: 0.0018 max mem: 2905
Epoch: [145] Total time: 0:20:18 (1.9490 s / it)
Averaged stats: lr: 0.002313 min_lr: 0.002313 loss: 3.0428 (3.0521) class_acc: 0.5117 (0.5148) weight_decay: 0.0500 (0.0500) grad_norm: 2.3087 (2.5419)
Test: [ 0/50] eta: 0:10:06 loss: 3.0126 (3.0126) acc1: 39.2000 (39.2000) acc5: 64.8000 (64.8000) time: 12.1301 data: 12.1020 max mem: 2905
Test: [10/50] eta: 0:01:10 loss: 3.0126 (3.0161) acc1: 39.2000 (40.4364) acc5: 64.8000 (64.2909) time: 1.7690 data: 1.7470 max mem: 2905
Test: [20/50] eta: 0:00:38 loss: 3.2252 (3.2251) acc1: 33.6000 (36.5333) acc5: 60.0000 (60.8762) time: 0.7394 data: 0.7179 max mem: 2905
Test: [30/50] eta: 0:00:23 loss: 3.2189 (3.1635) acc1: 33.6000 (37.1613) acc5: 60.8000 (61.9871) time: 0.8651 data: 0.8452 max mem: 2905
Test: [40/50] eta: 0:00:10 loss: 3.1708 (3.2238) acc1: 35.2000 (36.5463) acc5: 61.6000 (61.1707) time: 0.8513 data: 0.8336 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 3.2854 (3.2300) acc1: 33.6000 (36.6080) acc5: 60.0000 (61.1360) time: 0.5136 data: 0.4959 max mem: 2905
Test: Total time: 0:00:47 (0.9433 s / it)
* Acc@1 37.644 Acc@5 61.944 loss 3.189
Accuracy of the model on the 50000 test images: 37.6%
Max accuracy: 50.54%
Epoch: [146] [ 0/625] eta: 3:55:02 lr: 0.002313 min_lr: 0.002313 loss: 3.0128 (3.0128) class_acc: 0.5195 (0.5195) weight_decay: 0.0500 (0.0500) time: 22.5640 data: 21.0014 max mem: 2905
Epoch: [146] [200/625] eta: 0:14:47 lr: 0.002306 min_lr: 0.002306 loss: 3.0039 (3.0538) class_acc: 0.5117 (0.5141) weight_decay: 0.0500 (0.0500) grad_norm: 1.9791 (2.6369) time: 1.8616 data: 0.1726 max mem: 2905
Epoch: [146] [400/625] eta: 0:07:33 lr: 0.002299 min_lr: 0.002299 loss: 3.0296 (3.0546) class_acc: 0.5117 (0.5150) weight_decay: 0.0500 (0.0500) grad_norm: 2.5003 (inf) time: 1.9097 data: 0.0032 max mem: 2905
Epoch: [146] [600/625] eta: 0:00:50 lr: 0.002292 min_lr: 0.002292 loss: 3.0597 (3.0568) class_acc: 0.5039 (0.5142) weight_decay: 0.0500 (0.0500) grad_norm: 1.8707 (inf) time: 2.0692 data: 0.0098 max mem: 2905
Epoch: [146] [624/625] eta: 0:00:01 lr: 0.002291 min_lr: 0.002291 loss: 3.0740 (3.0579) class_acc: 0.5039 (0.5138) weight_decay: 0.0500 (0.0500) grad_norm: 1.6800 (inf) time: 0.7919 data: 0.0015 max mem: 2905
Epoch: [146] Total time: 0:20:29 (1.9674 s / it)
Averaged stats: lr: 0.002291 min_lr: 0.002291 loss: 3.0740 (3.0554) class_acc: 0.5039 (0.5138) weight_decay: 0.0500 (0.0500) grad_norm: 1.6800 (inf)
Test: [ 0/50] eta: 0:09:44 loss: 3.3146 (3.3146) acc1: 37.6000 (37.6000) acc5: 60.0000 (60.0000) time: 11.6938 data: 11.6670 max mem: 2905
Test: [10/50] eta: 0:01:17 loss: 3.5712 (3.6294) acc1: 32.8000 (32.9455) acc5: 53.6000 (54.9091) time: 1.9435 data: 1.9241 max mem: 2905
Test: [20/50] eta: 0:00:46 loss: 3.6901 (3.7306) acc1: 29.6000 (30.4381) acc5: 52.0000 (53.6381) time: 1.0347 data: 1.0151 max mem: 2905
Test: [30/50] eta: 0:00:27 loss: 3.7869 (3.7748) acc1: 27.2000 (29.9097) acc5: 50.4000 (52.5936) time: 1.0609 data: 1.0409 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 3.9946 (3.8499) acc1: 27.2000 (29.4439) acc5: 48.8000 (51.5122) time: 0.8709 data: 0.8508 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 3.7481 (3.7961) acc1: 29.6000 (29.9840) acc5: 51.2000 (52.3680) time: 0.7295 data: 0.7082 max mem: 2905
Test: Total time: 0:00:51 (1.0237 s / it)
* Acc@1 30.328 Acc@5 52.902 loss 3.751
Accuracy of the model on the 50000 test images: 30.3%
Max accuracy: 50.54%
Epoch: [147] [ 0/625] eta: 3:58:55 lr: 0.002291 min_lr: 0.002291 loss: 3.3517 (3.3517) class_acc: 0.4922 (0.4922) weight_decay: 0.0500 (0.0500) time: 22.9374 data: 18.6722 max mem: 2905
Epoch: [147] [200/625] eta: 0:14:45 lr: 0.002284 min_lr: 0.002284 loss: 3.0038 (3.0383) class_acc: 0.5156 (0.5182) weight_decay: 0.0500 (0.0500) grad_norm: 2.1349 (2.4594) time: 1.9008 data: 0.0013 max mem: 2905
Epoch: [147] [400/625] eta: 0:07:32 lr: 0.002277 min_lr: 0.002277 loss: 3.0456 (3.0355) class_acc: 0.5156 (0.5176) weight_decay: 0.0500 (0.0500) grad_norm: 3.4006 (2.5693) time: 1.9305 data: 0.0008 max mem: 2905
Epoch: [147] [600/625] eta: 0:00:50 lr: 0.002270 min_lr: 0.002270 loss: 3.0956 (3.0467) class_acc: 0.4961 (0.5152) weight_decay: 0.0500 (0.0500) grad_norm: 2.1714 (2.5219) time: 1.9798 data: 0.0009 max mem: 2905
Epoch: [147] [624/625] eta: 0:00:01 lr: 0.002269 min_lr: 0.002269 loss: 3.0867 (3.0477) class_acc: 0.5078 (0.5150) weight_decay: 0.0500 (0.0500) grad_norm: 1.8592 (2.5137) time: 0.7221 data: 0.0020 max mem: 2905
Epoch: [147] Total time: 0:20:22 (1.9562 s / it)
Averaged stats: lr: 0.002269 min_lr: 0.002269 loss: 3.0867 (3.0508) class_acc: 0.5078 (0.5149) weight_decay: 0.0500 (0.0500) grad_norm: 1.8592 (2.5137)
Test: [ 0/50] eta: 0:10:51 loss: 2.5123 (2.5123) acc1: 46.4000 (46.4000) acc5: 73.6000 (73.6000) time: 13.0329 data: 13.0048 max mem: 2905
Test: [10/50] eta: 0:01:27 loss: 2.5123 (2.5746) acc1: 47.2000 (46.3273) acc5: 73.6000 (71.5636) time: 2.1962 data: 2.1753 max mem: 2905
Test: [20/50] eta: 0:00:51 loss: 2.7364 (2.7569) acc1: 44.8000 (43.5429) acc5: 68.0000 (68.5333) time: 1.1338 data: 1.1129 max mem: 2905
Test: [30/50] eta: 0:00:29 loss: 2.9157 (2.8039) acc1: 40.8000 (42.8903) acc5: 66.4000 (68.1806) time: 1.0853 data: 1.0654 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.9157 (2.8110) acc1: 40.8000 (42.8293) acc5: 66.4000 (67.8829) time: 0.7223 data: 0.7038 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.6534 (2.7788) acc1: 41.6000 (43.0880) acc5: 67.2000 (68.3200) time: 0.6628 data: 0.6435 max mem: 2905
Test: Total time: 0:00:50 (1.0193 s / it)
* Acc@1 43.650 Acc@5 68.506 loss 2.761
Accuracy of the model on the 50000 test images: 43.7%
Max accuracy: 50.54%
Epoch: [148] [ 0/625] eta: 4:09:23 lr: 0.002269 min_lr: 0.002269 loss: 3.1695 (3.1695) class_acc: 0.4688 (0.4688) weight_decay: 0.0500 (0.0500) time: 23.9415 data: 18.5702 max mem: 2905
Epoch: [148] [200/625] eta: 0:14:16 lr: 0.002262 min_lr: 0.002262 loss: 2.9880 (3.0238) class_acc: 0.5273 (0.5188) weight_decay: 0.0500 (0.0500) grad_norm: 2.4833 (2.5539) time: 1.8423 data: 0.5251 max mem: 2905
Epoch: [148] [400/625] eta: 0:07:19 lr: 0.002255 min_lr: 0.002255 loss: 3.0093 (3.0311) class_acc: 0.5352 (0.5189) weight_decay: 0.0500 (0.0500) grad_norm: 2.4527 (2.5450) time: 1.9003 data: 0.0109 max mem: 2905
Epoch: [148] [600/625] eta: 0:00:48 lr: 0.002248 min_lr: 0.002248 loss: 3.0577 (3.0411) class_acc: 0.5156 (0.5168) weight_decay: 0.0500 (0.0500) grad_norm: 2.7720 (2.5787) time: 1.9974 data: 0.3803 max mem: 2905
Epoch: [148] [624/625] eta: 0:00:01 lr: 0.002247 min_lr: 0.002247 loss: 3.0280 (3.0412) class_acc: 0.5273 (0.5168) weight_decay: 0.0500 (0.0500) grad_norm: 1.8916 (2.5543) time: 1.1760 data: 0.0593 max mem: 2905
Epoch: [148] Total time: 0:19:42 (1.8920 s / it)
Averaged stats: lr: 0.002247 min_lr: 0.002247 loss: 3.0280 (3.0448) class_acc: 0.5273 (0.5164) weight_decay: 0.0500 (0.0500) grad_norm: 1.8916 (2.5543)
Test: [ 0/50] eta: 0:09:52 loss: 2.4585 (2.4585) acc1: 41.6000 (41.6000) acc5: 74.4000 (74.4000) time: 11.8444 data: 11.8086 max mem: 2905
Test: [10/50] eta: 0:01:26 loss: 2.8436 (2.8866) acc1: 41.6000 (41.6727) acc5: 65.6000 (65.7455) time: 2.1555 data: 2.1350 max mem: 2905
Test: [20/50] eta: 0:00:51 loss: 3.0068 (2.9908) acc1: 40.8000 (40.2286) acc5: 62.4000 (64.2667) time: 1.2254 data: 1.2065 max mem: 2905
Test: [30/50] eta: 0:00:28 loss: 3.0115 (2.9515) acc1: 40.8000 (41.0839) acc5: 63.2000 (65.1097) time: 0.9994 data: 0.9798 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.9922 (2.9965) acc1: 41.6000 (40.6439) acc5: 63.2000 (64.7610) time: 0.5585 data: 0.5398 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.8542 (2.9806) acc1: 41.6000 (40.6400) acc5: 65.6000 (65.0080) time: 0.4895 data: 0.4708 max mem: 2905
Test: Total time: 0:00:48 (0.9743 s / it)
* Acc@1 40.844 Acc@5 65.812 loss 2.945
Accuracy of the model on the 50000 test images: 40.8%
Max accuracy: 50.54%
Epoch: [149] [ 0/625] eta: 3:55:14 lr: 0.002247 min_lr: 0.002247 loss: 3.1467 (3.1467) class_acc: 0.4570 (0.4570) weight_decay: 0.0500 (0.0500) time: 22.5826 data: 18.4608 max mem: 2905
Epoch: [149] [200/625] eta: 0:13:40 lr: 0.002240 min_lr: 0.002240 loss: 3.0379 (3.0399) class_acc: 0.5195 (0.5166) weight_decay: 0.0500 (0.0500) grad_norm: 1.6151 (2.3642) time: 1.6730 data: 0.8320 max mem: 2905
Epoch: [149] [400/625] eta: 0:07:13 lr: 0.002232 min_lr: 0.002232 loss: 2.9688 (3.0396) class_acc: 0.5312 (0.5178) weight_decay: 0.0500 (0.0500) grad_norm: 2.3880 (2.6806) time: 1.9757 data: 0.0060 max mem: 2905
Epoch: [149] [600/625] eta: 0:00:49 lr: 0.002225 min_lr: 0.002225 loss: 3.0846 (3.0494) class_acc: 0.4961 (0.5148) weight_decay: 0.0500 (0.0500) grad_norm: 2.4737 (2.6573) time: 2.1015 data: 0.0458 max mem: 2905
Epoch: [149] [624/625] eta: 0:00:01 lr: 0.002224 min_lr: 0.002224 loss: 3.0867 (3.0503) class_acc: 0.5000 (0.5144) weight_decay: 0.0500 (0.0500) grad_norm: 2.3000 (2.6448) time: 0.7234 data: 0.0021 max mem: 2905
Epoch: [149] Total time: 0:20:16 (1.9466 s / it)
Averaged stats: lr: 0.002224 min_lr: 0.002224 loss: 3.0867 (3.0456) class_acc: 0.5000 (0.5158) weight_decay: 0.0500 (0.0500) grad_norm: 2.3000 (2.6448)
Test: [ 0/50] eta: 0:11:19 loss: 2.5075 (2.5075) acc1: 38.4000 (38.4000) acc5: 70.4000 (70.4000) time: 13.5836 data: 13.5596 max mem: 2905
Test: [10/50] eta: 0:01:34 loss: 2.6579 (2.7780) acc1: 40.0000 (42.9091) acc5: 68.8000 (67.1273) time: 2.3603 data: 2.3389 max mem: 2905
Test: [20/50] eta: 0:00:56 loss: 2.9266 (2.9420) acc1: 38.4000 (39.9238) acc5: 64.8000 (65.2191) time: 1.2932 data: 1.2714 max mem: 2905
Test: [30/50] eta: 0:00:33 loss: 3.0047 (2.9357) acc1: 36.8000 (40.4645) acc5: 64.8000 (65.4452) time: 1.2669 data: 1.2429 max mem: 2905
Test: [40/50] eta: 0:00:13 loss: 2.8491 (2.9547) acc1: 43.2000 (40.5659) acc5: 66.4000 (65.0341) time: 0.8628 data: 0.8400 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.8092 (2.9274) acc1: 43.2000 (40.8800) acc5: 68.8000 (65.6800) time: 0.8604 data: 0.8399 max mem: 2905
Test: Total time: 0:00:57 (1.1413 s / it)
* Acc@1 41.758 Acc@5 66.238 loss 2.885
Accuracy of the model on the 50000 test images: 41.8%
Max accuracy: 50.54%
Epoch: [150] [ 0/625] eta: 3:48:53 lr: 0.002224 min_lr: 0.002224 loss: 3.0979 (3.0979) class_acc: 0.4961 (0.4961) weight_decay: 0.0500 (0.0500) time: 21.9732 data: 20.1845 max mem: 2905
Epoch: [150] [200/625] eta: 0:14:54 lr: 0.002217 min_lr: 0.002217 loss: 3.0001 (3.0312) class_acc: 0.5312 (0.5190) weight_decay: 0.0500 (0.0500) grad_norm: 2.2735 (2.5337) time: 2.1304 data: 1.9446 max mem: 2905
Epoch: [150] [400/625] eta: 0:07:34 lr: 0.002210 min_lr: 0.002210 loss: 3.0345 (3.0356) class_acc: 0.5039 (0.5184) weight_decay: 0.0500 (0.0500) grad_norm: 2.4768 (2.5608) time: 1.8880 data: 1.6795 max mem: 2905
Epoch: [150] [600/625] eta: 0:00:49 lr: 0.002203 min_lr: 0.002203 loss: 3.0437 (3.0402) class_acc: 0.5273 (0.5181) weight_decay: 0.0500 (0.0500) grad_norm: 1.7446 (2.5425) time: 1.9584 data: 1.7877 max mem: 2905
Epoch: [150] [624/625] eta: 0:00:01 lr: 0.002202 min_lr: 0.002202 loss: 3.0401 (3.0396) class_acc: 0.5156 (0.5183) weight_decay: 0.0500 (0.0500) grad_norm: 2.7496 (2.5655) time: 0.7883 data: 0.6352 max mem: 2905
Epoch: [150] Total time: 0:20:17 (1.9480 s / it)
Averaged stats: lr: 0.002202 min_lr: 0.002202 loss: 3.0401 (3.0419) class_acc: 0.5156 (0.5165) weight_decay: 0.0500 (0.0500) grad_norm: 2.7496 (2.5655)
Test: [ 0/50] eta: 0:09:40 loss: 2.9424 (2.9424) acc1: 36.8000 (36.8000) acc5: 60.0000 (60.0000) time: 11.6073 data: 11.5793 max mem: 2905
Test: [10/50] eta: 0:01:15 loss: 2.7706 (2.8128) acc1: 40.0000 (41.3091) acc5: 68.8000 (67.0545) time: 1.8958 data: 1.8751 max mem: 2905
Test: [20/50] eta: 0:00:44 loss: 3.1150 (3.0830) acc1: 37.6000 (37.9048) acc5: 64.0000 (63.7333) time: 0.9640 data: 0.9454 max mem: 2905
Test: [30/50] eta: 0:00:25 loss: 3.1970 (3.1249) acc1: 35.2000 (37.6516) acc5: 61.6000 (63.1484) time: 0.9309 data: 0.9132 max mem: 2905
Test: [40/50] eta: 0:00:10 loss: 3.1618 (3.1494) acc1: 36.8000 (37.5610) acc5: 60.0000 (62.3024) time: 0.6551 data: 0.6365 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 3.1191 (3.1599) acc1: 36.8000 (37.3120) acc5: 60.0000 (62.2400) time: 0.5938 data: 0.5751 max mem: 2905
Test: Total time: 0:00:46 (0.9210 s / it)
* Acc@1 38.200 Acc@5 62.776 loss 3.120
Accuracy of the model on the 50000 test images: 38.2%
Max accuracy: 50.54%
Epoch: [151] [ 0/625] eta: 3:44:37 lr: 0.002202 min_lr: 0.002202 loss: 3.1651 (3.1651) class_acc: 0.5156 (0.5156) weight_decay: 0.0500 (0.0500) time: 21.5638 data: 18.6684 max mem: 2905
Epoch: [151] [200/625] eta: 0:13:39 lr: 0.002195 min_lr: 0.002195 loss: 3.0265 (3.0355) class_acc: 0.5039 (0.5178) weight_decay: 0.0500 (0.0500) grad_norm: 2.2292 (2.4239) time: 1.7984 data: 0.0006 max mem: 2905
Epoch: [151] [400/625] eta: 0:07:08 lr: 0.002188 min_lr: 0.002188 loss: 3.0176 (3.0361) class_acc: 0.5117 (0.5185) weight_decay: 0.0500 (0.0500) grad_norm: 3.0534 (2.6388) time: 1.7462 data: 0.0007 max mem: 2905
Epoch: [151] [600/625] eta: 0:00:47 lr: 0.002181 min_lr: 0.002181 loss: 3.0618 (3.0399) class_acc: 0.4961 (0.5178) weight_decay: 0.0500 (0.0500) grad_norm: 2.7182 (2.6694) time: 1.9069 data: 0.0006 max mem: 2905
Epoch: [151] [624/625] eta: 0:00:01 lr: 0.002180 min_lr: 0.002180 loss: 3.0361 (3.0408) class_acc: 0.5078 (0.5178) weight_decay: 0.0500 (0.0500) grad_norm: 2.2722 (2.6563) time: 0.6331 data: 0.0012 max mem: 2905
Epoch: [151] Total time: 0:19:34 (1.8794 s / it)
Averaged stats: lr: 0.002180 min_lr: 0.002180 loss: 3.0361 (3.0411) class_acc: 0.5078 (0.5170) weight_decay: 0.0500 (0.0500) grad_norm: 2.2722 (2.6563)
Test: [ 0/50] eta: 0:10:12 loss: 3.5451 (3.5451) acc1: 31.2000 (31.2000) acc5: 53.6000 (53.6000) time: 12.2476 data: 12.2141 max mem: 2905
Test: [10/50] eta: 0:01:19 loss: 3.2563 (3.2406) acc1: 37.6000 (36.5091) acc5: 60.8000 (60.5091) time: 1.9833 data: 1.9621 max mem: 2905
Test: [20/50] eta: 0:00:45 loss: 3.3185 (3.3582) acc1: 36.0000 (35.1619) acc5: 58.4000 (59.4286) time: 0.9639 data: 0.9432 max mem: 2905
Test: [30/50] eta: 0:00:25 loss: 3.3992 (3.3403) acc1: 36.8000 (35.8452) acc5: 56.8000 (59.6903) time: 0.9033 data: 0.8821 max mem: 2905
Test: [40/50] eta: 0:00:10 loss: 3.2801 (3.3732) acc1: 34.4000 (35.3561) acc5: 58.4000 (59.2976) time: 0.6404 data: 0.6196 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 3.3992 (3.3611) acc1: 35.2000 (35.7280) acc5: 60.0000 (59.6800) time: 0.5934 data: 0.5734 max mem: 2905
Test: Total time: 0:00:47 (0.9515 s / it)
* Acc@1 35.794 Acc@5 60.112 loss 3.324
Accuracy of the model on the 50000 test images: 35.8%
Max accuracy: 50.54%
Epoch: [152] [ 0/625] eta: 3:22:29 lr: 0.002180 min_lr: 0.002180 loss: 3.0376 (3.0376) class_acc: 0.4961 (0.4961) weight_decay: 0.0500 (0.0500) time: 19.4386 data: 18.2439 max mem: 2905
Epoch: [152] [200/625] eta: 0:13:36 lr: 0.002173 min_lr: 0.002173 loss: 3.0593 (3.0118) class_acc: 0.5156 (0.5253) weight_decay: 0.0500 (0.0500) grad_norm: 1.7804 (2.2714) time: 1.7929 data: 0.0321 max mem: 2905
Epoch: [152] [400/625] eta: 0:07:03 lr: 0.002165 min_lr: 0.002165 loss: 3.0556 (3.0211) class_acc: 0.5117 (0.5226) weight_decay: 0.0500 (0.0500) grad_norm: 2.0346 (2.4357) time: 1.8247 data: 0.0008 max mem: 2905
Epoch: [152] [600/625] eta: 0:00:47 lr: 0.002158 min_lr: 0.002158 loss: 3.0546 (3.0280) class_acc: 0.5000 (0.5204) weight_decay: 0.0500 (0.0500) grad_norm: 2.0263 (inf) time: 2.1767 data: 0.0008 max mem: 2905
Epoch: [152] [624/625] eta: 0:00:01 lr: 0.002157 min_lr: 0.002157 loss: 2.9995 (3.0287) class_acc: 0.5234 (0.5204) weight_decay: 0.0500 (0.0500) grad_norm: 2.2997 (inf) time: 0.6602 data: 0.0016 max mem: 2905
Epoch: [152] Total time: 0:19:32 (1.8758 s / it)
Averaged stats: lr: 0.002157 min_lr: 0.002157 loss: 2.9995 (3.0349) class_acc: 0.5234 (0.5180) weight_decay: 0.0500 (0.0500) grad_norm: 2.2997 (inf)
Test: [ 0/50] eta: 0:10:09 loss: 3.5093 (3.5093) acc1: 28.8000 (28.8000) acc5: 54.4000 (54.4000) time: 12.1903 data: 12.1593 max mem: 2905
Test: [10/50] eta: 0:01:29 loss: 3.1881 (3.2337) acc1: 40.0000 (38.3273) acc5: 61.6000 (61.6727) time: 2.2366 data: 2.2156 max mem: 2905
Test: [20/50] eta: 0:00:54 loss: 3.3766 (3.3887) acc1: 35.2000 (35.4286) acc5: 60.8000 (59.1619) time: 1.3064 data: 1.2866 max mem: 2905
Test: [30/50] eta: 0:00:32 loss: 3.4955 (3.3873) acc1: 35.2000 (36.0774) acc5: 56.0000 (58.7613) time: 1.2908 data: 1.2724 max mem: 2905
Test: [40/50] eta: 0:00:13 loss: 3.4960 (3.4144) acc1: 36.0000 (35.8244) acc5: 56.0000 (58.3805) time: 0.8609 data: 0.8432 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 3.5112 (3.4276) acc1: 33.6000 (35.5840) acc5: 56.8000 (58.3200) time: 0.7874 data: 0.7694 max mem: 2905
Test: Total time: 0:00:56 (1.1271 s / it)
* Acc@1 35.734 Acc@5 58.714 loss 3.402
Accuracy of the model on the 50000 test images: 35.7%
Max accuracy: 50.54%
Epoch: [153] [ 0/625] eta: 3:46:03 lr: 0.002157 min_lr: 0.002157 loss: 2.9352 (2.9352) class_acc: 0.5039 (0.5039) weight_decay: 0.0500 (0.0500) time: 21.7019 data: 21.5852 max mem: 2905
Epoch: [153] [200/625] eta: 0:14:39 lr: 0.002150 min_lr: 0.002150 loss: 3.0729 (3.0122) class_acc: 0.5039 (0.5230) weight_decay: 0.0500 (0.0500) grad_norm: 2.2612 (2.4603) time: 1.9413 data: 0.0007 max mem: 2905
Epoch: [153] [400/625] eta: 0:07:28 lr: 0.002143 min_lr: 0.002143 loss: 2.9897 (3.0248) class_acc: 0.5234 (0.5201) weight_decay: 0.0500 (0.0500) grad_norm: 1.9767 (2.5491) time: 1.9214 data: 0.0009 max mem: 2905
Epoch: [153] [600/625] eta: 0:00:49 lr: 0.002136 min_lr: 0.002136 loss: 3.0572 (3.0379) class_acc: 0.5117 (0.5177) weight_decay: 0.0500 (0.0500) grad_norm: 2.6070 (2.5075) time: 1.9138 data: 0.0007 max mem: 2905
Epoch: [153] [624/625] eta: 0:00:01 lr: 0.002135 min_lr: 0.002135 loss: 3.0697 (3.0391) class_acc: 0.4922 (0.5171) weight_decay: 0.0500 (0.0500) grad_norm: 2.6070 (2.5045) time: 0.8963 data: 0.0018 max mem: 2905
Epoch: [153] Total time: 0:20:05 (1.9293 s / it)
Averaged stats: lr: 0.002135 min_lr: 0.002135 loss: 3.0697 (3.0323) class_acc: 0.4922 (0.5188) weight_decay: 0.0500 (0.0500) grad_norm: 2.6070 (2.5045)
Test: [ 0/50] eta: 0:09:12 loss: 3.2503 (3.2503) acc1: 37.6000 (37.6000) acc5: 57.6000 (57.6000) time: 11.0546 data: 11.0247 max mem: 2905
Test: [10/50] eta: 0:01:18 loss: 3.2267 (3.2357) acc1: 35.2000 (36.5818) acc5: 64.8000 (62.1818) time: 1.9732 data: 1.9530 max mem: 2905
Test: [20/50] eta: 0:00:47 loss: 3.3884 (3.4245) acc1: 34.4000 (33.8286) acc5: 57.6000 (58.7810) time: 1.1164 data: 1.0975 max mem: 2905
Test: [30/50] eta: 0:00:27 loss: 3.5025 (3.3792) acc1: 32.8000 (35.4839) acc5: 56.8000 (59.2000) time: 1.0619 data: 1.0435 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 3.3892 (3.3861) acc1: 36.0000 (35.7268) acc5: 57.6000 (59.4732) time: 0.6433 data: 0.6235 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 3.3892 (3.3864) acc1: 34.4000 (35.3600) acc5: 59.2000 (59.5200) time: 0.5689 data: 0.5490 max mem: 2905
Test: Total time: 0:00:46 (0.9319 s / it)
* Acc@1 36.286 Acc@5 60.354 loss 3.330
Accuracy of the model on the 50000 test images: 36.3%
Max accuracy: 50.54%
Epoch: [154] [ 0/625] eta: 3:27:30 lr: 0.002135 min_lr: 0.002135 loss: 2.9735 (2.9735) class_acc: 0.5430 (0.5430) weight_decay: 0.0500 (0.0500) time: 19.9212 data: 18.6431 max mem: 2905
Epoch: [154] [200/625] eta: 0:13:48 lr: 0.002128 min_lr: 0.002128 loss: 2.9991 (3.0093) class_acc: 0.5195 (0.5235) weight_decay: 0.0500 (0.0500) grad_norm: 2.8436 (2.6363) time: 1.9476 data: 0.9274 max mem: 2905
Epoch: [154] [400/625] eta: 0:07:12 lr: 0.002121 min_lr: 0.002121 loss: 3.0310 (3.0143) class_acc: 0.5234 (0.5220) weight_decay: 0.0500 (0.0500) grad_norm: 2.5752 (2.5978) time: 1.8642 data: 0.3912 max mem: 2905
Epoch: [154] [600/625] eta: 0:00:48 lr: 0.002113 min_lr: 0.002113 loss: 3.0270 (3.0280) class_acc: 0.5117 (0.5194) weight_decay: 0.0500 (0.0500) grad_norm: 3.0360 (2.5763) time: 1.9887 data: 0.0515 max mem: 2905
Epoch: [154] [624/625] eta: 0:00:01 lr: 0.002113 min_lr: 0.002113 loss: 3.0285 (3.0280) class_acc: 0.5156 (0.5194) weight_decay: 0.0500 (0.0500) grad_norm: 2.8794 (2.5640) time: 0.9190 data: 0.0265 max mem: 2905
Epoch: [154] Total time: 0:19:35 (1.8808 s / it)
Averaged stats: lr: 0.002113 min_lr: 0.002113 loss: 3.0285 (3.0267) class_acc: 0.5156 (0.5197) weight_decay: 0.0500 (0.0500) grad_norm: 2.8794 (2.5640)
Test: [ 0/50] eta: 0:09:31 loss: 3.8801 (3.8801) acc1: 28.8000 (28.8000) acc5: 47.2000 (47.2000) time: 11.4247 data: 11.3942 max mem: 2905
Test: [10/50] eta: 0:01:25 loss: 3.2167 (3.3366) acc1: 35.2000 (34.9818) acc5: 60.0000 (58.6909) time: 2.1347 data: 2.1136 max mem: 2905
Test: [20/50] eta: 0:00:51 loss: 3.4895 (3.4943) acc1: 32.8000 (32.6476) acc5: 56.0000 (56.7238) time: 1.2394 data: 1.2195 max mem: 2905
Test: [30/50] eta: 0:00:29 loss: 3.6375 (3.4702) acc1: 29.6000 (33.0581) acc5: 55.2000 (56.9806) time: 1.0920 data: 1.0730 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 3.5073 (3.5147) acc1: 32.8000 (32.3902) acc5: 56.0000 (56.5463) time: 0.6334 data: 0.6154 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 3.4358 (3.4976) acc1: 33.6000 (32.8960) acc5: 57.6000 (56.8640) time: 0.5545 data: 0.5366 max mem: 2905
Test: Total time: 0:00:49 (0.9901 s / it)
* Acc@1 33.546 Acc@5 57.298 loss 3.466
Accuracy of the model on the 50000 test images: 33.5%
Max accuracy: 50.54%
Epoch: [155] [ 0/625] eta: 3:47:09 lr: 0.002113 min_lr: 0.002113 loss: 2.7752 (2.7752) class_acc: 0.5625 (0.5625) weight_decay: 0.0500 (0.0500) time: 21.8076 data: 18.6507 max mem: 2905
Epoch: [155] [200/625] eta: 0:13:55 lr: 0.002105 min_lr: 0.002105 loss: 3.0109 (3.0089) class_acc: 0.5156 (0.5243) weight_decay: 0.0500 (0.0500) grad_norm: 2.7185 (2.7366) time: 1.8816 data: 0.0012 max mem: 2905
Epoch: [155] [400/625] eta: 0:07:20 lr: 0.002098 min_lr: 0.002098 loss: 2.9925 (3.0268) class_acc: 0.5156 (0.5199) weight_decay: 0.0500 (0.0500) grad_norm: 2.5118 (2.6630) time: 1.9407 data: 0.0010 max mem: 2905
Epoch: [155] [600/625] eta: 0:00:49 lr: 0.002091 min_lr: 0.002091 loss: 3.0713 (3.0340) class_acc: 0.5117 (0.5181) weight_decay: 0.0500 (0.0500) grad_norm: 1.7920 (2.6555) time: 1.9477 data: 0.0483 max mem: 2905
Epoch: [155] [624/625] eta: 0:00:01 lr: 0.002090 min_lr: 0.002090 loss: 3.0200 (3.0344) class_acc: 0.5117 (0.5181) weight_decay: 0.0500 (0.0500) grad_norm: 2.0642 (2.6587) time: 0.5837 data: 0.0060 max mem: 2905
Epoch: [155] Total time: 0:20:13 (1.9418 s / it)
Averaged stats: lr: 0.002090 min_lr: 0.002090 loss: 3.0200 (3.0307) class_acc: 0.5117 (0.5197) weight_decay: 0.0500 (0.0500) grad_norm: 2.0642 (2.6587)
Test: [ 0/50] eta: 0:10:44 loss: 3.4774 (3.4774) acc1: 34.4000 (34.4000) acc5: 55.2000 (55.2000) time: 12.8984 data: 12.8685 max mem: 2905
Test: [10/50] eta: 0:01:25 loss: 2.8554 (2.9839) acc1: 40.8000 (40.2182) acc5: 68.0000 (65.0909) time: 2.1309 data: 2.1080 max mem: 2905
Test: [20/50] eta: 0:00:49 loss: 3.2216 (3.2589) acc1: 36.0000 (36.1143) acc5: 60.0000 (60.7619) time: 1.0896 data: 1.0685 max mem: 2905
Test: [30/50] eta: 0:00:29 loss: 3.3724 (3.2594) acc1: 34.4000 (35.7419) acc5: 56.8000 (60.3613) time: 1.1041 data: 1.0842 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 3.3724 (3.3082) acc1: 34.4000 (35.0244) acc5: 56.8000 (59.9610) time: 0.9255 data: 0.9064 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 3.3747 (3.3033) acc1: 32.0000 (35.2160) acc5: 56.8000 (60.0640) time: 0.8749 data: 0.8562 max mem: 2905
Test: Total time: 0:00:55 (1.1094 s / it)
* Acc@1 35.538 Acc@5 60.466 loss 3.269
Accuracy of the model on the 50000 test images: 35.5%
Max accuracy: 50.54%
Epoch: [156] [ 0/625] eta: 3:47:19 lr: 0.002090 min_lr: 0.002090 loss: 2.9601 (2.9601) class_acc: 0.5273 (0.5273) weight_decay: 0.0500 (0.0500) time: 21.8231 data: 18.2878 max mem: 2905
Epoch: [156] [200/625] eta: 0:14:40 lr: 0.002083 min_lr: 0.002083 loss: 2.9491 (3.0054) class_acc: 0.5195 (0.5225) weight_decay: 0.0500 (0.0500) grad_norm: 2.4504 (2.4339) time: 1.8751 data: 0.0008 max mem: 2905
Epoch: [156] [400/625] eta: 0:07:33 lr: 0.002076 min_lr: 0.002076 loss: 3.0714 (3.0159) class_acc: 0.5039 (0.5201) weight_decay: 0.0500 (0.0500) grad_norm: 2.3515 (2.4555) time: 1.9844 data: 0.0007 max mem: 2905
Epoch: [156] [600/625] eta: 0:00:49 lr: 0.002069 min_lr: 0.002069 loss: 3.0238 (3.0244) class_acc: 0.5195 (0.5186) weight_decay: 0.0500 (0.0500) grad_norm: 1.6113 (2.4710) time: 2.0847 data: 0.0007 max mem: 2905
Epoch: [156] [624/625] eta: 0:00:01 lr: 0.002068 min_lr: 0.002068 loss: 3.0534 (3.0250) class_acc: 0.5195 (0.5186) weight_decay: 0.0500 (0.0500) grad_norm: 2.3017 (2.4959) time: 0.7239 data: 0.0013 max mem: 2905
Epoch: [156] Total time: 0:20:12 (1.9398 s / it)
Averaged stats: lr: 0.002068 min_lr: 0.002068 loss: 3.0534 (3.0243) class_acc: 0.5195 (0.5201) weight_decay: 0.0500 (0.0500) grad_norm: 2.3017 (2.4959)
Test: [ 0/50] eta: 0:10:22 loss: 2.9412 (2.9412) acc1: 40.0000 (40.0000) acc5: 66.4000 (66.4000) time: 12.4532 data: 12.4195 max mem: 2905
Test: [10/50] eta: 0:01:23 loss: 3.0171 (3.1125) acc1: 40.0000 (39.7091) acc5: 62.4000 (61.3818) time: 2.0921 data: 2.0707 max mem: 2905
Test: [20/50] eta: 0:00:48 loss: 3.2026 (3.3091) acc1: 36.0000 (36.4952) acc5: 59.2000 (59.1238) time: 1.0749 data: 1.0554 max mem: 2905
Test: [30/50] eta: 0:00:28 loss: 3.5017 (3.3518) acc1: 33.6000 (35.9742) acc5: 58.4000 (58.7355) time: 1.0429 data: 1.0242 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 3.3821 (3.3553) acc1: 35.2000 (35.6683) acc5: 58.4000 (58.7902) time: 0.6676 data: 0.6497 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 3.3477 (3.3675) acc1: 34.4000 (35.2480) acc5: 59.2000 (58.6880) time: 0.6143 data: 0.5965 max mem: 2905
Test: Total time: 0:00:48 (0.9630 s / it)
* Acc@1 35.434 Acc@5 58.678 loss 3.355
Accuracy of the model on the 50000 test images: 35.4%
Max accuracy: 50.54%
Epoch: [157] [ 0/625] eta: 3:36:36 lr: 0.002068 min_lr: 0.002068 loss: 3.1194 (3.1194) class_acc: 0.4922 (0.4922) weight_decay: 0.0500 (0.0500) time: 20.7952 data: 19.5208 max mem: 2905
Epoch: [157] [200/625] eta: 0:14:09 lr: 0.002061 min_lr: 0.002061 loss: 3.0206 (3.0142) class_acc: 0.5273 (0.5254) weight_decay: 0.0500 (0.0500) grad_norm: 2.5901 (3.0417) time: 1.8902 data: 0.0017 max mem: 2905
Epoch: [157] [400/625] eta: 0:07:21 lr: 0.002053 min_lr: 0.002053 loss: 3.0223 (3.0161) class_acc: 0.5195 (0.5234) weight_decay: 0.0500 (0.0500) grad_norm: 2.1142 (2.6565) time: 1.9916 data: 0.0007 max mem: 2905
Epoch: [157] [600/625] eta: 0:00:49 lr: 0.002046 min_lr: 0.002046 loss: 3.0597 (3.0238) class_acc: 0.5078 (0.5220) weight_decay: 0.0500 (0.0500) grad_norm: 1.8276 (2.7122) time: 2.1000 data: 0.0006 max mem: 2905
Epoch: [157] [624/625] eta: 0:00:01 lr: 0.002045 min_lr: 0.002045 loss: 3.0107 (3.0244) class_acc: 0.5156 (0.5217) weight_decay: 0.0500 (0.0500) grad_norm: 2.3966 (2.7331) time: 0.8472 data: 0.0020 max mem: 2905
Epoch: [157] Total time: 0:20:12 (1.9402 s / it)
Averaged stats: lr: 0.002045 min_lr: 0.002045 loss: 3.0107 (3.0219) class_acc: 0.5156 (0.5211) weight_decay: 0.0500 (0.0500) grad_norm: 2.3966 (2.7331)
Test: [ 0/50] eta: 0:10:56 loss: 2.8063 (2.8063) acc1: 40.8000 (40.8000) acc5: 65.6000 (65.6000) time: 13.1270 data: 13.0957 max mem: 2905
Test: [10/50] eta: 0:01:28 loss: 2.8591 (2.9528) acc1: 40.8000 (41.1636) acc5: 68.0000 (66.2545) time: 2.2014 data: 2.1818 max mem: 2905
Test: [20/50] eta: 0:00:53 loss: 3.0065 (3.0680) acc1: 38.4000 (38.7429) acc5: 64.8000 (63.8095) time: 1.2065 data: 1.1871 max mem: 2905
Test: [30/50] eta: 0:00:31 loss: 3.0294 (3.0176) acc1: 37.6000 (39.9484) acc5: 62.4000 (64.6194) time: 1.2598 data: 1.2409 max mem: 2905
Test: [40/50] eta: 0:00:13 loss: 3.0294 (3.0609) acc1: 36.8000 (39.0049) acc5: 62.4000 (63.9220) time: 0.9770 data: 0.9591 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 3.0332 (3.0391) acc1: 36.8000 (38.9600) acc5: 63.2000 (64.4640) time: 0.9116 data: 0.8909 max mem: 2905
Test: Total time: 0:00:57 (1.1489 s / it)
* Acc@1 39.884 Acc@5 65.088 loss 2.998
Accuracy of the model on the 50000 test images: 39.9%
Max accuracy: 50.54%
Epoch: [158] [ 0/625] eta: 3:50:33 lr: 0.002045 min_lr: 0.002045 loss: 2.9035 (2.9035) class_acc: 0.5195 (0.5195) weight_decay: 0.0500 (0.0500) time: 22.1338 data: 22.0038 max mem: 2905
Epoch: [158] [200/625] eta: 0:14:31 lr: 0.002038 min_lr: 0.002038 loss: 3.0524 (3.0105) class_acc: 0.5078 (0.5228) weight_decay: 0.0500 (0.0500) grad_norm: 2.7281 (2.8381) time: 1.9043 data: 0.9828 max mem: 2905
Epoch: [158] [400/625] eta: 0:07:25 lr: 0.002031 min_lr: 0.002031 loss: 3.0553 (3.0218) class_acc: 0.5117 (0.5199) weight_decay: 0.0500 (0.0500) grad_norm: 2.0558 (2.7807) time: 2.0683 data: 0.0006 max mem: 2905
Epoch: [158] [600/625] eta: 0:00:49 lr: 0.002024 min_lr: 0.002024 loss: 3.0205 (3.0309) class_acc: 0.5273 (0.5187) weight_decay: 0.0500 (0.0500) grad_norm: 1.8804 (2.6360) time: 2.1269 data: 0.0008 max mem: 2905
Epoch: [158] [624/625] eta: 0:00:01 lr: 0.002023 min_lr: 0.002023 loss: 3.0348 (3.0309) class_acc: 0.5117 (0.5184) weight_decay: 0.0500 (0.0500) grad_norm: 2.6145 (2.6381) time: 0.8863 data: 0.0101 max mem: 2905
Epoch: [158] Total time: 0:20:18 (1.9494 s / it)
Averaged stats: lr: 0.002023 min_lr: 0.002023 loss: 3.0348 (3.0214) class_acc: 0.5117 (0.5216) weight_decay: 0.0500 (0.0500) grad_norm: 2.6145 (2.6381)
Test: [ 0/50] eta: 0:10:51 loss: 2.8790 (2.8790) acc1: 42.4000 (42.4000) acc5: 64.8000 (64.8000) time: 13.0309 data: 13.0038 max mem: 2905
Test: [10/50] eta: 0:01:29 loss: 2.8306 (2.7522) acc1: 44.8000 (43.3455) acc5: 64.8000 (66.9091) time: 2.2299 data: 2.2101 max mem: 2905
Test: [20/50] eta: 0:00:52 loss: 3.0101 (2.9438) acc1: 38.4000 (40.1524) acc5: 64.0000 (65.3714) time: 1.1820 data: 1.1611 max mem: 2905
Test: [30/50] eta: 0:00:30 loss: 3.1355 (2.9555) acc1: 37.6000 (40.7226) acc5: 64.8000 (65.1613) time: 1.1386 data: 1.1167 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 3.0745 (2.9753) acc1: 39.2000 (40.3707) acc5: 65.6000 (64.7024) time: 0.7614 data: 0.7422 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 3.0783 (3.0112) acc1: 37.6000 (39.9200) acc5: 61.6000 (64.3520) time: 0.7104 data: 0.6923 max mem: 2905
Test: Total time: 0:00:52 (1.0555 s / it)
* Acc@1 40.394 Acc@5 64.784 loss 2.969
Accuracy of the model on the 50000 test images: 40.4%
Max accuracy: 50.54%
Epoch: [159] [ 0/625] eta: 3:47:21 lr: 0.002023 min_lr: 0.002023 loss: 3.0250 (3.0250) class_acc: 0.4844 (0.4844) weight_decay: 0.0500 (0.0500) time: 21.8270 data: 18.4449 max mem: 2905
Epoch: [159] [200/625] eta: 0:14:21 lr: 0.002016 min_lr: 0.002016 loss: 3.0319 (2.9930) class_acc: 0.5234 (0.5266) weight_decay: 0.0500 (0.0500) grad_norm: 2.3921 (2.4452) time: 1.8829 data: 0.3644 max mem: 2905
Epoch: [159] [400/625] eta: 0:07:29 lr: 0.002009 min_lr: 0.002009 loss: 3.0498 (3.0097) class_acc: 0.5156 (0.5233) weight_decay: 0.0500 (0.0500) grad_norm: 2.4559 (inf) time: 2.0620 data: 0.0007 max mem: 2905
Epoch: [159] [600/625] eta: 0:00:49 lr: 0.002001 min_lr: 0.002001 loss: 3.0302 (3.0178) class_acc: 0.5117 (0.5224) weight_decay: 0.0500 (0.0500) grad_norm: 3.4642 (inf) time: 1.9790 data: 0.0007 max mem: 2905
Epoch: [159] [624/625] eta: 0:00:01 lr: 0.002001 min_lr: 0.002001 loss: 3.0329 (3.0196) class_acc: 0.5000 (0.5218) weight_decay: 0.0500 (0.0500) grad_norm: 2.6362 (inf) time: 1.1082 data: 0.0014 max mem: 2905
Epoch: [159] Total time: 0:20:16 (1.9458 s / it)
Averaged stats: lr: 0.002001 min_lr: 0.002001 loss: 3.0329 (3.0174) class_acc: 0.5000 (0.5229) weight_decay: 0.0500 (0.0500) grad_norm: 2.6362 (inf)
Test: [ 0/50] eta: 0:10:02 loss: 2.7988 (2.7988) acc1: 43.2000 (43.2000) acc5: 69.6000 (69.6000) time: 12.0570 data: 12.0300 max mem: 2905
Test: [10/50] eta: 0:01:22 loss: 3.0871 (3.0594) acc1: 36.8000 (38.9091) acc5: 64.0000 (63.6364) time: 2.0502 data: 2.0272 max mem: 2905
Test: [20/50] eta: 0:00:49 loss: 3.1623 (3.1333) acc1: 36.8000 (37.0667) acc5: 62.4000 (62.7810) time: 1.1139 data: 1.0922 max mem: 2905
Test: [30/50] eta: 0:00:27 loss: 3.1623 (3.1303) acc1: 36.0000 (37.8323) acc5: 62.4000 (62.6065) time: 1.0101 data: 0.9899 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 3.0786 (3.1501) acc1: 38.4000 (37.6781) acc5: 61.6000 (62.1268) time: 0.5755 data: 0.5549 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 3.1692 (3.1886) acc1: 36.0000 (37.1840) acc5: 61.6000 (61.7120) time: 0.4883 data: 0.4684 max mem: 2905
Test: Total time: 0:00:46 (0.9363 s / it)
* Acc@1 38.046 Acc@5 62.350 loss 3.146
Accuracy of the model on the 50000 test images: 38.0%
Max accuracy: 50.54%
Epoch: [160] [ 0/625] eta: 3:26:54 lr: 0.002001 min_lr: 0.002001 loss: 2.9900 (2.9900) class_acc: 0.4922 (0.4922) weight_decay: 0.0500 (0.0500) time: 19.8626 data: 19.5622 max mem: 2905
Epoch: [160] [200/625] eta: 0:13:38 lr: 0.001993 min_lr: 0.001993 loss: 2.9546 (2.9929) class_acc: 0.5273 (0.5270) weight_decay: 0.0500 (0.0500) grad_norm: 2.3242 (2.3992) time: 1.9188 data: 0.0005 max mem: 2905
Epoch: [160] [400/625] eta: 0:07:17 lr: 0.001986 min_lr: 0.001986 loss: 3.0093 (3.0115) class_acc: 0.5156 (0.5231) weight_decay: 0.0500 (0.0500) grad_norm: 1.7988 (2.5474) time: 1.9939 data: 0.0007 max mem: 2905
Epoch: [160] [600/625] eta: 0:00:49 lr: 0.001979 min_lr: 0.001979 loss: 3.0520 (3.0162) class_acc: 0.5312 (0.5226) weight_decay: 0.0500 (0.0500) grad_norm: 2.4405 (2.5678) time: 2.0670 data: 0.0278 max mem: 2905
Epoch: [160] [624/625] eta: 0:00:01 lr: 0.001978 min_lr: 0.001978 loss: 3.0520 (3.0175) class_acc: 0.5156 (0.5221) weight_decay: 0.0500 (0.0500) grad_norm: 2.0516 (2.5486) time: 0.7611 data: 0.0015 max mem: 2905
Epoch: [160] Total time: 0:19:59 (1.9190 s / it)
Averaged stats: lr: 0.001978 min_lr: 0.001978 loss: 3.0520 (3.0129) class_acc: 0.5156 (0.5229) weight_decay: 0.0500 (0.0500) grad_norm: 2.0516 (2.5486)
Test: [ 0/50] eta: 0:11:26 loss: 2.9394 (2.9394) acc1: 29.6000 (29.6000) acc5: 68.8000 (68.8000) time: 13.7376 data: 13.7101 max mem: 2905
Test: [10/50] eta: 0:01:29 loss: 3.0636 (3.0320) acc1: 40.8000 (38.4000) acc5: 61.6000 (63.6364) time: 2.2285 data: 2.2094 max mem: 2905
Test: [20/50] eta: 0:00:51 loss: 3.1471 (3.1398) acc1: 37.6000 (36.6095) acc5: 61.6000 (62.2095) time: 1.1312 data: 1.1122 max mem: 2905
Test: [30/50] eta: 0:00:30 loss: 3.1471 (3.1000) acc1: 35.2000 (37.5742) acc5: 60.0000 (62.2194) time: 1.1692 data: 1.1500 max mem: 2905
Test: [40/50] eta: 0:00:13 loss: 3.0694 (3.1393) acc1: 37.6000 (37.1122) acc5: 60.8000 (62.1463) time: 0.9834 data: 0.9645 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 3.0694 (3.1121) acc1: 37.6000 (37.3440) acc5: 64.8000 (62.7520) time: 0.9017 data: 0.8804 max mem: 2905
Test: Total time: 0:00:56 (1.1384 s / it)
* Acc@1 38.436 Acc@5 63.624 loss 3.063
Accuracy of the model on the 50000 test images: 38.4%
Max accuracy: 50.54%
Epoch: [161] [ 0/625] eta: 3:32:26 lr: 0.001978 min_lr: 0.001978 loss: 2.9341 (2.9341) class_acc: 0.5586 (0.5586) weight_decay: 0.0500 (0.0500) time: 20.3938 data: 19.5079 max mem: 2905
Epoch: [161] [200/625] eta: 0:14:37 lr: 0.001971 min_lr: 0.001971 loss: 3.0436 (3.0027) class_acc: 0.5195 (0.5256) weight_decay: 0.0500 (0.0500) grad_norm: 2.0281 (2.5309) time: 1.8008 data: 0.0010 max mem: 2905
Epoch: [161] [400/625] eta: 0:07:30 lr: 0.001964 min_lr: 0.001964 loss: 3.0016 (3.0113) class_acc: 0.5195 (0.5233) weight_decay: 0.0500 (0.0500) grad_norm: 2.9704 (2.5799) time: 1.9437 data: 0.0008 max mem: 2905
Epoch: [161] [600/625] eta: 0:00:49 lr: 0.001956 min_lr: 0.001956 loss: 2.9813 (3.0121) class_acc: 0.5234 (0.5227) weight_decay: 0.0500 (0.0500) grad_norm: 2.4052 (2.5571) time: 2.0764 data: 0.0008 max mem: 2905
Epoch: [161] [624/625] eta: 0:00:01 lr: 0.001956 min_lr: 0.001956 loss: 3.0267 (3.0123) class_acc: 0.5156 (0.5226) weight_decay: 0.0500 (0.0500) grad_norm: 2.5296 (2.5531) time: 0.7451 data: 0.0042 max mem: 2905
Epoch: [161] Total time: 0:20:13 (1.9420 s / it)
Averaged stats: lr: 0.001956 min_lr: 0.001956 loss: 3.0267 (3.0114) class_acc: 0.5156 (0.5236) weight_decay: 0.0500 (0.0500) grad_norm: 2.5296 (2.5531)
Test: [ 0/50] eta: 0:10:19 loss: 3.7432 (3.7432) acc1: 31.2000 (31.2000) acc5: 55.2000 (55.2000) time: 12.3817 data: 12.3492 max mem: 2905
Test: [10/50] eta: 0:01:22 loss: 3.2487 (3.2970) acc1: 36.0000 (35.0545) acc5: 60.0000 (60.8000) time: 2.0615 data: 2.0393 max mem: 2905
Test: [20/50] eta: 0:00:48 loss: 3.2875 (3.4076) acc1: 34.4000 (33.6000) acc5: 59.2000 (58.9714) time: 1.0720 data: 1.0512 max mem: 2905
Test: [30/50] eta: 0:00:27 loss: 3.4490 (3.3851) acc1: 32.8000 (34.2194) acc5: 58.4000 (59.4839) time: 0.9984 data: 0.9786 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 3.3444 (3.4059) acc1: 32.0000 (34.3415) acc5: 57.6000 (58.9463) time: 0.6106 data: 0.5908 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 3.2676 (3.3755) acc1: 35.2000 (34.6880) acc5: 57.6000 (59.2800) time: 0.5106 data: 0.4912 max mem: 2905
Test: Total time: 0:00:47 (0.9414 s / it)
* Acc@1 34.820 Acc@5 59.346 loss 3.352
Accuracy of the model on the 50000 test images: 34.8%
Max accuracy: 50.54%
Epoch: [162] [ 0/625] eta: 3:39:08 lr: 0.001956 min_lr: 0.001956 loss: 2.9780 (2.9780) class_acc: 0.5156 (0.5156) weight_decay: 0.0500 (0.0500) time: 21.0377 data: 17.2221 max mem: 2905
Epoch: [162] [200/625] eta: 0:14:13 lr: 0.001948 min_lr: 0.001948 loss: 3.0291 (2.9952) class_acc: 0.5156 (0.5262) weight_decay: 0.0500 (0.0500) grad_norm: 1.6709 (2.4841) time: 1.8659 data: 0.0280 max mem: 2905
Epoch: [162] [400/625] eta: 0:07:18 lr: 0.001941 min_lr: 0.001941 loss: 2.9941 (2.9947) class_acc: 0.5273 (0.5274) weight_decay: 0.0500 (0.0500) grad_norm: 2.5552 (2.5308) time: 1.9128 data: 0.0053 max mem: 2905
Epoch: [162] [600/625] eta: 0:00:48 lr: 0.001934 min_lr: 0.001934 loss: 3.0022 (3.0015) class_acc: 0.5195 (0.5256) weight_decay: 0.0500 (0.0500) grad_norm: 2.0132 (2.5374) time: 1.9884 data: 0.0007 max mem: 2905
Epoch: [162] [624/625] eta: 0:00:01 lr: 0.001933 min_lr: 0.001933 loss: 2.9778 (3.0021) class_acc: 0.5234 (0.5255) weight_decay: 0.0500 (0.0500) grad_norm: 2.0642 (2.5320) time: 0.9402 data: 0.0023 max mem: 2905
Epoch: [162] Total time: 0:19:38 (1.8859 s / it)
Averaged stats: lr: 0.001933 min_lr: 0.001933 loss: 2.9778 (3.0061) class_acc: 0.5234 (0.5243) weight_decay: 0.0500 (0.0500) grad_norm: 2.0642 (2.5320)
Test: [ 0/50] eta: 0:10:10 loss: 2.2307 (2.2307) acc1: 56.8000 (56.8000) acc5: 75.2000 (75.2000) time: 12.2016 data: 12.1716 max mem: 2905
Test: [10/50] eta: 0:01:17 loss: 2.3625 (2.3458) acc1: 54.4000 (51.4182) acc5: 72.8000 (73.6727) time: 1.9343 data: 1.9151 max mem: 2905
Test: [20/50] eta: 0:00:44 loss: 2.4769 (2.4956) acc1: 48.0000 (48.3429) acc5: 72.0000 (72.4571) time: 0.9447 data: 0.9253 max mem: 2905
Test: [30/50] eta: 0:00:25 loss: 2.5498 (2.5496) acc1: 44.0000 (47.3032) acc5: 71.2000 (71.6645) time: 0.9321 data: 0.9123 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.6894 (2.5854) acc1: 45.6000 (47.2195) acc5: 68.0000 (71.0829) time: 0.7578 data: 0.7393 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.7985 (2.6443) acc1: 44.0000 (46.0640) acc5: 68.0000 (70.5280) time: 0.7372 data: 0.7183 max mem: 2905
Test: Total time: 0:00:50 (1.0073 s / it)
* Acc@1 46.360 Acc@5 70.986 loss 2.623
Accuracy of the model on the 50000 test images: 46.4%
Max accuracy: 50.54%
Epoch: [163] [ 0/625] eta: 3:34:21 lr: 0.001933 min_lr: 0.001933 loss: 3.0274 (3.0274) class_acc: 0.5273 (0.5273) weight_decay: 0.0500 (0.0500) time: 20.5791 data: 19.3684 max mem: 2905
Epoch: [163] [200/625] eta: 0:13:28 lr: 0.001926 min_lr: 0.001926 loss: 2.9864 (3.0064) class_acc: 0.5234 (0.5262) weight_decay: 0.0500 (0.0500) grad_norm: 2.3154 (2.9247) time: 1.7898 data: 0.0006 max mem: 2905
Epoch: [163] [400/625] eta: 0:07:05 lr: 0.001919 min_lr: 0.001919 loss: 2.9711 (2.9996) class_acc: 0.5312 (0.5268) weight_decay: 0.0500 (0.0500) grad_norm: 2.1815 (2.7231) time: 1.7375 data: 0.0006 max mem: 2905
Epoch: [163] [600/625] eta: 0:00:47 lr: 0.001912 min_lr: 0.001912 loss: 3.0241 (3.0069) class_acc: 0.5234 (0.5252) weight_decay: 0.0500 (0.0500) grad_norm: 2.1012 (2.8291) time: 1.9009 data: 0.0026 max mem: 2905
Epoch: [163] [624/625] eta: 0:00:01 lr: 0.001911 min_lr: 0.001911 loss: 2.9689 (3.0057) class_acc: 0.5273 (0.5254) weight_decay: 0.0500 (0.0500) grad_norm: 2.5482 (2.8276) time: 0.8900 data: 0.0069 max mem: 2905
Epoch: [163] Total time: 0:19:20 (1.8569 s / it)
Averaged stats: lr: 0.001911 min_lr: 0.001911 loss: 2.9689 (3.0062) class_acc: 0.5273 (0.5245) weight_decay: 0.0500 (0.0500) grad_norm: 2.5482 (2.8276)
Test: [ 0/50] eta: 0:11:30 loss: 2.7785 (2.7785) acc1: 45.6000 (45.6000) acc5: 66.4000 (66.4000) time: 13.8183 data: 13.7918 max mem: 2905
Test: [10/50] eta: 0:01:32 loss: 2.7681 (2.8103) acc1: 44.8000 (45.2364) acc5: 66.4000 (66.3273) time: 2.3049 data: 2.2812 max mem: 2905
Test: [20/50] eta: 0:00:53 loss: 3.0242 (3.0300) acc1: 39.2000 (40.8000) acc5: 64.8000 (64.6857) time: 1.1824 data: 1.1603 max mem: 2905
Test: [30/50] eta: 0:00:29 loss: 3.1845 (3.0246) acc1: 38.4000 (40.6710) acc5: 64.0000 (64.4903) time: 1.0269 data: 1.0061 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 3.0399 (3.0763) acc1: 39.2000 (40.1951) acc5: 63.2000 (63.7463) time: 0.5740 data: 0.5538 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 3.1764 (3.1003) acc1: 38.4000 (39.8080) acc5: 62.4000 (63.2960) time: 0.6283 data: 0.6077 max mem: 2905
Test: Total time: 0:00:50 (1.0091 s / it)
* Acc@1 39.894 Acc@5 63.868 loss 3.059
Accuracy of the model on the 50000 test images: 39.9%
Max accuracy: 50.54%
Epoch: [164] [ 0/625] eta: 3:48:40 lr: 0.001911 min_lr: 0.001911 loss: 3.0496 (3.0496) class_acc: 0.5117 (0.5117) weight_decay: 0.0500 (0.0500) time: 21.9520 data: 17.7404 max mem: 2905
Epoch: [164] [200/625] eta: 0:13:41 lr: 0.001904 min_lr: 0.001904 loss: 2.9989 (2.9934) class_acc: 0.5273 (0.5261) weight_decay: 0.0500 (0.0500) grad_norm: 2.5929 (2.7544) time: 1.8066 data: 0.0006 max mem: 2905
Epoch: [164] [400/625] eta: 0:07:08 lr: 0.001896 min_lr: 0.001896 loss: 2.9909 (3.0018) class_acc: 0.5312 (0.5253) weight_decay: 0.0500 (0.0500) grad_norm: 2.6663 (2.7015) time: 1.7716 data: 0.0005 max mem: 2905
Epoch: [164] [600/625] eta: 0:00:47 lr: 0.001889 min_lr: 0.001889 loss: 3.0058 (3.0014) class_acc: 0.5195 (0.5256) weight_decay: 0.0500 (0.0500) grad_norm: 2.1491 (2.6635) time: 1.9180 data: 0.0005 max mem: 2905
Epoch: [164] [624/625] eta: 0:00:01 lr: 0.001888 min_lr: 0.001888 loss: 3.0013 (3.0017) class_acc: 0.5273 (0.5258) weight_decay: 0.0500 (0.0500) grad_norm: 2.2426 (2.6582) time: 0.7014 data: 0.0012 max mem: 2905
Epoch: [164] Total time: 0:19:30 (1.8725 s / it)
Averaged stats: lr: 0.001888 min_lr: 0.001888 loss: 3.0013 (3.0016) class_acc: 0.5273 (0.5250) weight_decay: 0.0500 (0.0500) grad_norm: 2.2426 (2.6582)
Test: [ 0/50] eta: 0:10:39 loss: 2.9423 (2.9423) acc1: 40.0000 (40.0000) acc5: 64.0000 (64.0000) time: 12.7900 data: 12.7567 max mem: 2905
Test: [10/50] eta: 0:01:28 loss: 3.1255 (3.0971) acc1: 38.4000 (39.8545) acc5: 64.0000 (62.9818) time: 2.2071 data: 2.1855 max mem: 2905
Test: [20/50] eta: 0:00:53 loss: 3.3436 (3.2860) acc1: 36.0000 (35.8476) acc5: 60.0000 (60.6857) time: 1.2203 data: 1.2005 max mem: 2905
Test: [30/50] eta: 0:00:28 loss: 3.4506 (3.2799) acc1: 32.0000 (35.9226) acc5: 58.4000 (60.4129) time: 0.9896 data: 0.9700 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 3.3231 (3.3201) acc1: 34.4000 (35.4732) acc5: 58.4000 (59.3366) time: 0.4992 data: 0.4803 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 3.3841 (3.3178) acc1: 32.8000 (35.5360) acc5: 56.8000 (59.4880) time: 0.4767 data: 0.4587 max mem: 2905
Test: Total time: 0:00:47 (0.9560 s / it)
* Acc@1 35.890 Acc@5 60.088 loss 3.286
Accuracy of the model on the 50000 test images: 35.9%
Max accuracy: 50.54%
Epoch: [165] [ 0/625] eta: 3:33:41 lr: 0.001888 min_lr: 0.001888 loss: 2.9401 (2.9401) class_acc: 0.5195 (0.5195) weight_decay: 0.0500 (0.0500) time: 20.5142 data: 17.2189 max mem: 2905
Epoch: [165] [200/625] eta: 0:13:29 lr: 0.001881 min_lr: 0.001881 loss: 3.0212 (2.9828) class_acc: 0.5234 (0.5307) weight_decay: 0.0500 (0.0500) grad_norm: 2.7025 (3.0605) time: 1.6481 data: 0.0008 max mem: 2905
Epoch: [165] [400/625] eta: 0:07:02 lr: 0.001874 min_lr: 0.001874 loss: 2.9921 (2.9909) class_acc: 0.5156 (0.5281) weight_decay: 0.0500 (0.0500) grad_norm: 2.2314 (2.8759) time: 1.7111 data: 0.0006 max mem: 2905
Epoch: [165] [600/625] eta: 0:00:47 lr: 0.001867 min_lr: 0.001867 loss: 3.0458 (2.9946) class_acc: 0.5156 (0.5272) weight_decay: 0.0500 (0.0500) grad_norm: 2.9788 (inf) time: 1.9794 data: 0.0009 max mem: 2905
Epoch: [165] [624/625] eta: 0:00:01 lr: 0.001866 min_lr: 0.001866 loss: 3.0176 (2.9969) class_acc: 0.5039 (0.5267) weight_decay: 0.0500 (0.0500) grad_norm: 2.5893 (inf) time: 0.6704 data: 0.0013 max mem: 2905
Epoch: [165] Total time: 0:19:24 (1.8633 s / it)
Averaged stats: lr: 0.001866 min_lr: 0.001866 loss: 3.0176 (2.9999) class_acc: 0.5039 (0.5258) weight_decay: 0.0500 (0.0500) grad_norm: 2.5893 (inf)
Test: [ 0/50] eta: 0:10:24 loss: 3.0633 (3.0633) acc1: 35.2000 (35.2000) acc5: 62.4000 (62.4000) time: 12.4867 data: 12.4549 max mem: 2905
Test: [10/50] eta: 0:01:25 loss: 2.9873 (2.9387) acc1: 40.8000 (40.6545) acc5: 64.8000 (64.8000) time: 2.1271 data: 2.1046 max mem: 2905
Test: [20/50] eta: 0:00:50 loss: 2.9873 (3.0398) acc1: 39.2000 (38.9333) acc5: 64.8000 (63.8857) time: 1.1543 data: 1.1337 max mem: 2905
Test: [30/50] eta: 0:00:28 loss: 3.1121 (3.0090) acc1: 39.2000 (39.8710) acc5: 64.0000 (64.6194) time: 1.0377 data: 1.0176 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 3.1345 (3.1053) acc1: 39.2000 (38.2049) acc5: 61.6000 (63.1220) time: 0.5805 data: 0.5603 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 3.1253 (3.0828) acc1: 36.8000 (38.6720) acc5: 63.2000 (63.5680) time: 0.4977 data: 0.4786 max mem: 2905
Test: Total time: 0:00:48 (0.9611 s / it)
* Acc@1 39.370 Acc@5 63.908 loss 3.041
Accuracy of the model on the 50000 test images: 39.4%
Max accuracy: 50.54%
Epoch: [166] [ 0/625] eta: 3:32:05 lr: 0.001866 min_lr: 0.001866 loss: 2.9962 (2.9962) class_acc: 0.5469 (0.5469) weight_decay: 0.0500 (0.0500) time: 20.3603 data: 20.2128 max mem: 2905
Epoch: [166] [200/625] eta: 0:13:55 lr: 0.001859 min_lr: 0.001859 loss: 2.9943 (2.9773) class_acc: 0.5352 (0.5296) weight_decay: 0.0500 (0.0500) grad_norm: 2.4041 (2.7321) time: 1.8612 data: 0.0006 max mem: 2905
Epoch: [166] [400/625] eta: 0:07:08 lr: 0.001852 min_lr: 0.001852 loss: 2.9390 (2.9844) class_acc: 0.5234 (0.5279) weight_decay: 0.0500 (0.0500) grad_norm: 1.9155 (2.6368) time: 1.8199 data: 0.0006 max mem: 2905
Epoch: [166] [600/625] eta: 0:00:47 lr: 0.001844 min_lr: 0.001844 loss: 2.9537 (2.9877) class_acc: 0.5273 (0.5272) weight_decay: 0.0500 (0.0500) grad_norm: 1.6367 (2.6694) time: 1.9442 data: 0.0006 max mem: 2905
Epoch: [166] [624/625] eta: 0:00:01 lr: 0.001844 min_lr: 0.001844 loss: 2.9819 (2.9888) class_acc: 0.5195 (0.5271) weight_decay: 0.0500 (0.0500) grad_norm: 1.8814 (2.6580) time: 1.1992 data: 0.0013 max mem: 2905
Epoch: [166] Total time: 0:19:33 (1.8768 s / it)
Averaged stats: lr: 0.001844 min_lr: 0.001844 loss: 2.9819 (2.9984) class_acc: 0.5195 (0.5262) weight_decay: 0.0500 (0.0500) grad_norm: 1.8814 (2.6580)
Test: [ 0/50] eta: 0:10:06 loss: 3.1573 (3.1573) acc1: 39.2000 (39.2000) acc5: 63.2000 (63.2000) time: 12.1281 data: 12.0890 max mem: 2905
Test: [10/50] eta: 0:01:22 loss: 3.3033 (3.2254) acc1: 38.4000 (36.8000) acc5: 59.2000 (60.0727) time: 2.0505 data: 2.0303 max mem: 2905
Test: [20/50] eta: 0:00:48 loss: 3.4230 (3.3467) acc1: 36.0000 (34.8571) acc5: 56.0000 (58.6667) time: 1.1077 data: 1.0894 max mem: 2905
Test: [30/50] eta: 0:00:29 loss: 3.4230 (3.3456) acc1: 36.0000 (35.4065) acc5: 56.0000 (58.6839) time: 1.1326 data: 1.1125 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 3.4131 (3.3856) acc1: 33.6000 (35.0049) acc5: 57.6000 (58.3415) time: 0.7933 data: 0.7729 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 3.4591 (3.4062) acc1: 32.8000 (34.6400) acc5: 57.6000 (57.9840) time: 0.7385 data: 0.7203 max mem: 2905
Test: Total time: 0:00:50 (1.0188 s / it)
* Acc@1 35.154 Acc@5 59.214 loss 3.369
Accuracy of the model on the 50000 test images: 35.2%
Max accuracy: 50.54%
Epoch: [167] [ 0/625] eta: 4:44:39 lr: 0.001844 min_lr: 0.001844 loss: 3.0249 (3.0249) class_acc: 0.5195 (0.5195) weight_decay: 0.0500 (0.0500) time: 27.3280 data: 13.1028 max mem: 2905
Epoch: [167] [200/625] eta: 0:14:16 lr: 0.001836 min_lr: 0.001836 loss: 2.9187 (2.9747) class_acc: 0.5312 (0.5336) weight_decay: 0.0500 (0.0500) grad_norm: 2.6803 (2.3803) time: 1.9411 data: 0.0097 max mem: 2905
Epoch: [167] [400/625] eta: 0:07:16 lr: 0.001829 min_lr: 0.001829 loss: 2.9569 (2.9790) class_acc: 0.5391 (0.5320) weight_decay: 0.0500 (0.0500) grad_norm: 1.5310 (2.6824) time: 1.8691 data: 0.0007 max mem: 2905
Epoch: [167] [600/625] eta: 0:00:48 lr: 0.001822 min_lr: 0.001822 loss: 3.0398 (2.9900) class_acc: 0.5195 (0.5296) weight_decay: 0.0500 (0.0500) grad_norm: 2.1429 (2.6636) time: 1.8791 data: 0.0007 max mem: 2905
Epoch: [167] [624/625] eta: 0:00:01 lr: 0.001821 min_lr: 0.001821 loss: 3.0024 (2.9914) class_acc: 0.5117 (0.5293) weight_decay: 0.0500 (0.0500) grad_norm: 2.2141 (2.6628) time: 0.3843 data: 0.0014 max mem: 2905
Epoch: [167] Total time: 0:19:51 (1.9072 s / it)
Averaged stats: lr: 0.001821 min_lr: 0.001821 loss: 3.0024 (2.9929) class_acc: 0.5117 (0.5279) weight_decay: 0.0500 (0.0500) grad_norm: 2.2141 (2.6628)
Test: [ 0/50] eta: 0:11:17 loss: 2.2886 (2.2886) acc1: 51.2000 (51.2000) acc5: 75.2000 (75.2000) time: 13.5560 data: 13.5230 max mem: 2905
Test: [10/50] eta: 0:01:31 loss: 2.6449 (2.5244) acc1: 44.8000 (46.9091) acc5: 72.0000 (71.8545) time: 2.2959 data: 2.2761 max mem: 2905
Test: [20/50] eta: 0:00:54 loss: 2.7329 (2.7073) acc1: 42.4000 (43.5810) acc5: 68.0000 (68.8762) time: 1.2291 data: 1.2097 max mem: 2905
Test: [30/50] eta: 0:00:30 loss: 2.8250 (2.7163) acc1: 42.4000 (43.8452) acc5: 66.4000 (68.9032) time: 1.0924 data: 1.0724 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.6335 (2.7280) acc1: 44.0000 (43.7073) acc5: 68.0000 (68.8390) time: 0.6191 data: 0.5998 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.7958 (2.7493) acc1: 41.6000 (43.2000) acc5: 67.2000 (68.4640) time: 0.5082 data: 0.4895 max mem: 2905
Test: Total time: 0:00:51 (1.0294 s / it)
* Acc@1 43.868 Acc@5 68.928 loss 2.713
Accuracy of the model on the 50000 test images: 43.9%
Max accuracy: 50.54%
Epoch: [168] [ 0/625] eta: 3:39:52 lr: 0.001821 min_lr: 0.001821 loss: 2.8138 (2.8138) class_acc: 0.5820 (0.5820) weight_decay: 0.0500 (0.0500) time: 21.1080 data: 20.8204 max mem: 2905
Epoch: [168] [200/625] eta: 0:13:50 lr: 0.001814 min_lr: 0.001814 loss: 2.9459 (2.9694) class_acc: 0.5391 (0.5325) weight_decay: 0.0500 (0.0500) grad_norm: 1.6658 (2.4502) time: 1.7739 data: 0.0009 max mem: 2905
Epoch: [168] [400/625] eta: 0:07:19 lr: 0.001807 min_lr: 0.001807 loss: 2.9447 (2.9821) class_acc: 0.5352 (0.5300) weight_decay: 0.0500 (0.0500) grad_norm: 2.2311 (2.5463) time: 1.9811 data: 0.0007 max mem: 2905
Epoch: [168] [600/625] eta: 0:00:49 lr: 0.001800 min_lr: 0.001800 loss: 2.9581 (2.9858) class_acc: 0.5312 (0.5300) weight_decay: 0.0500 (0.0500) grad_norm: 2.7950 (2.6243) time: 1.9608 data: 0.0007 max mem: 2905
Epoch: [168] [624/625] eta: 0:00:01 lr: 0.001799 min_lr: 0.001799 loss: 3.0626 (2.9881) class_acc: 0.5078 (0.5294) weight_decay: 0.0500 (0.0500) grad_norm: 2.1495 (2.6073) time: 0.6393 data: 0.0017 max mem: 2905
Epoch: [168] Total time: 0:20:25 (1.9602 s / it)
Averaged stats: lr: 0.001799 min_lr: 0.001799 loss: 3.0626 (2.9926) class_acc: 0.5078 (0.5281) weight_decay: 0.0500 (0.0500) grad_norm: 2.1495 (2.6073)
Test: [ 0/50] eta: 0:10:57 loss: 3.2454 (3.2454) acc1: 34.4000 (34.4000) acc5: 66.4000 (66.4000) time: 13.1534 data: 13.1248 max mem: 2905
Test: [10/50] eta: 0:01:29 loss: 3.2339 (3.1913) acc1: 37.6000 (38.2545) acc5: 62.4000 (60.5818) time: 2.2336 data: 2.2109 max mem: 2905
Test: [20/50] eta: 0:00:52 loss: 3.3253 (3.3924) acc1: 35.2000 (35.3143) acc5: 57.6000 (58.5905) time: 1.1850 data: 1.1644 max mem: 2905
Test: [30/50] eta: 0:00:31 loss: 3.4640 (3.3858) acc1: 35.2000 (35.4839) acc5: 56.8000 (58.8903) time: 1.1765 data: 1.1562 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 3.2923 (3.3466) acc1: 36.0000 (35.7659) acc5: 59.2000 (59.2976) time: 0.7691 data: 0.7467 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 3.1683 (3.3408) acc1: 36.0000 (35.6480) acc5: 60.0000 (59.4720) time: 0.7373 data: 0.7171 max mem: 2905
Test: Total time: 0:00:52 (1.0576 s / it)
* Acc@1 36.334 Acc@5 59.828 loss 3.298
Accuracy of the model on the 50000 test images: 36.3%
Max accuracy: 50.54%
Epoch: [169] [ 0/625] eta: 3:39:41 lr: 0.001799 min_lr: 0.001799 loss: 2.9010 (2.9010) class_acc: 0.5547 (0.5547) weight_decay: 0.0500 (0.0500) time: 21.0897 data: 20.2607 max mem: 2905
Epoch: [169] [200/625] eta: 0:14:20 lr: 0.001792 min_lr: 0.001792 loss: 2.9941 (2.9758) class_acc: 0.5195 (0.5320) weight_decay: 0.0500 (0.0500) grad_norm: 2.8153 (2.4579) time: 2.0700 data: 0.0294 max mem: 2905
Epoch: [169] [400/625] eta: 0:07:27 lr: 0.001785 min_lr: 0.001785 loss: 2.9884 (2.9861) class_acc: 0.5195 (0.5298) weight_decay: 0.0500 (0.0500) grad_norm: 2.0978 (2.3877) time: 1.8974 data: 1.5291 max mem: 2905
Epoch: [169] [600/625] eta: 0:00:49 lr: 0.001777 min_lr: 0.001777 loss: 3.0000 (2.9893) class_acc: 0.5117 (0.5283) weight_decay: 0.0500 (0.0500) grad_norm: 1.9412 (2.4352) time: 1.8642 data: 1.4349 max mem: 2905
Epoch: [169] [624/625] eta: 0:00:01 lr: 0.001777 min_lr: 0.001777 loss: 2.9846 (2.9905) class_acc: 0.5391 (0.5283) weight_decay: 0.0500 (0.0500) grad_norm: 1.9362 (2.4144) time: 0.9707 data: 0.6309 max mem: 2905
Epoch: [169] Total time: 0:20:01 (1.9231 s / it)
Averaged stats: lr: 0.001777 min_lr: 0.001777 loss: 2.9846 (2.9867) class_acc: 0.5391 (0.5292) weight_decay: 0.0500 (0.0500) grad_norm: 1.9362 (2.4144)
Test: [ 0/50] eta: 0:10:32 loss: 3.3138 (3.3138) acc1: 31.2000 (31.2000) acc5: 59.2000 (59.2000) time: 12.6456 data: 12.6194 max mem: 2905
Test: [10/50] eta: 0:01:30 loss: 2.7917 (2.9317) acc1: 41.6000 (40.9455) acc5: 66.4000 (64.6545) time: 2.2689 data: 2.2476 max mem: 2905
Test: [20/50] eta: 0:00:53 loss: 2.9900 (3.0767) acc1: 38.4000 (39.2381) acc5: 63.2000 (62.4762) time: 1.2369 data: 1.2167 max mem: 2905
Test: [30/50] eta: 0:00:28 loss: 3.1523 (3.0585) acc1: 37.6000 (39.4323) acc5: 63.2000 (62.9677) time: 0.9834 data: 0.9627 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 3.1308 (3.0979) acc1: 38.4000 (38.8878) acc5: 61.6000 (62.5561) time: 0.5155 data: 0.4930 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 3.1458 (3.1031) acc1: 36.0000 (38.6080) acc5: 61.6000 (62.8160) time: 0.4367 data: 0.4162 max mem: 2905
Test: Total time: 0:00:48 (0.9731 s / it)
* Acc@1 39.048 Acc@5 63.846 loss 3.065
Accuracy of the model on the 50000 test images: 39.0%
Max accuracy: 50.54%
Epoch: [170] [ 0/625] eta: 3:21:06 lr: 0.001777 min_lr: 0.001777 loss: 2.9403 (2.9403) class_acc: 0.5625 (0.5625) weight_decay: 0.0500 (0.0500) time: 19.3067 data: 19.1782 max mem: 2905
Epoch: [170] [200/625] eta: 0:13:46 lr: 0.001769 min_lr: 0.001769 loss: 3.0203 (2.9719) class_acc: 0.5117 (0.5329) weight_decay: 0.0500 (0.0500) grad_norm: 2.3207 (2.5694) time: 1.8361 data: 0.0007 max mem: 2905
Epoch: [170] [400/625] eta: 0:07:04 lr: 0.001762 min_lr: 0.001762 loss: 2.9403 (2.9849) class_acc: 0.5391 (0.5309) weight_decay: 0.0500 (0.0500) grad_norm: 2.1293 (2.8965) time: 1.8819 data: 0.0007 max mem: 2905
Epoch: [170] [600/625] eta: 0:00:46 lr: 0.001755 min_lr: 0.001755 loss: 3.0265 (2.9884) class_acc: 0.5039 (0.5291) weight_decay: 0.0500 (0.0500) grad_norm: 1.9512 (2.7736) time: 1.8766 data: 0.0681 max mem: 2905
Epoch: [170] [624/625] eta: 0:00:01 lr: 0.001754 min_lr: 0.001754 loss: 2.9997 (2.9893) class_acc: 0.5273 (0.5290) weight_decay: 0.0500 (0.0500) grad_norm: 2.4963 (2.7813) time: 0.6587 data: 0.0020 max mem: 2905
Epoch: [170] Total time: 0:19:25 (1.8642 s / it)
Averaged stats: lr: 0.001754 min_lr: 0.001754 loss: 2.9997 (2.9856) class_acc: 0.5273 (0.5297) weight_decay: 0.0500 (0.0500) grad_norm: 2.4963 (2.7813)
Test: [ 0/50] eta: 0:10:26 loss: 3.2164 (3.2164) acc1: 40.0000 (40.0000) acc5: 64.8000 (64.8000) time: 12.5359 data: 12.5076 max mem: 2905
Test: [10/50] eta: 0:01:27 loss: 3.0124 (2.9835) acc1: 41.6000 (41.6727) acc5: 65.6000 (65.8182) time: 2.1855 data: 2.1657 max mem: 2905
Test: [20/50] eta: 0:00:53 loss: 3.0124 (3.0668) acc1: 39.2000 (39.6952) acc5: 65.6000 (64.5333) time: 1.2393 data: 1.2196 max mem: 2905
Test: [30/50] eta: 0:00:31 loss: 3.1176 (3.0541) acc1: 39.2000 (39.6903) acc5: 62.4000 (64.1806) time: 1.2716 data: 1.2511 max mem: 2905
Test: [40/50] eta: 0:00:13 loss: 2.9376 (3.0710) acc1: 39.2000 (39.5902) acc5: 62.4000 (64.1561) time: 0.8478 data: 0.8267 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.8790 (3.0384) acc1: 40.0000 (39.8720) acc5: 67.2000 (64.4320) time: 0.8235 data: 0.8031 max mem: 2905
Test: Total time: 0:00:54 (1.0928 s / it)
* Acc@1 40.450 Acc@5 65.534 loss 2.986
Accuracy of the model on the 50000 test images: 40.5%
Max accuracy: 50.54%
Epoch: [171] [ 0/625] eta: 3:38:26 lr: 0.001754 min_lr: 0.001754 loss: 3.0879 (3.0879) class_acc: 0.4844 (0.4844) weight_decay: 0.0500 (0.0500) time: 20.9705 data: 18.2474 max mem: 2905
Epoch: [171] [200/625] eta: 0:14:30 lr: 0.001747 min_lr: 0.001747 loss: 2.9826 (2.9559) class_acc: 0.5312 (0.5364) weight_decay: 0.0500 (0.0500) grad_norm: 1.8893 (2.7455) time: 1.9763 data: 0.4276 max mem: 2905
Epoch: [171] [400/625] eta: 0:07:22 lr: 0.001740 min_lr: 0.001740 loss: 2.9612 (2.9695) class_acc: 0.5312 (0.5340) weight_decay: 0.0500 (0.0500) grad_norm: 2.2606 (2.7927) time: 1.8029 data: 0.1976 max mem: 2905
Epoch: [171] [600/625] eta: 0:00:48 lr: 0.001733 min_lr: 0.001733 loss: 2.9750 (2.9818) class_acc: 0.5352 (0.5304) weight_decay: 0.0500 (0.0500) grad_norm: 2.6586 (2.7298) time: 1.9153 data: 0.0194 max mem: 2905
Epoch: [171] [624/625] eta: 0:00:01 lr: 0.001732 min_lr: 0.001732 loss: 2.9626 (2.9816) class_acc: 0.5352 (0.5304) weight_decay: 0.0500 (0.0500) grad_norm: 1.7193 (2.6867) time: 0.9668 data: 0.0252 max mem: 2905
Epoch: [171] Total time: 0:19:52 (1.9083 s / it)
Averaged stats: lr: 0.001732 min_lr: 0.001732 loss: 2.9626 (2.9838) class_acc: 0.5352 (0.5299) weight_decay: 0.0500 (0.0500) grad_norm: 1.7193 (2.6867)
Test: [ 0/50] eta: 0:09:43 loss: 3.2135 (3.2135) acc1: 38.4000 (38.4000) acc5: 63.2000 (63.2000) time: 11.6679 data: 11.6320 max mem: 2905
Test: [10/50] eta: 0:01:17 loss: 3.1183 (3.0763) acc1: 38.4000 (39.1273) acc5: 63.2000 (62.4727) time: 1.9262 data: 1.9045 max mem: 2905
Test: [20/50] eta: 0:00:43 loss: 3.1183 (3.1506) acc1: 37.6000 (37.6762) acc5: 63.2000 (62.2476) time: 0.9504 data: 0.9296 max mem: 2905
Test: [30/50] eta: 0:00:25 loss: 3.1413 (3.1643) acc1: 35.2000 (37.0323) acc5: 62.4000 (62.1419) time: 0.9341 data: 0.9131 max mem: 2905
Test: [40/50] eta: 0:00:10 loss: 3.2040 (3.1711) acc1: 36.8000 (37.3659) acc5: 62.4000 (62.3415) time: 0.6586 data: 0.6386 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 3.2941 (3.1729) acc1: 38.4000 (37.6320) acc5: 62.4000 (62.0000) time: 0.4853 data: 0.4672 max mem: 2905
Test: Total time: 0:00:44 (0.8985 s / it)
* Acc@1 37.836 Acc@5 62.512 loss 3.137
Accuracy of the model on the 50000 test images: 37.8%
Max accuracy: 50.54%
Epoch: [172] [ 0/625] eta: 3:44:31 lr: 0.001732 min_lr: 0.001732 loss: 2.9385 (2.9385) class_acc: 0.5820 (0.5820) weight_decay: 0.0500 (0.0500) time: 21.5539 data: 18.0907 max mem: 2905
Epoch: [172] [200/625] eta: 0:14:05 lr: 0.001725 min_lr: 0.001725 loss: 2.9877 (2.9806) class_acc: 0.5195 (0.5315) weight_decay: 0.0500 (0.0500) grad_norm: 2.1970 (2.5075) time: 1.8672 data: 0.5351 max mem: 2905
Epoch: [172] [400/625] eta: 0:07:16 lr: 0.001718 min_lr: 0.001718 loss: 2.9649 (2.9821) class_acc: 0.5273 (0.5293) weight_decay: 0.0500 (0.0500) grad_norm: 2.0467 (inf) time: 1.9091 data: 0.1682 max mem: 2905
Epoch: [172] [600/625] eta: 0:00:47 lr: 0.001711 min_lr: 0.001711 loss: 3.0100 (2.9793) class_acc: 0.5234 (0.5312) weight_decay: 0.0500 (0.0500) grad_norm: 2.0628 (inf) time: 1.9168 data: 0.0007 max mem: 2905
Epoch: [172] [624/625] eta: 0:00:01 lr: 0.001710 min_lr: 0.001710 loss: 2.9436 (2.9790) class_acc: 0.5312 (0.5312) weight_decay: 0.0500 (0.0500) grad_norm: 3.1261 (inf) time: 0.5992 data: 0.0015 max mem: 2905
Epoch: [172] Total time: 0:19:46 (1.8983 s / it)
Averaged stats: lr: 0.001710 min_lr: 0.001710 loss: 2.9436 (2.9805) class_acc: 0.5312 (0.5308) weight_decay: 0.0500 (0.0500) grad_norm: 3.1261 (inf)
Test: [ 0/50] eta: 0:11:09 loss: 3.2755 (3.2755) acc1: 32.8000 (32.8000) acc5: 61.6000 (61.6000) time: 13.3831 data: 13.3492 max mem: 2905
Test: [10/50] eta: 0:01:29 loss: 2.8431 (2.9376) acc1: 40.8000 (40.3636) acc5: 67.2000 (65.3091) time: 2.2493 data: 2.2292 max mem: 2905
Test: [20/50] eta: 0:00:53 loss: 3.0852 (3.1209) acc1: 36.8000 (37.0667) acc5: 62.4000 (62.8952) time: 1.1907 data: 1.1717 max mem: 2905
Test: [30/50] eta: 0:00:30 loss: 3.1739 (3.1121) acc1: 36.8000 (38.0129) acc5: 60.8000 (62.8387) time: 1.1428 data: 1.1229 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 3.2810 (3.1978) acc1: 35.2000 (37.4049) acc5: 60.8000 (61.8732) time: 0.7285 data: 0.7089 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 3.2611 (3.1785) acc1: 35.2000 (37.6320) acc5: 60.8000 (62.1280) time: 0.7267 data: 0.7088 max mem: 2905
Test: Total time: 0:00:52 (1.0426 s / it)
* Acc@1 38.084 Acc@5 62.774 loss 3.138
Accuracy of the model on the 50000 test images: 38.1%
Max accuracy: 50.54%
Epoch: [173] [ 0/625] eta: 3:56:53 lr: 0.001710 min_lr: 0.001710 loss: 2.8024 (2.8024) class_acc: 0.5430 (0.5430) weight_decay: 0.0500 (0.0500) time: 22.7415 data: 20.1273 max mem: 2905
Epoch: [173] [200/625] eta: 0:14:24 lr: 0.001703 min_lr: 0.001703 loss: 2.9312 (2.9495) class_acc: 0.5352 (0.5378) weight_decay: 0.0500 (0.0500) grad_norm: 2.0731 (2.6881) time: 1.8458 data: 0.0161 max mem: 2905
Epoch: [173] [400/625] eta: 0:07:20 lr: 0.001696 min_lr: 0.001696 loss: 3.0127 (2.9510) class_acc: 0.5156 (0.5372) weight_decay: 0.0500 (0.0500) grad_norm: 2.1637 (2.5861) time: 1.8729 data: 0.0007 max mem: 2905
Epoch: [173] [600/625] eta: 0:00:48 lr: 0.001689 min_lr: 0.001689 loss: 2.9601 (2.9667) class_acc: 0.5195 (0.5341) weight_decay: 0.0500 (0.0500) grad_norm: 2.4811 (2.7430) time: 2.0259 data: 0.0015 max mem: 2905
Epoch: [173] [624/625] eta: 0:00:01 lr: 0.001688 min_lr: 0.001688 loss: 2.9370 (2.9675) class_acc: 0.5195 (0.5337) weight_decay: 0.0500 (0.0500) grad_norm: 2.4555 (2.7457) time: 0.5736 data: 0.0013 max mem: 2905
Epoch: [173] Total time: 0:20:08 (1.9336 s / it)
Averaged stats: lr: 0.001688 min_lr: 0.001688 loss: 2.9370 (2.9746) class_acc: 0.5195 (0.5317) weight_decay: 0.0500 (0.0500) grad_norm: 2.4555 (2.7457)
Test: [ 0/50] eta: 0:09:08 loss: 2.6666 (2.6666) acc1: 40.0000 (40.0000) acc5: 69.6000 (69.6000) time: 10.9764 data: 10.9503 max mem: 2905
Test: [10/50] eta: 0:01:08 loss: 2.7626 (2.8494) acc1: 40.8000 (40.8000) acc5: 66.4000 (65.6000) time: 1.7093 data: 1.6876 max mem: 2905
Test: [20/50] eta: 0:00:38 loss: 2.9249 (2.9672) acc1: 40.0000 (39.1619) acc5: 64.8000 (64.2667) time: 0.7943 data: 0.7713 max mem: 2905
Test: [30/50] eta: 0:00:24 loss: 3.1269 (3.0094) acc1: 36.8000 (39.0710) acc5: 61.6000 (63.1484) time: 0.9874 data: 0.9659 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 3.1472 (3.0558) acc1: 36.8000 (38.8878) acc5: 60.8000 (62.4195) time: 1.0054 data: 0.9864 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 3.1000 (3.0549) acc1: 36.8000 (39.1040) acc5: 62.4000 (62.7840) time: 0.5792 data: 0.5598 max mem: 2905
Test: Total time: 0:00:50 (1.0061 s / it)
* Acc@1 39.206 Acc@5 63.740 loss 3.025
Accuracy of the model on the 50000 test images: 39.2%
Max accuracy: 50.54%
Epoch: [174] [ 0/625] eta: 3:39:08 lr: 0.001688 min_lr: 0.001688 loss: 3.0976 (3.0976) class_acc: 0.5117 (0.5117) weight_decay: 0.0500 (0.0500) time: 21.0382 data: 19.9410 max mem: 2905
Epoch: [174] [200/625] eta: 0:14:11 lr: 0.001681 min_lr: 0.001681 loss: 2.9163 (2.9552) class_acc: 0.5312 (0.5368) weight_decay: 0.0500 (0.0500) grad_norm: 2.7128 (2.7697) time: 1.8328 data: 0.0419 max mem: 2905
Epoch: [174] [400/625] eta: 0:07:26 lr: 0.001674 min_lr: 0.001674 loss: 2.9233 (2.9677) class_acc: 0.5352 (0.5334) weight_decay: 0.0500 (0.0500) grad_norm: 2.4044 (2.6798) time: 1.9221 data: 0.0007 max mem: 2905
Epoch: [174] [600/625] eta: 0:00:49 lr: 0.001666 min_lr: 0.001666 loss: 2.9966 (2.9767) class_acc: 0.5273 (0.5307) weight_decay: 0.0500 (0.0500) grad_norm: 2.3354 (2.6386) time: 2.0869 data: 0.0007 max mem: 2905
Epoch: [174] [624/625] eta: 0:00:01 lr: 0.001666 min_lr: 0.001666 loss: 3.0110 (2.9780) class_acc: 0.5273 (0.5308) weight_decay: 0.0500 (0.0500) grad_norm: 2.3033 (2.6561) time: 0.7380 data: 0.0014 max mem: 2905
Epoch: [174] Total time: 0:20:08 (1.9341 s / it)
Averaged stats: lr: 0.001666 min_lr: 0.001666 loss: 3.0110 (2.9749) class_acc: 0.5273 (0.5315) weight_decay: 0.0500 (0.0500) grad_norm: 2.3033 (2.6561)
Test: [ 0/50] eta: 0:10:01 loss: 3.0384 (3.0384) acc1: 36.8000 (36.8000) acc5: 59.2000 (59.2000) time: 12.0253 data: 11.9985 max mem: 2905
Test: [10/50] eta: 0:01:18 loss: 3.0384 (2.9753) acc1: 37.6000 (40.3636) acc5: 62.4000 (63.2000) time: 1.9516 data: 1.9317 max mem: 2905
Test: [20/50] eta: 0:00:43 loss: 3.0979 (3.1068) acc1: 37.6000 (37.5619) acc5: 61.6000 (61.8667) time: 0.9101 data: 0.8908 max mem: 2905
Test: [30/50] eta: 0:00:24 loss: 3.0826 (3.0781) acc1: 37.6000 (38.4000) acc5: 62.4000 (62.6323) time: 0.8580 data: 0.8374 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 3.1173 (3.1519) acc1: 40.0000 (38.0878) acc5: 60.8000 (61.7366) time: 0.8465 data: 0.8265 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 3.1689 (3.1511) acc1: 32.8000 (37.4560) acc5: 59.2000 (61.7920) time: 0.7235 data: 0.7052 max mem: 2905
Test: Total time: 0:00:51 (1.0305 s / it)
* Acc@1 37.862 Acc@5 62.306 loss 3.122
Accuracy of the model on the 50000 test images: 37.9%
Max accuracy: 50.54%
Epoch: [175] [ 0/625] eta: 3:45:41 lr: 0.001666 min_lr: 0.001666 loss: 2.9276 (2.9276) class_acc: 0.5352 (0.5352) weight_decay: 0.0500 (0.0500) time: 21.6667 data: 18.7025 max mem: 2905
Epoch: [175] [200/625] eta: 0:14:08 lr: 0.001658 min_lr: 0.001658 loss: 2.9273 (2.9511) class_acc: 0.5352 (0.5380) weight_decay: 0.0500 (0.0500) grad_norm: 2.1802 (inf) time: 1.8904 data: 0.0972 max mem: 2905
Epoch: [175] [400/625] eta: 0:07:18 lr: 0.001651 min_lr: 0.001651 loss: 2.9357 (2.9673) class_acc: 0.5312 (0.5349) weight_decay: 0.0500 (0.0500) grad_norm: 2.2905 (inf) time: 1.7468 data: 0.0189 max mem: 2905
Epoch: [175] [600/625] eta: 0:00:49 lr: 0.001644 min_lr: 0.001644 loss: 2.9760 (2.9734) class_acc: 0.5117 (0.5335) weight_decay: 0.0500 (0.0500) grad_norm: 2.2182 (inf) time: 1.7601 data: 0.0112 max mem: 2905
Epoch: [175] [624/625] eta: 0:00:01 lr: 0.001644 min_lr: 0.001644 loss: 2.9708 (2.9740) class_acc: 0.5312 (0.5333) weight_decay: 0.0500 (0.0500) grad_norm: 1.9865 (inf) time: 0.8576 data: 0.0134 max mem: 2905
Epoch: [175] Total time: 0:19:59 (1.9195 s / it)
Averaged stats: lr: 0.001644 min_lr: 0.001644 loss: 2.9708 (2.9721) class_acc: 0.5312 (0.5322) weight_decay: 0.0500 (0.0500) grad_norm: 1.9865 (inf)
Test: [ 0/50] eta: 0:10:03 loss: 3.7714 (3.7714) acc1: 37.6000 (37.6000) acc5: 56.0000 (56.0000) time: 12.0751 data: 12.0456 max mem: 2905
Test: [10/50] eta: 0:01:21 loss: 3.8978 (3.9093) acc1: 32.8000 (29.9636) acc5: 51.2000 (50.9818) time: 2.0429 data: 2.0232 max mem: 2905
Test: [20/50] eta: 0:00:46 loss: 3.9049 (3.9608) acc1: 26.4000 (27.8476) acc5: 48.8000 (50.5905) time: 1.0085 data: 0.9883 max mem: 2905
Test: [30/50] eta: 0:00:25 loss: 4.0368 (3.9828) acc1: 25.6000 (27.5871) acc5: 48.8000 (50.2710) time: 0.8854 data: 0.8645 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 4.0735 (4.0297) acc1: 25.6000 (27.0049) acc5: 49.6000 (50.2049) time: 0.7169 data: 0.6953 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 4.1491 (4.0414) acc1: 25.6000 (27.1840) acc5: 49.6000 (49.9840) time: 0.6759 data: 0.6550 max mem: 2905
Test: Total time: 0:00:50 (1.0072 s / it)
* Acc@1 27.642 Acc@5 50.460 loss 4.006
Accuracy of the model on the 50000 test images: 27.6%
Max accuracy: 50.54%
Epoch: [176] [ 0/625] eta: 3:45:17 lr: 0.001643 min_lr: 0.001643 loss: 2.8093 (2.8093) class_acc: 0.5586 (0.5586) weight_decay: 0.0500 (0.0500) time: 21.6274 data: 21.4630 max mem: 2905
Epoch: [176] [200/625] eta: 0:14:25 lr: 0.001636 min_lr: 0.001636 loss: 2.9613 (2.9424) class_acc: 0.5352 (0.5377) weight_decay: 0.0500 (0.0500) grad_norm: 2.7073 (2.8015) time: 1.8253 data: 0.0006 max mem: 2905
Epoch: [176] [400/625] eta: 0:07:28 lr: 0.001629 min_lr: 0.001629 loss: 2.9726 (2.9564) class_acc: 0.5312 (0.5350) weight_decay: 0.0500 (0.0500) grad_norm: 2.1197 (2.8548) time: 2.0773 data: 0.0007 max mem: 2905
Epoch: [176] [600/625] eta: 0:00:49 lr: 0.001622 min_lr: 0.001622 loss: 2.9676 (2.9622) class_acc: 0.5312 (0.5343) weight_decay: 0.0500 (0.0500) grad_norm: 2.3538 (2.8196) time: 1.9686 data: 0.0006 max mem: 2905
Epoch: [176] [624/625] eta: 0:00:01 lr: 0.001621 min_lr: 0.001621 loss: 3.0047 (2.9636) class_acc: 0.5312 (0.5343) weight_decay: 0.0500 (0.0500) grad_norm: 2.3463 (2.8120) time: 0.7437 data: 0.0013 max mem: 2905
Epoch: [176] Total time: 0:20:16 (1.9468 s / it)
Averaged stats: lr: 0.001621 min_lr: 0.001621 loss: 3.0047 (2.9677) class_acc: 0.5312 (0.5332) weight_decay: 0.0500 (0.0500) grad_norm: 2.3463 (2.8120)
Test: [ 0/50] eta: 0:10:48 loss: 3.6266 (3.6266) acc1: 32.0000 (32.0000) acc5: 53.6000 (53.6000) time: 12.9700 data: 12.9368 max mem: 2905
Test: [10/50] eta: 0:01:29 loss: 3.4445 (3.3868) acc1: 33.6000 (34.4000) acc5: 57.6000 (57.2364) time: 2.2257 data: 2.2048 max mem: 2905
Test: [20/50] eta: 0:00:53 loss: 3.2033 (3.3318) acc1: 37.6000 (35.3524) acc5: 58.4000 (58.9714) time: 1.2245 data: 1.2056 max mem: 2905
Test: [30/50] eta: 0:00:32 loss: 3.3299 (3.3366) acc1: 36.8000 (36.1548) acc5: 59.2000 (59.2000) time: 1.2774 data: 1.2586 max mem: 2905
Test: [40/50] eta: 0:00:13 loss: 3.3606 (3.3586) acc1: 36.0000 (36.0390) acc5: 58.4000 (58.9659) time: 0.9229 data: 0.9035 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 3.3969 (3.3777) acc1: 34.4000 (35.8240) acc5: 58.4000 (58.6560) time: 0.8259 data: 0.8054 max mem: 2905
Test: Total time: 0:00:56 (1.1291 s / it)
* Acc@1 36.232 Acc@5 59.528 loss 3.320
Accuracy of the model on the 50000 test images: 36.2%
Max accuracy: 50.54%
Epoch: [177] [ 0/625] eta: 3:33:21 lr: 0.001621 min_lr: 0.001621 loss: 2.8986 (2.8986) class_acc: 0.5352 (0.5352) weight_decay: 0.0500 (0.0500) time: 20.4823 data: 18.8573 max mem: 2905
Epoch: [177] [200/625] eta: 0:14:06 lr: 0.001614 min_lr: 0.001614 loss: 2.9350 (2.9588) class_acc: 0.5469 (0.5372) weight_decay: 0.0500 (0.0500) grad_norm: 2.3800 (3.0986) time: 1.7903 data: 0.0164 max mem: 2905
Epoch: [177] [400/625] eta: 0:07:24 lr: 0.001607 min_lr: 0.001607 loss: 2.9678 (2.9623) class_acc: 0.5352 (0.5356) weight_decay: 0.0500 (0.0500) grad_norm: 2.5470 (3.0213) time: 2.1067 data: 0.0009 max mem: 2905
Epoch: [177] [600/625] eta: 0:00:48 lr: 0.001600 min_lr: 0.001600 loss: 2.9456 (2.9672) class_acc: 0.5273 (0.5340) weight_decay: 0.0500 (0.0500) grad_norm: 1.9555 (2.8579) time: 1.9036 data: 0.0007 max mem: 2905
Epoch: [177] [624/625] eta: 0:00:01 lr: 0.001599 min_lr: 0.001599 loss: 2.9981 (2.9696) class_acc: 0.5273 (0.5336) weight_decay: 0.0500 (0.0500) grad_norm: 2.2223 (2.8605) time: 0.7163 data: 0.0027 max mem: 2905
Epoch: [177] Total time: 0:20:07 (1.9326 s / it)
Averaged stats: lr: 0.001599 min_lr: 0.001599 loss: 2.9981 (2.9661) class_acc: 0.5273 (0.5336) weight_decay: 0.0500 (0.0500) grad_norm: 2.2223 (2.8605)
Test: [ 0/50] eta: 0:09:34 loss: 2.1445 (2.1445) acc1: 56.8000 (56.8000) acc5: 78.4000 (78.4000) time: 11.4892 data: 11.4387 max mem: 2905
Test: [10/50] eta: 0:01:15 loss: 2.0917 (2.1205) acc1: 56.8000 (54.9818) acc5: 77.6000 (78.3273) time: 1.8893 data: 1.8615 max mem: 2905
Test: [20/50] eta: 0:00:44 loss: 2.2997 (2.3323) acc1: 48.0000 (50.7048) acc5: 73.6000 (75.2000) time: 0.9762 data: 0.9528 max mem: 2905
Test: [30/50] eta: 0:00:25 loss: 2.5339 (2.3686) acc1: 46.4000 (49.9871) acc5: 72.0000 (74.4774) time: 0.8995 data: 0.8784 max mem: 2905
Test: [40/50] eta: 0:00:10 loss: 2.3273 (2.3865) acc1: 48.0000 (49.7561) acc5: 72.8000 (74.2049) time: 0.5945 data: 0.5745 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.3960 (2.3967) acc1: 48.8000 (49.6960) acc5: 73.6000 (74.0320) time: 0.5065 data: 0.4874 max mem: 2905
Test: Total time: 0:00:45 (0.9134 s / it)
* Acc@1 49.656 Acc@5 74.674 loss 2.365
Accuracy of the model on the 50000 test images: 49.7%
Max accuracy: 50.54%
Epoch: [178] [ 0/625] eta: 3:36:01 lr: 0.001599 min_lr: 0.001599 loss: 3.2124 (3.2124) class_acc: 0.5352 (0.5352) weight_decay: 0.0500 (0.0500) time: 20.7386 data: 20.3402 max mem: 2905
Epoch: [178] [200/625] eta: 0:14:44 lr: 0.001592 min_lr: 0.001592 loss: 2.9707 (2.9610) class_acc: 0.5156 (0.5361) weight_decay: 0.0500 (0.0500) grad_norm: 3.0166 (2.7647) time: 1.9292 data: 0.0094 max mem: 2905
Epoch: [178] [400/625] eta: 0:07:27 lr: 0.001585 min_lr: 0.001585 loss: 2.9206 (2.9634) class_acc: 0.5312 (0.5339) weight_decay: 0.0500 (0.0500) grad_norm: 2.9752 (2.8134) time: 2.0765 data: 0.0007 max mem: 2905
Epoch: [178] [600/625] eta: 0:00:49 lr: 0.001578 min_lr: 0.001578 loss: 2.9737 (2.9597) class_acc: 0.5352 (0.5343) weight_decay: 0.0500 (0.0500) grad_norm: 1.7499 (2.7650) time: 2.0173 data: 0.0012 max mem: 2905
Epoch: [178] [624/625] eta: 0:00:01 lr: 0.001578 min_lr: 0.001578 loss: 3.0011 (2.9596) class_acc: 0.5273 (0.5343) weight_decay: 0.0500 (0.0500) grad_norm: 2.1669 (2.7711) time: 0.7741 data: 0.0014 max mem: 2905
Epoch: [178] Total time: 0:20:12 (1.9405 s / it)
Averaged stats: lr: 0.001578 min_lr: 0.001578 loss: 3.0011 (2.9628) class_acc: 0.5273 (0.5338) weight_decay: 0.0500 (0.0500) grad_norm: 2.1669 (2.7711)
Test: [ 0/50] eta: 0:09:33 loss: 3.1583 (3.1583) acc1: 35.2000 (35.2000) acc5: 64.0000 (64.0000) time: 11.4778 data: 11.4509 max mem: 2905
Test: [10/50] eta: 0:01:26 loss: 2.6522 (2.6764) acc1: 44.8000 (43.2000) acc5: 68.0000 (69.7455) time: 2.1689 data: 2.1478 max mem: 2905
Test: [20/50] eta: 0:00:54 loss: 2.8025 (2.7664) acc1: 42.4000 (41.8286) acc5: 67.2000 (68.1143) time: 1.3218 data: 1.3005 max mem: 2905
Test: [30/50] eta: 0:00:31 loss: 2.8040 (2.7718) acc1: 40.8000 (42.9677) acc5: 67.2000 (68.3355) time: 1.2171 data: 1.1953 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.8040 (2.8160) acc1: 40.8000 (42.6537) acc5: 67.2000 (67.7073) time: 0.7104 data: 0.6884 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.8978 (2.8076) acc1: 40.8000 (42.8480) acc5: 67.2000 (67.6800) time: 0.6119 data: 0.5909 max mem: 2905
Test: Total time: 0:00:54 (1.0800 s / it)
* Acc@1 43.564 Acc@5 68.272 loss 2.770
Accuracy of the model on the 50000 test images: 43.6%
Max accuracy: 50.54%
Epoch: [179] [ 0/625] eta: 3:23:31 lr: 0.001577 min_lr: 0.001577 loss: 3.0459 (3.0459) class_acc: 0.5078 (0.5078) weight_decay: 0.0500 (0.0500) time: 19.5389 data: 18.4212 max mem: 2905
Epoch: [179] [200/625] eta: 0:15:29 lr: 0.001570 min_lr: 0.001570 loss: 2.9030 (2.9636) class_acc: 0.5391 (0.5331) weight_decay: 0.0500 (0.0500) grad_norm: 2.1817 (2.6462) time: 2.0051 data: 0.0005 max mem: 2905
Epoch: [179] [400/625] eta: 0:07:52 lr: 0.001563 min_lr: 0.001563 loss: 2.9813 (2.9597) class_acc: 0.5234 (0.5350) weight_decay: 0.0500 (0.0500) grad_norm: 3.1210 (2.7472) time: 1.8979 data: 0.0006 max mem: 2905
Epoch: [179] [600/625] eta: 0:00:51 lr: 0.001556 min_lr: 0.001556 loss: 2.9118 (2.9608) class_acc: 0.5312 (0.5347) weight_decay: 0.0500 (0.0500) grad_norm: 2.1375 (2.7450) time: 2.0920 data: 0.0006 max mem: 2905
Epoch: [179] [624/625] eta: 0:00:02 lr: 0.001556 min_lr: 0.001556 loss: 2.8960 (2.9600) class_acc: 0.5352 (0.5348) weight_decay: 0.0500 (0.0500) grad_norm: 2.3293 (2.7545) time: 1.0082 data: 0.0012 max mem: 2905
Epoch: [179] Total time: 0:20:55 (2.0094 s / it)
Averaged stats: lr: 0.001556 min_lr: 0.001556 loss: 2.8960 (2.9562) class_acc: 0.5352 (0.5361) weight_decay: 0.0500 (0.0500) grad_norm: 2.3293 (2.7545)
Test: [ 0/50] eta: 0:08:28 loss: 3.0974 (3.0974) acc1: 37.6000 (37.6000) acc5: 59.2000 (59.2000) time: 10.1621 data: 10.1307 max mem: 2905
Test: [10/50] eta: 0:01:19 loss: 2.6009 (2.6745) acc1: 44.0000 (43.4909) acc5: 69.6000 (68.4364) time: 1.9976 data: 1.9784 max mem: 2905
Test: [20/50] eta: 0:00:47 loss: 2.7106 (2.9242) acc1: 40.8000 (40.5714) acc5: 68.0000 (65.4857) time: 1.1709 data: 1.1520 max mem: 2905
Test: [30/50] eta: 0:00:26 loss: 3.0709 (2.9526) acc1: 40.0000 (40.8000) acc5: 62.4000 (65.3677) time: 0.9781 data: 0.9587 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.9931 (2.9606) acc1: 40.8000 (41.0146) acc5: 64.0000 (65.2098) time: 0.7432 data: 0.7245 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.9915 (2.9503) acc1: 40.8000 (40.9600) acc5: 64.8000 (65.4080) time: 0.6526 data: 0.6346 max mem: 2905
Test: Total time: 0:00:51 (1.0331 s / it)
* Acc@1 41.638 Acc@5 66.288 loss 2.904
Accuracy of the model on the 50000 test images: 41.6%
Max accuracy: 50.54%
Epoch: [180] [ 0/625] eta: 3:54:11 lr: 0.001556 min_lr: 0.001556 loss: 3.0688 (3.0688) class_acc: 0.5273 (0.5273) weight_decay: 0.0500 (0.0500) time: 22.4827 data: 19.3503 max mem: 2905
Epoch: [180] [200/625] eta: 0:16:56 lr: 0.001549 min_lr: 0.001549 loss: 2.9402 (2.9402) class_acc: 0.5469 (0.5408) weight_decay: 0.0500 (0.0500) grad_norm: 2.8623 (2.6197) time: 2.0191 data: 0.0006 max mem: 2905
Epoch: [180] [400/625] eta: 0:08:07 lr: 0.001542 min_lr: 0.001542 loss: 2.9974 (2.9514) class_acc: 0.5195 (0.5375) weight_decay: 0.0500 (0.0500) grad_norm: 2.0796 (2.5966) time: 1.8534 data: 0.0007 max mem: 2905
Epoch: [180] [600/625] eta: 0:00:52 lr: 0.001535 min_lr: 0.001535 loss: 2.9739 (2.9596) class_acc: 0.5430 (0.5363) weight_decay: 0.0500 (0.0500) grad_norm: 3.2341 (2.7636) time: 1.8660 data: 0.0006 max mem: 2905
Epoch: [180] [624/625] eta: 0:00:02 lr: 0.001534 min_lr: 0.001534 loss: 2.9460 (2.9591) class_acc: 0.5273 (0.5364) weight_decay: 0.0500 (0.0500) grad_norm: 2.0194 (2.7407) time: 0.7925 data: 0.0015 max mem: 2905
Epoch: [180] Total time: 0:21:14 (2.0386 s / it)
Averaged stats: lr: 0.001534 min_lr: 0.001534 loss: 2.9460 (2.9575) class_acc: 0.5273 (0.5356) weight_decay: 0.0500 (0.0500) grad_norm: 2.0194 (2.7407)
Test: [ 0/50] eta: 0:09:47 loss: 3.0501 (3.0501) acc1: 40.8000 (40.8000) acc5: 62.4000 (62.4000) time: 11.7423 data: 11.7172 max mem: 2905
Test: [10/50] eta: 0:01:22 loss: 2.5562 (2.6453) acc1: 42.4000 (44.0727) acc5: 68.8000 (69.0182) time: 2.0555 data: 2.0329 max mem: 2905
Test: [20/50] eta: 0:00:50 loss: 2.7856 (2.8169) acc1: 42.4000 (42.2857) acc5: 68.0000 (67.3524) time: 1.1817 data: 1.1594 max mem: 2905
Test: [30/50] eta: 0:00:30 loss: 2.8759 (2.8285) acc1: 40.0000 (42.4000) acc5: 66.4000 (67.5097) time: 1.1980 data: 1.1764 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.8759 (2.8659) acc1: 40.8000 (42.3024) acc5: 66.4000 (66.8488) time: 0.7998 data: 0.7791 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.8480 (2.8717) acc1: 40.8000 (42.1760) acc5: 66.4000 (66.7200) time: 0.6438 data: 0.6228 max mem: 2905
Test: Total time: 0:00:52 (1.0520 s / it)
* Acc@1 42.880 Acc@5 67.184 loss 2.819
Accuracy of the model on the 50000 test images: 42.9%
Max accuracy: 50.54%
Epoch: [181] [ 0/625] eta: 3:48:01 lr: 0.001534 min_lr: 0.001534 loss: 2.9576 (2.9576) class_acc: 0.5312 (0.5312) weight_decay: 0.0500 (0.0500) time: 21.8904 data: 19.8802 max mem: 2905
Epoch: [181] [200/625] eta: 0:13:35 lr: 0.001527 min_lr: 0.001527 loss: 2.9780 (2.9274) class_acc: 0.5312 (0.5427) weight_decay: 0.0500 (0.0500) grad_norm: 2.6591 (2.6912) time: 1.8400 data: 0.0272 max mem: 2905
Epoch: [181] [400/625] eta: 0:07:04 lr: 0.001520 min_lr: 0.001520 loss: 3.0450 (2.9455) class_acc: 0.5156 (0.5370) weight_decay: 0.0500 (0.0500) grad_norm: 3.0059 (2.8901) time: 1.7974 data: 0.0065 max mem: 2905
Epoch: [181] [600/625] eta: 0:00:47 lr: 0.001513 min_lr: 0.001513 loss: 2.9584 (2.9560) class_acc: 0.5352 (0.5359) weight_decay: 0.0500 (0.0500) grad_norm: 1.7979 (2.8036) time: 2.0169 data: 0.0010 max mem: 2905
Epoch: [181] [624/625] eta: 0:00:01 lr: 0.001512 min_lr: 0.001512 loss: 2.9850 (2.9582) class_acc: 0.5234 (0.5353) weight_decay: 0.0500 (0.0500) grad_norm: 2.0999 (2.8138) time: 0.6392 data: 0.0017 max mem: 2905
Epoch: [181] Total time: 0:19:33 (1.8784 s / it)
Averaged stats: lr: 0.001512 min_lr: 0.001512 loss: 2.9850 (2.9535) class_acc: 0.5234 (0.5368) weight_decay: 0.0500 (0.0500) grad_norm: 2.0999 (2.8138)
Test: [ 0/50] eta: 0:10:25 loss: 3.1346 (3.1346) acc1: 40.8000 (40.8000) acc5: 64.0000 (64.0000) time: 12.5050 data: 12.4746 max mem: 2905
Test: [10/50] eta: 0:01:19 loss: 3.0459 (3.0258) acc1: 42.4000 (40.7273) acc5: 64.0000 (64.3636) time: 1.9912 data: 1.9724 max mem: 2905
Test: [20/50] eta: 0:00:46 loss: 3.1529 (3.1968) acc1: 37.6000 (37.7905) acc5: 61.6000 (61.7905) time: 1.0173 data: 0.9992 max mem: 2905
Test: [30/50] eta: 0:00:27 loss: 3.1529 (3.1847) acc1: 37.6000 (37.8065) acc5: 61.6000 (61.9613) time: 1.0406 data: 1.0195 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 3.1476 (3.1930) acc1: 37.6000 (37.5415) acc5: 60.0000 (61.6976) time: 0.7097 data: 0.6879 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 3.2742 (3.2207) acc1: 36.8000 (37.2960) acc5: 60.0000 (61.4560) time: 0.5925 data: 0.5717 max mem: 2905
Test: Total time: 0:00:48 (0.9624 s / it)
* Acc@1 37.396 Acc@5 62.212 loss 3.169
Accuracy of the model on the 50000 test images: 37.4%
Max accuracy: 50.54%
Epoch: [182] [ 0/625] eta: 3:48:26 lr: 0.001512 min_lr: 0.001512 loss: 2.8528 (2.8528) class_acc: 0.5742 (0.5742) weight_decay: 0.0500 (0.0500) time: 21.9312 data: 17.4945 max mem: 2905
Epoch: [182] [200/625] eta: 0:14:16 lr: 0.001505 min_lr: 0.001505 loss: 2.8631 (2.9374) class_acc: 0.5469 (0.5404) weight_decay: 0.0500 (0.0500) grad_norm: 2.8294 (3.1070) time: 1.8812 data: 0.0701 max mem: 2905
Epoch: [182] [400/625] eta: 0:07:22 lr: 0.001498 min_lr: 0.001498 loss: 2.9454 (2.9441) class_acc: 0.5430 (0.5391) weight_decay: 0.0500 (0.0500) grad_norm: 2.1880 (2.9962) time: 1.8618 data: 0.0769 max mem: 2905
Epoch: [182] [600/625] eta: 0:00:49 lr: 0.001491 min_lr: 0.001491 loss: 2.9813 (2.9474) class_acc: 0.5391 (0.5387) weight_decay: 0.0500 (0.0500) grad_norm: 2.5305 (2.8895) time: 2.0511 data: 0.0008 max mem: 2905
Epoch: [182] [624/625] eta: 0:00:01 lr: 0.001490 min_lr: 0.001490 loss: 2.9624 (2.9483) class_acc: 0.5234 (0.5383) weight_decay: 0.0500 (0.0500) grad_norm: 2.9084 (2.9022) time: 0.9062 data: 0.0014 max mem: 2905
Epoch: [182] Total time: 0:20:07 (1.9318 s / it)
Averaged stats: lr: 0.001490 min_lr: 0.001490 loss: 2.9624 (2.9506) class_acc: 0.5234 (0.5369) weight_decay: 0.0500 (0.0500) grad_norm: 2.9084 (2.9022)
Test: [ 0/50] eta: 0:09:59 loss: 2.4990 (2.4990) acc1: 44.8000 (44.8000) acc5: 76.8000 (76.8000) time: 11.9823 data: 11.9593 max mem: 2905
Test: [10/50] eta: 0:01:23 loss: 2.4903 (2.5872) acc1: 44.8000 (46.1091) acc5: 72.0000 (70.3273) time: 2.0939 data: 2.0736 max mem: 2905
Test: [20/50] eta: 0:00:50 loss: 2.8230 (2.7645) acc1: 44.0000 (44.3048) acc5: 68.8000 (68.6476) time: 1.1701 data: 1.1502 max mem: 2905
Test: [30/50] eta: 0:00:30 loss: 2.9380 (2.7739) acc1: 41.6000 (44.1032) acc5: 67.2000 (68.4129) time: 1.1864 data: 1.1676 max mem: 2905
Test: [40/50] eta: 0:00:13 loss: 2.9707 (2.8075) acc1: 40.8000 (43.6878) acc5: 68.0000 (68.1561) time: 0.9251 data: 0.9052 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.7791 (2.8076) acc1: 41.6000 (43.6800) acc5: 68.0000 (68.0960) time: 0.8109 data: 0.7904 max mem: 2905
Test: Total time: 0:00:55 (1.1001 s / it)
* Acc@1 43.998 Acc@5 68.302 loss 2.784
Accuracy of the model on the 50000 test images: 44.0%
Max accuracy: 50.54%
Epoch: [183] [ 0/625] eta: 3:22:52 lr: 0.001490 min_lr: 0.001490 loss: 3.0205 (3.0205) class_acc: 0.5156 (0.5156) weight_decay: 0.0500 (0.0500) time: 19.4760 data: 17.6965 max mem: 2905
Epoch: [183] [200/625] eta: 0:14:14 lr: 0.001483 min_lr: 0.001483 loss: 2.9576 (2.9469) class_acc: 0.5273 (0.5378) weight_decay: 0.0500 (0.0500) grad_norm: 2.8342 (2.7885) time: 1.9214 data: 0.0647 max mem: 2905
Epoch: [183] [400/625] eta: 0:07:25 lr: 0.001476 min_lr: 0.001476 loss: 2.9856 (2.9431) class_acc: 0.5312 (0.5392) weight_decay: 0.0500 (0.0500) grad_norm: 2.2166 (2.7283) time: 2.0967 data: 0.0007 max mem: 2905
Epoch: [183] [600/625] eta: 0:00:49 lr: 0.001469 min_lr: 0.001469 loss: 2.9512 (2.9428) class_acc: 0.5352 (0.5397) weight_decay: 0.0500 (0.0500) grad_norm: 2.1238 (2.6808) time: 1.9924 data: 0.0008 max mem: 2905
Epoch: [183] [624/625] eta: 0:00:01 lr: 0.001469 min_lr: 0.001469 loss: 2.9977 (2.9445) class_acc: 0.5195 (0.5391) weight_decay: 0.0500 (0.0500) grad_norm: 2.5755 (2.6919) time: 0.8826 data: 0.0062 max mem: 2905
Epoch: [183] Total time: 0:20:15 (1.9456 s / it)
Averaged stats: lr: 0.001469 min_lr: 0.001469 loss: 2.9977 (2.9468) class_acc: 0.5195 (0.5378) weight_decay: 0.0500 (0.0500) grad_norm: 2.5755 (2.6919)
Test: [ 0/50] eta: 0:10:03 loss: 4.2062 (4.2062) acc1: 27.2000 (27.2000) acc5: 45.6000 (45.6000) time: 12.0727 data: 12.0391 max mem: 2905
Test: [10/50] eta: 0:01:19 loss: 3.4823 (3.5067) acc1: 35.2000 (34.8364) acc5: 57.6000 (56.8000) time: 1.9913 data: 1.9704 max mem: 2905
Test: [20/50] eta: 0:00:46 loss: 3.4944 (3.6496) acc1: 32.8000 (32.3429) acc5: 56.0000 (54.7048) time: 1.0253 data: 1.0060 max mem: 2905
Test: [30/50] eta: 0:00:27 loss: 3.5570 (3.6233) acc1: 30.4000 (32.8516) acc5: 55.2000 (54.9161) time: 1.0616 data: 1.0405 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 3.4127 (3.5927) acc1: 34.4000 (33.2098) acc5: 56.8000 (55.6293) time: 0.9404 data: 0.9192 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 3.4998 (3.6264) acc1: 32.0000 (32.6880) acc5: 55.2000 (55.3280) time: 0.8402 data: 0.8208 max mem: 2905
Test: Total time: 0:00:56 (1.1233 s / it)
* Acc@1 32.940 Acc@5 55.742 loss 3.610
Accuracy of the model on the 50000 test images: 32.9%
Max accuracy: 50.54%
Epoch: [184] [ 0/625] eta: 3:36:10 lr: 0.001469 min_lr: 0.001469 loss: 2.9707 (2.9707) class_acc: 0.5820 (0.5820) weight_decay: 0.0500 (0.0500) time: 20.7529 data: 20.6271 max mem: 2905
Epoch: [184] [200/625] eta: 0:14:28 lr: 0.001462 min_lr: 0.001462 loss: 2.9288 (2.9270) class_acc: 0.5430 (0.5420) weight_decay: 0.0500 (0.0500) grad_norm: 2.2709 (3.1388) time: 2.0056 data: 0.0452 max mem: 2905
Epoch: [184] [400/625] eta: 0:07:29 lr: 0.001455 min_lr: 0.001455 loss: 2.9200 (2.9405) class_acc: 0.5391 (0.5378) weight_decay: 0.0500 (0.0500) grad_norm: 2.3547 (3.0991) time: 1.8497 data: 0.0008 max mem: 2905
Epoch: [184] [600/625] eta: 0:00:49 lr: 0.001448 min_lr: 0.001448 loss: 2.9480 (2.9479) class_acc: 0.5312 (0.5376) weight_decay: 0.0500 (0.0500) grad_norm: 2.2036 (2.9126) time: 2.0344 data: 0.0109 max mem: 2905
Epoch: [184] [624/625] eta: 0:00:01 lr: 0.001447 min_lr: 0.001447 loss: 2.9182 (2.9483) class_acc: 0.5352 (0.5376) weight_decay: 0.0500 (0.0500) grad_norm: 2.3840 (2.9224) time: 0.5753 data: 0.0020 max mem: 2905
Epoch: [184] Total time: 0:20:22 (1.9562 s / it)
Averaged stats: lr: 0.001447 min_lr: 0.001447 loss: 2.9182 (2.9463) class_acc: 0.5352 (0.5379) weight_decay: 0.0500 (0.0500) grad_norm: 2.3840 (2.9224)
Test: [ 0/50] eta: 0:09:24 loss: 2.3522 (2.3522) acc1: 46.4000 (46.4000) acc5: 77.6000 (77.6000) time: 11.2956 data: 11.2701 max mem: 2905
Test: [10/50] eta: 0:01:18 loss: 2.4138 (2.4112) acc1: 50.4000 (49.7455) acc5: 73.6000 (74.0364) time: 1.9634 data: 1.9442 max mem: 2905
Test: [20/50] eta: 0:00:47 loss: 2.5622 (2.5294) acc1: 45.6000 (47.2000) acc5: 72.0000 (72.0762) time: 1.0864 data: 1.0666 max mem: 2905
Test: [30/50] eta: 0:00:27 loss: 2.6801 (2.5933) acc1: 41.6000 (46.5806) acc5: 68.8000 (71.2000) time: 1.0454 data: 1.0250 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.6966 (2.6169) acc1: 44.0000 (46.5561) acc5: 68.8000 (71.0439) time: 0.6260 data: 0.6055 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.6455 (2.6357) acc1: 44.0000 (46.3360) acc5: 71.2000 (70.7520) time: 0.5272 data: 0.5077 max mem: 2905
Test: Total time: 0:00:46 (0.9321 s / it)
* Acc@1 46.432 Acc@5 71.606 loss 2.580
Accuracy of the model on the 50000 test images: 46.4%
Max accuracy: 50.54%
Epoch: [185] [ 0/625] eta: 3:43:29 lr: 0.001447 min_lr: 0.001447 loss: 2.9341 (2.9341) class_acc: 0.5508 (0.5508) weight_decay: 0.0500 (0.0500) time: 21.4560 data: 21.3298 max mem: 2905
Epoch: [185] [200/625] eta: 0:14:25 lr: 0.001440 min_lr: 0.001440 loss: 2.9338 (2.9171) class_acc: 0.5430 (0.5453) weight_decay: 0.0500 (0.0500) grad_norm: 2.4908 (2.6441) time: 1.9272 data: 0.0011 max mem: 2905
Epoch: [185] [400/625] eta: 0:07:21 lr: 0.001433 min_lr: 0.001433 loss: 2.8433 (2.9245) class_acc: 0.5430 (0.5437) weight_decay: 0.0500 (0.0500) grad_norm: 2.4949 (2.6339) time: 1.9658 data: 0.6593 max mem: 2905
Epoch: [185] [600/625] eta: 0:00:48 lr: 0.001426 min_lr: 0.001426 loss: 2.9800 (2.9352) class_acc: 0.5352 (0.5412) weight_decay: 0.0500 (0.0500) grad_norm: 2.4372 (2.6900) time: 1.9224 data: 0.0364 max mem: 2905
Epoch: [185] [624/625] eta: 0:00:01 lr: 0.001426 min_lr: 0.001426 loss: 2.9000 (2.9343) class_acc: 0.5469 (0.5417) weight_decay: 0.0500 (0.0500) grad_norm: 3.7563 (2.7122) time: 0.7465 data: 0.0024 max mem: 2905
Epoch: [185] Total time: 0:19:51 (1.9071 s / it)
Averaged stats: lr: 0.001426 min_lr: 0.001426 loss: 2.9000 (2.9410) class_acc: 0.5469 (0.5393) weight_decay: 0.0500 (0.0500) grad_norm: 3.7563 (2.7122)
Test: [ 0/50] eta: 0:11:00 loss: 3.0293 (3.0293) acc1: 33.6000 (33.6000) acc5: 63.2000 (63.2000) time: 13.2031 data: 13.1694 max mem: 2905
Test: [10/50] eta: 0:01:35 loss: 2.5866 (2.6063) acc1: 46.4000 (45.4545) acc5: 72.0000 (69.3818) time: 2.3922 data: 2.3695 max mem: 2905
Test: [20/50] eta: 0:00:56 loss: 2.6123 (2.7243) acc1: 44.8000 (43.5048) acc5: 70.4000 (68.4191) time: 1.3224 data: 1.3021 max mem: 2905
Test: [30/50] eta: 0:00:31 loss: 2.7870 (2.7195) acc1: 42.4000 (43.7677) acc5: 68.0000 (68.8000) time: 1.1298 data: 1.1117 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.7870 (2.7756) acc1: 42.4000 (43.3561) acc5: 67.2000 (67.8829) time: 0.6770 data: 0.6586 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.9163 (2.7990) acc1: 42.4000 (43.2480) acc5: 65.6000 (67.5040) time: 0.6251 data: 0.6053 max mem: 2905
Test: Total time: 0:00:53 (1.0721 s / it)
* Acc@1 44.076 Acc@5 68.814 loss 2.750
Accuracy of the model on the 50000 test images: 44.1%
Max accuracy: 50.54%
Epoch: [186] [ 0/625] eta: 4:09:10 lr: 0.001425 min_lr: 0.001425 loss: 2.8937 (2.8937) class_acc: 0.5703 (0.5703) weight_decay: 0.0500 (0.0500) time: 23.9201 data: 19.2024 max mem: 2905
Epoch: [186] [200/625] eta: 0:14:16 lr: 0.001419 min_lr: 0.001419 loss: 2.9270 (2.9184) class_acc: 0.5352 (0.5439) weight_decay: 0.0500 (0.0500) grad_norm: 2.7794 (2.8982) time: 1.9494 data: 0.1967 max mem: 2905
Epoch: [186] [400/625] eta: 0:07:22 lr: 0.001412 min_lr: 0.001412 loss: 2.9199 (2.9339) class_acc: 0.5430 (0.5404) weight_decay: 0.0500 (0.0500) grad_norm: 2.5156 (2.8205) time: 2.0454 data: 0.0007 max mem: 2905
Epoch: [186] [600/625] eta: 0:00:48 lr: 0.001405 min_lr: 0.001405 loss: 2.9077 (2.9397) class_acc: 0.5469 (0.5392) weight_decay: 0.0500 (0.0500) grad_norm: 1.7382 (2.7667) time: 1.8180 data: 0.0358 max mem: 2905
Epoch: [186] [624/625] eta: 0:00:01 lr: 0.001404 min_lr: 0.001404 loss: 2.9374 (2.9397) class_acc: 0.5391 (0.5391) weight_decay: 0.0500 (0.0500) grad_norm: 2.5795 (2.8222) time: 0.7186 data: 0.0361 max mem: 2905
Epoch: [186] Total time: 0:19:44 (1.8949 s / it)
Averaged stats: lr: 0.001404 min_lr: 0.001404 loss: 2.9374 (2.9394) class_acc: 0.5391 (0.5399) weight_decay: 0.0500 (0.0500) grad_norm: 2.5795 (2.8222)
Test: [ 0/50] eta: 0:09:55 loss: 3.7056 (3.7056) acc1: 29.6000 (29.6000) acc5: 53.6000 (53.6000) time: 11.9145 data: 11.8874 max mem: 2905
Test: [10/50] eta: 0:01:20 loss: 3.1112 (3.0873) acc1: 38.4000 (40.2182) acc5: 63.2000 (62.2545) time: 2.0080 data: 1.9893 max mem: 2905
Test: [20/50] eta: 0:00:47 loss: 3.2475 (3.3704) acc1: 36.8000 (35.0095) acc5: 61.6000 (59.3905) time: 1.0534 data: 1.0354 max mem: 2905
Test: [30/50] eta: 0:00:27 loss: 3.4997 (3.3361) acc1: 32.0000 (35.7677) acc5: 58.4000 (59.8710) time: 1.0561 data: 1.0359 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 3.3638 (3.3853) acc1: 35.2000 (35.5707) acc5: 60.0000 (59.1610) time: 0.7788 data: 0.7577 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 3.4239 (3.4182) acc1: 35.2000 (35.4720) acc5: 55.2000 (58.4640) time: 0.6766 data: 0.6555 max mem: 2905
Test: Total time: 0:00:49 (0.9874 s / it)
* Acc@1 35.878 Acc@5 58.962 loss 3.370
Accuracy of the model on the 50000 test images: 35.9%
Max accuracy: 50.54%
Epoch: [187] [ 0/625] eta: 3:44:01 lr: 0.001404 min_lr: 0.001404 loss: 2.9614 (2.9614) class_acc: 0.5000 (0.5000) weight_decay: 0.0500 (0.0500) time: 21.5067 data: 21.3693 max mem: 2905
Epoch: [187] [200/625] eta: 0:14:24 lr: 0.001397 min_lr: 0.001397 loss: 2.9544 (2.9256) class_acc: 0.5391 (0.5437) weight_decay: 0.0500 (0.0500) grad_norm: 2.9375 (3.1494) time: 1.7960 data: 0.0011 max mem: 2905
Epoch: [187] [400/625] eta: 0:07:23 lr: 0.001390 min_lr: 0.001390 loss: 2.9764 (2.9318) class_acc: 0.5312 (0.5414) weight_decay: 0.0500 (0.0500) grad_norm: 1.9406 (2.9901) time: 1.8576 data: 0.0146 max mem: 2905
Epoch: [187] [600/625] eta: 0:00:49 lr: 0.001383 min_lr: 0.001383 loss: 2.9485 (2.9383) class_acc: 0.5430 (0.5402) weight_decay: 0.0500 (0.0500) grad_norm: 2.0422 (inf) time: 1.9623 data: 0.0188 max mem: 2905
Epoch: [187] [624/625] eta: 0:00:01 lr: 0.001383 min_lr: 0.001383 loss: 2.9454 (2.9393) class_acc: 0.5391 (0.5401) weight_decay: 0.0500 (0.0500) grad_norm: 2.7637 (inf) time: 0.7093 data: 0.0014 max mem: 2905
Epoch: [187] Total time: 0:19:56 (1.9146 s / it)
Averaged stats: lr: 0.001383 min_lr: 0.001383 loss: 2.9454 (2.9338) class_acc: 0.5391 (0.5408) weight_decay: 0.0500 (0.0500) grad_norm: 2.7637 (inf)
Test: [ 0/50] eta: 0:10:12 loss: 3.0444 (3.0444) acc1: 36.0000 (36.0000) acc5: 65.6000 (65.6000) time: 12.2456 data: 12.2174 max mem: 2905
Test: [10/50] eta: 0:01:21 loss: 2.8346 (2.8365) acc1: 41.6000 (41.4545) acc5: 68.0000 (66.8364) time: 2.0431 data: 2.0224 max mem: 2905
Test: [20/50] eta: 0:00:48 loss: 2.8422 (2.8652) acc1: 40.0000 (41.3714) acc5: 68.0000 (66.6286) time: 1.0931 data: 1.0739 max mem: 2905
Test: [30/50] eta: 0:00:28 loss: 2.8550 (2.8851) acc1: 40.8000 (41.3419) acc5: 65.6000 (66.0129) time: 1.0729 data: 1.0530 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.9609 (2.9396) acc1: 40.8000 (40.2927) acc5: 64.0000 (65.6585) time: 0.6442 data: 0.6217 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 3.0720 (2.9490) acc1: 40.0000 (40.1120) acc5: 64.0000 (65.7280) time: 0.5637 data: 0.5424 max mem: 2905
Test: Total time: 0:00:47 (0.9553 s / it)
* Acc@1 40.528 Acc@5 65.984 loss 2.923
Accuracy of the model on the 50000 test images: 40.5%
Max accuracy: 50.54%
Epoch: [188] [ 0/625] eta: 3:45:55 lr: 0.001383 min_lr: 0.001383 loss: 2.9962 (2.9962) class_acc: 0.5156 (0.5156) weight_decay: 0.0500 (0.0500) time: 21.6886 data: 18.3213 max mem: 2905
Epoch: [188] [200/625] eta: 0:13:36 lr: 0.001376 min_lr: 0.001376 loss: 2.9010 (2.9230) class_acc: 0.5469 (0.5462) weight_decay: 0.0500 (0.0500) grad_norm: 1.8185 (2.5813) time: 1.7251 data: 0.0006 max mem: 2905
Epoch: [188] [400/625] eta: 0:07:04 lr: 0.001369 min_lr: 0.001369 loss: 2.8965 (2.9250) class_acc: 0.5469 (0.5444) weight_decay: 0.0500 (0.0500) grad_norm: 2.8789 (2.7608) time: 1.8175 data: 0.0006 max mem: 2905
Epoch: [188] [600/625] eta: 0:00:47 lr: 0.001362 min_lr: 0.001362 loss: 2.9332 (2.9299) class_acc: 0.5312 (0.5440) weight_decay: 0.0500 (0.0500) grad_norm: 3.4110 (2.8955) time: 2.0495 data: 0.0008 max mem: 2905
Epoch: [188] [624/625] eta: 0:00:01 lr: 0.001361 min_lr: 0.001361 loss: 2.9181 (2.9307) class_acc: 0.5352 (0.5440) weight_decay: 0.0500 (0.0500) grad_norm: 3.0252 (2.8989) time: 0.6879 data: 0.0015 max mem: 2905
Epoch: [188] Total time: 0:19:35 (1.8806 s / it)
Averaged stats: lr: 0.001361 min_lr: 0.001361 loss: 2.9181 (2.9294) class_acc: 0.5352 (0.5421) weight_decay: 0.0500 (0.0500) grad_norm: 3.0252 (2.8989)
Test: [ 0/50] eta: 0:10:18 loss: 2.4576 (2.4576) acc1: 47.2000 (47.2000) acc5: 76.0000 (76.0000) time: 12.3720 data: 12.3444 max mem: 2905
Test: [10/50] eta: 0:01:21 loss: 2.4576 (2.5802) acc1: 47.2000 (45.9636) acc5: 72.0000 (70.1091) time: 2.0392 data: 2.0203 max mem: 2905
Test: [20/50] eta: 0:00:47 loss: 2.7938 (2.7306) acc1: 41.6000 (43.1238) acc5: 68.0000 (68.3048) time: 1.0518 data: 1.0338 max mem: 2905
Test: [30/50] eta: 0:00:26 loss: 2.7938 (2.7247) acc1: 43.2000 (43.8194) acc5: 68.0000 (68.3097) time: 0.9475 data: 0.9288 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.7286 (2.7323) acc1: 44.0000 (43.4732) acc5: 68.8000 (68.1171) time: 0.5953 data: 0.5743 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.7033 (2.7306) acc1: 43.2000 (43.5360) acc5: 68.8000 (68.0640) time: 0.5090 data: 0.4883 max mem: 2905
Test: Total time: 0:00:46 (0.9286 s / it)
* Acc@1 44.302 Acc@5 68.832 loss 2.695
Accuracy of the model on the 50000 test images: 44.3%
Max accuracy: 50.54%
Epoch: [189] [ 0/625] eta: 3:21:22 lr: 0.001361 min_lr: 0.001361 loss: 2.8026 (2.8026) class_acc: 0.5664 (0.5664) weight_decay: 0.0500 (0.0500) time: 19.3315 data: 18.9044 max mem: 2905
Epoch: [189] [200/625] eta: 0:14:03 lr: 0.001355 min_lr: 0.001355 loss: 2.8865 (2.9203) class_acc: 0.5391 (0.5457) weight_decay: 0.0500 (0.0500) grad_norm: 2.3264 (2.8017) time: 1.8587 data: 1.5231 max mem: 2905
Epoch: [189] [400/625] eta: 0:07:12 lr: 0.001348 min_lr: 0.001348 loss: 2.9280 (2.9199) class_acc: 0.5352 (0.5448) weight_decay: 0.0500 (0.0500) grad_norm: 2.2555 (2.7308) time: 1.8627 data: 1.6581 max mem: 2905
Epoch: [189] [600/625] eta: 0:00:47 lr: 0.001341 min_lr: 0.001341 loss: 2.9145 (2.9225) class_acc: 0.5430 (0.5443) weight_decay: 0.0500 (0.0500) grad_norm: 2.2839 (2.7543) time: 1.8939 data: 1.7072 max mem: 2905
Epoch: [189] [624/625] eta: 0:00:01 lr: 0.001340 min_lr: 0.001340 loss: 2.8932 (2.9218) class_acc: 0.5469 (0.5444) weight_decay: 0.0500 (0.0500) grad_norm: 2.6119 (2.7563) time: 0.7162 data: 0.5508 max mem: 2905
Epoch: [189] Total time: 0:19:26 (1.8669 s / it)
Averaged stats: lr: 0.001340 min_lr: 0.001340 loss: 2.8932 (2.9291) class_acc: 0.5469 (0.5422) weight_decay: 0.0500 (0.0500) grad_norm: 2.6119 (2.7563)
Test: [ 0/50] eta: 0:09:56 loss: 2.4711 (2.4711) acc1: 40.8000 (40.8000) acc5: 74.4000 (74.4000) time: 11.9284 data: 11.9016 max mem: 2905
Test: [10/50] eta: 0:01:21 loss: 2.3305 (2.2734) acc1: 51.2000 (50.7636) acc5: 74.4000 (74.1818) time: 2.0462 data: 2.0261 max mem: 2905
Test: [20/50] eta: 0:00:48 loss: 2.3902 (2.3994) acc1: 48.8000 (48.4571) acc5: 73.6000 (74.0191) time: 1.1108 data: 1.0915 max mem: 2905
Test: [30/50] eta: 0:00:28 loss: 2.4373 (2.4211) acc1: 44.8000 (48.2065) acc5: 73.6000 (73.5742) time: 1.0868 data: 1.0677 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.4373 (2.4279) acc1: 48.8000 (48.8390) acc5: 74.4000 (73.7951) time: 0.6950 data: 0.6742 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.3989 (2.4369) acc1: 48.8000 (48.7520) acc5: 74.4000 (73.7280) time: 0.5423 data: 0.5211 max mem: 2905
Test: Total time: 0:00:49 (0.9888 s / it)
* Acc@1 49.300 Acc@5 74.036 loss 2.405
Accuracy of the model on the 50000 test images: 49.3%
Max accuracy: 50.54%
Epoch: [190] [ 0/625] eta: 3:32:30 lr: 0.001340 min_lr: 0.001340 loss: 2.9151 (2.9151) class_acc: 0.5547 (0.5547) weight_decay: 0.0500 (0.0500) time: 20.4012 data: 20.0372 max mem: 2905
Epoch: [190] [200/625] eta: 0:14:04 lr: 0.001333 min_lr: 0.001333 loss: 2.9531 (2.9229) class_acc: 0.5391 (0.5451) weight_decay: 0.0500 (0.0500) grad_norm: 2.5471 (2.8818) time: 2.0116 data: 0.1104 max mem: 2905
Epoch: [190] [400/625] eta: 0:07:22 lr: 0.001327 min_lr: 0.001327 loss: 2.9694 (2.9240) class_acc: 0.5430 (0.5434) weight_decay: 0.0500 (0.0500) grad_norm: 3.4676 (3.0408) time: 1.9912 data: 0.0081 max mem: 2905
Epoch: [190] [600/625] eta: 0:00:49 lr: 0.001320 min_lr: 0.001320 loss: 2.9507 (2.9288) class_acc: 0.5352 (0.5424) weight_decay: 0.0500 (0.0500) grad_norm: 2.2835 (2.9434) time: 2.0388 data: 0.0008 max mem: 2905
Epoch: [190] [624/625] eta: 0:00:01 lr: 0.001319 min_lr: 0.001319 loss: 2.9223 (2.9306) class_acc: 0.5352 (0.5420) weight_decay: 0.0500 (0.0500) grad_norm: 2.8739 (2.9507) time: 0.8172 data: 0.0081 max mem: 2905
Epoch: [190] Total time: 0:20:08 (1.9339 s / it)
Averaged stats: lr: 0.001319 min_lr: 0.001319 loss: 2.9223 (2.9253) class_acc: 0.5352 (0.5428) weight_decay: 0.0500 (0.0500) grad_norm: 2.8739 (2.9507)
Test: [ 0/50] eta: 0:10:18 loss: 2.5478 (2.5478) acc1: 45.6000 (45.6000) acc5: 72.0000 (72.0000) time: 12.3603 data: 12.3302 max mem: 2905
Test: [10/50] eta: 0:01:17 loss: 2.4530 (2.4388) acc1: 49.6000 (49.5273) acc5: 72.0000 (71.7818) time: 1.9423 data: 1.9225 max mem: 2905
Test: [20/50] eta: 0:00:44 loss: 2.5764 (2.6544) acc1: 46.4000 (45.2571) acc5: 69.6000 (70.0191) time: 0.9527 data: 0.9328 max mem: 2905
Test: [30/50] eta: 0:00:26 loss: 2.7053 (2.6843) acc1: 40.8000 (45.0323) acc5: 69.6000 (69.7032) time: 0.9586 data: 0.9387 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.5968 (2.6682) acc1: 44.0000 (44.9951) acc5: 71.2000 (70.0098) time: 0.9298 data: 0.9107 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.5268 (2.6248) acc1: 45.6000 (45.6960) acc5: 72.8000 (70.8320) time: 0.8134 data: 0.7944 max mem: 2905
Test: Total time: 0:00:54 (1.0879 s / it)
* Acc@1 46.152 Acc@5 70.916 loss 2.604
Accuracy of the model on the 50000 test images: 46.2%
Max accuracy: 50.54%
Epoch: [191] [ 0/625] eta: 3:45:55 lr: 0.001319 min_lr: 0.001319 loss: 2.9978 (2.9978) class_acc: 0.5508 (0.5508) weight_decay: 0.0500 (0.0500) time: 21.6880 data: 20.7601 max mem: 2905
Epoch: [191] [200/625] eta: 0:13:59 lr: 0.001312 min_lr: 0.001312 loss: 2.9575 (2.9073) class_acc: 0.5273 (0.5475) weight_decay: 0.0500 (0.0500) grad_norm: 3.0668 (2.9456) time: 1.9654 data: 0.1931 max mem: 2905
Epoch: [191] [400/625] eta: 0:07:12 lr: 0.001305 min_lr: 0.001305 loss: 2.9726 (2.9155) class_acc: 0.5391 (0.5456) weight_decay: 0.0500 (0.0500) grad_norm: 3.9952 (3.0188) time: 1.8039 data: 0.0006 max mem: 2905
Epoch: [191] [600/625] eta: 0:00:48 lr: 0.001299 min_lr: 0.001299 loss: 2.9182 (2.9250) class_acc: 0.5391 (0.5435) weight_decay: 0.0500 (0.0500) grad_norm: 3.7701 (3.0389) time: 1.8028 data: 0.0008 max mem: 2905
Epoch: [191] [624/625] eta: 0:00:01 lr: 0.001298 min_lr: 0.001298 loss: 2.9384 (2.9255) class_acc: 0.5312 (0.5433) weight_decay: 0.0500 (0.0500) grad_norm: 3.3549 (3.0748) time: 0.6599 data: 0.0056 max mem: 2905
Epoch: [191] Total time: 0:19:49 (1.9032 s / it)
Averaged stats: lr: 0.001298 min_lr: 0.001298 loss: 2.9384 (2.9227) class_acc: 0.5312 (0.5436) weight_decay: 0.0500 (0.0500) grad_norm: 3.3549 (3.0748)
Test: [ 0/50] eta: 0:09:42 loss: 2.9738 (2.9738) acc1: 44.0000 (44.0000) acc5: 65.6000 (65.6000) time: 11.6522 data: 11.6238 max mem: 2905
Test: [10/50] eta: 0:01:16 loss: 2.6974 (2.6458) acc1: 45.6000 (46.8364) acc5: 71.2000 (71.0545) time: 1.9167 data: 1.8939 max mem: 2905
Test: [20/50] eta: 0:00:42 loss: 2.7606 (2.8378) acc1: 41.6000 (43.0476) acc5: 68.8000 (67.6571) time: 0.8942 data: 0.8732 max mem: 2905
Test: [30/50] eta: 0:00:24 loss: 2.8132 (2.8180) acc1: 40.0000 (43.1484) acc5: 64.8000 (67.9742) time: 0.8142 data: 0.7939 max mem: 2905
Test: [40/50] eta: 0:00:10 loss: 2.7896 (2.8379) acc1: 42.4000 (42.8293) acc5: 68.0000 (67.7268) time: 0.6545 data: 0.6347 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.7911 (2.8367) acc1: 41.6000 (42.7040) acc5: 67.2000 (67.6960) time: 0.5751 data: 0.5550 max mem: 2905
Test: Total time: 0:00:45 (0.9002 s / it)
* Acc@1 43.686 Acc@5 67.946 loss 2.803
Accuracy of the model on the 50000 test images: 43.7%
Max accuracy: 50.54%
Epoch: [192] [ 0/625] eta: 3:29:30 lr: 0.001298 min_lr: 0.001298 loss: 2.7630 (2.7630) class_acc: 0.5977 (0.5977) weight_decay: 0.0500 (0.0500) time: 20.1135 data: 18.1851 max mem: 2905
Epoch: [192] [200/625] eta: 0:14:01 lr: 0.001291 min_lr: 0.001291 loss: 2.9235 (2.9231) class_acc: 0.5430 (0.5437) weight_decay: 0.0500 (0.0500) grad_norm: 2.9417 (2.8943) time: 1.9627 data: 0.0010 max mem: 2905
Epoch: [192] [400/625] eta: 0:07:28 lr: 0.001284 min_lr: 0.001284 loss: 2.9102 (2.9212) class_acc: 0.5430 (0.5442) weight_decay: 0.0500 (0.0500) grad_norm: 2.0723 (2.8096) time: 2.0547 data: 0.0008 max mem: 2905
Epoch: [192] [600/625] eta: 0:00:49 lr: 0.001278 min_lr: 0.001278 loss: 2.9390 (2.9262) class_acc: 0.5312 (0.5434) weight_decay: 0.0500 (0.0500) grad_norm: 2.0961 (2.7313) time: 1.9570 data: 0.0008 max mem: 2905
Epoch: [192] [624/625] eta: 0:00:01 lr: 0.001277 min_lr: 0.001277 loss: 2.8979 (2.9264) class_acc: 0.5391 (0.5431) weight_decay: 0.0500 (0.0500) grad_norm: 2.3891 (2.7369) time: 0.6761 data: 0.0016 max mem: 2905
Epoch: [192] Total time: 0:20:32 (1.9717 s / it)
Averaged stats: lr: 0.001277 min_lr: 0.001277 loss: 2.8979 (2.9222) class_acc: 0.5391 (0.5438) weight_decay: 0.0500 (0.0500) grad_norm: 2.3891 (2.7369)
Test: [ 0/50] eta: 0:10:33 loss: 3.0853 (3.0853) acc1: 36.0000 (36.0000) acc5: 66.4000 (66.4000) time: 12.6765 data: 12.6396 max mem: 2905
Test: [10/50] eta: 0:01:17 loss: 3.2519 (3.1364) acc1: 36.0000 (39.2000) acc5: 64.0000 (63.3455) time: 1.9456 data: 1.9255 max mem: 2905
Test: [20/50] eta: 0:00:44 loss: 3.3824 (3.2797) acc1: 35.2000 (36.8000) acc5: 59.2000 (61.7524) time: 0.9165 data: 0.8971 max mem: 2905
Test: [30/50] eta: 0:00:27 loss: 3.3651 (3.2682) acc1: 35.2000 (37.5742) acc5: 59.2000 (61.5742) time: 1.0402 data: 1.0209 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 3.1933 (3.2724) acc1: 35.2000 (36.9366) acc5: 60.8000 (61.6195) time: 1.1026 data: 1.0839 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 3.1658 (3.2495) acc1: 35.2000 (36.7840) acc5: 60.8000 (61.6320) time: 0.7949 data: 0.7734 max mem: 2905
Test: Total time: 0:00:58 (1.1661 s / it)
* Acc@1 37.428 Acc@5 62.086 loss 3.206
Accuracy of the model on the 50000 test images: 37.4%
Max accuracy: 50.54%
Epoch: [193] [ 0/625] eta: 3:25:24 lr: 0.001277 min_lr: 0.001277 loss: 2.8153 (2.8153) class_acc: 0.5586 (0.5586) weight_decay: 0.0500 (0.0500) time: 19.7195 data: 19.0394 max mem: 2905
Epoch: [193] [200/625] eta: 0:14:31 lr: 0.001270 min_lr: 0.001270 loss: 2.9840 (2.9127) class_acc: 0.5430 (0.5435) weight_decay: 0.0500 (0.0500) grad_norm: 2.8202 (3.0284) time: 1.9617 data: 0.0010 max mem: 2905
Epoch: [193] [400/625] eta: 0:07:32 lr: 0.001264 min_lr: 0.001264 loss: 2.8956 (2.9230) class_acc: 0.5508 (0.5431) weight_decay: 0.0500 (0.0500) grad_norm: 2.1415 (2.7949) time: 1.9404 data: 0.0012 max mem: 2905
Epoch: [193] [600/625] eta: 0:00:49 lr: 0.001257 min_lr: 0.001257 loss: 2.8523 (2.9272) class_acc: 0.5625 (0.5412) weight_decay: 0.0500 (0.0500) grad_norm: 2.4802 (2.9189) time: 2.0040 data: 0.0011 max mem: 2905
Epoch: [193] [624/625] eta: 0:00:01 lr: 0.001256 min_lr: 0.001256 loss: 2.9145 (2.9264) class_acc: 0.5352 (0.5415) weight_decay: 0.0500 (0.0500) grad_norm: 2.9187 (2.9353) time: 0.7411 data: 0.0022 max mem: 2905
Epoch: [193] Total time: 0:20:18 (1.9492 s / it)
Averaged stats: lr: 0.001256 min_lr: 0.001256 loss: 2.9145 (2.9196) class_acc: 0.5352 (0.5440) weight_decay: 0.0500 (0.0500) grad_norm: 2.9187 (2.9353)
Test: [ 0/50] eta: 0:10:23 loss: 3.2407 (3.2407) acc1: 31.2000 (31.2000) acc5: 63.2000 (63.2000) time: 12.4640 data: 12.4337 max mem: 2905
Test: [10/50] eta: 0:01:18 loss: 2.9510 (2.9637) acc1: 39.2000 (38.8364) acc5: 64.8000 (64.8000) time: 1.9627 data: 1.9430 max mem: 2905
Test: [20/50] eta: 0:00:45 loss: 3.0986 (3.1029) acc1: 37.6000 (36.6857) acc5: 63.2000 (63.1238) time: 0.9596 data: 0.9403 max mem: 2905
Test: [30/50] eta: 0:00:26 loss: 3.1853 (3.0552) acc1: 36.8000 (37.7548) acc5: 61.6000 (63.4065) time: 0.9639 data: 0.9449 max mem: 2905
Test: [40/50] eta: 0:00:10 loss: 3.0145 (3.0531) acc1: 39.2000 (38.3610) acc5: 61.6000 (63.6683) time: 0.6666 data: 0.6473 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.9667 (3.0554) acc1: 39.2000 (38.3680) acc5: 65.6000 (63.9040) time: 0.5243 data: 0.5049 max mem: 2905
Test: Total time: 0:00:47 (0.9570 s / it)
* Acc@1 39.556 Acc@5 64.008 loss 3.017
Accuracy of the model on the 50000 test images: 39.6%
Max accuracy: 50.54%
Epoch: [194] [ 0/625] eta: 3:21:16 lr: 0.001256 min_lr: 0.001256 loss: 2.8731 (2.8731) class_acc: 0.5391 (0.5391) weight_decay: 0.0500 (0.0500) time: 19.3229 data: 19.1959 max mem: 2905
Epoch: [194] [200/625] eta: 0:14:15 lr: 0.001249 min_lr: 0.001249 loss: 2.9252 (2.9092) class_acc: 0.5508 (0.5444) weight_decay: 0.0500 (0.0500) grad_norm: 2.5505 (inf) time: 1.9251 data: 0.1120 max mem: 2905
Epoch: [194] [400/625] eta: 0:07:20 lr: 0.001243 min_lr: 0.001243 loss: 2.9534 (2.9101) class_acc: 0.5312 (0.5454) weight_decay: 0.0500 (0.0500) grad_norm: 2.3721 (inf) time: 2.0637 data: 0.0744 max mem: 2905
Epoch: [194] [600/625] eta: 0:00:49 lr: 0.001236 min_lr: 0.001236 loss: 2.8972 (2.9143) class_acc: 0.5391 (0.5447) weight_decay: 0.0500 (0.0500) grad_norm: 1.9056 (inf) time: 2.0990 data: 0.0424 max mem: 2905
Epoch: [194] [624/625] eta: 0:00:01 lr: 0.001235 min_lr: 0.001235 loss: 2.9051 (2.9143) class_acc: 0.5352 (0.5445) weight_decay: 0.0500 (0.0500) grad_norm: 2.0697 (inf) time: 0.7116 data: 0.0019 max mem: 2905
Epoch: [194] Total time: 0:19:57 (1.9156 s / it)
Averaged stats: lr: 0.001235 min_lr: 0.001235 loss: 2.9051 (2.9127) class_acc: 0.5352 (0.5459) weight_decay: 0.0500 (0.0500) grad_norm: 2.0697 (inf)
Test: [ 0/50] eta: 0:10:12 loss: 2.5322 (2.5322) acc1: 49.6000 (49.6000) acc5: 72.8000 (72.8000) time: 12.2496 data: 12.2186 max mem: 2905
Test: [10/50] eta: 0:01:13 loss: 2.4090 (2.4054) acc1: 49.6000 (50.0364) acc5: 73.6000 (73.0182) time: 1.8260 data: 1.8060 max mem: 2905
Test: [20/50] eta: 0:00:40 loss: 2.5546 (2.5443) acc1: 46.4000 (46.9333) acc5: 70.4000 (71.0857) time: 0.8114 data: 0.7925 max mem: 2905
Test: [30/50] eta: 0:00:24 loss: 2.5546 (2.5469) acc1: 44.8000 (46.8903) acc5: 69.6000 (71.3290) time: 0.9007 data: 0.8816 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.4524 (2.5893) acc1: 47.2000 (46.6146) acc5: 69.6000 (70.5756) time: 0.8979 data: 0.8789 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.4173 (2.5784) acc1: 47.2000 (46.7840) acc5: 71.2000 (70.8480) time: 0.5608 data: 0.5419 max mem: 2905
Test: Total time: 0:00:49 (0.9893 s / it)
* Acc@1 46.720 Acc@5 71.438 loss 2.548
Accuracy of the model on the 50000 test images: 46.7%
Max accuracy: 50.54%
Epoch: [195] [ 0/625] eta: 3:50:55 lr: 0.001235 min_lr: 0.001235 loss: 2.9635 (2.9635) class_acc: 0.5039 (0.5039) weight_decay: 0.0500 (0.0500) time: 22.1691 data: 17.6797 max mem: 2905
Epoch: [195] [200/625] eta: 0:14:16 lr: 0.001229 min_lr: 0.001229 loss: 2.9411 (2.8931) class_acc: 0.5352 (0.5508) weight_decay: 0.0500 (0.0500) grad_norm: 3.1960 (3.4029) time: 1.9323 data: 1.6963 max mem: 2905
Epoch: [195] [400/625] eta: 0:07:21 lr: 0.001222 min_lr: 0.001222 loss: 2.8517 (2.9087) class_acc: 0.5430 (0.5473) weight_decay: 0.0500 (0.0500) grad_norm: 2.4624 (3.3010) time: 1.9755 data: 1.1315 max mem: 2905
Epoch: [195] [600/625] eta: 0:00:51 lr: 0.001215 min_lr: 0.001215 loss: 2.9085 (2.9107) class_acc: 0.5391 (0.5471) weight_decay: 0.0500 (0.0500) grad_norm: 1.9755 (3.0550) time: 2.3877 data: 0.0006 max mem: 2905
Epoch: [195] [624/625] eta: 0:00:02 lr: 0.001215 min_lr: 0.001215 loss: 2.8800 (2.9105) class_acc: 0.5430 (0.5469) weight_decay: 0.0500 (0.0500) grad_norm: 2.7811 (3.0690) time: 1.3988 data: 0.0079 max mem: 2905
Epoch: [195] Total time: 0:21:17 (2.0439 s / it)
Averaged stats: lr: 0.001215 min_lr: 0.001215 loss: 2.8800 (2.9122) class_acc: 0.5430 (0.5458) weight_decay: 0.0500 (0.0500) grad_norm: 2.7811 (3.0690)
Test: [ 0/50] eta: 0:25:24 loss: 2.5202 (2.5202) acc1: 42.4000 (42.4000) acc5: 73.6000 (73.6000) time: 30.4993 data: 30.4583 max mem: 2905
Test: [10/50] eta: 0:02:49 loss: 2.5202 (2.4166) acc1: 47.2000 (48.2909) acc5: 73.6000 (73.9636) time: 4.2459 data: 4.2237 max mem: 2905
Test: [20/50] eta: 0:01:26 loss: 2.5572 (2.5614) acc1: 45.6000 (45.2190) acc5: 73.6000 (71.8476) time: 1.5077 data: 1.4876 max mem: 2905
Test: [30/50] eta: 0:00:47 loss: 2.6460 (2.5833) acc1: 45.6000 (45.5484) acc5: 68.8000 (71.2258) time: 1.3445 data: 1.3242 max mem: 2905
Test: [40/50] eta: 0:00:19 loss: 2.6460 (2.5963) acc1: 45.6000 (45.7171) acc5: 68.8000 (70.9073) time: 1.0184 data: 0.9981 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.6124 (2.6062) acc1: 45.6000 (45.7760) acc5: 68.8000 (70.5920) time: 0.8597 data: 0.8377 max mem: 2905
Test: Total time: 0:01:22 (1.6425 s / it)
* Acc@1 46.208 Acc@5 71.248 loss 2.572
Accuracy of the model on the 50000 test images: 46.2%
Max accuracy: 50.54%
Epoch: [196] [ 0/625] eta: 3:31:25 lr: 0.001215 min_lr: 0.001215 loss: 2.7980 (2.7980) class_acc: 0.5938 (0.5938) weight_decay: 0.0500 (0.0500) time: 20.2969 data: 17.0873 max mem: 2905
Epoch: [196] [200/625] eta: 0:14:25 lr: 0.001208 min_lr: 0.001208 loss: 2.9126 (2.9048) class_acc: 0.5586 (0.5485) weight_decay: 0.0500 (0.0500) grad_norm: 2.0398 (3.0081) time: 1.7068 data: 0.0381 max mem: 2905
Epoch: [196] [400/625] eta: 0:07:22 lr: 0.001201 min_lr: 0.001201 loss: 2.9202 (2.9100) class_acc: 0.5430 (0.5459) weight_decay: 0.0500 (0.0500) grad_norm: 2.8460 (2.9066) time: 1.8193 data: 0.0122 max mem: 2905
Epoch: [196] [600/625] eta: 0:00:48 lr: 0.001195 min_lr: 0.001195 loss: 2.9258 (2.9145) class_acc: 0.5391 (0.5452) weight_decay: 0.0500 (0.0500) grad_norm: 2.9828 (inf) time: 1.9735 data: 0.0007 max mem: 2905
Epoch: [196] [624/625] eta: 0:00:01 lr: 0.001194 min_lr: 0.001194 loss: 2.9119 (2.9149) class_acc: 0.5391 (0.5452) weight_decay: 0.0500 (0.0500) grad_norm: 2.0852 (inf) time: 0.6276 data: 0.0013 max mem: 2905
Epoch: [196] Total time: 0:19:53 (1.9094 s / it)
Averaged stats: lr: 0.001194 min_lr: 0.001194 loss: 2.9119 (2.9102) class_acc: 0.5391 (0.5460) weight_decay: 0.0500 (0.0500) grad_norm: 2.0852 (inf)
Test: [ 0/50] eta: 0:09:40 loss: 2.9230 (2.9230) acc1: 43.2000 (43.2000) acc5: 64.0000 (64.0000) time: 11.6038 data: 11.5685 max mem: 2905
Test: [10/50] eta: 0:01:17 loss: 2.7042 (2.7395) acc1: 44.0000 (42.4727) acc5: 70.4000 (68.7273) time: 1.9352 data: 1.9158 max mem: 2905
Test: [20/50] eta: 0:00:45 loss: 2.8235 (2.8646) acc1: 42.4000 (41.6381) acc5: 68.8000 (67.4667) time: 1.0150 data: 0.9971 max mem: 2905
Test: [30/50] eta: 0:00:26 loss: 2.9258 (2.8937) acc1: 40.0000 (41.6000) acc5: 64.0000 (66.7871) time: 0.9958 data: 0.9764 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.9333 (2.9197) acc1: 40.0000 (41.5024) acc5: 64.0000 (66.2244) time: 0.7575 data: 0.7353 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.9671 (2.9522) acc1: 37.6000 (40.9600) acc5: 64.0000 (65.6320) time: 0.6845 data: 0.6635 max mem: 2905
Test: Total time: 0:00:48 (0.9646 s / it)
* Acc@1 42.232 Acc@5 65.968 loss 2.912
Accuracy of the model on the 50000 test images: 42.2%
Max accuracy: 50.54%
Epoch: [197] [ 0/625] eta: 3:29:50 lr: 0.001194 min_lr: 0.001194 loss: 2.9084 (2.9084) class_acc: 0.5742 (0.5742) weight_decay: 0.0500 (0.0500) time: 20.1449 data: 18.3367 max mem: 2905
Epoch: [197] [200/625] eta: 0:14:14 lr: 0.001187 min_lr: 0.001187 loss: 2.9193 (2.8950) class_acc: 0.5391 (0.5513) weight_decay: 0.0500 (0.0500) grad_norm: 4.0277 (3.1640) time: 2.3592 data: 0.0009 max mem: 2905
Epoch: [197] [400/625] eta: 0:07:22 lr: 0.001181 min_lr: 0.001181 loss: 2.9294 (2.9067) class_acc: 0.5312 (0.5476) weight_decay: 0.0500 (0.0500) grad_norm: 3.6728 (3.0156) time: 2.0081 data: 0.0439 max mem: 2905
Epoch: [197] [600/625] eta: 0:00:48 lr: 0.001174 min_lr: 0.001174 loss: 2.8982 (2.9108) class_acc: 0.5508 (0.5471) weight_decay: 0.0500 (0.0500) grad_norm: 2.9043 (2.9878) time: 1.9745 data: 0.0012 max mem: 2905
Epoch: [197] [624/625] eta: 0:00:01 lr: 0.001174 min_lr: 0.001174 loss: 2.8929 (2.9104) class_acc: 0.5469 (0.5472) weight_decay: 0.0500 (0.0500) grad_norm: 3.5255 (3.0085) time: 0.8097 data: 0.0018 max mem: 2905
Epoch: [197] Total time: 0:19:52 (1.9079 s / it)
Averaged stats: lr: 0.001174 min_lr: 0.001174 loss: 2.8929 (2.9062) class_acc: 0.5469 (0.5476) weight_decay: 0.0500 (0.0500) grad_norm: 3.5255 (3.0085)
Test: [ 0/50] eta: 0:10:39 loss: 2.6053 (2.6053) acc1: 48.0000 (48.0000) acc5: 71.2000 (71.2000) time: 12.7980 data: 12.7709 max mem: 2905
Test: [10/50] eta: 0:01:25 loss: 2.6053 (2.7062) acc1: 45.6000 (45.3818) acc5: 68.0000 (68.5818) time: 2.1443 data: 2.1248 max mem: 2905
Test: [20/50] eta: 0:00:49 loss: 2.7418 (2.7252) acc1: 42.4000 (43.6952) acc5: 68.0000 (69.1429) time: 1.0976 data: 1.0793 max mem: 2905
Test: [30/50] eta: 0:00:29 loss: 2.7651 (2.7647) acc1: 42.4000 (43.6645) acc5: 68.0000 (68.2839) time: 1.0932 data: 1.0741 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.8325 (2.7633) acc1: 42.4000 (43.5317) acc5: 65.6000 (68.2927) time: 0.8551 data: 0.8357 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.7144 (2.7570) acc1: 44.8000 (43.8880) acc5: 68.0000 (68.1920) time: 0.7523 data: 0.7331 max mem: 2905
Test: Total time: 0:00:55 (1.1199 s / it)
* Acc@1 44.326 Acc@5 68.906 loss 2.720
Accuracy of the model on the 50000 test images: 44.3%
Max accuracy: 50.54%
Epoch: [198] [ 0/625] eta: 3:44:17 lr: 0.001174 min_lr: 0.001174 loss: 2.8976 (2.8976) class_acc: 0.5586 (0.5586) weight_decay: 0.0500 (0.0500) time: 21.5314 data: 21.1077 max mem: 2905
Epoch: [198] [200/625] eta: 0:14:15 lr: 0.001167 min_lr: 0.001167 loss: 2.9582 (2.8968) class_acc: 0.5352 (0.5524) weight_decay: 0.0500 (0.0500) grad_norm: 3.1012 (2.8716) time: 1.7730 data: 0.0092 max mem: 2905
Epoch: [198] [400/625] eta: 0:07:18 lr: 0.001161 min_lr: 0.001161 loss: 2.9451 (2.9057) class_acc: 0.5352 (0.5483) weight_decay: 0.0500 (0.0500) grad_norm: 2.2368 (2.9838) time: 1.8771 data: 0.0006 max mem: 2905
Epoch: [198] [600/625] eta: 0:00:48 lr: 0.001154 min_lr: 0.001154 loss: 2.9146 (2.9116) class_acc: 0.5508 (0.5468) weight_decay: 0.0500 (0.0500) grad_norm: 2.6784 (2.9849) time: 1.9182 data: 0.0316 max mem: 2905
Epoch: [198] [624/625] eta: 0:00:01 lr: 0.001153 min_lr: 0.001153 loss: 2.8847 (2.9107) class_acc: 0.5586 (0.5471) weight_decay: 0.0500 (0.0500) grad_norm: 2.2280 (2.9756) time: 0.5794 data: 0.0016 max mem: 2905
Epoch: [198] Total time: 0:20:02 (1.9242 s / it)
Averaged stats: lr: 0.001153 min_lr: 0.001153 loss: 2.8847 (2.9039) class_acc: 0.5586 (0.5482) weight_decay: 0.0500 (0.0500) grad_norm: 2.2280 (2.9756)
Test: [ 0/50] eta: 0:09:40 loss: 2.0337 (2.0337) acc1: 56.0000 (56.0000) acc5: 83.2000 (83.2000) time: 11.6086 data: 11.5774 max mem: 2905
Test: [10/50] eta: 0:01:19 loss: 2.1642 (2.2359) acc1: 50.4000 (51.4182) acc5: 76.8000 (77.1636) time: 1.9931 data: 1.9729 max mem: 2905
Test: [20/50] eta: 0:00:48 loss: 2.4039 (2.3975) acc1: 46.4000 (48.2667) acc5: 75.2000 (74.5905) time: 1.1121 data: 1.0927 max mem: 2905
Test: [30/50] eta: 0:00:28 loss: 2.5155 (2.3722) acc1: 45.6000 (48.5161) acc5: 72.8000 (74.5548) time: 1.1403 data: 1.1219 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.3333 (2.3794) acc1: 48.0000 (48.7610) acc5: 74.4000 (74.4585) time: 0.8752 data: 0.8564 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.3231 (2.3685) acc1: 48.0000 (48.6720) acc5: 75.2000 (74.6720) time: 0.7886 data: 0.7674 max mem: 2905
Test: Total time: 0:00:52 (1.0486 s / it)
* Acc@1 49.434 Acc@5 74.702 loss 2.352
Accuracy of the model on the 50000 test images: 49.4%
Max accuracy: 50.54%
Epoch: [199] [ 0/625] eta: 3:24:01 lr: 0.001153 min_lr: 0.001153 loss: 2.8251 (2.8251) class_acc: 0.5703 (0.5703) weight_decay: 0.0500 (0.0500) time: 19.5865 data: 19.1593 max mem: 2905
Epoch: [199] [200/625] eta: 0:14:12 lr: 0.001147 min_lr: 0.001147 loss: 2.9107 (2.8996) class_acc: 0.5508 (0.5489) weight_decay: 0.0500 (0.0500) grad_norm: 2.1446 (2.8866) time: 2.1043 data: 0.0006 max mem: 2905
Epoch: [199] [400/625] eta: 0:07:18 lr: 0.001140 min_lr: 0.001140 loss: 2.8977 (2.8954) class_acc: 0.5312 (0.5497) weight_decay: 0.0500 (0.0500) grad_norm: 2.2082 (2.9519) time: 1.9804 data: 0.0259 max mem: 2905
Epoch: [199] [600/625] eta: 0:00:47 lr: 0.001134 min_lr: 0.001134 loss: 2.8967 (2.9011) class_acc: 0.5430 (0.5485) weight_decay: 0.0500 (0.0500) grad_norm: 3.4360 (3.0290) time: 1.9518 data: 0.0006 max mem: 2905
Epoch: [199] [624/625] eta: 0:00:01 lr: 0.001133 min_lr: 0.001133 loss: 2.9155 (2.9021) class_acc: 0.5312 (0.5480) weight_decay: 0.0500 (0.0500) grad_norm: 2.6394 (3.0202) time: 1.1443 data: 0.0012 max mem: 2905
Epoch: [199] Total time: 0:19:38 (1.8855 s / it)
Averaged stats: lr: 0.001133 min_lr: 0.001133 loss: 2.9155 (2.9002) class_acc: 0.5312 (0.5487) weight_decay: 0.0500 (0.0500) grad_norm: 2.6394 (3.0202)
Test: [ 0/50] eta: 0:09:23 loss: 2.4074 (2.4074) acc1: 47.2000 (47.2000) acc5: 72.8000 (72.8000) time: 11.2634 data: 11.2332 max mem: 2905
Test: [10/50] eta: 0:01:15 loss: 2.4074 (2.3818) acc1: 50.4000 (48.3636) acc5: 72.8000 (72.8727) time: 1.8962 data: 1.8764 max mem: 2905
Test: [20/50] eta: 0:00:43 loss: 2.4303 (2.4626) acc1: 45.6000 (46.4000) acc5: 72.8000 (72.9905) time: 0.9518 data: 0.9322 max mem: 2905
Test: [30/50] eta: 0:00:24 loss: 2.5021 (2.5080) acc1: 44.8000 (46.3226) acc5: 72.0000 (72.1806) time: 0.8482 data: 0.8285 max mem: 2905
Test: [40/50] eta: 0:00:10 loss: 2.5666 (2.5450) acc1: 44.8000 (46.0878) acc5: 68.0000 (71.8439) time: 0.5616 data: 0.5422 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.5415 (2.5402) acc1: 44.0000 (45.9840) acc5: 72.8000 (72.0960) time: 0.5596 data: 0.5396 max mem: 2905
Test: Total time: 0:00:44 (0.8948 s / it)
* Acc@1 47.292 Acc@5 72.470 loss 2.509
Accuracy of the model on the 50000 test images: 47.3%
Max accuracy: 50.54%
Epoch: [200] [ 0/625] eta: 3:43:06 lr: 0.001133 min_lr: 0.001133 loss: 3.0597 (3.0597) class_acc: 0.5352 (0.5352) weight_decay: 0.0500 (0.0500) time: 21.4189 data: 16.5883 max mem: 2905
Epoch: [200] [200/625] eta: 0:13:43 lr: 0.001126 min_lr: 0.001126 loss: 2.8482 (2.8782) class_acc: 0.5430 (0.5514) weight_decay: 0.0500 (0.0500) grad_norm: 2.4542 (2.9941) time: 1.8208 data: 0.0005 max mem: 2905
Epoch: [200] [400/625] eta: 0:07:14 lr: 0.001120 min_lr: 0.001120 loss: 2.9281 (2.8937) class_acc: 0.5469 (0.5493) weight_decay: 0.0500 (0.0500) grad_norm: 2.7431 (2.9786) time: 1.9645 data: 0.0006 max mem: 2905
Epoch: [200] [600/625] eta: 0:00:48 lr: 0.001114 min_lr: 0.001114 loss: 2.9436 (2.8957) class_acc: 0.5312 (0.5486) weight_decay: 0.0500 (0.0500) grad_norm: 2.2007 (3.0581) time: 1.9204 data: 0.0008 max mem: 2905
Epoch: [200] [624/625] eta: 0:00:01 lr: 0.001113 min_lr: 0.001113 loss: 2.9184 (2.8962) class_acc: 0.5469 (0.5488) weight_decay: 0.0500 (0.0500) grad_norm: 2.3034 (3.0387) time: 0.6089 data: 0.0015 max mem: 2905
Epoch: [200] Total time: 0:19:51 (1.9064 s / it)
Averaged stats: lr: 0.001113 min_lr: 0.001113 loss: 2.9184 (2.8964) class_acc: 0.5469 (0.5493) weight_decay: 0.0500 (0.0500) grad_norm: 2.3034 (3.0387)
Test: [ 0/50] eta: 0:09:40 loss: 2.3553 (2.3553) acc1: 46.4000 (46.4000) acc5: 77.6000 (77.6000) time: 11.6034 data: 11.5752 max mem: 2905
Test: [10/50] eta: 0:01:13 loss: 1.9268 (2.0257) acc1: 55.2000 (56.2182) acc5: 77.6000 (78.2545) time: 1.8302 data: 1.8099 max mem: 2905
Test: [20/50] eta: 0:00:41 loss: 2.3581 (2.2748) acc1: 51.2000 (51.3143) acc5: 74.4000 (75.0476) time: 0.8820 data: 0.8624 max mem: 2905
Test: [30/50] eta: 0:00:25 loss: 2.3483 (2.2302) acc1: 51.2000 (51.8968) acc5: 73.6000 (75.8968) time: 0.9574 data: 0.9384 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.1894 (2.2644) acc1: 52.0000 (51.5122) acc5: 74.4000 (75.2781) time: 0.8931 data: 0.8736 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.2296 (2.2747) acc1: 48.8000 (51.3280) acc5: 74.4000 (75.3280) time: 0.5499 data: 0.5294 max mem: 2905
Test: Total time: 0:00:50 (1.0084 s / it)
* Acc@1 52.052 Acc@5 76.100 loss 2.228
Accuracy of the model on the 50000 test images: 52.1%
Max accuracy: 52.05%
Epoch: [201] [ 0/625] eta: 4:03:00 lr: 0.001113 min_lr: 0.001113 loss: 2.9127 (2.9127) class_acc: 0.5547 (0.5547) weight_decay: 0.0500 (0.0500) time: 23.3292 data: 20.7101 max mem: 2905
Epoch: [201] [200/625] eta: 0:14:03 lr: 0.001106 min_lr: 0.001106 loss: 2.9559 (2.8961) class_acc: 0.5352 (0.5489) weight_decay: 0.0500 (0.0500) grad_norm: 3.1042 (3.2214) time: 2.0185 data: 0.2712 max mem: 2905
Epoch: [201] [400/625] eta: 0:07:24 lr: 0.001100 min_lr: 0.001100 loss: 2.8621 (2.9003) class_acc: 0.5547 (0.5484) weight_decay: 0.0500 (0.0500) grad_norm: 2.3592 (3.0595) time: 2.2133 data: 0.0951 max mem: 2905
Epoch: [201] [600/625] eta: 0:00:49 lr: 0.001094 min_lr: 0.001094 loss: 2.8697 (2.8973) class_acc: 0.5547 (0.5489) weight_decay: 0.0500 (0.0500) grad_norm: 4.9074 (3.2205) time: 1.9500 data: 0.0859 max mem: 2905
Epoch: [201] [624/625] eta: 0:00:01 lr: 0.001093 min_lr: 0.001093 loss: 2.9323 (2.8977) class_acc: 0.5547 (0.5490) weight_decay: 0.0500 (0.0500) grad_norm: 2.9555 (3.2072) time: 0.9491 data: 0.0527 max mem: 2905
Epoch: [201] Total time: 0:20:08 (1.9336 s / it)
Averaged stats: lr: 0.001093 min_lr: 0.001093 loss: 2.9323 (2.8937) class_acc: 0.5547 (0.5502) weight_decay: 0.0500 (0.0500) grad_norm: 2.9555 (3.2072)
Test: [ 0/50] eta: 0:10:14 loss: 2.6657 (2.6657) acc1: 46.4000 (46.4000) acc5: 66.4000 (66.4000) time: 12.2904 data: 12.2663 max mem: 2905
Test: [10/50] eta: 0:01:13 loss: 2.4544 (2.4948) acc1: 48.8000 (47.4909) acc5: 72.0000 (71.8545) time: 1.8485 data: 1.8266 max mem: 2905
Test: [20/50] eta: 0:00:39 loss: 2.5602 (2.6137) acc1: 47.2000 (46.0190) acc5: 71.2000 (71.0476) time: 0.7818 data: 0.7606 max mem: 2905
Test: [30/50] eta: 0:00:26 loss: 2.6539 (2.6283) acc1: 46.4000 (45.9613) acc5: 68.0000 (70.2194) time: 1.0187 data: 0.9984 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.8115 (2.7271) acc1: 44.0000 (44.3122) acc5: 66.4000 (68.9756) time: 1.0284 data: 1.0087 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.8396 (2.7450) acc1: 40.8000 (44.0640) acc5: 67.2000 (68.5600) time: 0.5600 data: 0.5411 max mem: 2905
Test: Total time: 0:00:52 (1.0418 s / it)
* Acc@1 44.680 Acc@5 69.082 loss 2.720
Accuracy of the model on the 50000 test images: 44.7%
Max accuracy: 52.05%
Epoch: [202] [ 0/625] eta: 3:50:43 lr: 0.001093 min_lr: 0.001093 loss: 3.0313 (3.0313) class_acc: 0.5664 (0.5664) weight_decay: 0.0500 (0.0500) time: 22.1489 data: 20.6541 max mem: 2905
Epoch: [202] [200/625] eta: 0:14:41 lr: 0.001086 min_lr: 0.001086 loss: 2.8710 (2.8803) class_acc: 0.5508 (0.5523) weight_decay: 0.0500 (0.0500) grad_norm: 2.7238 (2.9651) time: 1.9658 data: 0.0009 max mem: 2905
Epoch: [202] [400/625] eta: 0:07:28 lr: 0.001080 min_lr: 0.001080 loss: 2.8897 (2.8892) class_acc: 0.5391 (0.5506) weight_decay: 0.0500 (0.0500) grad_norm: 3.0159 (3.1104) time: 1.8130 data: 0.0007 max mem: 2905
Epoch: [202] [600/625] eta: 0:00:49 lr: 0.001074 min_lr: 0.001074 loss: 2.8211 (2.8919) class_acc: 0.5547 (0.5506) weight_decay: 0.0500 (0.0500) grad_norm: 2.8445 (3.1244) time: 1.9175 data: 0.0006 max mem: 2905
Epoch: [202] [624/625] eta: 0:00:01 lr: 0.001073 min_lr: 0.001073 loss: 2.8772 (2.8920) class_acc: 0.5469 (0.5504) weight_decay: 0.0500 (0.0500) grad_norm: 2.6209 (3.0976) time: 0.6790 data: 0.0013 max mem: 2905
Epoch: [202] Total time: 0:20:08 (1.9335 s / it)
Averaged stats: lr: 0.001073 min_lr: 0.001073 loss: 2.8772 (2.8904) class_acc: 0.5469 (0.5508) weight_decay: 0.0500 (0.0500) grad_norm: 2.6209 (3.0976)
Test: [ 0/50] eta: 0:09:30 loss: 2.6640 (2.6640) acc1: 44.0000 (44.0000) acc5: 72.8000 (72.8000) time: 11.4009 data: 11.3727 max mem: 2905
Test: [10/50] eta: 0:01:11 loss: 2.4418 (2.5027) acc1: 45.6000 (46.4727) acc5: 72.8000 (71.5636) time: 1.7952 data: 1.7723 max mem: 2905
Test: [20/50] eta: 0:00:37 loss: 2.4775 (2.5617) acc1: 45.6000 (46.1333) acc5: 71.2000 (70.9714) time: 0.7547 data: 0.7341 max mem: 2905
Test: [30/50] eta: 0:00:21 loss: 2.5656 (2.5530) acc1: 45.6000 (46.3742) acc5: 70.4000 (70.8387) time: 0.7089 data: 0.6890 max mem: 2905
Test: [40/50] eta: 0:00:10 loss: 2.5656 (2.5590) acc1: 46.4000 (46.5366) acc5: 70.4000 (70.7902) time: 0.8996 data: 0.8797 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.4642 (2.5453) acc1: 47.2000 (46.7840) acc5: 72.0000 (71.0880) time: 0.6991 data: 0.6796 max mem: 2905
Test: Total time: 0:00:48 (0.9610 s / it)
* Acc@1 47.558 Acc@5 71.790 loss 2.507
Accuracy of the model on the 50000 test images: 47.6%
Max accuracy: 52.05%
Epoch: [203] [ 0/625] eta: 3:34:49 lr: 0.001073 min_lr: 0.001073 loss: 2.7870 (2.7870) class_acc: 0.5547 (0.5547) weight_decay: 0.0500 (0.0500) time: 20.6225 data: 18.6708 max mem: 2905
Epoch: [203] [200/625] eta: 0:14:22 lr: 0.001066 min_lr: 0.001066 loss: 2.9130 (2.8775) class_acc: 0.5469 (0.5538) weight_decay: 0.0500 (0.0500) grad_norm: 2.5122 (3.2295) time: 1.9272 data: 0.1290 max mem: 2905
Epoch: [203] [400/625] eta: 0:07:19 lr: 0.001060 min_lr: 0.001060 loss: 2.8503 (2.8790) class_acc: 0.5430 (0.5538) weight_decay: 0.0500 (0.0500) grad_norm: 2.8870 (inf) time: 1.9587 data: 0.1191 max mem: 2905
Epoch: [203] [600/625] eta: 0:00:48 lr: 0.001054 min_lr: 0.001054 loss: 2.8840 (2.8880) class_acc: 0.5469 (0.5518) weight_decay: 0.0500 (0.0500) grad_norm: 2.8902 (inf) time: 1.9373 data: 0.0192 max mem: 2905
Epoch: [203] [624/625] eta: 0:00:01 lr: 0.001053 min_lr: 0.001053 loss: 2.9153 (2.8881) class_acc: 0.5391 (0.5517) weight_decay: 0.0500 (0.0500) grad_norm: 2.8295 (inf) time: 0.4042 data: 0.0232 max mem: 2905
Epoch: [203] Total time: 0:19:49 (1.9037 s / it)
Averaged stats: lr: 0.001053 min_lr: 0.001053 loss: 2.9153 (2.8877) class_acc: 0.5391 (0.5515) weight_decay: 0.0500 (0.0500) grad_norm: 2.8295 (inf)
Test: [ 0/50] eta: 0:10:23 loss: 2.5038 (2.5038) acc1: 45.6000 (45.6000) acc5: 71.2000 (71.2000) time: 12.4686 data: 12.4340 max mem: 2905
Test: [10/50] eta: 0:01:23 loss: 2.3957 (2.3556) acc1: 48.8000 (50.0364) acc5: 72.8000 (73.6000) time: 2.0756 data: 2.0543 max mem: 2905
Test: [20/50] eta: 0:00:46 loss: 2.4901 (2.4844) acc1: 47.2000 (47.6571) acc5: 72.8000 (72.4571) time: 0.9925 data: 0.9721 max mem: 2905
Test: [30/50] eta: 0:00:26 loss: 2.5934 (2.4566) acc1: 48.0000 (48.8258) acc5: 73.6000 (72.8000) time: 0.9134 data: 0.8929 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.5934 (2.5026) acc1: 48.8000 (48.0000) acc5: 72.0000 (72.2732) time: 0.7398 data: 0.7207 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.6015 (2.5182) acc1: 44.8000 (47.5680) acc5: 69.6000 (71.9360) time: 0.6680 data: 0.6481 max mem: 2905
Test: Total time: 0:00:47 (0.9547 s / it)
* Acc@1 48.036 Acc@5 72.350 loss 2.484
Accuracy of the model on the 50000 test images: 48.0%
Max accuracy: 52.05%
Epoch: [204] [ 0/625] eta: 3:38:32 lr: 0.001053 min_lr: 0.001053 loss: 2.7783 (2.7783) class_acc: 0.5547 (0.5547) weight_decay: 0.0500 (0.0500) time: 20.9800 data: 20.8434 max mem: 2905
Epoch: [204] [200/625] eta: 0:13:57 lr: 0.001047 min_lr: 0.001047 loss: 2.8437 (2.8723) class_acc: 0.5586 (0.5548) weight_decay: 0.0500 (0.0500) grad_norm: 3.7670 (2.9545) time: 1.9487 data: 0.7485 max mem: 2905
Epoch: [204] [400/625] eta: 0:07:10 lr: 0.001040 min_lr: 0.001040 loss: 2.8148 (2.8789) class_acc: 0.5625 (0.5528) weight_decay: 0.0500 (0.0500) grad_norm: 2.6933 (2.9670) time: 1.8240 data: 0.0006 max mem: 2905
Epoch: [204] [600/625] eta: 0:00:47 lr: 0.001034 min_lr: 0.001034 loss: 2.9004 (2.8848) class_acc: 0.5352 (0.5519) weight_decay: 0.0500 (0.0500) grad_norm: 2.2490 (2.8898) time: 1.9440 data: 0.0170 max mem: 2905
Epoch: [204] [624/625] eta: 0:00:01 lr: 0.001033 min_lr: 0.001033 loss: 2.8976 (2.8862) class_acc: 0.5391 (0.5512) weight_decay: 0.0500 (0.0500) grad_norm: 2.2809 (2.8912) time: 0.7434 data: 0.0018 max mem: 2905
Epoch: [204] Total time: 0:19:40 (1.8890 s / it)
Averaged stats: lr: 0.001033 min_lr: 0.001033 loss: 2.8976 (2.8854) class_acc: 0.5391 (0.5524) weight_decay: 0.0500 (0.0500) grad_norm: 2.2809 (2.8912)
Test: [ 0/50] eta: 0:09:52 loss: 2.5162 (2.5162) acc1: 45.6000 (45.6000) acc5: 76.0000 (76.0000) time: 11.8526 data: 11.8252 max mem: 2905
Test: [10/50] eta: 0:01:20 loss: 2.3630 (2.3850) acc1: 48.8000 (49.7455) acc5: 74.4000 (73.8909) time: 2.0144 data: 1.9950 max mem: 2905
Test: [20/50] eta: 0:00:42 loss: 2.4599 (2.5727) acc1: 48.8000 (46.2857) acc5: 72.0000 (72.1143) time: 0.9101 data: 0.8908 max mem: 2905
Test: [30/50] eta: 0:00:23 loss: 2.6263 (2.5818) acc1: 45.6000 (46.4774) acc5: 72.0000 (72.0516) time: 0.7246 data: 0.7040 max mem: 2905
Test: [40/50] eta: 0:00:10 loss: 2.5700 (2.6056) acc1: 45.6000 (46.3610) acc5: 71.2000 (71.9220) time: 0.6557 data: 0.6354 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.6339 (2.6101) acc1: 44.0000 (46.4000) acc5: 71.2000 (71.6640) time: 0.4996 data: 0.4810 max mem: 2905
Test: Total time: 0:00:46 (0.9283 s / it)
* Acc@1 47.632 Acc@5 71.544 loss 2.566
Accuracy of the model on the 50000 test images: 47.6%
Max accuracy: 52.05%
Epoch: [205] [ 0/625] eta: 3:31:44 lr: 0.001033 min_lr: 0.001033 loss: 2.6924 (2.6924) class_acc: 0.5859 (0.5859) weight_decay: 0.0500 (0.0500) time: 20.3277 data: 14.7903 max mem: 2905
Epoch: [205] [200/625] eta: 0:14:10 lr: 0.001027 min_lr: 0.001027 loss: 2.8817 (2.8708) class_acc: 0.5391 (0.5558) weight_decay: 0.0500 (0.0500) grad_norm: 2.9875 (3.0523) time: 1.8813 data: 0.0006 max mem: 2905
Epoch: [205] [400/625] eta: 0:07:20 lr: 0.001021 min_lr: 0.001021 loss: 2.9028 (2.8806) class_acc: 0.5430 (0.5539) weight_decay: 0.0500 (0.0500) grad_norm: 2.5332 (3.1168) time: 1.8082 data: 0.0008 max mem: 2905
Epoch: [205] [600/625] eta: 0:00:48 lr: 0.001014 min_lr: 0.001014 loss: 2.8808 (2.8818) class_acc: 0.5508 (0.5539) weight_decay: 0.0500 (0.0500) grad_norm: 2.3462 (3.0097) time: 1.8975 data: 0.0005 max mem: 2905
Epoch: [205] [624/625] eta: 0:00:01 lr: 0.001014 min_lr: 0.001014 loss: 2.8942 (2.8826) class_acc: 0.5430 (0.5538) weight_decay: 0.0500 (0.0500) grad_norm: 2.3462 (2.9980) time: 0.7674 data: 0.0016 max mem: 2905
Epoch: [205] Total time: 0:19:49 (1.9025 s / it)
Averaged stats: lr: 0.001014 min_lr: 0.001014 loss: 2.8942 (2.8824) class_acc: 0.5430 (0.5529) weight_decay: 0.0500 (0.0500) grad_norm: 2.3462 (2.9980)
Test: [ 0/50] eta: 0:10:25 loss: 2.8214 (2.8214) acc1: 41.6000 (41.6000) acc5: 66.4000 (66.4000) time: 12.5016 data: 12.4749 max mem: 2905
Test: [10/50] eta: 0:01:23 loss: 2.8214 (2.8339) acc1: 43.2000 (41.9636) acc5: 68.0000 (67.7818) time: 2.0997 data: 2.0754 max mem: 2905
Test: [20/50] eta: 0:00:47 loss: 2.8672 (2.9050) acc1: 41.6000 (40.7238) acc5: 66.4000 (66.7810) time: 1.0246 data: 1.0028 max mem: 2905
Test: [30/50] eta: 0:00:27 loss: 2.9260 (2.9135) acc1: 39.2000 (40.7226) acc5: 66.4000 (66.7355) time: 0.9471 data: 0.9275 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 3.0545 (2.9637) acc1: 39.2000 (40.3707) acc5: 64.8000 (65.8732) time: 0.7521 data: 0.7315 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 3.0545 (2.9767) acc1: 40.8000 (40.4800) acc5: 64.0000 (65.5520) time: 0.7124 data: 0.6924 max mem: 2905
Test: Total time: 0:00:51 (1.0253 s / it)
* Acc@1 41.094 Acc@5 65.924 loss 2.938
Accuracy of the model on the 50000 test images: 41.1%
Max accuracy: 52.05%
Epoch: [206] [ 0/625] eta: 3:17:04 lr: 0.001014 min_lr: 0.001014 loss: 2.8358 (2.8358) class_acc: 0.5742 (0.5742) weight_decay: 0.0500 (0.0500) time: 18.9190 data: 16.1167 max mem: 2905
Epoch: [206] [200/625] eta: 0:13:54 lr: 0.001007 min_lr: 0.001007 loss: 2.8740 (2.8624) class_acc: 0.5586 (0.5565) weight_decay: 0.0500 (0.0500) grad_norm: 2.4453 (3.1968) time: 1.6688 data: 0.0011 max mem: 2905
Epoch: [206] [400/625] eta: 0:07:14 lr: 0.001001 min_lr: 0.001001 loss: 2.9083 (2.8765) class_acc: 0.5391 (0.5551) weight_decay: 0.0500 (0.0500) grad_norm: 2.6642 (3.3105) time: 1.7639 data: 0.0009 max mem: 2905
Epoch: [206] [600/625] eta: 0:00:48 lr: 0.000995 min_lr: 0.000995 loss: 2.9223 (2.8792) class_acc: 0.5469 (0.5543) weight_decay: 0.0500 (0.0500) grad_norm: 2.0321 (3.0616) time: 2.0394 data: 0.0008 max mem: 2905
Epoch: [206] [624/625] eta: 0:00:01 lr: 0.000994 min_lr: 0.000994 loss: 2.8726 (2.8799) class_acc: 0.5508 (0.5543) weight_decay: 0.0500 (0.0500) grad_norm: 2.4480 (3.0424) time: 0.7083 data: 0.0051 max mem: 2905
Epoch: [206] Total time: 0:20:02 (1.9233 s / it)
Averaged stats: lr: 0.000994 min_lr: 0.000994 loss: 2.8726 (2.8784) class_acc: 0.5508 (0.5535) weight_decay: 0.0500 (0.0500) grad_norm: 2.4480 (3.0424)
Test: [ 0/50] eta: 0:10:41 loss: 2.2146 (2.2146) acc1: 48.8000 (48.8000) acc5: 80.0000 (80.0000) time: 12.8215 data: 12.7984 max mem: 2905
Test: [10/50] eta: 0:01:31 loss: 2.3821 (2.4057) acc1: 48.8000 (51.4182) acc5: 73.6000 (73.6000) time: 2.2761 data: 2.2567 max mem: 2905
Test: [20/50] eta: 0:00:54 loss: 2.6524 (2.5958) acc1: 44.0000 (46.1333) acc5: 71.2000 (71.2762) time: 1.2719 data: 1.2519 max mem: 2905
Test: [30/50] eta: 0:00:31 loss: 2.7257 (2.6335) acc1: 40.8000 (45.2903) acc5: 69.6000 (70.7871) time: 1.1572 data: 1.1369 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.7443 (2.6591) acc1: 40.0000 (44.5463) acc5: 68.0000 (70.4000) time: 0.7079 data: 0.6886 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.7054 (2.6617) acc1: 42.4000 (44.5600) acc5: 69.6000 (70.3360) time: 0.6224 data: 0.6037 max mem: 2905
Test: Total time: 0:00:53 (1.0603 s / it)
* Acc@1 45.560 Acc@5 70.562 loss 2.637
Accuracy of the model on the 50000 test images: 45.6%
Max accuracy: 52.05%
Epoch: [207] [ 0/625] eta: 4:09:27 lr: 0.000994 min_lr: 0.000994 loss: 2.6924 (2.6924) class_acc: 0.6172 (0.6172) weight_decay: 0.0500 (0.0500) time: 23.9479 data: 20.8220 max mem: 2905
Epoch: [207] [200/625] eta: 0:14:31 lr: 0.000988 min_lr: 0.000988 loss: 2.8869 (2.8617) class_acc: 0.5469 (0.5564) weight_decay: 0.0500 (0.0500) grad_norm: 2.7443 (2.7849) time: 2.0594 data: 0.9492 max mem: 2905
Epoch: [207] [400/625] eta: 0:07:35 lr: 0.000982 min_lr: 0.000982 loss: 2.8765 (2.8678) class_acc: 0.5469 (0.5544) weight_decay: 0.0500 (0.0500) grad_norm: 2.3188 (2.8626) time: 2.0815 data: 0.0008 max mem: 2905
Epoch: [207] [600/625] eta: 0:00:51 lr: 0.000976 min_lr: 0.000976 loss: 2.9320 (2.8744) class_acc: 0.5430 (0.5544) weight_decay: 0.0500 (0.0500) grad_norm: 2.4625 (2.9755) time: 2.2362 data: 0.0009 max mem: 2905
Epoch: [207] [624/625] eta: 0:00:01 lr: 0.000975 min_lr: 0.000975 loss: 2.8622 (2.8744) class_acc: 0.5469 (0.5543) weight_decay: 0.0500 (0.0500) grad_norm: 2.7302 (3.0051) time: 0.6907 data: 0.0013 max mem: 2905
Epoch: [207] Total time: 0:20:49 (1.9997 s / it)
Averaged stats: lr: 0.000975 min_lr: 0.000975 loss: 2.8622 (2.8756) class_acc: 0.5469 (0.5546) weight_decay: 0.0500 (0.0500) grad_norm: 2.7302 (3.0051)
Test: [ 0/50] eta: 0:11:58 loss: 2.6572 (2.6572) acc1: 39.2000 (39.2000) acc5: 69.6000 (69.6000) time: 14.3756 data: 14.3465 max mem: 2905
Test: [10/50] eta: 0:01:28 loss: 2.5540 (2.4685) acc1: 49.6000 (47.8545) acc5: 72.8000 (73.8909) time: 2.2004 data: 2.1810 max mem: 2905
Test: [20/50] eta: 0:00:49 loss: 2.5540 (2.5972) acc1: 47.2000 (46.6286) acc5: 72.0000 (71.4667) time: 1.0236 data: 1.0049 max mem: 2905
Test: [30/50] eta: 0:00:29 loss: 2.4939 (2.5700) acc1: 47.2000 (47.2258) acc5: 70.4000 (71.9742) time: 1.0658 data: 1.0470 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.6372 (2.6029) acc1: 46.4000 (46.6927) acc5: 71.2000 (71.3366) time: 0.9092 data: 0.8903 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.6372 (2.6001) acc1: 46.4000 (46.8960) acc5: 71.2000 (71.3600) time: 0.7616 data: 0.7423 max mem: 2905
Test: Total time: 0:00:56 (1.1338 s / it)
* Acc@1 47.810 Acc@5 71.930 loss 2.556
Accuracy of the model on the 50000 test images: 47.8%
Max accuracy: 52.05%
Epoch: [208] [ 0/625] eta: 3:56:25 lr: 0.000975 min_lr: 0.000975 loss: 3.0261 (3.0261) class_acc: 0.4922 (0.4922) weight_decay: 0.0500 (0.0500) time: 22.6962 data: 21.7468 max mem: 2905
Epoch: [208] [200/625] eta: 0:14:58 lr: 0.000969 min_lr: 0.000969 loss: 2.8062 (2.8636) class_acc: 0.5664 (0.5572) weight_decay: 0.0500 (0.0500) grad_norm: 2.1851 (3.0015) time: 1.8681 data: 0.0548 max mem: 2905
Epoch: [208] [400/625] eta: 0:07:38 lr: 0.000963 min_lr: 0.000963 loss: 2.8370 (2.8658) class_acc: 0.5547 (0.5562) weight_decay: 0.0500 (0.0500) grad_norm: 2.6270 (3.0561) time: 1.9013 data: 0.0008 max mem: 2905
Epoch: [208] [600/625] eta: 0:00:50 lr: 0.000956 min_lr: 0.000956 loss: 2.8803 (2.8694) class_acc: 0.5508 (0.5554) weight_decay: 0.0500 (0.0500) grad_norm: 3.1440 (3.0891) time: 2.0358 data: 0.0008 max mem: 2905
Epoch: [208] [624/625] eta: 0:00:01 lr: 0.000956 min_lr: 0.000956 loss: 2.8772 (2.8712) class_acc: 0.5430 (0.5550) weight_decay: 0.0500 (0.0500) grad_norm: 2.8043 (3.0737) time: 0.8078 data: 0.0017 max mem: 2905
Epoch: [208] Total time: 0:20:31 (1.9697 s / it)
Averaged stats: lr: 0.000956 min_lr: 0.000956 loss: 2.8772 (2.8711) class_acc: 0.5430 (0.5555) weight_decay: 0.0500 (0.0500) grad_norm: 2.8043 (3.0737)
Test: [ 0/50] eta: 0:10:05 loss: 2.2059 (2.2059) acc1: 54.4000 (54.4000) acc5: 76.8000 (76.8000) time: 12.1020 data: 12.0707 max mem: 2905
Test: [10/50] eta: 0:01:19 loss: 1.9689 (2.0178) acc1: 56.8000 (56.6545) acc5: 79.2000 (79.9273) time: 1.9765 data: 1.9570 max mem: 2905
Test: [20/50] eta: 0:00:44 loss: 2.1178 (2.1763) acc1: 52.8000 (53.6762) acc5: 78.4000 (78.4381) time: 0.9524 data: 0.9325 max mem: 2905
Test: [30/50] eta: 0:00:25 loss: 2.1953 (2.2158) acc1: 50.4000 (52.8516) acc5: 76.8000 (77.7290) time: 0.9173 data: 0.8972 max mem: 2905
Test: [40/50] eta: 0:00:10 loss: 2.1433 (2.2386) acc1: 50.4000 (52.7610) acc5: 76.0000 (77.5220) time: 0.6740 data: 0.6556 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.1799 (2.2359) acc1: 50.4000 (52.4640) acc5: 76.8000 (77.6000) time: 0.5923 data: 0.5743 max mem: 2905
Test: Total time: 0:00:46 (0.9310 s / it)
* Acc@1 53.338 Acc@5 77.504 loss 2.205
Accuracy of the model on the 50000 test images: 53.3%
Max accuracy: 53.34%
Epoch: [209] [ 0/625] eta: 3:27:12 lr: 0.000956 min_lr: 0.000956 loss: 2.9900 (2.9900) class_acc: 0.5352 (0.5352) weight_decay: 0.0500 (0.0500) time: 19.8919 data: 19.7616 max mem: 2905
Epoch: [209] [200/625] eta: 0:13:49 lr: 0.000950 min_lr: 0.000950 loss: 2.8705 (2.8713) class_acc: 0.5430 (0.5553) weight_decay: 0.0500 (0.0500) grad_norm: 5.3777 (3.5759) time: 1.7230 data: 0.0007 max mem: 2905
Epoch: [209] [400/625] eta: 0:07:14 lr: 0.000944 min_lr: 0.000944 loss: 2.8353 (2.8723) class_acc: 0.5508 (0.5546) weight_decay: 0.0500 (0.0500) grad_norm: 2.6873 (3.4149) time: 1.9223 data: 0.1811 max mem: 2905
Epoch: [209] [600/625] eta: 0:00:48 lr: 0.000937 min_lr: 0.000937 loss: 2.8240 (2.8762) class_acc: 0.5664 (0.5539) weight_decay: 0.0500 (0.0500) grad_norm: 2.4924 (3.4393) time: 2.0451 data: 0.0374 max mem: 2905
Epoch: [209] [624/625] eta: 0:00:01 lr: 0.000937 min_lr: 0.000937 loss: 2.8606 (2.8772) class_acc: 0.5469 (0.5537) weight_decay: 0.0500 (0.0500) grad_norm: 2.4966 (3.4208) time: 0.9492 data: 0.0124 max mem: 2905
Epoch: [209] Total time: 0:19:42 (1.8918 s / it)
Averaged stats: lr: 0.000937 min_lr: 0.000937 loss: 2.8606 (2.8691) class_acc: 0.5469 (0.5564) weight_decay: 0.0500 (0.0500) grad_norm: 2.4966 (3.4208)
Test: [ 0/50] eta: 0:10:40 loss: 2.4309 (2.4309) acc1: 46.4000 (46.4000) acc5: 71.2000 (71.2000) time: 12.8180 data: 12.7911 max mem: 2905
Test: [10/50] eta: 0:01:21 loss: 2.3890 (2.3352) acc1: 50.4000 (50.5455) acc5: 74.4000 (73.7455) time: 2.0322 data: 2.0108 max mem: 2905
Test: [20/50] eta: 0:00:47 loss: 2.4491 (2.5133) acc1: 46.4000 (46.8571) acc5: 73.6000 (72.1524) time: 1.0141 data: 0.9944 max mem: 2905
Test: [30/50] eta: 0:00:28 loss: 2.5529 (2.5310) acc1: 43.2000 (46.8903) acc5: 71.2000 (71.8452) time: 1.0555 data: 1.0372 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.5512 (2.5667) acc1: 45.6000 (47.0439) acc5: 71.2000 (71.3171) time: 0.8003 data: 0.7822 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.5972 (2.5768) acc1: 47.2000 (47.0400) acc5: 68.0000 (70.9760) time: 0.6907 data: 0.6723 max mem: 2905
Test: Total time: 0:00:51 (1.0250 s / it)
* Acc@1 47.846 Acc@5 71.524 loss 2.530
Accuracy of the model on the 50000 test images: 47.8%
Max accuracy: 53.34%
Epoch: [210] [ 0/625] eta: 3:14:37 lr: 0.000937 min_lr: 0.000937 loss: 2.8904 (2.8904) class_acc: 0.5781 (0.5781) weight_decay: 0.0500 (0.0500) time: 18.6843 data: 17.7713 max mem: 2905
Epoch: [210] [200/625] eta: 0:14:23 lr: 0.000931 min_lr: 0.000931 loss: 2.8605 (2.8540) class_acc: 0.5586 (0.5611) weight_decay: 0.0500 (0.0500) grad_norm: 2.7377 (2.7882) time: 1.8635 data: 0.0008 max mem: 2905
Epoch: [210] [400/625] eta: 0:07:19 lr: 0.000925 min_lr: 0.000925 loss: 2.8761 (2.8583) class_acc: 0.5469 (0.5586) weight_decay: 0.0500 (0.0500) grad_norm: 3.1565 (2.8829) time: 1.8057 data: 0.0007 max mem: 2905
Epoch: [210] [600/625] eta: 0:00:48 lr: 0.000918 min_lr: 0.000918 loss: 2.8734 (2.8645) class_acc: 0.5625 (0.5572) weight_decay: 0.0500 (0.0500) grad_norm: 3.7189 (2.9967) time: 2.0221 data: 0.0009 max mem: 2905
Epoch: [210] [624/625] eta: 0:00:01 lr: 0.000918 min_lr: 0.000918 loss: 2.8697 (2.8653) class_acc: 0.5430 (0.5568) weight_decay: 0.0500 (0.0500) grad_norm: 4.3915 (3.0620) time: 0.7600 data: 0.0012 max mem: 2905
Epoch: [210] Total time: 0:19:45 (1.8974 s / it)
Averaged stats: lr: 0.000918 min_lr: 0.000918 loss: 2.8697 (2.8671) class_acc: 0.5430 (0.5565) weight_decay: 0.0500 (0.0500) grad_norm: 4.3915 (3.0620)
Test: [ 0/50] eta: 0:09:13 loss: 1.8271 (1.8271) acc1: 55.2000 (55.2000) acc5: 84.0000 (84.0000) time: 11.0732 data: 11.0440 max mem: 2905
Test: [10/50] eta: 0:01:22 loss: 1.9140 (1.9644) acc1: 58.4000 (58.6909) acc5: 81.6000 (80.2182) time: 2.0627 data: 2.0433 max mem: 2905
Test: [20/50] eta: 0:00:50 loss: 2.1921 (2.1230) acc1: 52.0000 (54.5524) acc5: 77.6000 (78.2095) time: 1.2114 data: 1.1929 max mem: 2905
Test: [30/50] eta: 0:00:29 loss: 2.2073 (2.1345) acc1: 52.0000 (54.8387) acc5: 76.0000 (77.9097) time: 1.1808 data: 1.1616 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.2189 (2.1653) acc1: 52.8000 (54.3024) acc5: 76.8000 (77.3659) time: 0.7509 data: 0.7302 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.2722 (2.1755) acc1: 51.2000 (53.6800) acc5: 76.8000 (77.4080) time: 0.6409 data: 0.6211 max mem: 2905
Test: Total time: 0:00:51 (1.0256 s / it)
* Acc@1 54.420 Acc@5 78.240 loss 2.133
Accuracy of the model on the 50000 test images: 54.4%
Max accuracy: 54.42%
Epoch: [211] [ 0/625] eta: 3:26:31 lr: 0.000918 min_lr: 0.000918 loss: 2.7751 (2.7751) class_acc: 0.5430 (0.5430) weight_decay: 0.0500 (0.0500) time: 19.8269 data: 17.6656 max mem: 2905
Epoch: [211] [200/625] eta: 0:13:29 lr: 0.000912 min_lr: 0.000912 loss: 2.8535 (2.8510) class_acc: 0.5469 (0.5608) weight_decay: 0.0500 (0.0500) grad_norm: 2.8003 (3.1526) time: 1.8632 data: 0.0007 max mem: 2905
Epoch: [211] [400/625] eta: 0:07:02 lr: 0.000906 min_lr: 0.000906 loss: 2.8203 (2.8542) class_acc: 0.5625 (0.5593) weight_decay: 0.0500 (0.0500) grad_norm: 2.7219 (3.1742) time: 1.9107 data: 0.0007 max mem: 2905
Epoch: [211] [600/625] eta: 0:00:47 lr: 0.000900 min_lr: 0.000900 loss: 2.9224 (2.8626) class_acc: 0.5508 (0.5574) weight_decay: 0.0500 (0.0500) grad_norm: 3.3338 (3.2063) time: 2.0623 data: 0.1461 max mem: 2905
Epoch: [211] [624/625] eta: 0:00:01 lr: 0.000899 min_lr: 0.000899 loss: 2.8189 (2.8629) class_acc: 0.5586 (0.5574) weight_decay: 0.0500 (0.0500) grad_norm: 3.4208 (3.2116) time: 0.8514 data: 0.2455 max mem: 2905
Epoch: [211] Total time: 0:19:18 (1.8535 s / it)
Averaged stats: lr: 0.000899 min_lr: 0.000899 loss: 2.8189 (2.8648) class_acc: 0.5586 (0.5573) weight_decay: 0.0500 (0.0500) grad_norm: 3.4208 (3.2116)
Test: [ 0/50] eta: 0:09:22 loss: 2.6556 (2.6556) acc1: 41.6000 (41.6000) acc5: 69.6000 (69.6000) time: 11.2446 data: 11.2208 max mem: 2905
Test: [10/50] eta: 0:01:17 loss: 2.4608 (2.4689) acc1: 47.2000 (48.0000) acc5: 72.0000 (72.5818) time: 1.9351 data: 1.9150 max mem: 2905
Test: [20/50] eta: 0:00:45 loss: 2.6005 (2.6847) acc1: 43.2000 (43.7714) acc5: 68.8000 (70.2095) time: 1.0431 data: 1.0233 max mem: 2905
Test: [30/50] eta: 0:00:26 loss: 2.7997 (2.6992) acc1: 40.8000 (43.7161) acc5: 67.2000 (69.0839) time: 0.9903 data: 0.9702 max mem: 2905
Test: [40/50] eta: 0:00:10 loss: 2.7758 (2.7548) acc1: 43.2000 (43.4732) acc5: 67.2000 (68.5268) time: 0.6178 data: 0.5963 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.7928 (2.7611) acc1: 42.4000 (43.4560) acc5: 64.8000 (68.2240) time: 0.5405 data: 0.5197 max mem: 2905
Test: Total time: 0:00:45 (0.9039 s / it)
* Acc@1 44.112 Acc@5 69.138 loss 2.721
Accuracy of the model on the 50000 test images: 44.1%
Max accuracy: 54.42%
Epoch: [212] [ 0/625] eta: 4:36:50 lr: 0.000899 min_lr: 0.000899 loss: 2.7424 (2.7424) class_acc: 0.5625 (0.5625) weight_decay: 0.0500 (0.0500) time: 26.5775 data: 17.8196 max mem: 2905
Epoch: [212] [200/625] eta: 0:13:50 lr: 0.000893 min_lr: 0.000893 loss: 2.9033 (2.8590) class_acc: 0.5586 (0.5614) weight_decay: 0.0500 (0.0500) grad_norm: 2.6885 (2.6922) time: 1.7908 data: 0.0008 max mem: 2905
Epoch: [212] [400/625] eta: 0:07:02 lr: 0.000887 min_lr: 0.000887 loss: 2.9157 (2.8567) class_acc: 0.5508 (0.5599) weight_decay: 0.0500 (0.0500) grad_norm: 2.6200 (2.9315) time: 1.8920 data: 0.0006 max mem: 2905
Epoch: [212] [600/625] eta: 0:00:46 lr: 0.000881 min_lr: 0.000881 loss: 2.9140 (2.8588) class_acc: 0.5547 (0.5582) weight_decay: 0.0500 (0.0500) grad_norm: 2.9470 (2.9191) time: 1.8899 data: 0.0007 max mem: 2905
Epoch: [212] [624/625] eta: 0:00:01 lr: 0.000880 min_lr: 0.000880 loss: 2.8315 (2.8584) class_acc: 0.5508 (0.5581) weight_decay: 0.0500 (0.0500) grad_norm: 2.7213 (2.9196) time: 0.8123 data: 0.0020 max mem: 2905
Epoch: [212] Total time: 0:19:15 (1.8490 s / it)
Averaged stats: lr: 0.000880 min_lr: 0.000880 loss: 2.8315 (2.8586) class_acc: 0.5508 (0.5583) weight_decay: 0.0500 (0.0500) grad_norm: 2.7213 (2.9196)
Test: [ 0/50] eta: 0:09:58 loss: 2.9251 (2.9251) acc1: 38.4000 (38.4000) acc5: 62.4000 (62.4000) time: 11.9609 data: 11.9297 max mem: 2905
Test: [10/50] eta: 0:01:24 loss: 2.4204 (2.3933) acc1: 51.2000 (49.9636) acc5: 73.6000 (72.7273) time: 2.1246 data: 2.1054 max mem: 2905
Test: [20/50] eta: 0:00:50 loss: 2.6036 (2.6218) acc1: 45.6000 (46.0571) acc5: 69.6000 (69.9810) time: 1.1581 data: 1.1389 max mem: 2905
Test: [30/50] eta: 0:00:29 loss: 2.7284 (2.6017) acc1: 44.0000 (46.7097) acc5: 68.0000 (70.1936) time: 1.1218 data: 1.1021 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.6970 (2.6432) acc1: 46.4000 (46.2049) acc5: 68.0000 (69.7561) time: 0.7653 data: 0.7451 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.7736 (2.6540) acc1: 45.6000 (45.8720) acc5: 68.0000 (69.6480) time: 0.6503 data: 0.6277 max mem: 2905
Test: Total time: 0:00:51 (1.0223 s / it)
* Acc@1 46.210 Acc@5 70.362 loss 2.627
Accuracy of the model on the 50000 test images: 46.2%
Max accuracy: 54.42%
Epoch: [213] [ 0/625] eta: 3:24:59 lr: 0.000880 min_lr: 0.000880 loss: 2.7151 (2.7151) class_acc: 0.5430 (0.5430) weight_decay: 0.0500 (0.0500) time: 19.6793 data: 19.5196 max mem: 2905
Epoch: [213] [200/625] eta: 0:14:13 lr: 0.000874 min_lr: 0.000874 loss: 2.8313 (2.8353) class_acc: 0.5547 (0.5618) weight_decay: 0.0500 (0.0500) grad_norm: 2.7075 (inf) time: 1.9866 data: 0.0007 max mem: 2905
Epoch: [213] [400/625] eta: 0:07:16 lr: 0.000868 min_lr: 0.000868 loss: 2.8347 (2.8456) class_acc: 0.5703 (0.5601) weight_decay: 0.0500 (0.0500) grad_norm: 3.2377 (inf) time: 1.8750 data: 0.4142 max mem: 2905
Epoch: [213] [600/625] eta: 0:00:48 lr: 0.000863 min_lr: 0.000863 loss: 2.8553 (2.8525) class_acc: 0.5547 (0.5593) weight_decay: 0.0500 (0.0500) grad_norm: 2.2295 (inf) time: 2.1345 data: 0.0007 max mem: 2905
Epoch: [213] [624/625] eta: 0:00:01 lr: 0.000862 min_lr: 0.000862 loss: 2.8048 (2.8519) class_acc: 0.5625 (0.5595) weight_decay: 0.0500 (0.0500) grad_norm: 3.2696 (inf) time: 0.8310 data: 0.0340 max mem: 2905
Epoch: [213] Total time: 0:19:46 (1.8984 s / it)
Averaged stats: lr: 0.000862 min_lr: 0.000862 loss: 2.8048 (2.8573) class_acc: 0.5625 (0.5587) weight_decay: 0.0500 (0.0500) grad_norm: 3.2696 (inf)
Test: [ 0/50] eta: 0:09:41 loss: 2.2315 (2.2315) acc1: 51.2000 (51.2000) acc5: 75.2000 (75.2000) time: 11.6324 data: 11.6045 max mem: 2905
Test: [10/50] eta: 0:01:32 loss: 1.8380 (1.8331) acc1: 61.6000 (59.1273) acc5: 80.8000 (80.4364) time: 2.3106 data: 2.2897 max mem: 2905
Test: [20/50] eta: 0:00:54 loss: 1.9574 (2.0150) acc1: 56.0000 (55.1238) acc5: 79.2000 (79.2000) time: 1.3276 data: 1.3074 max mem: 2905
Test: [30/50] eta: 0:00:31 loss: 2.1893 (2.0638) acc1: 51.2000 (54.1677) acc5: 76.8000 (78.2452) time: 1.1706 data: 1.1510 max mem: 2905
Test: [40/50] eta: 0:00:13 loss: 2.1573 (2.0808) acc1: 51.2000 (53.8927) acc5: 76.8000 (78.2634) time: 0.8058 data: 0.7857 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.1114 (2.0973) acc1: 53.6000 (53.7440) acc5: 79.2000 (78.3840) time: 0.7066 data: 0.6858 max mem: 2905
Test: Total time: 0:00:56 (1.1298 s / it)
* Acc@1 54.930 Acc@5 79.038 loss 2.047
Accuracy of the model on the 50000 test images: 54.9%
Max accuracy: 54.93%
Epoch: [214] [ 0/625] eta: 3:58:26 lr: 0.000862 min_lr: 0.000862 loss: 2.7778 (2.7778) class_acc: 0.5742 (0.5742) weight_decay: 0.0500 (0.0500) time: 22.8907 data: 20.8083 max mem: 2905
Epoch: [214] [200/625] eta: 0:14:42 lr: 0.000856 min_lr: 0.000856 loss: 2.8224 (2.8515) class_acc: 0.5586 (0.5594) weight_decay: 0.0500 (0.0500) grad_norm: 2.7554 (3.1999) time: 1.9666 data: 1.7635 max mem: 2905
Epoch: [214] [400/625] eta: 0:07:38 lr: 0.000850 min_lr: 0.000850 loss: 2.8676 (2.8529) class_acc: 0.5625 (0.5593) weight_decay: 0.0500 (0.0500) grad_norm: 3.2494 (3.0852) time: 1.9234 data: 0.0009 max mem: 2905
Epoch: [214] [600/625] eta: 0:00:50 lr: 0.000844 min_lr: 0.000844 loss: 2.8551 (2.8546) class_acc: 0.5547 (0.5593) weight_decay: 0.0500 (0.0500) grad_norm: 3.8018 (3.2174) time: 1.9554 data: 0.0437 max mem: 2905
Epoch: [214] [624/625] eta: 0:00:01 lr: 0.000844 min_lr: 0.000844 loss: 2.8616 (2.8548) class_acc: 0.5508 (0.5592) weight_decay: 0.0500 (0.0500) grad_norm: 3.4322 (3.2245) time: 0.8874 data: 0.0297 max mem: 2905
Epoch: [214] Total time: 0:20:36 (1.9784 s / it)
Averaged stats: lr: 0.000844 min_lr: 0.000844 loss: 2.8616 (2.8526) class_acc: 0.5508 (0.5597) weight_decay: 0.0500 (0.0500) grad_norm: 3.4322 (3.2245)
Test: [ 0/50] eta: 0:10:27 loss: 1.8443 (1.8443) acc1: 59.2000 (59.2000) acc5: 84.8000 (84.8000) time: 12.5563 data: 12.5030 max mem: 2905
Test: [10/50] eta: 0:01:26 loss: 1.8811 (1.9989) acc1: 58.4000 (56.5091) acc5: 82.4000 (80.5091) time: 2.1748 data: 2.1497 max mem: 2905
Test: [20/50] eta: 0:00:51 loss: 2.2505 (2.2000) acc1: 52.0000 (53.4857) acc5: 76.8000 (77.8286) time: 1.1660 data: 1.1448 max mem: 2905
Test: [30/50] eta: 0:00:29 loss: 2.2456 (2.2004) acc1: 49.6000 (52.8516) acc5: 74.4000 (77.2645) time: 1.0801 data: 1.0607 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.1910 (2.2062) acc1: 50.4000 (53.0732) acc5: 75.2000 (76.9171) time: 0.6585 data: 0.6403 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.1012 (2.1918) acc1: 52.0000 (52.9920) acc5: 76.8000 (77.1360) time: 0.5313 data: 0.5119 max mem: 2905
Test: Total time: 0:00:50 (1.0048 s / it)
* Acc@1 53.612 Acc@5 77.116 loss 2.167
Accuracy of the model on the 50000 test images: 53.6%
Max accuracy: 54.93%
Epoch: [215] [ 0/625] eta: 3:40:29 lr: 0.000843 min_lr: 0.000843 loss: 2.8950 (2.8950) class_acc: 0.5781 (0.5781) weight_decay: 0.0500 (0.0500) time: 21.1671 data: 16.5604 max mem: 2905
Epoch: [215] [200/625] eta: 0:14:20 lr: 0.000838 min_lr: 0.000838 loss: 2.7813 (2.8414) class_acc: 0.5586 (0.5603) weight_decay: 0.0500 (0.0500) grad_norm: 3.6882 (3.5993) time: 1.8034 data: 0.1104 max mem: 2905
Epoch: [215] [400/625] eta: 0:07:16 lr: 0.000832 min_lr: 0.000832 loss: 2.8371 (2.8471) class_acc: 0.5586 (0.5594) weight_decay: 0.0500 (0.0500) grad_norm: 3.0763 (3.4911) time: 1.8987 data: 0.5416 max mem: 2905
Epoch: [215] [600/625] eta: 0:00:48 lr: 0.000826 min_lr: 0.000826 loss: 2.8327 (2.8493) class_acc: 0.5508 (0.5585) weight_decay: 0.0500 (0.0500) grad_norm: 2.0044 (3.2370) time: 1.9963 data: 0.5071 max mem: 2905
Epoch: [215] [624/625] eta: 0:00:01 lr: 0.000825 min_lr: 0.000825 loss: 2.8587 (2.8496) class_acc: 0.5625 (0.5585) weight_decay: 0.0500 (0.0500) grad_norm: 3.4475 (3.2905) time: 0.8059 data: 0.0362 max mem: 2905
Epoch: [215] Total time: 0:19:47 (1.8998 s / it)
Averaged stats: lr: 0.000825 min_lr: 0.000825 loss: 2.8587 (2.8514) class_acc: 0.5625 (0.5602) weight_decay: 0.0500 (0.0500) grad_norm: 3.4475 (3.2905)
Test: [ 0/50] eta: 0:10:27 loss: 2.0447 (2.0447) acc1: 52.0000 (52.0000) acc5: 80.0000 (80.0000) time: 12.5470 data: 12.5222 max mem: 2905
Test: [10/50] eta: 0:01:18 loss: 2.0447 (2.0176) acc1: 55.2000 (55.4909) acc5: 80.0000 (78.9091) time: 1.9625 data: 1.9415 max mem: 2905
Test: [20/50] eta: 0:00:44 loss: 2.1533 (2.2046) acc1: 50.4000 (51.8476) acc5: 76.8000 (76.8000) time: 0.9194 data: 0.8990 max mem: 2905
Test: [30/50] eta: 0:00:24 loss: 2.3237 (2.2326) acc1: 49.6000 (51.5355) acc5: 74.4000 (76.3097) time: 0.8371 data: 0.8167 max mem: 2905
Test: [40/50] eta: 0:00:10 loss: 2.2201 (2.2387) acc1: 49.6000 (51.4732) acc5: 76.0000 (76.1951) time: 0.5905 data: 0.5692 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.2400 (2.2682) acc1: 49.6000 (51.0720) acc5: 75.2000 (75.5360) time: 0.5230 data: 0.5001 max mem: 2905
Test: Total time: 0:00:45 (0.9122 s / it)
* Acc@1 52.106 Acc@5 76.104 loss 2.221
Accuracy of the model on the 50000 test images: 52.1%
Max accuracy: 54.93%
Epoch: [216] [ 0/625] eta: 3:40:31 lr: 0.000825 min_lr: 0.000825 loss: 3.0004 (3.0004) class_acc: 0.5195 (0.5195) weight_decay: 0.0500 (0.0500) time: 21.1708 data: 16.5791 max mem: 2905
Epoch: [216] [200/625] eta: 0:14:06 lr: 0.000819 min_lr: 0.000819 loss: 2.8070 (2.8458) class_acc: 0.5508 (0.5618) weight_decay: 0.0500 (0.0500) grad_norm: 4.0907 (3.2868) time: 1.9264 data: 0.0056 max mem: 2905
Epoch: [216] [400/625] eta: 0:07:26 lr: 0.000814 min_lr: 0.000814 loss: 2.8496 (2.8463) class_acc: 0.5742 (0.5619) weight_decay: 0.0500 (0.0500) grad_norm: 2.3506 (3.1835) time: 2.1112 data: 0.0007 max mem: 2905
Epoch: [216] [600/625] eta: 0:00:49 lr: 0.000808 min_lr: 0.000808 loss: 2.8175 (2.8458) class_acc: 0.5547 (0.5622) weight_decay: 0.0500 (0.0500) grad_norm: 2.6576 (3.0471) time: 2.1839 data: 0.0007 max mem: 2905
Epoch: [216] [624/625] eta: 0:00:01 lr: 0.000807 min_lr: 0.000807 loss: 2.8191 (2.8459) class_acc: 0.5664 (0.5621) weight_decay: 0.0500 (0.0500) grad_norm: 2.5842 (3.0354) time: 1.0618 data: 0.0014 max mem: 2905
Epoch: [216] Total time: 0:20:18 (1.9493 s / it)
Averaged stats: lr: 0.000807 min_lr: 0.000807 loss: 2.8191 (2.8505) class_acc: 0.5664 (0.5607) weight_decay: 0.0500 (0.0500) grad_norm: 2.5842 (3.0354)
Test: [ 0/50] eta: 0:10:39 loss: 2.2310 (2.2310) acc1: 47.2000 (47.2000) acc5: 75.2000 (75.2000) time: 12.7974 data: 12.7611 max mem: 2905
Test: [10/50] eta: 0:01:27 loss: 1.8731 (1.9213) acc1: 57.6000 (57.3091) acc5: 80.8000 (80.2909) time: 2.1791 data: 2.1569 max mem: 2905
Test: [20/50] eta: 0:00:52 loss: 2.0377 (2.0957) acc1: 53.6000 (53.5619) acc5: 77.6000 (78.1714) time: 1.1921 data: 1.1717 max mem: 2905
Test: [30/50] eta: 0:00:31 loss: 2.1861 (2.1070) acc1: 52.8000 (53.6774) acc5: 75.2000 (77.8065) time: 1.2187 data: 1.1984 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.0831 (2.1169) acc1: 52.8000 (53.6976) acc5: 77.6000 (77.7366) time: 0.8201 data: 0.7998 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.0815 (2.1230) acc1: 52.8000 (53.4240) acc5: 78.4000 (77.9200) time: 0.7780 data: 0.7587 max mem: 2905
Test: Total time: 0:00:53 (1.0745 s / it)
* Acc@1 54.462 Acc@5 78.244 loss 2.088
Accuracy of the model on the 50000 test images: 54.5%
Max accuracy: 54.93%
Epoch: [217] [ 0/625] eta: 3:53:24 lr: 0.000807 min_lr: 0.000807 loss: 2.9204 (2.9204) class_acc: 0.5820 (0.5820) weight_decay: 0.0500 (0.0500) time: 22.4080 data: 17.3745 max mem: 2905
Epoch: [217] [200/625] eta: 0:14:10 lr: 0.000801 min_lr: 0.000801 loss: 2.8442 (2.8423) class_acc: 0.5586 (0.5623) weight_decay: 0.0500 (0.0500) grad_norm: 3.1329 (3.0758) time: 1.9887 data: 0.0007 max mem: 2905
Epoch: [217] [400/625] eta: 0:07:18 lr: 0.000796 min_lr: 0.000796 loss: 2.8225 (2.8415) class_acc: 0.5547 (0.5631) weight_decay: 0.0500 (0.0500) grad_norm: 3.5125 (3.3279) time: 1.8608 data: 0.0007 max mem: 2905
Epoch: [217] [600/625] eta: 0:00:49 lr: 0.000790 min_lr: 0.000790 loss: 2.8185 (2.8459) class_acc: 0.5547 (0.5613) weight_decay: 0.0500 (0.0500) grad_norm: 2.7362 (3.2111) time: 2.0081 data: 0.0008 max mem: 2905
Epoch: [217] [624/625] eta: 0:00:01 lr: 0.000789 min_lr: 0.000789 loss: 2.7639 (2.8446) class_acc: 0.5820 (0.5618) weight_decay: 0.0500 (0.0500) grad_norm: 2.1391 (3.1896) time: 0.8480 data: 0.0017 max mem: 2905
Epoch: [217] Total time: 0:20:01 (1.9220 s / it)
Averaged stats: lr: 0.000789 min_lr: 0.000789 loss: 2.7639 (2.8447) class_acc: 0.5820 (0.5620) weight_decay: 0.0500 (0.0500) grad_norm: 2.1391 (3.1896)
Test: [ 0/50] eta: 0:10:44 loss: 2.9077 (2.9077) acc1: 40.8000 (40.8000) acc5: 62.4000 (62.4000) time: 12.8834 data: 12.8567 max mem: 2905
Test: [10/50] eta: 0:01:23 loss: 2.6843 (2.6003) acc1: 47.2000 (44.9455) acc5: 69.6000 (69.0909) time: 2.0829 data: 2.0624 max mem: 2905
Test: [20/50] eta: 0:00:47 loss: 2.7803 (2.7979) acc1: 41.6000 (42.4000) acc5: 67.2000 (66.8571) time: 1.0067 data: 0.9873 max mem: 2905
Test: [30/50] eta: 0:00:27 loss: 2.9680 (2.8100) acc1: 39.2000 (42.4000) acc5: 65.6000 (66.8129) time: 0.9720 data: 0.9532 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.8448 (2.8293) acc1: 39.2000 (42.2829) acc5: 66.4000 (67.1610) time: 0.7156 data: 0.6958 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.9717 (2.8423) acc1: 41.6000 (42.2560) acc5: 66.4000 (67.0080) time: 0.5978 data: 0.5772 max mem: 2905
Test: Total time: 0:00:49 (0.9960 s / it)
* Acc@1 42.578 Acc@5 67.498 loss 2.796
Accuracy of the model on the 50000 test images: 42.6%
Max accuracy: 54.93%
Epoch: [218] [ 0/625] eta: 3:51:37 lr: 0.000789 min_lr: 0.000789 loss: 2.8966 (2.8966) class_acc: 0.5469 (0.5469) weight_decay: 0.0500 (0.0500) time: 22.2367 data: 17.6602 max mem: 2905
Epoch: [218] [200/625] eta: 0:14:32 lr: 0.000784 min_lr: 0.000784 loss: 2.8269 (2.8308) class_acc: 0.5664 (0.5644) weight_decay: 0.0500 (0.0500) grad_norm: 2.1742 (3.6475) time: 1.9986 data: 0.0355 max mem: 2905
Epoch: [218] [400/625] eta: 0:07:24 lr: 0.000778 min_lr: 0.000778 loss: 2.8209 (2.8363) class_acc: 0.5586 (0.5632) weight_decay: 0.0500 (0.0500) grad_norm: 2.4928 (3.5446) time: 1.9136 data: 0.0086 max mem: 2905
Epoch: [218] [600/625] eta: 0:00:49 lr: 0.000772 min_lr: 0.000772 loss: 2.8345 (2.8411) class_acc: 0.5547 (0.5610) weight_decay: 0.0500 (0.0500) grad_norm: 2.4417 (3.4768) time: 2.1508 data: 0.0043 max mem: 2905
Epoch: [218] [624/625] eta: 0:00:01 lr: 0.000772 min_lr: 0.000772 loss: 2.8303 (2.8411) class_acc: 0.5547 (0.5610) weight_decay: 0.0500 (0.0500) grad_norm: 2.4417 (3.4883) time: 0.7725 data: 0.0016 max mem: 2905
Epoch: [218] Total time: 0:20:11 (1.9385 s / it)
Averaged stats: lr: 0.000772 min_lr: 0.000772 loss: 2.8303 (2.8420) class_acc: 0.5547 (0.5619) weight_decay: 0.0500 (0.0500) grad_norm: 2.4417 (3.4883)
Test: [ 0/50] eta: 0:12:11 loss: 2.1146 (2.1146) acc1: 52.8000 (52.8000) acc5: 80.8000 (80.8000) time: 14.6324 data: 14.5998 max mem: 2905
Test: [10/50] eta: 0:01:28 loss: 1.9208 (1.9196) acc1: 56.0000 (57.6727) acc5: 80.8000 (80.7273) time: 2.2126 data: 2.1911 max mem: 2905
Test: [20/50] eta: 0:00:50 loss: 2.0204 (2.1221) acc1: 55.2000 (55.3524) acc5: 78.4000 (78.6286) time: 1.0435 data: 1.0239 max mem: 2905
Test: [30/50] eta: 0:00:29 loss: 2.2861 (2.1400) acc1: 53.6000 (54.6065) acc5: 76.0000 (78.1419) time: 1.0841 data: 1.0650 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.0928 (2.1398) acc1: 52.0000 (54.3024) acc5: 76.0000 (77.5805) time: 0.8826 data: 0.8625 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.0954 (2.1521) acc1: 52.0000 (54.2240) acc5: 77.6000 (77.4080) time: 0.8566 data: 0.8348 max mem: 2905
Test: Total time: 0:00:53 (1.0720 s / it)
* Acc@1 55.472 Acc@5 78.604 loss 2.097
Accuracy of the model on the 50000 test images: 55.5%
Max accuracy: 55.47%
Epoch: [219] [ 0/625] eta: 3:25:59 lr: 0.000771 min_lr: 0.000771 loss: 2.8030 (2.8030) class_acc: 0.5664 (0.5664) weight_decay: 0.0500 (0.0500) time: 19.7753 data: 18.4014 max mem: 2905
Epoch: [219] [200/625] eta: 0:14:27 lr: 0.000766 min_lr: 0.000766 loss: 2.7904 (2.8293) class_acc: 0.5664 (0.5654) weight_decay: 0.0500 (0.0500) grad_norm: 2.4579 (3.1716) time: 2.0906 data: 0.3221 max mem: 2905
Epoch: [219] [400/625] eta: 0:07:31 lr: 0.000760 min_lr: 0.000760 loss: 2.8437 (2.8308) class_acc: 0.5586 (0.5645) weight_decay: 0.0500 (0.0500) grad_norm: 2.3487 (2.9924) time: 1.9062 data: 0.0012 max mem: 2905
Epoch: [219] [600/625] eta: 0:00:50 lr: 0.000755 min_lr: 0.000755 loss: 2.8752 (2.8355) class_acc: 0.5586 (0.5638) weight_decay: 0.0500 (0.0500) grad_norm: 2.6165 (3.0341) time: 2.0310 data: 0.0007 max mem: 2905
Epoch: [219] [624/625] eta: 0:00:01 lr: 0.000754 min_lr: 0.000754 loss: 2.8732 (2.8374) class_acc: 0.5508 (0.5633) weight_decay: 0.0500 (0.0500) grad_norm: 2.2060 (3.0129) time: 1.1016 data: 0.0078 max mem: 2905
Epoch: [219] Total time: 0:20:32 (1.9718 s / it)
Averaged stats: lr: 0.000754 min_lr: 0.000754 loss: 2.8732 (2.8364) class_acc: 0.5508 (0.5634) weight_decay: 0.0500 (0.0500) grad_norm: 2.2060 (3.0129)
Test: [ 0/50] eta: 0:10:31 loss: 2.1764 (2.1764) acc1: 48.8000 (48.8000) acc5: 79.2000 (79.2000) time: 12.6317 data: 12.5952 max mem: 2905
Test: [10/50] eta: 0:01:30 loss: 1.9786 (2.0772) acc1: 56.8000 (55.0545) acc5: 77.6000 (77.6727) time: 2.2674 data: 2.2435 max mem: 2905
Test: [20/50] eta: 0:00:54 loss: 2.1775 (2.1679) acc1: 54.4000 (53.4095) acc5: 76.8000 (77.2571) time: 1.2643 data: 1.2431 max mem: 2905
Test: [30/50] eta: 0:00:30 loss: 2.2345 (2.1959) acc1: 51.2000 (52.7226) acc5: 76.8000 (76.8516) time: 1.1012 data: 1.0819 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.3541 (2.2453) acc1: 50.4000 (51.9024) acc5: 74.4000 (76.1951) time: 0.6042 data: 0.5850 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.3377 (2.2435) acc1: 51.2000 (51.9360) acc5: 74.4000 (76.2400) time: 0.6075 data: 0.5885 max mem: 2905
Test: Total time: 0:00:51 (1.0258 s / it)
* Acc@1 52.896 Acc@5 76.974 loss 2.217
Accuracy of the model on the 50000 test images: 52.9%
Max accuracy: 55.47%
Epoch: [220] [ 0/625] eta: 3:49:14 lr: 0.000754 min_lr: 0.000754 loss: 2.8331 (2.8331) class_acc: 0.5469 (0.5469) weight_decay: 0.0500 (0.0500) time: 22.0069 data: 19.7209 max mem: 2905
Epoch: [220] [200/625] eta: 0:14:19 lr: 0.000748 min_lr: 0.000748 loss: 2.7866 (2.8227) class_acc: 0.5781 (0.5676) weight_decay: 0.0500 (0.0500) grad_norm: 2.5551 (3.4210) time: 2.0994 data: 0.0007 max mem: 2905
Epoch: [220] [400/625] eta: 0:07:23 lr: 0.000743 min_lr: 0.000743 loss: 2.8310 (2.8385) class_acc: 0.5625 (0.5629) weight_decay: 0.0500 (0.0500) grad_norm: 2.5217 (inf) time: 1.9486 data: 0.0495 max mem: 2905
Epoch: [220] [600/625] eta: 0:00:49 lr: 0.000737 min_lr: 0.000737 loss: 2.9002 (2.8402) class_acc: 0.5547 (0.5627) weight_decay: 0.0500 (0.0500) grad_norm: 3.1108 (inf) time: 2.0586 data: 0.0007 max mem: 2905
Epoch: [220] [624/625] eta: 0:00:01 lr: 0.000736 min_lr: 0.000736 loss: 2.8853 (2.8412) class_acc: 0.5508 (0.5624) weight_decay: 0.0500 (0.0500) grad_norm: 3.5105 (inf) time: 0.4050 data: 0.0015 max mem: 2905
Epoch: [220] Total time: 0:20:11 (1.9384 s / it)
Averaged stats: lr: 0.000736 min_lr: 0.000736 loss: 2.8853 (2.8362) class_acc: 0.5508 (0.5635) weight_decay: 0.0500 (0.0500) grad_norm: 3.5105 (inf)
Test: [ 0/50] eta: 0:10:08 loss: 2.5442 (2.5442) acc1: 48.8000 (48.8000) acc5: 73.6000 (73.6000) time: 12.1652 data: 12.1342 max mem: 2905
Test: [10/50] eta: 0:01:20 loss: 2.3319 (2.3187) acc1: 51.2000 (52.2182) acc5: 74.4000 (74.7636) time: 2.0241 data: 2.0037 max mem: 2905
Test: [20/50] eta: 0:00:46 loss: 2.4589 (2.4447) acc1: 49.6000 (49.0667) acc5: 72.8000 (73.6381) time: 1.0061 data: 0.9861 max mem: 2905
Test: [30/50] eta: 0:00:25 loss: 2.5015 (2.4221) acc1: 49.6000 (49.8839) acc5: 73.6000 (74.1936) time: 0.8924 data: 0.8722 max mem: 2905
Test: [40/50] eta: 0:00:10 loss: 2.4818 (2.4485) acc1: 50.4000 (49.2683) acc5: 72.8000 (73.2293) time: 0.5313 data: 0.5116 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.5120 (2.4597) acc1: 46.4000 (49.0880) acc5: 71.2000 (73.2160) time: 0.4708 data: 0.4508 max mem: 2905
Test: Total time: 0:00:44 (0.8869 s / it)
* Acc@1 49.028 Acc@5 73.226 loss 2.430
Accuracy of the model on the 50000 test images: 49.0%
Max accuracy: 55.47%
Epoch: [221] [ 0/625] eta: 3:24:02 lr: 0.000736 min_lr: 0.000736 loss: 2.7191 (2.7191) class_acc: 0.6094 (0.6094) weight_decay: 0.0500 (0.0500) time: 19.5875 data: 19.2361 max mem: 2905
Epoch: [221] [200/625] eta: 0:14:19 lr: 0.000731 min_lr: 0.000731 loss: 2.7883 (2.8130) class_acc: 0.5742 (0.5692) weight_decay: 0.0500 (0.0500) grad_norm: 2.8464 (3.3130) time: 1.8483 data: 0.0009 max mem: 2905
Epoch: [221] [400/625] eta: 0:07:25 lr: 0.000725 min_lr: 0.000725 loss: 2.8561 (2.8254) class_acc: 0.5625 (0.5677) weight_decay: 0.0500 (0.0500) grad_norm: 2.7967 (3.2403) time: 1.8415 data: 0.0009 max mem: 2905
Epoch: [221] [600/625] eta: 0:00:49 lr: 0.000720 min_lr: 0.000720 loss: 2.8826 (2.8304) class_acc: 0.5547 (0.5670) weight_decay: 0.0500 (0.0500) grad_norm: 2.7030 (3.1994) time: 2.0911 data: 0.0007 max mem: 2905
Epoch: [221] [624/625] eta: 0:00:01 lr: 0.000719 min_lr: 0.000719 loss: 2.8839 (2.8314) class_acc: 0.5508 (0.5664) weight_decay: 0.0500 (0.0500) grad_norm: 2.3692 (3.1862) time: 0.8245 data: 0.0014 max mem: 2905
Epoch: [221] Total time: 0:20:18 (1.9495 s / it)
Averaged stats: lr: 0.000719 min_lr: 0.000719 loss: 2.8839 (2.8317) class_acc: 0.5508 (0.5652) weight_decay: 0.0500 (0.0500) grad_norm: 2.3692 (3.1862)
Test: [ 0/50] eta: 0:11:05 loss: 2.1198 (2.1198) acc1: 48.8000 (48.8000) acc5: 80.0000 (80.0000) time: 13.3195 data: 13.2905 max mem: 2905
Test: [10/50] eta: 0:01:31 loss: 2.1198 (2.1117) acc1: 53.6000 (53.4545) acc5: 76.8000 (76.0000) time: 2.2865 data: 2.2674 max mem: 2905
Test: [20/50] eta: 0:00:55 loss: 2.2780 (2.2737) acc1: 50.4000 (50.5905) acc5: 74.4000 (75.0857) time: 1.2596 data: 1.2406 max mem: 2905
Test: [30/50] eta: 0:00:32 loss: 2.3160 (2.2701) acc1: 50.4000 (51.6645) acc5: 74.4000 (75.0710) time: 1.2884 data: 1.2691 max mem: 2905
Test: [40/50] eta: 0:00:13 loss: 2.2639 (2.2961) acc1: 52.0000 (51.2000) acc5: 73.6000 (74.8098) time: 0.9322 data: 0.9101 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.2748 (2.2995) acc1: 47.2000 (51.0880) acc5: 74.4000 (74.8960) time: 0.8987 data: 0.8758 max mem: 2905
Test: Total time: 0:00:58 (1.1621 s / it)
* Acc@1 51.604 Acc@5 75.564 loss 2.276
Accuracy of the model on the 50000 test images: 51.6%
Max accuracy: 55.47%
Epoch: [222] [ 0/625] eta: 3:33:08 lr: 0.000719 min_lr: 0.000719 loss: 2.7429 (2.7429) class_acc: 0.5742 (0.5742) weight_decay: 0.0500 (0.0500) time: 20.4617 data: 17.8072 max mem: 2905
Epoch: [222] [200/625] eta: 0:14:27 lr: 0.000714 min_lr: 0.000714 loss: 2.8334 (2.8090) class_acc: 0.5664 (0.5703) weight_decay: 0.0500 (0.0500) grad_norm: 4.6817 (3.6760) time: 2.0688 data: 0.0077 max mem: 2905
Epoch: [222] [400/625] eta: 0:07:22 lr: 0.000708 min_lr: 0.000708 loss: 2.8283 (2.8220) class_acc: 0.5586 (0.5670) weight_decay: 0.0500 (0.0500) grad_norm: 3.3044 (3.7010) time: 1.9727 data: 0.0009 max mem: 2905
Epoch: [222] [600/625] eta: 0:00:49 lr: 0.000703 min_lr: 0.000703 loss: 2.7872 (2.8256) class_acc: 0.5625 (0.5660) weight_decay: 0.0500 (0.0500) grad_norm: 2.5118 (3.6341) time: 1.9153 data: 0.0006 max mem: 2905
Epoch: [222] [624/625] eta: 0:00:01 lr: 0.000702 min_lr: 0.000702 loss: 2.8644 (2.8263) class_acc: 0.5625 (0.5660) weight_decay: 0.0500 (0.0500) grad_norm: 2.5995 (3.5931) time: 0.8339 data: 0.0014 max mem: 2905
Epoch: [222] Total time: 0:20:05 (1.9281 s / it)
Averaged stats: lr: 0.000702 min_lr: 0.000702 loss: 2.8644 (2.8276) class_acc: 0.5625 (0.5657) weight_decay: 0.0500 (0.0500) grad_norm: 2.5995 (3.5931)
Test: [ 0/50] eta: 0:10:24 loss: 2.0144 (2.0144) acc1: 52.8000 (52.8000) acc5: 79.2000 (79.2000) time: 12.4907 data: 12.4628 max mem: 2905
Test: [10/50] eta: 0:01:21 loss: 1.7840 (1.7650) acc1: 59.2000 (59.6364) acc5: 81.6000 (82.4727) time: 2.0255 data: 2.0065 max mem: 2905
Test: [20/50] eta: 0:00:47 loss: 1.8505 (1.9359) acc1: 56.0000 (56.8381) acc5: 81.6000 (80.4191) time: 1.0336 data: 1.0148 max mem: 2905
Test: [30/50] eta: 0:00:26 loss: 2.0733 (1.9769) acc1: 53.6000 (56.3097) acc5: 78.4000 (79.6903) time: 0.9340 data: 0.9128 max mem: 2905
Test: [40/50] eta: 0:00:10 loss: 2.0218 (1.9949) acc1: 54.4000 (56.0585) acc5: 78.4000 (79.2195) time: 0.5812 data: 0.5604 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.0308 (2.0174) acc1: 55.2000 (55.6160) acc5: 77.6000 (78.9440) time: 0.5745 data: 0.5565 max mem: 2905
Test: Total time: 0:00:46 (0.9377 s / it)
* Acc@1 56.632 Acc@5 79.536 loss 1.973
Accuracy of the model on the 50000 test images: 56.6%
Max accuracy: 56.63%
Epoch: [223] [ 0/625] eta: 3:22:34 lr: 0.000702 min_lr: 0.000702 loss: 2.9567 (2.9567) class_acc: 0.5391 (0.5391) weight_decay: 0.0500 (0.0500) time: 19.4466 data: 19.2668 max mem: 2905
Epoch: [223] [200/625] eta: 0:13:51 lr: 0.000696 min_lr: 0.000696 loss: 2.8644 (2.8272) class_acc: 0.5664 (0.5657) weight_decay: 0.0500 (0.0500) grad_norm: 3.6796 (3.6625) time: 1.7534 data: 0.2931 max mem: 2905
Epoch: [223] [400/625] eta: 0:07:07 lr: 0.000691 min_lr: 0.000691 loss: 2.8362 (2.8247) class_acc: 0.5547 (0.5659) weight_decay: 0.0500 (0.0500) grad_norm: 3.2313 (3.3947) time: 1.8213 data: 0.0668 max mem: 2905
Epoch: [223] [600/625] eta: 0:00:47 lr: 0.000686 min_lr: 0.000686 loss: 2.8040 (2.8253) class_acc: 0.5625 (0.5663) weight_decay: 0.0500 (0.0500) grad_norm: 2.9826 (3.3668) time: 1.7651 data: 0.3873 max mem: 2905
Epoch: [223] [624/625] eta: 0:00:01 lr: 0.000685 min_lr: 0.000685 loss: 2.8220 (2.8259) class_acc: 0.5625 (0.5662) weight_decay: 0.0500 (0.0500) grad_norm: 2.3721 (3.3306) time: 0.7192 data: 0.0981 max mem: 2905
Epoch: [223] Total time: 0:19:33 (1.8772 s / it)
Averaged stats: lr: 0.000685 min_lr: 0.000685 loss: 2.8220 (2.8278) class_acc: 0.5625 (0.5659) weight_decay: 0.0500 (0.0500) grad_norm: 2.3721 (3.3306)
Test: [ 0/50] eta: 0:09:39 loss: 2.2917 (2.2917) acc1: 54.4000 (54.4000) acc5: 72.0000 (72.0000) time: 11.5949 data: 11.5598 max mem: 2905
Test: [10/50] eta: 0:01:19 loss: 2.1529 (2.0980) acc1: 53.6000 (54.4727) acc5: 76.8000 (76.8000) time: 1.9888 data: 1.9659 max mem: 2905
Test: [20/50] eta: 0:00:47 loss: 2.1606 (2.1968) acc1: 51.2000 (52.8000) acc5: 76.8000 (76.6857) time: 1.0899 data: 1.0689 max mem: 2905
Test: [30/50] eta: 0:00:27 loss: 2.3104 (2.2169) acc1: 51.2000 (52.5419) acc5: 76.8000 (76.1806) time: 1.0420 data: 1.0200 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.3458 (2.2673) acc1: 50.4000 (51.8439) acc5: 75.2000 (75.3561) time: 0.6473 data: 0.6240 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.3437 (2.2792) acc1: 50.4000 (51.7920) acc5: 73.6000 (75.1040) time: 0.6055 data: 0.5842 max mem: 2905
Test: Total time: 0:00:46 (0.9381 s / it)
* Acc@1 52.020 Acc@5 76.140 loss 2.233
Accuracy of the model on the 50000 test images: 52.0%
Max accuracy: 56.63%
Epoch: [224] [ 0/625] eta: 3:18:55 lr: 0.000685 min_lr: 0.000685 loss: 2.7990 (2.7990) class_acc: 0.5898 (0.5898) weight_decay: 0.0500 (0.0500) time: 19.0963 data: 18.9780 max mem: 2905
Epoch: [224] [200/625] eta: 0:13:42 lr: 0.000680 min_lr: 0.000680 loss: 2.7993 (2.8212) class_acc: 0.5625 (0.5712) weight_decay: 0.0500 (0.0500) grad_norm: 2.4578 (3.1635) time: 1.9367 data: 0.0007 max mem: 2905
Epoch: [224] [400/625] eta: 0:07:12 lr: 0.000674 min_lr: 0.000674 loss: 2.8184 (2.8224) class_acc: 0.5664 (0.5691) weight_decay: 0.0500 (0.0500) grad_norm: 2.6766 (3.3197) time: 1.9716 data: 0.0663 max mem: 2905
Epoch: [224] [600/625] eta: 0:00:48 lr: 0.000669 min_lr: 0.000669 loss: 2.8155 (2.8274) class_acc: 0.5664 (0.5676) weight_decay: 0.0500 (0.0500) grad_norm: 2.1413 (3.2610) time: 1.8458 data: 0.0329 max mem: 2905
Epoch: [224] [624/625] eta: 0:00:01 lr: 0.000668 min_lr: 0.000668 loss: 2.8499 (2.8284) class_acc: 0.5430 (0.5673) weight_decay: 0.0500 (0.0500) grad_norm: 3.3804 (3.2899) time: 0.8596 data: 0.0259 max mem: 2905
Epoch: [224] Total time: 0:19:38 (1.8864 s / it)
Averaged stats: lr: 0.000668 min_lr: 0.000668 loss: 2.8499 (2.8211) class_acc: 0.5430 (0.5675) weight_decay: 0.0500 (0.0500) grad_norm: 3.3804 (3.2899)
Test: [ 0/50] eta: 0:09:54 loss: 2.1635 (2.1635) acc1: 52.8000 (52.8000) acc5: 80.0000 (80.0000) time: 11.8913 data: 11.8653 max mem: 2905
Test: [10/50] eta: 0:01:23 loss: 1.9494 (1.9558) acc1: 56.0000 (56.5091) acc5: 80.8000 (80.5818) time: 2.0800 data: 2.0612 max mem: 2905
Test: [20/50] eta: 0:00:49 loss: 2.0537 (2.1261) acc1: 51.2000 (53.2191) acc5: 77.6000 (78.5524) time: 1.1533 data: 1.1341 max mem: 2905
Test: [30/50] eta: 0:00:30 loss: 2.3219 (2.1630) acc1: 51.2000 (53.0323) acc5: 76.0000 (77.7290) time: 1.1852 data: 1.1658 max mem: 2905
Test: [40/50] eta: 0:00:13 loss: 2.1732 (2.1829) acc1: 51.2000 (52.8195) acc5: 76.0000 (77.2293) time: 0.9708 data: 0.9524 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.1732 (2.1901) acc1: 50.4000 (52.4960) acc5: 76.0000 (76.9920) time: 0.8942 data: 0.8750 max mem: 2905
Test: Total time: 0:00:55 (1.1077 s / it)
* Acc@1 53.376 Acc@5 77.586 loss 2.148
Accuracy of the model on the 50000 test images: 53.4%
Max accuracy: 56.63%
Epoch: [225] [ 0/625] eta: 3:50:12 lr: 0.000668 min_lr: 0.000668 loss: 2.7869 (2.7869) class_acc: 0.5391 (0.5391) weight_decay: 0.0500 (0.0500) time: 22.1005 data: 16.8768 max mem: 2905
Epoch: [225] [200/625] eta: 0:13:42 lr: 0.000663 min_lr: 0.000663 loss: 2.8162 (2.8060) class_acc: 0.5625 (0.5691) weight_decay: 0.0500 (0.0500) grad_norm: 2.5242 (2.9538) time: 1.6414 data: 0.1694 max mem: 2905
Epoch: [225] [400/625] eta: 0:07:11 lr: 0.000657 min_lr: 0.000657 loss: 2.8392 (2.8119) class_acc: 0.5586 (0.5688) weight_decay: 0.0500 (0.0500) grad_norm: 3.2505 (3.1525) time: 1.9047 data: 0.0820 max mem: 2905
Epoch: [225] [600/625] eta: 0:00:47 lr: 0.000652 min_lr: 0.000652 loss: 2.8350 (2.8188) class_acc: 0.5625 (0.5675) weight_decay: 0.0500 (0.0500) grad_norm: 2.4554 (3.0685) time: 1.8824 data: 0.8664 max mem: 2905
Epoch: [225] [624/625] eta: 0:00:01 lr: 0.000652 min_lr: 0.000652 loss: 2.8253 (2.8200) class_acc: 0.5742 (0.5674) weight_decay: 0.0500 (0.0500) grad_norm: 2.1150 (3.0519) time: 0.7363 data: 0.2715 max mem: 2905
Epoch: [225] Total time: 0:19:44 (1.8953 s / it)
Averaged stats: lr: 0.000652 min_lr: 0.000652 loss: 2.8253 (2.8192) class_acc: 0.5742 (0.5679) weight_decay: 0.0500 (0.0500) grad_norm: 2.1150 (3.0519)
Test: [ 0/50] eta: 0:09:34 loss: 1.6075 (1.6075) acc1: 57.6000 (57.6000) acc5: 88.8000 (88.8000) time: 11.4850 data: 11.4609 max mem: 2905
Test: [10/50] eta: 0:01:19 loss: 1.7576 (1.7273) acc1: 60.0000 (60.3636) acc5: 84.0000 (83.7818) time: 1.9831 data: 1.9645 max mem: 2905
Test: [20/50] eta: 0:00:47 loss: 1.8631 (1.8731) acc1: 59.2000 (58.2857) acc5: 80.8000 (81.6000) time: 1.0916 data: 1.0735 max mem: 2905
Test: [30/50] eta: 0:00:27 loss: 1.9401 (1.8816) acc1: 56.0000 (58.1677) acc5: 80.8000 (81.6000) time: 1.0413 data: 1.0228 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 1.9401 (1.8972) acc1: 56.0000 (57.9122) acc5: 81.6000 (81.1317) time: 0.6668 data: 0.6452 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 1.9292 (1.9028) acc1: 55.2000 (57.6160) acc5: 80.8000 (81.0560) time: 0.6076 data: 0.5857 max mem: 2905
Test: Total time: 0:00:47 (0.9564 s / it)
* Acc@1 58.326 Acc@5 81.262 loss 1.876
Accuracy of the model on the 50000 test images: 58.3%
Max accuracy: 58.33%
Epoch: [226] [ 0/625] eta: 3:19:33 lr: 0.000651 min_lr: 0.000651 loss: 2.7994 (2.7994) class_acc: 0.5820 (0.5820) weight_decay: 0.0500 (0.0500) time: 19.1581 data: 18.9488 max mem: 2905
Epoch: [226] [200/625] eta: 0:13:54 lr: 0.000646 min_lr: 0.000646 loss: 2.7681 (2.7975) class_acc: 0.5742 (0.5737) weight_decay: 0.0500 (0.0500) grad_norm: 3.0465 (3.2634) time: 1.9344 data: 0.0008 max mem: 2905
Epoch: [226] [400/625] eta: 0:07:16 lr: 0.000641 min_lr: 0.000641 loss: 2.7806 (2.8149) class_acc: 0.5742 (0.5704) weight_decay: 0.0500 (0.0500) grad_norm: 3.1574 (3.2118) time: 1.9394 data: 0.0457 max mem: 2905
Epoch: [226] [600/625] eta: 0:00:48 lr: 0.000636 min_lr: 0.000636 loss: 2.8242 (2.8189) class_acc: 0.5586 (0.5685) weight_decay: 0.0500 (0.0500) grad_norm: 3.6197 (3.4137) time: 1.7290 data: 0.0008 max mem: 2905
Epoch: [226] [624/625] eta: 0:00:01 lr: 0.000635 min_lr: 0.000635 loss: 2.8306 (2.8193) class_acc: 0.5625 (0.5685) weight_decay: 0.0500 (0.0500) grad_norm: 2.4763 (3.3777) time: 0.8559 data: 0.0014 max mem: 2905
Epoch: [226] Total time: 0:19:46 (1.8983 s / it)
Averaged stats: lr: 0.000635 min_lr: 0.000635 loss: 2.8306 (2.8159) class_acc: 0.5625 (0.5684) weight_decay: 0.0500 (0.0500) grad_norm: 2.4763 (3.3777)
Test: [ 0/50] eta: 0:10:12 loss: 2.1853 (2.1853) acc1: 49.6000 (49.6000) acc5: 77.6000 (77.6000) time: 12.2435 data: 12.2151 max mem: 2905
Test: [10/50] eta: 0:01:21 loss: 2.0433 (2.0694) acc1: 53.6000 (54.6182) acc5: 77.6000 (77.4545) time: 2.0473 data: 2.0272 max mem: 2905
Test: [20/50] eta: 0:00:47 loss: 2.2561 (2.2374) acc1: 51.2000 (52.0000) acc5: 75.2000 (75.6571) time: 1.0616 data: 1.0413 max mem: 2905
Test: [30/50] eta: 0:00:28 loss: 2.3767 (2.2360) acc1: 50.4000 (52.3613) acc5: 75.2000 (75.8194) time: 1.0745 data: 1.0535 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.2327 (2.2545) acc1: 51.2000 (51.6488) acc5: 75.2000 (75.3951) time: 0.8368 data: 0.8164 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.2977 (2.2851) acc1: 47.2000 (50.8000) acc5: 74.4000 (75.0240) time: 0.7598 data: 0.7395 max mem: 2905
Test: Total time: 0:00:51 (1.0249 s / it)
* Acc@1 51.768 Acc@5 75.880 loss 2.233
Accuracy of the model on the 50000 test images: 51.8%
Max accuracy: 58.33%
Epoch: [227] [ 0/625] eta: 4:23:18 lr: 0.000635 min_lr: 0.000635 loss: 2.7508 (2.7508) class_acc: 0.6289 (0.6289) weight_decay: 0.0500 (0.0500) time: 25.2776 data: 21.0737 max mem: 2905
Epoch: [227] [200/625] eta: 0:14:30 lr: 0.000630 min_lr: 0.000630 loss: 2.8346 (2.8022) class_acc: 0.5586 (0.5686) weight_decay: 0.0500 (0.0500) grad_norm: 2.8335 (3.4276) time: 2.0562 data: 0.0695 max mem: 2905
Epoch: [227] [400/625] eta: 0:07:23 lr: 0.000625 min_lr: 0.000625 loss: 2.8212 (2.8047) class_acc: 0.5664 (0.5695) weight_decay: 0.0500 (0.0500) grad_norm: 2.7345 (3.3682) time: 1.8246 data: 0.0554 max mem: 2905
Epoch: [227] [600/625] eta: 0:00:49 lr: 0.000619 min_lr: 0.000619 loss: 2.7757 (2.8087) class_acc: 0.5742 (0.5687) weight_decay: 0.0500 (0.0500) grad_norm: 2.4021 (3.4449) time: 2.0238 data: 0.0033 max mem: 2905
Epoch: [227] [624/625] eta: 0:00:01 lr: 0.000619 min_lr: 0.000619 loss: 2.7762 (2.8077) class_acc: 0.5703 (0.5690) weight_decay: 0.0500 (0.0500) grad_norm: 2.3825 (3.4170) time: 0.7313 data: 0.0013 max mem: 2905
Epoch: [227] Total time: 0:19:56 (1.9137 s / it)
Averaged stats: lr: 0.000619 min_lr: 0.000619 loss: 2.7762 (2.8117) class_acc: 0.5703 (0.5693) weight_decay: 0.0500 (0.0500) grad_norm: 2.3825 (3.4170)
Test: [ 0/50] eta: 0:09:40 loss: 1.9122 (1.9122) acc1: 58.4000 (58.4000) acc5: 82.4000 (82.4000) time: 11.6133 data: 11.5741 max mem: 2905
Test: [10/50] eta: 0:01:12 loss: 1.8697 (1.7995) acc1: 58.4000 (59.2727) acc5: 80.8000 (81.3818) time: 1.8095 data: 1.7879 max mem: 2905
Test: [20/50] eta: 0:00:41 loss: 1.9306 (1.9705) acc1: 56.0000 (56.4191) acc5: 78.4000 (79.7714) time: 0.8581 data: 0.8385 max mem: 2905
Test: [30/50] eta: 0:00:26 loss: 2.1099 (2.0106) acc1: 52.8000 (55.5355) acc5: 78.4000 (79.2774) time: 1.0661 data: 1.0466 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.0536 (2.0259) acc1: 52.8000 (55.3561) acc5: 78.4000 (78.8878) time: 0.9853 data: 0.9665 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.0485 (2.0400) acc1: 54.4000 (54.9600) acc5: 78.4000 (78.8160) time: 0.5242 data: 0.5047 max mem: 2905
Test: Total time: 0:00:51 (1.0377 s / it)
* Acc@1 55.902 Acc@5 79.566 loss 1.993
Accuracy of the model on the 50000 test images: 55.9%
Max accuracy: 58.33%
Epoch: [228] [ 0/625] eta: 3:33:30 lr: 0.000619 min_lr: 0.000619 loss: 2.8542 (2.8542) class_acc: 0.5312 (0.5312) weight_decay: 0.0500 (0.0500) time: 20.4963 data: 19.0074 max mem: 2905
Epoch: [228] [200/625] eta: 0:14:04 lr: 0.000614 min_lr: 0.000614 loss: 2.8009 (2.8050) class_acc: 0.5781 (0.5716) weight_decay: 0.0500 (0.0500) grad_norm: 3.0872 (3.1895) time: 1.8658 data: 1.6852 max mem: 2905
Epoch: [228] [400/625] eta: 0:07:20 lr: 0.000608 min_lr: 0.000608 loss: 2.7718 (2.8085) class_acc: 0.5742 (0.5702) weight_decay: 0.0500 (0.0500) grad_norm: 3.0361 (3.2611) time: 1.8774 data: 0.0008 max mem: 2905
Epoch: [228] [600/625] eta: 0:00:49 lr: 0.000603 min_lr: 0.000603 loss: 2.7492 (2.8101) class_acc: 0.5820 (0.5692) weight_decay: 0.0500 (0.0500) grad_norm: 3.3052 (3.2620) time: 1.8297 data: 0.0008 max mem: 2905
Epoch: [228] [624/625] eta: 0:00:01 lr: 0.000603 min_lr: 0.000603 loss: 2.8008 (2.8095) class_acc: 0.5703 (0.5694) weight_decay: 0.0500 (0.0500) grad_norm: 3.4762 (3.2899) time: 0.8699 data: 0.0187 max mem: 2905
Epoch: [228] Total time: 0:20:07 (1.9318 s / it)
Averaged stats: lr: 0.000603 min_lr: 0.000603 loss: 2.8008 (2.8107) class_acc: 0.5703 (0.5699) weight_decay: 0.0500 (0.0500) grad_norm: 3.4762 (3.2899)
Test: [ 0/50] eta: 0:10:58 loss: 1.8527 (1.8527) acc1: 57.6000 (57.6000) acc5: 83.2000 (83.2000) time: 13.1633 data: 13.1382 max mem: 2905
Test: [10/50] eta: 0:01:33 loss: 1.7647 (1.7371) acc1: 60.0000 (60.5091) acc5: 81.6000 (82.6909) time: 2.3490 data: 2.3300 max mem: 2905
Test: [20/50] eta: 0:00:56 loss: 1.9968 (1.9506) acc1: 56.0000 (56.6476) acc5: 79.2000 (80.8000) time: 1.3055 data: 1.2864 max mem: 2905
Test: [30/50] eta: 0:00:32 loss: 2.1427 (1.9518) acc1: 55.2000 (57.1871) acc5: 79.2000 (80.5936) time: 1.2016 data: 1.1829 max mem: 2905
Test: [40/50] eta: 0:00:13 loss: 1.8928 (1.9603) acc1: 56.0000 (57.2878) acc5: 80.8000 (80.2342) time: 0.7832 data: 0.7653 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 1.9211 (1.9747) acc1: 56.0000 (56.7840) acc5: 79.2000 (79.9680) time: 0.6976 data: 0.6799 max mem: 2905
Test: Total time: 0:00:56 (1.1202 s / it)
* Acc@1 57.990 Acc@5 80.910 loss 1.928
Accuracy of the model on the 50000 test images: 58.0%
Max accuracy: 58.33%
Epoch: [229] [ 0/625] eta: 4:17:54 lr: 0.000603 min_lr: 0.000603 loss: 2.6692 (2.6692) class_acc: 0.5898 (0.5898) weight_decay: 0.0500 (0.0500) time: 24.7595 data: 19.2615 max mem: 2905
Epoch: [229] [200/625] eta: 0:14:13 lr: 0.000597 min_lr: 0.000597 loss: 2.7499 (2.7962) class_acc: 0.5703 (0.5730) weight_decay: 0.0500 (0.0500) grad_norm: 2.7308 (3.5185) time: 1.8759 data: 0.0009 max mem: 2905
Epoch: [229] [400/625] eta: 0:07:20 lr: 0.000592 min_lr: 0.000592 loss: 2.8252 (2.7999) class_acc: 0.5703 (0.5718) weight_decay: 0.0500 (0.0500) grad_norm: 4.5798 (3.5899) time: 1.9614 data: 0.0145 max mem: 2905
Epoch: [229] [600/625] eta: 0:00:49 lr: 0.000587 min_lr: 0.000587 loss: 2.8627 (2.8065) class_acc: 0.5547 (0.5707) weight_decay: 0.0500 (0.0500) grad_norm: 3.3104 (3.6582) time: 2.0219 data: 0.0376 max mem: 2905
Epoch: [229] [624/625] eta: 0:00:01 lr: 0.000587 min_lr: 0.000587 loss: 2.8191 (2.8063) class_acc: 0.5742 (0.5709) weight_decay: 0.0500 (0.0500) grad_norm: 2.8051 (3.6175) time: 0.7761 data: 0.0014 max mem: 2905
Epoch: [229] Total time: 0:20:15 (1.9449 s / it)
Averaged stats: lr: 0.000587 min_lr: 0.000587 loss: 2.8191 (2.8083) class_acc: 0.5742 (0.5704) weight_decay: 0.0500 (0.0500) grad_norm: 2.8051 (3.6175)
Test: [ 0/50] eta: 0:11:08 loss: 2.1459 (2.1459) acc1: 52.8000 (52.8000) acc5: 77.6000 (77.6000) time: 13.3742 data: 13.3357 max mem: 2905
Test: [10/50] eta: 0:01:30 loss: 1.9254 (1.9042) acc1: 59.2000 (59.4182) acc5: 80.8000 (81.0182) time: 2.2651 data: 2.2440 max mem: 2905
Test: [20/50] eta: 0:00:53 loss: 1.9960 (2.0638) acc1: 56.8000 (55.1619) acc5: 79.2000 (78.9333) time: 1.2070 data: 1.1861 max mem: 2905
Test: [30/50] eta: 0:00:31 loss: 2.1468 (2.0659) acc1: 52.0000 (54.9161) acc5: 77.6000 (78.7613) time: 1.2320 data: 1.2117 max mem: 2905
Test: [40/50] eta: 0:00:13 loss: 2.1743 (2.1175) acc1: 52.8000 (53.8341) acc5: 76.8000 (78.0878) time: 0.8127 data: 0.7917 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.2134 (2.1298) acc1: 52.8000 (53.7920) acc5: 76.8000 (78.0000) time: 0.7434 data: 0.7218 max mem: 2905
Test: Total time: 0:00:55 (1.1021 s / it)
* Acc@1 54.580 Acc@5 78.262 loss 2.089
Accuracy of the model on the 50000 test images: 54.6%
Max accuracy: 58.33%
Epoch: [230] [ 0/625] eta: 4:16:09 lr: 0.000587 min_lr: 0.000587 loss: 3.1090 (3.1090) class_acc: 0.4883 (0.4883) weight_decay: 0.0500 (0.0500) time: 24.5911 data: 18.7988 max mem: 2905
Epoch: [230] [200/625] eta: 0:14:42 lr: 0.000582 min_lr: 0.000582 loss: 2.7681 (2.7908) class_acc: 0.5820 (0.5754) weight_decay: 0.0500 (0.0500) grad_norm: 2.9388 (3.5806) time: 2.0226 data: 0.0008 max mem: 2905
Epoch: [230] [400/625] eta: 0:07:32 lr: 0.000577 min_lr: 0.000577 loss: 2.7967 (2.8021) class_acc: 0.5742 (0.5725) weight_decay: 0.0500 (0.0500) grad_norm: 3.0681 (3.2917) time: 2.0449 data: 0.0006 max mem: 2905
Epoch: [230] [600/625] eta: 0:00:50 lr: 0.000571 min_lr: 0.000571 loss: 2.8194 (2.8068) class_acc: 0.5703 (0.5708) weight_decay: 0.0500 (0.0500) grad_norm: 4.1883 (3.3142) time: 2.1214 data: 0.0008 max mem: 2905
Epoch: [230] [624/625] eta: 0:00:01 lr: 0.000571 min_lr: 0.000571 loss: 2.8307 (2.8076) class_acc: 0.5664 (0.5707) weight_decay: 0.0500 (0.0500) grad_norm: 3.9054 (3.3572) time: 0.7775 data: 0.0017 max mem: 2905
Epoch: [230] Total time: 0:20:21 (1.9552 s / it)
Averaged stats: lr: 0.000571 min_lr: 0.000571 loss: 2.8307 (2.8048) class_acc: 0.5664 (0.5709) weight_decay: 0.0500 (0.0500) grad_norm: 3.9054 (3.3572)
Test: [ 0/50] eta: 0:10:41 loss: 1.8700 (1.8700) acc1: 55.2000 (55.2000) acc5: 82.4000 (82.4000) time: 12.8291 data: 12.8047 max mem: 2905
Test: [10/50] eta: 0:01:23 loss: 1.8156 (1.8103) acc1: 61.6000 (60.8000) acc5: 82.4000 (82.2545) time: 2.0979 data: 2.0777 max mem: 2905
Test: [20/50] eta: 0:00:48 loss: 2.0411 (1.9976) acc1: 57.6000 (56.3048) acc5: 78.4000 (80.0000) time: 1.0680 data: 1.0481 max mem: 2905
Test: [30/50] eta: 0:00:28 loss: 2.1375 (2.0355) acc1: 52.8000 (55.9226) acc5: 77.6000 (78.9677) time: 1.0494 data: 1.0295 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.0447 (2.0545) acc1: 53.6000 (55.9024) acc5: 76.0000 (78.4000) time: 0.6936 data: 0.6748 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.0692 (2.0548) acc1: 55.2000 (55.9360) acc5: 76.8000 (78.4320) time: 0.6029 data: 0.5848 max mem: 2905
Test: Total time: 0:00:49 (0.9855 s / it)
* Acc@1 56.292 Acc@5 79.016 loss 2.032
Accuracy of the model on the 50000 test images: 56.3%
Max accuracy: 58.33%
Epoch: [231] [ 0/625] eta: 3:19:52 lr: 0.000571 min_lr: 0.000571 loss: 2.8739 (2.8739) class_acc: 0.5430 (0.5430) weight_decay: 0.0500 (0.0500) time: 19.1873 data: 17.8375 max mem: 2905
Epoch: [231] [200/625] eta: 0:14:09 lr: 0.000566 min_lr: 0.000566 loss: 2.7872 (2.7965) class_acc: 0.5703 (0.5742) weight_decay: 0.0500 (0.0500) grad_norm: 2.9912 (3.2443) time: 1.9974 data: 0.9620 max mem: 2905
Epoch: [231] [400/625] eta: 0:07:15 lr: 0.000561 min_lr: 0.000561 loss: 2.8260 (2.8033) class_acc: 0.5664 (0.5726) weight_decay: 0.0500 (0.0500) grad_norm: 3.0155 (inf) time: 1.8714 data: 1.6883 max mem: 2905
Epoch: [231] [600/625] eta: 0:00:48 lr: 0.000556 min_lr: 0.000556 loss: 2.7760 (2.8042) class_acc: 0.5664 (0.5721) weight_decay: 0.0500 (0.0500) grad_norm: 2.7760 (inf) time: 1.7572 data: 1.5709 max mem: 2905
Epoch: [231] [624/625] eta: 0:00:01 lr: 0.000555 min_lr: 0.000555 loss: 2.7955 (2.8046) class_acc: 0.5703 (0.5721) weight_decay: 0.0500 (0.0500) grad_norm: 2.2778 (inf) time: 0.6970 data: 0.5400 max mem: 2905
Epoch: [231] Total time: 0:19:46 (1.8979 s / it)
Averaged stats: lr: 0.000555 min_lr: 0.000555 loss: 2.7955 (2.8014) class_acc: 0.5703 (0.5715) weight_decay: 0.0500 (0.0500) grad_norm: 2.2778 (inf)
Test: [ 0/50] eta: 0:10:16 loss: 1.6774 (1.6774) acc1: 63.2000 (63.2000) acc5: 85.6000 (85.6000) time: 12.3347 data: 12.3111 max mem: 2905
Test: [10/50] eta: 0:01:21 loss: 1.7365 (1.7937) acc1: 61.6000 (60.4364) acc5: 83.2000 (82.9091) time: 2.0328 data: 2.0111 max mem: 2905
Test: [20/50] eta: 0:00:46 loss: 1.9281 (1.9538) acc1: 55.2000 (57.0667) acc5: 80.8000 (81.0286) time: 1.0184 data: 0.9977 max mem: 2905
Test: [30/50] eta: 0:00:25 loss: 2.0676 (1.9567) acc1: 54.4000 (57.1097) acc5: 80.0000 (80.6452) time: 0.8577 data: 0.8387 max mem: 2905
Test: [40/50] eta: 0:00:10 loss: 1.9374 (1.9935) acc1: 55.2000 (56.5463) acc5: 78.4000 (79.5902) time: 0.4824 data: 0.4633 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.1219 (2.0049) acc1: 54.4000 (56.5920) acc5: 76.8000 (79.2640) time: 0.4467 data: 0.4272 max mem: 2905
Test: Total time: 0:00:42 (0.8586 s / it)
* Acc@1 56.626 Acc@5 79.948 loss 1.972
Accuracy of the model on the 50000 test images: 56.6%
Max accuracy: 58.33%
Epoch: [232] [ 0/625] eta: 4:07:03 lr: 0.000555 min_lr: 0.000555 loss: 2.8731 (2.8731) class_acc: 0.5703 (0.5703) weight_decay: 0.0500 (0.0500) time: 23.7178 data: 17.3007 max mem: 2905
Epoch: [232] [200/625] eta: 0:13:42 lr: 0.000550 min_lr: 0.000550 loss: 2.7129 (2.7817) class_acc: 0.5781 (0.5741) weight_decay: 0.0500 (0.0500) grad_norm: 2.9729 (3.1949) time: 1.4717 data: 0.0010 max mem: 2905
Epoch: [232] [400/625] eta: 0:07:13 lr: 0.000545 min_lr: 0.000545 loss: 2.7285 (2.7850) class_acc: 0.5820 (0.5737) weight_decay: 0.0500 (0.0500) grad_norm: 3.5392 (3.4313) time: 2.1543 data: 0.0008 max mem: 2905
Epoch: [232] [600/625] eta: 0:00:48 lr: 0.000540 min_lr: 0.000540 loss: 2.7790 (2.7916) class_acc: 0.5742 (0.5729) weight_decay: 0.0500 (0.0500) grad_norm: 3.4367 (3.3541) time: 1.9816 data: 0.0152 max mem: 2905
Epoch: [232] [624/625] eta: 0:00:01 lr: 0.000540 min_lr: 0.000540 loss: 2.7984 (2.7925) class_acc: 0.5625 (0.5727) weight_decay: 0.0500 (0.0500) grad_norm: 2.3837 (3.3191) time: 0.7159 data: 0.0018 max mem: 2905
Epoch: [232] Total time: 0:19:57 (1.9160 s / it)
Averaged stats: lr: 0.000540 min_lr: 0.000540 loss: 2.7984 (2.7976) class_acc: 0.5625 (0.5725) weight_decay: 0.0500 (0.0500) grad_norm: 2.3837 (3.3191)
Test: [ 0/50] eta: 0:09:21 loss: 1.8810 (1.8810) acc1: 56.8000 (56.8000) acc5: 80.0000 (80.0000) time: 11.2215 data: 11.1916 max mem: 2905
Test: [10/50] eta: 0:01:12 loss: 1.8810 (1.8578) acc1: 59.2000 (59.2727) acc5: 80.8000 (81.4545) time: 1.8122 data: 1.7916 max mem: 2905
Test: [20/50] eta: 0:00:42 loss: 1.9956 (2.0134) acc1: 54.4000 (56.0000) acc5: 79.2000 (79.6191) time: 0.9124 data: 0.8930 max mem: 2905
Test: [30/50] eta: 0:00:27 loss: 2.1664 (2.0434) acc1: 52.8000 (55.8710) acc5: 77.6000 (79.0194) time: 1.1048 data: 1.0865 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 2.0152 (2.0610) acc1: 54.4000 (55.2195) acc5: 79.2000 (78.5951) time: 1.0898 data: 1.0721 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 2.0864 (2.0574) acc1: 54.4000 (55.4400) acc5: 78.4000 (78.6240) time: 0.6653 data: 0.6466 max mem: 2905
Test: Total time: 0:00:55 (1.1098 s / it)
* Acc@1 56.114 Acc@5 78.878 loss 2.028
Accuracy of the model on the 50000 test images: 56.1%
Max accuracy: 58.33%
Epoch: [233] [ 0/625] eta: 4:17:12 lr: 0.000540 min_lr: 0.000540 loss: 2.6610 (2.6610) class_acc: 0.5703 (0.5703) weight_decay: 0.0500 (0.0500) time: 24.6915 data: 18.3156 max mem: 2905
Epoch: [233] [200/625] eta: 0:14:28 lr: 0.000535 min_lr: 0.000535 loss: 2.7795 (2.7874) class_acc: 0.5742 (0.5741) weight_decay: 0.0500 (0.0500) grad_norm: 3.2357 (3.3048) time: 1.8983 data: 0.0005 max mem: 2905
Epoch: [233] [400/625] eta: 0:07:18 lr: 0.000530 min_lr: 0.000530 loss: 2.7779 (2.7918) class_acc: 0.5742 (0.5739) weight_decay: 0.0500 (0.0500) grad_norm: 3.0097 (3.3251) time: 1.8607 data: 1.4458 max mem: 2905
Epoch: [233] [600/625] eta: 0:00:48 lr: 0.000525 min_lr: 0.000525 loss: 2.7850 (2.7951) class_acc: 0.5664 (0.5739) weight_decay: 0.0500 (0.0500) grad_norm: 2.8634 (3.4273) time: 2.0629 data: 0.0242 max mem: 2905
Epoch: [233] [624/625] eta: 0:00:01 lr: 0.000525 min_lr: 0.000525 loss: 2.8057 (2.7955) class_acc: 0.5742 (0.5740) weight_decay: 0.0500 (0.0500) grad_norm: 2.6490 (3.3994) time: 0.7271 data: 0.0012 max mem: 2905
Epoch: [233] Total time: 0:19:52 (1.9080 s / it)
Averaged stats: lr: 0.000525 min_lr: 0.000525 loss: 2.8057 (2.7961) class_acc: 0.5742 (0.5733) weight_decay: 0.0500 (0.0500) grad_norm: 2.6490 (3.3994)
Test: [ 0/50] eta: 0:10:07 loss: 1.7354 (1.7354) acc1: 62.4000 (62.4000) acc5: 86.4000 (86.4000) time: 12.1419 data: 12.1121 max mem: 2905
Test: [10/50] eta: 0:01:07 loss: 1.7092 (1.6220) acc1: 64.0000 (64.2182) acc5: 86.4000 (85.7455) time: 1.6821 data: 1.6618 max mem: 2905
Test: [20/50] eta: 0:00:41 loss: 1.7951 (1.8257) acc1: 57.6000 (59.4667) acc5: 82.4000 (82.5143) time: 0.8459 data: 0.8256 max mem: 2905
Test: [30/50] eta: 0:00:25 loss: 1.9529 (1.8486) acc1: 55.2000 (59.1484) acc5: 80.0000 (82.0129) time: 1.0870 data: 1.0675 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 1.8771 (1.8581) acc1: 56.8000 (59.0244) acc5: 80.0000 (81.6000) time: 0.8831 data: 0.8656 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 1.8965 (1.8725) acc1: 56.8000 (58.8640) acc5: 80.0000 (81.4880) time: 0.4732 data: 0.4559 max mem: 2905
Test: Total time: 0:00:49 (0.9974 s / it)
* Acc@1 59.628 Acc@5 82.046 loss 1.835
Accuracy of the model on the 50000 test images: 59.6%
Max accuracy: 59.63%
Epoch: [234] [ 0/625] eta: 4:11:41 lr: 0.000525 min_lr: 0.000525 loss: 2.7511 (2.7511) class_acc: 0.6250 (0.6250) weight_decay: 0.0500 (0.0500) time: 24.1620 data: 19.2059 max mem: 2905
Epoch: [234] [200/625] eta: 0:14:28 lr: 0.000520 min_lr: 0.000520 loss: 2.7253 (2.7788) class_acc: 0.5898 (0.5775) weight_decay: 0.0500 (0.0500) grad_norm: 2.6878 (3.1516) time: 1.9309 data: 0.0015 max mem: 2905
Epoch: [234] [400/625] eta: 0:07:26 lr: 0.000515 min_lr: 0.000515 loss: 2.7892 (2.7906) class_acc: 0.5703 (0.5753) weight_decay: 0.0500 (0.0500) grad_norm: 3.2308 (3.3925) time: 1.8564 data: 0.0011 max mem: 2905
Epoch: [234] [600/625] eta: 0:00:49 lr: 0.000510 min_lr: 0.000510 loss: 2.8073 (2.7955) class_acc: 0.5703 (0.5738) weight_decay: 0.0500 (0.0500) grad_norm: 3.2033 (3.2646) time: 1.9608 data: 0.0011 max mem: 2905
Epoch: [234] [624/625] eta: 0:00:01 lr: 0.000510 min_lr: 0.000510 loss: 2.7745 (2.7959) class_acc: 0.5703 (0.5737) weight_decay: 0.0500 (0.0500) grad_norm: 2.8156 (3.2592) time: 0.7573 data: 0.0014 max mem: 2905
Epoch: [234] Total time: 0:20:10 (1.9363 s / it)
Averaged stats: lr: 0.000510 min_lr: 0.000510 loss: 2.7745 (2.7935) class_acc: 0.5703 (0.5739) weight_decay: 0.0500 (0.0500) grad_norm: 2.8156 (3.2592)
Test: [ 0/50] eta: 0:09:36 loss: 1.7962 (1.7962) acc1: 63.2000 (63.2000) acc5: 83.2000 (83.2000) time: 11.5244 data: 11.4893 max mem: 2905
Test: [10/50] eta: 0:01:18 loss: 1.6752 (1.7176) acc1: 63.2000 (62.9091) acc5: 83.2000 (84.0000) time: 1.9734 data: 1.9518 max mem: 2905
Test: [20/50] eta: 0:00:47 loss: 1.8667 (1.8785) acc1: 56.0000 (59.3524) acc5: 81.6000 (82.0952) time: 1.0771 data: 1.0573 max mem: 2905
Test: [30/50] eta: 0:00:27 loss: 1.8704 (1.8811) acc1: 58.4000 (59.4065) acc5: 80.0000 (82.0903) time: 1.0683 data: 1.0492 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 1.7973 (1.9103) acc1: 59.2000 (58.6927) acc5: 81.6000 (81.6390) time: 0.7439 data: 0.7244 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 1.9504 (1.9216) acc1: 56.8000 (58.3680) acc5: 79.2000 (81.1840) time: 0.6244 data: 0.6043 max mem: 2905
Test: Total time: 0:00:49 (0.9824 s / it)
* Acc@1 59.092 Acc@5 81.518 loss 1.887
Accuracy of the model on the 50000 test images: 59.1%
Max accuracy: 59.63%
Epoch: [235] [ 0/625] eta: 3:42:40 lr: 0.000510 min_lr: 0.000510 loss: 2.7276 (2.7276) class_acc: 0.5664 (0.5664) weight_decay: 0.0500 (0.0500) time: 21.3775 data: 21.2518 max mem: 2905
Epoch: [235] [200/625] eta: 0:14:05 lr: 0.000505 min_lr: 0.000505 loss: 2.7797 (2.7879) class_acc: 0.5742 (0.5741) weight_decay: 0.0500 (0.0500) grad_norm: 2.6321 (3.5089) time: 1.8146 data: 1.4181 max mem: 2905
Epoch: [235] [400/625] eta: 0:07:19 lr: 0.000500 min_lr: 0.000500 loss: 2.8162 (2.7937) class_acc: 0.5586 (0.5730) weight_decay: 0.0500 (0.0500) grad_norm: 3.1491 (3.4641) time: 1.9852 data: 1.6353 max mem: 2905
Epoch: [235] [600/625] eta: 0:00:48 lr: 0.000495 min_lr: 0.000495 loss: 2.8075 (2.7920) class_acc: 0.5742 (0.5742) weight_decay: 0.0500 (0.0500) grad_norm: 2.8011 (3.3297) time: 1.9821 data: 1.1157 max mem: 2905
Epoch: [235] [624/625] eta: 0:00:01 lr: 0.000495 min_lr: 0.000495 loss: 2.8226 (2.7932) class_acc: 0.5664 (0.5739) weight_decay: 0.0500 (0.0500) grad_norm: 2.5230 (3.3320) time: 0.4779 data: 0.1769 max mem: 2905
Epoch: [235] Total time: 0:20:00 (1.9213 s / it)
Averaged stats: lr: 0.000495 min_lr: 0.000495 loss: 2.8226 (2.7872) class_acc: 0.5664 (0.5753) weight_decay: 0.0500 (0.0500) grad_norm: 2.5230 (3.3320)
Test: [ 0/50] eta: 0:10:39 loss: 1.8610 (1.8610) acc1: 63.2000 (63.2000) acc5: 83.2000 (83.2000) time: 12.7932 data: 12.7570 max mem: 2905
Test: [10/50] eta: 0:01:24 loss: 1.8463 (1.9009) acc1: 56.8000 (58.0364) acc5: 80.8000 (81.6727) time: 2.1231 data: 2.1025 max mem: 2905
Test: [20/50] eta: 0:00:50 loss: 1.9796 (2.0586) acc1: 54.4000 (55.1238) acc5: 80.0000 (79.3143) time: 1.1169 data: 1.0968 max mem: 2905
Test: [30/50] eta: 0:00:28 loss: 2.1496 (2.0463) acc1: 55.2000 (55.5355) acc5: 77.6000 (79.0710) time: 1.0608 data: 1.0413 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 2.0426 (2.0689) acc1: 56.0000 (55.3171) acc5: 77.6000 (78.4390) time: 0.6191 data: 0.5990 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 2.0900 (2.0753) acc1: 54.4000 (55.1840) acc5: 76.0000 (78.1760) time: 0.6094 data: 0.5881 max mem: 2905
Test: Total time: 0:00:48 (0.9618 s / it)
* Acc@1 55.816 Acc@5 78.690 loss 2.041
Accuracy of the model on the 50000 test images: 55.8%
Max accuracy: 59.63%
Epoch: [236] [ 0/625] eta: 3:41:12 lr: 0.000495 min_lr: 0.000495 loss: 2.6949 (2.6949) class_acc: 0.5820 (0.5820) weight_decay: 0.0500 (0.0500) time: 21.2365 data: 16.8783 max mem: 2905
Epoch: [236] [200/625] eta: 0:13:55 lr: 0.000490 min_lr: 0.000490 loss: 2.7732 (2.7767) class_acc: 0.5664 (0.5787) weight_decay: 0.0500 (0.0500) grad_norm: 3.0168 (3.3288) time: 1.9275 data: 0.9510 max mem: 2905
Epoch: [236] [400/625] eta: 0:07:15 lr: 0.000485 min_lr: 0.000485 loss: 2.7939 (2.7811) class_acc: 0.5820 (0.5781) weight_decay: 0.0500 (0.0500) grad_norm: 2.4715 (3.2278) time: 1.9014 data: 0.0238 max mem: 2905
Epoch: [236] [600/625] eta: 0:00:48 lr: 0.000481 min_lr: 0.000481 loss: 2.8003 (2.7817) class_acc: 0.5703 (0.5774) weight_decay: 0.0500 (0.0500) grad_norm: 2.9899 (3.2114) time: 2.0740 data: 0.0666 max mem: 2905
Epoch: [236] [624/625] eta: 0:00:01 lr: 0.000480 min_lr: 0.000480 loss: 2.8196 (2.7836) class_acc: 0.5742 (0.5770) weight_decay: 0.0500 (0.0500) grad_norm: 3.2263 (3.2220) time: 0.8521 data: 0.0015 max mem: 2905
Epoch: [236] Total time: 0:19:52 (1.9083 s / it)
Averaged stats: lr: 0.000480 min_lr: 0.000480 loss: 2.8196 (2.7873) class_acc: 0.5742 (0.5756) weight_decay: 0.0500 (0.0500) grad_norm: 3.2263 (3.2220)
Test: [ 0/50] eta: 0:10:38 loss: 1.7233 (1.7233) acc1: 59.2000 (59.2000) acc5: 84.8000 (84.8000) time: 12.7606 data: 12.7294 max mem: 2905
Test: [10/50] eta: 0:01:24 loss: 1.6955 (1.6432) acc1: 62.4000 (62.7636) acc5: 81.6000 (83.7818) time: 2.1201 data: 2.1006 max mem: 2905
Test: [20/50] eta: 0:00:50 loss: 1.8052 (1.8365) acc1: 59.2000 (58.8571) acc5: 81.6000 (81.7524) time: 1.1204 data: 1.1022 max mem: 2905
Test: [30/50] eta: 0:00:29 loss: 1.9595 (1.8762) acc1: 56.0000 (58.3484) acc5: 81.6000 (81.1871) time: 1.1416 data: 1.1236 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 1.9595 (1.9142) acc1: 55.2000 (57.6000) acc5: 80.8000 (80.5854) time: 0.8884 data: 0.8701 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 1.9773 (1.9414) acc1: 53.6000 (57.0240) acc5: 78.4000 (80.3520) time: 0.8351 data: 0.8158 max mem: 2905
Test: Total time: 0:00:53 (1.0796 s / it)
* Acc@1 58.618 Acc@5 81.202 loss 1.891
Accuracy of the model on the 50000 test images: 58.6%
Max accuracy: 59.63%
Epoch: [237] [ 0/625] eta: 3:56:52 lr: 0.000480 min_lr: 0.000480 loss: 2.5955 (2.5955) class_acc: 0.6055 (0.6055) weight_decay: 0.0500 (0.0500) time: 22.7404 data: 18.9484 max mem: 2905
Epoch: [237] [200/625] eta: 0:14:19 lr: 0.000475 min_lr: 0.000475 loss: 2.7488 (2.7758) class_acc: 0.5742 (0.5772) weight_decay: 0.0500 (0.0500) grad_norm: 3.0349 (3.4789) time: 1.8228 data: 0.0007 max mem: 2905
Epoch: [237] [400/625] eta: 0:07:27 lr: 0.000471 min_lr: 0.000471 loss: 2.7498 (2.7823) class_acc: 0.5820 (0.5761) weight_decay: 0.0500 (0.0500) grad_norm: 3.0190 (3.5008) time: 2.0332 data: 0.0007 max mem: 2905
Epoch: [237] [600/625] eta: 0:00:49 lr: 0.000466 min_lr: 0.000466 loss: 2.8107 (2.7830) class_acc: 0.5742 (0.5766) weight_decay: 0.0500 (0.0500) grad_norm: 2.9216 (3.4565) time: 1.9514 data: 0.0007 max mem: 2905
Epoch: [237] [624/625] eta: 0:00:01 lr: 0.000466 min_lr: 0.000466 loss: 2.7838 (2.7839) class_acc: 0.5703 (0.5763) weight_decay: 0.0500 (0.0500) grad_norm: 3.5205 (3.4671) time: 0.5957 data: 0.0020 max mem: 2905
Epoch: [237] Total time: 0:20:31 (1.9701 s / it)
Averaged stats: lr: 0.000466 min_lr: 0.000466 loss: 2.7838 (2.7830) class_acc: 0.5703 (0.5761) weight_decay: 0.0500 (0.0500) grad_norm: 3.5205 (3.4671)
Test: [ 0/50] eta: 0:10:47 loss: 1.9279 (1.9279) acc1: 54.4000 (54.4000) acc5: 84.0000 (84.0000) time: 12.9461 data: 12.9142 max mem: 2905
Test: [10/50] eta: 0:01:26 loss: 1.7044 (1.6345) acc1: 61.6000 (62.4000) acc5: 84.8000 (85.0182) time: 2.1507 data: 2.1309 max mem: 2905
Test: [20/50] eta: 0:00:50 loss: 1.7869 (1.8063) acc1: 60.0000 (59.7714) acc5: 82.4000 (83.2381) time: 1.1073 data: 1.0890 max mem: 2905
Test: [30/50] eta: 0:00:29 loss: 1.9130 (1.8073) acc1: 57.6000 (60.0774) acc5: 82.4000 (82.9161) time: 1.1184 data: 1.0984 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 1.7461 (1.8187) acc1: 59.2000 (59.8829) acc5: 83.2000 (82.6342) time: 0.8932 data: 0.8735 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 1.7483 (1.8228) acc1: 59.2000 (60.0160) acc5: 81.6000 (82.3680) time: 0.8634 data: 0.8441 max mem: 2905
Test: Total time: 0:00:53 (1.0758 s / it)
* Acc@1 61.060 Acc@5 82.848 loss 1.782
Accuracy of the model on the 50000 test images: 61.1%
Max accuracy: 61.06%
Epoch: [238] [ 0/625] eta: 3:30:29 lr: 0.000466 min_lr: 0.000466 loss: 2.8227 (2.8227) class_acc: 0.5859 (0.5859) weight_decay: 0.0500 (0.0500) time: 20.2075 data: 16.8635 max mem: 2905
Epoch: [238] [200/625] eta: 0:14:45 lr: 0.000461 min_lr: 0.000461 loss: 2.7656 (2.7678) class_acc: 0.5742 (0.5817) weight_decay: 0.0500 (0.0500) grad_norm: 2.9293 (3.2914) time: 1.9695 data: 0.0738 max mem: 2905
Epoch: [238] [400/625] eta: 0:07:31 lr: 0.000456 min_lr: 0.000456 loss: 2.8005 (2.7783) class_acc: 0.5625 (0.5786) weight_decay: 0.0500 (0.0500) grad_norm: 2.6832 (3.1466) time: 1.9229 data: 0.0505 max mem: 2905
Epoch: [238] [600/625] eta: 0:00:50 lr: 0.000452 min_lr: 0.000452 loss: 2.7760 (2.7828) class_acc: 0.5742 (0.5782) weight_decay: 0.0500 (0.0500) grad_norm: 3.4062 (3.3167) time: 1.9473 data: 0.0008 max mem: 2905
Epoch: [238] [624/625] eta: 0:00:01 lr: 0.000451 min_lr: 0.000451 loss: 2.7654 (2.7826) class_acc: 0.5781 (0.5784) weight_decay: 0.0500 (0.0500) grad_norm: 3.0902 (3.2842) time: 0.8888 data: 0.0016 max mem: 2905
Epoch: [238] Total time: 0:20:29 (1.9672 s / it)
Averaged stats: lr: 0.000451 min_lr: 0.000451 loss: 2.7654 (2.7806) class_acc: 0.5781 (0.5774) weight_decay: 0.0500 (0.0500) grad_norm: 3.0902 (3.2842)
Test: [ 0/50] eta: 0:10:04 loss: 1.7888 (1.7888) acc1: 58.4000 (58.4000) acc5: 85.6000 (85.6000) time: 12.0896 data: 12.0571 max mem: 2905
Test: [10/50] eta: 0:01:22 loss: 1.5926 (1.5937) acc1: 65.6000 (64.1455) acc5: 86.4000 (84.9455) time: 2.0619 data: 2.0416 max mem: 2905
Test: [20/50] eta: 0:00:48 loss: 1.7535 (1.7433) acc1: 60.8000 (61.3333) acc5: 84.8000 (83.7714) time: 1.0883 data: 1.0691 max mem: 2905
Test: [30/50] eta: 0:00:28 loss: 1.8361 (1.7455) acc1: 59.2000 (61.5742) acc5: 82.4000 (83.5097) time: 1.0532 data: 1.0339 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 1.6893 (1.7610) acc1: 61.6000 (61.5415) acc5: 83.2000 (83.2390) time: 0.7425 data: 0.7237 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 1.8629 (1.7830) acc1: 57.6000 (60.7040) acc5: 82.4000 (82.9440) time: 0.6398 data: 0.6208 max mem: 2905
Test: Total time: 0:00:49 (0.9951 s / it)
* Acc@1 61.324 Acc@5 83.114 loss 1.754
Accuracy of the model on the 50000 test images: 61.3%
Max accuracy: 61.32%
Epoch: [239] [ 0/625] eta: 3:38:04 lr: 0.000451 min_lr: 0.000451 loss: 2.7719 (2.7719) class_acc: 0.6055 (0.6055) weight_decay: 0.0500 (0.0500) time: 20.9347 data: 20.6136 max mem: 2905
Epoch: [239] [200/625] eta: 0:14:35 lr: 0.000447 min_lr: 0.000447 loss: 2.7616 (2.7669) class_acc: 0.5859 (0.5805) weight_decay: 0.0500 (0.0500) grad_norm: 3.1448 (3.2851) time: 1.8818 data: 0.0009 max mem: 2905
Epoch: [239] [400/625] eta: 0:07:30 lr: 0.000442 min_lr: 0.000442 loss: 2.7580 (2.7741) class_acc: 0.5703 (0.5780) weight_decay: 0.0500 (0.0500) grad_norm: 3.5243 (inf) time: 1.9657 data: 0.0012 max mem: 2905
Epoch: [239] [600/625] eta: 0:00:50 lr: 0.000438 min_lr: 0.000438 loss: 2.7749 (2.7753) class_acc: 0.5703 (0.5788) weight_decay: 0.0500 (0.0500) grad_norm: 2.6752 (inf) time: 2.1050 data: 0.0008 max mem: 2905
Epoch: [239] [624/625] eta: 0:00:01 lr: 0.000437 min_lr: 0.000437 loss: 2.7535 (2.7757) class_acc: 0.5781 (0.5788) weight_decay: 0.0500 (0.0500) grad_norm: 2.3334 (inf) time: 1.0655 data: 0.0013 max mem: 2905
Epoch: [239] Total time: 0:20:30 (1.9687 s / it)
Averaged stats: lr: 0.000437 min_lr: 0.000437 loss: 2.7535 (2.7768) class_acc: 0.5781 (0.5779) weight_decay: 0.0500 (0.0500) grad_norm: 2.3334 (inf)
Test: [ 0/50] eta: 0:10:40 loss: 1.5106 (1.5106) acc1: 68.8000 (68.8000) acc5: 88.0000 (88.0000) time: 12.8081 data: 12.7803 max mem: 2905
Test: [10/50] eta: 0:01:20 loss: 1.5905 (1.5839) acc1: 66.4000 (66.4727) acc5: 87.2000 (86.1091) time: 2.0068 data: 1.9859 max mem: 2905
Test: [20/50] eta: 0:00:46 loss: 1.7188 (1.7741) acc1: 61.6000 (61.7524) acc5: 84.0000 (83.5810) time: 0.9711 data: 0.9520 max mem: 2905
Test: [30/50] eta: 0:00:25 loss: 1.8920 (1.7872) acc1: 60.0000 (61.6000) acc5: 82.4000 (83.1226) time: 0.8914 data: 0.8730 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 1.7891 (1.8236) acc1: 60.8000 (61.0146) acc5: 82.4000 (82.2439) time: 0.7275 data: 0.7084 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 1.7935 (1.8260) acc1: 60.8000 (60.9280) acc5: 84.0000 (82.3360) time: 0.6560 data: 0.6364 max mem: 2905
Test: Total time: 0:00:51 (1.0203 s / it)
* Acc@1 60.740 Acc@5 82.674 loss 1.801
Accuracy of the model on the 50000 test images: 60.7%
Max accuracy: 61.32%
Epoch: [240] [ 0/625] eta: 3:43:30 lr: 0.000437 min_lr: 0.000437 loss: 2.7879 (2.7879) class_acc: 0.5938 (0.5938) weight_decay: 0.0500 (0.0500) time: 21.4568 data: 20.2205 max mem: 2905
Epoch: [240] [200/625] eta: 0:14:16 lr: 0.000433 min_lr: 0.000433 loss: 2.7671 (2.7668) class_acc: 0.5742 (0.5787) weight_decay: 0.0500 (0.0500) grad_norm: 1.9119 (3.5169) time: 1.8851 data: 0.2731 max mem: 2905
Epoch: [240] [400/625] eta: 0:07:26 lr: 0.000428 min_lr: 0.000428 loss: 2.7443 (2.7701) class_acc: 0.5703 (0.5779) weight_decay: 0.0500 (0.0500) grad_norm: 3.4075 (3.4219) time: 2.1454 data: 0.8892 max mem: 2905
Epoch: [240] [600/625] eta: 0:00:49 lr: 0.000424 min_lr: 0.000424 loss: 2.7571 (2.7697) class_acc: 0.5781 (0.5780) weight_decay: 0.0500 (0.0500) grad_norm: 3.3776 (3.3400) time: 2.0084 data: 1.1536 max mem: 2905
Epoch: [240] [624/625] eta: 0:00:01 lr: 0.000423 min_lr: 0.000423 loss: 2.7996 (2.7713) class_acc: 0.5742 (0.5778) weight_decay: 0.0500 (0.0500) grad_norm: 2.9011 (3.3337) time: 0.9011 data: 0.2608 max mem: 2905
Epoch: [240] Total time: 0:20:10 (1.9369 s / it)
Averaged stats: lr: 0.000423 min_lr: 0.000423 loss: 2.7996 (2.7737) class_acc: 0.5742 (0.5782) weight_decay: 0.0500 (0.0500) grad_norm: 2.9011 (3.3337)
Test: [ 0/50] eta: 0:10:26 loss: 1.6511 (1.6511) acc1: 61.6000 (61.6000) acc5: 87.2000 (87.2000) time: 12.5385 data: 12.4803 max mem: 2905
Test: [10/50] eta: 0:01:21 loss: 1.5726 (1.5675) acc1: 64.0000 (65.1636) acc5: 86.4000 (85.8909) time: 2.0290 data: 2.0062 max mem: 2905
Test: [20/50] eta: 0:00:46 loss: 1.7784 (1.7399) acc1: 62.4000 (61.5238) acc5: 84.8000 (84.0000) time: 0.9967 data: 0.9775 max mem: 2905
Test: [30/50] eta: 0:00:26 loss: 1.8981 (1.7510) acc1: 57.6000 (61.3419) acc5: 80.8000 (83.3806) time: 0.9635 data: 0.9442 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 1.8026 (1.7835) acc1: 57.6000 (60.9756) acc5: 80.0000 (83.0829) time: 0.6764 data: 0.6574 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 1.7653 (1.7831) acc1: 59.2000 (60.8000) acc5: 81.6000 (82.9440) time: 0.6009 data: 0.5824 max mem: 2905
Test: Total time: 0:00:46 (0.9382 s / it)
* Acc@1 61.128 Acc@5 83.264 loss 1.761
Accuracy of the model on the 50000 test images: 61.1%
Max accuracy: 61.32%
Epoch: [241] [ 0/625] eta: 3:22:33 lr: 0.000423 min_lr: 0.000423 loss: 2.5817 (2.5817) class_acc: 0.6289 (0.6289) weight_decay: 0.0500 (0.0500) time: 19.4459 data: 19.2134 max mem: 2905
Epoch: [241] [200/625] eta: 0:13:46 lr: 0.000419 min_lr: 0.000419 loss: 2.7714 (2.7652) class_acc: 0.5625 (0.5800) weight_decay: 0.0500 (0.0500) grad_norm: 4.1017 (3.8157) time: 1.6421 data: 0.2635 max mem: 2905
Epoch: [241] [400/625] eta: 0:07:10 lr: 0.000415 min_lr: 0.000415 loss: 2.7636 (2.7750) class_acc: 0.5781 (0.5781) weight_decay: 0.0500 (0.0500) grad_norm: 2.5308 (3.4547) time: 1.8221 data: 0.0007 max mem: 2905
Epoch: [241] [600/625] eta: 0:00:48 lr: 0.000410 min_lr: 0.000410 loss: 2.7548 (2.7759) class_acc: 0.5820 (0.5779) weight_decay: 0.0500 (0.0500) grad_norm: 2.6641 (3.5108) time: 1.9151 data: 0.0011 max mem: 2905
Epoch: [241] [624/625] eta: 0:00:01 lr: 0.000410 min_lr: 0.000410 loss: 2.7937 (2.7759) class_acc: 0.5742 (0.5780) weight_decay: 0.0500 (0.0500) grad_norm: 2.5352 (3.4761) time: 0.9106 data: 0.0184 max mem: 2905
Epoch: [241] Total time: 0:19:41 (1.8903 s / it)
Averaged stats: lr: 0.000410 min_lr: 0.000410 loss: 2.7937 (2.7716) class_acc: 0.5742 (0.5789) weight_decay: 0.0500 (0.0500) grad_norm: 2.5352 (3.4761)
Test: [ 0/50] eta: 0:10:59 loss: 1.7449 (1.7449) acc1: 58.4000 (58.4000) acc5: 84.0000 (84.0000) time: 13.1914 data: 13.1631 max mem: 2905
Test: [10/50] eta: 0:01:34 loss: 1.6031 (1.6393) acc1: 62.4000 (62.6909) acc5: 85.6000 (84.5818) time: 2.3723 data: 2.3519 max mem: 2905
Test: [20/50] eta: 0:00:53 loss: 1.7851 (1.7686) acc1: 60.0000 (60.0000) acc5: 83.2000 (82.9714) time: 1.2280 data: 1.2090 max mem: 2905
Test: [30/50] eta: 0:00:30 loss: 1.8498 (1.7780) acc1: 58.4000 (60.2581) acc5: 81.6000 (82.6323) time: 1.0776 data: 1.0579 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 1.7058 (1.7914) acc1: 60.8000 (59.8439) acc5: 81.6000 (82.2829) time: 0.7644 data: 0.7430 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 1.7043 (1.7920) acc1: 57.6000 (59.8080) acc5: 82.4000 (82.2240) time: 0.6467 data: 0.6240 max mem: 2905
Test: Total time: 0:00:53 (1.0745 s / it)
* Acc@1 60.942 Acc@5 82.952 loss 1.751
Accuracy of the model on the 50000 test images: 60.9%
Max accuracy: 61.32%
Epoch: [242] [ 0/625] eta: 3:57:18 lr: 0.000410 min_lr: 0.000410 loss: 2.7184 (2.7184) class_acc: 0.5742 (0.5742) weight_decay: 0.0500 (0.0500) time: 22.7819 data: 15.3494 max mem: 2905
Epoch: [242] [200/625] eta: 0:14:22 lr: 0.000405 min_lr: 0.000405 loss: 2.7780 (2.7539) class_acc: 0.5859 (0.5825) weight_decay: 0.0500 (0.0500) grad_norm: 2.8875 (3.0289) time: 1.9559 data: 0.0013 max mem: 2905
Epoch: [242] [400/625] eta: 0:07:23 lr: 0.000401 min_lr: 0.000401 loss: 2.7109 (2.7539) class_acc: 0.5898 (0.5812) weight_decay: 0.0500 (0.0500) grad_norm: 3.2530 (3.1185) time: 1.9582 data: 0.0008 max mem: 2905
Epoch: [242] [600/625] eta: 0:00:49 lr: 0.000397 min_lr: 0.000397 loss: 2.7970 (2.7612) class_acc: 0.5703 (0.5806) weight_decay: 0.0500 (0.0500) grad_norm: 2.0874 (3.2829) time: 2.1660 data: 0.0134 max mem: 2905
Epoch: [242] [624/625] eta: 0:00:01 lr: 0.000396 min_lr: 0.000396 loss: 2.8043 (2.7616) class_acc: 0.5703 (0.5803) weight_decay: 0.0500 (0.0500) grad_norm: 2.5360 (3.2674) time: 0.7304 data: 0.0017 max mem: 2905
Epoch: [242] Total time: 0:20:06 (1.9296 s / it)
Averaged stats: lr: 0.000396 min_lr: 0.000396 loss: 2.8043 (2.7708) class_acc: 0.5703 (0.5793) weight_decay: 0.0500 (0.0500) grad_norm: 2.5360 (3.2674)
Test: [ 0/50] eta: 0:10:53 loss: 1.7314 (1.7314) acc1: 58.4000 (58.4000) acc5: 85.6000 (85.6000) time: 13.0795 data: 13.0457 max mem: 2905
Test: [10/50] eta: 0:01:21 loss: 1.6015 (1.5680) acc1: 64.8000 (63.9273) acc5: 86.4000 (85.6727) time: 2.0291 data: 2.0077 max mem: 2905
Test: [20/50] eta: 0:00:45 loss: 1.6664 (1.6840) acc1: 62.4000 (62.7429) acc5: 84.8000 (84.4952) time: 0.9548 data: 0.9347 max mem: 2905
Test: [30/50] eta: 0:00:26 loss: 1.7986 (1.7048) acc1: 60.8000 (62.2452) acc5: 83.2000 (84.2065) time: 0.9521 data: 0.9315 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 1.7986 (1.7233) acc1: 58.4000 (61.5610) acc5: 82.4000 (83.6488) time: 0.8802 data: 0.8601 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 1.7420 (1.7317) acc1: 58.4000 (61.5360) acc5: 82.4000 (83.4400) time: 0.6336 data: 0.6143 max mem: 2905
Test: Total time: 0:00:53 (1.0720 s / it)
* Acc@1 62.014 Acc@5 83.670 loss 1.703
Accuracy of the model on the 50000 test images: 62.0%
Max accuracy: 62.01%
Epoch: [243] [ 0/625] eta: 3:40:31 lr: 0.000396 min_lr: 0.000396 loss: 2.7409 (2.7409) class_acc: 0.5625 (0.5625) weight_decay: 0.0500 (0.0500) time: 21.1696 data: 17.5611 max mem: 2905
Epoch: [243] [200/625] eta: 0:14:28 lr: 0.000392 min_lr: 0.000392 loss: 2.7529 (2.7562) class_acc: 0.5859 (0.5865) weight_decay: 0.0500 (0.0500) grad_norm: 2.3311 (3.4300) time: 1.7437 data: 0.0007 max mem: 2905
Epoch: [243] [400/625] eta: 0:07:31 lr: 0.000388 min_lr: 0.000388 loss: 2.7159 (2.7591) class_acc: 0.5898 (0.5845) weight_decay: 0.0500 (0.0500) grad_norm: 4.4367 (3.3550) time: 1.9394 data: 0.0007 max mem: 2905
Epoch: [243] [600/625] eta: 0:00:50 lr: 0.000383 min_lr: 0.000383 loss: 2.7774 (2.7662) class_acc: 0.5742 (0.5817) weight_decay: 0.0500 (0.0500) grad_norm: 3.1733 (3.3561) time: 2.1701 data: 0.0007 max mem: 2905
Epoch: [243] [624/625] eta: 0:00:01 lr: 0.000383 min_lr: 0.000383 loss: 2.7374 (2.7669) class_acc: 0.5664 (0.5813) weight_decay: 0.0500 (0.0500) grad_norm: 4.5450 (3.4273) time: 0.8327 data: 0.0014 max mem: 2905
Epoch: [243] Total time: 0:20:22 (1.9557 s / it)
Averaged stats: lr: 0.000383 min_lr: 0.000383 loss: 2.7374 (2.7680) class_acc: 0.5664 (0.5799) weight_decay: 0.0500 (0.0500) grad_norm: 4.5450 (3.4273)
Test: [ 0/50] eta: 0:10:31 loss: 1.7137 (1.7137) acc1: 60.8000 (60.8000) acc5: 84.0000 (84.0000) time: 12.6229 data: 12.5925 max mem: 2905
Test: [10/50] eta: 0:01:15 loss: 1.5407 (1.6163) acc1: 63.2000 (62.8364) acc5: 84.0000 (84.7273) time: 1.8877 data: 1.8658 max mem: 2905
Test: [20/50] eta: 0:00:39 loss: 1.7987 (1.7814) acc1: 58.4000 (59.7714) acc5: 83.2000 (82.8191) time: 0.7564 data: 0.7356 max mem: 2905
Test: [30/50] eta: 0:00:25 loss: 1.8556 (1.7868) acc1: 57.6000 (59.9484) acc5: 81.6000 (82.9419) time: 0.9324 data: 0.9128 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 1.8156 (1.8131) acc1: 58.4000 (59.7463) acc5: 82.4000 (82.1463) time: 0.8742 data: 0.8544 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 1.7923 (1.8212) acc1: 58.4000 (59.6800) acc5: 82.4000 (82.2400) time: 0.4421 data: 0.4222 max mem: 2905
Test: Total time: 0:00:48 (0.9680 s / it)
* Acc@1 60.508 Acc@5 82.950 loss 1.777
Accuracy of the model on the 50000 test images: 60.5%
Max accuracy: 62.01%
Epoch: [244] [ 0/625] eta: 3:50:23 lr: 0.000383 min_lr: 0.000383 loss: 2.6059 (2.6059) class_acc: 0.6172 (0.6172) weight_decay: 0.0500 (0.0500) time: 22.1176 data: 21.7752 max mem: 2905
Epoch: [244] [200/625] eta: 0:14:16 lr: 0.000379 min_lr: 0.000379 loss: 2.7288 (2.7459) class_acc: 0.5938 (0.5829) weight_decay: 0.0500 (0.0500) grad_norm: 3.0239 (3.3966) time: 1.8827 data: 1.2112 max mem: 2905
Epoch: [244] [400/625] eta: 0:07:21 lr: 0.000374 min_lr: 0.000374 loss: 2.7790 (2.7590) class_acc: 0.5898 (0.5818) weight_decay: 0.0500 (0.0500) grad_norm: 3.4928 (3.4304) time: 1.8949 data: 1.5867 max mem: 2905
Epoch: [244] [600/625] eta: 0:00:48 lr: 0.000370 min_lr: 0.000370 loss: 2.7377 (2.7632) class_acc: 0.5898 (0.5819) weight_decay: 0.0500 (0.0500) grad_norm: 2.4940 (3.5859) time: 1.6860 data: 0.2579 max mem: 2905
Epoch: [244] [624/625] eta: 0:00:01 lr: 0.000370 min_lr: 0.000370 loss: 2.7756 (2.7630) class_acc: 0.5859 (0.5818) weight_decay: 0.0500 (0.0500) grad_norm: 2.9518 (3.5723) time: 0.6657 data: 0.0643 max mem: 2905
Epoch: [244] Total time: 0:19:54 (1.9111 s / it)
Averaged stats: lr: 0.000370 min_lr: 0.000370 loss: 2.7756 (2.7649) class_acc: 0.5859 (0.5808) weight_decay: 0.0500 (0.0500) grad_norm: 2.9518 (3.5723)
Test: [ 0/50] eta: 0:10:16 loss: 1.7577 (1.7577) acc1: 63.2000 (63.2000) acc5: 84.8000 (84.8000) time: 12.3346 data: 12.2949 max mem: 2905
Test: [10/50] eta: 0:01:27 loss: 1.5378 (1.5262) acc1: 67.2000 (66.6909) acc5: 84.8000 (86.2545) time: 2.1813 data: 2.1592 max mem: 2905
Test: [20/50] eta: 0:00:47 loss: 1.6097 (1.7092) acc1: 61.6000 (61.8286) acc5: 84.8000 (84.0381) time: 1.0442 data: 1.0246 max mem: 2905
Test: [30/50] eta: 0:00:25 loss: 1.8465 (1.7113) acc1: 58.4000 (61.8065) acc5: 84.8000 (84.2065) time: 0.7915 data: 0.7728 max mem: 2905
Test: [40/50] eta: 0:00:10 loss: 1.7804 (1.7387) acc1: 58.4000 (61.3854) acc5: 84.8000 (83.7073) time: 0.4864 data: 0.4680 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 1.7521 (1.7475) acc1: 58.4000 (61.2640) acc5: 83.2000 (83.5040) time: 0.3836 data: 0.3635 max mem: 2905
Test: Total time: 0:00:45 (0.9047 s / it)
* Acc@1 61.934 Acc@5 83.588 loss 1.710
Accuracy of the model on the 50000 test images: 61.9%
Max accuracy: 62.01%
Epoch: [245] [ 0/625] eta: 3:25:52 lr: 0.000370 min_lr: 0.000370 loss: 2.7690 (2.7690) class_acc: 0.5938 (0.5938) weight_decay: 0.0500 (0.0500) time: 19.7643 data: 19.2677 max mem: 2905
Epoch: [245] [200/625] eta: 0:14:06 lr: 0.000366 min_lr: 0.000366 loss: 2.7916 (2.7665) class_acc: 0.5820 (0.5796) weight_decay: 0.0500 (0.0500) grad_norm: 3.2291 (3.4862) time: 2.0710 data: 0.0008 max mem: 2905
Epoch: [245] [400/625] eta: 0:07:19 lr: 0.000362 min_lr: 0.000362 loss: 2.7396 (2.7614) class_acc: 0.5820 (0.5812) weight_decay: 0.0500 (0.0500) grad_norm: 3.0495 (3.4060) time: 2.0380 data: 0.0008 max mem: 2905
Epoch: [245] [600/625] eta: 0:00:48 lr: 0.000357 min_lr: 0.000357 loss: 2.7622 (2.7639) class_acc: 0.5859 (0.5810) weight_decay: 0.0500 (0.0500) grad_norm: 2.7053 (3.4730) time: 2.1348 data: 0.0333 max mem: 2905
Epoch: [245] [624/625] eta: 0:00:01 lr: 0.000357 min_lr: 0.000357 loss: 2.7437 (2.7637) class_acc: 0.5781 (0.5809) weight_decay: 0.0500 (0.0500) grad_norm: 2.7557 (3.4413) time: 0.7689 data: 0.0091 max mem: 2905
Epoch: [245] Total time: 0:19:55 (1.9122 s / it)
Averaged stats: lr: 0.000357 min_lr: 0.000357 loss: 2.7437 (2.7610) class_acc: 0.5781 (0.5813) weight_decay: 0.0500 (0.0500) grad_norm: 2.7557 (3.4413)
Test: [ 0/50] eta: 0:10:08 loss: 1.7890 (1.7890) acc1: 58.4000 (58.4000) acc5: 83.2000 (83.2000) time: 12.1649 data: 12.1250 max mem: 2905
Test: [10/50] eta: 0:01:21 loss: 1.5829 (1.5946) acc1: 64.0000 (64.2909) acc5: 87.2000 (85.4546) time: 2.0292 data: 2.0075 max mem: 2905
Test: [20/50] eta: 0:00:48 loss: 1.7567 (1.7307) acc1: 60.8000 (61.3714) acc5: 84.0000 (83.7714) time: 1.0730 data: 1.0526 max mem: 2905
Test: [30/50] eta: 0:00:28 loss: 1.7886 (1.7307) acc1: 59.2000 (60.9290) acc5: 82.4000 (83.4323) time: 1.1025 data: 1.0820 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 1.7137 (1.7667) acc1: 58.4000 (60.2927) acc5: 82.4000 (82.7707) time: 0.9474 data: 0.9271 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 1.8086 (1.7832) acc1: 58.4000 (59.9520) acc5: 82.4000 (82.6240) time: 0.8457 data: 0.8251 max mem: 2905
Test: Total time: 0:00:56 (1.1220 s / it)
* Acc@1 60.762 Acc@5 82.960 loss 1.757
Accuracy of the model on the 50000 test images: 60.8%
Max accuracy: 62.01%
Epoch: [246] [ 0/625] eta: 3:23:36 lr: 0.000357 min_lr: 0.000357 loss: 2.6648 (2.6648) class_acc: 0.5625 (0.5625) weight_decay: 0.0500 (0.0500) time: 19.5465 data: 19.2392 max mem: 2905
Epoch: [246] [200/625] eta: 0:13:50 lr: 0.000353 min_lr: 0.000353 loss: 2.7346 (2.7542) class_acc: 0.5781 (0.5838) weight_decay: 0.0500 (0.0500) grad_norm: 3.4239 (3.1977) time: 1.8317 data: 0.0006 max mem: 2905
Epoch: [246] [400/625] eta: 0:07:10 lr: 0.000349 min_lr: 0.000349 loss: 2.7236 (2.7589) class_acc: 0.5859 (0.5829) weight_decay: 0.0500 (0.0500) grad_norm: 3.0873 (3.3556) time: 1.8041 data: 0.0007 max mem: 2905
Epoch: [246] [600/625] eta: 0:00:47 lr: 0.000345 min_lr: 0.000345 loss: 2.7883 (2.7603) class_acc: 0.5742 (0.5825) weight_decay: 0.0500 (0.0500) grad_norm: 5.4175 (inf) time: 1.8969 data: 0.8910 max mem: 2905
Epoch: [246] [624/625] eta: 0:00:01 lr: 0.000344 min_lr: 0.000344 loss: 2.7428 (2.7597) class_acc: 0.5820 (0.5827) weight_decay: 0.0500 (0.0500) grad_norm: 5.7122 (inf) time: 0.6722 data: 0.3728 max mem: 2905
Epoch: [246] Total time: 0:19:39 (1.8868 s / it)
Averaged stats: lr: 0.000344 min_lr: 0.000344 loss: 2.7428 (2.7590) class_acc: 0.5820 (0.5821) weight_decay: 0.0500 (0.0500) grad_norm: 5.7122 (inf)
Test: [ 0/50] eta: 0:10:12 loss: 1.6667 (1.6667) acc1: 62.4000 (62.4000) acc5: 86.4000 (86.4000) time: 12.2585 data: 12.2244 max mem: 2905
Test: [10/50] eta: 0:01:16 loss: 1.4992 (1.5292) acc1: 65.6000 (66.1091) acc5: 86.4000 (85.6727) time: 1.9172 data: 1.8927 max mem: 2905
Test: [20/50] eta: 0:00:42 loss: 1.6921 (1.7046) acc1: 64.0000 (62.4762) acc5: 83.2000 (83.5429) time: 0.8749 data: 0.8537 max mem: 2905
Test: [30/50] eta: 0:00:24 loss: 1.8057 (1.7293) acc1: 57.6000 (61.5484) acc5: 81.6000 (83.2258) time: 0.8376 data: 0.8178 max mem: 2905
Test: [40/50] eta: 0:00:10 loss: 1.7670 (1.7485) acc1: 57.6000 (61.1707) acc5: 82.4000 (83.0634) time: 0.6940 data: 0.6742 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 1.7713 (1.7519) acc1: 58.4000 (60.9760) acc5: 80.8000 (82.8960) time: 0.5623 data: 0.5435 max mem: 2905
Test: Total time: 0:00:46 (0.9349 s / it)
* Acc@1 61.762 Acc@5 83.206 loss 1.718
Accuracy of the model on the 50000 test images: 61.8%
Max accuracy: 62.01%
Epoch: [247] [ 0/625] eta: 3:26:35 lr: 0.000344 min_lr: 0.000344 loss: 2.8539 (2.8539) class_acc: 0.5703 (0.5703) weight_decay: 0.0500 (0.0500) time: 19.8328 data: 17.3093 max mem: 2905
Epoch: [247] [200/625] eta: 0:13:45 lr: 0.000340 min_lr: 0.000340 loss: 2.7122 (2.7429) class_acc: 0.5977 (0.5877) weight_decay: 0.0500 (0.0500) grad_norm: 2.1684 (2.9796) time: 1.9026 data: 0.0481 max mem: 2905
Epoch: [247] [400/625] eta: 0:07:10 lr: 0.000336 min_lr: 0.000336 loss: 2.7419 (2.7444) class_acc: 0.5781 (0.5859) weight_decay: 0.0500 (0.0500) grad_norm: 2.9002 (3.2321) time: 1.7504 data: 0.0677 max mem: 2905
Epoch: [247] [600/625] eta: 0:00:48 lr: 0.000332 min_lr: 0.000332 loss: 2.7652 (2.7500) class_acc: 0.5703 (0.5845) weight_decay: 0.0500 (0.0500) grad_norm: 3.4353 (3.2385) time: 1.9152 data: 0.0176 max mem: 2905
Epoch: [247] [624/625] eta: 0:00:01 lr: 0.000332 min_lr: 0.000332 loss: 2.7836 (2.7508) class_acc: 0.5938 (0.5845) weight_decay: 0.0500 (0.0500) grad_norm: 3.7755 (3.2782) time: 0.7040 data: 0.0314 max mem: 2905
Epoch: [247] Total time: 0:19:51 (1.9058 s / it)
Averaged stats: lr: 0.000332 min_lr: 0.000332 loss: 2.7836 (2.7552) class_acc: 0.5938 (0.5830) weight_decay: 0.0500 (0.0500) grad_norm: 3.7755 (3.2782)
Test: [ 0/50] eta: 0:10:14 loss: 1.7678 (1.7678) acc1: 60.0000 (60.0000) acc5: 86.4000 (86.4000) time: 12.2941 data: 12.2687 max mem: 2905
Test: [10/50] eta: 0:01:25 loss: 1.6385 (1.6347) acc1: 63.2000 (63.5636) acc5: 85.6000 (84.7273) time: 2.1481 data: 2.1265 max mem: 2905
Test: [20/50] eta: 0:00:50 loss: 1.7778 (1.8048) acc1: 60.0000 (59.7333) acc5: 83.2000 (83.0857) time: 1.1696 data: 1.1496 max mem: 2905
Test: [30/50] eta: 0:00:29 loss: 1.8623 (1.7954) acc1: 59.2000 (60.1806) acc5: 83.2000 (82.9936) time: 1.1259 data: 1.1068 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 1.7402 (1.8095) acc1: 60.0000 (59.8049) acc5: 81.6000 (82.4195) time: 0.7296 data: 0.7103 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 1.7686 (1.8017) acc1: 60.8000 (60.1120) acc5: 81.6000 (82.2560) time: 0.6405 data: 0.6197 max mem: 2905
Test: Total time: 0:00:51 (1.0276 s / it)
* Acc@1 60.984 Acc@5 82.372 loss 1.780
Accuracy of the model on the 50000 test images: 61.0%
Max accuracy: 62.01%
Epoch: [248] [ 0/625] eta: 3:36:18 lr: 0.000332 min_lr: 0.000332 loss: 2.7000 (2.7000) class_acc: 0.6133 (0.6133) weight_decay: 0.0500 (0.0500) time: 20.7662 data: 18.4724 max mem: 2905
Epoch: [248] [200/625] eta: 0:14:37 lr: 0.000328 min_lr: 0.000328 loss: 2.7880 (2.7437) class_acc: 0.5820 (0.5872) weight_decay: 0.0500 (0.0500) grad_norm: 2.7727 (3.2098) time: 1.7260 data: 0.0006 max mem: 2905
Epoch: [248] [400/625] eta: 0:07:41 lr: 0.000324 min_lr: 0.000324 loss: 2.7056 (2.7480) class_acc: 0.5820 (0.5846) weight_decay: 0.0500 (0.0500) grad_norm: 2.9931 (3.2920) time: 2.1214 data: 0.0006 max mem: 2905
Epoch: [248] [600/625] eta: 0:00:51 lr: 0.000320 min_lr: 0.000320 loss: 2.7750 (2.7493) class_acc: 0.5742 (0.5833) weight_decay: 0.0500 (0.0500) grad_norm: 2.8891 (3.2853) time: 2.0283 data: 0.0006 max mem: 2905
Epoch: [248] [624/625] eta: 0:00:01 lr: 0.000320 min_lr: 0.000320 loss: 2.7426 (2.7486) class_acc: 0.5820 (0.5834) weight_decay: 0.0500 (0.0500) grad_norm: 3.0953 (3.2853) time: 0.8414 data: 0.0013 max mem: 2905
Epoch: [248] Total time: 0:20:48 (1.9982 s / it)
Averaged stats: lr: 0.000320 min_lr: 0.000320 loss: 2.7426 (2.7526) class_acc: 0.5820 (0.5836) weight_decay: 0.0500 (0.0500) grad_norm: 3.0953 (3.2853)
Test: [ 0/50] eta: 0:09:00 loss: 1.5439 (1.5439) acc1: 60.8000 (60.8000) acc5: 86.4000 (86.4000) time: 10.8025 data: 10.7716 max mem: 2905
Test: [10/50] eta: 0:01:18 loss: 1.4529 (1.4601) acc1: 66.4000 (67.8545) acc5: 88.0000 (87.3455) time: 1.9718 data: 1.9526 max mem: 2905
Test: [20/50] eta: 0:00:48 loss: 1.6206 (1.6176) acc1: 63.2000 (63.8857) acc5: 85.6000 (85.4857) time: 1.1564 data: 1.1389 max mem: 2905
Test: [30/50] eta: 0:00:29 loss: 1.7389 (1.6359) acc1: 59.2000 (63.1742) acc5: 84.0000 (84.8774) time: 1.1676 data: 1.1490 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 1.7198 (1.6502) acc1: 62.4000 (63.0244) acc5: 82.4000 (84.5463) time: 0.8768 data: 0.8564 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 1.7108 (1.6633) acc1: 61.6000 (62.5760) acc5: 83.2000 (84.3840) time: 0.7465 data: 0.7260 max mem: 2905
Test: Total time: 0:00:51 (1.0378 s / it)
* Acc@1 63.404 Acc@5 84.742 loss 1.627
Accuracy of the model on the 50000 test images: 63.4%
Max accuracy: 63.40%
Epoch: [249] [ 0/625] eta: 3:27:44 lr: 0.000320 min_lr: 0.000320 loss: 2.8956 (2.8956) class_acc: 0.5352 (0.5352) weight_decay: 0.0500 (0.0500) time: 19.9438 data: 19.8201 max mem: 2905
Epoch: [249] [200/625] eta: 0:14:48 lr: 0.000316 min_lr: 0.000316 loss: 2.6596 (2.7379) class_acc: 0.5977 (0.5869) weight_decay: 0.0500 (0.0500) grad_norm: 2.8069 (3.5533) time: 2.1266 data: 0.0007 max mem: 2905
Epoch: [249] [400/625] eta: 0:07:38 lr: 0.000312 min_lr: 0.000312 loss: 2.7756 (2.7489) class_acc: 0.5820 (0.5852) weight_decay: 0.0500 (0.0500) grad_norm: 4.6658 (3.7656) time: 2.0893 data: 0.0006 max mem: 2905
Epoch: [249] [600/625] eta: 0:00:50 lr: 0.000308 min_lr: 0.000308 loss: 2.7614 (2.7504) class_acc: 0.5820 (0.5845) weight_decay: 0.0500 (0.0500) grad_norm: 2.7433 (3.7822) time: 1.9091 data: 0.0006 max mem: 2905
Epoch: [249] [624/625] eta: 0:00:01 lr: 0.000308 min_lr: 0.000308 loss: 2.7108 (2.7491) class_acc: 0.5898 (0.5848) weight_decay: 0.0500 (0.0500) grad_norm: 2.8314 (3.7907) time: 0.8130 data: 0.0109 max mem: 2905
Epoch: [249] Total time: 0:20:43 (1.9895 s / it)
Averaged stats: lr: 0.000308 min_lr: 0.000308 loss: 2.7108 (2.7496) class_acc: 0.5898 (0.5843) weight_decay: 0.0500 (0.0500) grad_norm: 2.8314 (3.7907)
Test: [ 0/50] eta: 0:09:41 loss: 1.7079 (1.7079) acc1: 61.6000 (61.6000) acc5: 83.2000 (83.2000) time: 11.6289 data: 11.5878 max mem: 2905
Test: [10/50] eta: 0:01:20 loss: 1.5484 (1.5617) acc1: 64.0000 (65.4545) acc5: 84.8000 (84.6546) time: 2.0162 data: 1.9944 max mem: 2905
Test: [20/50] eta: 0:00:48 loss: 1.7695 (1.7363) acc1: 60.8000 (61.2191) acc5: 82.4000 (82.8952) time: 1.1250 data: 1.1055 max mem: 2905
Test: [30/50] eta: 0:00:29 loss: 1.7879 (1.7272) acc1: 59.2000 (61.3419) acc5: 82.4000 (83.2258) time: 1.1621 data: 1.1432 max mem: 2905
Test: [40/50] eta: 0:00:13 loss: 1.7082 (1.7476) acc1: 60.8000 (61.1902) acc5: 83.2000 (82.9073) time: 0.9970 data: 0.9781 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 1.6476 (1.7471) acc1: 59.2000 (61.1200) acc5: 83.2000 (82.8960) time: 0.8783 data: 0.8599 max mem: 2905
Test: Total time: 0:00:55 (1.1118 s / it)
* Acc@1 61.994 Acc@5 83.418 loss 1.715
Accuracy of the model on the 50000 test images: 62.0%
Max accuracy: 63.40%
Epoch: [250] [ 0/625] eta: 3:50:13 lr: 0.000307 min_lr: 0.000307 loss: 2.5409 (2.5409) class_acc: 0.6289 (0.6289) weight_decay: 0.0500 (0.0500) time: 22.1023 data: 18.4403 max mem: 2905
Epoch: [250] [200/625] eta: 0:14:38 lr: 0.000304 min_lr: 0.000304 loss: 2.7303 (2.7308) class_acc: 0.5742 (0.5875) weight_decay: 0.0500 (0.0500) grad_norm: 2.6854 (3.1862) time: 1.9584 data: 0.0009 max mem: 2905
Epoch: [250] [400/625] eta: 0:07:35 lr: 0.000300 min_lr: 0.000300 loss: 2.7362 (2.7350) class_acc: 0.5898 (0.5861) weight_decay: 0.0500 (0.0500) grad_norm: 2.6315 (3.2845) time: 2.0521 data: 0.0197 max mem: 2905
Epoch: [250] [600/625] eta: 0:00:50 lr: 0.000296 min_lr: 0.000296 loss: 2.7671 (2.7396) class_acc: 0.5664 (0.5857) weight_decay: 0.0500 (0.0500) grad_norm: 2.9016 (3.3349) time: 2.0908 data: 0.0007 max mem: 2905
Epoch: [250] [624/625] eta: 0:00:01 lr: 0.000296 min_lr: 0.000296 loss: 2.7524 (2.7401) class_acc: 0.5859 (0.5855) weight_decay: 0.0500 (0.0500) grad_norm: 2.9613 (3.3660) time: 0.8337 data: 0.0016 max mem: 2905
Epoch: [250] Total time: 0:20:34 (1.9759 s / it)
Averaged stats: lr: 0.000296 min_lr: 0.000296 loss: 2.7524 (2.7462) class_acc: 0.5859 (0.5850) weight_decay: 0.0500 (0.0500) grad_norm: 2.9613 (3.3660)
Test: [ 0/50] eta: 0:08:56 loss: 1.7887 (1.7887) acc1: 58.4000 (58.4000) acc5: 85.6000 (85.6000) time: 10.7376 data: 10.7108 max mem: 2905
Test: [10/50] eta: 0:01:00 loss: 1.5495 (1.5564) acc1: 66.4000 (66.1091) acc5: 85.6000 (85.8909) time: 1.5227 data: 1.5022 max mem: 2905
Test: [20/50] eta: 0:00:39 loss: 1.6848 (1.7118) acc1: 62.4000 (62.7048) acc5: 83.2000 (84.0381) time: 0.8385 data: 0.8184 max mem: 2905
Test: [30/50] eta: 0:00:23 loss: 1.8307 (1.7295) acc1: 59.2000 (62.1936) acc5: 81.6000 (83.4323) time: 1.0025 data: 0.9827 max mem: 2905
Test: [40/50] eta: 0:00:10 loss: 1.7892 (1.7399) acc1: 59.2000 (62.0683) acc5: 82.4000 (83.0829) time: 0.7374 data: 0.7183 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 1.7649 (1.7478) acc1: 59.2000 (61.7120) acc5: 81.6000 (82.7360) time: 0.4111 data: 0.3928 max mem: 2905
Test: Total time: 0:00:45 (0.9034 s / it)
* Acc@1 62.198 Acc@5 83.376 loss 1.706
Accuracy of the model on the 50000 test images: 62.2%
Max accuracy: 63.40%
Epoch: [251] [ 0/625] eta: 3:37:07 lr: 0.000296 min_lr: 0.000296 loss: 2.8539 (2.8539) class_acc: 0.5430 (0.5430) weight_decay: 0.0500 (0.0500) time: 20.8443 data: 17.5066 max mem: 2905
Epoch: [251] [200/625] eta: 0:14:30 lr: 0.000292 min_lr: 0.000292 loss: 2.7129 (2.7388) class_acc: 0.5938 (0.5871) weight_decay: 0.0500 (0.0500) grad_norm: 2.8670 (3.3793) time: 2.0413 data: 0.0007 max mem: 2905
Epoch: [251] [400/625] eta: 0:07:29 lr: 0.000288 min_lr: 0.000288 loss: 2.7485 (2.7504) class_acc: 0.5938 (0.5842) weight_decay: 0.0500 (0.0500) grad_norm: 2.4842 (3.3045) time: 1.9685 data: 0.0007 max mem: 2905
Epoch: [251] [600/625] eta: 0:00:50 lr: 0.000284 min_lr: 0.000284 loss: 2.7451 (2.7488) class_acc: 0.5625 (0.5844) weight_decay: 0.0500 (0.0500) grad_norm: 2.5057 (3.2126) time: 2.0468 data: 0.0742 max mem: 2905
Epoch: [251] [624/625] eta: 0:00:01 lr: 0.000284 min_lr: 0.000284 loss: 2.7682 (2.7501) class_acc: 0.5703 (0.5840) weight_decay: 0.0500 (0.0500) grad_norm: 2.6611 (3.2146) time: 1.0430 data: 0.0038 max mem: 2905
Epoch: [251] Total time: 0:20:44 (1.9906 s / it)
Averaged stats: lr: 0.000284 min_lr: 0.000284 loss: 2.7682 (2.7443) class_acc: 0.5703 (0.5854) weight_decay: 0.0500 (0.0500) grad_norm: 2.6611 (3.2146)
Test: [ 0/50] eta: 0:10:06 loss: 1.6305 (1.6305) acc1: 62.4000 (62.4000) acc5: 84.8000 (84.8000) time: 12.1314 data: 12.0983 max mem: 2905
Test: [10/50] eta: 0:01:19 loss: 1.5044 (1.5755) acc1: 66.4000 (66.5455) acc5: 86.4000 (85.6727) time: 1.9828 data: 1.9625 max mem: 2905
Test: [20/50] eta: 0:00:50 loss: 1.7115 (1.7187) acc1: 62.4000 (63.0095) acc5: 84.0000 (84.2667) time: 1.1436 data: 1.1240 max mem: 2905
Test: [30/50] eta: 0:00:31 loss: 1.7853 (1.7146) acc1: 60.8000 (63.3806) acc5: 82.4000 (83.9484) time: 1.3518 data: 1.3296 max mem: 2905
Test: [40/50] eta: 0:00:13 loss: 1.6920 (1.7279) acc1: 63.2000 (62.7317) acc5: 83.2000 (83.7659) time: 1.0631 data: 1.0414 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 1.6563 (1.7186) acc1: 63.2000 (62.9760) acc5: 84.8000 (84.0000) time: 0.7976 data: 0.7785 max mem: 2905
Test: Total time: 0:00:58 (1.1614 s / it)
* Acc@1 63.504 Acc@5 84.526 loss 1.685
Accuracy of the model on the 50000 test images: 63.5%
Max accuracy: 63.50%
Epoch: [252] [ 0/625] eta: 3:54:36 lr: 0.000284 min_lr: 0.000284 loss: 3.0420 (3.0420) class_acc: 0.5469 (0.5469) weight_decay: 0.0500 (0.0500) time: 22.5231 data: 18.5886 max mem: 2905
Epoch: [252] [200/625] eta: 0:15:02 lr: 0.000280 min_lr: 0.000280 loss: 2.7254 (2.7393) class_acc: 0.5742 (0.5848) weight_decay: 0.0500 (0.0500) grad_norm: 2.9925 (3.4662) time: 2.2164 data: 0.8763 max mem: 2905
Epoch: [252] [400/625] eta: 0:07:33 lr: 0.000277 min_lr: 0.000277 loss: 2.7541 (2.7445) class_acc: 0.5742 (0.5846) weight_decay: 0.0500 (0.0500) grad_norm: 3.7114 (3.6017) time: 2.0685 data: 0.0007 max mem: 2905
Epoch: [252] [600/625] eta: 0:00:49 lr: 0.000273 min_lr: 0.000273 loss: 2.6950 (2.7407) class_acc: 0.5938 (0.5863) weight_decay: 0.0500 (0.0500) grad_norm: 2.8804 (3.4904) time: 1.8444 data: 0.0006 max mem: 2905
Epoch: [252] [624/625] eta: 0:00:01 lr: 0.000273 min_lr: 0.000273 loss: 2.7226 (2.7408) class_acc: 0.5781 (0.5862) weight_decay: 0.0500 (0.0500) grad_norm: 3.1792 (3.5127) time: 0.7131 data: 0.0014 max mem: 2905
Epoch: [252] Total time: 0:20:29 (1.9672 s / it)
Averaged stats: lr: 0.000273 min_lr: 0.000273 loss: 2.7226 (2.7406) class_acc: 0.5781 (0.5862) weight_decay: 0.0500 (0.0500) grad_norm: 3.1792 (3.5127)
Test: [ 0/50] eta: 0:10:37 loss: 1.6181 (1.6181) acc1: 57.6000 (57.6000) acc5: 88.8000 (88.8000) time: 12.7449 data: 12.7117 max mem: 2905
Test: [10/50] eta: 0:01:25 loss: 1.3761 (1.3977) acc1: 68.8000 (68.2182) acc5: 88.0000 (88.3636) time: 2.1367 data: 2.1169 max mem: 2905
Test: [20/50] eta: 0:00:50 loss: 1.5155 (1.5729) acc1: 63.2000 (64.4191) acc5: 85.6000 (86.2095) time: 1.1292 data: 1.1090 max mem: 2905
Test: [30/50] eta: 0:00:29 loss: 1.6948 (1.5814) acc1: 62.4000 (64.5936) acc5: 84.8000 (85.7290) time: 1.1339 data: 1.1134 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 1.6002 (1.6103) acc1: 63.2000 (64.0390) acc5: 84.8000 (85.1707) time: 0.8725 data: 0.8520 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 1.5257 (1.6124) acc1: 64.0000 (64.0480) acc5: 84.8000 (85.1360) time: 0.8463 data: 0.8264 max mem: 2905
Test: Total time: 0:00:53 (1.0719 s / it)
* Acc@1 64.746 Acc@5 85.416 loss 1.572
Accuracy of the model on the 50000 test images: 64.7%
Max accuracy: 64.75%
Epoch: [253] [ 0/625] eta: 3:44:35 lr: 0.000273 min_lr: 0.000273 loss: 2.7552 (2.7552) class_acc: 0.5938 (0.5938) weight_decay: 0.0500 (0.0500) time: 21.5616 data: 19.5654 max mem: 2905
Epoch: [253] [200/625] eta: 0:14:03 lr: 0.000269 min_lr: 0.000269 loss: 2.7445 (2.7441) class_acc: 0.5820 (0.5851) weight_decay: 0.0500 (0.0500) grad_norm: 3.0631 (3.2907) time: 2.0001 data: 0.0007 max mem: 2905
Epoch: [253] [400/625] eta: 0:07:21 lr: 0.000265 min_lr: 0.000265 loss: 2.7319 (2.7394) class_acc: 0.5781 (0.5864) weight_decay: 0.0500 (0.0500) grad_norm: 3.4387 (inf) time: 1.8362 data: 0.0007 max mem: 2905
Epoch: [253] [600/625] eta: 0:00:49 lr: 0.000262 min_lr: 0.000262 loss: 2.7257 (2.7411) class_acc: 0.6055 (0.5868) weight_decay: 0.0500 (0.0500) grad_norm: 2.4727 (inf) time: 2.0141 data: 0.0007 max mem: 2905
Epoch: [253] [624/625] eta: 0:00:01 lr: 0.000261 min_lr: 0.000261 loss: 2.7982 (2.7432) class_acc: 0.5703 (0.5863) weight_decay: 0.0500 (0.0500) grad_norm: 2.2750 (inf) time: 0.6853 data: 0.0014 max mem: 2905
Epoch: [253] Total time: 0:20:27 (1.9635 s / it)
Averaged stats: lr: 0.000261 min_lr: 0.000261 loss: 2.7982 (2.7398) class_acc: 0.5703 (0.5866) weight_decay: 0.0500 (0.0500) grad_norm: 2.2750 (inf)
Test: [ 0/50] eta: 0:10:12 loss: 1.5349 (1.5349) acc1: 63.2000 (63.2000) acc5: 89.6000 (89.6000) time: 12.2600 data: 12.2342 max mem: 2905
Test: [10/50] eta: 0:01:11 loss: 1.4989 (1.4906) acc1: 67.2000 (67.0545) acc5: 85.6000 (86.1091) time: 1.7872 data: 1.7648 max mem: 2905
Test: [20/50] eta: 0:00:40 loss: 1.5942 (1.6474) acc1: 62.4000 (62.8952) acc5: 84.0000 (84.5714) time: 0.8206 data: 0.8002 max mem: 2905
Test: [30/50] eta: 0:00:25 loss: 1.7724 (1.6507) acc1: 60.0000 (62.8645) acc5: 83.2000 (84.4387) time: 0.9579 data: 0.9394 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 1.7413 (1.6688) acc1: 61.6000 (62.8488) acc5: 83.2000 (84.2342) time: 0.9932 data: 0.9747 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 1.5952 (1.6775) acc1: 61.6000 (62.8000) acc5: 82.4000 (84.0960) time: 0.7428 data: 0.7238 max mem: 2905
Test: Total time: 0:00:53 (1.0765 s / it)
* Acc@1 63.346 Acc@5 84.782 loss 1.643
Accuracy of the model on the 50000 test images: 63.3%
Max accuracy: 64.75%
Epoch: [254] [ 0/625] eta: 3:44:14 lr: 0.000261 min_lr: 0.000261 loss: 2.7654 (2.7654) class_acc: 0.5859 (0.5859) weight_decay: 0.0500 (0.0500) time: 21.5264 data: 19.3171 max mem: 2905
Epoch: [254] [200/625] eta: 0:14:53 lr: 0.000258 min_lr: 0.000258 loss: 2.7006 (2.7169) class_acc: 0.5820 (0.5924) weight_decay: 0.0500 (0.0500) grad_norm: 3.4051 (3.5633) time: 2.1138 data: 0.3264 max mem: 2905
Epoch: [254] [400/625] eta: 0:07:36 lr: 0.000254 min_lr: 0.000254 loss: 2.6711 (2.7241) class_acc: 0.5859 (0.5898) weight_decay: 0.0500 (0.0500) grad_norm: 2.5500 (3.4284) time: 1.9595 data: 0.0126 max mem: 2905
Epoch: [254] [600/625] eta: 0:00:50 lr: 0.000251 min_lr: 0.000251 loss: 2.7435 (2.7343) class_acc: 0.5820 (0.5884) weight_decay: 0.0500 (0.0500) grad_norm: 2.5560 (3.4639) time: 2.1111 data: 0.0007 max mem: 2905
Epoch: [254] [624/625] eta: 0:00:01 lr: 0.000251 min_lr: 0.000251 loss: 2.7208 (2.7339) class_acc: 0.5898 (0.5887) weight_decay: 0.0500 (0.0500) grad_norm: 2.7047 (3.4401) time: 0.7457 data: 0.0072 max mem: 2905
Epoch: [254] Total time: 0:20:31 (1.9702 s / it)
Averaged stats: lr: 0.000251 min_lr: 0.000251 loss: 2.7208 (2.7364) class_acc: 0.5898 (0.5874) weight_decay: 0.0500 (0.0500) grad_norm: 2.7047 (3.4401)
Test: [ 0/50] eta: 0:10:10 loss: 1.4909 (1.4909) acc1: 64.0000 (64.0000) acc5: 88.0000 (88.0000) time: 12.2019 data: 12.1770 max mem: 2905
Test: [10/50] eta: 0:01:24 loss: 1.3238 (1.3467) acc1: 69.6000 (70.0364) acc5: 88.0000 (88.2182) time: 2.1171 data: 2.0980 max mem: 2905
Test: [20/50] eta: 0:00:46 loss: 1.4950 (1.5084) acc1: 66.4000 (66.5143) acc5: 86.4000 (86.2476) time: 1.0215 data: 1.0032 max mem: 2905
Test: [30/50] eta: 0:00:26 loss: 1.6170 (1.5169) acc1: 64.0000 (65.9355) acc5: 84.8000 (86.3226) time: 0.8513 data: 0.8334 max mem: 2905
Test: [40/50] eta: 0:00:11 loss: 1.5418 (1.5407) acc1: 62.4000 (65.4634) acc5: 84.8000 (85.9122) time: 0.7529 data: 0.7343 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 1.5919 (1.5532) acc1: 63.2000 (64.9600) acc5: 84.8000 (85.6960) time: 0.6818 data: 0.6626 max mem: 2905
Test: Total time: 0:00:51 (1.0252 s / it)
* Acc@1 65.522 Acc@5 86.162 loss 1.520
Accuracy of the model on the 50000 test images: 65.5%
Max accuracy: 65.52%
Epoch: [255] [ 0/625] eta: 3:26:30 lr: 0.000250 min_lr: 0.000250 loss: 2.8605 (2.8605) class_acc: 0.5469 (0.5469) weight_decay: 0.0500 (0.0500) time: 19.8255 data: 19.4243 max mem: 2905
Epoch: [255] [200/625] eta: 0:14:09 lr: 0.000247 min_lr: 0.000247 loss: 2.7077 (2.7239) class_acc: 0.5977 (0.5927) weight_decay: 0.0500 (0.0500) grad_norm: 3.8561 (3.1735) time: 1.9528 data: 0.3191 max mem: 2905
Epoch: [255] [400/625] eta: 0:07:20 lr: 0.000244 min_lr: 0.000244 loss: 2.6997 (2.7316) class_acc: 0.6094 (0.5896) weight_decay: 0.0500 (0.0500) grad_norm: 2.6714 (3.2675) time: 1.9264 data: 0.0013 max mem: 2905
Epoch: [255] [600/625] eta: 0:00:48 lr: 0.000240 min_lr: 0.000240 loss: 2.7242 (2.7341) class_acc: 0.5938 (0.5891) weight_decay: 0.0500 (0.0500) grad_norm: 3.7159 (3.2027) time: 1.7563 data: 0.0066 max mem: 2905
Epoch: [255] [624/625] eta: 0:00:01 lr: 0.000240 min_lr: 0.000240 loss: 2.7479 (2.7336) class_acc: 0.5820 (0.5893) weight_decay: 0.0500 (0.0500) grad_norm: 3.4637 (3.1991) time: 1.1358 data: 0.0260 max mem: 2905
Epoch: [255] Total time: 0:20:04 (1.9264 s / it)
Averaged stats: lr: 0.000240 min_lr: 0.000240 loss: 2.7479 (2.7322) class_acc: 0.5820 (0.5884) weight_decay: 0.0500 (0.0500) grad_norm: 3.4637 (3.1991)
Test: [ 0/50] eta: 0:08:20 loss: 1.5699 (1.5699) acc1: 62.4000 (62.4000) acc5: 88.0000 (88.0000) time: 10.0079 data: 9.9798 max mem: 2905
Test: [10/50] eta: 0:01:09 loss: 1.4135 (1.4245) acc1: 67.2000 (67.9273) acc5: 87.2000 (87.3455) time: 1.7276 data: 1.7048 max mem: 2905
Test: [20/50] eta: 0:00:38 loss: 1.5388 (1.5779) acc1: 65.6000 (64.9905) acc5: 85.6000 (85.1810) time: 0.8387 data: 0.8168 max mem: 2905
Test: [30/50] eta: 0:00:22 loss: 1.7482 (1.5958) acc1: 61.6000 (64.3871) acc5: 85.6000 (85.2645) time: 0.8141 data: 0.7946 max mem: 2905
Test: [40/50] eta: 0:00:10 loss: 1.6335 (1.6109) acc1: 61.6000 (64.1756) acc5: 85.6000 (84.8585) time: 0.8951 data: 0.8764 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 1.6059 (1.6096) acc1: 64.8000 (64.3200) acc5: 85.6000 (84.9760) time: 0.6413 data: 0.6216 max mem: 2905
Test: Total time: 0:00:48 (0.9653 s / it)
* Acc@1 65.094 Acc@5 85.712 loss 1.567
Accuracy of the model on the 50000 test images: 65.1%
Max accuracy: 65.52%
Epoch: [256] [ 0/625] eta: 3:33:46 lr: 0.000240 min_lr: 0.000240 loss: 2.8370 (2.8370) class_acc: 0.5703 (0.5703) weight_decay: 0.0500 (0.0500) time: 20.5227 data: 18.6952 max mem: 2905
Epoch: [256] [200/625] eta: 0:13:50 lr: 0.000236 min_lr: 0.000236 loss: 2.6545 (2.7142) class_acc: 0.5859 (0.5924) weight_decay: 0.0500 (0.0500) grad_norm: 2.5862 (3.1751) time: 1.6284 data: 0.0050 max mem: 2905
Epoch: [256] [400/625] eta: 0:07:14 lr: 0.000233 min_lr: 0.000233 loss: 2.7374 (2.7253) class_acc: 0.5859 (0.5902) weight_decay: 0.0500 (0.0500) grad_norm: 2.7179 (3.0621) time: 1.9977 data: 0.0008 max mem: 2905
Epoch: [256] [600/625] eta: 0:00:47 lr: 0.000230 min_lr: 0.000230 loss: 2.7958 (2.7376) class_acc: 0.5703 (0.5874) weight_decay: 0.0500 (0.0500) grad_norm: 4.5650 (3.5334) time: 1.8449 data: 0.0008 max mem: 2905
Epoch: [256] [624/625] eta: 0:00:01 lr: 0.000229 min_lr: 0.000229 loss: 2.7231 (2.7376) class_acc: 0.5781 (0.5872) weight_decay: 0.0500 (0.0500) grad_norm: 4.7181 (3.5818) time: 0.7537 data: 0.0015 max mem: 2905
Epoch: [256] Total time: 0:19:44 (1.8956 s / it)
Averaged stats: lr: 0.000229 min_lr: 0.000229 loss: 2.7231 (2.7325) class_acc: 0.5781 (0.5886) weight_decay: 0.0500 (0.0500) grad_norm: 4.7181 (3.5818)
Test: [ 0/50] eta: 0:10:22 loss: 1.4078 (1.4078) acc1: 64.0000 (64.0000) acc5: 91.2000 (91.2000) time: 12.4423 data: 12.4060 max mem: 2905
Test: [10/50] eta: 0:01:22 loss: 1.4078 (1.3778) acc1: 68.8000 (69.7455) acc5: 88.8000 (88.1455) time: 2.0697 data: 2.0498 max mem: 2905
Test: [20/50] eta: 0:00:45 loss: 1.5660 (1.5334) acc1: 64.8000 (65.7905) acc5: 86.4000 (86.4381) time: 0.9580 data: 0.9385 max mem: 2905
Test: [30/50] eta: 0:00:24 loss: 1.6332 (1.5371) acc1: 63.2000 (65.4194) acc5: 85.6000 (86.1677) time: 0.7938 data: 0.7741 max mem: 2905
Test: [40/50] eta: 0:00:10 loss: 1.5259 (1.5596) acc1: 64.0000 (65.0732) acc5: 85.6000 (85.7561) time: 0.6158 data: 0.5961 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 1.5176 (1.5672) acc1: 64.0000 (64.8960) acc5: 84.8000 (85.5520) time: 0.5330 data: 0.5138 max mem: 2905
Test: Total time: 0:00:46 (0.9385 s / it)
* Acc@1 65.690 Acc@5 86.250 loss 1.534
Accuracy of the model on the 50000 test images: 65.7%
Max accuracy: 65.69%
Epoch: [257] [ 0/625] eta: 3:45:18 lr: 0.000229 min_lr: 0.000229 loss: 2.5982 (2.5982) class_acc: 0.6133 (0.6133) weight_decay: 0.0500 (0.0500) time: 21.6303 data: 16.3646 max mem: 2905
Epoch: [257] [200/625] eta: 0:14:25 lr: 0.000226 min_lr: 0.000226 loss: 2.7106 (2.7174) class_acc: 0.5820 (0.5894) weight_decay: 0.0500 (0.0500) grad_norm: 3.2095 (3.2202) time: 1.7910 data: 0.0011 max mem: 2905
Epoch: [257] [400/625] eta: 0:07:27 lr: 0.000223 min_lr: 0.000223 loss: 2.7566 (2.7293) class_acc: 0.5742 (0.5893) weight_decay: 0.0500 (0.0500) grad_norm: 3.3751 (3.3419) time: 1.9253 data: 0.0008 max mem: 2905
Epoch: [257] [600/625] eta: 0:00:49 lr: 0.000219 min_lr: 0.000219 loss: 2.6901 (2.7289) class_acc: 0.5859 (0.5891) weight_decay: 0.0500 (0.0500) grad_norm: 2.6949 (3.3159) time: 2.0633 data: 0.0008 max mem: 2905
Epoch: [257] [624/625] eta: 0:00:01 lr: 0.000219 min_lr: 0.000219 loss: 2.7255 (2.7292) class_acc: 0.5742 (0.5889) weight_decay: 0.0500 (0.0500) grad_norm: 2.3787 (3.2976) time: 0.8807 data: 0.0018 max mem: 2905
Epoch: [257] Total time: 0:20:15 (1.9455 s / it)
Averaged stats: lr: 0.000219 min_lr: 0.000219 loss: 2.7255 (2.7289) class_acc: 0.5742 (0.5894) weight_decay: 0.0500 (0.0500) grad_norm: 2.3787 (3.2976)
Test: [ 0/50] eta: 0:11:05 loss: 1.5713 (1.5713) acc1: 64.0000 (64.0000) acc5: 87.2000 (87.2000) time: 13.3129 data: 13.2860 max mem: 2905
Test: [10/50] eta: 0:01:27 loss: 1.3897 (1.4096) acc1: 69.6000 (69.9636) acc5: 88.8000 (88.0727) time: 2.1845 data: 2.1655 max mem: 2905
Test: [20/50] eta: 0:00:47 loss: 1.5689 (1.5329) acc1: 66.4000 (66.7048) acc5: 87.2000 (87.0095) time: 1.0032 data: 0.9843 max mem: 2905
Test: [30/50] eta: 0:00:26 loss: 1.6281 (1.5558) acc1: 64.8000 (66.3484) acc5: 86.4000 (86.3742) time: 0.8768 data: 0.8575 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 1.5405 (1.5741) acc1: 64.8000 (65.6390) acc5: 85.6000 (85.9512) time: 0.8397 data: 0.8204 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 1.5772 (1.5809) acc1: 64.0000 (65.5520) acc5: 85.6000 (85.8080) time: 0.7819 data: 0.7610 max mem: 2905
Test: Total time: 0:00:54 (1.0885 s / it)
* Acc@1 65.784 Acc@5 86.180 loss 1.553
Accuracy of the model on the 50000 test images: 65.8%
Max accuracy: 65.78%
Epoch: [258] [ 0/625] eta: 4:14:33 lr: 0.000219 min_lr: 0.000219 loss: 2.7456 (2.7456) class_acc: 0.5703 (0.5703) weight_decay: 0.0500 (0.0500) time: 24.4371 data: 19.5015 max mem: 2905
Epoch: [258] [200/625] eta: 0:14:19 lr: 0.000216 min_lr: 0.000216 loss: 2.6921 (2.7116) class_acc: 0.5938 (0.5936) weight_decay: 0.0500 (0.0500) grad_norm: 3.3964 (3.2895) time: 1.8386 data: 0.0007 max mem: 2905
Epoch: [258] [400/625] eta: 0:07:22 lr: 0.000212 min_lr: 0.000212 loss: 2.6857 (2.7084) class_acc: 0.5938 (0.5940) weight_decay: 0.0500 (0.0500) grad_norm: 3.9483 (3.4547) time: 1.7934 data: 0.0007 max mem: 2905
Epoch: [258] [600/625] eta: 0:00:48 lr: 0.000209 min_lr: 0.000209 loss: 2.6951 (2.7124) class_acc: 0.5938 (0.5930) weight_decay: 0.0500 (0.0500) grad_norm: 2.6143 (3.3969) time: 1.9270 data: 0.0007 max mem: 2905
Epoch: [258] [624/625] eta: 0:00:01 lr: 0.000209 min_lr: 0.000209 loss: 2.7552 (2.7144) class_acc: 0.5820 (0.5926) weight_decay: 0.0500 (0.0500) grad_norm: 2.7592 (3.3967) time: 0.7888 data: 0.0016 max mem: 2905
Epoch: [258] Total time: 0:20:11 (1.9381 s / it)
Averaged stats: lr: 0.000209 min_lr: 0.000209 loss: 2.7552 (2.7240) class_acc: 0.5820 (0.5906) weight_decay: 0.0500 (0.0500) grad_norm: 2.7592 (3.3967)
Test: [ 0/50] eta: 0:09:24 loss: 1.5589 (1.5589) acc1: 64.0000 (64.0000) acc5: 84.0000 (84.0000) time: 11.2899 data: 11.2559 max mem: 2905
Test: [10/50] eta: 0:01:08 loss: 1.4893 (1.4482) acc1: 65.6000 (67.0545) acc5: 87.2000 (87.2000) time: 1.7222 data: 1.7018 max mem: 2905
Test: [20/50] eta: 0:00:36 loss: 1.5755 (1.6072) acc1: 63.2000 (64.2286) acc5: 86.4000 (85.8286) time: 0.7101 data: 0.6907 max mem: 2905
Test: [30/50] eta: 0:00:23 loss: 1.7574 (1.6328) acc1: 60.8000 (63.5097) acc5: 85.6000 (85.4710) time: 0.8587 data: 0.8384 max mem: 2905
Test: [40/50] eta: 0:00:10 loss: 1.7542 (1.6624) acc1: 60.8000 (63.0439) acc5: 84.0000 (84.7610) time: 0.8250 data: 0.8035 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 1.6714 (1.6670) acc1: 60.8000 (62.8000) acc5: 83.2000 (84.6400) time: 0.4191 data: 0.3987 max mem: 2905
Test: Total time: 0:00:44 (0.8928 s / it)
* Acc@1 63.704 Acc@5 84.766 loss 1.631
Accuracy of the model on the 50000 test images: 63.7%
Max accuracy: 65.78%
Epoch: [259] [ 0/625] eta: 3:26:00 lr: 0.000209 min_lr: 0.000209 loss: 2.9102 (2.9102) class_acc: 0.5625 (0.5625) weight_decay: 0.0500 (0.0500) time: 19.7768 data: 18.1604 max mem: 2905
Epoch: [259] [200/625] eta: 0:13:43 lr: 0.000206 min_lr: 0.000206 loss: 2.7059 (2.7221) class_acc: 0.5859 (0.5896) weight_decay: 0.0500 (0.0500) grad_norm: 2.9464 (3.4944) time: 1.8997 data: 0.0118 max mem: 2905
Epoch: [259] [400/625] eta: 0:07:13 lr: 0.000203 min_lr: 0.000203 loss: 2.6808 (2.7276) class_acc: 0.5977 (0.5881) weight_decay: 0.0500 (0.0500) grad_norm: 2.7679 (3.5310) time: 1.9214 data: 0.0136 max mem: 2905
Epoch: [259] [600/625] eta: 0:00:48 lr: 0.000199 min_lr: 0.000199 loss: 2.7244 (2.7279) class_acc: 0.5742 (0.5881) weight_decay: 0.0500 (0.0500) grad_norm: 2.8087 (3.4387) time: 2.1257 data: 0.0089 max mem: 2905
Epoch: [259] [624/625] eta: 0:00:01 lr: 0.000199 min_lr: 0.000199 loss: 2.7092 (2.7275) class_acc: 0.5859 (0.5883) weight_decay: 0.0500 (0.0500) grad_norm: 2.6717 (3.4193) time: 0.8008 data: 0.0017 max mem: 2905
Epoch: [259] Total time: 0:20:03 (1.9258 s / it)
Averaged stats: lr: 0.000199 min_lr: 0.000199 loss: 2.7092 (2.7235) class_acc: 0.5859 (0.5904) weight_decay: 0.0500 (0.0500) grad_norm: 2.6717 (3.4193)
Test: [ 0/50] eta: 0:10:44 loss: 1.3960 (1.3960) acc1: 69.6000 (69.6000) acc5: 87.2000 (87.2000) time: 12.8839 data: 12.8585 max mem: 2905
Test: [10/50] eta: 0:01:30 loss: 1.3465 (1.3265) acc1: 69.6000 (71.5636) acc5: 88.0000 (88.2182) time: 2.2662 data: 2.2435 max mem: 2905
Test: [20/50] eta: 0:00:54 loss: 1.4350 (1.4891) acc1: 68.0000 (67.7333) acc5: 87.2000 (86.7810) time: 1.2611 data: 1.2393 max mem: 2905
Test: [30/50] eta: 0:00:32 loss: 1.6341 (1.4996) acc1: 63.2000 (66.6323) acc5: 85.6000 (86.5032) time: 1.2560 data: 1.2345 max mem: 2905
Test: [40/50] eta: 0:00:13 loss: 1.4941 (1.5167) acc1: 62.4000 (65.9902) acc5: 85.6000 (86.1268) time: 0.8591 data: 0.8385 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 1.4805 (1.5255) acc1: 62.4000 (65.5520) acc5: 85.6000 (85.9360) time: 0.7805 data: 0.7609 max mem: 2905
Test: Total time: 0:00:56 (1.1286 s / it)
* Acc@1 66.380 Acc@5 86.542 loss 1.490
Accuracy of the model on the 50000 test images: 66.4%
Max accuracy: 66.38%
Epoch: [260] [ 0/625] eta: 3:19:49 lr: 0.000199 min_lr: 0.000199 loss: 2.7660 (2.7660) class_acc: 0.5391 (0.5391) weight_decay: 0.0500 (0.0500) time: 19.1827 data: 17.9289 max mem: 2905
Epoch: [260] [200/625] eta: 0:14:19 lr: 0.000196 min_lr: 0.000196 loss: 2.7087 (2.7025) class_acc: 0.5898 (0.5976) weight_decay: 0.0500 (0.0500) grad_norm: 3.3598 (3.7958) time: 1.9344 data: 0.0006 max mem: 2905
Epoch: [260] [400/625] eta: 0:07:22 lr: 0.000193 min_lr: 0.000193 loss: 2.7273 (2.7128) class_acc: 0.5898 (0.5934) weight_decay: 0.0500 (0.0500) grad_norm: 2.5857 (inf) time: 1.8016 data: 0.0007 max mem: 2905
Epoch: [260] [600/625] eta: 0:00:49 lr: 0.000190 min_lr: 0.000190 loss: 2.7195 (2.7200) class_acc: 0.5859 (0.5914) weight_decay: 0.0500 (0.0500) grad_norm: 3.6396 (inf) time: 2.0866 data: 0.0006 max mem: 2905
Epoch: [260] [624/625] eta: 0:00:01 lr: 0.000189 min_lr: 0.000189 loss: 2.7451 (2.7209) class_acc: 0.5859 (0.5914) weight_decay: 0.0500 (0.0500) grad_norm: 3.5698 (inf) time: 0.5684 data: 0.0017 max mem: 2905
Epoch: [260] Total time: 0:20:25 (1.9616 s / it)
Averaged stats: lr: 0.000189 min_lr: 0.000189 loss: 2.7451 (2.7205) class_acc: 0.5859 (0.5916) weight_decay: 0.0500 (0.0500) grad_norm: 3.5698 (inf)
Test: [ 0/50] eta: 0:10:06 loss: 1.4237 (1.4237) acc1: 69.6000 (69.6000) acc5: 88.8000 (88.8000) time: 12.1305 data: 12.0700 max mem: 2905
Test: [10/50] eta: 0:01:22 loss: 1.3327 (1.3290) acc1: 71.2000 (70.7636) acc5: 90.4000 (89.8182) time: 2.0737 data: 2.0507 max mem: 2905
Test: [20/50] eta: 0:00:49 loss: 1.4568 (1.4776) acc1: 66.4000 (66.8571) acc5: 88.8000 (88.2286) time: 1.1268 data: 1.1078 max mem: 2905
Test: [30/50] eta: 0:00:29 loss: 1.5571 (1.5026) acc1: 64.8000 (66.4000) acc5: 86.4000 (87.4581) time: 1.1486 data: 1.1300 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 1.5411 (1.5282) acc1: 64.8000 (65.7951) acc5: 85.6000 (86.8488) time: 0.9195 data: 0.9002 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 1.5188 (1.5343) acc1: 64.8000 (65.6320) acc5: 85.6000 (86.8160) time: 0.8881 data: 0.8682 max mem: 2905
Test: Total time: 0:00:53 (1.0711 s / it)
* Acc@1 66.532 Acc@5 86.990 loss 1.497
Accuracy of the model on the 50000 test images: 66.5%
Max accuracy: 66.53%
Epoch: [261] [ 0/625] eta: 4:28:55 lr: 0.000189 min_lr: 0.000189 loss: 2.5993 (2.5993) class_acc: 0.6172 (0.6172) weight_decay: 0.0500 (0.0500) time: 25.8162 data: 18.7189 max mem: 2905
Epoch: [261] [200/625] eta: 0:14:47 lr: 0.000186 min_lr: 0.000186 loss: 2.7165 (2.7027) class_acc: 0.5742 (0.5928) weight_decay: 0.0500 (0.0500) grad_norm: 3.2815 (3.5401) time: 1.8274 data: 0.0007 max mem: 2905
Epoch: [261] [400/625] eta: 0:07:43 lr: 0.000183 min_lr: 0.000183 loss: 2.6623 (2.7087) class_acc: 0.5938 (0.5927) weight_decay: 0.0500 (0.0500) grad_norm: 2.9501 (3.4142) time: 2.0727 data: 0.0006 max mem: 2905
Epoch: [261] [600/625] eta: 0:00:51 lr: 0.000180 min_lr: 0.000180 loss: 2.7268 (2.7143) class_acc: 0.5859 (0.5919) weight_decay: 0.0500 (0.0500) grad_norm: 3.2833 (3.3762) time: 2.0241 data: 0.0007 max mem: 2905
Epoch: [261] [624/625] eta: 0:00:01 lr: 0.000180 min_lr: 0.000180 loss: 2.7475 (2.7152) class_acc: 0.5859 (0.5917) weight_decay: 0.0500 (0.0500) grad_norm: 2.7709 (3.3768) time: 0.8703 data: 0.0013 max mem: 2905
Epoch: [261] Total time: 0:20:50 (2.0001 s / it)
Averaged stats: lr: 0.000180 min_lr: 0.000180 loss: 2.7475 (2.7206) class_acc: 0.5859 (0.5913) weight_decay: 0.0500 (0.0500) grad_norm: 2.7709 (3.3768)
Test: [ 0/50] eta: 0:10:15 loss: 1.4869 (1.4869) acc1: 63.2000 (63.2000) acc5: 86.4000 (86.4000) time: 12.3170 data: 12.2876 max mem: 2905
Test: [10/50] eta: 0:01:28 loss: 1.3026 (1.3153) acc1: 71.2000 (70.5455) acc5: 88.8000 (88.2909) time: 2.2218 data: 2.2016 max mem: 2905
Test: [20/50] eta: 0:00:54 loss: 1.4055 (1.4615) acc1: 67.2000 (67.3905) acc5: 87.2000 (86.8952) time: 1.2754 data: 1.2548 max mem: 2905
Test: [30/50] eta: 0:00:31 loss: 1.5319 (1.4809) acc1: 63.2000 (66.5032) acc5: 87.2000 (86.8387) time: 1.2323 data: 1.2110 max mem: 2905
Test: [40/50] eta: 0:00:13 loss: 1.4787 (1.5068) acc1: 63.2000 (66.1073) acc5: 84.8000 (86.3220) time: 0.8046 data: 0.7844 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 1.5062 (1.5128) acc1: 64.8000 (65.7920) acc5: 84.8000 (86.2240) time: 0.7535 data: 0.7338 max mem: 2905
Test: Total time: 0:00:54 (1.0987 s / it)
* Acc@1 66.388 Acc@5 86.686 loss 1.484
Accuracy of the model on the 50000 test images: 66.4%
Max accuracy: 66.53%
Epoch: [262] [ 0/625] eta: 3:32:44 lr: 0.000180 min_lr: 0.000180 loss: 2.6006 (2.6006) class_acc: 0.5859 (0.5859) weight_decay: 0.0500 (0.0500) time: 20.4236 data: 18.8559 max mem: 2905
Epoch: [262] [200/625] eta: 0:14:25 lr: 0.000177 min_lr: 0.000177 loss: 2.6758 (2.7199) class_acc: 0.6094 (0.5920) weight_decay: 0.0500 (0.0500) grad_norm: 3.0267 (3.4221) time: 1.9948 data: 0.0271 max mem: 2905
Epoch: [262] [400/625] eta: 0:07:32 lr: 0.000174 min_lr: 0.000174 loss: 2.7035 (2.7177) class_acc: 0.5977 (0.5923) weight_decay: 0.0500 (0.0500) grad_norm: 2.5609 (3.1984) time: 2.0504 data: 0.0015 max mem: 2905
Epoch: [262] [600/625] eta: 0:00:50 lr: 0.000171 min_lr: 0.000171 loss: 2.6726 (2.7195) class_acc: 0.5898 (0.5923) weight_decay: 0.0500 (0.0500) grad_norm: 2.9992 (3.2388) time: 2.0665 data: 0.0009 max mem: 2905
Epoch: [262] [624/625] eta: 0:00:01 lr: 0.000171 min_lr: 0.000171 loss: 2.6545 (2.7181) class_acc: 0.5938 (0.5923) weight_decay: 0.0500 (0.0500) grad_norm: 2.9992 (3.2367) time: 0.8305 data: 0.0014 max mem: 2905
Epoch: [262] Total time: 0:20:28 (1.9650 s / it)
Averaged stats: lr: 0.000171 min_lr: 0.000171 loss: 2.6545 (2.7193) class_acc: 0.5938 (0.5919) weight_decay: 0.0500 (0.0500) grad_norm: 2.9992 (3.2367)
Test: [ 0/50] eta: 0:10:59 loss: 1.3657 (1.3657) acc1: 64.0000 (64.0000) acc5: 89.6000 (89.6000) time: 13.1889 data: 13.1503 max mem: 2905
Test: [10/50] eta: 0:01:26 loss: 1.3222 (1.3373) acc1: 70.4000 (70.9818) acc5: 89.6000 (88.9455) time: 2.1616 data: 2.1409 max mem: 2905
Test: [20/50] eta: 0:00:50 loss: 1.4960 (1.4961) acc1: 65.6000 (67.0476) acc5: 87.2000 (87.1619) time: 1.1172 data: 1.0983 max mem: 2905
Test: [30/50] eta: 0:00:29 loss: 1.5735 (1.5061) acc1: 63.2000 (66.6581) acc5: 86.4000 (87.0194) time: 1.1156 data: 1.0956 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 1.4596 (1.5319) acc1: 64.0000 (66.1073) acc5: 86.4000 (86.4000) time: 0.8460 data: 0.8255 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 1.5491 (1.5417) acc1: 64.0000 (65.6480) acc5: 86.4000 (86.1920) time: 0.8027 data: 0.7828 max mem: 2905
Test: Total time: 0:00:53 (1.0635 s / it)
* Acc@1 66.068 Acc@5 86.440 loss 1.514
Accuracy of the model on the 50000 test images: 66.1%
Max accuracy: 66.53%
Epoch: [263] [ 0/625] eta: 4:08:20 lr: 0.000171 min_lr: 0.000171 loss: 2.5159 (2.5159) class_acc: 0.6367 (0.6367) weight_decay: 0.0500 (0.0500) time: 23.8410 data: 18.6652 max mem: 2905
Epoch: [263] [200/625] eta: 0:14:31 lr: 0.000168 min_lr: 0.000168 loss: 2.7019 (2.7243) class_acc: 0.5938 (0.5909) weight_decay: 0.0500 (0.0500) grad_norm: 3.6454 (3.5457) time: 1.9884 data: 0.8526 max mem: 2905
Epoch: [263] [400/625] eta: 0:07:26 lr: 0.000165 min_lr: 0.000165 loss: 2.6986 (2.7227) class_acc: 0.5977 (0.5907) weight_decay: 0.0500 (0.0500) grad_norm: 3.0232 (3.3814) time: 2.0851 data: 0.0008 max mem: 2905
Epoch: [263] [600/625] eta: 0:00:49 lr: 0.000162 min_lr: 0.000162 loss: 2.6886 (2.7212) class_acc: 0.5859 (0.5907) weight_decay: 0.0500 (0.0500) grad_norm: 2.7652 (3.3005) time: 1.8322 data: 0.0061 max mem: 2905
Epoch: [263] [624/625] eta: 0:00:01 lr: 0.000162 min_lr: 0.000162 loss: 2.7270 (2.7223) class_acc: 0.5781 (0.5904) weight_decay: 0.0500 (0.0500) grad_norm: 2.8724 (3.2828) time: 0.7270 data: 0.0014 max mem: 2905
Epoch: [263] Total time: 0:20:11 (1.9385 s / it)
Averaged stats: lr: 0.000162 min_lr: 0.000162 loss: 2.7270 (2.7147) class_acc: 0.5781 (0.5925) weight_decay: 0.0500 (0.0500) grad_norm: 2.8724 (3.2828)
Test: [ 0/50] eta: 0:09:35 loss: 1.4544 (1.4544) acc1: 65.6000 (65.6000) acc5: 88.8000 (88.8000) time: 11.5030 data: 11.4767 max mem: 2905
Test: [10/50] eta: 0:01:11 loss: 1.3297 (1.3302) acc1: 68.8000 (70.6182) acc5: 89.6000 (89.5273) time: 1.7841 data: 1.7630 max mem: 2905
Test: [20/50] eta: 0:00:40 loss: 1.4719 (1.4732) acc1: 66.4000 (67.0857) acc5: 88.0000 (87.8095) time: 0.8488 data: 0.8290 max mem: 2905
Test: [30/50] eta: 0:00:24 loss: 1.5536 (1.4999) acc1: 64.0000 (66.5806) acc5: 86.4000 (87.0452) time: 0.9526 data: 0.9320 max mem: 2905
Test: [40/50] eta: 0:00:10 loss: 1.4940 (1.5238) acc1: 64.8000 (65.9122) acc5: 85.6000 (86.4781) time: 0.7851 data: 0.7650 max mem: 2905
Test: [49/50] eta: 0:00:00 loss: 1.5341 (1.5228) acc1: 65.6000 (65.6800) acc5: 85.6000 (86.4640) time: 0.4189 data: 0.4007 max mem: 2905
Test: Total time: 0:00:47 (0.9438 s / it)
* Acc@1 66.454 Acc@5 86.734 loss 1.496
Accuracy of the model on the 50000 test images: 66.5%
Max accuracy: 66.53%
Epoch: [264] [ 0/625] eta: 3:24:32 lr: 0.000162 min_lr: 0.000162 loss: 2.7634 (2.7634) class_acc: 0.5898 (0.5898) weight_decay: 0.0500 (0.0500) time: 19.6367 data: 17.2441 max mem: 2905
Epoch: [264] [200/625] eta: 0:13:58 lr: 0.000159 min_lr: 0.000159 loss: 2.7183 (2.7173) class_acc: 0.5820 (0.5933) weight_decay: 0.0500 (0.0500) grad_norm: 2.4330 (3.4661) time: 1.8096 data: 0.6035 max mem: 2905
Epoch: [264] [400/625] eta: 0:07:17 lr: 0.000156 min_lr: 0.000156 loss: 2.7310 (2.7119) class_acc: 0.5859 (0.5933) weight_decay: 0.0500 (0.0500) grad_norm: 2.8214 (3.4802) time: 1.9498 data: 0.0954 max mem: 2905
Epoch: [264] [600/625] eta: 0:00:48 lr: 0.000154 min_lr: 0.000154 loss: 2.6837 (2.7124) class_acc: 0.5977 (0.5935) weight_decay: 0.0500 (0.0500) grad_norm: 2.8499 (3.5127) time: 2.0400 data: 0.1734 max mem: 2905
Epoch: [264] [624/625] eta: 0:00:01 lr: 0.000153 min_lr: 0.000153 loss: 2.6487 (2.7113) class_acc: 0.6016 (0.5938) weight_decay: 0.0500 (0.0500) grad_norm: 2.7755 (3.4780) time: 0.7590 data: 0.0324 max mem: 2905
Epoch: [264] Total time: 0:19:40 (1.8882 s / it)
Averaged stats: lr: 0.000153 min_lr: 0.000153 loss: 2.6487 (2.7131) class_acc: 0.6016 (0.5927) weight_decay: 0.0500 (0.0500) grad_norm: 2.7755 (3.4780)
Test: [ 0/50] eta: 0:10:46 loss: 1.4714 (1.4714) acc1: 65.6000 (65.6000) acc5: 87.2000 (87.2000) time: 12.9237 data: 12.8982 max mem: 2905
Test: [10/50] eta: 0:01:17 loss: 1.3041 (1.2939) acc1: 69.6000 (71.0545) acc5: 89.6000 (88.6546) time: 1.9316 data: 1.9128 max mem: 2905
Test: [20/50] eta: 0:00:42 loss: 1.4484 (1.4556) acc1: 68.0000 (67.6952) acc5: 87.2000 (86.9714) time: 0.8514 data: 0.8328 max mem: 2905
Test: [30/50] eta: 0:00:26 loss: 1.5503 (1.4833) acc1: 64.8000 (66.9677) acc5: 85.6000 (86.7613) time: 0.9835 data: 0.9645 max mem: 2905
Test: [40/50] eta: 0:00:12 loss: 1.5289 (1.5031) acc1: 64.0000 (66.4585) acc5: 85.6000 (86.2439) time: 0.9746 data: 0.9549 max mem: 2905
Test: [49/50] eta: 0:00:01 loss: 1.4980 (1.5058) acc1: 65.6000 (66.4160) acc5: 85.6000 (86.1120) time: 0.6406 data: 0.6217 max mem: 2905
Test: Total time: 0:00:53 (1.0784 s / it)
* Acc@1 66.778 Acc@5 86.864 loss 1.475
Accuracy of the model on the 50000 test images: 66.8%
Max accuracy: 66.78%
Epoch: [265] [ 0/625] eta: 3:31:23 lr: 0.000153 min_lr: 0.000153 loss: 2.7851 (2.7851) class_acc: 0.5859 (0.5859) weight_decay: 0.0500 (0.0500) time: 20.2935 data: 19.8328 max mem: 2905
WARNING:__main__:
*****************************************
Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
*****************************************
| distributed init (rank 0): env://, gpu 0
| distributed init (rank 5): env://, gpu 5
| distributed init (rank 6): env://, gpu 6
| distributed init (rank 3): env://, gpu 3
| distributed init (rank 7): env://, gpu 7
| distributed init (rank 1): env://, gpu 1
| distributed init (rank 4): env://, gpu 4
| distributed init (rank 2): env://, gpu 2
Namespace(aa='rand-m9-mstd0.5-inc1', auto_resume=True, batch_size=256, clip_grad=None, color_jitter=0.4, crop_pct=None, cutmix=0.0, cutmix_minmax=None, data_path='/data/benchmarks/ILSVRC2012_LMDB', data_set='IMNET_LMDB', device='cuda', disable_eval=False, dist_backend='nccl', dist_eval=True, dist_on_itp=False, dist_url='env://', distributed=True, drop_path=0.2, enable_wandb=False, epochs=300, eval=False, eval_data_path=None, finetune='', gpu=0, head_init_scale=1.0, imagenet_default_mean_and_std=True, input_size=224, layer_decay=1.0, layer_scale_init_value=1e-06, local_rank=-1, log_dir=None, lr=0.004, min_lr=1e-06, mixup=0.0, mixup_mode='batch', mixup_prob=1.0, mixup_switch_prob=0.5, model='convnext_tiny', model_ema=False, model_ema_decay=0.9999, model_ema_eval=False, model_ema_force_cpu=False, model_key='model|module', model_prefix='', momentum=0.9, nb_classes=1000, num_workers=10, opt='adamw', opt_betas=None, opt_eps=1e-08, output_dir='./checkpoint', pin_mem=True, project='convnext', rank=0, recount=1, remode='pixel', reprob=0.25, resplit=False, resume='', save_ckpt=True, save_ckpt_freq=1, save_ckpt_num=3, seed=0, smoothing=0.1, start_epoch=0, train_interpolation='bicubic', update_freq=2, use_amp=True, wandb_ckpt=False, warmup_epochs=20, warmup_steps=-1, weight_decay=0.05, weight_decay_end=None, world_size=8)
Transform =
RandomResizedCropAndInterpolation(size=(224, 224), scale=(0.08, 1.0), ratio=(0.75, 1.3333), interpolation=bicubic)
RandomHorizontalFlip(p=0.5)
RandAugment(n=2, ops=
AugmentOp(name=AutoContrast, p=0.5, m=9, mstd=0.5)
AugmentOp(name=Equalize, p=0.5, m=9, mstd=0.5)
AugmentOp(name=Invert, p=0.5, m=9, mstd=0.5)
AugmentOp(name=Rotate, p=0.5, m=9, mstd=0.5)
AugmentOp(name=PosterizeIncreasing, p=0.5, m=9, mstd=0.5)
AugmentOp(name=SolarizeIncreasing, p=0.5, m=9, mstd=0.5)
AugmentOp(name=SolarizeAdd, p=0.5, m=9, mstd=0.5)
AugmentOp(name=ColorIncreasing, p=0.5, m=9, mstd=0.5)
AugmentOp(name=ContrastIncreasing, p=0.5, m=9, mstd=0.5)
AugmentOp(name=BrightnessIncreasing, p=0.5, m=9, mstd=0.5)
AugmentOp(name=SharpnessIncreasing, p=0.5, m=9, mstd=0.5)
AugmentOp(name=ShearX, p=0.5, m=9, mstd=0.5)
AugmentOp(name=ShearY, p=0.5, m=9, mstd=0.5)
AugmentOp(name=TranslateXRel, p=0.5, m=9, mstd=0.5)
AugmentOp(name=TranslateYRel, p=0.5, m=9, mstd=0.5))
ToTensor()
Normalize(mean=tensor([0.4850, 0.4560, 0.4060]), std=tensor([0.2290, 0.2240, 0.2250]))
RandomErasing(p=0.25, mode=pixel, count=(1, 1))
---------------------------
reading from datapath /data/benchmarks/ILSVRC2012_LMDB
Number of the class = 1000
Transform =
Resize(size=256, interpolation=bicubic, max_size=None, antialias=None)
CenterCrop(size=(224, 224))
ToTensor()
Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225))
---------------------------
reading from datapath /data/benchmarks/ILSVRC2012_LMDB
Number of the class = 1000
Sampler_train = <torch.utils.data.distributed.DistributedSampler object at 0x7f763c0a9d60>
Model = MobileNetV3_Small(
(conv1): Conv2d(3, 16, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(hs1): Hardswish()
(bneck): Sequential(
(0): Block(
(conv1): Conv2d(16, 16, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): ReLU(inplace=True)
(conv2): Conv2d(16, 16, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=16, bias=False)
(bn2): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): ReLU(inplace=True)
(se): SeModule(
(se): Sequential(
(0): AdaptiveAvgPool2d(output_size=1)
(1): Conv2d(16, 8, kernel_size=(1, 1), stride=(1, 1), bias=False)
(2): BatchNorm2d(8, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): ReLU(inplace=True)
(4): Conv2d(8, 16, kernel_size=(1, 1), stride=(1, 1), bias=False)
(5): Hardsigmoid()
)
)
(conv3): Conv2d(16, 16, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act3): ReLU(inplace=True)
(skip): Sequential(
(0): Conv2d(16, 16, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=16, bias=False)
(1): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): Block(
(conv1): Conv2d(16, 72, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(72, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): ReLU(inplace=True)
(conv2): Conv2d(72, 72, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=72, bias=False)
(bn2): BatchNorm2d(72, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): ReLU(inplace=True)
(se): Identity()
(conv3): Conv2d(72, 24, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act3): ReLU(inplace=True)
(skip): Sequential(
(0): Conv2d(16, 16, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=16, bias=False)
(1): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): Conv2d(16, 24, kernel_size=(1, 1), stride=(1, 1))
(3): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(2): Block(
(conv1): Conv2d(24, 88, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(88, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): ReLU(inplace=True)
(conv2): Conv2d(88, 88, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=88, bias=False)
(bn2): BatchNorm2d(88, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): ReLU(inplace=True)
(se): Identity()
(conv3): Conv2d(88, 24, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act3): ReLU(inplace=True)
)
(3): Block(
(conv1): Conv2d(24, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): Hardswish()
(conv2): Conv2d(96, 96, kernel_size=(5, 5), stride=(2, 2), padding=(2, 2), groups=96, bias=False)
(bn2): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): Hardswish()
(se): SeModule(
(se): Sequential(
(0): AdaptiveAvgPool2d(output_size=1)
(1): Conv2d(96, 24, kernel_size=(1, 1), stride=(1, 1), bias=False)
(2): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): ReLU(inplace=True)
(4): Conv2d(24, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)
(5): Hardsigmoid()
)
)
(conv3): Conv2d(96, 40, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(40, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act3): Hardswish()
(skip): Sequential(
(0): Conv2d(24, 24, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=24, bias=False)
(1): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): Conv2d(24, 40, kernel_size=(1, 1), stride=(1, 1))
(3): BatchNorm2d(40, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(4): Block(
(conv1): Conv2d(40, 240, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(240, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): Hardswish()
(conv2): Conv2d(240, 240, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=240, bias=False)
(bn2): BatchNorm2d(240, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): Hardswish()
(se): SeModule(
(se): Sequential(
(0): AdaptiveAvgPool2d(output_size=1)
(1): Conv2d(240, 60, kernel_size=(1, 1), stride=(1, 1), bias=False)
(2): BatchNorm2d(60, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): ReLU(inplace=True)
(4): Conv2d(60, 240, kernel_size=(1, 1), stride=(1, 1), bias=False)
(5): Hardsigmoid()
)
)
(conv3): Conv2d(240, 40, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(40, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act3): Hardswish()
)
(5): Block(
(conv1): Conv2d(40, 240, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(240, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): Hardswish()
(conv2): Conv2d(240, 240, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=240, bias=False)
(bn2): BatchNorm2d(240, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): Hardswish()
(se): SeModule(
(se): Sequential(
(0): AdaptiveAvgPool2d(output_size=1)
(1): Conv2d(240, 60, kernel_size=(1, 1), stride=(1, 1), bias=False)
(2): BatchNorm2d(60, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): ReLU(inplace=True)
(4): Conv2d(60, 240, kernel_size=(1, 1), stride=(1, 1), bias=False)
(5): Hardsigmoid()
)
)
(conv3): Conv2d(240, 40, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(40, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act3): Hardswish()
)
(6): Block(
(conv1): Conv2d(40, 120, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(120, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): Hardswish()
(conv2): Conv2d(120, 120, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=120, bias=False)
(bn2): BatchNorm2d(120, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): Hardswish()
(se): SeModule(
(se): Sequential(
(0): AdaptiveAvgPool2d(output_size=1)
(1): Conv2d(120, 30, kernel_size=(1, 1), stride=(1, 1), bias=False)
(2): BatchNorm2d(30, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): ReLU(inplace=True)
(4): Conv2d(30, 120, kernel_size=(1, 1), stride=(1, 1), bias=False)
(5): Hardsigmoid()
)
)
(conv3): Conv2d(120, 48, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(48, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act3): Hardswish()
(skip): Sequential(
(0): Conv2d(40, 48, kernel_size=(1, 1), stride=(1, 1), bias=False)
(1): BatchNorm2d(48, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(7): Block(
(conv1): Conv2d(48, 144, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(144, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): Hardswish()
(conv2): Conv2d(144, 144, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=144, bias=False)
(bn2): BatchNorm2d(144, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): Hardswish()
(se): SeModule(
(se): Sequential(
(0): AdaptiveAvgPool2d(output_size=1)
(1): Conv2d(144, 36, kernel_size=(1, 1), stride=(1, 1), bias=False)
(2): BatchNorm2d(36, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): ReLU(inplace=True)
(4): Conv2d(36, 144, kernel_size=(1, 1), stride=(1, 1), bias=False)
(5): Hardsigmoid()
)
)
(conv3): Conv2d(144, 48, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(48, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act3): Hardswish()
)
(8): Block(
(conv1): Conv2d(48, 288, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(288, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): Hardswish()
(conv2): Conv2d(288, 288, kernel_size=(5, 5), stride=(2, 2), padding=(2, 2), groups=288, bias=False)
(bn2): BatchNorm2d(288, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): Hardswish()
(se): SeModule(
(se): Sequential(
(0): AdaptiveAvgPool2d(output_size=1)
(1): Conv2d(288, 72, kernel_size=(1, 1), stride=(1, 1), bias=False)
(2): BatchNorm2d(72, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): ReLU(inplace=True)
(4): Conv2d(72, 288, kernel_size=(1, 1), stride=(1, 1), bias=False)
(5): Hardsigmoid()
)
)
(conv3): Conv2d(288, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act3): Hardswish()
(skip): Sequential(
(0): Conv2d(48, 48, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=48, bias=False)
(1): BatchNorm2d(48, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): Conv2d(48, 96, kernel_size=(1, 1), stride=(1, 1))
(3): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(9): Block(
(conv1): Conv2d(96, 576, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(576, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): Hardswish()
(conv2): Conv2d(576, 576, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=576, bias=False)
(bn2): BatchNorm2d(576, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): Hardswish()
(se): SeModule(
(se): Sequential(
(0): AdaptiveAvgPool2d(output_size=1)
(1): Conv2d(576, 144, kernel_size=(1, 1), stride=(1, 1), bias=False)
(2): BatchNorm2d(144, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): ReLU(inplace=True)
(4): Conv2d(144, 576, kernel_size=(1, 1), stride=(1, 1), bias=False)
(5): Hardsigmoid()
)
)
(conv3): Conv2d(576, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act3): Hardswish()
)
(10): Block(
(conv1): Conv2d(96, 576, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(576, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): Hardswish()
(conv2): Conv2d(576, 576, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=576, bias=False)
(bn2): BatchNorm2d(576, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): Hardswish()
(se): SeModule(
(se): Sequential(
(0): AdaptiveAvgPool2d(output_size=1)
(1): Conv2d(576, 144, kernel_size=(1, 1), stride=(1, 1), bias=False)
(2): BatchNorm2d(144, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): ReLU(inplace=True)
(4): Conv2d(144, 576, kernel_size=(1, 1), stride=(1, 1), bias=False)
(5): Hardsigmoid()
)
)
(conv3): Conv2d(576, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act3): Hardswish()
)
)
(conv2): Conv2d(96, 576, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn2): BatchNorm2d(576, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(hs2): Hardswish()
(gap): AdaptiveAvgPool2d(output_size=1)
(linear3): Linear(in_features=576, out_features=1280, bias=False)
(bn3): BatchNorm1d(1280, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(hs3): Hardswish()
(drop): Dropout(p=0.2, inplace=False)
(linear4): Linear(in_features=1280, out_features=1000, bias=True)
)
number of params: 2950524
LR = 0.00400000
Batch size = 4096
Update frequent = 2
Number of training examples = 1281167
Number of training training per epoch = 312
Param groups = {
"decay": {
"weight_decay": 0.05,
"params": [
"conv1.weight",
"bneck.0.conv1.weight",
"bneck.0.conv2.weight",
"bneck.0.se.se.1.weight",
"bneck.0.se.se.4.weight",
"bneck.0.conv3.weight",
"bneck.0.skip.0.weight",
"bneck.1.conv1.weight",
"bneck.1.conv2.weight",
"bneck.1.conv3.weight",
"bneck.1.skip.0.weight",
"bneck.1.skip.2.weight",
"bneck.2.conv1.weight",
"bneck.2.conv2.weight",
"bneck.2.conv3.weight",
"bneck.3.conv1.weight",
"bneck.3.conv2.weight",
"bneck.3.se.se.1.weight",
"bneck.3.se.se.4.weight",
"bneck.3.conv3.weight",
"bneck.3.skip.0.weight",
"bneck.3.skip.2.weight",
"bneck.4.conv1.weight",
"bneck.4.conv2.weight",
"bneck.4.se.se.1.weight",
"bneck.4.se.se.4.weight",
"bneck.4.conv3.weight",
"bneck.5.conv1.weight",
"bneck.5.conv2.weight",
"bneck.5.se.se.1.weight",
"bneck.5.se.se.4.weight",
"bneck.5.conv3.weight",
"bneck.6.conv1.weight",
"bneck.6.conv2.weight",
"bneck.6.se.se.1.weight",
"bneck.6.se.se.4.weight",
"bneck.6.conv3.weight",
"bneck.6.skip.0.weight",
"bneck.7.conv1.weight",
"bneck.7.conv2.weight",
"bneck.7.se.se.1.weight",
"bneck.7.se.se.4.weight",
"bneck.7.conv3.weight",
"bneck.8.conv1.weight",
"bneck.8.conv2.weight",
"bneck.8.se.se.1.weight",
"bneck.8.se.se.4.weight",
"bneck.8.conv3.weight",
"bneck.8.skip.0.weight",
"bneck.8.skip.2.weight",
"bneck.9.conv1.weight",
"bneck.9.conv2.weight",
"bneck.9.se.se.1.weight",
"bneck.9.se.se.4.weight",
"bneck.9.conv3.weight",
"bneck.10.conv1.weight",
"bneck.10.conv2.weight",
"bneck.10.se.se.1.weight",
"bneck.10.se.se.4.weight",
"bneck.10.conv3.weight",
"conv2.weight",
"linear3.weight",
"linear4.weight"
],
"lr_scale": 1.0
},
"no_decay": {
"weight_decay": 0.0,
"params": [
"bn1.weight",
"bn1.bias",
"bneck.0.bn1.weight",
"bneck.0.bn1.bias",
"bneck.0.bn2.weight",
"bneck.0.bn2.bias",
"bneck.0.se.se.2.weight",
"bneck.0.se.se.2.bias",
"bneck.0.bn3.weight",
"bneck.0.bn3.bias",
"bneck.0.skip.1.weight",
"bneck.0.skip.1.bias",
"bneck.1.bn1.weight",
"bneck.1.bn1.bias",
"bneck.1.bn2.weight",
"bneck.1.bn2.bias",
"bneck.1.bn3.weight",
"bneck.1.bn3.bias",
"bneck.1.skip.1.weight",
"bneck.1.skip.1.bias",
"bneck.1.skip.2.bias",
"bneck.1.skip.3.weight",
"bneck.1.skip.3.bias",
"bneck.2.bn1.weight",
"bneck.2.bn1.bias",
"bneck.2.bn2.weight",
"bneck.2.bn2.bias",
"bneck.2.bn3.weight",
"bneck.2.bn3.bias",
"bneck.3.bn1.weight",
"bneck.3.bn1.bias",
"bneck.3.bn2.weight",
"bneck.3.bn2.bias",
"bneck.3.se.se.2.weight",
"bneck.3.se.se.2.bias",
"bneck.3.bn3.weight",
"bneck.3.bn3.bias",
"bneck.3.skip.1.weight",
"bneck.3.skip.1.bias",
"bneck.3.skip.2.bias",
"bneck.3.skip.3.weight",
"bneck.3.skip.3.bias",
"bneck.4.bn1.weight",
"bneck.4.bn1.bias",
"bneck.4.bn2.weight",
"bneck.4.bn2.bias",
"bneck.4.se.se.2.weight",
"bneck.4.se.se.2.bias",
"bneck.4.bn3.weight",
"bneck.4.bn3.bias",
"bneck.5.bn1.weight",
"bneck.5.bn1.bias",
"bneck.5.bn2.weight",
"bneck.5.bn2.bias",
"bneck.5.se.se.2.weight",
"bneck.5.se.se.2.bias",
"bneck.5.bn3.weight",
"bneck.5.bn3.bias",
"bneck.6.bn1.weight",
"bneck.6.bn1.bias",
"bneck.6.bn2.weight",
"bneck.6.bn2.bias",
"bneck.6.se.se.2.weight",
"bneck.6.se.se.2.bias",
"bneck.6.bn3.weight",
"bneck.6.bn3.bias",
"bneck.6.skip.1.weight",
"bneck.6.skip.1.bias",
"bneck.7.bn1.weight",
"bneck.7.bn1.bias",
"bneck.7.bn2.weight",
"bneck.7.bn2.bias",
"bneck.7.se.se.2.weight",
"bneck.7.se.se.2.bias",
"bneck.7.bn3.weight",
"bneck.7.bn3.bias",
"bneck.8.bn1.weight",
"bneck.8.bn1.bias",
"bneck.8.bn2.weight",
"bneck.8.bn2.bias",
"bneck.8.se.se.2.weight",
"bneck.8.se.se.2.bias",
"bneck.8.bn3.weight",
"bneck.8.bn3.bias",
"bneck.8.skip.1.weight",
"bneck.8.skip.1.bias",
"bneck.8.skip.2.bias",
"bneck.8.skip.3.weight",
"bneck.8.skip.3.bias",
"bneck.9.bn1.weight",
"bneck.9.bn1.bias",
"bneck.9.bn2.weight",
"bneck.9.bn2.bias",
"bneck.9.se.se.2.weight",
"bneck.9.se.se.2.bias",
"bneck.9.bn3.weight",
"bneck.9.bn3.bias",
"bneck.10.bn1.weight",
"bneck.10.bn1.bias",
"bneck.10.bn2.weight",
"bneck.10.bn2.bias",
"bneck.10.se.se.2.weight",
"bneck.10.se.se.2.bias",
"bneck.10.bn3.weight",
"bneck.10.bn3.bias",
"bn2.weight",
"bn2.bias",
"bn3.weight",
"bn3.bias",
"linear4.bias"
],
"lr_scale": 1.0
}
}
Use Cosine LR scheduler
Set warmup steps = 6240
Set warmup steps = 0
Max WD = 0.0500000, Min WD = 0.0500000
criterion = LabelSmoothingCrossEntropy()
Auto resume checkpoint: checkpoint/checkpoint-264.pth
Resume checkpoint checkpoint/checkpoint-264.pth
With optim & sched!
Start training for 300 epochs
Epoch: [265] [ 0/625] eta: 5:12:11 lr: 0.000153 min_lr: 0.000153 loss: 2.6982 (2.6982) class_acc: 0.6250 (0.6250) weight_decay: 0.0500 (0.0500) time: 29.9703 data: 22.2179 max mem: 2928
Epoch: [265] [200/625] eta: 0:15:33 lr: 0.000150 min_lr: 0.000150 loss: 2.7303 (2.7090) class_acc: 0.5938 (0.5956) weight_decay: 0.0500 (0.0500) grad_norm: 3.8810 (3.1684) time: 1.8843 data: 0.0438 max mem: 2928
Epoch: [265] [400/625] eta: 0:08:04 lr: 0.000148 min_lr: 0.000148 loss: 2.7196 (2.7135) class_acc: 0.5859 (0.5945) weight_decay: 0.0500 (0.0500) grad_norm: 3.0067 (3.2041) time: 2.1437 data: 0.3882 max mem: 2928
Epoch: [265] [600/625] eta: 0:00:52 lr: 0.000145 min_lr: 0.000145 loss: 2.7335 (2.7158) class_acc: 0.5898 (0.5935) weight_decay: 0.0500 (0.0500) grad_norm: 2.7748 (3.3163) time: 1.9821 data: 0.0006 max mem: 2928
Epoch: [265] [624/625] eta: 0:00:02 lr: 0.000145 min_lr: 0.000145 loss: 2.7468 (2.7158) class_acc: 0.5977 (0.5937) weight_decay: 0.0500 (0.0500) grad_norm: 2.4024 (3.2918) time: 0.7514 data: 0.0016 max mem: 2928
Epoch: [265] Total time: 0:21:43 (2.0851 s / it)
Averaged stats: lr: 0.000145 min_lr: 0.000145 loss: 2.7468 (2.7100) class_acc: 0.5977 (0.5937) weight_decay: 0.0500 (0.0500) grad_norm: 2.4024 (3.2918)
Test: [ 0/50] eta: 0:10:16 loss: 1.4208 (1.4208) acc1: 64.8000 (64.8000) acc5: 89.6000 (89.6000) time: 12.3266 data: 11.6150 max mem: 2928
Test: [10/50] eta: 0:01:23 loss: 1.3219 (1.3229) acc1: 70.4000 (71.0545) acc5: 89.6000 (89.3091) time: 2.0756 data: 1.9935 max mem: 2928
Test: [20/50] eta: 0:00:50 loss: 1.4699 (1.4817) acc1: 66.4000 (67.2381) acc5: 88.0000 (87.8476) time: 1.1575 data: 1.1392 max mem: 2928
Test: [30/50] eta: 0:00:30 loss: 1.6121 (1.4965) acc1: 64.0000 (66.7097) acc5: 87.2000 (87.3806) time: 1.2436 data: 1.2263 max mem: 2928
Test: [40/50] eta: 0:00:13 loss: 1.5226 (1.5095) acc1: 64.0000 (66.2634) acc5: 86.4000 (87.1220) time: 1.0960 data: 1.0781 max mem: 2928
Test: [49/50] eta: 0:00:01 loss: 1.4883 (1.5159) acc1: 64.8000 (66.1120) acc5: 86.4000 (86.9440) time: 0.9563 data: 0.9368 max mem: 2928
Test: Total time: 0:00:58 (1.1724 s / it)
* Acc@1 67.028 Acc@5 86.856 loss 1.483
Accuracy of the model on the 50000 test images: 67.0%
Max accuracy: 67.03%
Epoch: [266] [ 0/625] eta: 3:14:14 lr: 0.000145 min_lr: 0.000145 loss: 2.6366 (2.6366) class_acc: 0.6133 (0.6133) weight_decay: 0.0500 (0.0500) time: 18.6478 data: 17.9495 max mem: 2928
Epoch: [266] [200/625] eta: 0:13:43 lr: 0.000142 min_lr: 0.000142 loss: 2.7148 (2.7098) class_acc: 0.5898 (0.5929) weight_decay: 0.0500 (0.0500) grad_norm: 3.2582 (3.2946) time: 1.7768 data: 0.0008 max mem: 2928
Epoch: [266] [400/625] eta: 0:07:09 lr: 0.000139 min_lr: 0.000139 loss: 2.6743 (2.7173) class_acc: 0.5938 (0.5913) weight_decay: 0.0500 (0.0500) grad_norm: 2.8287 (3.3699) time: 1.7315 data: 0.0008 max mem: 2928
Epoch: [266] [600/625] eta: 0:00:47 lr: 0.000137 min_lr: 0.000137 loss: 2.6792 (2.7112) class_acc: 0.5938 (0.5925) weight_decay: 0.0500 (0.0500) grad_norm: 2.1729 (3.4514) time: 1.9200 data: 0.0011 max mem: 2928
Epoch: [266] [624/625] eta: 0:00:01 lr: 0.000137 min_lr: 0.000137 loss: 2.7016 (2.7115) class_acc: 0.6055 (0.5925) weight_decay: 0.0500 (0.0500) grad_norm: 3.8838 (3.4796) time: 0.8098 data: 0.0014 max mem: 2928
Epoch: [266] Total time: 0:19:27 (1.8684 s / it)
Averaged stats: lr: 0.000137 min_lr: 0.000137 loss: 2.7016 (2.7090) class_acc: 0.6055 (0.5940) weight_decay: 0.0500 (0.0500) grad_norm: 3.8838 (3.4796)
Test: [ 0/50] eta: 0:09:26 loss: 1.4158 (1.4158) acc1: 68.0000 (68.0000) acc5: 92.0000 (92.0000) time: 11.3276 data: 11.2999 max mem: 2928
Test: [10/50] eta: 0:01:20 loss: 1.2840 (1.2876) acc1: 72.0000 (72.6545) acc5: 90.4000 (89.6000) time: 2.0166 data: 1.9942 max mem: 2928
Test: [20/50] eta: 0:00:48 loss: 1.4544 (1.4495) acc1: 68.0000 (68.4190) acc5: 88.0000 (88.1524) time: 1.1289 data: 1.1081 max mem: 2928
Test: [30/50] eta: 0:00:27 loss: 1.6032 (1.4796) acc1: 64.0000 (67.0194) acc5: 87.2000 (87.2516) time: 1.0116 data: 0.9930 max mem: 2928
Test: [40/50] eta: 0:00:11 loss: 1.5103 (1.4976) acc1: 63.2000 (66.4781) acc5: 85.6000 (86.6146) time: 0.5873 data: 0.5700 max mem: 2928
Test: [49/50] eta: 0:00:00 loss: 1.4662 (1.5046) acc1: 63.2000 (65.9200) acc5: 86.4000 (86.5280) time: 0.4541 data: 0.4365 max mem: 2928
Test: Total time: 0:00:46 (0.9389 s / it)
* Acc@1 67.022 Acc@5 87.080 loss 1.470
Accuracy of the model on the 50000 test images: 67.0%
Max accuracy: 67.03%
Epoch: [267] [ 0/625] eta: 3:20:25 lr: 0.000136 min_lr: 0.000136 loss: 2.8081 (2.8081) class_acc: 0.5508 (0.5508) weight_decay: 0.0500 (0.0500) time: 19.2412 data: 16.5344 max mem: 2928
Epoch: [267] [200/625] eta: 0:13:44 lr: 0.000134 min_lr: 0.000134 loss: 2.7135 (2.7086) class_acc: 0.6016 (0.5941) weight_decay: 0.0500 (0.0500) grad_norm: 2.2558 (inf) time: 1.9102 data: 0.0138 max mem: 2928
Epoch: [267] [400/625] eta: 0:07:04 lr: 0.000131 min_lr: 0.000131 loss: 2.6886 (2.7125) class_acc: 0.5898 (0.5919) weight_decay: 0.0500 (0.0500) grad_norm: 3.9867 (inf) time: 1.5592 data: 0.0006 max mem: 2928
Epoch: [267] [600/625] eta: 0:00:47 lr: 0.000129 min_lr: 0.000129 loss: 2.7519 (2.7126) class_acc: 0.5898 (0.5923) weight_decay: 0.0500 (0.0500) grad_norm: 2.4030 (inf) time: 1.8698 data: 0.0006 max mem: 2928
Epoch: [267] [624/625] eta: 0:00:01 lr: 0.000129 min_lr: 0.000129 loss: 2.7224 (2.7128) class_acc: 0.5859 (0.5922) weight_decay: 0.0500 (0.0500) grad_norm: 3.1920 (inf) time: 0.7403 data: 0.0016 max mem: 2928
Epoch: [267] Total time: 0:19:13 (1.8462 s / it)
Averaged stats: lr: 0.000129 min_lr: 0.000129 loss: 2.7224 (2.7063) class_acc: 0.5859 (0.5946) weight_decay: 0.0500 (0.0500) grad_norm: 3.1920 (inf)
Test: [ 0/50] eta: 0:09:05 loss: 1.4205 (1.4205) acc1: 66.4000 (66.4000) acc5: 91.2000 (91.2000) time: 10.9186 data: 10.8928 max mem: 2928
Test: [10/50] eta: 0:01:11 loss: 1.2860 (1.2785) acc1: 71.2000 (71.7091) acc5: 89.6000 (89.4545) time: 1.7946 data: 1.7749 max mem: 2928
Test: [20/50] eta: 0:00:42 loss: 1.3355 (1.4082) acc1: 68.0000 (68.6095) acc5: 88.8000 (88.1143) time: 0.9544 data: 0.9355 max mem: 2928
Test: [30/50] eta: 0:00:25 loss: 1.5094 (1.4340) acc1: 65.6000 (67.7419) acc5: 87.2000 (87.9226) time: 0.9651 data: 0.9461 max mem: 2928
Test: [40/50] eta: 0:00:10 loss: 1.4308 (1.4614) acc1: 65.6000 (67.2585) acc5: 86.4000 (87.3171) time: 0.7517 data: 0.7315 max mem: 2928
Test: [49/50] eta: 0:00:00 loss: 1.4840 (1.4681) acc1: 66.4000 (67.0240) acc5: 85.6000 (87.0720) time: 0.5676 data: 0.5475 max mem: 2928
Test: Total time: 0:00:48 (0.9617 s / it)
* Acc@1 67.826 Acc@5 87.606 loss 1.430
Accuracy of the model on the 50000 test images: 67.8%
Max accuracy: 67.83%
Epoch: [268] [ 0/625] eta: 3:28:20 lr: 0.000128 min_lr: 0.000128 loss: 2.5826 (2.5826) class_acc: 0.6172 (0.6172) weight_decay: 0.0500 (0.0500) time: 20.0007 data: 17.7944 max mem: 2928
Epoch: [268] [200/625] eta: 0:13:31 lr: 0.000126 min_lr: 0.000126 loss: 2.7327 (2.7133) class_acc: 0.5820 (0.5907) weight_decay: 0.0500 (0.0500) grad_norm: 3.5326 (3.3510) time: 1.7522 data: 0.0090 max mem: 2928
Epoch: [268] [400/625] eta: 0:07:00 lr: 0.000123 min_lr: 0.000123 loss: 2.6934 (2.7126) class_acc: 0.6016 (0.5922) weight_decay: 0.0500 (0.0500) grad_norm: 3.3752 (3.5178) time: 1.9288 data: 0.0007 max mem: 2928
Epoch: [268] [600/625] eta: 0:00:47 lr: 0.000121 min_lr: 0.000121 loss: 2.7205 (2.7108) class_acc: 0.5898 (0.5931) weight_decay: 0.0500 (0.0500) grad_norm: 3.2686 (3.5203) time: 2.0563 data: 0.0007 max mem: 2928
Epoch: [268] [624/625] eta: 0:00:01 lr: 0.000121 min_lr: 0.000121 loss: 2.6252 (2.7082) class_acc: 0.6055 (0.5937) weight_decay: 0.0500 (0.0500) grad_norm: 3.1555 (3.5092) time: 0.9236 data: 0.0013 max mem: 2928
Epoch: [268] Total time: 0:19:12 (1.8445 s / it)
Averaged stats: lr: 0.000121 min_lr: 0.000121 loss: 2.6252 (2.7043) class_acc: 0.6055 (0.5952) weight_decay: 0.0500 (0.0500) grad_norm: 3.1555 (3.5092)
Test: [ 0/50] eta: 0:09:11 loss: 1.3064 (1.3064) acc1: 64.8000 (64.8000) acc5: 92.0000 (92.0000) time: 11.0248 data: 10.9935 max mem: 2928
Test: [10/50] eta: 0:01:24 loss: 1.2414 (1.2534) acc1: 72.8000 (72.3636) acc5: 89.6000 (89.0909) time: 2.1043 data: 2.0834 max mem: 2928
Test: [20/50] eta: 0:00:52 loss: 1.3990 (1.3931) acc1: 68.0000 (68.3810) acc5: 88.0000 (88.1524) time: 1.2923 data: 1.2730 max mem: 2928
Test: [30/50] eta: 0:00:30 loss: 1.4889 (1.4157) acc1: 65.6000 (67.8194) acc5: 87.2000 (87.7677) time: 1.2075 data: 1.1883 max mem: 2928
Test: [40/50] eta: 0:00:12 loss: 1.3815 (1.4375) acc1: 65.6000 (67.4341) acc5: 87.2000 (87.2976) time: 0.7894 data: 0.7697 max mem: 2928
Test: [49/50] eta: 0:00:01 loss: 1.4162 (1.4396) acc1: 65.6000 (67.1840) acc5: 85.6000 (87.0240) time: 0.5964 data: 0.5765 max mem: 2928
Test: Total time: 0:00:53 (1.0714 s / it)
* Acc@1 67.870 Acc@5 87.568 loss 1.408
Accuracy of the model on the 50000 test images: 67.9%
Max accuracy: 67.87%
Epoch: [269] [ 0/625] eta: 3:56:16 lr: 0.000121 min_lr: 0.000121 loss: 2.5830 (2.5830) class_acc: 0.6328 (0.6328) weight_decay: 0.0500 (0.0500) time: 22.6824 data: 15.8010 max mem: 2928
Epoch: [269] [200/625] eta: 0:14:00 lr: 0.000118 min_lr: 0.000118 loss: 2.7129 (2.7029) class_acc: 0.6133 (0.5974) weight_decay: 0.0500 (0.0500) grad_norm: 2.7937 (3.0471) time: 1.9056 data: 0.0009 max mem: 2928
Epoch: [269] [400/625] eta: 0:07:15 lr: 0.000116 min_lr: 0.000116 loss: 2.7001 (2.7024) class_acc: 0.5898 (0.5967) weight_decay: 0.0500 (0.0500) grad_norm: 4.0177 (3.2485) time: 1.9665 data: 0.0010 max mem: 2928
Epoch: [269] [600/625] eta: 0:00:48 lr: 0.000113 min_lr: 0.000113 loss: 2.7637 (2.7064) class_acc: 0.5820 (0.5955) weight_decay: 0.0500 (0.0500) grad_norm: 3.1701 (3.4919) time: 1.9778 data: 0.0008 max mem: 2928
Epoch: [269] [624/625] eta: 0:00:01 lr: 0.000113 min_lr: 0.000113 loss: 2.7322 (2.7067) class_acc: 0.5898 (0.5953) weight_decay: 0.0500 (0.0500) grad_norm: 3.7104 (3.5100) time: 0.7780 data: 0.0014 max mem: 2928
Epoch: [269] Total time: 0:19:35 (1.8811 s / it)
Averaged stats: lr: 0.000113 min_lr: 0.000113 loss: 2.7322 (2.7040) class_acc: 0.5898 (0.5953) weight_decay: 0.0500 (0.0500) grad_norm: 3.7104 (3.5100)
Test: [ 0/50] eta: 0:10:13 loss: 1.4209 (1.4209) acc1: 67.2000 (67.2000) acc5: 88.0000 (88.0000) time: 12.2786 data: 12.2494 max mem: 2928
Test: [10/50] eta: 0:01:16 loss: 1.2628 (1.2641) acc1: 71.2000 (72.1455) acc5: 90.4000 (89.3818) time: 1.9220 data: 1.9020 max mem: 2928
Test: [20/50] eta: 0:00:45 loss: 1.4259 (1.4152) acc1: 68.8000 (69.0667) acc5: 88.0000 (88.0381) time: 0.9817 data: 0.9617 max mem: 2928
Test: [30/50] eta: 0:00:27 loss: 1.5233 (1.4392) acc1: 64.0000 (68.0258) acc5: 87.2000 (87.7419) time: 1.0510 data: 1.0297 max mem: 2928
Test: [40/50] eta: 0:00:11 loss: 1.4580 (1.4621) acc1: 64.0000 (67.5122) acc5: 87.2000 (87.2781) time: 0.7457 data: 0.7250 max mem: 2928
Test: [49/50] eta: 0:00:00 loss: 1.4548 (1.4639) acc1: 64.0000 (67.2320) acc5: 87.2000 (87.1200) time: 0.6696 data: 0.6486 max mem: 2928
Test: Total time: 0:00:48 (0.9767 s / it)
* Acc@1 67.794 Acc@5 87.504 loss 1.426
Accuracy of the model on the 50000 test images: 67.8%
Max accuracy: 67.87%
Epoch: [270] [ 0/625] eta: 3:36:47 lr: 0.000113 min_lr: 0.000113 loss: 2.6511 (2.6511) class_acc: 0.6211 (0.6211) weight_decay: 0.0500 (0.0500) time: 20.8115 data: 19.1630 max mem: 2928
Epoch: [270] [200/625] eta: 0:13:58 lr: 0.000111 min_lr: 0.000111 loss: 2.6846 (2.7009) class_acc: 0.5898 (0.5976) weight_decay: 0.0500 (0.0500) grad_norm: 2.5810 (2.8145) time: 1.9970 data: 0.0596 max mem: 2928
Epoch: [270] [400/625] eta: 0:07:13 lr: 0.000109 min_lr: 0.000109 loss: 2.7432 (2.7050) class_acc: 0.5938 (0.5966) weight_decay: 0.0500 (0.0500) grad_norm: 2.2566 (2.9775) time: 1.7794 data: 0.0006 max mem: 2928
Epoch: [270] [600/625] eta: 0:00:48 lr: 0.000106 min_lr: 0.000106 loss: 2.6925 (2.7024) class_acc: 0.5898 (0.5963) weight_decay: 0.0500 (0.0500) grad_norm: 2.7629 (3.0018) time: 1.8598 data: 0.0007 max mem: 2928
Epoch: [270] [624/625] eta: 0:00:01 lr: 0.000106 min_lr: 0.000106 loss: 2.6738 (2.7020) class_acc: 0.6016 (0.5965) weight_decay: 0.0500 (0.0500) grad_norm: 3.2650 (3.0529) time: 0.7898 data: 0.0013 max mem: 2928
Epoch: [270] Total time: 0:19:39 (1.8865 s / it)
Averaged stats: lr: 0.000106 min_lr: 0.000106 loss: 2.6738 (2.7029) class_acc: 0.6016 (0.5959) weight_decay: 0.0500 (0.0500) grad_norm: 3.2650 (3.0529)
Test: [ 0/50] eta: 0:09:59 loss: 1.3246 (1.3246) acc1: 68.0000 (68.0000) acc5: 88.8000 (88.8000) time: 11.9856 data: 11.9541 max mem: 2928
Test: [10/50] eta: 0:01:22 loss: 1.2602 (1.2566) acc1: 72.0000 (71.9273) acc5: 90.4000 (89.9636) time: 2.0617 data: 2.0425 max mem: 2928
Test: [20/50] eta: 0:00:50 loss: 1.4536 (1.4100) acc1: 67.2000 (68.3429) acc5: 88.0000 (88.3048) time: 1.1678 data: 1.1491 max mem: 2928
Test: [30/50] eta: 0:00:28 loss: 1.5538 (1.4361) acc1: 64.8000 (67.6903) acc5: 86.4000 (87.6645) time: 1.0898 data: 1.0703 max mem: 2928
Test: [40/50] eta: 0:00:11 loss: 1.4390 (1.4523) acc1: 64.8000 (67.4146) acc5: 86.4000 (87.3561) time: 0.6133 data: 0.5933 max mem: 2928
Test: [49/50] eta: 0:00:00 loss: 1.4204 (1.4527) acc1: 64.8000 (67.0720) acc5: 87.2000 (87.2160) time: 0.4896 data: 0.4698 max mem: 2928
Test: Total time: 0:00:48 (0.9752 s / it)
* Acc@1 67.814 Acc@5 87.684 loss 1.412
Accuracy of the model on the 50000 test images: 67.8%
Max accuracy: 67.87%
Epoch: [271] [ 0/625] eta: 3:39:25 lr: 0.000106 min_lr: 0.000106 loss: 2.7391 (2.7391) class_acc: 0.5820 (0.5820) weight_decay: 0.0500 (0.0500) time: 21.0640 data: 16.7909 max mem: 2928
Epoch: [271] [200/625] eta: 0:14:18 lr: 0.000104 min_lr: 0.000104 loss: 2.7004 (2.6886) class_acc: 0.5898 (0.5973) weight_decay: 0.0500 (0.0500) grad_norm: 2.9814 (3.3375) time: 1.9515 data: 0.0007 max mem: 2928
Epoch: [271] [400/625] eta: 0:07:17 lr: 0.000101 min_lr: 0.000101 loss: 2.7393 (2.6974) class_acc: 0.5820 (0.5966) weight_decay: 0.0500 (0.0500) grad_norm: 3.8755 (3.3907) time: 1.8603 data: 0.0008 max mem: 2928
Epoch: [271] [600/625] eta: 0:00:48 lr: 0.000099 min_lr: 0.000099 loss: 2.6825 (2.6988) class_acc: 0.5859 (0.5960) weight_decay: 0.0500 (0.0500) grad_norm: 3.3325 (3.2279) time: 2.0133 data: 0.0007 max mem: 2928
Epoch: [271] [624/625] eta: 0:00:01 lr: 0.000099 min_lr: 0.000099 loss: 2.6910 (2.6987) class_acc: 0.5898 (0.5962) weight_decay: 0.0500 (0.0500) grad_norm: 2.6265 (3.2198) time: 0.6447 data: 0.0015 max mem: 2928
Epoch: [271] Total time: 0:19:36 (1.8828 s / it)
Averaged stats: lr: 0.000099 min_lr: 0.000099 loss: 2.6910 (2.6998) class_acc: 0.5898 (0.5961) weight_decay: 0.0500 (0.0500) grad_norm: 2.6265 (3.2198)
Test: [ 0/50] eta: 0:09:45 loss: 1.4222 (1.4222) acc1: 65.6000 (65.6000) acc5: 89.6000 (89.6000) time: 11.7017 data: 11.6789 max mem: 2928
Test: [10/50] eta: 0:01:18 loss: 1.2510 (1.2580) acc1: 72.8000 (71.8545) acc5: 90.4000 (90.4000) time: 1.9592 data: 1.9403 max mem: 2928
Test: [20/50] eta: 0:00:46 loss: 1.3864 (1.4104) acc1: 65.6000 (68.1524) acc5: 88.8000 (88.8000) time: 1.0353 data: 1.0157 max mem: 2928
Test: [30/50] eta: 0:00:27 loss: 1.4963 (1.4314) acc1: 65.6000 (67.8710) acc5: 88.0000 (88.3613) time: 1.0415 data: 1.0215 max mem: 2928
Test: [40/50] eta: 0:00:11 loss: 1.4401 (1.4553) acc1: 65.6000 (67.2000) acc5: 86.4000 (87.7073) time: 0.7603 data: 0.7414 max mem: 2928
Test: [49/50] eta: 0:00:00 loss: 1.4635 (1.4612) acc1: 65.6000 (66.8960) acc5: 85.6000 (87.4400) time: 0.6726 data: 0.6535 max mem: 2928
Test: Total time: 0:00:48 (0.9737 s / it)
* Acc@1 67.910 Acc@5 87.666 loss 1.427
Accuracy of the model on the 50000 test images: 67.9%
Max accuracy: 67.91%
Epoch: [272] [ 0/625] eta: 3:11:09 lr: 0.000099 min_lr: 0.000099 loss: 2.7360 (2.7360) class_acc: 0.5977 (0.5977) weight_decay: 0.0500 (0.0500) time: 18.3508 data: 18.0127 max mem: 2928
Epoch: [272] [200/625] eta: 0:13:39 lr: 0.000097 min_lr: 0.000097 loss: 2.6547 (2.7008) class_acc: 0.6016 (0.5970) weight_decay: 0.0500 (0.0500) grad_norm: 2.5486 (3.1072) time: 1.8958 data: 1.1827 max mem: 2928
Epoch: [272] [400/625] eta: 0:07:00 lr: 0.000094 min_lr: 0.000094 loss: 2.6620 (2.6965) class_acc: 0.5938 (0.5983) weight_decay: 0.0500 (0.0500) grad_norm: 2.3547 (3.1378) time: 1.8289 data: 0.0006 max mem: 2928
Epoch: [272] [600/625] eta: 0:00:47 lr: 0.000092 min_lr: 0.000092 loss: 2.6817 (2.6969) class_acc: 0.5859 (0.5973) weight_decay: 0.0500 (0.0500) grad_norm: 2.2615 (3.1892) time: 1.9939 data: 0.0005 max mem: 2928
Epoch: [272] [624/625] eta: 0:00:01 lr: 0.000092 min_lr: 0.000092 loss: 2.6530 (2.6974) class_acc: 0.5898 (0.5969) weight_decay: 0.0500 (0.0500) grad_norm: 2.5169 (3.1918) time: 0.6706 data: 0.0014 max mem: 2928
Epoch: [272] Total time: 0:19:23 (1.8611 s / it)
Averaged stats: lr: 0.000092 min_lr: 0.000092 loss: 2.6530 (2.6984) class_acc: 0.5898 (0.5969) weight_decay: 0.0500 (0.0500) grad_norm: 2.5169 (3.1918)
Test: [ 0/50] eta: 0:09:22 loss: 1.3835 (1.3835) acc1: 64.0000 (64.0000) acc5: 89.6000 (89.6000) time: 11.2520 data: 11.2204 max mem: 2928
Test: [10/50] eta: 0:01:15 loss: 1.2347 (1.2630) acc1: 72.8000 (72.2909) acc5: 89.6000 (89.8909) time: 1.8942 data: 1.8740 max mem: 2928
Test: [20/50] eta: 0:00:44 loss: 1.4162 (1.3927) acc1: 68.8000 (68.9905) acc5: 88.8000 (88.5333) time: 1.0050 data: 0.9854 max mem: 2928
Test: [30/50] eta: 0:00:26 loss: 1.5001 (1.4099) acc1: 66.4000 (68.4903) acc5: 86.4000 (88.1548) time: 1.0020 data: 0.9819 max mem: 2928
Test: [40/50] eta: 0:00:11 loss: 1.3811 (1.4346) acc1: 66.4000 (67.7268) acc5: 86.4000 (87.5317) time: 0.7838 data: 0.7626 max mem: 2928
Test: [49/50] eta: 0:00:01 loss: 1.4691 (1.4392) acc1: 65.6000 (67.3600) acc5: 86.4000 (87.3440) time: 0.7103 data: 0.6904 max mem: 2928
Test: Total time: 0:00:50 (1.0171 s / it)
* Acc@1 68.182 Acc@5 87.624 loss 1.409
Accuracy of the model on the 50000 test images: 68.2%
Max accuracy: 68.18%
Epoch: [273] [ 0/625] eta: 3:22:31 lr: 0.000092 min_lr: 0.000092 loss: 2.5506 (2.5506) class_acc: 0.6211 (0.6211) weight_decay: 0.0500 (0.0500) time: 19.4418 data: 19.2721 max mem: 2928
Epoch: [273] [200/625] eta: 0:14:02 lr: 0.000090 min_lr: 0.000090 loss: 2.6592 (2.6863) class_acc: 0.6055 (0.6004) weight_decay: 0.0500 (0.0500) grad_norm: 2.7570 (3.0974) time: 1.7972 data: 0.0008 max mem: 2928
Epoch: [273] [400/625] eta: 0:07:12 lr: 0.000088 min_lr: 0.000088 loss: 2.6879 (2.6899) class_acc: 0.6016 (0.5990) weight_decay: 0.0500 (0.0500) grad_norm: 2.9832 (3.2100) time: 1.8369 data: 0.0007 max mem: 2928
Epoch: [273] [600/625] eta: 0:00:48 lr: 0.000086 min_lr: 0.000086 loss: 2.6857 (2.6927) class_acc: 0.5938 (0.5985) weight_decay: 0.0500 (0.0500) grad_norm: 2.8768 (inf) time: 1.8753 data: 0.8175 max mem: 2928
Epoch: [273] [624/625] eta: 0:00:01 lr: 0.000085 min_lr: 0.000085 loss: 2.6485 (2.6922) class_acc: 0.6094 (0.5987) weight_decay: 0.0500 (0.0500) grad_norm: 2.4341 (inf) time: 0.9818 data: 0.5894 max mem: 2928
Epoch: [273] Total time: 0:19:38 (1.8858 s / it)
Averaged stats: lr: 0.000085 min_lr: 0.000085 loss: 2.6485 (2.6966) class_acc: 0.6094 (0.5969) weight_decay: 0.0500 (0.0500) grad_norm: 2.4341 (inf)
Test: [ 0/50] eta: 0:10:09 loss: 1.3885 (1.3885) acc1: 68.0000 (68.0000) acc5: 89.6000 (89.6000) time: 12.1982 data: 12.1731 max mem: 2928
Test: [10/50] eta: 0:01:21 loss: 1.2585 (1.2638) acc1: 72.8000 (72.7273) acc5: 90.4000 (89.9636) time: 2.0263 data: 2.0053 max mem: 2928
Test: [20/50] eta: 0:00:48 loss: 1.3941 (1.4129) acc1: 67.2000 (69.0667) acc5: 88.0000 (88.3810) time: 1.0749 data: 1.0555 max mem: 2928
Test: [30/50] eta: 0:00:27 loss: 1.5349 (1.4345) acc1: 65.6000 (68.3097) acc5: 88.0000 (87.8452) time: 1.0214 data: 1.0026 max mem: 2928
Test: [40/50] eta: 0:00:12 loss: 1.4251 (1.4469) acc1: 65.6000 (67.8829) acc5: 87.2000 (87.4537) time: 0.7823 data: 0.7636 max mem: 2928
Test: [49/50] eta: 0:00:01 loss: 1.4251 (1.4510) acc1: 64.8000 (67.4560) acc5: 87.2000 (87.3600) time: 0.7817 data: 0.7629 max mem: 2928
Test: Total time: 0:00:51 (1.0345 s / it)
* Acc@1 68.180 Acc@5 87.878 loss 1.417
Accuracy of the model on the 50000 test images: 68.2%
Max accuracy: 68.18%
Epoch: [274] [ 0/625] eta: 4:00:35 lr: 0.000085 min_lr: 0.000085 loss: 2.6856 (2.6856) class_acc: 0.6289 (0.6289) weight_decay: 0.0500 (0.0500) time: 23.0969 data: 19.3993 max mem: 2928
Epoch: [274] [200/625] eta: 0:13:41 lr: 0.000083 min_lr: 0.000083 loss: 2.6918 (2.6992) class_acc: 0.5859 (0.5967) weight_decay: 0.0500 (0.0500) grad_norm: 2.7592 (3.1120) time: 1.9756 data: 0.0820 max mem: 2928
Epoch: [274] [400/625] eta: 0:07:03 lr: 0.000081 min_lr: 0.000081 loss: 2.6703 (2.6947) class_acc: 0.6133 (0.5979) weight_decay: 0.0500 (0.0500) grad_norm: 2.6652 (3.0480) time: 1.8408 data: 0.4873 max mem: 2928
Epoch: [274] [600/625] eta: 0:00:47 lr: 0.000079 min_lr: 0.000079 loss: 2.7090 (2.6946) class_acc: 0.6016 (0.5976) weight_decay: 0.0500 (0.0500) grad_norm: 2.6876 (3.1544) time: 1.9572 data: 0.0007 max mem: 2928
Epoch: [274] [624/625] eta: 0:00:01 lr: 0.000079 min_lr: 0.000079 loss: 2.6884 (2.6948) class_acc: 0.5938 (0.5977) weight_decay: 0.0500 (0.0500) grad_norm: 2.2495 (3.1419) time: 0.7670 data: 0.0015 max mem: 2928
Epoch: [274] Total time: 0:19:33 (1.8773 s / it)
Averaged stats: lr: 0.000079 min_lr: 0.000079 loss: 2.6884 (2.6941) class_acc: 0.5938 (0.5972) weight_decay: 0.0500 (0.0500) grad_norm: 2.2495 (3.1419)
Test: [ 0/50] eta: 0:10:05 loss: 1.3563 (1.3563) acc1: 67.2000 (67.2000) acc5: 88.8000 (88.8000) time: 12.1057 data: 12.0772 max mem: 2928
Test: [10/50] eta: 0:01:23 loss: 1.2730 (1.2644) acc1: 70.4000 (71.9273) acc5: 90.4000 (89.6000) time: 2.0841 data: 2.0629 max mem: 2928
Test: [20/50] eta: 0:00:48 loss: 1.3975 (1.4028) acc1: 68.8000 (68.7238) acc5: 89.6000 (88.4952) time: 1.1051 data: 1.0849 max mem: 2928
Test: [30/50] eta: 0:00:27 loss: 1.5271 (1.4267) acc1: 66.4000 (67.8968) acc5: 87.2000 (88.0000) time: 1.0038 data: 0.9830 max mem: 2928
Test: [40/50] eta: 0:00:11 loss: 1.4284 (1.4394) acc1: 66.4000 (67.6293) acc5: 86.4000 (87.5902) time: 0.6486 data: 0.6277 max mem: 2928
Test: [49/50] eta: 0:00:00 loss: 1.4388 (1.4441) acc1: 65.6000 (67.2000) acc5: 86.4000 (87.3920) time: 0.6400 data: 0.6200 max mem: 2928
Test: Total time: 0:00:48 (0.9751 s / it)
* Acc@1 68.150 Acc@5 87.822 loss 1.410
Accuracy of the model on the 50000 test images: 68.2%
Max accuracy: 68.18%
Epoch: [275] [ 0/625] eta: 3:53:07 lr: 0.000079 min_lr: 0.000079 loss: 2.8161 (2.8161) class_acc: 0.5703 (0.5703) weight_decay: 0.0500 (0.0500) time: 22.3797 data: 13.6702 max mem: 2928
Epoch: [275] [200/625] eta: 0:14:00 lr: 0.000077 min_lr: 0.000077 loss: 2.7157 (2.6953) class_acc: 0.5898 (0.5984) weight_decay: 0.0500 (0.0500) grad_norm: 3.4296 (3.4390) time: 1.8427 data: 0.0006 max mem: 2928
Epoch: [275] [400/625] eta: 0:07:12 lr: 0.000075 min_lr: 0.000075 loss: 2.7074 (2.6958) class_acc: 0.5898 (0.5973) weight_decay: 0.0500 (0.0500) grad_norm: 2.5290 (3.3259) time: 1.8275 data: 0.0006 max mem: 2928
Epoch: [275] [600/625] eta: 0:00:47 lr: 0.000073 min_lr: 0.000073 loss: 2.7033 (2.7010) class_acc: 0.5820 (0.5961) weight_decay: 0.0500 (0.0500) grad_norm: 2.9741 (3.2677) time: 2.0943 data: 0.0006 max mem: 2928
Epoch: [275] [624/625] eta: 0:00:01 lr: 0.000073 min_lr: 0.000073 loss: 2.6872 (2.7010) class_acc: 0.5859 (0.5957) weight_decay: 0.0500 (0.0500) grad_norm: 3.2051 (3.2667) time: 0.8218 data: 0.0013 max mem: 2928
Epoch: [275] Total time: 0:19:29 (1.8706 s / it)
Averaged stats: lr: 0.000073 min_lr: 0.000073 loss: 2.6872 (2.6949) class_acc: 0.5859 (0.5975) weight_decay: 0.0500 (0.0500) grad_norm: 3.2051 (3.2667)
Test: [ 0/50] eta: 0:10:11 loss: 1.3697 (1.3697) acc1: 64.0000 (64.0000) acc5: 90.4000 (90.4000) time: 12.2342 data: 12.2045 max mem: 2928
Test: [10/50] eta: 0:01:20 loss: 1.2814 (1.2650) acc1: 72.8000 (71.8545) acc5: 90.4000 (89.3818) time: 2.0091 data: 1.9897 max mem: 2928
Test: [20/50] eta: 0:00:46 loss: 1.3902 (1.4106) acc1: 68.0000 (68.6857) acc5: 88.0000 (88.1143) time: 1.0152 data: 0.9965 max mem: 2928
Test: [30/50] eta: 0:00:27 loss: 1.5326 (1.4249) acc1: 66.4000 (68.1806) acc5: 87.2000 (87.7936) time: 1.0058 data: 0.9864 max mem: 2928
Test: [40/50] eta: 0:00:11 loss: 1.3861 (1.4402) acc1: 66.4000 (67.6683) acc5: 87.2000 (87.5317) time: 0.8079 data: 0.7866 max mem: 2928
Test: [49/50] eta: 0:00:01 loss: 1.4318 (1.4417) acc1: 65.6000 (67.5040) acc5: 87.2000 (87.3280) time: 0.7010 data: 0.6800 max mem: 2928
Test: Total time: 0:00:51 (1.0329 s / it)
* Acc@1 68.200 Acc@5 87.736 loss 1.410
Accuracy of the model on the 50000 test images: 68.2%
Max accuracy: 68.20%
Epoch: [276] [ 0/625] eta: 3:17:13 lr: 0.000073 min_lr: 0.000073 loss: 2.6518 (2.6518) class_acc: 0.6250 (0.6250) weight_decay: 0.0500 (0.0500) time: 18.9342 data: 17.9040 max mem: 2928
Epoch: [276] [200/625] eta: 0:14:00 lr: 0.000071 min_lr: 0.000071 loss: 2.6926 (2.6846) class_acc: 0.6016 (0.5994) weight_decay: 0.0500 (0.0500) grad_norm: 3.7429 (3.3839) time: 1.8723 data: 0.0054 max mem: 2928
Epoch: [276] [400/625] eta: 0:07:13 lr: 0.000069 min_lr: 0.000069 loss: 2.6608 (2.6845) class_acc: 0.6094 (0.6008) weight_decay: 0.0500 (0.0500) grad_norm: 3.1121 (3.4721) time: 1.8420 data: 0.0010 max mem: 2928
Epoch: [276] [600/625] eta: 0:00:48 lr: 0.000067 min_lr: 0.000067 loss: 2.6647 (2.6884) class_acc: 0.5977 (0.5993) weight_decay: 0.0500 (0.0500) grad_norm: 2.9250 (3.2657) time: 1.9726 data: 0.0007 max mem: 2928
Epoch: [276] [624/625] eta: 0:00:01 lr: 0.000067 min_lr: 0.000067 loss: 2.7245 (2.6897) class_acc: 0.5938 (0.5994) weight_decay: 0.0500 (0.0500) grad_norm: 2.9789 (3.2508) time: 0.7098 data: 0.0016 max mem: 2928
Epoch: [276] Total time: 0:19:38 (1.8863 s / it)
Averaged stats: lr: 0.000067 min_lr: 0.000067 loss: 2.7245 (2.6929) class_acc: 0.5938 (0.5978) weight_decay: 0.0500 (0.0500) grad_norm: 2.9789 (3.2508)
Test: [ 0/50] eta: 0:10:29 loss: 1.3327 (1.3327) acc1: 68.0000 (68.0000) acc5: 92.0000 (92.0000) time: 12.5927 data: 12.5655 max mem: 2928
Test: [10/50] eta: 0:01:23 loss: 1.2554 (1.2507) acc1: 72.0000 (72.5091) acc5: 90.4000 (90.1091) time: 2.0772 data: 2.0560 max mem: 2928
Test: [20/50] eta: 0:00:48 loss: 1.4097 (1.4020) acc1: 68.0000 (68.7619) acc5: 89.6000 (88.7238) time: 1.0580 data: 1.0379 max mem: 2928
Test: [30/50] eta: 0:00:28 loss: 1.5303 (1.4193) acc1: 66.4000 (68.0774) acc5: 88.0000 (88.2581) time: 1.0705 data: 1.0514 max mem: 2928
Test: [40/50] eta: 0:00:12 loss: 1.4000 (1.4330) acc1: 66.4000 (67.6683) acc5: 87.2000 (87.8829) time: 0.8149 data: 0.7966 max mem: 2928
Test: [49/50] eta: 0:00:01 loss: 1.4106 (1.4365) acc1: 66.4000 (67.4400) acc5: 87.2000 (87.6800) time: 0.7385 data: 0.7200 max mem: 2928
Test: Total time: 0:00:50 (1.0160 s / it)
* Acc@1 68.428 Acc@5 87.984 loss 1.402
Accuracy of the model on the 50000 test images: 68.4%
Max accuracy: 68.43%
Epoch: [277] [ 0/625] eta: 3:31:49 lr: 0.000067 min_lr: 0.000067 loss: 2.6082 (2.6082) class_acc: 0.5859 (0.5859) weight_decay: 0.0500 (0.0500) time: 20.3358 data: 17.3456 max mem: 2928
Epoch: [277] [200/625] eta: 0:14:25 lr: 0.000065 min_lr: 0.000065 loss: 2.6370 (2.6730) class_acc: 0.5938 (0.5995) weight_decay: 0.0500 (0.0500) grad_norm: 3.3387 (3.1867) time: 1.9612 data: 0.0012 max mem: 2928
Epoch: [277] [400/625] eta: 0:07:20 lr: 0.000064 min_lr: 0.000064 loss: 2.6703 (2.6893) class_acc: 0.5977 (0.5964) weight_decay: 0.0500 (0.0500) grad_norm: 2.7297 (3.4342) time: 1.8914 data: 0.0014 max mem: 2928
Epoch: [277] [600/625] eta: 0:00:48 lr: 0.000062 min_lr: 0.000062 loss: 2.7367 (2.6900) class_acc: 0.5820 (0.5967) weight_decay: 0.0500 (0.0500) grad_norm: 2.8013 (3.2965) time: 1.9486 data: 0.0009 max mem: 2928
Epoch: [277] [624/625] eta: 0:00:01 lr: 0.000062 min_lr: 0.000062 loss: 2.7087 (2.6901) class_acc: 0.5898 (0.5967) weight_decay: 0.0500 (0.0500) grad_norm: 2.8374 (3.3087) time: 0.7608 data: 0.0020 max mem: 2928
Epoch: [277] Total time: 0:19:45 (1.8971 s / it)
Averaged stats: lr: 0.000062 min_lr: 0.000062 loss: 2.7087 (2.6914) class_acc: 0.5898 (0.5981) weight_decay: 0.0500 (0.0500) grad_norm: 2.8374 (3.3087)
Test: [ 0/50] eta: 0:10:28 loss: 1.4060 (1.4060) acc1: 65.6000 (65.6000) acc5: 88.8000 (88.8000) time: 12.5743 data: 12.5466 max mem: 2928
Test: [10/50] eta: 0:01:24 loss: 1.2714 (1.2757) acc1: 73.6000 (72.6545) acc5: 90.4000 (89.6727) time: 2.1007 data: 2.0786 max mem: 2928
Test: [20/50] eta: 0:00:48 loss: 1.4380 (1.4259) acc1: 68.0000 (68.7619) acc5: 88.0000 (88.2286) time: 1.0704 data: 1.0505 max mem: 2928
Test: [30/50] eta: 0:00:28 loss: 1.5226 (1.4390) acc1: 65.6000 (68.0774) acc5: 87.2000 (87.8194) time: 1.0493 data: 1.0311 max mem: 2928
Test: [40/50] eta: 0:00:11 loss: 1.4300 (1.4517) acc1: 66.4000 (67.4927) acc5: 88.0000 (87.6488) time: 0.7532 data: 0.7349 max mem: 2928
Test: [49/50] eta: 0:00:01 loss: 1.4326 (1.4545) acc1: 64.8000 (67.2800) acc5: 87.2000 (87.4720) time: 0.7197 data: 0.7012 max mem: 2928
Test: Total time: 0:00:50 (1.0183 s / it)
* Acc@1 68.194 Acc@5 87.870 loss 1.414
Accuracy of the model on the 50000 test images: 68.2%
Max accuracy: 68.43%
Epoch: [278] [ 0/625] eta: 4:17:04 lr: 0.000062 min_lr: 0.000062 loss: 2.7227 (2.7227) class_acc: 0.6250 (0.6250) weight_decay: 0.0500 (0.0500) time: 24.6787 data: 16.5646 max mem: 2928
Epoch: [278] [200/625] eta: 0:14:11 lr: 0.000060 min_lr: 0.000060 loss: 2.6780 (2.7018) class_acc: 0.5859 (0.5976) weight_decay: 0.0500 (0.0500) grad_norm: 3.2220 (3.2738) time: 1.8835 data: 0.0058 max mem: 2928
Epoch: [278] [400/625] eta: 0:07:18 lr: 0.000058 min_lr: 0.000058 loss: 2.6360 (2.6924) class_acc: 0.6016 (0.6000) weight_decay: 0.0500 (0.0500) grad_norm: 2.7605 (3.2574) time: 1.9273 data: 0.0011 max mem: 2928
Epoch: [278] [600/625] eta: 0:00:48 lr: 0.000056 min_lr: 0.000056 loss: 2.6787 (2.6932) class_acc: 0.5938 (0.5991) weight_decay: 0.0500 (0.0500) grad_norm: 3.0789 (3.1963) time: 2.0496 data: 0.0011 max mem: 2928
Epoch: [278] [624/625] eta: 0:00:01 lr: 0.000056 min_lr: 0.000056 loss: 2.6510 (2.6931) class_acc: 0.6055 (0.5991) weight_decay: 0.0500 (0.0500) grad_norm: 2.8214 (3.2008) time: 0.7820 data: 0.0018 max mem: 2928
Epoch: [278] Total time: 0:19:50 (1.9055 s / it)
Averaged stats: lr: 0.000056 min_lr: 0.000056 loss: 2.6510 (2.6903) class_acc: 0.6055 (0.5988) weight_decay: 0.0500 (0.0500) grad_norm: 2.8214 (3.2008)
Test: [ 0/50] eta: 0:10:03 loss: 1.3870 (1.3870) acc1: 67.2000 (67.2000) acc5: 90.4000 (90.4000) time: 12.0789 data: 12.0518 max mem: 2928
Test: [10/50] eta: 0:01:22 loss: 1.2373 (1.2620) acc1: 72.8000 (72.8727) acc5: 90.4000 (89.9636) time: 2.0694 data: 2.0496 max mem: 2928
Test: [20/50] eta: 0:00:49 loss: 1.4269 (1.4167) acc1: 68.0000 (68.9524) acc5: 88.0000 (88.5333) time: 1.1394 data: 1.1205 max mem: 2928
Test: [30/50] eta: 0:00:29 loss: 1.5423 (1.4308) acc1: 66.4000 (68.1806) acc5: 87.2000 (88.2323) time: 1.1569 data: 1.1388 max mem: 2928
Test: [40/50] eta: 0:00:12 loss: 1.4451 (1.4495) acc1: 66.4000 (67.7073) acc5: 87.2000 (87.8244) time: 0.8331 data: 0.8144 max mem: 2928
Test: [49/50] eta: 0:00:01 loss: 1.4349 (1.4516) acc1: 66.4000 (67.6160) acc5: 87.2000 (87.6000) time: 0.7899 data: 0.7698 max mem: 2928
Test: Total time: 0:00:52 (1.0488 s / it)
* Acc@1 68.216 Acc@5 87.900 loss 1.417
Accuracy of the model on the 50000 test images: 68.2%
Max accuracy: 68.43%
Epoch: [279] [ 0/625] eta: 3:28:48 lr: 0.000056 min_lr: 0.000056 loss: 2.6166 (2.6166) class_acc: 0.5898 (0.5898) weight_decay: 0.0500 (0.0500) time: 20.0462 data: 16.6315 max mem: 2928
Epoch: [279] [200/625] eta: 0:14:19 lr: 0.000055 min_lr: 0.000055 loss: 2.7202 (2.6871) class_acc: 0.5898 (0.6010) weight_decay: 0.0500 (0.0500) grad_norm: 2.4478 (2.9410) time: 1.9121 data: 0.6348 max mem: 2928
Epoch: [279] [400/625] eta: 0:07:24 lr: 0.000053 min_lr: 0.000053 loss: 2.7356 (2.6906) class_acc: 0.5938 (0.6001) weight_decay: 0.0500 (0.0500) grad_norm: 3.9114 (3.1683) time: 2.0300 data: 0.0007 max mem: 2928
Epoch: [279] [600/625] eta: 0:00:48 lr: 0.000051 min_lr: 0.000051 loss: 2.6522 (2.6903) class_acc: 0.5977 (0.5992) weight_decay: 0.0500 (0.0500) grad_norm: 2.3418 (3.1071) time: 2.0551 data: 0.0293 max mem: 2928
Epoch: [279] [624/625] eta: 0:00:01 lr: 0.000051 min_lr: 0.000051 loss: 2.6741 (2.6903) class_acc: 0.5859 (0.5991) weight_decay: 0.0500 (0.0500) grad_norm: 2.5262 (3.0888) time: 0.7456 data: 0.0014 max mem: 2928
Epoch: [279] Total time: 0:19:48 (1.9011 s / it)
Averaged stats: lr: 0.000051 min_lr: 0.000051 loss: 2.6741 (2.6876) class_acc: 0.5859 (0.5993) weight_decay: 0.0500 (0.0500) grad_norm: 2.5262 (3.0888)
Test: [ 0/50] eta: 0:10:15 loss: 1.3327 (1.3327) acc1: 69.6000 (69.6000) acc5: 90.4000 (90.4000) time: 12.3071 data: 12.2686 max mem: 2928
Test: [10/50] eta: 0:01:22 loss: 1.2232 (1.2342) acc1: 72.8000 (72.4364) acc5: 91.2000 (90.3273) time: 2.0663 data: 2.0443 max mem: 2928
Test: [20/50] eta: 0:00:48 loss: 1.4040 (1.3779) acc1: 68.0000 (68.8000) acc5: 89.6000 (88.8381) time: 1.0731 data: 1.0531 max mem: 2928
Test: [30/50] eta: 0:00:26 loss: 1.5038 (1.3973) acc1: 66.4000 (68.3097) acc5: 87.2000 (88.3613) time: 0.9030 data: 0.8841 max mem: 2928
Test: [40/50] eta: 0:00:10 loss: 1.3802 (1.4175) acc1: 66.4000 (67.8244) acc5: 87.2000 (87.8829) time: 0.5052 data: 0.4863 max mem: 2928
Test: [49/50] eta: 0:00:00 loss: 1.4298 (1.4204) acc1: 67.2000 (67.6480) acc5: 87.2000 (87.7280) time: 0.4532 data: 0.4344 max mem: 2928
Test: Total time: 0:00:44 (0.8894 s / it)
* Acc@1 68.532 Acc@5 88.082 loss 1.387
Accuracy of the model on the 50000 test images: 68.5%
Max accuracy: 68.53%
Epoch: [280] [ 0/625] eta: 3:08:12 lr: 0.000051 min_lr: 0.000051 loss: 2.6742 (2.6742) class_acc: 0.5859 (0.5859) weight_decay: 0.0500 (0.0500) time: 18.0682 data: 15.8064 max mem: 2928
Epoch: [280] [200/625] eta: 0:13:46 lr: 0.000050 min_lr: 0.000050 loss: 2.6397 (2.6768) class_acc: 0.6094 (0.6026) weight_decay: 0.0500 (0.0500) grad_norm: 3.2799 (inf) time: 1.7707 data: 0.0006 max mem: 2928
Epoch: [280] [400/625] eta: 0:07:06 lr: 0.000048 min_lr: 0.000048 loss: 2.6548 (2.6809) class_acc: 0.6055 (0.6001) weight_decay: 0.0500 (0.0500) grad_norm: 3.2054 (inf) time: 1.7875 data: 0.0007 max mem: 2928
Epoch: [280] [600/625] eta: 0:00:47 lr: 0.000046 min_lr: 0.000046 loss: 2.6748 (2.6851) class_acc: 0.5938 (0.5990) weight_decay: 0.0500 (0.0500) grad_norm: 2.6952 (inf) time: 1.9275 data: 0.0006 max mem: 2928
Epoch: [280] [624/625] eta: 0:00:01 lr: 0.000046 min_lr: 0.000046 loss: 2.6492 (2.6852) class_acc: 0.6016 (0.5991) weight_decay: 0.0500 (0.0500) grad_norm: 2.5849 (inf) time: 0.9609 data: 0.0014 max mem: 2928
Epoch: [280] Total time: 0:19:16 (1.8505 s / it)
Averaged stats: lr: 0.000046 min_lr: 0.000046 loss: 2.6492 (2.6893) class_acc: 0.6016 (0.5988) weight_decay: 0.0500 (0.0500) grad_norm: 2.5849 (inf)
Test: [ 0/50] eta: 0:10:14 loss: 1.3498 (1.3498) acc1: 67.2000 (67.2000) acc5: 92.0000 (92.0000) time: 12.2866 data: 12.2563 max mem: 2928
Test: [10/50] eta: 0:01:19 loss: 1.2368 (1.2383) acc1: 72.8000 (72.5818) acc5: 91.2000 (90.2546) time: 1.9794 data: 1.9596 max mem: 2928
Test: [20/50] eta: 0:00:46 loss: 1.4004 (1.3801) acc1: 68.8000 (69.4476) acc5: 88.8000 (88.8000) time: 1.0071 data: 0.9884 max mem: 2928
Test: [30/50] eta: 0:00:26 loss: 1.4768 (1.4000) acc1: 66.4000 (68.6968) acc5: 88.0000 (88.3871) time: 0.9872 data: 0.9680 max mem: 2928
Test: [40/50] eta: 0:00:11 loss: 1.4075 (1.4208) acc1: 66.4000 (68.1951) acc5: 88.0000 (87.9610) time: 0.6854 data: 0.6657 max mem: 2928
Test: [49/50] eta: 0:00:00 loss: 1.4243 (1.4253) acc1: 65.6000 (67.9040) acc5: 87.2000 (87.8080) time: 0.5648 data: 0.5456 max mem: 2928
Test: Total time: 0:00:49 (0.9838 s / it)
* Acc@1 68.594 Acc@5 88.064 loss 1.391
Accuracy of the model on the 50000 test images: 68.6%
Max accuracy: 68.59%
Epoch: [281] [ 0/625] eta: 3:13:44 lr: 0.000046 min_lr: 0.000046 loss: 2.8213 (2.8213) class_acc: 0.5898 (0.5898) weight_decay: 0.0500 (0.0500) time: 18.5992 data: 18.4075 max mem: 2928
Epoch: [281] [200/625] eta: 0:13:50 lr: 0.000045 min_lr: 0.000045 loss: 2.7363 (2.6928) class_acc: 0.5820 (0.5982) weight_decay: 0.0500 (0.0500) grad_norm: 2.7569 (2.9812) time: 1.8773 data: 0.0008 max mem: 2928
Epoch: [281] [400/625] eta: 0:07:11 lr: 0.000043 min_lr: 0.000043 loss: 2.6870 (2.6889) class_acc: 0.5898 (0.5991) weight_decay: 0.0500 (0.0500) grad_norm: 2.4591 (3.0920) time: 2.0355 data: 0.0008 max mem: 2928
Epoch: [281] [600/625] eta: 0:00:48 lr: 0.000042 min_lr: 0.000042 loss: 2.6828 (2.6882) class_acc: 0.6055 (0.5985) weight_decay: 0.0500 (0.0500) grad_norm: 2.3855 (3.0058) time: 2.0052 data: 0.0008 max mem: 2928
Epoch: [281] [624/625] eta: 0:00:01 lr: 0.000042 min_lr: 0.000042 loss: 2.7301 (2.6895) class_acc: 0.5977 (0.5985) weight_decay: 0.0500 (0.0500) grad_norm: 2.9938 (3.0082) time: 0.6779 data: 0.0020 max mem: 2928
Epoch: [281] Total time: 0:19:38 (1.8858 s / it)
Averaged stats: lr: 0.000042 min_lr: 0.000042 loss: 2.7301 (2.6852) class_acc: 0.5977 (0.5997) weight_decay: 0.0500 (0.0500) grad_norm: 2.9938 (3.0082)
Test: [ 0/50] eta: 0:10:12 loss: 1.3764 (1.3764) acc1: 68.8000 (68.8000) acc5: 90.4000 (90.4000) time: 12.2577 data: 12.2286 max mem: 2928
Test: [10/50] eta: 0:01:29 loss: 1.2316 (1.2398) acc1: 72.0000 (72.6545) acc5: 90.4000 (90.1818) time: 2.2326 data: 2.2137 max mem: 2928
Test: [20/50] eta: 0:00:54 loss: 1.3995 (1.3890) acc1: 68.8000 (69.0667) acc5: 88.8000 (88.6857) time: 1.2890 data: 1.2701 max mem: 2928
Test: [30/50] eta: 0:00:31 loss: 1.5015 (1.4098) acc1: 66.4000 (68.2581) acc5: 88.0000 (88.0774) time: 1.1875 data: 1.1672 max mem: 2928
Test: [40/50] eta: 0:00:12 loss: 1.4007 (1.4277) acc1: 66.4000 (67.9024) acc5: 87.2000 (87.7268) time: 0.7521 data: 0.7328 max mem: 2928
Test: [49/50] eta: 0:00:01 loss: 1.4124 (1.4320) acc1: 66.4000 (67.6800) acc5: 87.2000 (87.5200) time: 0.7510 data: 0.7327 max mem: 2928
Test: Total time: 0:00:53 (1.0675 s / it)
* Acc@1 68.550 Acc@5 88.018 loss 1.398
Accuracy of the model on the 50000 test images: 68.6%
Max accuracy: 68.59%
Epoch: [282] [ 0/625] eta: 3:37:33 lr: 0.000042 min_lr: 0.000042 loss: 2.6832 (2.6832) class_acc: 0.6211 (0.6211) weight_decay: 0.0500 (0.0500) time: 20.8848 data: 16.3276 max mem: 2928
Epoch: [282] [200/625] eta: 0:14:03 lr: 0.000040 min_lr: 0.000040 loss: 2.6873 (2.6818) class_acc: 0.6055 (0.6021) weight_decay: 0.0500 (0.0500) grad_norm: 2.4936 (2.8106) time: 1.7879 data: 0.0006 max mem: 2928
Epoch: [282] [400/625] eta: 0:07:13 lr: 0.000039 min_lr: 0.000039 loss: 2.6466 (2.6852) class_acc: 0.6055 (0.6007) weight_decay: 0.0500 (0.0500) grad_norm: 2.6044 (3.0977) time: 1.8668 data: 0.0150 max mem: 2928
Epoch: [282] [600/625] eta: 0:00:48 lr: 0.000037 min_lr: 0.000037 loss: 2.7311 (2.6832) class_acc: 0.5898 (0.6019) weight_decay: 0.0500 (0.0500) grad_norm: 2.8431 (3.1425) time: 2.1169 data: 0.0180 max mem: 2928
Epoch: [282] [624/625] eta: 0:00:01 lr: 0.000037 min_lr: 0.000037 loss: 2.7014 (2.6830) class_acc: 0.5898 (0.6019) weight_decay: 0.0500 (0.0500) grad_norm: 3.0306 (3.1807) time: 0.7332 data: 0.0193 max mem: 2928
Epoch: [282] Total time: 0:19:59 (1.9196 s / it)
Averaged stats: lr: 0.000037 min_lr: 0.000037 loss: 2.7014 (2.6850) class_acc: 0.5898 (0.5998) weight_decay: 0.0500 (0.0500) grad_norm: 3.0306 (3.1807)
Test: [ 0/50] eta: 0:10:20 loss: 1.3544 (1.3544) acc1: 67.2000 (67.2000) acc5: 90.4000 (90.4000) time: 12.4064 data: 12.3627 max mem: 2928
Test: [10/50] eta: 0:01:21 loss: 1.2201 (1.2334) acc1: 73.6000 (72.8727) acc5: 91.2000 (90.3273) time: 2.0299 data: 2.0079 max mem: 2928
Test: [20/50] eta: 0:00:47 loss: 1.3980 (1.3812) acc1: 69.6000 (69.4095) acc5: 89.6000 (88.8762) time: 1.0432 data: 1.0233 max mem: 2928
Test: [30/50] eta: 0:00:27 loss: 1.5038 (1.4016) acc1: 66.4000 (68.6968) acc5: 88.0000 (88.4645) time: 1.0084 data: 0.9883 max mem: 2928
Test: [40/50] eta: 0:00:11 loss: 1.4008 (1.4199) acc1: 67.2000 (68.0390) acc5: 87.2000 (87.9610) time: 0.6561 data: 0.6362 max mem: 2928
Test: [49/50] eta: 0:00:00 loss: 1.4060 (1.4226) acc1: 65.6000 (67.8240) acc5: 87.2000 (87.8080) time: 0.6545 data: 0.6361 max mem: 2928
Test: Total time: 0:00:46 (0.9336 s / it)
* Acc@1 68.676 Acc@5 88.124 loss 1.389
Accuracy of the model on the 50000 test images: 68.7%
Max accuracy: 68.68%
Epoch: [283] [ 0/625] eta: 3:30:01 lr: 0.000037 min_lr: 0.000037 loss: 2.5470 (2.5470) class_acc: 0.6680 (0.6680) weight_decay: 0.0500 (0.0500) time: 20.1618 data: 17.7754 max mem: 2928
Epoch: [283] [200/625] eta: 0:14:36 lr: 0.000036 min_lr: 0.000036 loss: 2.6979 (2.6966) class_acc: 0.6094 (0.5982) weight_decay: 0.0500 (0.0500) grad_norm: 2.3781 (2.9006) time: 1.7888 data: 0.0010 max mem: 2928
Epoch: [283] [400/625] eta: 0:07:29 lr: 0.000035 min_lr: 0.000035 loss: 2.6904 (2.6868) class_acc: 0.5977 (0.5998) weight_decay: 0.0500 (0.0500) grad_norm: 2.5846 (2.9859) time: 1.9854 data: 0.0008 max mem: 2928
Epoch: [283] [600/625] eta: 0:00:49 lr: 0.000033 min_lr: 0.000033 loss: 2.6247 (2.6860) class_acc: 0.5938 (0.6000) weight_decay: 0.0500 (0.0500) grad_norm: 2.6765 (2.9870) time: 1.9493 data: 0.0019 max mem: 2928
Epoch: [283] [624/625] eta: 0:00:01 lr: 0.000033 min_lr: 0.000033 loss: 2.6755 (2.6857) class_acc: 0.5938 (0.5999) weight_decay: 0.0500 (0.0500) grad_norm: 2.7766 (2.9764) time: 0.5867 data: 0.0023 max mem: 2928
Epoch: [283] Total time: 0:20:31 (1.9708 s / it)
Averaged stats: lr: 0.000033 min_lr: 0.000033 loss: 2.6755 (2.6852) class_acc: 0.5938 (0.5996) weight_decay: 0.0500 (0.0500) grad_norm: 2.7766 (2.9764)
Test: [ 0/50] eta: 0:10:44 loss: 1.3511 (1.3511) acc1: 67.2000 (67.2000) acc5: 91.2000 (91.2000) time: 12.8905 data: 12.8601 max mem: 2928
Test: [10/50] eta: 0:01:22 loss: 1.2415 (1.2416) acc1: 72.8000 (73.0182) acc5: 91.2000 (90.3273) time: 2.0600 data: 2.0403 max mem: 2928
Test: [20/50] eta: 0:00:47 loss: 1.3866 (1.3816) acc1: 68.8000 (69.1429) acc5: 88.8000 (88.9143) time: 1.0117 data: 0.9918 max mem: 2928
Test: [30/50] eta: 0:00:27 loss: 1.4887 (1.4011) acc1: 66.4000 (68.6194) acc5: 88.0000 (88.3355) time: 1.0095 data: 0.9902 max mem: 2928
Test: [40/50] eta: 0:00:11 loss: 1.4002 (1.4201) acc1: 66.4000 (68.1171) acc5: 87.2000 (87.9610) time: 0.7426 data: 0.7232 max mem: 2928
Test: [49/50] eta: 0:00:01 loss: 1.4248 (1.4233) acc1: 65.6000 (67.7120) acc5: 87.2000 (87.7600) time: 0.7130 data: 0.6932 max mem: 2928
Test: Total time: 0:00:50 (1.0134 s / it)
* Acc@1 68.714 Acc@5 88.108 loss 1.392
Accuracy of the model on the 50000 test images: 68.7%
Max accuracy: 68.71%
Epoch: [284] [ 0/625] eta: 3:34:22 lr: 0.000033 min_lr: 0.000033 loss: 2.7053 (2.7053) class_acc: 0.5742 (0.5742) weight_decay: 0.0500 (0.0500) time: 20.5796 data: 17.8878 max mem: 2928
Epoch: [284] [200/625] eta: 0:14:12 lr: 0.000032 min_lr: 0.000032 loss: 2.6803 (2.6917) class_acc: 0.5938 (0.5967) weight_decay: 0.0500 (0.0500) grad_norm: 2.5237 (3.0243) time: 1.8779 data: 0.0005 max mem: 2928
Epoch: [284] [400/625] eta: 0:07:11 lr: 0.000031 min_lr: 0.000031 loss: 2.6899 (2.6887) class_acc: 0.5938 (0.5986) weight_decay: 0.0500 (0.0500) grad_norm: 2.7147 (3.0954) time: 1.7720 data: 0.0006 max mem: 2928
Epoch: [284] [600/625] eta: 0:00:47 lr: 0.000029 min_lr: 0.000029 loss: 2.7134 (2.6882) class_acc: 0.5781 (0.5985) weight_decay: 0.0500 (0.0500) grad_norm: 2.6319 (3.0402) time: 2.0573 data: 0.0008 max mem: 2928
Epoch: [284] [624/625] eta: 0:00:01 lr: 0.000029 min_lr: 0.000029 loss: 2.7230 (2.6890) class_acc: 0.5781 (0.5982) weight_decay: 0.0500 (0.0500) grad_norm: 2.8797 (3.0550) time: 0.7659 data: 0.0012 max mem: 2928
Epoch: [284] Total time: 0:19:44 (1.8960 s / it)
Averaged stats: lr: 0.000029 min_lr: 0.000029 loss: 2.7230 (2.6841) class_acc: 0.5781 (0.6003) weight_decay: 0.0500 (0.0500) grad_norm: 2.8797 (3.0550)
Test: [ 0/50] eta: 0:10:30 loss: 1.3430 (1.3430) acc1: 65.6000 (65.6000) acc5: 92.0000 (92.0000) time: 12.6097 data: 12.5836 max mem: 2928
Test: [10/50] eta: 0:01:22 loss: 1.2264 (1.2395) acc1: 74.4000 (72.8727) acc5: 91.2000 (90.4000) time: 2.0679 data: 2.0481 max mem: 2928
Test: [20/50] eta: 0:00:49 loss: 1.4008 (1.3874) acc1: 68.8000 (69.2952) acc5: 89.6000 (88.9143) time: 1.0865 data: 1.0676 max mem: 2928
Test: [30/50] eta: 0:00:28 loss: 1.5308 (1.4051) acc1: 66.4000 (68.6194) acc5: 87.2000 (88.4903) time: 1.0803 data: 1.0617 max mem: 2928
Test: [40/50] eta: 0:00:11 loss: 1.4155 (1.4214) acc1: 66.4000 (68.0585) acc5: 88.0000 (88.0000) time: 0.7159 data: 0.6965 max mem: 2928
Test: [49/50] eta: 0:00:00 loss: 1.3998 (1.4254) acc1: 65.6000 (67.8400) acc5: 87.2000 (87.7440) time: 0.6736 data: 0.6530 max mem: 2928
Test: Total time: 0:00:49 (0.9871 s / it)
* Acc@1 68.628 Acc@5 88.056 loss 1.392
Accuracy of the model on the 50000 test images: 68.6%
Max accuracy: 68.71%
Epoch: [285] [ 0/625] eta: 3:37:54 lr: 0.000029 min_lr: 0.000029 loss: 2.6885 (2.6885) class_acc: 0.6484 (0.6484) weight_decay: 0.0500 (0.0500) time: 20.9187 data: 17.2003 max mem: 2928
Epoch: [285] [200/625] eta: 0:14:08 lr: 0.000028 min_lr: 0.000028 loss: 2.6788 (2.6842) class_acc: 0.6016 (0.5993) weight_decay: 0.0500 (0.0500) grad_norm: 2.3959 (2.7854) time: 1.9613 data: 0.0010 max mem: 2928
Epoch: [285] [400/625] eta: 0:07:19 lr: 0.000027 min_lr: 0.000027 loss: 2.6458 (2.6777) class_acc: 0.5977 (0.6010) weight_decay: 0.0500 (0.0500) grad_norm: 2.6664 (2.9210) time: 2.0289 data: 0.0008 max mem: 2928
Epoch: [285] [600/625] eta: 0:00:49 lr: 0.000026 min_lr: 0.000026 loss: 2.6542 (2.6781) class_acc: 0.5977 (0.6007) weight_decay: 0.0500 (0.0500) grad_norm: 2.6703 (2.9270) time: 2.1352 data: 0.0014 max mem: 2928
Epoch: [285] [624/625] eta: 0:00:01 lr: 0.000026 min_lr: 0.000026 loss: 2.6575 (2.6782) class_acc: 0.5938 (0.6005) weight_decay: 0.0500 (0.0500) grad_norm: 2.4929 (2.9171) time: 0.7751 data: 0.0018 max mem: 2928
Epoch: [285] Total time: 0:20:04 (1.9270 s / it)
Averaged stats: lr: 0.000026 min_lr: 0.000026 loss: 2.6575 (2.6807) class_acc: 0.5938 (0.6005) weight_decay: 0.0500 (0.0500) grad_norm: 2.4929 (2.9171)
Test: [ 0/50] eta: 0:10:50 loss: 1.3308 (1.3308) acc1: 66.4000 (66.4000) acc5: 91.2000 (91.2000) time: 13.0177 data: 12.9864 max mem: 2928
Test: [10/50] eta: 0:01:25 loss: 1.2131 (1.2304) acc1: 73.6000 (73.7455) acc5: 91.2000 (90.6182) time: 2.1352 data: 2.1147 max mem: 2928
Test: [20/50] eta: 0:00:50 loss: 1.4042 (1.3776) acc1: 68.8000 (69.7524) acc5: 88.8000 (89.0667) time: 1.1125 data: 1.0941 max mem: 2928
Test: [30/50] eta: 0:00:30 loss: 1.4835 (1.3992) acc1: 67.2000 (69.0065) acc5: 88.8000 (88.5677) time: 1.1653 data: 1.1471 max mem: 2928
Test: [40/50] eta: 0:00:13 loss: 1.3978 (1.4179) acc1: 66.4000 (68.3512) acc5: 88.0000 (88.0390) time: 1.0266 data: 1.0079 max mem: 2928
Test: [49/50] eta: 0:00:01 loss: 1.4277 (1.4210) acc1: 67.2000 (68.0640) acc5: 87.2000 (87.8400) time: 0.9427 data: 0.9205 max mem: 2928
Test: Total time: 0:00:56 (1.1364 s / it)
* Acc@1 68.716 Acc@5 88.110 loss 1.389
Accuracy of the model on the 50000 test images: 68.7%
Max accuracy: 68.72%
Epoch: [286] [ 0/625] eta: 3:26:44 lr: 0.000026 min_lr: 0.000026 loss: 2.8813 (2.8813) class_acc: 0.5742 (0.5742) weight_decay: 0.0500 (0.0500) time: 19.8472 data: 18.6266 max mem: 2928
Epoch: [286] [200/625] eta: 0:14:02 lr: 0.000025 min_lr: 0.000025 loss: 2.6691 (2.6869) class_acc: 0.5977 (0.5998) weight_decay: 0.0500 (0.0500) grad_norm: 3.2221 (3.1651) time: 1.8964 data: 0.0006 max mem: 2928
Epoch: [286] [400/625] eta: 0:07:23 lr: 0.000023 min_lr: 0.000023 loss: 2.6892 (2.6783) class_acc: 0.5977 (0.6016) weight_decay: 0.0500 (0.0500) grad_norm: 2.3608 (3.0207) time: 1.8217 data: 0.0319 max mem: 2928
Epoch: [286] [600/625] eta: 0:00:48 lr: 0.000022 min_lr: 0.000022 loss: 2.7525 (2.6787) class_acc: 0.5820 (0.6025) weight_decay: 0.0500 (0.0500) grad_norm: 2.6531 (inf) time: 1.9520 data: 0.1765 max mem: 2928
Epoch: [286] [624/625] eta: 0:00:01 lr: 0.000022 min_lr: 0.000022 loss: 2.6599 (2.6794) class_acc: 0.6133 (0.6026) weight_decay: 0.0500 (0.0500) grad_norm: 2.6172 (inf) time: 0.9064 data: 0.1038 max mem: 2928
Epoch: [286] Total time: 0:19:57 (1.9163 s / it)
Averaged stats: lr: 0.000022 min_lr: 0.000022 loss: 2.6599 (2.6829) class_acc: 0.6133 (0.6003) weight_decay: 0.0500 (0.0500) grad_norm: 2.6172 (inf)
Test: [ 0/50] eta: 0:10:05 loss: 1.3468 (1.3468) acc1: 64.0000 (64.0000) acc5: 90.4000 (90.4000) time: 12.1065 data: 12.0783 max mem: 2928
Test: [10/50] eta: 0:01:21 loss: 1.2118 (1.2286) acc1: 73.6000 (73.2364) acc5: 91.2000 (90.3273) time: 2.0423 data: 2.0221 max mem: 2928
Test: [20/50] eta: 0:00:48 loss: 1.3857 (1.3784) acc1: 69.6000 (69.6381) acc5: 88.0000 (88.8000) time: 1.0810 data: 1.0599 max mem: 2928
Test: [30/50] eta: 0:00:27 loss: 1.5108 (1.3973) acc1: 66.4000 (68.8258) acc5: 87.2000 (88.3613) time: 1.0048 data: 0.9822 max mem: 2928
Test: [40/50] eta: 0:00:11 loss: 1.3972 (1.4171) acc1: 66.4000 (68.1561) acc5: 88.0000 (87.8244) time: 0.6052 data: 0.5806 max mem: 2928
Test: [49/50] eta: 0:00:00 loss: 1.4134 (1.4198) acc1: 67.2000 (67.9680) acc5: 87.2000 (87.6640) time: 0.5323 data: 0.5102 max mem: 2928
Test: Total time: 0:00:46 (0.9324 s / it)
* Acc@1 68.756 Acc@5 88.172 loss 1.385
Accuracy of the model on the 50000 test images: 68.8%
Max accuracy: 68.76%
Epoch: [287] [ 0/625] eta: 3:20:48 lr: 0.000022 min_lr: 0.000022 loss: 2.5877 (2.5877) class_acc: 0.6406 (0.6406) weight_decay: 0.0500 (0.0500) time: 19.2779 data: 15.3315 max mem: 2928
Epoch: [287] [200/625] eta: 0:13:59 lr: 0.000021 min_lr: 0.000021 loss: 2.6888 (2.6895) class_acc: 0.5938 (0.6009) weight_decay: 0.0500 (0.0500) grad_norm: 2.6288 (2.8720) time: 1.8534 data: 0.0007 max mem: 2928
Epoch: [287] [400/625] eta: 0:07:19 lr: 0.000020 min_lr: 0.000020 loss: 2.6429 (2.6867) class_acc: 0.6016 (0.6020) weight_decay: 0.0500 (0.0500) grad_norm: 2.7867 (2.8569) time: 2.0474 data: 0.0007 max mem: 2928
Epoch: [287] [600/625] eta: 0:00:49 lr: 0.000019 min_lr: 0.000019 loss: 2.6753 (2.6856) class_acc: 0.5938 (0.6017) weight_decay: 0.0500 (0.0500) grad_norm: 3.0845 (2.9335) time: 1.8874 data: 0.0007 max mem: 2928
Epoch: [287] [624/625] eta: 0:00:01 lr: 0.000019 min_lr: 0.000019 loss: 2.6694 (2.6847) class_acc: 0.6094 (0.6020) weight_decay: 0.0500 (0.0500) grad_norm: 2.2659 (2.9257) time: 0.8055 data: 0.0014 max mem: 2928
Epoch: [287] Total time: 0:20:00 (1.9207 s / it)
Averaged stats: lr: 0.000019 min_lr: 0.000019 loss: 2.6694 (2.6800) class_acc: 0.6094 (0.6011) weight_decay: 0.0500 (0.0500) grad_norm: 2.2659 (2.9257)
Test: [ 0/50] eta: 0:09:49 loss: 1.3451 (1.3451) acc1: 66.4000 (66.4000) acc5: 91.2000 (91.2000) time: 11.7954 data: 11.7609 max mem: 2928
Test: [10/50] eta: 0:01:21 loss: 1.2081 (1.2299) acc1: 72.8000 (73.0182) acc5: 90.4000 (90.2545) time: 2.0269 data: 2.0067 max mem: 2928
Test: [20/50] eta: 0:00:48 loss: 1.3801 (1.3794) acc1: 68.8000 (69.5238) acc5: 88.8000 (89.0286) time: 1.1215 data: 1.0998 max mem: 2928
Test: [30/50] eta: 0:00:29 loss: 1.5131 (1.3977) acc1: 67.2000 (69.0323) acc5: 88.8000 (88.4645) time: 1.1378 data: 1.1160 max mem: 2928
Test: [40/50] eta: 0:00:13 loss: 1.3969 (1.4160) acc1: 67.2000 (68.3317) acc5: 88.0000 (88.0195) time: 1.0370 data: 1.0182 max mem: 2928
Test: [49/50] eta: 0:00:01 loss: 1.4230 (1.4204) acc1: 66.4000 (67.9680) acc5: 86.4000 (87.7760) time: 0.9709 data: 0.9500 max mem: 2928
Test: Total time: 0:00:56 (1.1316 s / it)
* Acc@1 68.722 Acc@5 88.126 loss 1.386
Accuracy of the model on the 50000 test images: 68.7%
Max accuracy: 68.76%
Epoch: [288] [ 0/625] eta: 3:52:42 lr: 0.000019 min_lr: 0.000019 loss: 2.5230 (2.5230) class_acc: 0.6406 (0.6406) weight_decay: 0.0500 (0.0500) time: 22.3404 data: 18.7238 max mem: 2928
Epoch: [288] [200/625] eta: 0:14:18 lr: 0.000018 min_lr: 0.000018 loss: 2.6787 (2.6751) class_acc: 0.5938 (0.6003) weight_decay: 0.0500 (0.0500) grad_norm: 2.5655 (2.8107) time: 1.8712 data: 0.0005 max mem: 2928
Epoch: [288] [400/625] eta: 0:07:24 lr: 0.000017 min_lr: 0.000017 loss: 2.7081 (2.6745) class_acc: 0.6016 (0.6017) weight_decay: 0.0500 (0.0500) grad_norm: 2.8706 (2.8939) time: 1.7816 data: 0.0005 max mem: 2928
Epoch: [288] [600/625] eta: 0:00:49 lr: 0.000016 min_lr: 0.000016 loss: 2.6662 (2.6764) class_acc: 0.5977 (0.6011) weight_decay: 0.0500 (0.0500) grad_norm: 2.7887 (2.8693) time: 2.0209 data: 0.0006 max mem: 2928
Epoch: [288] [624/625] eta: 0:00:01 lr: 0.000016 min_lr: 0.000016 loss: 2.7278 (2.6776) class_acc: 0.5898 (0.6007) weight_decay: 0.0500 (0.0500) grad_norm: 2.4043 (2.8600) time: 0.7411 data: 0.0018 max mem: 2928
Epoch: [288] Total time: 0:20:07 (1.9319 s / it)
Averaged stats: lr: 0.000016 min_lr: 0.000016 loss: 2.7278 (2.6808) class_acc: 0.5898 (0.6011) weight_decay: 0.0500 (0.0500) grad_norm: 2.4043 (2.8600)
Test: [ 0/50] eta: 0:10:05 loss: 1.3328 (1.3328) acc1: 66.4000 (66.4000) acc5: 91.2000 (91.2000) time: 12.1174 data: 12.0820 max mem: 2928
Test: [10/50] eta: 0:01:21 loss: 1.2014 (1.2246) acc1: 73.6000 (73.3818) acc5: 91.2000 (90.4727) time: 2.0342 data: 2.0128 max mem: 2928
Test: [20/50] eta: 0:00:47 loss: 1.3817 (1.3747) acc1: 68.8000 (69.3714) acc5: 89.6000 (88.9905) time: 1.0731 data: 1.0534 max mem: 2928
Test: [30/50] eta: 0:00:27 loss: 1.4977 (1.3945) acc1: 67.2000 (68.8516) acc5: 88.0000 (88.5161) time: 1.0252 data: 1.0042 max mem: 2928
Test: [40/50] eta: 0:00:11 loss: 1.3696 (1.4138) acc1: 67.2000 (68.1951) acc5: 88.0000 (88.0195) time: 0.6460 data: 0.6245 max mem: 2928
Test: [49/50] eta: 0:00:00 loss: 1.4113 (1.4168) acc1: 67.2000 (67.9040) acc5: 87.2000 (87.8720) time: 0.5907 data: 0.5707 max mem: 2928
Test: Total time: 0:00:46 (0.9374 s / it)
* Acc@1 68.806 Acc@5 88.178 loss 1.385
Accuracy of the model on the 50000 test images: 68.8%
Max accuracy: 68.81%
Epoch: [289] [ 0/625] eta: 3:26:39 lr: 0.000016 min_lr: 0.000016 loss: 2.8413 (2.8413) class_acc: 0.5703 (0.5703) weight_decay: 0.0500 (0.0500) time: 19.8397 data: 19.2889 max mem: 2928
Epoch: [289] [200/625] eta: 0:14:42 lr: 0.000015 min_lr: 0.000015 loss: 2.6746 (2.6934) class_acc: 0.6016 (0.5988) weight_decay: 0.0500 (0.0500) grad_norm: 2.4621 (2.8588) time: 1.9523 data: 0.0178 max mem: 2928
Epoch: [289] [400/625] eta: 0:07:31 lr: 0.000014 min_lr: 0.000014 loss: 2.6586 (2.6925) class_acc: 0.6055 (0.5993) weight_decay: 0.0500 (0.0500) grad_norm: 2.5300 (2.8845) time: 1.9432 data: 0.0043 max mem: 2928
Epoch: [289] [600/625] eta: 0:00:49 lr: 0.000014 min_lr: 0.000014 loss: 2.6777 (2.6879) class_acc: 0.5898 (0.5999) weight_decay: 0.0500 (0.0500) grad_norm: 2.4942 (2.8763) time: 1.9701 data: 0.0007 max mem: 2928
Epoch: [289] [624/625] eta: 0:00:01 lr: 0.000014 min_lr: 0.000014 loss: 2.6521 (2.6882) class_acc: 0.5977 (0.5996) weight_decay: 0.0500 (0.0500) grad_norm: 2.7526 (2.8755) time: 0.7331 data: 0.0120 max mem: 2928
Epoch: [289] Total time: 0:20:11 (1.9389 s / it)
Averaged stats: lr: 0.000014 min_lr: 0.000014 loss: 2.6521 (2.6813) class_acc: 0.5977 (0.6010) weight_decay: 0.0500 (0.0500) grad_norm: 2.7526 (2.8755)
Test: [ 0/50] eta: 0:09:09 loss: 1.3287 (1.3287) acc1: 66.4000 (66.4000) acc5: 92.0000 (92.0000) time: 10.9822 data: 10.9572 max mem: 2928
Test: [10/50] eta: 0:01:13 loss: 1.2106 (1.2306) acc1: 73.6000 (73.3818) acc5: 91.2000 (90.6182) time: 1.8441 data: 1.8242 max mem: 2928
Test: [20/50] eta: 0:00:43 loss: 1.3901 (1.3761) acc1: 69.6000 (69.7905) acc5: 89.6000 (89.2191) time: 0.9655 data: 0.9463 max mem: 2928
Test: [30/50] eta: 0:00:25 loss: 1.5018 (1.3950) acc1: 66.4000 (68.9806) acc5: 88.8000 (88.6968) time: 0.9590 data: 0.9380 max mem: 2928
Test: [40/50] eta: 0:00:11 loss: 1.3728 (1.4133) acc1: 66.4000 (68.2732) acc5: 87.2000 (88.1366) time: 0.8484 data: 0.8241 max mem: 2928
Test: [49/50] eta: 0:00:01 loss: 1.4242 (1.4170) acc1: 67.2000 (67.9680) acc5: 87.2000 (87.9520) time: 0.6265 data: 0.6049 max mem: 2928
Test: Total time: 0:00:50 (1.0103 s / it)
* Acc@1 68.856 Acc@5 88.246 loss 1.382
Accuracy of the model on the 50000 test images: 68.9%
Max accuracy: 68.86%
Epoch: [290] [ 0/625] eta: 3:40:22 lr: 0.000014 min_lr: 0.000014 loss: 2.7858 (2.7858) class_acc: 0.5625 (0.5625) weight_decay: 0.0500 (0.0500) time: 21.1559 data: 17.8987 max mem: 2928
Epoch: [290] [200/625] eta: 0:14:25 lr: 0.000013 min_lr: 0.000013 loss: 2.6827 (2.6674) class_acc: 0.6094 (0.6013) weight_decay: 0.0500 (0.0500) grad_norm: 2.6508 (2.8644) time: 2.0875 data: 0.0151 max mem: 2928
Epoch: [290] [400/625] eta: 0:07:26 lr: 0.000012 min_lr: 0.000012 loss: 2.6567 (2.6753) class_acc: 0.6016 (0.6016) weight_decay: 0.0500 (0.0500) grad_norm: 2.8170 (2.8953) time: 1.9420 data: 0.0007 max mem: 2928
Epoch: [290] [600/625] eta: 0:00:49 lr: 0.000011 min_lr: 0.000011 loss: 2.6216 (2.6785) class_acc: 0.5977 (0.6004) weight_decay: 0.0500 (0.0500) grad_norm: 2.3132 (2.8436) time: 2.0309 data: 0.0008 max mem: 2928
Epoch: [290] [624/625] eta: 0:00:01 lr: 0.000011 min_lr: 0.000011 loss: 2.6949 (2.6797) class_acc: 0.6016 (0.6002) weight_decay: 0.0500 (0.0500) grad_norm: 2.6061 (2.8500) time: 0.7422 data: 0.0020 max mem: 2928
Epoch: [290] Total time: 0:20:05 (1.9286 s / it)
Averaged stats: lr: 0.000011 min_lr: 0.000011 loss: 2.6949 (2.6799) class_acc: 0.6016 (0.6009) weight_decay: 0.0500 (0.0500) grad_norm: 2.6061 (2.8500)
Test: [ 0/50] eta: 0:08:30 loss: 1.3376 (1.3376) acc1: 65.6000 (65.6000) acc5: 91.2000 (91.2000) time: 10.2041 data: 10.1787 max mem: 2928
Test: [10/50] eta: 0:01:15 loss: 1.2093 (1.2255) acc1: 72.8000 (73.5273) acc5: 91.2000 (90.5455) time: 1.8853 data: 1.8659 max mem: 2928
Test: [20/50] eta: 0:00:43 loss: 1.3743 (1.3709) acc1: 69.6000 (70.0191) acc5: 89.6000 (89.2191) time: 1.0109 data: 0.9911 max mem: 2928
Test: [30/50] eta: 0:00:25 loss: 1.4948 (1.3915) acc1: 67.2000 (69.1613) acc5: 88.8000 (88.7484) time: 0.9406 data: 0.9205 max mem: 2928
Test: [40/50] eta: 0:00:11 loss: 1.3753 (1.4111) acc1: 65.6000 (68.4293) acc5: 88.0000 (88.1951) time: 0.8308 data: 0.8118 max mem: 2928
Test: [49/50] eta: 0:00:01 loss: 1.4205 (1.4145) acc1: 66.4000 (68.1600) acc5: 87.2000 (87.9680) time: 0.7311 data: 0.7104 max mem: 2928
Test: Total time: 0:00:50 (1.0103 s / it)
* Acc@1 68.848 Acc@5 88.184 loss 1.381
Accuracy of the model on the 50000 test images: 68.8%
Max accuracy: 68.86%
Epoch: [291] [ 0/625] eta: 3:42:43 lr: 0.000011 min_lr: 0.000011 loss: 2.8587 (2.8587) class_acc: 0.5469 (0.5469) weight_decay: 0.0500 (0.0500) time: 21.3814 data: 17.1549 max mem: 2928
Epoch: [291] [200/625] eta: 0:13:48 lr: 0.000010 min_lr: 0.000010 loss: 2.7152 (2.6814) class_acc: 0.5938 (0.5983) weight_decay: 0.0500 (0.0500) grad_norm: 2.3529 (2.8354) time: 1.8143 data: 0.0006 max mem: 2928
Epoch: [291] [400/625] eta: 0:07:14 lr: 0.000010 min_lr: 0.000010 loss: 2.6895 (2.6801) class_acc: 0.5977 (0.5992) weight_decay: 0.0500 (0.0500) grad_norm: 2.4969 (2.8554) time: 1.9893 data: 0.0006 max mem: 2928
Epoch: [291] [600/625] eta: 0:00:48 lr: 0.000009 min_lr: 0.000009 loss: 2.6592 (2.6761) class_acc: 0.6016 (0.6012) weight_decay: 0.0500 (0.0500) grad_norm: 3.0971 (2.9058) time: 1.9543 data: 0.0007 max mem: 2928
Epoch: [291] [624/625] eta: 0:00:01 lr: 0.000009 min_lr: 0.000009 loss: 2.6887 (2.6766) class_acc: 0.5977 (0.6012) weight_decay: 0.0500 (0.0500) grad_norm: 2.7518 (2.8979) time: 0.7669 data: 0.0016 max mem: 2928
Epoch: [291] Total time: 0:20:08 (1.9331 s / it)
Averaged stats: lr: 0.000009 min_lr: 0.000009 loss: 2.6887 (2.6786) class_acc: 0.5977 (0.6013) weight_decay: 0.0500 (0.0500) grad_norm: 2.7518 (2.8979)
Test: [ 0/50] eta: 0:10:01 loss: 1.3369 (1.3369) acc1: 67.2000 (67.2000) acc5: 91.2000 (91.2000) time: 12.0279 data: 11.9919 max mem: 2928
Test: [10/50] eta: 0:01:20 loss: 1.2133 (1.2291) acc1: 72.8000 (73.3091) acc5: 91.2000 (90.2546) time: 2.0021 data: 1.9807 max mem: 2928
Test: [20/50] eta: 0:00:47 loss: 1.3838 (1.3757) acc1: 68.8000 (69.6000) acc5: 88.8000 (89.0667) time: 1.0465 data: 1.0272 max mem: 2928
Test: [30/50] eta: 0:00:28 loss: 1.5077 (1.3953) acc1: 66.4000 (69.0065) acc5: 88.8000 (88.6194) time: 1.0694 data: 1.0496 max mem: 2928
Test: [40/50] eta: 0:00:12 loss: 1.3790 (1.4147) acc1: 66.4000 (68.2927) acc5: 87.2000 (88.1171) time: 0.9021 data: 0.8814 max mem: 2928
Test: [49/50] eta: 0:00:01 loss: 1.4154 (1.4183) acc1: 66.4000 (68.0480) acc5: 87.2000 (87.9520) time: 0.8812 data: 0.8615 max mem: 2928
Test: Total time: 0:00:55 (1.1099 s / it)
* Acc@1 68.828 Acc@5 88.200 loss 1.384
Accuracy of the model on the 50000 test images: 68.8%
Max accuracy: 68.86%
Epoch: [292] [ 0/625] eta: 3:25:38 lr: 0.000009 min_lr: 0.000009 loss: 2.6794 (2.6794) class_acc: 0.5625 (0.5625) weight_decay: 0.0500 (0.0500) time: 19.7421 data: 19.6194 max mem: 2928
Epoch: [292] [200/625] eta: 0:14:35 lr: 0.000008 min_lr: 0.000008 loss: 2.6573 (2.6768) class_acc: 0.6094 (0.6014) weight_decay: 0.0500 (0.0500) grad_norm: 2.4938 (2.6780) time: 1.8099 data: 0.0156 max mem: 2928
Epoch: [292] [400/625] eta: 0:07:24 lr: 0.000008 min_lr: 0.000008 loss: 2.6593 (2.6772) class_acc: 0.5938 (0.6006) weight_decay: 0.0500 (0.0500) grad_norm: 2.7928 (2.7951) time: 2.0499 data: 0.0006 max mem: 2928
Epoch: [292] [600/625] eta: 0:00:49 lr: 0.000007 min_lr: 0.000007 loss: 2.6311 (2.6752) class_acc: 0.5898 (0.6014) weight_decay: 0.0500 (0.0500) grad_norm: 2.4127 (2.7912) time: 2.1412 data: 0.0005 max mem: 2928
Epoch: [292] [624/625] eta: 0:00:01 lr: 0.000007 min_lr: 0.000007 loss: 2.6551 (2.6758) class_acc: 0.6016 (0.6013) weight_decay: 0.0500 (0.0500) grad_norm: 2.2329 (2.7846) time: 0.7125 data: 0.0013 max mem: 2928
Epoch: [292] Total time: 0:20:08 (1.9339 s / it)
Averaged stats: lr: 0.000007 min_lr: 0.000007 loss: 2.6551 (2.6780) class_acc: 0.6016 (0.6016) weight_decay: 0.0500 (0.0500) grad_norm: 2.2329 (2.7846)
Test: [ 0/50] eta: 0:09:29 loss: 1.3263 (1.3263) acc1: 65.6000 (65.6000) acc5: 91.2000 (91.2000) time: 11.3945 data: 11.3689 max mem: 2928
Test: [10/50] eta: 0:01:19 loss: 1.2212 (1.2353) acc1: 73.6000 (73.0909) acc5: 91.2000 (90.3273) time: 1.9804 data: 1.9588 max mem: 2928
Test: [20/50] eta: 0:00:47 loss: 1.4004 (1.3811) acc1: 68.8000 (69.5238) acc5: 89.6000 (89.0667) time: 1.0940 data: 1.0726 max mem: 2928
Test: [30/50] eta: 0:00:27 loss: 1.4957 (1.3997) acc1: 68.0000 (68.9290) acc5: 88.8000 (88.5419) time: 1.0265 data: 1.0051 max mem: 2928
Test: [40/50] eta: 0:00:11 loss: 1.3843 (1.4179) acc1: 67.2000 (68.2927) acc5: 88.0000 (88.0976) time: 0.6181 data: 0.5965 max mem: 2928
Test: [49/50] eta: 0:00:00 loss: 1.4216 (1.4216) acc1: 65.6000 (67.9200) acc5: 87.2000 (87.9200) time: 0.5584 data: 0.5383 max mem: 2928
Test: Total time: 0:00:46 (0.9350 s / it)
* Acc@1 68.770 Acc@5 88.210 loss 1.389
Accuracy of the model on the 50000 test images: 68.8%
Max accuracy: 68.86%
Epoch: [293] [ 0/625] eta: 3:54:21 lr: 0.000007 min_lr: 0.000007 loss: 2.7472 (2.7472) class_acc: 0.6328 (0.6328) weight_decay: 0.0500 (0.0500) time: 22.4990 data: 16.0056 max mem: 2928
Epoch: [293] [200/625] eta: 0:14:25 lr: 0.000007 min_lr: 0.000007 loss: 2.6708 (2.6846) class_acc: 0.6094 (0.6008) weight_decay: 0.0500 (0.0500) grad_norm: 2.5527 (2.8190) time: 1.8553 data: 0.0007 max mem: 2928
Epoch: [293] [400/625] eta: 0:07:25 lr: 0.000006 min_lr: 0.000006 loss: 2.6712 (2.6860) class_acc: 0.6211 (0.6007) weight_decay: 0.0500 (0.0500) grad_norm: 2.6632 (inf) time: 1.9945 data: 0.0006 max mem: 2928
Epoch: [293] [600/625] eta: 0:00:49 lr: 0.000006 min_lr: 0.000006 loss: 2.7119 (2.6856) class_acc: 0.5820 (0.6008) weight_decay: 0.0500 (0.0500) grad_norm: 2.1901 (inf) time: 2.0840 data: 0.0007 max mem: 2928
Epoch: [293] [624/625] eta: 0:00:01 lr: 0.000006 min_lr: 0.000006 loss: 2.6658 (2.6851) class_acc: 0.5898 (0.6007) weight_decay: 0.0500 (0.0500) grad_norm: 2.2412 (inf) time: 0.7882 data: 0.0016 max mem: 2928
Epoch: [293] Total time: 0:20:03 (1.9263 s / it)
Averaged stats: lr: 0.000006 min_lr: 0.000006 loss: 2.6658 (2.6794) class_acc: 0.5898 (0.6016) weight_decay: 0.0500 (0.0500) grad_norm: 2.2412 (inf)
Test: [ 0/50] eta: 0:10:36 loss: 1.3297 (1.3297) acc1: 65.6000 (65.6000) acc5: 91.2000 (91.2000) time: 12.7391 data: 12.7130 max mem: 2928
Test: [10/50] eta: 0:01:28 loss: 1.2089 (1.2266) acc1: 72.8000 (73.0909) acc5: 91.2000 (90.4000) time: 2.2219 data: 2.2026 max mem: 2928
Test: [20/50] eta: 0:00:53 loss: 1.3835 (1.3735) acc1: 68.0000 (69.3714) acc5: 88.8000 (89.1429) time: 1.2317 data: 1.2132 max mem: 2928
Test: [30/50] eta: 0:00:32 loss: 1.4962 (1.3934) acc1: 67.2000 (68.7484) acc5: 88.8000 (88.6194) time: 1.2789 data: 1.2600 max mem: 2928
Test: [40/50] eta: 0:00:13 loss: 1.3787 (1.4113) acc1: 67.2000 (68.1561) acc5: 88.0000 (88.1951) time: 0.9704 data: 0.9517 max mem: 2928
Test: [49/50] eta: 0:00:01 loss: 1.4079 (1.4145) acc1: 67.2000 (67.9840) acc5: 87.2000 (88.0000) time: 0.8822 data: 0.8625 max mem: 2928
Test: Total time: 0:00:57 (1.1511 s / it)
* Acc@1 68.782 Acc@5 88.236 loss 1.382
Accuracy of the model on the 50000 test images: 68.8%
Max accuracy: 68.86%
Epoch: [294] [ 0/625] eta: 3:38:43 lr: 0.000006 min_lr: 0.000006 loss: 2.8759 (2.8759) class_acc: 0.5391 (0.5391) weight_decay: 0.0500 (0.0500) time: 20.9980 data: 19.6595 max mem: 2928
Epoch: [294] [200/625] eta: 0:14:27 lr: 0.000005 min_lr: 0.000005 loss: 2.6856 (2.6840) class_acc: 0.5977 (0.6011) weight_decay: 0.0500 (0.0500) grad_norm: 2.4792 (2.7672) time: 1.8199 data: 0.0011 max mem: 2928
Epoch: [294] [400/625] eta: 0:07:25 lr: 0.000005 min_lr: 0.000005 loss: 2.6811 (2.6879) class_acc: 0.5977 (0.5998) weight_decay: 0.0500 (0.0500) grad_norm: 2.6034 (2.7596) time: 1.8411 data: 0.0008 max mem: 2928
Epoch: [294] [600/625] eta: 0:00:49 lr: 0.000004 min_lr: 0.000004 loss: 2.6740 (2.6836) class_acc: 0.6016 (0.6002) weight_decay: 0.0500 (0.0500) grad_norm: 2.4229 (2.7472) time: 1.8682 data: 0.0011 max mem: 2928
Epoch: [294] [624/625] eta: 0:00:01 lr: 0.000004 min_lr: 0.000004 loss: 2.6834 (2.6821) class_acc: 0.6016 (0.6007) weight_decay: 0.0500 (0.0500) grad_norm: 2.5511 (2.7357) time: 0.7413 data: 0.0016 max mem: 2928
Epoch: [294] Total time: 0:19:59 (1.9187 s / it)
Averaged stats: lr: 0.000004 min_lr: 0.000004 loss: 2.6834 (2.6774) class_acc: 0.6016 (0.6013) weight_decay: 0.0500 (0.0500) grad_norm: 2.5511 (2.7357)
Test: [ 0/50] eta: 0:10:29 loss: 1.3391 (1.3391) acc1: 66.4000 (66.4000) acc5: 92.0000 (92.0000) time: 12.5841 data: 12.5476 max mem: 2928
Test: [10/50] eta: 0:01:12 loss: 1.2171 (1.2311) acc1: 72.8000 (73.0909) acc5: 91.2000 (90.2546) time: 1.8086 data: 1.7875 max mem: 2928
Test: [20/50] eta: 0:00:38 loss: 1.3790 (1.3766) acc1: 69.6000 (69.6381) acc5: 88.8000 (88.9905) time: 0.7248 data: 0.7048 max mem: 2928
Test: [30/50] eta: 0:00:23 loss: 1.5001 (1.3958) acc1: 68.0000 (69.0839) acc5: 88.8000 (88.5161) time: 0.8392 data: 0.8194 max mem: 2928
Test: [40/50] eta: 0:00:10 loss: 1.3794 (1.4129) acc1: 67.2000 (68.3122) acc5: 88.0000 (88.0781) time: 0.8178 data: 0.7981 max mem: 2928
Test: [49/50] eta: 0:00:00 loss: 1.4099 (1.4162) acc1: 66.4000 (67.9680) acc5: 87.2000 (87.8720) time: 0.4890 data: 0.4699 max mem: 2928
Test: Total time: 0:00:46 (0.9344 s / it)
* Acc@1 68.786 Acc@5 88.232 loss 1.382
Accuracy of the model on the 50000 test images: 68.8%
Max accuracy: 68.86%
Epoch: [295] [ 0/625] eta: 3:28:53 lr: 0.000004 min_lr: 0.000004 loss: 2.8544 (2.8544) class_acc: 0.5664 (0.5664) weight_decay: 0.0500 (0.0500) time: 20.0535 data: 17.1230 max mem: 2928
Epoch: [295] [200/625] eta: 0:14:22 lr: 0.000004 min_lr: 0.000004 loss: 2.6610 (2.6748) class_acc: 0.5977 (0.6009) weight_decay: 0.0500 (0.0500) grad_norm: 2.7520 (2.8132) time: 1.8582 data: 0.0006 max mem: 2928
Epoch: [295] [400/625] eta: 0:07:25 lr: 0.000003 min_lr: 0.000003 loss: 2.7026 (2.6805) class_acc: 0.5859 (0.5997) weight_decay: 0.0500 (0.0500) grad_norm: 2.4308 (2.7594) time: 1.9179 data: 0.0006 max mem: 2928
Epoch: [295] [600/625] eta: 0:00:49 lr: 0.000003 min_lr: 0.000003 loss: 2.6588 (2.6791) class_acc: 0.5977 (0.6001) weight_decay: 0.0500 (0.0500) grad_norm: 2.4203 (2.7308) time: 2.0897 data: 0.0007 max mem: 2928
Epoch: [295] [624/625] eta: 0:00:01 lr: 0.000003 min_lr: 0.000003 loss: 2.6758 (2.6793) class_acc: 0.5938 (0.6002) weight_decay: 0.0500 (0.0500) grad_norm: 2.4045 (2.7226) time: 0.6885 data: 0.0015 max mem: 2928
Epoch: [295] Total time: 0:20:11 (1.9390 s / it)
Averaged stats: lr: 0.000003 min_lr: 0.000003 loss: 2.6758 (2.6792) class_acc: 0.5938 (0.6014) weight_decay: 0.0500 (0.0500) grad_norm: 2.4045 (2.7226)
Test: [ 0/50] eta: 0:10:24 loss: 1.3216 (1.3216) acc1: 67.2000 (67.2000) acc5: 89.6000 (89.6000) time: 12.4948 data: 12.4627 max mem: 2928
Test: [10/50] eta: 0:01:24 loss: 1.2071 (1.2243) acc1: 72.8000 (73.1636) acc5: 90.4000 (90.1091) time: 2.1116 data: 2.0889 max mem: 2928
Test: [20/50] eta: 0:00:50 loss: 1.3845 (1.3711) acc1: 68.8000 (69.8286) acc5: 88.8000 (88.9905) time: 1.1300 data: 1.1098 max mem: 2928
Test: [30/50] eta: 0:00:28 loss: 1.4981 (1.3908) acc1: 67.2000 (69.2129) acc5: 88.8000 (88.3871) time: 1.0090 data: 0.9900 max mem: 2928
Test: [40/50] eta: 0:00:12 loss: 1.3737 (1.4089) acc1: 67.2000 (68.4683) acc5: 87.2000 (87.8829) time: 0.7443 data: 0.7237 max mem: 2928
Test: [49/50] eta: 0:00:01 loss: 1.4152 (1.4127) acc1: 67.2000 (68.2080) acc5: 87.2000 (87.7440) time: 0.9513 data: 0.9306 max mem: 2928
Test: Total time: 0:00:57 (1.1432 s / it)
* Acc@1 68.778 Acc@5 88.162 loss 1.380
Accuracy of the model on the 50000 test images: 68.8%
Max accuracy: 68.86%
Epoch: [296] [ 0/625] eta: 4:00:00 lr: 0.000003 min_lr: 0.000003 loss: 2.5667 (2.5667) class_acc: 0.6328 (0.6328) weight_decay: 0.0500 (0.0500) time: 23.0405 data: 16.2490 max mem: 2928
Epoch: [296] [200/625] eta: 0:14:42 lr: 0.000003 min_lr: 0.000003 loss: 2.6616 (2.6685) class_acc: 0.5977 (0.6031) weight_decay: 0.0500 (0.0500) grad_norm: 2.3834 (2.7222) time: 1.9687 data: 0.0088 max mem: 2928
Epoch: [296] [400/625] eta: 0:07:32 lr: 0.000002 min_lr: 0.000002 loss: 2.6974 (2.6692) class_acc: 0.5977 (0.6028) weight_decay: 0.0500 (0.0500) grad_norm: 2.4679 (2.6991) time: 2.0311 data: 0.0284 max mem: 2928
Epoch: [296] [600/625] eta: 0:00:49 lr: 0.000002 min_lr: 0.000002 loss: 2.6595 (2.6766) class_acc: 0.6016 (0.6010) weight_decay: 0.0500 (0.0500) grad_norm: 2.3063 (2.7455) time: 2.0483 data: 0.0102 max mem: 2928
Epoch: [296] [624/625] eta: 0:00:01 lr: 0.000002 min_lr: 0.000002 loss: 2.6517 (2.6753) class_acc: 0.6016 (0.6013) weight_decay: 0.0500 (0.0500) grad_norm: 2.3670 (2.7370) time: 0.7539 data: 0.0016 max mem: 2928
Epoch: [296] Total time: 0:20:11 (1.9383 s / it)
Averaged stats: lr: 0.000002 min_lr: 0.000002 loss: 2.6517 (2.6778) class_acc: 0.6016 (0.6014) weight_decay: 0.0500 (0.0500) grad_norm: 2.3670 (2.7370)
Test: [ 0/50] eta: 0:08:50 loss: 1.3327 (1.3327) acc1: 66.4000 (66.4000) acc5: 92.0000 (92.0000) time: 10.6021 data: 10.5676 max mem: 2928
Test: [10/50] eta: 0:01:18 loss: 1.2059 (1.2268) acc1: 72.8000 (73.2364) acc5: 91.2000 (90.5455) time: 1.9548 data: 1.9344 max mem: 2928
Test: [20/50] eta: 0:00:47 loss: 1.3788 (1.3739) acc1: 69.6000 (69.7143) acc5: 88.8000 (89.1429) time: 1.1154 data: 1.0961 max mem: 2928
Test: [30/50] eta: 0:00:25 loss: 1.4987 (1.3939) acc1: 67.2000 (69.0065) acc5: 88.8000 (88.7226) time: 0.9344 data: 0.9151 max mem: 2928
Test: [40/50] eta: 0:00:10 loss: 1.3849 (1.4118) acc1: 67.2000 (68.3512) acc5: 88.0000 (88.1951) time: 0.5183 data: 0.4998 max mem: 2928
Test: [49/50] eta: 0:00:00 loss: 1.4114 (1.4146) acc1: 66.4000 (68.0320) acc5: 87.2000 (88.0480) time: 0.4720 data: 0.4534 max mem: 2928
Test: Total time: 0:00:45 (0.9118 s / it)
* Acc@1 68.840 Acc@5 88.248 loss 1.380
Accuracy of the model on the 50000 test images: 68.8%
Max accuracy: 68.86%
Epoch: [297] [ 0/625] eta: 3:10:45 lr: 0.000002 min_lr: 0.000002 loss: 2.7148 (2.7148) class_acc: 0.5859 (0.5859) weight_decay: 0.0500 (0.0500) time: 18.3130 data: 16.6817 max mem: 2928
Epoch: [297] [200/625] eta: 0:13:56 lr: 0.000002 min_lr: 0.000002 loss: 2.6638 (2.6785) class_acc: 0.5977 (0.6011) weight_decay: 0.0500 (0.0500) grad_norm: 2.4251 (2.6761) time: 2.1036 data: 0.4350 max mem: 2928
Epoch: [297] [400/625] eta: 0:07:22 lr: 0.000002 min_lr: 0.000002 loss: 2.7218 (2.6803) class_acc: 0.5898 (0.6005) weight_decay: 0.0500 (0.0500) grad_norm: 2.3476 (2.7105) time: 1.9519 data: 0.0009 max mem: 2928
Epoch: [297] [600/625] eta: 0:00:49 lr: 0.000002 min_lr: 0.000002 loss: 2.6509 (2.6761) class_acc: 0.6016 (0.6020) weight_decay: 0.0500 (0.0500) grad_norm: 2.9869 (2.7638) time: 2.1211 data: 0.0010 max mem: 2928
Epoch: [297] [624/625] eta: 0:00:01 lr: 0.000002 min_lr: 0.000002 loss: 2.6145 (2.6755) class_acc: 0.6055 (0.6022) weight_decay: 0.0500 (0.0500) grad_norm: 2.6785 (2.7563) time: 0.7132 data: 0.0037 max mem: 2928
Epoch: [297] Total time: 0:20:02 (1.9242 s / it)
Averaged stats: lr: 0.000002 min_lr: 0.000002 loss: 2.6145 (2.6766) class_acc: 0.6055 (0.6021) weight_decay: 0.0500 (0.0500) grad_norm: 2.6785 (2.7563)
Test: [ 0/50] eta: 0:10:29 loss: 1.3225 (1.3225) acc1: 67.2000 (67.2000) acc5: 91.2000 (91.2000) time: 12.5910 data: 12.5570 max mem: 2928
Test: [10/50] eta: 0:01:28 loss: 1.2055 (1.2277) acc1: 72.8000 (73.0182) acc5: 91.2000 (90.5455) time: 2.2065 data: 2.1870 max mem: 2928
Test: [20/50] eta: 0:00:53 loss: 1.3894 (1.3742) acc1: 69.6000 (69.5619) acc5: 90.4000 (89.1810) time: 1.2540 data: 1.2355 max mem: 2928
Test: [30/50] eta: 0:00:32 loss: 1.4979 (1.3939) acc1: 68.0000 (68.9290) acc5: 88.8000 (88.6194) time: 1.2952 data: 1.2756 max mem: 2928
Test: [40/50] eta: 0:00:13 loss: 1.3860 (1.4120) acc1: 68.0000 (68.2341) acc5: 88.0000 (88.1561) time: 0.9179 data: 0.8975 max mem: 2928
Test: [49/50] eta: 0:00:01 loss: 1.4148 (1.4150) acc1: 66.4000 (67.9840) acc5: 87.2000 (87.9680) time: 0.8327 data: 0.8112 max mem: 2928
Test: Total time: 0:00:56 (1.1392 s / it)
* Acc@1 68.774 Acc@5 88.224 loss 1.382
Accuracy of the model on the 50000 test images: 68.8%
Max accuracy: 68.86%
Epoch: [298] [ 0/625] eta: 3:39:39 lr: 0.000002 min_lr: 0.000002 loss: 2.6401 (2.6401) class_acc: 0.6172 (0.6172) weight_decay: 0.0500 (0.0500) time: 21.0868 data: 18.8524 max mem: 2928
Epoch: [298] [200/625] eta: 0:13:59 lr: 0.000001 min_lr: 0.000001 loss: 2.6615 (2.6802) class_acc: 0.6055 (0.6003) weight_decay: 0.0500 (0.0500) grad_norm: 2.6550 (2.6390) time: 1.9461 data: 0.4480 max mem: 2928
Epoch: [298] [400/625] eta: 0:07:17 lr: 0.000001 min_lr: 0.000001 loss: 2.6597 (2.6818) class_acc: 0.6055 (0.5996) weight_decay: 0.0500 (0.0500) grad_norm: 2.5365 (2.7021) time: 2.0119 data: 0.0005 max mem: 2928
Epoch: [298] [600/625] eta: 0:00:48 lr: 0.000001 min_lr: 0.000001 loss: 2.5863 (2.6811) class_acc: 0.6250 (0.5999) weight_decay: 0.0500 (0.0500) grad_norm: 2.6566 (2.6847) time: 2.0652 data: 0.0006 max mem: 2928
Epoch: [298] [624/625] eta: 0:00:01 lr: 0.000001 min_lr: 0.000001 loss: 2.6654 (2.6816) class_acc: 0.5977 (0.5996) weight_decay: 0.0500 (0.0500) grad_norm: 2.6008 (2.6869) time: 0.4461 data: 0.0017 max mem: 2928
Epoch: [298] Total time: 0:19:59 (1.9197 s / it)
Averaged stats: lr: 0.000001 min_lr: 0.000001 loss: 2.6654 (2.6778) class_acc: 0.5977 (0.6015) weight_decay: 0.0500 (0.0500) grad_norm: 2.6008 (2.6869)
Test: [ 0/50] eta: 0:10:08 loss: 1.3397 (1.3397) acc1: 67.2000 (67.2000) acc5: 91.2000 (91.2000) time: 12.1776 data: 12.1515 max mem: 2928
Test: [10/50] eta: 0:01:20 loss: 1.2115 (1.2323) acc1: 72.8000 (73.3818) acc5: 91.2000 (90.4727) time: 2.0163 data: 1.9936 max mem: 2928
Test: [20/50] eta: 0:00:47 loss: 1.3832 (1.3786) acc1: 70.4000 (69.9429) acc5: 89.6000 (89.1429) time: 1.0541 data: 1.0322 max mem: 2928
Test: [30/50] eta: 0:00:27 loss: 1.4972 (1.3980) acc1: 68.0000 (69.2903) acc5: 88.8000 (88.6194) time: 1.0158 data: 0.9955 max mem: 2928
Test: [40/50] eta: 0:00:11 loss: 1.3818 (1.4153) acc1: 68.0000 (68.5659) acc5: 88.0000 (88.1756) time: 0.6660 data: 0.6472 max mem: 2928
Test: [49/50] eta: 0:00:00 loss: 1.4170 (1.4186) acc1: 67.2000 (68.2880) acc5: 87.2000 (87.9520) time: 0.6053 data: 0.5834 max mem: 2928
Test: Total time: 0:00:47 (0.9452 s / it)
* Acc@1 68.832 Acc@5 88.186 loss 1.385
Accuracy of the model on the 50000 test images: 68.8%
Max accuracy: 68.86%
Epoch: [299] [ 0/625] eta: 3:18:11 lr: 0.000001 min_lr: 0.000001 loss: 2.3921 (2.3921) class_acc: 0.6289 (0.6289) weight_decay: 0.0500 (0.0500) time: 19.0265 data: 18.4969 max mem: 2928
Epoch: [299] [200/625] eta: 0:13:50 lr: 0.000001 min_lr: 0.000001 loss: 2.7380 (2.6751) class_acc: 0.5898 (0.6017) weight_decay: 0.0500 (0.0500) grad_norm: 2.5679 (2.6599) time: 1.7346 data: 0.0006 max mem: 2928
Epoch: [299] [400/625] eta: 0:07:18 lr: 0.000001 min_lr: 0.000001 loss: 2.7213 (2.6777) class_acc: 0.5977 (0.6016) weight_decay: 0.0500 (0.0500) grad_norm: 2.6202 (2.7620) time: 1.8885 data: 0.0006 max mem: 2928
Epoch: [299] [600/625] eta: 0:00:48 lr: 0.000001 min_lr: 0.000001 loss: 2.6587 (2.6767) class_acc: 0.6172 (0.6021) weight_decay: 0.0500 (0.0500) grad_norm: 2.4148 (2.7151) time: 1.9842 data: 0.0005 max mem: 2928
Epoch: [299] [624/625] eta: 0:00:01 lr: 0.000001 min_lr: 0.000001 loss: 2.7043 (2.6765) class_acc: 0.6016 (0.6023) weight_decay: 0.0500 (0.0500) grad_norm: 2.5488 (2.7135) time: 0.9958 data: 0.0012 max mem: 2928
Epoch: [299] Total time: 0:19:54 (1.9115 s / it)
Averaged stats: lr: 0.000001 min_lr: 0.000001 loss: 2.7043 (2.6769) class_acc: 0.6016 (0.6016) weight_decay: 0.0500 (0.0500) grad_norm: 2.5488 (2.7135)
Test: [ 0/50] eta: 0:10:15 loss: 1.3242 (1.3242) acc1: 67.2000 (67.2000) acc5: 92.0000 (92.0000) time: 12.3172 data: 12.2858 max mem: 2928
Test: [10/50] eta: 0:01:20 loss: 1.2083 (1.2299) acc1: 72.8000 (73.3091) acc5: 91.2000 (90.4000) time: 2.0181 data: 1.9960 max mem: 2928
Test: [20/50] eta: 0:00:46 loss: 1.3931 (1.3753) acc1: 68.8000 (69.6000) acc5: 90.4000 (89.0667) time: 1.0217 data: 1.0017 max mem: 2928
Test: [30/50] eta: 0:00:27 loss: 1.5069 (1.3959) acc1: 67.2000 (68.8774) acc5: 88.8000 (88.6452) time: 1.0302 data: 1.0113 max mem: 2928
Test: [40/50] eta: 0:00:12 loss: 1.3897 (1.4142) acc1: 67.2000 (68.1756) acc5: 88.0000 (88.1366) time: 0.9157 data: 0.8972 max mem: 2928
Test: [49/50] eta: 0:00:01 loss: 1.4096 (1.4175) acc1: 67.2000 (67.9520) acc5: 87.2000 (87.9520) time: 0.8629 data: 0.8446 max mem: 2928
Test: Total time: 0:00:56 (1.1277 s / it)
* Acc@1 68.808 Acc@5 88.270 loss 1.382
Accuracy of the model on the 50000 test images: 68.8%
Max accuracy: 68.86%
Training time 12:06:45
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。