代码拉取完成,页面将自动刷新
WARNING:__main__:
*****************************************
Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
*****************************************
| distributed init (rank 1): env://, gpu 1
| distributed init (rank 2): env://, gpu 2
| distributed init (rank 0): env://, gpu 0
| distributed init (rank 7): env://, gpu 7
| distributed init (rank 6): env://, gpu 6
| distributed init (rank 4): env://, gpu 4
| distributed init (rank 5): env://, gpu 5
| distributed init (rank 3): env://, gpu 3
Namespace(aa='rand-m9-mstd0.5-inc1', auto_resume=True, batch_size=256, clip_grad=None, color_jitter=0.4, crop_pct=None, cutmix=0.0, cutmix_minmax=None, data_path='/data/benchmarks/ILSVRC2012_LMDB', data_set='IMNET_LMDB', device='cuda', disable_eval=False, dist_backend='nccl', dist_eval=True, dist_on_itp=False, dist_url='env://', distributed=True, drop_path=0.2, enable_wandb=False, epochs=300, eval=False, eval_data_path=None, finetune='', gpu=0, head_init_scale=1.0, imagenet_default_mean_and_std=True, input_size=224, layer_decay=1.0, layer_scale_init_value=1e-06, local_rank=-1, log_dir=None, lr=0.004, min_lr=1e-06, mixup=0.0, mixup_mode='batch', mixup_prob=1.0, mixup_switch_prob=0.5, model='convnext_tiny', model_ema=False, model_ema_decay=0.9999, model_ema_eval=False, model_ema_force_cpu=False, model_key='model|module', model_prefix='', momentum=0.9, nb_classes=1000, num_workers=10, opt='adamw', opt_betas=None, opt_eps=1e-08, output_dir='./checkpoint', pin_mem=True, project='convnext', rank=0, recount=1, remode='pixel', reprob=0.25, resplit=False, resume='', save_ckpt=True, save_ckpt_freq=1, save_ckpt_num=3, seed=0, smoothing=0.1, start_epoch=0, train_interpolation='bicubic', update_freq=2, use_amp=True, wandb_ckpt=False, warmup_epochs=20, warmup_steps=-1, weight_decay=0.05, weight_decay_end=None, world_size=8)
Transform =
RandomResizedCropAndInterpolation(size=(224, 224), scale=(0.08, 1.0), ratio=(0.75, 1.3333), interpolation=bicubic)
RandomHorizontalFlip(p=0.5)
RandAugment(n=2, ops=
AugmentOp(name=AutoContrast, p=0.5, m=9, mstd=0.5)
AugmentOp(name=Equalize, p=0.5, m=9, mstd=0.5)
AugmentOp(name=Invert, p=0.5, m=9, mstd=0.5)
AugmentOp(name=Rotate, p=0.5, m=9, mstd=0.5)
AugmentOp(name=PosterizeIncreasing, p=0.5, m=9, mstd=0.5)
AugmentOp(name=SolarizeIncreasing, p=0.5, m=9, mstd=0.5)
AugmentOp(name=SolarizeAdd, p=0.5, m=9, mstd=0.5)
AugmentOp(name=ColorIncreasing, p=0.5, m=9, mstd=0.5)
AugmentOp(name=ContrastIncreasing, p=0.5, m=9, mstd=0.5)
AugmentOp(name=BrightnessIncreasing, p=0.5, m=9, mstd=0.5)
AugmentOp(name=SharpnessIncreasing, p=0.5, m=9, mstd=0.5)
AugmentOp(name=ShearX, p=0.5, m=9, mstd=0.5)
AugmentOp(name=ShearY, p=0.5, m=9, mstd=0.5)
AugmentOp(name=TranslateXRel, p=0.5, m=9, mstd=0.5)
AugmentOp(name=TranslateYRel, p=0.5, m=9, mstd=0.5))
ToTensor()
Normalize(mean=tensor([0.4850, 0.4560, 0.4060]), std=tensor([0.2290, 0.2240, 0.2250]))
RandomErasing(p=0.25, mode=pixel, count=(1, 1))
---------------------------
reading from datapath /data/benchmarks/ILSVRC2012_LMDB
Number of the class = 1000
Transform =
Resize(size=256, interpolation=bicubic, max_size=None, antialias=None)
CenterCrop(size=(224, 224))
ToTensor()
Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225))
---------------------------
reading from datapath /data/benchmarks/ILSVRC2012_LMDB
Number of the class = 1000
Sampler_train = <torch.utils.data.distributed.DistributedSampler object at 0x7fea802e9c40>
Model = MobileNetV3_Large(
(conv1): Conv2d(3, 16, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(hs1): Hardswish()
(bneck): Sequential(
(0): Block(
(conv1): Conv2d(16, 16, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): ReLU(inplace=True)
(conv2): Conv2d(16, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=16, bias=False)
(bn2): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): ReLU(inplace=True)
(se): Identity()
(conv3): Conv2d(16, 16, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act3): ReLU(inplace=True)
)
(1): Block(
(conv1): Conv2d(16, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): ReLU(inplace=True)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=64, bias=False)
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): ReLU(inplace=True)
(se): Identity()
(conv3): Conv2d(64, 24, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act3): ReLU(inplace=True)
(skip): Sequential(
(0): Conv2d(16, 16, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=16, bias=False)
(1): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): Conv2d(16, 24, kernel_size=(1, 1), stride=(1, 1))
(3): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(2): Block(
(conv1): Conv2d(24, 72, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(72, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): ReLU(inplace=True)
(conv2): Conv2d(72, 72, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=72, bias=False)
(bn2): BatchNorm2d(72, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): ReLU(inplace=True)
(se): Identity()
(conv3): Conv2d(72, 24, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act3): ReLU(inplace=True)
)
(3): Block(
(conv1): Conv2d(24, 72, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(72, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): ReLU(inplace=True)
(conv2): Conv2d(72, 72, kernel_size=(5, 5), stride=(2, 2), padding=(2, 2), groups=72, bias=False)
(bn2): BatchNorm2d(72, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): ReLU(inplace=True)
(se): SeModule(
(se): Sequential(
(0): AdaptiveAvgPool2d(output_size=1)
(1): Conv2d(72, 18, kernel_size=(1, 1), stride=(1, 1), bias=False)
(2): BatchNorm2d(18, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): ReLU(inplace=True)
(4): Conv2d(18, 72, kernel_size=(1, 1), stride=(1, 1), bias=False)
(5): Hardsigmoid()
)
)
(conv3): Conv2d(72, 40, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(40, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act3): ReLU(inplace=True)
(skip): Sequential(
(0): Conv2d(24, 24, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=24, bias=False)
(1): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): Conv2d(24, 40, kernel_size=(1, 1), stride=(1, 1))
(3): BatchNorm2d(40, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(4): Block(
(conv1): Conv2d(40, 120, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(120, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): ReLU(inplace=True)
(conv2): Conv2d(120, 120, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=120, bias=False)
(bn2): BatchNorm2d(120, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): ReLU(inplace=True)
(se): SeModule(
(se): Sequential(
(0): AdaptiveAvgPool2d(output_size=1)
(1): Conv2d(120, 30, kernel_size=(1, 1), stride=(1, 1), bias=False)
(2): BatchNorm2d(30, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): ReLU(inplace=True)
(4): Conv2d(30, 120, kernel_size=(1, 1), stride=(1, 1), bias=False)
(5): Hardsigmoid()
)
)
(conv3): Conv2d(120, 40, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(40, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act3): ReLU(inplace=True)
)
(5): Block(
(conv1): Conv2d(40, 120, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(120, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): ReLU(inplace=True)
(conv2): Conv2d(120, 120, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=120, bias=False)
(bn2): BatchNorm2d(120, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): ReLU(inplace=True)
(se): SeModule(
(se): Sequential(
(0): AdaptiveAvgPool2d(output_size=1)
(1): Conv2d(120, 30, kernel_size=(1, 1), stride=(1, 1), bias=False)
(2): BatchNorm2d(30, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): ReLU(inplace=True)
(4): Conv2d(30, 120, kernel_size=(1, 1), stride=(1, 1), bias=False)
(5): Hardsigmoid()
)
)
(conv3): Conv2d(120, 40, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(40, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act3): ReLU(inplace=True)
)
(6): Block(
(conv1): Conv2d(40, 240, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(240, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): Hardswish()
(conv2): Conv2d(240, 240, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=240, bias=False)
(bn2): BatchNorm2d(240, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): Hardswish()
(se): Identity()
(conv3): Conv2d(240, 80, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(80, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act3): Hardswish()
(skip): Sequential(
(0): Conv2d(40, 40, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=40, bias=False)
(1): BatchNorm2d(40, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): Conv2d(40, 80, kernel_size=(1, 1), stride=(1, 1))
(3): BatchNorm2d(80, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(7): Block(
(conv1): Conv2d(80, 200, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(200, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): Hardswish()
(conv2): Conv2d(200, 200, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=200, bias=False)
(bn2): BatchNorm2d(200, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): Hardswish()
(se): Identity()
(conv3): Conv2d(200, 80, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(80, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act3): Hardswish()
)
(8): Block(
(conv1): Conv2d(80, 184, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(184, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): Hardswish()
(conv2): Conv2d(184, 184, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=184, bias=False)
(bn2): BatchNorm2d(184, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): Hardswish()
(se): Identity()
(conv3): Conv2d(184, 80, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(80, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act3): Hardswish()
)
(9): Block(
(conv1): Conv2d(80, 184, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(184, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): Hardswish()
(conv2): Conv2d(184, 184, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=184, bias=False)
(bn2): BatchNorm2d(184, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): Hardswish()
(se): Identity()
(conv3): Conv2d(184, 80, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(80, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act3): Hardswish()
)
(10): Block(
(conv1): Conv2d(80, 480, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(480, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): Hardswish()
(conv2): Conv2d(480, 480, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=480, bias=False)
(bn2): BatchNorm2d(480, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): Hardswish()
(se): SeModule(
(se): Sequential(
(0): AdaptiveAvgPool2d(output_size=1)
(1): Conv2d(480, 120, kernel_size=(1, 1), stride=(1, 1), bias=False)
(2): BatchNorm2d(120, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): ReLU(inplace=True)
(4): Conv2d(120, 480, kernel_size=(1, 1), stride=(1, 1), bias=False)
(5): Hardsigmoid()
)
)
(conv3): Conv2d(480, 112, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(112, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act3): Hardswish()
(skip): Sequential(
(0): Conv2d(80, 112, kernel_size=(1, 1), stride=(1, 1), bias=False)
(1): BatchNorm2d(112, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(11): Block(
(conv1): Conv2d(112, 672, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(672, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): Hardswish()
(conv2): Conv2d(672, 672, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=672, bias=False)
(bn2): BatchNorm2d(672, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): Hardswish()
(se): SeModule(
(se): Sequential(
(0): AdaptiveAvgPool2d(output_size=1)
(1): Conv2d(672, 168, kernel_size=(1, 1), stride=(1, 1), bias=False)
(2): BatchNorm2d(168, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): ReLU(inplace=True)
(4): Conv2d(168, 672, kernel_size=(1, 1), stride=(1, 1), bias=False)
(5): Hardsigmoid()
)
)
(conv3): Conv2d(672, 112, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(112, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act3): Hardswish()
)
(12): Block(
(conv1): Conv2d(112, 672, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(672, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): Hardswish()
(conv2): Conv2d(672, 672, kernel_size=(5, 5), stride=(2, 2), padding=(2, 2), groups=672, bias=False)
(bn2): BatchNorm2d(672, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): Hardswish()
(se): SeModule(
(se): Sequential(
(0): AdaptiveAvgPool2d(output_size=1)
(1): Conv2d(672, 168, kernel_size=(1, 1), stride=(1, 1), bias=False)
(2): BatchNorm2d(168, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): ReLU(inplace=True)
(4): Conv2d(168, 672, kernel_size=(1, 1), stride=(1, 1), bias=False)
(5): Hardsigmoid()
)
)
(conv3): Conv2d(672, 160, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(160, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act3): Hardswish()
(skip): Sequential(
(0): Conv2d(112, 112, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=112, bias=False)
(1): BatchNorm2d(112, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): Conv2d(112, 160, kernel_size=(1, 1), stride=(1, 1))
(3): BatchNorm2d(160, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(13): Block(
(conv1): Conv2d(160, 672, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(672, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): Hardswish()
(conv2): Conv2d(672, 672, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=672, bias=False)
(bn2): BatchNorm2d(672, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): Hardswish()
(se): SeModule(
(se): Sequential(
(0): AdaptiveAvgPool2d(output_size=1)
(1): Conv2d(672, 168, kernel_size=(1, 1), stride=(1, 1), bias=False)
(2): BatchNorm2d(168, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): ReLU(inplace=True)
(4): Conv2d(168, 672, kernel_size=(1, 1), stride=(1, 1), bias=False)
(5): Hardsigmoid()
)
)
(conv3): Conv2d(672, 160, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(160, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act3): Hardswish()
)
(14): Block(
(conv1): Conv2d(160, 960, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(960, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): Hardswish()
(conv2): Conv2d(960, 960, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=960, bias=False)
(bn2): BatchNorm2d(960, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): Hardswish()
(se): SeModule(
(se): Sequential(
(0): AdaptiveAvgPool2d(output_size=1)
(1): Conv2d(960, 240, kernel_size=(1, 1), stride=(1, 1), bias=False)
(2): BatchNorm2d(240, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): ReLU(inplace=True)
(4): Conv2d(240, 960, kernel_size=(1, 1), stride=(1, 1), bias=False)
(5): Hardsigmoid()
)
)
(conv3): Conv2d(960, 160, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(160, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act3): Hardswish()
)
)
(conv2): Conv2d(160, 960, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn2): BatchNorm2d(960, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(hs2): Hardswish()
(gap): AdaptiveAvgPool2d(output_size=1)
(linear3): Linear(in_features=960, out_features=1280, bias=False)
(bn3): BatchNorm1d(1280, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(hs3): Hardswish()
(drop): Dropout(p=0.2, inplace=False)
(linear4): Linear(in_features=1280, out_features=1000, bias=True)
)
number of params: 5178732
LR = 0.00400000
Batch size = 4096
Update frequent = 2
Number of training examples = 1281167
Number of training training per epoch = 312
Param groups = {
"decay": {
"weight_decay": 0.05,
"params": [
"conv1.weight",
"bneck.0.conv1.weight",
"bneck.0.conv2.weight",
"bneck.0.conv3.weight",
"bneck.1.conv1.weight",
"bneck.1.conv2.weight",
"bneck.1.conv3.weight",
"bneck.1.skip.0.weight",
"bneck.1.skip.2.weight",
"bneck.2.conv1.weight",
"bneck.2.conv2.weight",
"bneck.2.conv3.weight",
"bneck.3.conv1.weight",
"bneck.3.conv2.weight",
"bneck.3.se.se.1.weight",
"bneck.3.se.se.4.weight",
"bneck.3.conv3.weight",
"bneck.3.skip.0.weight",
"bneck.3.skip.2.weight",
"bneck.4.conv1.weight",
"bneck.4.conv2.weight",
"bneck.4.se.se.1.weight",
"bneck.4.se.se.4.weight",
"bneck.4.conv3.weight",
"bneck.5.conv1.weight",
"bneck.5.conv2.weight",
"bneck.5.se.se.1.weight",
"bneck.5.se.se.4.weight",
"bneck.5.conv3.weight",
"bneck.6.conv1.weight",
"bneck.6.conv2.weight",
"bneck.6.conv3.weight",
"bneck.6.skip.0.weight",
"bneck.6.skip.2.weight",
"bneck.7.conv1.weight",
"bneck.7.conv2.weight",
"bneck.7.conv3.weight",
"bneck.8.conv1.weight",
"bneck.8.conv2.weight",
"bneck.8.conv3.weight",
"bneck.9.conv1.weight",
"bneck.9.conv2.weight",
"bneck.9.conv3.weight",
"bneck.10.conv1.weight",
"bneck.10.conv2.weight",
"bneck.10.se.se.1.weight",
"bneck.10.se.se.4.weight",
"bneck.10.conv3.weight",
"bneck.10.skip.0.weight",
"bneck.11.conv1.weight",
"bneck.11.conv2.weight",
"bneck.11.se.se.1.weight",
"bneck.11.se.se.4.weight",
"bneck.11.conv3.weight",
"bneck.12.conv1.weight",
"bneck.12.conv2.weight",
"bneck.12.se.se.1.weight",
"bneck.12.se.se.4.weight",
"bneck.12.conv3.weight",
"bneck.12.skip.0.weight",
"bneck.12.skip.2.weight",
"bneck.13.conv1.weight",
"bneck.13.conv2.weight",
"bneck.13.se.se.1.weight",
"bneck.13.se.se.4.weight",
"bneck.13.conv3.weight",
"bneck.14.conv1.weight",
"bneck.14.conv2.weight",
"bneck.14.se.se.1.weight",
"bneck.14.se.se.4.weight",
"bneck.14.conv3.weight",
"conv2.weight",
"linear3.weight",
"linear4.weight"
],
"lr_scale": 1.0
},
"no_decay": {
"weight_decay": 0.0,
"params": [
"bn1.weight",
"bn1.bias",
"bneck.0.bn1.weight",
"bneck.0.bn1.bias",
"bneck.0.bn2.weight",
"bneck.0.bn2.bias",
"bneck.0.bn3.weight",
"bneck.0.bn3.bias",
"bneck.1.bn1.weight",
"bneck.1.bn1.bias",
"bneck.1.bn2.weight",
"bneck.1.bn2.bias",
"bneck.1.bn3.weight",
"bneck.1.bn3.bias",
"bneck.1.skip.1.weight",
"bneck.1.skip.1.bias",
"bneck.1.skip.2.bias",
"bneck.1.skip.3.weight",
"bneck.1.skip.3.bias",
"bneck.2.bn1.weight",
"bneck.2.bn1.bias",
"bneck.2.bn2.weight",
"bneck.2.bn2.bias",
"bneck.2.bn3.weight",
"bneck.2.bn3.bias",
"bneck.3.bn1.weight",
"bneck.3.bn1.bias",
"bneck.3.bn2.weight",
"bneck.3.bn2.bias",
"bneck.3.se.se.2.weight",
"bneck.3.se.se.2.bias",
"bneck.3.bn3.weight",
"bneck.3.bn3.bias",
"bneck.3.skip.1.weight",
"bneck.3.skip.1.bias",
"bneck.3.skip.2.bias",
"bneck.3.skip.3.weight",
"bneck.3.skip.3.bias",
"bneck.4.bn1.weight",
"bneck.4.bn1.bias",
"bneck.4.bn2.weight",
"bneck.4.bn2.bias",
"bneck.4.se.se.2.weight",
"bneck.4.se.se.2.bias",
"bneck.4.bn3.weight",
"bneck.4.bn3.bias",
"bneck.5.bn1.weight",
"bneck.5.bn1.bias",
"bneck.5.bn2.weight",
"bneck.5.bn2.bias",
"bneck.5.se.se.2.weight",
"bneck.5.se.se.2.bias",
"bneck.5.bn3.weight",
"bneck.5.bn3.bias",
"bneck.6.bn1.weight",
"bneck.6.bn1.bias",
"bneck.6.bn2.weight",
"bneck.6.bn2.bias",
"bneck.6.bn3.weight",
"bneck.6.bn3.bias",
"bneck.6.skip.1.weight",
"bneck.6.skip.1.bias",
"bneck.6.skip.2.bias",
"bneck.6.skip.3.weight",
"bneck.6.skip.3.bias",
"bneck.7.bn1.weight",
"bneck.7.bn1.bias",
"bneck.7.bn2.weight",
"bneck.7.bn2.bias",
"bneck.7.bn3.weight",
"bneck.7.bn3.bias",
"bneck.8.bn1.weight",
"bneck.8.bn1.bias",
"bneck.8.bn2.weight",
"bneck.8.bn2.bias",
"bneck.8.bn3.weight",
"bneck.8.bn3.bias",
"bneck.9.bn1.weight",
"bneck.9.bn1.bias",
"bneck.9.bn2.weight",
"bneck.9.bn2.bias",
"bneck.9.bn3.weight",
"bneck.9.bn3.bias",
"bneck.10.bn1.weight",
"bneck.10.bn1.bias",
"bneck.10.bn2.weight",
"bneck.10.bn2.bias",
"bneck.10.se.se.2.weight",
"bneck.10.se.se.2.bias",
"bneck.10.bn3.weight",
"bneck.10.bn3.bias",
"bneck.10.skip.1.weight",
"bneck.10.skip.1.bias",
"bneck.11.bn1.weight",
"bneck.11.bn1.bias",
"bneck.11.bn2.weight",
"bneck.11.bn2.bias",
"bneck.11.se.se.2.weight",
"bneck.11.se.se.2.bias",
"bneck.11.bn3.weight",
"bneck.11.bn3.bias",
"bneck.12.bn1.weight",
"bneck.12.bn1.bias",
"bneck.12.bn2.weight",
"bneck.12.bn2.bias",
"bneck.12.se.se.2.weight",
"bneck.12.se.se.2.bias",
"bneck.12.bn3.weight",
"bneck.12.bn3.bias",
"bneck.12.skip.1.weight",
"bneck.12.skip.1.bias",
"bneck.12.skip.2.bias",
"bneck.12.skip.3.weight",
"bneck.12.skip.3.bias",
"bneck.13.bn1.weight",
"bneck.13.bn1.bias",
"bneck.13.bn2.weight",
"bneck.13.bn2.bias",
"bneck.13.se.se.2.weight",
"bneck.13.se.se.2.bias",
"bneck.13.bn3.weight",
"bneck.13.bn3.bias",
"bneck.14.bn1.weight",
"bneck.14.bn1.bias",
"bneck.14.bn2.weight",
"bneck.14.bn2.bias",
"bneck.14.se.se.2.weight",
"bneck.14.se.se.2.bias",
"bneck.14.bn3.weight",
"bneck.14.bn3.bias",
"bn2.weight",
"bn2.bias",
"bn3.weight",
"bn3.bias",
"linear4.bias"
],
"lr_scale": 1.0
}
}
Use Cosine LR scheduler
Set warmup steps = 6240
Set warmup steps = 0
Max WD = 0.0500000, Min WD = 0.0500000
criterion = LabelSmoothingCrossEntropy()
Auto resume checkpoint:
Start training for 300 epochs
Epoch: [0] [ 0/625] eta: 5:56:07 lr: 0.000000 min_lr: 0.000000 loss: 6.9095 (6.9095) class_acc: 0.0000 (0.0000) weight_decay: 0.0500 (0.0500) time: 34.1885 data: 19.7940 max mem: 6925
Epoch: [0] [200/625] eta: 0:15:18 lr: 0.000064 min_lr: 0.000064 loss: 6.8804 (6.8977) class_acc: 0.0000 (0.0015) weight_decay: 0.0500 (0.0500) grad_norm: 0.5480 (0.5217) time: 2.0561 data: 0.0005 max mem: 6925
Epoch: [0] [400/625] eta: 0:07:56 lr: 0.000128 min_lr: 0.000128 loss: 6.7886 (6.8663) class_acc: 0.0039 (0.0020) weight_decay: 0.0500 (0.0500) grad_norm: 0.9387 (0.6578) time: 2.1968 data: 0.0007 max mem: 6925
Epoch: [0] [600/625] eta: 0:00:52 lr: 0.000192 min_lr: 0.000192 loss: 6.6098 (6.8063) class_acc: 0.0078 (0.0031) weight_decay: 0.0500 (0.0500) grad_norm: 1.2534 (0.8297) time: 1.8978 data: 0.0007 max mem: 6925
Epoch: [0] [624/625] eta: 0:00:02 lr: 0.000199 min_lr: 0.000199 loss: 6.5938 (6.7985) class_acc: 0.0078 (0.0033) weight_decay: 0.0500 (0.0500) grad_norm: 1.2840 (0.8490) time: 1.0091 data: 0.0020 max mem: 6925
Epoch: [0] Total time: 0:21:35 (2.0729 s / it)
Averaged stats: lr: 0.000199 min_lr: 0.000199 loss: 6.5938 (6.7976) class_acc: 0.0078 (0.0035) weight_decay: 0.0500 (0.0500) grad_norm: 1.2840 (0.8490)
Test: [ 0/50] eta: 0:11:57 loss: 6.2253 (6.2253) acc1: 0.8000 (0.8000) acc5: 1.6000 (1.6000) time: 14.3473 data: 12.3008 max mem: 6925
Test: [10/50] eta: 0:01:27 loss: 6.2876 (6.2817) acc1: 0.8000 (1.7455) acc5: 4.0000 (4.3636) time: 2.1955 data: 1.9836 max mem: 6925
Test: [20/50] eta: 0:00:53 loss: 6.3267 (6.3014) acc1: 0.8000 (1.7524) acc5: 4.0000 (4.8000) time: 1.1468 data: 1.1177 max mem: 6925
Test: [30/50] eta: 0:00:32 loss: 6.2951 (6.2799) acc1: 0.8000 (1.7548) acc5: 6.4000 (5.4710) time: 1.2733 data: 1.2435 max mem: 6925
Test: [40/50] eta: 0:00:13 loss: 6.2815 (6.2820) acc1: 0.8000 (1.7561) acc5: 5.6000 (5.3854) time: 0.9462 data: 0.9165 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 6.2896 (6.2860) acc1: 0.8000 (1.7120) acc5: 4.0000 (5.3280) time: 0.8818 data: 0.8531 max mem: 6925
Test: Total time: 0:00:57 (1.1422 s / it)
* Acc@1 1.482 Acc@5 5.464 loss 6.276
Accuracy of the model on the 50000 test images: 1.5%
Max accuracy: 1.48%
Epoch: [1] [ 0/625] eta: 3:12:22 lr: 0.000200 min_lr: 0.000200 loss: 6.5556 (6.5556) class_acc: 0.0078 (0.0078) weight_decay: 0.0500 (0.0500) time: 18.4673 data: 15.9101 max mem: 6925
Epoch: [1] [200/625] eta: 0:13:36 lr: 0.000264 min_lr: 0.000264 loss: 6.3951 (6.4803) class_acc: 0.0195 (0.0132) weight_decay: 0.0500 (0.0500) grad_norm: 1.3826 (1.3894) time: 1.7949 data: 0.1612 max mem: 6925
Epoch: [1] [400/625] eta: 0:07:10 lr: 0.000328 min_lr: 0.000328 loss: 6.2116 (6.3916) class_acc: 0.0234 (0.0178) weight_decay: 0.0500 (0.0500) grad_norm: 1.4349 (1.3960) time: 1.9017 data: 0.0011 max mem: 6925
Epoch: [1] [600/625] eta: 0:00:48 lr: 0.000392 min_lr: 0.000392 loss: 6.0291 (6.3028) class_acc: 0.0430 (0.0230) weight_decay: 0.0500 (0.0500) grad_norm: 1.4847 (1.4249) time: 1.7438 data: 0.0008 max mem: 6925
Epoch: [1] [624/625] eta: 0:00:01 lr: 0.000399 min_lr: 0.000399 loss: 6.0516 (6.2934) class_acc: 0.0391 (0.0236) weight_decay: 0.0500 (0.0500) grad_norm: 1.5434 (1.4315) time: 0.5599 data: 0.0019 max mem: 6925
Epoch: [1] Total time: 0:19:46 (1.8979 s / it)
Averaged stats: lr: 0.000399 min_lr: 0.000399 loss: 6.0516 (6.2915) class_acc: 0.0391 (0.0231) weight_decay: 0.0500 (0.0500) grad_norm: 1.5434 (1.4315)
Test: [ 0/50] eta: 0:10:03 loss: 5.2781 (5.2781) acc1: 7.2000 (7.2000) acc5: 16.8000 (16.8000) time: 12.0729 data: 12.0404 max mem: 6925
Test: [10/50] eta: 0:01:30 loss: 5.2781 (5.2251) acc1: 8.0000 (8.6545) acc5: 22.4000 (22.9818) time: 2.2596 data: 2.2302 max mem: 6925
Test: [20/50] eta: 0:00:52 loss: 5.2521 (5.2607) acc1: 6.4000 (7.5048) acc5: 20.8000 (20.9524) time: 1.2508 data: 1.2220 max mem: 6925
Test: [30/50] eta: 0:00:30 loss: 5.2521 (5.2644) acc1: 6.4000 (7.5613) acc5: 17.6000 (20.9032) time: 1.0913 data: 1.0628 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 5.2966 (5.2780) acc1: 8.0000 (7.4927) acc5: 19.2000 (20.3317) time: 0.7001 data: 0.6713 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 5.3171 (5.2957) acc1: 6.4000 (7.2320) acc5: 19.2000 (20.0480) time: 0.6051 data: 0.5763 max mem: 6925
Test: Total time: 0:00:52 (1.0436 s / it)
* Acc@1 7.512 Acc@5 20.642 loss 5.264
Accuracy of the model on the 50000 test images: 7.5%
Max accuracy: 7.51%
Epoch: [2] [ 0/625] eta: 3:50:46 lr: 0.000400 min_lr: 0.000400 loss: 6.0365 (6.0365) class_acc: 0.0352 (0.0352) weight_decay: 0.0500 (0.0500) time: 22.1549 data: 16.8647 max mem: 6925
Epoch: [2] [200/625] eta: 0:13:42 lr: 0.000464 min_lr: 0.000464 loss: 5.8730 (5.9224) class_acc: 0.0547 (0.0480) weight_decay: 0.0500 (0.0500) grad_norm: 1.5818 (1.5650) time: 1.7536 data: 0.0009 max mem: 6925
Epoch: [2] [400/625] eta: 0:07:08 lr: 0.000528 min_lr: 0.000528 loss: 5.6966 (5.8369) class_acc: 0.0703 (0.0567) weight_decay: 0.0500 (0.0500) grad_norm: 1.5028 (1.5539) time: 1.8044 data: 0.0007 max mem: 6925
Epoch: [2] [600/625] eta: 0:00:47 lr: 0.000592 min_lr: 0.000592 loss: 5.5131 (5.7568) class_acc: 0.0859 (0.0653) weight_decay: 0.0500 (0.0500) grad_norm: 1.6248 (1.5674) time: 1.8818 data: 0.0010 max mem: 6925
Epoch: [2] [624/625] eta: 0:00:01 lr: 0.000599 min_lr: 0.000599 loss: 5.5190 (5.7481) class_acc: 0.0898 (0.0662) weight_decay: 0.0500 (0.0500) grad_norm: 1.5775 (1.5696) time: 1.1627 data: 0.0016 max mem: 6925
Epoch: [2] Total time: 0:19:30 (1.8721 s / it)
Averaged stats: lr: 0.000599 min_lr: 0.000599 loss: 5.5190 (5.7496) class_acc: 0.0898 (0.0660) weight_decay: 0.0500 (0.0500) grad_norm: 1.5775 (1.5696)
Test: [ 0/50] eta: 0:10:10 loss: 4.4641 (4.4641) acc1: 16.8000 (16.8000) acc5: 36.8000 (36.8000) time: 12.2086 data: 12.1381 max mem: 6925
Test: [10/50] eta: 0:01:23 loss: 4.4238 (4.4051) acc1: 17.6000 (17.4545) acc5: 36.8000 (37.3818) time: 2.0823 data: 2.0499 max mem: 6925
Test: [20/50] eta: 0:00:48 loss: 4.5066 (4.4858) acc1: 13.6000 (14.9714) acc5: 32.8000 (34.5905) time: 1.0871 data: 1.0582 max mem: 6925
Test: [30/50] eta: 0:00:28 loss: 4.5066 (4.4635) acc1: 13.6000 (15.3290) acc5: 35.2000 (35.5613) time: 1.0285 data: 0.9997 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 4.5031 (4.4885) acc1: 15.2000 (15.2195) acc5: 35.2000 (34.7902) time: 0.6810 data: 0.6509 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 4.5565 (4.5106) acc1: 13.6000 (15.0240) acc5: 33.6000 (34.2880) time: 0.6760 data: 0.6447 max mem: 6925
Test: Total time: 0:00:48 (0.9725 s / it)
* Acc@1 15.342 Acc@5 34.758 loss 4.479
Accuracy of the model on the 50000 test images: 15.3%
Max accuracy: 15.34%
Epoch: [3] [ 0/625] eta: 3:26:32 lr: 0.000600 min_lr: 0.000600 loss: 5.4816 (5.4816) class_acc: 0.0898 (0.0898) weight_decay: 0.0500 (0.0500) time: 19.8282 data: 19.5995 max mem: 6925
Epoch: [3] [200/625] eta: 0:13:27 lr: 0.000664 min_lr: 0.000664 loss: 5.2927 (5.4067) class_acc: 0.1055 (0.1020) weight_decay: 0.0500 (0.0500) grad_norm: 1.5359 (1.6177) time: 1.7715 data: 0.0016 max mem: 6925
Epoch: [3] [400/625] eta: 0:07:04 lr: 0.000728 min_lr: 0.000728 loss: 5.2244 (5.3416) class_acc: 0.1328 (0.1121) weight_decay: 0.0500 (0.0500) grad_norm: 1.5293 (inf) time: 1.8600 data: 0.0120 max mem: 6925
Epoch: [3] [600/625] eta: 0:00:47 lr: 0.000792 min_lr: 0.000792 loss: 5.1330 (5.2767) class_acc: 0.1406 (0.1202) weight_decay: 0.0500 (0.0500) grad_norm: 1.4537 (inf) time: 2.0165 data: 0.0015 max mem: 6925
Epoch: [3] [624/625] eta: 0:00:01 lr: 0.000799 min_lr: 0.000799 loss: 5.0808 (5.2703) class_acc: 0.1328 (0.1208) weight_decay: 0.0500 (0.0500) grad_norm: 1.4591 (inf) time: 0.6042 data: 0.0032 max mem: 6925
Epoch: [3] Total time: 0:19:34 (1.8785 s / it)
Averaged stats: lr: 0.000799 min_lr: 0.000799 loss: 5.0808 (5.2733) class_acc: 0.1328 (0.1203) weight_decay: 0.0500 (0.0500) grad_norm: 1.4591 (inf)
Test: [ 0/50] eta: 0:10:05 loss: 3.9517 (3.9517) acc1: 17.6000 (17.6000) acc5: 40.8000 (40.8000) time: 12.1128 data: 12.0784 max mem: 6925
Test: [10/50] eta: 0:01:26 loss: 3.8810 (3.7958) acc1: 24.8000 (24.1455) acc5: 48.8000 (48.5818) time: 2.1531 data: 2.1218 max mem: 6925
Test: [20/50] eta: 0:00:51 loss: 3.8732 (3.8766) acc1: 22.4000 (21.7143) acc5: 46.4000 (46.6286) time: 1.1844 data: 1.1546 max mem: 6925
Test: [30/50] eta: 0:00:30 loss: 3.8732 (3.8733) acc1: 20.8000 (22.2710) acc5: 45.6000 (46.4258) time: 1.1598 data: 1.1308 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 3.8971 (3.8955) acc1: 21.6000 (22.0683) acc5: 44.8000 (45.5805) time: 0.7770 data: 0.7478 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 4.0065 (3.9263) acc1: 19.2000 (21.5520) acc5: 40.8000 (44.7360) time: 0.6816 data: 0.6513 max mem: 6925
Test: Total time: 0:00:52 (1.0439 s / it)
* Acc@1 22.670 Acc@5 45.312 loss 3.893
Accuracy of the model on the 50000 test images: 22.7%
Max accuracy: 22.67%
Epoch: [4] [ 0/625] eta: 3:07:33 lr: 0.000800 min_lr: 0.000800 loss: 5.1065 (5.1065) class_acc: 0.1250 (0.1250) weight_decay: 0.0500 (0.0500) time: 18.0048 data: 17.7750 max mem: 6925
Epoch: [4] [200/625] eta: 0:13:52 lr: 0.000864 min_lr: 0.000864 loss: 4.9621 (5.0058) class_acc: 0.1641 (0.1548) weight_decay: 0.0500 (0.0500) grad_norm: 1.6045 (1.6961) time: 2.0155 data: 0.1118 max mem: 6925
Epoch: [4] [400/625] eta: 0:07:14 lr: 0.000928 min_lr: 0.000928 loss: 4.8520 (4.9529) class_acc: 0.1797 (0.1624) weight_decay: 0.0500 (0.0500) grad_norm: 1.4766 (1.6356) time: 2.1786 data: 0.0447 max mem: 6925
Epoch: [4] [600/625] eta: 0:00:47 lr: 0.000992 min_lr: 0.000992 loss: 4.7885 (4.9018) class_acc: 0.1797 (0.1695) weight_decay: 0.0500 (0.0500) grad_norm: 1.5506 (1.6296) time: 1.8418 data: 0.0143 max mem: 6925
Epoch: [4] [624/625] eta: 0:00:01 lr: 0.001000 min_lr: 0.001000 loss: 4.7369 (4.8964) class_acc: 0.2031 (0.1706) weight_decay: 0.0500 (0.0500) grad_norm: 1.3972 (1.6202) time: 0.7306 data: 0.0019 max mem: 6925
Epoch: [4] Total time: 0:19:43 (1.8935 s / it)
Averaged stats: lr: 0.001000 min_lr: 0.001000 loss: 4.7369 (4.8939) class_acc: 0.2031 (0.1713) weight_decay: 0.0500 (0.0500) grad_norm: 1.3972 (1.6202)
Test: [ 0/50] eta: 0:10:02 loss: 3.4860 (3.4860) acc1: 32.8000 (32.8000) acc5: 48.0000 (48.0000) time: 12.0533 data: 12.0156 max mem: 6925
Test: [10/50] eta: 0:01:21 loss: 3.3917 (3.3568) acc1: 32.0000 (31.7818) acc5: 56.8000 (55.9273) time: 2.0370 data: 2.0075 max mem: 6925
Test: [20/50] eta: 0:00:48 loss: 3.5504 (3.4944) acc1: 27.2000 (29.0286) acc5: 50.4000 (52.9524) time: 1.0812 data: 1.0525 max mem: 6925
Test: [30/50] eta: 0:00:28 loss: 3.6201 (3.4817) acc1: 25.6000 (29.3161) acc5: 49.6000 (53.0065) time: 1.1072 data: 1.0779 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 3.5172 (3.5017) acc1: 28.0000 (28.9171) acc5: 51.2000 (52.6244) time: 0.8841 data: 0.8544 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 3.6958 (3.5341) acc1: 25.6000 (28.3840) acc5: 51.2000 (52.1760) time: 0.8263 data: 0.7963 max mem: 6925
Test: Total time: 0:00:53 (1.0631 s / it)
* Acc@1 28.870 Acc@5 53.312 loss 3.489
Accuracy of the model on the 50000 test images: 28.9%
Max accuracy: 28.87%
Epoch: [5] [ 0/625] eta: 3:29:39 lr: 0.001000 min_lr: 0.001000 loss: 4.8149 (4.8149) class_acc: 0.1836 (0.1836) weight_decay: 0.0500 (0.0500) time: 20.1276 data: 16.7215 max mem: 6925
Epoch: [5] [200/625] eta: 0:13:53 lr: 0.001064 min_lr: 0.001064 loss: 4.6243 (4.6814) class_acc: 0.2070 (0.2051) weight_decay: 0.0500 (0.0500) grad_norm: 1.4361 (1.6153) time: 1.5607 data: 0.1304 max mem: 6925
Epoch: [5] [400/625] eta: 0:07:14 lr: 0.001128 min_lr: 0.001128 loss: 4.5867 (4.6457) class_acc: 0.2188 (0.2094) weight_decay: 0.0500 (0.0500) grad_norm: 1.4751 (1.5614) time: 1.9145 data: 0.0009 max mem: 6925
Epoch: [5] [600/625] eta: 0:00:48 lr: 0.001192 min_lr: 0.001192 loss: 4.4963 (4.6113) class_acc: 0.2266 (0.2151) weight_decay: 0.0500 (0.0500) grad_norm: 1.4929 (1.5797) time: 1.9518 data: 0.0007 max mem: 6925
Epoch: [5] [624/625] eta: 0:00:01 lr: 0.001200 min_lr: 0.001200 loss: 4.5370 (4.6068) class_acc: 0.2227 (0.2157) weight_decay: 0.0500 (0.0500) grad_norm: 1.7646 (1.5858) time: 1.2191 data: 0.0015 max mem: 6925
Epoch: [5] Total time: 0:19:54 (1.9105 s / it)
Averaged stats: lr: 0.001200 min_lr: 0.001200 loss: 4.5370 (4.6027) class_acc: 0.2227 (0.2161) weight_decay: 0.0500 (0.0500) grad_norm: 1.7646 (1.5858)
Test: [ 0/50] eta: 0:10:35 loss: 3.1712 (3.1712) acc1: 36.0000 (36.0000) acc5: 56.8000 (56.8000) time: 12.7045 data: 12.6276 max mem: 6925
Test: [10/50] eta: 0:01:17 loss: 3.1367 (3.0492) acc1: 37.6000 (38.9091) acc5: 59.2000 (61.5273) time: 1.9477 data: 1.9129 max mem: 6925
Test: [20/50] eta: 0:00:43 loss: 3.2614 (3.1718) acc1: 35.2000 (34.5143) acc5: 58.4000 (59.1238) time: 0.8792 data: 0.8490 max mem: 6925
Test: [30/50] eta: 0:00:24 loss: 3.3595 (3.1679) acc1: 30.4000 (34.1677) acc5: 56.0000 (58.8903) time: 0.8330 data: 0.8024 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 3.1905 (3.1928) acc1: 32.0000 (33.6781) acc5: 56.8000 (58.3220) time: 0.8307 data: 0.8005 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 3.2527 (3.2158) acc1: 30.4000 (33.2960) acc5: 56.8000 (58.1440) time: 0.6131 data: 0.5827 max mem: 6925
Test: Total time: 0:00:50 (1.0101 s / it)
* Acc@1 33.714 Acc@5 58.854 loss 3.178
Accuracy of the model on the 50000 test images: 33.7%
Max accuracy: 33.71%
Epoch: [6] [ 0/625] eta: 3:15:14 lr: 0.001200 min_lr: 0.001200 loss: 4.5212 (4.5212) class_acc: 0.2383 (0.2383) weight_decay: 0.0500 (0.0500) time: 18.7426 data: 18.5114 max mem: 6925
Epoch: [6] [200/625] eta: 0:13:40 lr: 0.001264 min_lr: 0.001264 loss: 4.4352 (4.4509) class_acc: 0.2383 (0.2428) weight_decay: 0.0500 (0.0500) grad_norm: 1.3908 (1.5878) time: 1.9018 data: 0.2037 max mem: 6925
Epoch: [6] [400/625] eta: 0:07:06 lr: 0.001328 min_lr: 0.001328 loss: 4.3493 (4.4172) class_acc: 0.2695 (0.2481) weight_decay: 0.0500 (0.0500) grad_norm: 1.4002 (1.5686) time: 1.9066 data: 0.4408 max mem: 6925
Epoch: [6] [600/625] eta: 0:00:48 lr: 0.001393 min_lr: 0.001393 loss: 4.2470 (4.3861) class_acc: 0.2539 (0.2532) weight_decay: 0.0500 (0.0500) grad_norm: 1.4963 (1.5648) time: 2.2510 data: 0.0806 max mem: 6925
Epoch: [6] [624/625] eta: 0:00:01 lr: 0.001400 min_lr: 0.001400 loss: 4.3315 (4.3834) class_acc: 0.2578 (0.2537) weight_decay: 0.0500 (0.0500) grad_norm: 1.4046 (1.5620) time: 0.7345 data: 0.0015 max mem: 6925
Epoch: [6] Total time: 0:19:34 (1.8791 s / it)
Averaged stats: lr: 0.001400 min_lr: 0.001400 loss: 4.3315 (4.3758) class_acc: 0.2578 (0.2540) weight_decay: 0.0500 (0.0500) grad_norm: 1.4046 (1.5620)
Test: [ 0/50] eta: 0:10:07 loss: 2.7027 (2.7027) acc1: 40.0000 (40.0000) acc5: 67.2000 (67.2000) time: 12.1527 data: 12.1185 max mem: 6925
Test: [10/50] eta: 0:01:18 loss: 2.7564 (2.7849) acc1: 40.0000 (41.7455) acc5: 66.4000 (65.8182) time: 1.9603 data: 1.9309 max mem: 6925
Test: [20/50] eta: 0:00:42 loss: 2.9449 (2.9259) acc1: 37.6000 (38.3619) acc5: 62.4000 (63.3524) time: 0.8751 data: 0.8456 max mem: 6925
Test: [30/50] eta: 0:00:25 loss: 2.9736 (2.9032) acc1: 36.0000 (38.7355) acc5: 62.4000 (63.8194) time: 0.8991 data: 0.8694 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 3.0083 (2.9283) acc1: 36.0000 (38.2829) acc5: 62.4000 (63.4927) time: 0.9614 data: 0.9322 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 3.0386 (2.9546) acc1: 35.2000 (37.7440) acc5: 60.8000 (63.1360) time: 0.6520 data: 0.6226 max mem: 6925
Test: Total time: 0:00:52 (1.0549 s / it)
* Acc@1 37.696 Acc@5 63.402 loss 2.937
Accuracy of the model on the 50000 test images: 37.7%
Max accuracy: 37.70%
Epoch: [7] [ 0/625] eta: 3:49:25 lr: 0.001400 min_lr: 0.001400 loss: 4.3410 (4.3410) class_acc: 0.2422 (0.2422) weight_decay: 0.0500 (0.0500) time: 22.0254 data: 16.2290 max mem: 6925
Epoch: [7] [200/625] eta: 0:14:22 lr: 0.001464 min_lr: 0.001464 loss: 4.2663 (4.2318) class_acc: 0.2812 (0.2802) weight_decay: 0.0500 (0.0500) grad_norm: 1.3494 (1.5012) time: 2.1662 data: 0.0013 max mem: 6925
Epoch: [7] [400/625] eta: 0:07:25 lr: 0.001528 min_lr: 0.001528 loss: 4.1935 (4.2155) class_acc: 0.2773 (0.2825) weight_decay: 0.0500 (0.0500) grad_norm: 1.4383 (1.5125) time: 1.9888 data: 0.0010 max mem: 6925
Epoch: [7] [600/625] eta: 0:00:49 lr: 0.001593 min_lr: 0.001593 loss: 4.0658 (4.1929) class_acc: 0.2812 (0.2859) weight_decay: 0.0500 (0.0500) grad_norm: 1.5424 (1.5101) time: 1.9448 data: 0.0010 max mem: 6925
Epoch: [7] [624/625] eta: 0:00:01 lr: 0.001600 min_lr: 0.001600 loss: 4.1433 (4.1897) class_acc: 0.3047 (0.2865) weight_decay: 0.0500 (0.0500) grad_norm: 1.4047 (1.5040) time: 0.8216 data: 0.0016 max mem: 6925
Epoch: [7] Total time: 0:19:58 (1.9184 s / it)
Averaged stats: lr: 0.001600 min_lr: 0.001600 loss: 4.1433 (4.1987) class_acc: 0.3047 (0.2852) weight_decay: 0.0500 (0.0500) grad_norm: 1.4047 (1.5040)
Test: [ 0/50] eta: 0:10:28 loss: 2.5643 (2.5643) acc1: 40.0000 (40.0000) acc5: 72.0000 (72.0000) time: 12.5791 data: 12.5481 max mem: 6925
Test: [10/50] eta: 0:01:26 loss: 2.5643 (2.6200) acc1: 43.2000 (44.6545) acc5: 72.0000 (68.4364) time: 2.1506 data: 2.1209 max mem: 6925
Test: [20/50] eta: 0:00:48 loss: 2.8511 (2.7905) acc1: 40.8000 (41.6000) acc5: 64.8000 (66.0952) time: 1.0637 data: 1.0346 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 2.9363 (2.7697) acc1: 39.2000 (41.6516) acc5: 65.6000 (66.6065) time: 0.9124 data: 0.8824 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 2.7805 (2.7972) acc1: 40.0000 (41.1902) acc5: 65.6000 (66.1268) time: 0.6348 data: 0.6038 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 2.8003 (2.8071) acc1: 39.2000 (40.8320) acc5: 64.8000 (66.1440) time: 0.5118 data: 0.4824 max mem: 6925
Test: Total time: 0:00:49 (0.9877 s / it)
* Acc@1 41.030 Acc@5 66.562 loss 2.771
Accuracy of the model on the 50000 test images: 41.0%
Max accuracy: 41.03%
Epoch: [8] [ 0/625] eta: 3:31:16 lr: 0.001600 min_lr: 0.001600 loss: 4.1364 (4.1364) class_acc: 0.2734 (0.2734) weight_decay: 0.0500 (0.0500) time: 20.2826 data: 18.7704 max mem: 6925
Epoch: [8] [200/625] eta: 0:14:18 lr: 0.001664 min_lr: 0.001664 loss: 4.0123 (4.0864) class_acc: 0.3125 (0.3047) weight_decay: 0.0500 (0.0500) grad_norm: 1.3290 (1.5126) time: 1.9716 data: 0.0476 max mem: 6925
Epoch: [8] [400/625] eta: 0:07:25 lr: 0.001728 min_lr: 0.001728 loss: 4.0074 (4.0690) class_acc: 0.3125 (0.3070) weight_decay: 0.0500 (0.0500) grad_norm: 1.3430 (1.5013) time: 1.9832 data: 0.0089 max mem: 6925
Epoch: [8] [600/625] eta: 0:00:49 lr: 0.001793 min_lr: 0.001793 loss: 4.0096 (4.0575) class_acc: 0.3125 (0.3095) weight_decay: 0.0500 (0.0500) grad_norm: 1.5426 (1.5261) time: 1.9662 data: 0.0159 max mem: 6925
Epoch: [8] [624/625] eta: 0:00:01 lr: 0.001800 min_lr: 0.001800 loss: 3.9937 (4.0541) class_acc: 0.3320 (0.3104) weight_decay: 0.0500 (0.0500) grad_norm: 1.2533 (1.5163) time: 0.8105 data: 0.0281 max mem: 6925
Epoch: [8] Total time: 0:20:06 (1.9301 s / it)
Averaged stats: lr: 0.001800 min_lr: 0.001800 loss: 3.9937 (4.0527) class_acc: 0.3320 (0.3116) weight_decay: 0.0500 (0.0500) grad_norm: 1.2533 (1.5163)
Test: [ 0/50] eta: 0:10:12 loss: 2.4263 (2.4263) acc1: 46.4000 (46.4000) acc5: 76.0000 (76.0000) time: 12.2491 data: 12.2181 max mem: 6925
Test: [10/50] eta: 0:01:18 loss: 2.4263 (2.4293) acc1: 46.4000 (46.6909) acc5: 72.8000 (72.2182) time: 1.9683 data: 1.9380 max mem: 6925
Test: [20/50] eta: 0:00:46 loss: 2.5984 (2.5959) acc1: 40.8000 (43.2762) acc5: 67.2000 (69.5619) time: 1.0158 data: 0.9863 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 2.7424 (2.5964) acc1: 40.8000 (43.7936) acc5: 67.2000 (69.3936) time: 1.0757 data: 1.0469 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 2.7092 (2.6361) acc1: 42.4000 (43.4146) acc5: 67.2000 (68.6439) time: 0.8801 data: 0.8512 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 2.6926 (2.6537) acc1: 42.4000 (43.1360) acc5: 68.0000 (68.4000) time: 0.7090 data: 0.6800 max mem: 6925
Test: Total time: 0:00:52 (1.0429 s / it)
* Acc@1 43.824 Acc@5 68.946 loss 2.614
Accuracy of the model on the 50000 test images: 43.8%
Max accuracy: 43.82%
Epoch: [9] [ 0/625] eta: 3:20:24 lr: 0.001800 min_lr: 0.001800 loss: 4.0534 (4.0534) class_acc: 0.2891 (0.2891) weight_decay: 0.0500 (0.0500) time: 19.2384 data: 18.2822 max mem: 6925
Epoch: [9] [200/625] eta: 0:13:39 lr: 0.001864 min_lr: 0.001864 loss: 3.9425 (3.9574) class_acc: 0.3320 (0.3284) weight_decay: 0.0500 (0.0500) grad_norm: 1.3525 (1.4977) time: 1.8314 data: 0.0011 max mem: 6925
Epoch: [9] [400/625] eta: 0:07:06 lr: 0.001929 min_lr: 0.001929 loss: 3.9690 (3.9474) class_acc: 0.3203 (0.3301) weight_decay: 0.0500 (0.0500) grad_norm: 1.6176 (1.4755) time: 2.0909 data: 0.0009 max mem: 6925
Epoch: [9] [600/625] eta: 0:00:47 lr: 0.001993 min_lr: 0.001993 loss: 3.9572 (3.9319) class_acc: 0.3320 (0.3334) weight_decay: 0.0500 (0.0500) grad_norm: 1.2477 (1.4769) time: 1.8500 data: 0.0145 max mem: 6925
Epoch: [9] [624/625] eta: 0:00:01 lr: 0.002000 min_lr: 0.002000 loss: 3.9135 (3.9309) class_acc: 0.3359 (0.3338) weight_decay: 0.0500 (0.0500) grad_norm: 1.3179 (1.4755) time: 0.9920 data: 0.0021 max mem: 6925
Epoch: [9] Total time: 0:19:29 (1.8710 s / it)
Averaged stats: lr: 0.002000 min_lr: 0.002000 loss: 3.9135 (3.9263) class_acc: 0.3359 (0.3354) weight_decay: 0.0500 (0.0500) grad_norm: 1.3179 (1.4755)
Test: [ 0/50] eta: 0:10:30 loss: 2.3571 (2.3571) acc1: 45.6000 (45.6000) acc5: 75.2000 (75.2000) time: 12.6090 data: 12.5540 max mem: 6925
Test: [10/50] eta: 0:01:22 loss: 2.3571 (2.3315) acc1: 48.8000 (48.5818) acc5: 74.4000 (74.1818) time: 2.0735 data: 2.0409 max mem: 6925
Test: [20/50] eta: 0:00:47 loss: 2.4959 (2.4849) acc1: 45.6000 (45.2190) acc5: 72.0000 (71.7714) time: 1.0378 data: 1.0068 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 2.6075 (2.4851) acc1: 43.2000 (45.6516) acc5: 69.6000 (71.4839) time: 1.0034 data: 0.9726 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 2.6013 (2.5173) acc1: 44.0000 (45.0927) acc5: 69.6000 (71.0244) time: 0.7082 data: 0.6790 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 2.5379 (2.5305) acc1: 43.2000 (44.8160) acc5: 70.4000 (70.8320) time: 0.5766 data: 0.5469 max mem: 6925
Test: Total time: 0:00:48 (0.9756 s / it)
* Acc@1 45.736 Acc@5 71.226 loss 2.497
Accuracy of the model on the 50000 test images: 45.7%
Max accuracy: 45.74%
Epoch: [10] [ 0/625] eta: 3:35:06 lr: 0.002000 min_lr: 0.002000 loss: 3.7979 (3.7979) class_acc: 0.3633 (0.3633) weight_decay: 0.0500 (0.0500) time: 20.6510 data: 18.7331 max mem: 6925
Epoch: [10] [200/625] eta: 0:13:40 lr: 0.002064 min_lr: 0.002064 loss: 3.8849 (3.8497) class_acc: 0.3398 (0.3498) weight_decay: 0.0500 (0.0500) grad_norm: 1.3145 (1.4611) time: 1.9134 data: 0.0010 max mem: 6925
Epoch: [10] [400/625] eta: 0:07:08 lr: 0.002129 min_lr: 0.002129 loss: 3.8394 (3.8339) class_acc: 0.3516 (0.3543) weight_decay: 0.0500 (0.0500) grad_norm: 1.3298 (inf) time: 2.0682 data: 0.0660 max mem: 6925
Epoch: [10] [600/625] eta: 0:00:47 lr: 0.002193 min_lr: 0.002193 loss: 3.7669 (3.8240) class_acc: 0.3633 (0.3557) weight_decay: 0.0500 (0.0500) grad_norm: 1.5628 (inf) time: 2.0368 data: 0.0069 max mem: 6925
Epoch: [10] [624/625] eta: 0:00:01 lr: 0.002200 min_lr: 0.002200 loss: 3.7446 (3.8211) class_acc: 0.3633 (0.3559) weight_decay: 0.0500 (0.0500) grad_norm: 1.5540 (inf) time: 0.7914 data: 0.0152 max mem: 6925
Epoch: [10] Total time: 0:19:25 (1.8641 s / it)
Averaged stats: lr: 0.002200 min_lr: 0.002200 loss: 3.7446 (3.8183) class_acc: 0.3633 (0.3560) weight_decay: 0.0500 (0.0500) grad_norm: 1.5540 (inf)
Test: [ 0/50] eta: 0:09:35 loss: 2.3656 (2.3656) acc1: 51.2000 (51.2000) acc5: 74.4000 (74.4000) time: 11.5067 data: 11.4707 max mem: 6925
Test: [10/50] eta: 0:01:17 loss: 2.3600 (2.2959) acc1: 52.0000 (51.2727) acc5: 74.4000 (75.6364) time: 1.9374 data: 1.9069 max mem: 6925
Test: [20/50] eta: 0:00:46 loss: 2.4139 (2.4155) acc1: 47.2000 (47.8476) acc5: 72.8000 (74.2095) time: 1.0475 data: 1.0167 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 2.5498 (2.4178) acc1: 44.0000 (47.7677) acc5: 72.0000 (73.4194) time: 1.0458 data: 1.0157 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 2.5523 (2.4372) acc1: 45.6000 (47.4732) acc5: 69.6000 (72.8195) time: 0.6576 data: 0.6289 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 2.4461 (2.4430) acc1: 46.4000 (47.6160) acc5: 70.4000 (72.8000) time: 0.5457 data: 0.5163 max mem: 6925
Test: Total time: 0:00:46 (0.9336 s / it)
* Acc@1 48.118 Acc@5 73.062 loss 2.402
Accuracy of the model on the 50000 test images: 48.1%
Max accuracy: 48.12%
Epoch: [11] [ 0/625] eta: 3:31:44 lr: 0.002200 min_lr: 0.002200 loss: 3.6978 (3.6978) class_acc: 0.3867 (0.3867) weight_decay: 0.0500 (0.0500) time: 20.3274 data: 18.0206 max mem: 6925
Epoch: [11] [200/625] eta: 0:13:35 lr: 0.002264 min_lr: 0.002264 loss: 3.7348 (3.7438) class_acc: 0.3633 (0.3695) weight_decay: 0.0500 (0.0500) grad_norm: 1.5929 (1.5562) time: 1.8127 data: 0.0009 max mem: 6925
Epoch: [11] [400/625] eta: 0:07:07 lr: 0.002329 min_lr: 0.002329 loss: 3.6450 (3.7349) class_acc: 0.3672 (0.3719) weight_decay: 0.0500 (0.0500) grad_norm: 1.5047 (1.5681) time: 1.8559 data: 0.0292 max mem: 6925
Epoch: [11] [600/625] eta: 0:00:47 lr: 0.002393 min_lr: 0.002393 loss: 3.6886 (3.7292) class_acc: 0.3789 (0.3733) weight_decay: 0.0500 (0.0500) grad_norm: 1.4215 (1.5687) time: 1.9081 data: 0.0014 max mem: 6925
Epoch: [11] [624/625] eta: 0:00:01 lr: 0.002400 min_lr: 0.002400 loss: 3.6711 (3.7270) class_acc: 0.3906 (0.3738) weight_decay: 0.0500 (0.0500) grad_norm: 1.4339 (1.5639) time: 0.8405 data: 0.0151 max mem: 6925
Epoch: [11] Total time: 0:19:29 (1.8712 s / it)
Averaged stats: lr: 0.002400 min_lr: 0.002400 loss: 3.6711 (3.7264) class_acc: 0.3906 (0.3737) weight_decay: 0.0500 (0.0500) grad_norm: 1.4339 (1.5639)
Test: [ 0/50] eta: 0:10:34 loss: 2.2988 (2.2988) acc1: 48.0000 (48.0000) acc5: 73.6000 (73.6000) time: 12.6837 data: 12.6409 max mem: 6925
Test: [10/50] eta: 0:01:26 loss: 2.2122 (2.1660) acc1: 52.0000 (53.3091) acc5: 75.2000 (76.4364) time: 2.1524 data: 2.1216 max mem: 6925
Test: [20/50] eta: 0:00:50 loss: 2.2729 (2.2869) acc1: 50.4000 (49.7905) acc5: 75.2000 (75.4667) time: 1.1458 data: 1.1166 max mem: 6925
Test: [30/50] eta: 0:00:31 loss: 2.3732 (2.2890) acc1: 47.2000 (49.2645) acc5: 74.4000 (75.2774) time: 1.2337 data: 1.2048 max mem: 6925
Test: [40/50] eta: 0:00:13 loss: 2.4234 (2.3161) acc1: 47.2000 (48.9951) acc5: 72.0000 (74.7122) time: 1.0500 data: 1.0209 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 2.3700 (2.3096) acc1: 49.6000 (49.2480) acc5: 72.8000 (74.6880) time: 0.8589 data: 0.8297 max mem: 6925
Test: Total time: 0:00:57 (1.1552 s / it)
* Acc@1 50.052 Acc@5 75.208 loss 2.262
Accuracy of the model on the 50000 test images: 50.1%
Max accuracy: 50.05%
Epoch: [12] [ 0/625] eta: 3:40:25 lr: 0.002400 min_lr: 0.002400 loss: 3.9719 (3.9719) class_acc: 0.3242 (0.3242) weight_decay: 0.0500 (0.0500) time: 21.1601 data: 20.8835 max mem: 6925
Epoch: [12] [200/625] eta: 0:14:21 lr: 0.002464 min_lr: 0.002464 loss: 3.6442 (3.6511) class_acc: 0.3789 (0.3897) weight_decay: 0.0500 (0.0500) grad_norm: 1.7213 (1.6294) time: 1.8270 data: 1.4725 max mem: 6925
Epoch: [12] [400/625] eta: 0:07:18 lr: 0.002529 min_lr: 0.002529 loss: 3.6375 (3.6514) class_acc: 0.3828 (0.3896) weight_decay: 0.0500 (0.0500) grad_norm: 1.4712 (1.5392) time: 1.9212 data: 1.6315 max mem: 6925
Epoch: [12] [600/625] eta: 0:00:49 lr: 0.002593 min_lr: 0.002593 loss: 3.5887 (3.6482) class_acc: 0.3984 (0.3896) weight_decay: 0.0500 (0.0500) grad_norm: 1.1837 (1.5223) time: 2.1414 data: 1.8369 max mem: 6925
Epoch: [12] [624/625] eta: 0:00:01 lr: 0.002600 min_lr: 0.002600 loss: 3.6115 (3.6468) class_acc: 0.3906 (0.3897) weight_decay: 0.0500 (0.0500) grad_norm: 1.1411 (1.5169) time: 0.7565 data: 0.4857 max mem: 6925
Epoch: [12] Total time: 0:20:14 (1.9430 s / it)
Averaged stats: lr: 0.002600 min_lr: 0.002600 loss: 3.6115 (3.6467) class_acc: 0.3906 (0.3895) weight_decay: 0.0500 (0.0500) grad_norm: 1.1411 (1.5169)
Test: [ 0/50] eta: 0:11:11 loss: 1.9926 (1.9926) acc1: 59.2000 (59.2000) acc5: 80.8000 (80.8000) time: 13.4302 data: 13.3990 max mem: 6925
Test: [10/50] eta: 0:01:29 loss: 1.9990 (2.0311) acc1: 56.8000 (56.5091) acc5: 80.0000 (78.9091) time: 2.2330 data: 2.2019 max mem: 6925
Test: [20/50] eta: 0:00:52 loss: 2.2662 (2.1680) acc1: 52.0000 (52.1143) acc5: 77.6000 (77.4476) time: 1.1790 data: 1.1491 max mem: 6925
Test: [30/50] eta: 0:00:29 loss: 2.2676 (2.1699) acc1: 48.8000 (52.2581) acc5: 76.0000 (77.1613) time: 1.0674 data: 1.0388 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 2.1295 (2.2074) acc1: 50.4000 (51.4341) acc5: 76.0000 (76.5463) time: 0.6198 data: 0.5911 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 2.2974 (2.2271) acc1: 47.2000 (51.1040) acc5: 73.6000 (76.1920) time: 0.5375 data: 0.5090 max mem: 6925
Test: Total time: 0:00:50 (1.0134 s / it)
* Acc@1 52.120 Acc@5 76.486 loss 2.192
Accuracy of the model on the 50000 test images: 52.1%
Max accuracy: 52.12%
Epoch: [13] [ 0/625] eta: 3:51:03 lr: 0.002600 min_lr: 0.002600 loss: 3.6038 (3.6038) class_acc: 0.4023 (0.4023) weight_decay: 0.0500 (0.0500) time: 22.1810 data: 20.5355 max mem: 6925
Epoch: [13] [200/625] eta: 0:16:01 lr: 0.002665 min_lr: 0.002665 loss: 3.5499 (3.5709) class_acc: 0.4141 (0.4057) weight_decay: 0.0500 (0.0500) grad_norm: 1.1951 (1.4398) time: 1.9438 data: 1.3023 max mem: 6925
Epoch: [13] [400/625] eta: 0:08:02 lr: 0.002729 min_lr: 0.002729 loss: 3.6085 (3.5725) class_acc: 0.4023 (0.4056) weight_decay: 0.0500 (0.0500) grad_norm: 1.4695 (1.5375) time: 2.1444 data: 0.0007 max mem: 6925
Epoch: [13] [600/625] eta: 0:00:55 lr: 0.002793 min_lr: 0.002793 loss: 3.5015 (3.5724) class_acc: 0.4141 (0.4055) weight_decay: 0.0500 (0.0500) grad_norm: 1.2768 (1.4835) time: 2.5757 data: 0.4782 max mem: 6925
Epoch: [13] [624/625] eta: 0:00:02 lr: 0.002800 min_lr: 0.002800 loss: 3.5638 (3.5717) class_acc: 0.4102 (0.4055) weight_decay: 0.0500 (0.0500) grad_norm: 1.1903 (1.4767) time: 1.1197 data: 0.2401 max mem: 6925
Epoch: [13] Total time: 0:23:10 (2.2246 s / it)
Averaged stats: lr: 0.002800 min_lr: 0.002800 loss: 3.5638 (3.5824) class_acc: 0.4102 (0.4023) weight_decay: 0.0500 (0.0500) grad_norm: 1.1903 (1.4767)
Test: [ 0/50] eta: 0:12:59 loss: 1.9250 (1.9250) acc1: 58.4000 (58.4000) acc5: 81.6000 (81.6000) time: 15.5897 data: 15.5557 max mem: 6925
Test: [10/50] eta: 0:01:46 loss: 2.0562 (2.0035) acc1: 56.0000 (56.8727) acc5: 79.2000 (79.9273) time: 2.6703 data: 2.6403 max mem: 6925
Test: [20/50] eta: 0:01:08 loss: 2.1759 (2.1147) acc1: 55.2000 (53.8667) acc5: 78.4000 (78.1333) time: 1.6251 data: 1.5961 max mem: 6925
Test: [30/50] eta: 0:00:39 loss: 2.2028 (2.1041) acc1: 51.2000 (53.8581) acc5: 76.0000 (77.8839) time: 1.5841 data: 1.5552 max mem: 6925
Test: [40/50] eta: 0:00:16 loss: 2.0854 (2.1390) acc1: 51.2000 (52.8390) acc5: 76.0000 (77.2878) time: 1.0469 data: 1.0180 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 2.1602 (2.1576) acc1: 49.6000 (52.5600) acc5: 76.0000 (76.9760) time: 0.9021 data: 0.8735 max mem: 6925
Test: Total time: 0:01:10 (1.4139 s / it)
* Acc@1 53.068 Acc@5 77.256 loss 2.128
Accuracy of the model on the 50000 test images: 53.1%
Max accuracy: 53.07%
Epoch: [14] [ 0/625] eta: 3:47:45 lr: 0.002800 min_lr: 0.002800 loss: 3.3681 (3.3681) class_acc: 0.4570 (0.4570) weight_decay: 0.0500 (0.0500) time: 21.8653 data: 21.6336 max mem: 6925
Epoch: [14] [200/625] eta: 0:14:40 lr: 0.002865 min_lr: 0.002865 loss: 3.4788 (3.5440) class_acc: 0.4062 (0.4102) weight_decay: 0.0500 (0.0500) grad_norm: 1.1076 (1.5876) time: 1.9578 data: 0.0554 max mem: 6925
Epoch: [14] [400/625] eta: 0:07:47 lr: 0.002929 min_lr: 0.002929 loss: 3.5163 (3.5376) class_acc: 0.4062 (0.4108) weight_decay: 0.0500 (0.0500) grad_norm: 1.3727 (1.6105) time: 1.9791 data: 0.0576 max mem: 6925
Epoch: [14] [600/625] eta: 0:00:51 lr: 0.002993 min_lr: 0.002993 loss: 3.6315 (3.5367) class_acc: 0.4023 (0.4113) weight_decay: 0.0500 (0.0500) grad_norm: 1.3977 (1.5698) time: 2.3133 data: 0.0103 max mem: 6925
Epoch: [14] [624/625] eta: 0:00:02 lr: 0.003000 min_lr: 0.003000 loss: 3.5076 (3.5361) class_acc: 0.4258 (0.4117) weight_decay: 0.0500 (0.0500) grad_norm: 1.5024 (1.5632) time: 0.7930 data: 0.0253 max mem: 6925
Epoch: [14] Total time: 0:21:12 (2.0363 s / it)
Averaged stats: lr: 0.003000 min_lr: 0.003000 loss: 3.5076 (3.5297) class_acc: 0.4258 (0.4124) weight_decay: 0.0500 (0.0500) grad_norm: 1.5024 (1.5632)
Test: [ 0/50] eta: 0:10:54 loss: 1.8116 (1.8116) acc1: 58.4000 (58.4000) acc5: 81.6000 (81.6000) time: 13.0956 data: 13.0580 max mem: 6925
Test: [10/50] eta: 0:01:23 loss: 2.0015 (1.9533) acc1: 57.6000 (57.6727) acc5: 80.8000 (79.7818) time: 2.0780 data: 2.0478 max mem: 6925
Test: [20/50] eta: 0:00:48 loss: 2.0676 (2.0900) acc1: 54.4000 (54.2095) acc5: 79.2000 (78.5524) time: 1.0446 data: 1.0158 max mem: 6925
Test: [30/50] eta: 0:00:28 loss: 2.2031 (2.0872) acc1: 52.0000 (54.1419) acc5: 78.4000 (78.3742) time: 1.0641 data: 1.0356 max mem: 6925
Test: [40/50] eta: 0:00:13 loss: 2.0475 (2.1102) acc1: 53.6000 (53.9317) acc5: 76.8000 (77.6585) time: 1.0009 data: 0.9718 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 2.1816 (2.1282) acc1: 52.8000 (53.5520) acc5: 75.2000 (77.4080) time: 0.8804 data: 0.8503 max mem: 6925
Test: Total time: 0:00:59 (1.1811 s / it)
* Acc@1 53.402 Acc@5 77.840 loss 2.109
Accuracy of the model on the 50000 test images: 53.4%
Max accuracy: 53.40%
Epoch: [15] [ 0/625] eta: 3:14:10 lr: 0.003000 min_lr: 0.003000 loss: 3.4767 (3.4767) class_acc: 0.3906 (0.3906) weight_decay: 0.0500 (0.0500) time: 18.6402 data: 16.2598 max mem: 6925
Epoch: [15] [200/625] eta: 0:14:41 lr: 0.003065 min_lr: 0.003065 loss: 3.5513 (3.4757) class_acc: 0.3984 (0.4224) weight_decay: 0.0500 (0.0500) grad_norm: 1.3221 (1.4427) time: 2.0642 data: 0.0009 max mem: 6925
Epoch: [15] [400/625] eta: 0:07:41 lr: 0.003129 min_lr: 0.003129 loss: 3.5201 (3.4865) class_acc: 0.4141 (0.4212) weight_decay: 0.0500 (0.0500) grad_norm: 1.6657 (1.5378) time: 2.1972 data: 0.0009 max mem: 6925
Epoch: [15] [600/625] eta: 0:00:50 lr: 0.003193 min_lr: 0.003193 loss: 3.4235 (3.4870) class_acc: 0.4258 (0.4212) weight_decay: 0.0500 (0.0500) grad_norm: 1.4227 (1.5087) time: 2.1383 data: 1.0790 max mem: 6925
Epoch: [15] [624/625] eta: 0:00:01 lr: 0.003200 min_lr: 0.003200 loss: 3.5013 (3.4865) class_acc: 0.4258 (0.4212) weight_decay: 0.0500 (0.0500) grad_norm: 1.2106 (1.5000) time: 0.8800 data: 0.2174 max mem: 6925
Epoch: [15] Total time: 0:20:46 (1.9951 s / it)
Averaged stats: lr: 0.003200 min_lr: 0.003200 loss: 3.5013 (3.4805) class_acc: 0.4258 (0.4226) weight_decay: 0.0500 (0.0500) grad_norm: 1.2106 (1.5000)
Test: [ 0/50] eta: 0:10:38 loss: 2.2376 (2.2376) acc1: 47.2000 (47.2000) acc5: 79.2000 (79.2000) time: 12.7720 data: 12.7301 max mem: 6925
Test: [10/50] eta: 0:01:17 loss: 1.9107 (1.9385) acc1: 57.6000 (57.7455) acc5: 81.6000 (80.9455) time: 1.9306 data: 1.9005 max mem: 6925
Test: [20/50] eta: 0:00:41 loss: 2.1752 (2.1111) acc1: 54.4000 (53.6762) acc5: 76.8000 (78.2857) time: 0.8144 data: 0.7853 max mem: 6925
Test: [30/50] eta: 0:00:26 loss: 2.2672 (2.1060) acc1: 50.4000 (53.5742) acc5: 75.2000 (78.0129) time: 0.9745 data: 0.9448 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 2.1741 (2.1416) acc1: 52.8000 (53.0146) acc5: 76.8000 (77.6585) time: 0.9834 data: 0.9534 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 2.1741 (2.1489) acc1: 53.6000 (53.0080) acc5: 78.4000 (77.5360) time: 0.5609 data: 0.5316 max mem: 6925
Test: Total time: 0:00:52 (1.0414 s / it)
* Acc@1 53.838 Acc@5 77.888 loss 2.097
Accuracy of the model on the 50000 test images: 53.8%
Max accuracy: 53.84%
Epoch: [16] [ 0/625] eta: 3:57:39 lr: 0.003201 min_lr: 0.003201 loss: 3.5308 (3.5308) class_acc: 0.4141 (0.4141) weight_decay: 0.0500 (0.0500) time: 22.8149 data: 22.5833 max mem: 6925
Epoch: [16] [200/625] eta: 0:14:33 lr: 0.003265 min_lr: 0.003265 loss: 3.4309 (3.4520) class_acc: 0.4297 (0.4285) weight_decay: 0.0500 (0.0500) grad_norm: 1.2463 (1.5067) time: 1.9697 data: 0.4155 max mem: 6925
Epoch: [16] [400/625] eta: 0:07:36 lr: 0.003329 min_lr: 0.003329 loss: 3.4118 (3.4434) class_acc: 0.4414 (0.4300) weight_decay: 0.0500 (0.0500) grad_norm: 1.1861 (1.5175) time: 2.1271 data: 0.5679 max mem: 6925
Epoch: [16] [600/625] eta: 0:00:51 lr: 0.003393 min_lr: 0.003393 loss: 3.4307 (3.4385) class_acc: 0.4219 (0.4307) weight_decay: 0.0500 (0.0500) grad_norm: 1.6104 (1.5296) time: 2.1493 data: 0.0236 max mem: 6925
Epoch: [16] [624/625] eta: 0:00:02 lr: 0.003400 min_lr: 0.003400 loss: 3.3894 (3.4380) class_acc: 0.4297 (0.4306) weight_decay: 0.0500 (0.0500) grad_norm: 1.4334 (1.5319) time: 0.7540 data: 0.1656 max mem: 6925
Epoch: [16] Total time: 0:20:58 (2.0140 s / it)
Averaged stats: lr: 0.003400 min_lr: 0.003400 loss: 3.3894 (3.4382) class_acc: 0.4297 (0.4314) weight_decay: 0.0500 (0.0500) grad_norm: 1.4334 (1.5319)
Test: [ 0/50] eta: 0:10:56 loss: 1.8150 (1.8150) acc1: 59.2000 (59.2000) acc5: 78.4000 (78.4000) time: 13.1257 data: 13.0902 max mem: 6925
Test: [10/50] eta: 0:01:29 loss: 1.9224 (1.8670) acc1: 59.2000 (59.2727) acc5: 80.0000 (81.4545) time: 2.2420 data: 2.2119 max mem: 6925
Test: [20/50] eta: 0:00:53 loss: 1.9845 (1.9906) acc1: 56.0000 (56.5333) acc5: 79.2000 (80.0000) time: 1.2224 data: 1.1931 max mem: 6925
Test: [30/50] eta: 0:00:32 loss: 2.1207 (2.0110) acc1: 51.2000 (55.1226) acc5: 79.2000 (79.3548) time: 1.2673 data: 1.2386 max mem: 6925
Test: [40/50] eta: 0:00:14 loss: 2.1001 (2.0459) acc1: 51.2000 (54.6341) acc5: 76.8000 (78.7122) time: 1.0646 data: 1.0362 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 2.1001 (2.0460) acc1: 52.0000 (54.5600) acc5: 76.8000 (78.6720) time: 1.0002 data: 0.9714 max mem: 6925
Test: Total time: 0:00:59 (1.1861 s / it)
* Acc@1 55.056 Acc@5 79.126 loss 2.011
Accuracy of the model on the 50000 test images: 55.1%
Max accuracy: 55.06%
Epoch: [17] [ 0/625] eta: 3:10:19 lr: 0.003401 min_lr: 0.003401 loss: 3.2761 (3.2761) class_acc: 0.4414 (0.4414) weight_decay: 0.0500 (0.0500) time: 18.2719 data: 17.3292 max mem: 6925
Epoch: [17] [200/625] eta: 0:14:52 lr: 0.003465 min_lr: 0.003465 loss: 3.3909 (3.3878) class_acc: 0.4414 (0.4426) weight_decay: 0.0500 (0.0500) grad_norm: 1.3092 (1.4510) time: 2.1927 data: 0.5247 max mem: 6925
Epoch: [17] [400/625] eta: 0:07:47 lr: 0.003529 min_lr: 0.003529 loss: 3.3460 (3.4010) class_acc: 0.4414 (0.4403) weight_decay: 0.0500 (0.0500) grad_norm: 1.2241 (1.4742) time: 1.9340 data: 0.4094 max mem: 6925
Epoch: [17] [600/625] eta: 0:00:51 lr: 0.003593 min_lr: 0.003593 loss: 3.4586 (3.4046) class_acc: 0.4336 (0.4393) weight_decay: 0.0500 (0.0500) grad_norm: 1.3688 (1.4467) time: 2.0656 data: 0.0114 max mem: 6925
Epoch: [17] [624/625] eta: 0:00:02 lr: 0.003600 min_lr: 0.003600 loss: 3.3960 (3.4054) class_acc: 0.4375 (0.4394) weight_decay: 0.0500 (0.0500) grad_norm: 1.2734 (1.4532) time: 0.5433 data: 0.0205 max mem: 6925
Epoch: [17] Total time: 0:21:18 (2.0461 s / it)
Averaged stats: lr: 0.003600 min_lr: 0.003600 loss: 3.3960 (3.4033) class_acc: 0.4375 (0.4385) weight_decay: 0.0500 (0.0500) grad_norm: 1.2734 (1.4532)
Test: [ 0/50] eta: 0:10:16 loss: 1.9363 (1.9363) acc1: 56.8000 (56.8000) acc5: 80.0000 (80.0000) time: 12.3355 data: 12.2976 max mem: 6925
Test: [10/50] eta: 0:01:18 loss: 1.9575 (1.9601) acc1: 56.8000 (56.5091) acc5: 80.0000 (79.4909) time: 1.9600 data: 1.9299 max mem: 6925
Test: [20/50] eta: 0:00:45 loss: 2.1019 (2.0714) acc1: 53.6000 (54.3619) acc5: 79.2000 (78.3619) time: 0.9590 data: 0.9302 max mem: 6925
Test: [30/50] eta: 0:00:26 loss: 2.2022 (2.0754) acc1: 51.2000 (54.0645) acc5: 76.8000 (77.7806) time: 0.9812 data: 0.9522 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 2.1551 (2.0982) acc1: 50.4000 (53.0732) acc5: 75.2000 (77.2878) time: 1.0598 data: 1.0301 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 2.1114 (2.0945) acc1: 50.4000 (53.0400) acc5: 76.0000 (77.2480) time: 0.8446 data: 0.8154 max mem: 6925
Test: Total time: 0:00:58 (1.1641 s / it)
* Acc@1 53.734 Acc@5 77.882 loss 2.064
Accuracy of the model on the 50000 test images: 53.7%
Max accuracy: 55.06%
Epoch: [18] [ 0/625] eta: 3:39:41 lr: 0.003601 min_lr: 0.003601 loss: 3.4790 (3.4790) class_acc: 0.4375 (0.4375) weight_decay: 0.0500 (0.0500) time: 21.0904 data: 20.8600 max mem: 6925
Epoch: [18] [200/625] eta: 0:14:13 lr: 0.003665 min_lr: 0.003665 loss: 3.2935 (3.3646) class_acc: 0.4570 (0.4477) weight_decay: 0.0500 (0.0500) grad_norm: 1.3893 (1.5764) time: 1.9814 data: 0.0838 max mem: 6925
Epoch: [18] [400/625] eta: 0:07:34 lr: 0.003729 min_lr: 0.003729 loss: 3.3851 (3.3706) class_acc: 0.4570 (0.4459) weight_decay: 0.0500 (0.0500) grad_norm: 1.2128 (1.5045) time: 2.1540 data: 0.0468 max mem: 6925
Epoch: [18] [600/625] eta: 0:00:51 lr: 0.003793 min_lr: 0.003793 loss: 3.3546 (3.3771) class_acc: 0.4414 (0.4439) weight_decay: 0.0500 (0.0500) grad_norm: 1.4672 (1.4971) time: 2.2075 data: 0.0008 max mem: 6925
Epoch: [18] [624/625] eta: 0:00:02 lr: 0.003800 min_lr: 0.003800 loss: 3.3689 (3.3785) class_acc: 0.4336 (0.4437) weight_decay: 0.0500 (0.0500) grad_norm: 1.4598 (1.5079) time: 1.1720 data: 0.0202 max mem: 6925
Epoch: [18] Total time: 0:20:54 (2.0068 s / it)
Averaged stats: lr: 0.003800 min_lr: 0.003800 loss: 3.3689 (3.3757) class_acc: 0.4336 (0.4444) weight_decay: 0.0500 (0.0500) grad_norm: 1.4598 (1.5079)
Test: [ 0/50] eta: 0:10:37 loss: 1.8819 (1.8819) acc1: 56.0000 (56.0000) acc5: 80.0000 (80.0000) time: 12.7470 data: 12.7039 max mem: 6925
Test: [10/50] eta: 0:01:24 loss: 1.8819 (1.8590) acc1: 56.8000 (58.2545) acc5: 80.0000 (81.6000) time: 2.1094 data: 2.0790 max mem: 6925
Test: [20/50] eta: 0:00:50 loss: 2.0360 (2.0109) acc1: 54.4000 (55.4286) acc5: 78.4000 (79.5048) time: 1.1217 data: 1.0928 max mem: 6925
Test: [30/50] eta: 0:00:30 loss: 2.1274 (2.0095) acc1: 52.8000 (54.9419) acc5: 78.4000 (79.5355) time: 1.1937 data: 1.1644 max mem: 6925
Test: [40/50] eta: 0:00:14 loss: 2.0620 (2.0297) acc1: 51.2000 (54.8098) acc5: 78.4000 (78.7317) time: 1.1275 data: 1.0978 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 2.1176 (2.0345) acc1: 52.0000 (54.5760) acc5: 76.8000 (78.7680) time: 1.1179 data: 1.0888 max mem: 6925
Test: Total time: 0:00:59 (1.1876 s / it)
* Acc@1 55.540 Acc@5 79.248 loss 1.995
Accuracy of the model on the 50000 test images: 55.5%
Max accuracy: 55.54%
Epoch: [19] [ 0/625] eta: 3:54:41 lr: 0.003801 min_lr: 0.003801 loss: 3.3794 (3.3794) class_acc: 0.4609 (0.4609) weight_decay: 0.0500 (0.0500) time: 22.5298 data: 22.0592 max mem: 6925
Epoch: [19] [200/625] eta: 0:14:26 lr: 0.003865 min_lr: 0.003865 loss: 3.3368 (3.3375) class_acc: 0.4531 (0.4518) weight_decay: 0.0500 (0.0500) grad_norm: 1.2369 (1.3929) time: 2.0328 data: 0.0007 max mem: 6925
Epoch: [19] [400/625] eta: 0:07:42 lr: 0.003929 min_lr: 0.003929 loss: 3.2767 (3.3476) class_acc: 0.4531 (0.4494) weight_decay: 0.0500 (0.0500) grad_norm: 1.0596 (1.3889) time: 2.1770 data: 0.0007 max mem: 6925
Epoch: [19] [600/625] eta: 0:00:51 lr: 0.003993 min_lr: 0.003993 loss: 3.2959 (3.3459) class_acc: 0.4531 (0.4502) weight_decay: 0.0500 (0.0500) grad_norm: 1.3254 (1.4013) time: 2.2302 data: 0.0009 max mem: 6925
Epoch: [19] [624/625] eta: 0:00:02 lr: 0.004000 min_lr: 0.004000 loss: 3.3472 (3.3482) class_acc: 0.4414 (0.4499) weight_decay: 0.0500 (0.0500) grad_norm: 1.0930 (1.3911) time: 1.0664 data: 0.0018 max mem: 6925
Epoch: [19] Total time: 0:20:58 (2.0137 s / it)
Averaged stats: lr: 0.004000 min_lr: 0.004000 loss: 3.3472 (3.3489) class_acc: 0.4414 (0.4496) weight_decay: 0.0500 (0.0500) grad_norm: 1.0930 (1.3911)
Test: [ 0/50] eta: 0:09:59 loss: 1.8480 (1.8480) acc1: 59.2000 (59.2000) acc5: 83.2000 (83.2000) time: 11.9827 data: 11.9450 max mem: 6925
Test: [10/50] eta: 0:01:10 loss: 1.8480 (1.7980) acc1: 59.2000 (60.5818) acc5: 83.2000 (82.6909) time: 1.7686 data: 1.7380 max mem: 6925
Test: [20/50] eta: 0:00:39 loss: 1.9844 (1.9459) acc1: 56.0000 (56.8000) acc5: 80.8000 (80.6476) time: 0.7819 data: 0.7528 max mem: 6925
Test: [30/50] eta: 0:00:23 loss: 2.0674 (1.9816) acc1: 51.2000 (55.6387) acc5: 78.4000 (79.6129) time: 0.8224 data: 0.7941 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.9983 (1.9848) acc1: 52.0000 (55.6683) acc5: 77.6000 (79.3756) time: 0.9966 data: 0.9679 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 2.0363 (1.9971) acc1: 54.4000 (55.4240) acc5: 77.6000 (79.1520) time: 0.8041 data: 0.7754 max mem: 6925
Test: Total time: 0:00:52 (1.0425 s / it)
* Acc@1 55.980 Acc@5 80.052 loss 1.959
Accuracy of the model on the 50000 test images: 56.0%
Max accuracy: 55.98%
Epoch: [20] [ 0/625] eta: 3:58:38 lr: 0.004000 min_lr: 0.004000 loss: 3.3307 (3.3307) class_acc: 0.4375 (0.4375) weight_decay: 0.0500 (0.0500) time: 22.9096 data: 15.4595 max mem: 6925
Epoch: [20] [200/625] eta: 0:14:50 lr: 0.004000 min_lr: 0.004000 loss: 3.3444 (3.3180) class_acc: 0.4453 (0.4563) weight_decay: 0.0500 (0.0500) grad_norm: 1.6040 (1.4375) time: 1.9376 data: 0.0009 max mem: 6925
Epoch: [20] [400/625] eta: 0:07:40 lr: 0.004000 min_lr: 0.004000 loss: 3.3124 (3.3227) class_acc: 0.4609 (0.4550) weight_decay: 0.0500 (0.0500) grad_norm: 1.3311 (1.4387) time: 1.8448 data: 0.1129 max mem: 6925
Epoch: [20] [600/625] eta: 0:00:51 lr: 0.004000 min_lr: 0.004000 loss: 3.3709 (3.3262) class_acc: 0.4375 (0.4544) weight_decay: 0.0500 (0.0500) grad_norm: 1.2990 (1.4154) time: 2.0352 data: 0.0007 max mem: 6925
Epoch: [20] [624/625] eta: 0:00:02 lr: 0.004000 min_lr: 0.004000 loss: 3.4000 (3.3276) class_acc: 0.4336 (0.4540) weight_decay: 0.0500 (0.0500) grad_norm: 1.2902 (1.4217) time: 0.9161 data: 0.0014 max mem: 6925
Epoch: [20] Total time: 0:20:55 (2.0085 s / it)
Averaged stats: lr: 0.004000 min_lr: 0.004000 loss: 3.4000 (3.3211) class_acc: 0.4336 (0.4556) weight_decay: 0.0500 (0.0500) grad_norm: 1.2902 (1.4217)
Test: [ 0/50] eta: 0:08:17 loss: 1.8120 (1.8120) acc1: 58.4000 (58.4000) acc5: 80.8000 (80.8000) time: 9.9593 data: 9.9265 max mem: 6925
Test: [10/50] eta: 0:01:13 loss: 1.9912 (1.8961) acc1: 58.4000 (57.8909) acc5: 80.8000 (81.0182) time: 1.8483 data: 1.8188 max mem: 6925
Test: [20/50] eta: 0:00:45 loss: 1.9951 (2.0200) acc1: 54.4000 (55.3143) acc5: 79.2000 (79.2000) time: 1.0773 data: 1.0478 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 2.1099 (2.0241) acc1: 52.0000 (54.7097) acc5: 78.4000 (79.3290) time: 1.0992 data: 1.0686 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 2.0292 (2.0326) acc1: 51.2000 (54.1854) acc5: 79.2000 (79.2000) time: 0.9739 data: 0.9440 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 2.0373 (2.0390) acc1: 52.0000 (54.1760) acc5: 80.0000 (78.9760) time: 0.9869 data: 0.9567 max mem: 6925
Test: Total time: 0:00:57 (1.1445 s / it)
* Acc@1 55.662 Acc@5 79.310 loss 1.985
Accuracy of the model on the 50000 test images: 55.7%
Max accuracy: 55.98%
Epoch: [21] [ 0/625] eta: 3:46:41 lr: 0.004000 min_lr: 0.004000 loss: 3.1756 (3.1756) class_acc: 0.5000 (0.5000) weight_decay: 0.0500 (0.0500) time: 21.7626 data: 18.8851 max mem: 6925
Epoch: [21] [200/625] eta: 0:14:44 lr: 0.004000 min_lr: 0.004000 loss: 3.2297 (3.2715) class_acc: 0.4727 (0.4635) weight_decay: 0.0500 (0.0500) grad_norm: 1.0984 (1.3901) time: 2.0127 data: 0.0343 max mem: 6925
Epoch: [21] [400/625] eta: 0:07:39 lr: 0.004000 min_lr: 0.004000 loss: 3.2596 (3.2742) class_acc: 0.4609 (0.4627) weight_decay: 0.0500 (0.0500) grad_norm: 1.2856 (1.3879) time: 2.0071 data: 0.0008 max mem: 6925
Epoch: [21] [600/625] eta: 0:00:50 lr: 0.004000 min_lr: 0.004000 loss: 3.3081 (3.2786) class_acc: 0.4609 (0.4629) weight_decay: 0.0500 (0.0500) grad_norm: 1.1094 (1.4098) time: 1.9889 data: 0.0006 max mem: 6925
Epoch: [21] [624/625] eta: 0:00:01 lr: 0.003999 min_lr: 0.003999 loss: 3.3010 (3.2800) class_acc: 0.4570 (0.4628) weight_decay: 0.0500 (0.0500) grad_norm: 1.1884 (1.4073) time: 1.0288 data: 0.0013 max mem: 6925
Epoch: [21] Total time: 0:20:46 (1.9948 s / it)
Averaged stats: lr: 0.003999 min_lr: 0.003999 loss: 3.3010 (3.2879) class_acc: 0.4570 (0.4623) weight_decay: 0.0500 (0.0500) grad_norm: 1.1884 (1.4073)
Test: [ 0/50] eta: 0:09:17 loss: 1.9069 (1.9069) acc1: 56.8000 (56.8000) acc5: 83.2000 (83.2000) time: 11.1503 data: 11.1189 max mem: 6925
Test: [10/50] eta: 0:01:16 loss: 1.7605 (1.7950) acc1: 59.2000 (60.1455) acc5: 81.6000 (82.1818) time: 1.9124 data: 1.8827 max mem: 6925
Test: [20/50] eta: 0:00:44 loss: 1.9519 (1.9149) acc1: 57.6000 (57.2571) acc5: 80.8000 (80.7619) time: 1.0093 data: 0.9802 max mem: 6925
Test: [30/50] eta: 0:00:26 loss: 2.0325 (1.9291) acc1: 55.2000 (57.0323) acc5: 78.4000 (80.2581) time: 1.0197 data: 0.9910 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 2.0043 (1.9758) acc1: 55.2000 (55.8049) acc5: 78.4000 (79.3756) time: 0.8673 data: 0.8384 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 2.0043 (1.9855) acc1: 55.2000 (55.7760) acc5: 80.0000 (79.3280) time: 0.8046 data: 0.7753 max mem: 6925
Test: Total time: 0:00:53 (1.0607 s / it)
* Acc@1 56.702 Acc@5 80.214 loss 1.936
Accuracy of the model on the 50000 test images: 56.7%
Max accuracy: 56.70%
Epoch: [22] [ 0/625] eta: 3:49:06 lr: 0.003999 min_lr: 0.003999 loss: 3.2059 (3.2059) class_acc: 0.4961 (0.4961) weight_decay: 0.0500 (0.0500) time: 21.9939 data: 21.2263 max mem: 6925
Epoch: [22] [200/625] eta: 0:14:21 lr: 0.003999 min_lr: 0.003999 loss: 3.2852 (3.2531) class_acc: 0.4531 (0.4717) weight_decay: 0.0500 (0.0500) grad_norm: 1.2180 (1.3728) time: 2.0918 data: 0.0159 max mem: 6925
Epoch: [22] [400/625] eta: 0:07:31 lr: 0.003999 min_lr: 0.003999 loss: 3.2529 (3.2568) class_acc: 0.4492 (0.4689) weight_decay: 0.0500 (0.0500) grad_norm: 1.3509 (inf) time: 2.0125 data: 0.0489 max mem: 6925
Epoch: [22] [600/625] eta: 0:00:50 lr: 0.003999 min_lr: 0.003999 loss: 3.2755 (3.2551) class_acc: 0.4648 (0.4697) weight_decay: 0.0500 (0.0500) grad_norm: 1.0178 (inf) time: 1.8202 data: 0.0008 max mem: 6925
Epoch: [22] [624/625] eta: 0:00:01 lr: 0.003999 min_lr: 0.003999 loss: 3.2056 (3.2551) class_acc: 0.4727 (0.4698) weight_decay: 0.0500 (0.0500) grad_norm: 1.2569 (inf) time: 0.8582 data: 0.0016 max mem: 6925
Epoch: [22] Total time: 0:20:39 (1.9827 s / it)
Averaged stats: lr: 0.003999 min_lr: 0.003999 loss: 3.2056 (3.2486) class_acc: 0.4727 (0.4708) weight_decay: 0.0500 (0.0500) grad_norm: 1.2569 (inf)
Test: [ 0/50] eta: 0:12:16 loss: 2.0468 (2.0468) acc1: 53.6000 (53.6000) acc5: 82.4000 (82.4000) time: 14.7328 data: 14.6967 max mem: 6925
Test: [10/50] eta: 0:01:29 loss: 1.9067 (1.8810) acc1: 57.6000 (57.8182) acc5: 82.4000 (82.4727) time: 2.2356 data: 2.2056 max mem: 6925
Test: [20/50] eta: 0:00:50 loss: 2.0120 (2.0484) acc1: 53.6000 (54.8190) acc5: 77.6000 (80.0000) time: 1.0170 data: 0.9882 max mem: 6925
Test: [30/50] eta: 0:00:28 loss: 2.1720 (2.0652) acc1: 51.2000 (53.9871) acc5: 76.0000 (79.2516) time: 1.0110 data: 0.9825 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 2.1720 (2.0886) acc1: 52.0000 (53.6585) acc5: 76.0000 (78.7707) time: 0.7643 data: 0.7354 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 2.1234 (2.0790) acc1: 52.0000 (54.0000) acc5: 78.4000 (78.9280) time: 0.8242 data: 0.7951 max mem: 6925
Test: Total time: 0:00:54 (1.0948 s / it)
* Acc@1 54.728 Acc@5 79.098 loss 2.033
Accuracy of the model on the 50000 test images: 54.7%
Max accuracy: 56.70%
Epoch: [23] [ 0/625] eta: 3:30:03 lr: 0.003999 min_lr: 0.003999 loss: 3.2045 (3.2045) class_acc: 0.4961 (0.4961) weight_decay: 0.0500 (0.0500) time: 20.1658 data: 18.6996 max mem: 6925
Epoch: [23] [200/625] eta: 0:14:24 lr: 0.003999 min_lr: 0.003999 loss: 3.2049 (3.2234) class_acc: 0.4766 (0.4746) weight_decay: 0.0500 (0.0500) grad_norm: 1.5337 (1.4239) time: 1.9802 data: 0.5958 max mem: 6925
Epoch: [23] [400/625] eta: 0:07:23 lr: 0.003998 min_lr: 0.003998 loss: 3.2479 (3.2242) class_acc: 0.4688 (0.4761) weight_decay: 0.0500 (0.0500) grad_norm: 0.9697 (1.3550) time: 1.9830 data: 0.1801 max mem: 6925
Epoch: [23] [600/625] eta: 0:00:49 lr: 0.003998 min_lr: 0.003998 loss: 3.2030 (3.2257) class_acc: 0.4688 (0.4752) weight_decay: 0.0500 (0.0500) grad_norm: 1.0974 (1.3447) time: 1.9075 data: 1.4613 max mem: 6925
Epoch: [23] [624/625] eta: 0:00:01 lr: 0.003998 min_lr: 0.003998 loss: 3.2291 (3.2256) class_acc: 0.4648 (0.4752) weight_decay: 0.0500 (0.0500) grad_norm: 1.2201 (1.3421) time: 0.8960 data: 0.5650 max mem: 6925
Epoch: [23] Total time: 0:20:22 (1.9552 s / it)
Averaged stats: lr: 0.003998 min_lr: 0.003998 loss: 3.2291 (3.2218) class_acc: 0.4648 (0.4766) weight_decay: 0.0500 (0.0500) grad_norm: 1.2201 (1.3421)
Test: [ 0/50] eta: 0:09:36 loss: 1.7914 (1.7914) acc1: 58.4000 (58.4000) acc5: 83.2000 (83.2000) time: 11.5359 data: 11.5002 max mem: 6925
Test: [10/50] eta: 0:01:27 loss: 1.8038 (1.8272) acc1: 57.6000 (58.4000) acc5: 82.4000 (81.7455) time: 2.1750 data: 2.1458 max mem: 6925
Test: [20/50] eta: 0:00:53 loss: 1.8913 (1.9583) acc1: 55.2000 (55.9238) acc5: 80.0000 (80.3048) time: 1.2886 data: 1.2600 max mem: 6925
Test: [30/50] eta: 0:00:30 loss: 2.0884 (1.9825) acc1: 53.6000 (55.1226) acc5: 78.4000 (79.5871) time: 1.1919 data: 1.1630 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 2.0884 (2.0074) acc1: 53.6000 (54.6146) acc5: 76.8000 (79.1610) time: 0.7206 data: 0.6906 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 2.0101 (2.0100) acc1: 52.0000 (54.6240) acc5: 80.0000 (79.2640) time: 0.6200 data: 0.5897 max mem: 6925
Test: Total time: 0:00:53 (1.0756 s / it)
* Acc@1 56.242 Acc@5 80.122 loss 1.954
Accuracy of the model on the 50000 test images: 56.2%
Max accuracy: 56.70%
Epoch: [24] [ 0/625] eta: 3:50:36 lr: 0.003998 min_lr: 0.003998 loss: 3.0273 (3.0273) class_acc: 0.5234 (0.5234) weight_decay: 0.0500 (0.0500) time: 22.1390 data: 14.7003 max mem: 6925
Epoch: [24] [200/625] eta: 0:14:00 lr: 0.003998 min_lr: 0.003998 loss: 3.2235 (3.1959) class_acc: 0.4727 (0.4837) weight_decay: 0.0500 (0.0500) grad_norm: 1.4420 (1.4856) time: 1.7901 data: 0.0006 max mem: 6925
Epoch: [24] [400/625] eta: 0:07:20 lr: 0.003997 min_lr: 0.003997 loss: 3.2135 (3.1972) class_acc: 0.4766 (0.4818) weight_decay: 0.0500 (0.0500) grad_norm: 1.3066 (1.3565) time: 2.0085 data: 0.0007 max mem: 6925
Epoch: [24] [600/625] eta: 0:00:49 lr: 0.003997 min_lr: 0.003997 loss: 3.1898 (3.1984) class_acc: 0.4844 (0.4812) weight_decay: 0.0500 (0.0500) grad_norm: 1.0375 (1.3528) time: 2.2258 data: 0.0007 max mem: 6925
Epoch: [24] [624/625] eta: 0:00:01 lr: 0.003997 min_lr: 0.003997 loss: 3.1570 (3.1984) class_acc: 0.4961 (0.4814) weight_decay: 0.0500 (0.0500) grad_norm: 1.3195 (1.3615) time: 1.1225 data: 0.0014 max mem: 6925
Epoch: [24] Total time: 0:20:07 (1.9321 s / it)
Averaged stats: lr: 0.003997 min_lr: 0.003997 loss: 3.1570 (3.1975) class_acc: 0.4961 (0.4819) weight_decay: 0.0500 (0.0500) grad_norm: 1.3195 (1.3615)
Test: [ 0/50] eta: 0:09:34 loss: 1.7549 (1.7549) acc1: 59.2000 (59.2000) acc5: 86.4000 (86.4000) time: 11.4959 data: 11.4576 max mem: 6925
Test: [10/50] eta: 0:01:12 loss: 1.7945 (1.8123) acc1: 58.4000 (59.4182) acc5: 82.4000 (82.1091) time: 1.8246 data: 1.7944 max mem: 6925
Test: [20/50] eta: 0:00:41 loss: 1.9018 (1.9572) acc1: 56.0000 (56.6476) acc5: 80.8000 (80.9524) time: 0.8900 data: 0.8611 max mem: 6925
Test: [30/50] eta: 0:00:26 loss: 2.1054 (1.9758) acc1: 53.6000 (56.1548) acc5: 79.2000 (80.2839) time: 1.0503 data: 1.0217 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.9926 (1.9906) acc1: 54.4000 (55.8439) acc5: 78.4000 (79.6878) time: 0.9374 data: 0.9076 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.9616 (1.9823) acc1: 55.2000 (56.0000) acc5: 78.4000 (79.7920) time: 0.5062 data: 0.4757 max mem: 6925
Test: Total time: 0:00:51 (1.0266 s / it)
* Acc@1 56.562 Acc@5 80.216 loss 1.941
Accuracy of the model on the 50000 test images: 56.6%
Max accuracy: 56.70%
Epoch: [25] [ 0/625] eta: 4:08:36 lr: 0.003997 min_lr: 0.003997 loss: 3.0749 (3.0749) class_acc: 0.5352 (0.5352) weight_decay: 0.0500 (0.0500) time: 23.8660 data: 23.6365 max mem: 6925
Epoch: [25] [200/625] eta: 0:14:25 lr: 0.003996 min_lr: 0.003996 loss: 3.0886 (3.1568) class_acc: 0.5000 (0.4892) weight_decay: 0.0500 (0.0500) grad_norm: 1.2347 (1.3301) time: 2.0893 data: 0.0009 max mem: 6925
Epoch: [25] [400/625] eta: 0:07:32 lr: 0.003996 min_lr: 0.003996 loss: 3.1980 (3.1706) class_acc: 0.4844 (0.4866) weight_decay: 0.0500 (0.0500) grad_norm: 1.0122 (1.3960) time: 2.1727 data: 0.0009 max mem: 6925
Epoch: [25] [600/625] eta: 0:00:49 lr: 0.003996 min_lr: 0.003996 loss: 3.1453 (3.1737) class_acc: 0.4805 (0.4863) weight_decay: 0.0500 (0.0500) grad_norm: 1.0835 (1.3699) time: 2.0219 data: 0.0009 max mem: 6925
Epoch: [25] [624/625] eta: 0:00:01 lr: 0.003995 min_lr: 0.003995 loss: 3.1621 (3.1738) class_acc: 0.4844 (0.4862) weight_decay: 0.0500 (0.0500) grad_norm: 0.9240 (1.3670) time: 0.9681 data: 0.0020 max mem: 6925
Epoch: [25] Total time: 0:20:22 (1.9562 s / it)
Averaged stats: lr: 0.003995 min_lr: 0.003995 loss: 3.1621 (3.1714) class_acc: 0.4844 (0.4873) weight_decay: 0.0500 (0.0500) grad_norm: 0.9240 (1.3670)
Test: [ 0/50] eta: 0:10:02 loss: 1.5453 (1.5453) acc1: 64.8000 (64.8000) acc5: 88.0000 (88.0000) time: 12.0511 data: 12.0175 max mem: 6925
Test: [10/50] eta: 0:01:26 loss: 1.7171 (1.6813) acc1: 63.2000 (62.6909) acc5: 85.6000 (84.5091) time: 2.1548 data: 2.1247 max mem: 6925
Test: [20/50] eta: 0:00:49 loss: 1.8320 (1.8279) acc1: 60.0000 (59.7333) acc5: 82.4000 (82.6667) time: 1.1454 data: 1.1162 max mem: 6925
Test: [30/50] eta: 0:00:28 loss: 1.9867 (1.8528) acc1: 56.8000 (58.6839) acc5: 80.8000 (82.0129) time: 0.9950 data: 0.9663 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.9944 (1.8836) acc1: 55.2000 (57.9902) acc5: 78.4000 (81.3463) time: 0.7843 data: 0.7554 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.9133 (1.8936) acc1: 56.8000 (58.1760) acc5: 79.2000 (81.0560) time: 0.5835 data: 0.5547 max mem: 6925
Test: Total time: 0:00:54 (1.0857 s / it)
* Acc@1 58.712 Acc@5 81.886 loss 1.850
Accuracy of the model on the 50000 test images: 58.7%
Max accuracy: 58.71%
Epoch: [26] [ 0/625] eta: 3:38:05 lr: 0.003995 min_lr: 0.003995 loss: 3.1172 (3.1172) class_acc: 0.4844 (0.4844) weight_decay: 0.0500 (0.0500) time: 20.9370 data: 20.7106 max mem: 6925
Epoch: [26] [200/625] eta: 0:14:23 lr: 0.003995 min_lr: 0.003995 loss: 3.0715 (3.1217) class_acc: 0.5078 (0.4958) weight_decay: 0.0500 (0.0500) grad_norm: 1.0979 (1.2930) time: 1.9535 data: 0.5706 max mem: 6925
Epoch: [26] [400/625] eta: 0:07:34 lr: 0.003994 min_lr: 0.003994 loss: 3.1118 (3.1366) class_acc: 0.4961 (0.4947) weight_decay: 0.0500 (0.0500) grad_norm: 1.3315 (1.3479) time: 2.0355 data: 0.0669 max mem: 6925
Epoch: [26] [600/625] eta: 0:00:50 lr: 0.003994 min_lr: 0.003994 loss: 3.1576 (3.1447) class_acc: 0.4844 (0.4934) weight_decay: 0.0500 (0.0500) grad_norm: 1.1143 (1.2935) time: 2.1260 data: 0.0008 max mem: 6925
Epoch: [26] [624/625] eta: 0:00:01 lr: 0.003994 min_lr: 0.003994 loss: 3.1863 (3.1465) class_acc: 0.4648 (0.4930) weight_decay: 0.0500 (0.0500) grad_norm: 1.0098 (1.2845) time: 0.9036 data: 0.0907 max mem: 6925
Epoch: [26] Total time: 0:20:30 (1.9684 s / it)
Averaged stats: lr: 0.003994 min_lr: 0.003994 loss: 3.1863 (3.1510) class_acc: 0.4648 (0.4920) weight_decay: 0.0500 (0.0500) grad_norm: 1.0098 (1.2845)
Test: [ 0/50] eta: 0:10:37 loss: 1.7435 (1.7435) acc1: 57.6000 (57.6000) acc5: 83.2000 (83.2000) time: 12.7491 data: 12.7147 max mem: 6925
Test: [10/50] eta: 0:01:19 loss: 1.8094 (1.8027) acc1: 58.4000 (59.2727) acc5: 83.2000 (82.9091) time: 1.9918 data: 1.9620 max mem: 6925
Test: [20/50] eta: 0:00:43 loss: 1.9481 (1.9070) acc1: 56.8000 (57.7905) acc5: 80.8000 (81.8667) time: 0.8867 data: 0.8576 max mem: 6925
Test: [30/50] eta: 0:00:23 loss: 2.0187 (1.9206) acc1: 55.2000 (56.8774) acc5: 80.0000 (81.1613) time: 0.7580 data: 0.7291 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 2.0609 (1.9766) acc1: 52.0000 (55.6098) acc5: 79.2000 (80.2342) time: 0.7924 data: 0.7630 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 2.0274 (1.9863) acc1: 52.0000 (55.2640) acc5: 79.2000 (80.1120) time: 0.6184 data: 0.5893 max mem: 6925
Test: Total time: 0:00:49 (0.9917 s / it)
* Acc@1 56.552 Acc@5 80.450 loss 1.953
Accuracy of the model on the 50000 test images: 56.6%
Max accuracy: 58.71%
Epoch: [27] [ 0/625] eta: 4:06:18 lr: 0.003994 min_lr: 0.003994 loss: 3.1902 (3.1902) class_acc: 0.5000 (0.5000) weight_decay: 0.0500 (0.0500) time: 23.6460 data: 14.9123 max mem: 6925
Epoch: [27] [200/625] eta: 0:14:50 lr: 0.003993 min_lr: 0.003993 loss: 3.1160 (3.1246) class_acc: 0.4922 (0.4966) weight_decay: 0.0500 (0.0500) grad_norm: 1.1239 (1.4555) time: 1.9603 data: 0.0009 max mem: 6925
Epoch: [27] [400/625] eta: 0:07:40 lr: 0.003993 min_lr: 0.003993 loss: 3.1534 (3.1285) class_acc: 0.4805 (0.4964) weight_decay: 0.0500 (0.0500) grad_norm: 1.1407 (1.3914) time: 2.0332 data: 0.0009 max mem: 6925
Epoch: [27] [600/625] eta: 0:00:50 lr: 0.003992 min_lr: 0.003992 loss: 3.1599 (3.1302) class_acc: 0.4922 (0.4969) weight_decay: 0.0500 (0.0500) grad_norm: 1.5402 (1.3986) time: 2.0503 data: 0.8212 max mem: 6925
Epoch: [27] [624/625] eta: 0:00:01 lr: 0.003992 min_lr: 0.003992 loss: 3.1753 (3.1318) class_acc: 0.4727 (0.4965) weight_decay: 0.0500 (0.0500) grad_norm: 1.7747 (1.4205) time: 0.9449 data: 0.4140 max mem: 6925
Epoch: [27] Total time: 0:20:43 (1.9892 s / it)
Averaged stats: lr: 0.003992 min_lr: 0.003992 loss: 3.1753 (3.1296) class_acc: 0.4727 (0.4962) weight_decay: 0.0500 (0.0500) grad_norm: 1.7747 (1.4205)
Test: [ 0/50] eta: 0:11:08 loss: 1.5979 (1.5979) acc1: 68.8000 (68.8000) acc5: 88.0000 (88.0000) time: 13.3797 data: 13.3470 max mem: 6925
Test: [10/50] eta: 0:01:19 loss: 1.6581 (1.6987) acc1: 62.4000 (62.4727) acc5: 83.2000 (84.0000) time: 1.9945 data: 1.9636 max mem: 6925
Test: [20/50] eta: 0:00:44 loss: 1.8228 (1.8282) acc1: 59.2000 (59.8095) acc5: 82.4000 (82.1714) time: 0.8776 data: 0.8470 max mem: 6925
Test: [30/50] eta: 0:00:26 loss: 1.9179 (1.8512) acc1: 56.0000 (59.1226) acc5: 80.0000 (81.4452) time: 0.9735 data: 0.9440 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.8941 (1.8795) acc1: 55.2000 (58.2049) acc5: 79.2000 (80.8390) time: 1.0038 data: 0.9746 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.8686 (1.8853) acc1: 56.0000 (57.7440) acc5: 81.6000 (80.9440) time: 0.7075 data: 0.6783 max mem: 6925
Test: Total time: 0:00:55 (1.1170 s / it)
* Acc@1 58.526 Acc@5 81.220 loss 1.852
Accuracy of the model on the 50000 test images: 58.5%
Max accuracy: 58.71%
Epoch: [28] [ 0/625] eta: 3:27:56 lr: 0.003992 min_lr: 0.003992 loss: 2.9874 (2.9874) class_acc: 0.5195 (0.5195) weight_decay: 0.0500 (0.0500) time: 19.9629 data: 19.7191 max mem: 6925
Epoch: [28] [200/625] eta: 0:14:10 lr: 0.003991 min_lr: 0.003991 loss: 3.0706 (3.1036) class_acc: 0.5156 (0.5014) weight_decay: 0.0500 (0.0500) grad_norm: 1.3169 (1.3402) time: 2.0495 data: 0.0726 max mem: 6925
Epoch: [28] [400/625] eta: 0:07:33 lr: 0.003991 min_lr: 0.003991 loss: 3.1552 (3.1116) class_acc: 0.4961 (0.4995) weight_decay: 0.0500 (0.0500) grad_norm: 1.0909 (1.3234) time: 2.1717 data: 0.0008 max mem: 6925
Epoch: [28] [600/625] eta: 0:00:50 lr: 0.003990 min_lr: 0.003990 loss: 3.1325 (3.1130) class_acc: 0.5039 (0.4991) weight_decay: 0.0500 (0.0500) grad_norm: 1.0661 (1.3014) time: 1.8675 data: 0.0006 max mem: 6925
Epoch: [28] [624/625] eta: 0:00:01 lr: 0.003990 min_lr: 0.003990 loss: 3.0940 (3.1124) class_acc: 0.5000 (0.4994) weight_decay: 0.0500 (0.0500) grad_norm: 1.1751 (1.3190) time: 0.7728 data: 0.0050 max mem: 6925
Epoch: [28] Total time: 0:20:48 (1.9980 s / it)
Averaged stats: lr: 0.003990 min_lr: 0.003990 loss: 3.0940 (3.1115) class_acc: 0.5000 (0.4995) weight_decay: 0.0500 (0.0500) grad_norm: 1.1751 (1.3190)
Test: [ 0/50] eta: 0:10:20 loss: 1.4528 (1.4528) acc1: 68.0000 (68.0000) acc5: 91.2000 (91.2000) time: 12.4128 data: 12.3422 max mem: 6925
Test: [10/50] eta: 0:01:23 loss: 1.5719 (1.6005) acc1: 65.6000 (65.0182) acc5: 85.6000 (85.3091) time: 2.0834 data: 2.0507 max mem: 6925
Test: [20/50] eta: 0:00:47 loss: 1.6982 (1.7459) acc1: 60.8000 (61.2190) acc5: 83.2000 (83.5429) time: 1.0309 data: 1.0020 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.8487 (1.7573) acc1: 56.8000 (60.5677) acc5: 80.8000 (82.9936) time: 0.9711 data: 0.9420 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.7848 (1.7854) acc1: 57.6000 (60.0585) acc5: 80.8000 (82.1073) time: 0.7860 data: 0.7549 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.7519 (1.7904) acc1: 57.6000 (59.8720) acc5: 81.6000 (82.1440) time: 0.7474 data: 0.7158 max mem: 6925
Test: Total time: 0:00:53 (1.0656 s / it)
* Acc@1 60.176 Acc@5 82.872 loss 1.754
Accuracy of the model on the 50000 test images: 60.2%
Max accuracy: 60.18%
Epoch: [29] [ 0/625] eta: 3:34:47 lr: 0.003990 min_lr: 0.003990 loss: 2.8921 (2.8921) class_acc: 0.5156 (0.5156) weight_decay: 0.0500 (0.0500) time: 20.6196 data: 19.6735 max mem: 6925
Epoch: [29] [200/625] eta: 0:13:54 lr: 0.003989 min_lr: 0.003989 loss: 3.1102 (3.0778) class_acc: 0.4922 (0.5077) weight_decay: 0.0500 (0.0500) grad_norm: 1.0316 (1.3358) time: 1.9148 data: 0.0007 max mem: 6925
Epoch: [29] [400/625] eta: 0:07:22 lr: 0.003988 min_lr: 0.003988 loss: 3.1392 (3.0914) class_acc: 0.4922 (0.5053) weight_decay: 0.0500 (0.0500) grad_norm: 1.0474 (inf) time: 1.9521 data: 0.0941 max mem: 6925
Epoch: [29] [600/625] eta: 0:00:49 lr: 0.003988 min_lr: 0.003988 loss: 3.1169 (3.0947) class_acc: 0.4883 (0.5045) weight_decay: 0.0500 (0.0500) grad_norm: 1.0542 (inf) time: 1.9093 data: 0.0932 max mem: 6925
Epoch: [29] [624/625] eta: 0:00:01 lr: 0.003987 min_lr: 0.003987 loss: 3.0418 (3.0936) class_acc: 0.5117 (0.5046) weight_decay: 0.0500 (0.0500) grad_norm: 1.1980 (inf) time: 0.9616 data: 0.0411 max mem: 6925
Epoch: [29] Total time: 0:19:59 (1.9199 s / it)
Averaged stats: lr: 0.003987 min_lr: 0.003987 loss: 3.0418 (3.0920) class_acc: 0.5117 (0.5040) weight_decay: 0.0500 (0.0500) grad_norm: 1.1980 (inf)
Test: [ 0/50] eta: 0:09:44 loss: 1.7239 (1.7239) acc1: 61.6000 (61.6000) acc5: 83.2000 (83.2000) time: 11.6800 data: 11.6341 max mem: 6925
Test: [10/50] eta: 0:01:14 loss: 1.6253 (1.6117) acc1: 62.4000 (63.6364) acc5: 83.2000 (84.1455) time: 1.8683 data: 1.8371 max mem: 6925
Test: [20/50] eta: 0:00:41 loss: 1.7728 (1.7782) acc1: 61.6000 (60.4571) acc5: 83.2000 (82.5905) time: 0.8528 data: 0.8236 max mem: 6925
Test: [30/50] eta: 0:00:24 loss: 1.8544 (1.7819) acc1: 57.6000 (59.6645) acc5: 83.2000 (82.6839) time: 0.9098 data: 0.8807 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.8442 (1.7948) acc1: 57.6000 (59.4146) acc5: 80.8000 (82.2634) time: 0.8810 data: 0.8515 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.8000 (1.8056) acc1: 56.8000 (59.1200) acc5: 81.6000 (82.4160) time: 0.5303 data: 0.5014 max mem: 6925
Test: Total time: 0:00:49 (0.9901 s / it)
* Acc@1 59.868 Acc@5 82.754 loss 1.771
Accuracy of the model on the 50000 test images: 59.9%
Max accuracy: 60.18%
Epoch: [30] [ 0/625] eta: 3:21:49 lr: 0.003987 min_lr: 0.003987 loss: 2.8488 (2.8488) class_acc: 0.5430 (0.5430) weight_decay: 0.0500 (0.0500) time: 19.3754 data: 16.8572 max mem: 6925
Epoch: [30] [200/625] eta: 0:14:00 lr: 0.003987 min_lr: 0.003987 loss: 3.0085 (3.0552) class_acc: 0.5117 (0.5130) weight_decay: 0.0500 (0.0500) grad_norm: 1.0938 (1.4296) time: 1.8245 data: 0.0011 max mem: 6925
Epoch: [30] [400/625] eta: 0:07:22 lr: 0.003986 min_lr: 0.003986 loss: 3.1038 (3.0623) class_acc: 0.5000 (0.5119) weight_decay: 0.0500 (0.0500) grad_norm: 1.4223 (1.4073) time: 1.9596 data: 0.0009 max mem: 6925
Epoch: [30] [600/625] eta: 0:00:48 lr: 0.003985 min_lr: 0.003985 loss: 3.0707 (3.0757) class_acc: 0.5039 (0.5090) weight_decay: 0.0500 (0.0500) grad_norm: 1.1808 (1.4029) time: 2.0432 data: 0.0007 max mem: 6925
Epoch: [30] [624/625] eta: 0:00:01 lr: 0.003985 min_lr: 0.003985 loss: 3.0440 (3.0745) class_acc: 0.5000 (0.5089) weight_decay: 0.0500 (0.0500) grad_norm: 1.0728 (1.3907) time: 0.4455 data: 0.0020 max mem: 6925
Epoch: [30] Total time: 0:20:06 (1.9301 s / it)
Averaged stats: lr: 0.003985 min_lr: 0.003985 loss: 3.0440 (3.0765) class_acc: 0.5000 (0.5076) weight_decay: 0.0500 (0.0500) grad_norm: 1.0728 (1.3907)
Test: [ 0/50] eta: 0:11:13 loss: 1.6353 (1.6353) acc1: 59.2000 (59.2000) acc5: 91.2000 (91.2000) time: 13.4753 data: 13.4311 max mem: 6925
Test: [10/50] eta: 0:01:22 loss: 1.6353 (1.6690) acc1: 60.0000 (61.6000) acc5: 84.8000 (85.4545) time: 2.0560 data: 2.0252 max mem: 6925
Test: [20/50] eta: 0:00:43 loss: 1.8183 (1.7773) acc1: 59.2000 (59.7333) acc5: 83.2000 (83.8095) time: 0.8502 data: 0.8213 max mem: 6925
Test: [30/50] eta: 0:00:26 loss: 1.8960 (1.7972) acc1: 58.4000 (59.5613) acc5: 81.6000 (82.9419) time: 0.9026 data: 0.8732 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.9060 (1.8167) acc1: 56.0000 (59.0439) acc5: 80.0000 (82.4195) time: 0.9249 data: 0.8950 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.9060 (1.8286) acc1: 56.0000 (58.8480) acc5: 80.8000 (82.3040) time: 0.5855 data: 0.5565 max mem: 6925
Test: Total time: 0:00:52 (1.0503 s / it)
* Acc@1 59.972 Acc@5 82.842 loss 1.784
Accuracy of the model on the 50000 test images: 60.0%
Max accuracy: 60.18%
Epoch: [31] [ 0/625] eta: 3:57:11 lr: 0.003985 min_lr: 0.003985 loss: 3.0598 (3.0598) class_acc: 0.5195 (0.5195) weight_decay: 0.0500 (0.0500) time: 22.7701 data: 20.8511 max mem: 6925
Epoch: [31] [200/625] eta: 0:14:24 lr: 0.003984 min_lr: 0.003984 loss: 3.0848 (3.0588) class_acc: 0.5039 (0.5106) weight_decay: 0.0500 (0.0500) grad_norm: 1.2662 (1.2878) time: 1.8552 data: 0.1841 max mem: 6925
Epoch: [31] [400/625] eta: 0:07:23 lr: 0.003983 min_lr: 0.003983 loss: 3.0693 (3.0739) class_acc: 0.4922 (0.5074) weight_decay: 0.0500 (0.0500) grad_norm: 0.9494 (1.3058) time: 1.8745 data: 0.3572 max mem: 6925
Epoch: [31] [600/625] eta: 0:00:49 lr: 0.003982 min_lr: 0.003982 loss: 3.0930 (3.0758) class_acc: 0.4883 (0.5079) weight_decay: 0.0500 (0.0500) grad_norm: 1.1417 (1.3246) time: 1.9043 data: 0.0010 max mem: 6925
Epoch: [31] [624/625] eta: 0:00:01 lr: 0.003982 min_lr: 0.003982 loss: 3.0803 (3.0763) class_acc: 0.5078 (0.5077) weight_decay: 0.0500 (0.0500) grad_norm: 1.0807 (1.3167) time: 0.7558 data: 0.0028 max mem: 6925
Epoch: [31] Total time: 0:20:07 (1.9312 s / it)
Averaged stats: lr: 0.003982 min_lr: 0.003982 loss: 3.0803 (3.0635) class_acc: 0.5078 (0.5099) weight_decay: 0.0500 (0.0500) grad_norm: 1.0807 (1.3167)
Test: [ 0/50] eta: 0:10:57 loss: 1.6904 (1.6904) acc1: 62.4000 (62.4000) acc5: 84.8000 (84.8000) time: 13.1417 data: 13.1086 max mem: 6925
Test: [10/50] eta: 0:01:31 loss: 1.6904 (1.6463) acc1: 63.2000 (63.7091) acc5: 84.8000 (84.7273) time: 2.2853 data: 2.2560 max mem: 6925
Test: [20/50] eta: 0:00:50 loss: 1.8049 (1.8079) acc1: 56.8000 (59.0476) acc5: 83.2000 (82.9714) time: 1.1129 data: 1.0826 max mem: 6925
Test: [30/50] eta: 0:00:28 loss: 1.9112 (1.8092) acc1: 55.2000 (59.0194) acc5: 81.6000 (82.5806) time: 0.9757 data: 0.9458 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.9044 (1.8450) acc1: 58.4000 (58.1073) acc5: 80.0000 (82.0098) time: 0.7017 data: 0.6729 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.9060 (1.8381) acc1: 56.0000 (58.5920) acc5: 80.8000 (82.1120) time: 0.6394 data: 0.6094 max mem: 6925
Test: Total time: 0:00:52 (1.0509 s / it)
* Acc@1 59.796 Acc@5 82.828 loss 1.785
Accuracy of the model on the 50000 test images: 59.8%
Max accuracy: 60.18%
Epoch: [32] [ 0/625] eta: 3:26:25 lr: 0.003982 min_lr: 0.003982 loss: 2.9716 (2.9716) class_acc: 0.5352 (0.5352) weight_decay: 0.0500 (0.0500) time: 19.8171 data: 17.7958 max mem: 6925
Epoch: [32] [200/625] eta: 0:13:49 lr: 0.003981 min_lr: 0.003981 loss: 3.0186 (3.0264) class_acc: 0.5156 (0.5173) weight_decay: 0.0500 (0.0500) grad_norm: 1.3384 (1.3608) time: 1.7023 data: 0.0010 max mem: 6925
Epoch: [32] [400/625] eta: 0:07:07 lr: 0.003980 min_lr: 0.003980 loss: 2.9921 (3.0409) class_acc: 0.5078 (0.5155) weight_decay: 0.0500 (0.0500) grad_norm: 1.0036 (1.3758) time: 1.7798 data: 0.0008 max mem: 6925
Epoch: [32] [600/625] eta: 0:00:48 lr: 0.003979 min_lr: 0.003979 loss: 3.1072 (3.0487) class_acc: 0.5039 (0.5139) weight_decay: 0.0500 (0.0500) grad_norm: 1.4377 (1.3678) time: 1.8787 data: 0.0009 max mem: 6925
Epoch: [32] [624/625] eta: 0:00:01 lr: 0.003979 min_lr: 0.003979 loss: 3.0713 (3.0491) class_acc: 0.5117 (0.5138) weight_decay: 0.0500 (0.0500) grad_norm: 1.0305 (1.3523) time: 0.7409 data: 0.0027 max mem: 6925
Epoch: [32] Total time: 0:19:46 (1.8992 s / it)
Averaged stats: lr: 0.003979 min_lr: 0.003979 loss: 3.0713 (3.0490) class_acc: 0.5117 (0.5132) weight_decay: 0.0500 (0.0500) grad_norm: 1.0305 (1.3523)
Test: [ 0/50] eta: 0:09:27 loss: 1.6074 (1.6074) acc1: 60.8000 (60.8000) acc5: 84.0000 (84.0000) time: 11.3549 data: 11.3234 max mem: 6925
Test: [10/50] eta: 0:01:21 loss: 1.6793 (1.6531) acc1: 62.4000 (63.7818) acc5: 84.0000 (84.0000) time: 2.0436 data: 2.0131 max mem: 6925
Test: [20/50] eta: 0:00:49 loss: 1.8475 (1.8027) acc1: 61.6000 (60.3048) acc5: 82.4000 (82.7429) time: 1.1542 data: 1.1244 max mem: 6925
Test: [30/50] eta: 0:00:28 loss: 1.9418 (1.8218) acc1: 56.0000 (59.8968) acc5: 80.8000 (82.1161) time: 1.0605 data: 1.0309 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.8380 (1.8405) acc1: 57.6000 (59.3756) acc5: 80.0000 (81.8146) time: 0.6106 data: 0.5805 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.8380 (1.8464) acc1: 56.8000 (59.0880) acc5: 80.8000 (81.8720) time: 0.5468 data: 0.5168 max mem: 6925
Test: Total time: 0:00:48 (0.9714 s / it)
* Acc@1 59.554 Acc@5 82.442 loss 1.793
Accuracy of the model on the 50000 test images: 59.6%
Max accuracy: 60.18%
Epoch: [33] [ 0/625] eta: 3:27:48 lr: 0.003979 min_lr: 0.003979 loss: 3.1188 (3.1188) class_acc: 0.4961 (0.4961) weight_decay: 0.0500 (0.0500) time: 19.9494 data: 19.3073 max mem: 6925
Epoch: [33] [200/625] eta: 0:14:31 lr: 0.003978 min_lr: 0.003978 loss: 3.0146 (3.0149) class_acc: 0.5117 (0.5188) weight_decay: 0.0500 (0.0500) grad_norm: 1.0547 (1.3360) time: 1.9507 data: 0.2380 max mem: 6925
Epoch: [33] [400/625] eta: 0:07:36 lr: 0.003977 min_lr: 0.003977 loss: 3.0240 (3.0281) class_acc: 0.5195 (0.5169) weight_decay: 0.0500 (0.0500) grad_norm: 1.5689 (1.3891) time: 2.1891 data: 0.0008 max mem: 6925
Epoch: [33] [600/625] eta: 0:00:50 lr: 0.003976 min_lr: 0.003976 loss: 3.0428 (3.0301) class_acc: 0.5117 (0.5166) weight_decay: 0.0500 (0.0500) grad_norm: 1.2916 (1.3928) time: 1.9063 data: 0.0009 max mem: 6925
Epoch: [33] [624/625] eta: 0:00:01 lr: 0.003975 min_lr: 0.003975 loss: 3.0648 (3.0322) class_acc: 0.5000 (0.5162) weight_decay: 0.0500 (0.0500) grad_norm: 1.2610 (1.3926) time: 0.9191 data: 0.0014 max mem: 6925
Epoch: [33] Total time: 0:20:26 (1.9626 s / it)
Averaged stats: lr: 0.003975 min_lr: 0.003975 loss: 3.0648 (3.0365) class_acc: 0.5000 (0.5156) weight_decay: 0.0500 (0.0500) grad_norm: 1.2610 (1.3926)
Test: [ 0/50] eta: 0:10:10 loss: 1.6203 (1.6203) acc1: 68.8000 (68.8000) acc5: 86.4000 (86.4000) time: 12.2026 data: 12.1372 max mem: 6925
Test: [10/50] eta: 0:01:25 loss: 1.6148 (1.5565) acc1: 65.6000 (64.9455) acc5: 85.6000 (85.9636) time: 2.1494 data: 2.1146 max mem: 6925
Test: [20/50] eta: 0:00:50 loss: 1.6279 (1.7126) acc1: 62.4000 (61.3333) acc5: 84.0000 (83.3524) time: 1.1743 data: 1.1443 max mem: 6925
Test: [30/50] eta: 0:00:30 loss: 1.8473 (1.7185) acc1: 59.2000 (61.1097) acc5: 81.6000 (83.2774) time: 1.1631 data: 1.1341 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.8426 (1.7396) acc1: 60.0000 (60.7220) acc5: 81.6000 (83.0634) time: 0.7836 data: 0.7543 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.7947 (1.7503) acc1: 58.4000 (60.4160) acc5: 82.4000 (82.9920) time: 0.7579 data: 0.7293 max mem: 6925
Test: Total time: 0:00:52 (1.0489 s / it)
* Acc@1 60.598 Acc@5 83.254 loss 1.743
Accuracy of the model on the 50000 test images: 60.6%
Max accuracy: 60.60%
Epoch: [34] [ 0/625] eta: 3:21:34 lr: 0.003975 min_lr: 0.003975 loss: 3.1289 (3.1289) class_acc: 0.4766 (0.4766) weight_decay: 0.0500 (0.0500) time: 19.3517 data: 15.3806 max mem: 6925
Epoch: [34] [200/625] eta: 0:14:04 lr: 0.003974 min_lr: 0.003974 loss: 2.9991 (3.0064) class_acc: 0.5117 (0.5228) weight_decay: 0.0500 (0.0500) grad_norm: 1.1835 (1.2491) time: 1.7967 data: 0.0007 max mem: 6925
Epoch: [34] [400/625] eta: 0:07:19 lr: 0.003973 min_lr: 0.003973 loss: 3.0045 (3.0161) class_acc: 0.5156 (0.5213) weight_decay: 0.0500 (0.0500) grad_norm: 1.2055 (1.2802) time: 1.9031 data: 0.0008 max mem: 6925
Epoch: [34] [600/625] eta: 0:00:49 lr: 0.003972 min_lr: 0.003972 loss: 3.0105 (3.0164) class_acc: 0.5117 (0.5212) weight_decay: 0.0500 (0.0500) grad_norm: 1.3915 (1.3253) time: 2.2255 data: 0.0008 max mem: 6925
Epoch: [34] [624/625] eta: 0:00:01 lr: 0.003972 min_lr: 0.003972 loss: 3.0178 (3.0175) class_acc: 0.5117 (0.5206) weight_decay: 0.0500 (0.0500) grad_norm: 1.3386 (1.3254) time: 0.8350 data: 0.0019 max mem: 6925
Epoch: [34] Total time: 0:20:00 (1.9212 s / it)
Averaged stats: lr: 0.003972 min_lr: 0.003972 loss: 3.0178 (3.0214) class_acc: 0.5117 (0.5197) weight_decay: 0.0500 (0.0500) grad_norm: 1.3386 (1.3254)
Test: [ 0/50] eta: 0:11:05 loss: 1.8300 (1.8300) acc1: 56.0000 (56.0000) acc5: 81.6000 (81.6000) time: 13.3070 data: 13.2454 max mem: 6925
Test: [10/50] eta: 0:01:22 loss: 1.6220 (1.6088) acc1: 64.0000 (63.8545) acc5: 86.4000 (85.4546) time: 2.0737 data: 2.0402 max mem: 6925
Test: [20/50] eta: 0:00:48 loss: 1.7592 (1.7939) acc1: 58.4000 (59.5810) acc5: 83.2000 (82.9714) time: 1.0210 data: 0.9910 max mem: 6925
Test: [30/50] eta: 0:00:28 loss: 1.9259 (1.8280) acc1: 54.4000 (58.7613) acc5: 80.0000 (82.0387) time: 1.0467 data: 1.0179 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.8867 (1.8375) acc1: 56.0000 (58.5171) acc5: 80.0000 (81.7951) time: 0.8617 data: 0.8309 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.7565 (1.8432) acc1: 56.0000 (58.4480) acc5: 80.8000 (81.5040) time: 0.7372 data: 0.7061 max mem: 6925
Test: Total time: 0:00:54 (1.0976 s / it)
* Acc@1 59.290 Acc@5 82.336 loss 1.803
Accuracy of the model on the 50000 test images: 59.3%
Max accuracy: 60.60%
Epoch: [35] [ 0/625] eta: 3:47:54 lr: 0.003972 min_lr: 0.003972 loss: 3.0919 (3.0919) class_acc: 0.5039 (0.5039) weight_decay: 0.0500 (0.0500) time: 21.8798 data: 19.5551 max mem: 6925
Epoch: [35] [200/625] eta: 0:14:30 lr: 0.003971 min_lr: 0.003971 loss: 2.9648 (2.9780) class_acc: 0.5312 (0.5291) weight_decay: 0.0500 (0.0500) grad_norm: 1.0617 (1.2949) time: 1.7498 data: 0.1176 max mem: 6925
Epoch: [35] [400/625] eta: 0:07:20 lr: 0.003969 min_lr: 0.003969 loss: 2.9842 (2.9898) class_acc: 0.5156 (0.5255) weight_decay: 0.0500 (0.0500) grad_norm: 1.0676 (1.2975) time: 1.7379 data: 0.0135 max mem: 6925
Epoch: [35] [600/625] eta: 0:00:48 lr: 0.003968 min_lr: 0.003968 loss: 3.0046 (2.9996) class_acc: 0.5156 (0.5237) weight_decay: 0.0500 (0.0500) grad_norm: 1.1115 (1.2905) time: 2.1973 data: 0.0062 max mem: 6925
Epoch: [35] [624/625] eta: 0:00:01 lr: 0.003968 min_lr: 0.003968 loss: 3.0309 (3.0003) class_acc: 0.5117 (0.5235) weight_decay: 0.0500 (0.0500) grad_norm: 1.1767 (1.3022) time: 0.8135 data: 0.0014 max mem: 6925
Epoch: [35] Total time: 0:20:01 (1.9226 s / it)
Averaged stats: lr: 0.003968 min_lr: 0.003968 loss: 3.0309 (3.0080) class_acc: 0.5117 (0.5221) weight_decay: 0.0500 (0.0500) grad_norm: 1.1767 (1.3022)
Test: [ 0/50] eta: 0:10:24 loss: 1.6680 (1.6680) acc1: 54.4000 (54.4000) acc5: 84.8000 (84.8000) time: 12.4835 data: 12.4341 max mem: 6925
Test: [10/50] eta: 0:01:19 loss: 1.6422 (1.6425) acc1: 61.6000 (63.4909) acc5: 84.8000 (84.0727) time: 1.9774 data: 1.9456 max mem: 6925
Test: [20/50] eta: 0:00:44 loss: 1.8165 (1.7880) acc1: 60.0000 (60.0381) acc5: 81.6000 (82.5143) time: 0.9427 data: 0.9126 max mem: 6925
Test: [30/50] eta: 0:00:25 loss: 1.8787 (1.7920) acc1: 56.8000 (59.6903) acc5: 80.0000 (82.4000) time: 0.9014 data: 0.8711 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.8084 (1.8230) acc1: 57.6000 (59.3756) acc5: 82.4000 (81.6781) time: 0.7070 data: 0.6775 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.8412 (1.8310) acc1: 56.8000 (59.0400) acc5: 81.6000 (81.6960) time: 0.6704 data: 0.6413 max mem: 6925
Test: Total time: 0:00:46 (0.9260 s / it)
* Acc@1 59.762 Acc@5 82.406 loss 1.796
Accuracy of the model on the 50000 test images: 59.8%
Max accuracy: 60.60%
Epoch: [36] [ 0/625] eta: 4:13:18 lr: 0.003968 min_lr: 0.003968 loss: 3.1832 (3.1832) class_acc: 0.4492 (0.4492) weight_decay: 0.0500 (0.0500) time: 24.3173 data: 16.3657 max mem: 6925
Epoch: [36] [200/625] eta: 0:14:32 lr: 0.003967 min_lr: 0.003967 loss: 2.9427 (2.9847) class_acc: 0.5234 (0.5258) weight_decay: 0.0500 (0.0500) grad_norm: 1.2821 (1.4306) time: 1.8793 data: 0.0010 max mem: 6925
Epoch: [36] [400/625] eta: 0:07:30 lr: 0.003965 min_lr: 0.003965 loss: 3.0511 (2.9989) class_acc: 0.5195 (0.5234) weight_decay: 0.0500 (0.0500) grad_norm: 0.9982 (inf) time: 2.0168 data: 0.0009 max mem: 6925
Epoch: [36] [600/625] eta: 0:00:49 lr: 0.003964 min_lr: 0.003964 loss: 2.9943 (3.0044) class_acc: 0.5234 (0.5231) weight_decay: 0.0500 (0.0500) grad_norm: 1.2028 (inf) time: 2.0276 data: 0.0009 max mem: 6925
Epoch: [36] [624/625] eta: 0:00:01 lr: 0.003964 min_lr: 0.003964 loss: 3.0103 (3.0033) class_acc: 0.5391 (0.5234) weight_decay: 0.0500 (0.0500) grad_norm: 0.8543 (inf) time: 0.7824 data: 0.0023 max mem: 6925
Epoch: [36] Total time: 0:20:13 (1.9423 s / it)
Averaged stats: lr: 0.003964 min_lr: 0.003964 loss: 3.0103 (2.9970) class_acc: 0.5391 (0.5247) weight_decay: 0.0500 (0.0500) grad_norm: 0.8543 (inf)
Test: [ 0/50] eta: 0:09:59 loss: 1.5805 (1.5805) acc1: 66.4000 (66.4000) acc5: 83.2000 (83.2000) time: 11.9902 data: 11.9493 max mem: 6925
Test: [10/50] eta: 0:01:20 loss: 1.6186 (1.6432) acc1: 63.2000 (63.6364) acc5: 84.8000 (84.0727) time: 2.0130 data: 1.9827 max mem: 6925
Test: [20/50] eta: 0:00:47 loss: 1.7689 (1.7796) acc1: 60.8000 (60.7619) acc5: 83.2000 (82.7810) time: 1.0636 data: 1.0328 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.8657 (1.8014) acc1: 57.6000 (60.1290) acc5: 81.6000 (82.2968) time: 1.0146 data: 0.9819 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.7912 (1.7918) acc1: 57.6000 (60.3707) acc5: 83.2000 (82.4585) time: 0.6624 data: 0.6316 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.7862 (1.8020) acc1: 57.6000 (59.7920) acc5: 83.2000 (82.1920) time: 0.6074 data: 0.5786 max mem: 6925
Test: Total time: 0:00:47 (0.9474 s / it)
* Acc@1 60.494 Acc@5 83.248 loss 1.754
Accuracy of the model on the 50000 test images: 60.5%
Max accuracy: 60.60%
Epoch: [37] [ 0/625] eta: 3:42:34 lr: 0.003964 min_lr: 0.003964 loss: 2.9751 (2.9751) class_acc: 0.5156 (0.5156) weight_decay: 0.0500 (0.0500) time: 21.3667 data: 18.7211 max mem: 6925
Epoch: [37] [200/625] eta: 0:14:32 lr: 0.003962 min_lr: 0.003962 loss: 2.9673 (2.9783) class_acc: 0.5312 (0.5292) weight_decay: 0.0500 (0.0500) grad_norm: 1.1550 (1.3942) time: 1.9387 data: 0.0815 max mem: 6925
Epoch: [37] [400/625] eta: 0:07:28 lr: 0.003961 min_lr: 0.003961 loss: 3.0020 (2.9770) class_acc: 0.5117 (0.5299) weight_decay: 0.0500 (0.0500) grad_norm: 1.1054 (1.3760) time: 2.0408 data: 0.0009 max mem: 6925
Epoch: [37] [600/625] eta: 0:00:49 lr: 0.003960 min_lr: 0.003960 loss: 3.0230 (2.9894) class_acc: 0.5156 (0.5274) weight_decay: 0.0500 (0.0500) grad_norm: 0.9780 (1.3503) time: 1.9158 data: 0.0009 max mem: 6925
Epoch: [37] [624/625] eta: 0:00:01 lr: 0.003959 min_lr: 0.003959 loss: 3.0487 (2.9907) class_acc: 0.5156 (0.5269) weight_decay: 0.0500 (0.0500) grad_norm: 1.2969 (1.3457) time: 0.7548 data: 0.0018 max mem: 6925
Epoch: [37] Total time: 0:20:21 (1.9540 s / it)
Averaged stats: lr: 0.003959 min_lr: 0.003959 loss: 3.0487 (2.9852) class_acc: 0.5156 (0.5269) weight_decay: 0.0500 (0.0500) grad_norm: 1.2969 (1.3457)
Test: [ 0/50] eta: 0:09:51 loss: 1.4146 (1.4146) acc1: 66.4000 (66.4000) acc5: 87.2000 (87.2000) time: 11.8307 data: 11.7956 max mem: 6925
Test: [10/50] eta: 0:01:26 loss: 1.5390 (1.5148) acc1: 65.6000 (65.1636) acc5: 86.4000 (86.6182) time: 2.1718 data: 2.1422 max mem: 6925
Test: [20/50] eta: 0:00:53 loss: 1.6498 (1.6795) acc1: 60.8000 (61.3714) acc5: 84.8000 (84.4952) time: 1.2804 data: 1.2515 max mem: 6925
Test: [30/50] eta: 0:00:30 loss: 1.7543 (1.6823) acc1: 60.0000 (61.4968) acc5: 81.6000 (84.2581) time: 1.1731 data: 1.1436 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.6435 (1.7137) acc1: 60.8000 (60.5854) acc5: 84.0000 (83.8439) time: 0.7404 data: 0.7100 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.6435 (1.7176) acc1: 57.6000 (60.3040) acc5: 84.0000 (83.7600) time: 0.6069 data: 0.5772 max mem: 6925
Test: Total time: 0:00:53 (1.0671 s / it)
* Acc@1 61.838 Acc@5 84.170 loss 1.675
Accuracy of the model on the 50000 test images: 61.8%
Max accuracy: 61.84%
Epoch: [38] [ 0/625] eta: 3:47:28 lr: 0.003959 min_lr: 0.003959 loss: 3.0991 (3.0991) class_acc: 0.5234 (0.5234) weight_decay: 0.0500 (0.0500) time: 21.8370 data: 19.0070 max mem: 6925
Epoch: [38] [200/625] eta: 0:14:28 lr: 0.003958 min_lr: 0.003958 loss: 2.9794 (2.9765) class_acc: 0.5039 (0.5276) weight_decay: 0.0500 (0.0500) grad_norm: 1.3432 (1.4792) time: 1.8726 data: 0.3899 max mem: 6925
Epoch: [38] [400/625] eta: 0:07:25 lr: 0.003956 min_lr: 0.003956 loss: 3.0027 (2.9773) class_acc: 0.5117 (0.5273) weight_decay: 0.0500 (0.0500) grad_norm: 0.9936 (1.4375) time: 1.8665 data: 0.0011 max mem: 6925
Epoch: [38] [600/625] eta: 0:00:49 lr: 0.003955 min_lr: 0.003955 loss: 3.0331 (2.9845) class_acc: 0.5234 (0.5262) weight_decay: 0.0500 (0.0500) grad_norm: 1.4557 (1.4420) time: 1.9562 data: 0.0011 max mem: 6925
Epoch: [38] [624/625] eta: 0:00:01 lr: 0.003955 min_lr: 0.003955 loss: 2.9720 (2.9851) class_acc: 0.5273 (0.5262) weight_decay: 0.0500 (0.0500) grad_norm: 1.1137 (1.4407) time: 0.7535 data: 0.0015 max mem: 6925
Epoch: [38] Total time: 0:20:01 (1.9231 s / it)
Averaged stats: lr: 0.003955 min_lr: 0.003955 loss: 2.9720 (2.9786) class_acc: 0.5273 (0.5287) weight_decay: 0.0500 (0.0500) grad_norm: 1.1137 (1.4407)
Test: [ 0/50] eta: 0:09:47 loss: 1.3946 (1.3946) acc1: 65.6000 (65.6000) acc5: 92.0000 (92.0000) time: 11.7556 data: 11.7216 max mem: 6925
Test: [10/50] eta: 0:01:24 loss: 1.6177 (1.5870) acc1: 63.2000 (63.7091) acc5: 83.2000 (84.2182) time: 2.1204 data: 2.0889 max mem: 6925
Test: [20/50] eta: 0:00:51 loss: 1.6833 (1.7140) acc1: 60.8000 (60.8000) acc5: 83.2000 (83.3143) time: 1.2103 data: 1.1794 max mem: 6925
Test: [30/50] eta: 0:00:28 loss: 1.8157 (1.7392) acc1: 57.6000 (60.4129) acc5: 81.6000 (82.7871) time: 1.0593 data: 1.0297 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.8157 (1.7646) acc1: 56.8000 (59.8049) acc5: 80.8000 (82.4195) time: 0.5825 data: 0.5538 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.8507 (1.7790) acc1: 58.4000 (59.7280) acc5: 81.6000 (82.4480) time: 0.4423 data: 0.4126 max mem: 6925
Test: Total time: 0:00:48 (0.9771 s / it)
* Acc@1 60.642 Acc@5 83.098 loss 1.745
Accuracy of the model on the 50000 test images: 60.6%
Max accuracy: 61.84%
Epoch: [39] [ 0/625] eta: 3:35:28 lr: 0.003955 min_lr: 0.003955 loss: 2.9076 (2.9076) class_acc: 0.5352 (0.5352) weight_decay: 0.0500 (0.0500) time: 20.6857 data: 18.9015 max mem: 6925
Epoch: [39] [200/625] eta: 0:14:11 lr: 0.003953 min_lr: 0.003953 loss: 3.0048 (2.9745) class_acc: 0.5352 (0.5294) weight_decay: 0.0500 (0.0500) grad_norm: 1.2914 (1.4326) time: 1.9225 data: 0.2660 max mem: 6925
Epoch: [39] [400/625] eta: 0:07:18 lr: 0.003952 min_lr: 0.003952 loss: 2.9674 (2.9692) class_acc: 0.5391 (0.5299) weight_decay: 0.0500 (0.0500) grad_norm: 1.0632 (1.3885) time: 1.7775 data: 0.0007 max mem: 6925
Epoch: [39] [600/625] eta: 0:00:48 lr: 0.003950 min_lr: 0.003950 loss: 2.9898 (2.9746) class_acc: 0.5352 (0.5285) weight_decay: 0.0500 (0.0500) grad_norm: 1.1028 (1.3699) time: 2.0738 data: 0.0009 max mem: 6925
Epoch: [39] [624/625] eta: 0:00:01 lr: 0.003950 min_lr: 0.003950 loss: 2.9349 (2.9738) class_acc: 0.5391 (0.5286) weight_decay: 0.0500 (0.0500) grad_norm: 1.3465 (1.4037) time: 0.8167 data: 0.0014 max mem: 6925
Epoch: [39] Total time: 0:19:52 (1.9083 s / it)
Averaged stats: lr: 0.003950 min_lr: 0.003950 loss: 2.9349 (2.9684) class_acc: 0.5391 (0.5302) weight_decay: 0.0500 (0.0500) grad_norm: 1.3465 (1.4037)
Test: [ 0/50] eta: 0:09:38 loss: 1.5985 (1.5985) acc1: 62.4000 (62.4000) acc5: 84.8000 (84.8000) time: 11.5728 data: 11.5392 max mem: 6925
Test: [10/50] eta: 0:01:21 loss: 1.5913 (1.6317) acc1: 63.2000 (64.3636) acc5: 84.8000 (85.0909) time: 2.0339 data: 2.0034 max mem: 6925
Test: [20/50] eta: 0:00:47 loss: 1.7650 (1.7356) acc1: 61.6000 (61.5238) acc5: 84.0000 (84.0000) time: 1.0781 data: 1.0485 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.7960 (1.7438) acc1: 57.6000 (60.9806) acc5: 82.4000 (83.5871) time: 0.9758 data: 0.9463 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.7390 (1.7739) acc1: 60.0000 (60.2732) acc5: 81.6000 (82.9463) time: 0.6220 data: 0.5913 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.8156 (1.7854) acc1: 57.6000 (59.8880) acc5: 81.6000 (82.8640) time: 0.5858 data: 0.5555 max mem: 6925
Test: Total time: 0:00:46 (0.9217 s / it)
* Acc@1 60.726 Acc@5 83.074 loss 1.747
Accuracy of the model on the 50000 test images: 60.7%
Max accuracy: 61.84%
Epoch: [40] [ 0/625] eta: 3:40:37 lr: 0.003950 min_lr: 0.003950 loss: 3.0365 (3.0365) class_acc: 0.5078 (0.5078) weight_decay: 0.0500 (0.0500) time: 21.1798 data: 18.7488 max mem: 6925
Epoch: [40] [200/625] eta: 0:13:53 lr: 0.003948 min_lr: 0.003948 loss: 2.9437 (2.9492) class_acc: 0.5273 (0.5346) weight_decay: 0.0500 (0.0500) grad_norm: 1.2824 (1.1953) time: 2.1175 data: 0.0399 max mem: 6925
Epoch: [40] [400/625] eta: 0:07:15 lr: 0.003947 min_lr: 0.003947 loss: 2.9503 (2.9561) class_acc: 0.5312 (0.5330) weight_decay: 0.0500 (0.0500) grad_norm: 1.0089 (1.2700) time: 1.9525 data: 0.0098 max mem: 6925
Epoch: [40] [600/625] eta: 0:00:48 lr: 0.003945 min_lr: 0.003945 loss: 3.0061 (2.9670) class_acc: 0.5352 (0.5314) weight_decay: 0.0500 (0.0500) grad_norm: 1.2142 (1.2437) time: 2.1728 data: 0.0009 max mem: 6925
Epoch: [40] [624/625] eta: 0:00:01 lr: 0.003945 min_lr: 0.003945 loss: 2.9822 (2.9687) class_acc: 0.5156 (0.5309) weight_decay: 0.0500 (0.0500) grad_norm: 1.7418 (1.2755) time: 0.6421 data: 0.0015 max mem: 6925
Epoch: [40] Total time: 0:20:06 (1.9296 s / it)
Averaged stats: lr: 0.003945 min_lr: 0.003945 loss: 2.9822 (2.9612) class_acc: 0.5156 (0.5322) weight_decay: 0.0500 (0.0500) grad_norm: 1.7418 (1.2755)
Test: [ 0/50] eta: 0:10:39 loss: 1.5253 (1.5253) acc1: 70.4000 (70.4000) acc5: 86.4000 (86.4000) time: 12.7817 data: 12.7345 max mem: 6925
Test: [10/50] eta: 0:01:29 loss: 1.5253 (1.5545) acc1: 66.4000 (65.1636) acc5: 86.4000 (86.3273) time: 2.2292 data: 2.1969 max mem: 6925
Test: [20/50] eta: 0:00:52 loss: 1.7084 (1.6900) acc1: 62.4000 (61.8286) acc5: 83.2000 (84.5714) time: 1.1971 data: 1.1676 max mem: 6925
Test: [30/50] eta: 0:00:28 loss: 1.7853 (1.6990) acc1: 59.2000 (61.6774) acc5: 82.4000 (83.7936) time: 1.0191 data: 0.9905 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.7505 (1.7336) acc1: 59.2000 (61.1512) acc5: 82.4000 (83.3171) time: 0.5510 data: 0.5217 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.7505 (1.7271) acc1: 58.4000 (61.1040) acc5: 83.2000 (83.4880) time: 0.5302 data: 0.5012 max mem: 6925
Test: Total time: 0:00:48 (0.9636 s / it)
* Acc@1 62.146 Acc@5 84.132 loss 1.679
Accuracy of the model on the 50000 test images: 62.1%
Max accuracy: 62.15%
Epoch: [41] [ 0/625] eta: 3:29:34 lr: 0.003945 min_lr: 0.003945 loss: 2.8549 (2.8549) class_acc: 0.5352 (0.5352) weight_decay: 0.0500 (0.0500) time: 20.1186 data: 19.7306 max mem: 6925
Epoch: [41] [200/625] eta: 0:13:59 lr: 0.003943 min_lr: 0.003943 loss: 2.8993 (2.9200) class_acc: 0.5352 (0.5399) weight_decay: 0.0500 (0.0500) grad_norm: 0.9368 (1.2817) time: 1.8780 data: 0.0489 max mem: 6925
Epoch: [41] [400/625] eta: 0:07:20 lr: 0.003941 min_lr: 0.003941 loss: 2.9270 (2.9338) class_acc: 0.5273 (0.5371) weight_decay: 0.0500 (0.0500) grad_norm: 0.9171 (1.2928) time: 2.0374 data: 0.0118 max mem: 6925
Epoch: [41] [600/625] eta: 0:00:50 lr: 0.003940 min_lr: 0.003940 loss: 2.9921 (2.9413) class_acc: 0.5117 (0.5362) weight_decay: 0.0500 (0.0500) grad_norm: 1.1260 (1.3142) time: 2.1887 data: 0.0009 max mem: 6925
Epoch: [41] [624/625] eta: 0:00:01 lr: 0.003939 min_lr: 0.003939 loss: 2.9550 (2.9420) class_acc: 0.5273 (0.5358) weight_decay: 0.0500 (0.0500) grad_norm: 1.0763 (1.3110) time: 0.9130 data: 0.0196 max mem: 6925
Epoch: [41] Total time: 0:20:43 (1.9901 s / it)
Averaged stats: lr: 0.003939 min_lr: 0.003939 loss: 2.9550 (2.9509) class_acc: 0.5273 (0.5344) weight_decay: 0.0500 (0.0500) grad_norm: 1.0763 (1.3110)
Test: [ 0/50] eta: 0:12:12 loss: 1.5334 (1.5334) acc1: 62.4000 (62.4000) acc5: 85.6000 (85.6000) time: 14.6507 data: 14.6077 max mem: 6925
Test: [10/50] eta: 0:01:44 loss: 1.5483 (1.5781) acc1: 64.0000 (64.0727) acc5: 84.8000 (84.6546) time: 2.6135 data: 2.5826 max mem: 6925
Test: [20/50] eta: 0:01:00 loss: 1.6989 (1.7423) acc1: 60.0000 (59.8476) acc5: 82.4000 (82.4381) time: 1.3935 data: 1.3643 max mem: 6925
Test: [30/50] eta: 0:00:34 loss: 1.8089 (1.7814) acc1: 55.2000 (58.9936) acc5: 81.6000 (82.1936) time: 1.2130 data: 1.1845 max mem: 6925
Test: [40/50] eta: 0:00:14 loss: 1.8523 (1.7948) acc1: 56.0000 (59.1024) acc5: 81.6000 (82.4390) time: 0.7489 data: 0.7200 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.8627 (1.7980) acc1: 59.2000 (59.3280) acc5: 82.4000 (82.4640) time: 0.6419 data: 0.6130 max mem: 6925
Test: Total time: 0:00:58 (1.1664 s / it)
* Acc@1 60.004 Acc@5 82.804 loss 1.768
Accuracy of the model on the 50000 test images: 60.0%
Max accuracy: 62.15%
Epoch: [42] [ 0/625] eta: 4:06:33 lr: 0.003939 min_lr: 0.003939 loss: 2.9416 (2.9416) class_acc: 0.5312 (0.5312) weight_decay: 0.0500 (0.0500) time: 23.6698 data: 19.7571 max mem: 6925
Epoch: [42] [200/625] eta: 0:14:28 lr: 0.003938 min_lr: 0.003938 loss: 2.8703 (2.9302) class_acc: 0.5430 (0.5384) weight_decay: 0.0500 (0.0500) grad_norm: 1.2939 (1.2431) time: 1.9222 data: 0.1358 max mem: 6925
Epoch: [42] [400/625] eta: 0:07:31 lr: 0.003936 min_lr: 0.003936 loss: 2.9717 (2.9398) class_acc: 0.5234 (0.5356) weight_decay: 0.0500 (0.0500) grad_norm: 1.2096 (1.3209) time: 2.1267 data: 0.1531 max mem: 6925
Epoch: [42] [600/625] eta: 0:00:50 lr: 0.003934 min_lr: 0.003934 loss: 2.9941 (2.9494) class_acc: 0.5273 (0.5340) weight_decay: 0.0500 (0.0500) grad_norm: 0.9148 (1.2969) time: 2.1248 data: 0.0013 max mem: 6925
Epoch: [42] [624/625] eta: 0:00:01 lr: 0.003934 min_lr: 0.003934 loss: 2.9674 (2.9509) class_acc: 0.5312 (0.5338) weight_decay: 0.0500 (0.0500) grad_norm: 1.0655 (1.2913) time: 0.5660 data: 0.0097 max mem: 6925
Epoch: [42] Total time: 0:20:39 (1.9830 s / it)
Averaged stats: lr: 0.003934 min_lr: 0.003934 loss: 2.9674 (2.9420) class_acc: 0.5312 (0.5362) weight_decay: 0.0500 (0.0500) grad_norm: 1.0655 (1.2913)
Test: [ 0/50] eta: 0:10:17 loss: 1.5484 (1.5484) acc1: 67.2000 (67.2000) acc5: 87.2000 (87.2000) time: 12.3576 data: 12.3261 max mem: 6925
Test: [10/50] eta: 0:01:20 loss: 1.5578 (1.5549) acc1: 67.2000 (66.4727) acc5: 87.2000 (85.6000) time: 2.0061 data: 1.9760 max mem: 6925
Test: [20/50] eta: 0:00:46 loss: 1.7774 (1.7504) acc1: 58.4000 (61.2191) acc5: 83.2000 (83.5429) time: 1.0091 data: 0.9799 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.8510 (1.7540) acc1: 57.6000 (61.0065) acc5: 80.8000 (83.2000) time: 0.9905 data: 0.9617 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.8106 (1.7842) acc1: 59.2000 (60.2537) acc5: 80.0000 (82.5951) time: 0.8087 data: 0.7801 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.8106 (1.7795) acc1: 58.4000 (60.3680) acc5: 80.8000 (82.7680) time: 0.6153 data: 0.5869 max mem: 6925
Test: Total time: 0:00:52 (1.0408 s / it)
* Acc@1 60.666 Acc@5 83.434 loss 1.744
Accuracy of the model on the 50000 test images: 60.7%
Max accuracy: 62.15%
Epoch: [43] [ 0/625] eta: 3:41:32 lr: 0.003934 min_lr: 0.003934 loss: 2.9558 (2.9558) class_acc: 0.5195 (0.5195) weight_decay: 0.0500 (0.0500) time: 21.2676 data: 19.5988 max mem: 6925
Epoch: [43] [200/625] eta: 0:14:49 lr: 0.003932 min_lr: 0.003932 loss: 2.9184 (2.9245) class_acc: 0.5430 (0.5417) weight_decay: 0.0500 (0.0500) grad_norm: 1.2432 (inf) time: 2.0464 data: 0.1432 max mem: 6925
Epoch: [43] [400/625] eta: 0:07:36 lr: 0.003930 min_lr: 0.003930 loss: 2.8636 (2.9332) class_acc: 0.5430 (0.5377) weight_decay: 0.0500 (0.0500) grad_norm: 0.9361 (inf) time: 1.8242 data: 0.0008 max mem: 6925
Epoch: [43] [600/625] eta: 0:00:50 lr: 0.003928 min_lr: 0.003928 loss: 2.9895 (2.9381) class_acc: 0.5156 (0.5362) weight_decay: 0.0500 (0.0500) grad_norm: 1.5201 (inf) time: 1.9380 data: 0.0010 max mem: 6925
Epoch: [43] [624/625] eta: 0:00:01 lr: 0.003928 min_lr: 0.003928 loss: 2.9331 (2.9384) class_acc: 0.5312 (0.5361) weight_decay: 0.0500 (0.0500) grad_norm: 0.9749 (inf) time: 0.7960 data: 0.0016 max mem: 6925
Epoch: [43] Total time: 0:20:35 (1.9763 s / it)
Averaged stats: lr: 0.003928 min_lr: 0.003928 loss: 2.9331 (2.9334) class_acc: 0.5312 (0.5385) weight_decay: 0.0500 (0.0500) grad_norm: 0.9749 (inf)
Test: [ 0/50] eta: 0:09:27 loss: 1.6145 (1.6145) acc1: 61.6000 (61.6000) acc5: 88.8000 (88.8000) time: 11.3486 data: 11.2842 max mem: 6925
Test: [10/50] eta: 0:01:15 loss: 1.6022 (1.5991) acc1: 61.6000 (63.2727) acc5: 84.8000 (85.2364) time: 1.8857 data: 1.8505 max mem: 6925
Test: [20/50] eta: 0:00:44 loss: 1.7309 (1.7516) acc1: 60.0000 (59.8857) acc5: 83.2000 (83.4286) time: 0.9808 data: 0.9488 max mem: 6925
Test: [30/50] eta: 0:00:25 loss: 1.8539 (1.7392) acc1: 56.8000 (60.0774) acc5: 80.8000 (83.5355) time: 0.9458 data: 0.9157 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.7768 (1.7732) acc1: 57.6000 (59.6293) acc5: 84.0000 (83.2781) time: 0.7238 data: 0.6944 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.7932 (1.7772) acc1: 57.6000 (59.5680) acc5: 82.4000 (83.2160) time: 0.5418 data: 0.5123 max mem: 6925
Test: Total time: 0:00:48 (0.9703 s / it)
* Acc@1 60.740 Acc@5 83.434 loss 1.743
Accuracy of the model on the 50000 test images: 60.7%
Max accuracy: 62.15%
Epoch: [44] [ 0/625] eta: 3:42:01 lr: 0.003928 min_lr: 0.003928 loss: 2.7135 (2.7135) class_acc: 0.6250 (0.6250) weight_decay: 0.0500 (0.0500) time: 21.3143 data: 19.4399 max mem: 6925
Epoch: [44] [200/625] eta: 0:14:24 lr: 0.003926 min_lr: 0.003926 loss: 3.0140 (2.9044) class_acc: 0.5234 (0.5441) weight_decay: 0.0500 (0.0500) grad_norm: 1.2869 (1.3972) time: 2.0904 data: 0.0113 max mem: 6925
Epoch: [44] [400/625] eta: 0:07:25 lr: 0.003924 min_lr: 0.003924 loss: 2.9050 (2.9186) class_acc: 0.5430 (0.5419) weight_decay: 0.0500 (0.0500) grad_norm: 0.8778 (1.3094) time: 1.8901 data: 0.0246 max mem: 6925
Epoch: [44] [600/625] eta: 0:00:49 lr: 0.003922 min_lr: 0.003922 loss: 2.8956 (2.9286) class_acc: 0.5352 (0.5394) weight_decay: 0.0500 (0.0500) grad_norm: 0.8106 (1.3358) time: 1.9013 data: 0.0013 max mem: 6925
Epoch: [44] [624/625] eta: 0:00:01 lr: 0.003922 min_lr: 0.003922 loss: 2.9731 (2.9301) class_acc: 0.5312 (0.5392) weight_decay: 0.0500 (0.0500) grad_norm: 1.2090 (1.3324) time: 0.9229 data: 0.0015 max mem: 6925
Epoch: [44] Total time: 0:20:12 (1.9405 s / it)
Averaged stats: lr: 0.003922 min_lr: 0.003922 loss: 2.9731 (2.9274) class_acc: 0.5312 (0.5397) weight_decay: 0.0500 (0.0500) grad_norm: 1.2090 (1.3324)
Test: [ 0/50] eta: 0:10:30 loss: 1.4943 (1.4943) acc1: 66.4000 (66.4000) acc5: 88.0000 (88.0000) time: 12.6161 data: 12.5806 max mem: 6925
Test: [10/50] eta: 0:01:28 loss: 1.4967 (1.5486) acc1: 66.4000 (66.1091) acc5: 86.4000 (85.6727) time: 2.2015 data: 2.1715 max mem: 6925
Test: [20/50] eta: 0:00:53 loss: 1.7834 (1.6990) acc1: 60.8000 (61.9810) acc5: 84.8000 (84.3810) time: 1.2395 data: 1.2103 max mem: 6925
Test: [30/50] eta: 0:00:32 loss: 1.8006 (1.7374) acc1: 57.6000 (61.0065) acc5: 82.4000 (83.8452) time: 1.2861 data: 1.2573 max mem: 6925
Test: [40/50] eta: 0:00:13 loss: 1.7829 (1.7725) acc1: 57.6000 (60.2537) acc5: 81.6000 (83.2390) time: 0.9021 data: 0.8729 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.8397 (1.7726) acc1: 56.8000 (60.3680) acc5: 81.6000 (83.2000) time: 0.8476 data: 0.8185 max mem: 6925
Test: Total time: 0:00:56 (1.1276 s / it)
* Acc@1 61.468 Acc@5 83.610 loss 1.737
Accuracy of the model on the 50000 test images: 61.5%
Max accuracy: 62.15%
Epoch: [45] [ 0/625] eta: 3:18:20 lr: 0.003922 min_lr: 0.003922 loss: 2.9192 (2.9192) class_acc: 0.5508 (0.5508) weight_decay: 0.0500 (0.0500) time: 19.0411 data: 18.4726 max mem: 6925
Epoch: [45] [200/625] eta: 0:14:01 lr: 0.003920 min_lr: 0.003920 loss: 2.9436 (2.9174) class_acc: 0.5391 (0.5403) weight_decay: 0.0500 (0.0500) grad_norm: 0.8159 (1.2109) time: 1.7715 data: 0.0786 max mem: 6925
Epoch: [45] [400/625] eta: 0:07:16 lr: 0.003918 min_lr: 0.003918 loss: 2.9106 (2.9199) class_acc: 0.5312 (0.5404) weight_decay: 0.0500 (0.0500) grad_norm: 0.9903 (1.2693) time: 1.8442 data: 0.1830 max mem: 6925
Epoch: [45] [600/625] eta: 0:00:48 lr: 0.003916 min_lr: 0.003916 loss: 2.9214 (2.9259) class_acc: 0.5391 (0.5384) weight_decay: 0.0500 (0.0500) grad_norm: 0.8993 (1.2925) time: 1.9073 data: 0.0012 max mem: 6925
Epoch: [45] [624/625] eta: 0:00:01 lr: 0.003916 min_lr: 0.003916 loss: 2.9684 (2.9272) class_acc: 0.5352 (0.5383) weight_decay: 0.0500 (0.0500) grad_norm: 0.9356 (1.2781) time: 0.4772 data: 0.0021 max mem: 6925
Epoch: [45] Total time: 0:20:01 (1.9227 s / it)
Averaged stats: lr: 0.003916 min_lr: 0.003916 loss: 2.9684 (2.9216) class_acc: 0.5352 (0.5409) weight_decay: 0.0500 (0.0500) grad_norm: 0.9356 (1.2781)
Test: [ 0/50] eta: 0:09:24 loss: 1.7441 (1.7441) acc1: 58.4000 (58.4000) acc5: 86.4000 (86.4000) time: 11.2998 data: 11.2464 max mem: 6925
Test: [10/50] eta: 0:01:22 loss: 1.5480 (1.5691) acc1: 63.2000 (63.7818) acc5: 85.6000 (85.5273) time: 2.0670 data: 2.0357 max mem: 6925
Test: [20/50] eta: 0:00:50 loss: 1.7006 (1.7000) acc1: 59.2000 (60.8762) acc5: 84.0000 (84.0381) time: 1.2135 data: 1.1848 max mem: 6925
Test: [30/50] eta: 0:00:28 loss: 1.7248 (1.7178) acc1: 58.4000 (60.8000) acc5: 82.4000 (83.4323) time: 1.0643 data: 1.0359 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.6980 (1.7380) acc1: 60.0000 (60.4683) acc5: 84.0000 (83.2390) time: 0.6151 data: 0.5863 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.6848 (1.7511) acc1: 56.8000 (59.8880) acc5: 84.0000 (83.0880) time: 0.5468 data: 0.5175 max mem: 6925
Test: Total time: 0:00:48 (0.9660 s / it)
* Acc@1 61.130 Acc@5 83.742 loss 1.710
Accuracy of the model on the 50000 test images: 61.1%
Max accuracy: 62.15%
Epoch: [46] [ 0/625] eta: 3:29:43 lr: 0.003916 min_lr: 0.003916 loss: 2.7991 (2.7991) class_acc: 0.5352 (0.5352) weight_decay: 0.0500 (0.0500) time: 20.1331 data: 19.2931 max mem: 6925
Epoch: [46] [200/625] eta: 0:14:21 lr: 0.003913 min_lr: 0.003913 loss: 2.8562 (2.8912) class_acc: 0.5469 (0.5459) weight_decay: 0.0500 (0.0500) grad_norm: 0.9579 (1.3348) time: 1.9625 data: 0.0012 max mem: 6925
Epoch: [46] [400/625] eta: 0:07:14 lr: 0.003911 min_lr: 0.003911 loss: 2.9281 (2.9055) class_acc: 0.5391 (0.5446) weight_decay: 0.0500 (0.0500) grad_norm: 1.0952 (1.2355) time: 1.8532 data: 0.0827 max mem: 6925
Epoch: [46] [600/625] eta: 0:00:47 lr: 0.003909 min_lr: 0.003909 loss: 2.9695 (2.9143) class_acc: 0.5352 (0.5427) weight_decay: 0.0500 (0.0500) grad_norm: 1.1656 (1.2662) time: 1.9842 data: 0.0803 max mem: 6925
Epoch: [46] [624/625] eta: 0:00:01 lr: 0.003909 min_lr: 0.003909 loss: 2.9388 (2.9152) class_acc: 0.5352 (0.5426) weight_decay: 0.0500 (0.0500) grad_norm: 1.3658 (1.2701) time: 0.7899 data: 0.0580 max mem: 6925
Epoch: [46] Total time: 0:19:28 (1.8695 s / it)
Averaged stats: lr: 0.003909 min_lr: 0.003909 loss: 2.9388 (2.9115) class_acc: 0.5352 (0.5433) weight_decay: 0.0500 (0.0500) grad_norm: 1.3658 (1.2701)
Test: [ 0/50] eta: 0:10:53 loss: 1.6086 (1.6086) acc1: 61.6000 (61.6000) acc5: 87.2000 (87.2000) time: 13.0703 data: 13.0363 max mem: 6925
Test: [10/50] eta: 0:01:27 loss: 1.6086 (1.6448) acc1: 61.6000 (62.3273) acc5: 85.6000 (85.0182) time: 2.1848 data: 2.1532 max mem: 6925
Test: [20/50] eta: 0:00:50 loss: 1.7773 (1.8275) acc1: 60.0000 (59.0857) acc5: 83.2000 (82.4762) time: 1.1150 data: 1.0841 max mem: 6925
Test: [30/50] eta: 0:00:26 loss: 1.9523 (1.8250) acc1: 56.8000 (59.2774) acc5: 80.0000 (82.3226) time: 0.8398 data: 0.8093 max mem: 6925
Test: [40/50] eta: 0:00:10 loss: 1.8264 (1.8460) acc1: 57.6000 (59.2195) acc5: 82.4000 (82.0293) time: 0.4138 data: 0.3843 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.8704 (1.8572) acc1: 56.8000 (58.9440) acc5: 82.4000 (81.8400) time: 0.3560 data: 0.3277 max mem: 6925
Test: Total time: 0:00:44 (0.8920 s / it)
* Acc@1 59.646 Acc@5 82.812 loss 1.809
Accuracy of the model on the 50000 test images: 59.6%
Max accuracy: 62.15%
Epoch: [47] [ 0/625] eta: 3:32:12 lr: 0.003909 min_lr: 0.003909 loss: 2.8642 (2.8642) class_acc: 0.5977 (0.5977) weight_decay: 0.0500 (0.0500) time: 20.3725 data: 18.9833 max mem: 6925
Epoch: [47] [200/625] eta: 0:14:22 lr: 0.003907 min_lr: 0.003907 loss: 2.8732 (2.9016) class_acc: 0.5430 (0.5446) weight_decay: 0.0500 (0.0500) grad_norm: 1.0148 (1.3564) time: 1.9106 data: 0.0009 max mem: 6925
Epoch: [47] [400/625] eta: 0:07:24 lr: 0.003905 min_lr: 0.003905 loss: 2.9338 (2.9137) class_acc: 0.5430 (0.5435) weight_decay: 0.0500 (0.0500) grad_norm: 1.1221 (1.3654) time: 1.9306 data: 0.0263 max mem: 6925
Epoch: [47] [600/625] eta: 0:00:49 lr: 0.003902 min_lr: 0.003902 loss: 2.9428 (2.9162) class_acc: 0.5352 (0.5432) weight_decay: 0.0500 (0.0500) grad_norm: 1.0656 (1.3009) time: 2.0464 data: 0.0163 max mem: 6925
Epoch: [47] [624/625] eta: 0:00:01 lr: 0.003902 min_lr: 0.003902 loss: 2.9078 (2.9159) class_acc: 0.5352 (0.5432) weight_decay: 0.0500 (0.0500) grad_norm: 1.4084 (1.3057) time: 0.7693 data: 0.0020 max mem: 6925
Epoch: [47] Total time: 0:20:11 (1.9381 s / it)
Averaged stats: lr: 0.003902 min_lr: 0.003902 loss: 2.9078 (2.9066) class_acc: 0.5352 (0.5443) weight_decay: 0.0500 (0.0500) grad_norm: 1.4084 (1.3057)
Test: [ 0/50] eta: 0:10:48 loss: 1.7339 (1.7339) acc1: 61.6000 (61.6000) acc5: 87.2000 (87.2000) time: 12.9681 data: 12.9217 max mem: 6925
Test: [10/50] eta: 0:01:24 loss: 1.6171 (1.6379) acc1: 64.0000 (64.0727) acc5: 84.0000 (85.0909) time: 2.1209 data: 2.0902 max mem: 6925
Test: [20/50] eta: 0:00:49 loss: 1.7562 (1.8136) acc1: 58.4000 (60.2286) acc5: 83.2000 (83.1238) time: 1.0912 data: 1.0624 max mem: 6925
Test: [30/50] eta: 0:00:29 loss: 1.8728 (1.8236) acc1: 57.6000 (59.5097) acc5: 80.8000 (82.7613) time: 1.1155 data: 1.0845 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.8269 (1.8557) acc1: 57.6000 (58.9268) acc5: 81.6000 (82.0488) time: 0.9010 data: 0.8691 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.9261 (1.8813) acc1: 58.4000 (58.7520) acc5: 80.8000 (81.5840) time: 0.8333 data: 0.8028 max mem: 6925
Test: Total time: 0:00:55 (1.1038 s / it)
* Acc@1 59.460 Acc@5 82.372 loss 1.825
Accuracy of the model on the 50000 test images: 59.5%
Max accuracy: 62.15%
Epoch: [48] [ 0/625] eta: 3:31:06 lr: 0.003902 min_lr: 0.003902 loss: 2.9074 (2.9074) class_acc: 0.5273 (0.5273) weight_decay: 0.0500 (0.0500) time: 20.2671 data: 18.4036 max mem: 6925
Epoch: [48] [200/625] eta: 0:13:57 lr: 0.003900 min_lr: 0.003900 loss: 2.8618 (2.8828) class_acc: 0.5547 (0.5493) weight_decay: 0.0500 (0.0500) grad_norm: 1.1788 (1.2746) time: 1.8203 data: 0.2215 max mem: 6925
Epoch: [48] [400/625] eta: 0:07:18 lr: 0.003898 min_lr: 0.003898 loss: 2.9279 (2.8889) class_acc: 0.5430 (0.5481) weight_decay: 0.0500 (0.0500) grad_norm: 0.9037 (1.2080) time: 2.0216 data: 1.5093 max mem: 6925
Epoch: [48] [600/625] eta: 0:00:48 lr: 0.003895 min_lr: 0.003895 loss: 2.8956 (2.8957) class_acc: 0.5430 (0.5470) weight_decay: 0.0500 (0.0500) grad_norm: 1.0284 (1.2749) time: 2.0790 data: 0.0015 max mem: 6925
Epoch: [48] [624/625] eta: 0:00:01 lr: 0.003895 min_lr: 0.003895 loss: 2.9651 (2.8979) class_acc: 0.5234 (0.5468) weight_decay: 0.0500 (0.0500) grad_norm: 0.9469 (1.2640) time: 0.4926 data: 0.0016 max mem: 6925
Epoch: [48] Total time: 0:19:47 (1.9008 s / it)
Averaged stats: lr: 0.003895 min_lr: 0.003895 loss: 2.9651 (2.9003) class_acc: 0.5234 (0.5460) weight_decay: 0.0500 (0.0500) grad_norm: 0.9469 (1.2640)
Test: [ 0/50] eta: 0:10:08 loss: 1.6080 (1.6080) acc1: 63.2000 (63.2000) acc5: 89.6000 (89.6000) time: 12.1708 data: 12.1324 max mem: 6925
Test: [10/50] eta: 0:01:28 loss: 1.5101 (1.5075) acc1: 64.0000 (65.3818) acc5: 85.6000 (86.3273) time: 2.2019 data: 2.1710 max mem: 6925
Test: [20/50] eta: 0:00:52 loss: 1.6590 (1.6886) acc1: 60.8000 (62.0191) acc5: 84.8000 (84.1905) time: 1.2136 data: 1.1843 max mem: 6925
Test: [30/50] eta: 0:00:28 loss: 1.8033 (1.7101) acc1: 59.2000 (61.5484) acc5: 84.0000 (83.7419) time: 1.0221 data: 0.9927 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.6739 (1.7223) acc1: 60.0000 (61.0927) acc5: 83.2000 (83.7659) time: 0.5942 data: 0.5636 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.7808 (1.7382) acc1: 57.6000 (60.4160) acc5: 83.2000 (83.3920) time: 0.5562 data: 0.5262 max mem: 6925
Test: Total time: 0:00:49 (0.9823 s / it)
* Acc@1 61.116 Acc@5 83.852 loss 1.698
Accuracy of the model on the 50000 test images: 61.1%
Max accuracy: 62.15%
Epoch: [49] [ 0/625] eta: 3:54:13 lr: 0.003895 min_lr: 0.003895 loss: 2.7917 (2.7917) class_acc: 0.5234 (0.5234) weight_decay: 0.0500 (0.0500) time: 22.4857 data: 18.4860 max mem: 6925
Epoch: [49] [200/625] eta: 0:14:03 lr: 0.003893 min_lr: 0.003893 loss: 2.8539 (2.8788) class_acc: 0.5586 (0.5501) weight_decay: 0.0500 (0.0500) grad_norm: 1.0740 (1.2281) time: 1.7118 data: 0.2770 max mem: 6925
Epoch: [49] [400/625] eta: 0:07:20 lr: 0.003890 min_lr: 0.003890 loss: 2.8988 (2.8864) class_acc: 0.5508 (0.5488) weight_decay: 0.0500 (0.0500) grad_norm: 1.1498 (1.2100) time: 1.9807 data: 0.0009 max mem: 6925
Epoch: [49] [600/625] eta: 0:00:49 lr: 0.003888 min_lr: 0.003888 loss: 2.8540 (2.8901) class_acc: 0.5430 (0.5469) weight_decay: 0.0500 (0.0500) grad_norm: 1.2795 (inf) time: 1.9331 data: 0.0008 max mem: 6925
Epoch: [49] [624/625] eta: 0:00:01 lr: 0.003888 min_lr: 0.003888 loss: 2.9256 (2.8919) class_acc: 0.5352 (0.5466) weight_decay: 0.0500 (0.0500) grad_norm: 0.8847 (inf) time: 0.6846 data: 0.0013 max mem: 6925
Epoch: [49] Total time: 0:20:20 (1.9535 s / it)
Averaged stats: lr: 0.003888 min_lr: 0.003888 loss: 2.9256 (2.8939) class_acc: 0.5352 (0.5467) weight_decay: 0.0500 (0.0500) grad_norm: 0.8847 (inf)
Test: [ 0/50] eta: 0:10:52 loss: 1.6756 (1.6756) acc1: 64.0000 (64.0000) acc5: 87.2000 (87.2000) time: 13.0586 data: 13.0279 max mem: 6925
Test: [10/50] eta: 0:01:22 loss: 1.7367 (1.6988) acc1: 62.4000 (62.9818) acc5: 84.0000 (84.5818) time: 2.0642 data: 2.0347 max mem: 6925
Test: [20/50] eta: 0:00:47 loss: 1.8366 (1.8360) acc1: 58.4000 (59.0476) acc5: 81.6000 (83.0095) time: 0.9969 data: 0.9680 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.9281 (1.8713) acc1: 55.2000 (58.1419) acc5: 80.8000 (82.1936) time: 0.9886 data: 0.9601 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.9341 (1.8778) acc1: 56.8000 (58.1659) acc5: 80.0000 (81.7756) time: 0.7747 data: 0.7460 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.8797 (1.8818) acc1: 59.2000 (58.3840) acc5: 80.0000 (81.6160) time: 0.7029 data: 0.6743 max mem: 6925
Test: Total time: 0:00:52 (1.0478 s / it)
* Acc@1 59.546 Acc@5 82.186 loss 1.841
Accuracy of the model on the 50000 test images: 59.5%
Max accuracy: 62.15%
Epoch: [50] [ 0/625] eta: 4:06:19 lr: 0.003888 min_lr: 0.003888 loss: 2.9456 (2.9456) class_acc: 0.5078 (0.5078) weight_decay: 0.0500 (0.0500) time: 23.6467 data: 19.1529 max mem: 6925
Epoch: [50] [200/625] eta: 0:14:06 lr: 0.003885 min_lr: 0.003885 loss: 2.8645 (2.8836) class_acc: 0.5547 (0.5509) weight_decay: 0.0500 (0.0500) grad_norm: 0.8884 (1.1582) time: 1.8171 data: 0.0018 max mem: 6925
Epoch: [50] [400/625] eta: 0:07:28 lr: 0.003883 min_lr: 0.003883 loss: 2.8858 (2.8912) class_acc: 0.5547 (0.5486) weight_decay: 0.0500 (0.0500) grad_norm: 1.1791 (1.2101) time: 2.1442 data: 0.0009 max mem: 6925
Epoch: [50] [600/625] eta: 0:00:49 lr: 0.003881 min_lr: 0.003881 loss: 2.8859 (2.8981) class_acc: 0.5352 (0.5464) weight_decay: 0.0500 (0.0500) grad_norm: 1.3555 (1.2363) time: 1.8726 data: 0.0010 max mem: 6925
Epoch: [50] [624/625] eta: 0:00:01 lr: 0.003880 min_lr: 0.003880 loss: 2.8450 (2.8976) class_acc: 0.5625 (0.5467) weight_decay: 0.0500 (0.0500) grad_norm: 1.3279 (1.2509) time: 0.5119 data: 0.0027 max mem: 6925
Epoch: [50] Total time: 0:20:23 (1.9581 s / it)
Averaged stats: lr: 0.003880 min_lr: 0.003880 loss: 2.8450 (2.8880) class_acc: 0.5625 (0.5485) weight_decay: 0.0500 (0.0500) grad_norm: 1.3279 (1.2509)
Test: [ 0/50] eta: 0:09:59 loss: 1.4435 (1.4435) acc1: 65.6000 (65.6000) acc5: 88.0000 (88.0000) time: 11.9876 data: 11.9558 max mem: 6925
Test: [10/50] eta: 0:01:25 loss: 1.5731 (1.5552) acc1: 66.4000 (65.4545) acc5: 86.4000 (85.8182) time: 2.1254 data: 2.0962 max mem: 6925
Test: [20/50] eta: 0:00:51 loss: 1.6727 (1.7282) acc1: 60.8000 (61.1429) acc5: 84.8000 (83.7714) time: 1.1970 data: 1.1681 max mem: 6925
Test: [30/50] eta: 0:00:29 loss: 1.8169 (1.7639) acc1: 58.4000 (60.4903) acc5: 81.6000 (82.9419) time: 1.1260 data: 1.0974 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.7769 (1.7806) acc1: 58.4000 (60.4488) acc5: 82.4000 (82.7122) time: 0.6847 data: 0.6562 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.8204 (1.8022) acc1: 59.2000 (60.2240) acc5: 81.6000 (82.3520) time: 0.6791 data: 0.6505 max mem: 6925
Test: Total time: 0:00:51 (1.0314 s / it)
* Acc@1 60.938 Acc@5 83.204 loss 1.753
Accuracy of the model on the 50000 test images: 60.9%
Max accuracy: 62.15%
Epoch: [51] [ 0/625] eta: 3:51:17 lr: 0.003880 min_lr: 0.003880 loss: 2.7048 (2.7048) class_acc: 0.6211 (0.6211) weight_decay: 0.0500 (0.0500) time: 22.2040 data: 20.3569 max mem: 6925
Epoch: [51] [200/625] eta: 0:14:30 lr: 0.003878 min_lr: 0.003878 loss: 2.8945 (2.8704) class_acc: 0.5391 (0.5527) weight_decay: 0.0500 (0.0500) grad_norm: 0.7946 (1.2090) time: 1.9756 data: 0.0272 max mem: 6925
Epoch: [51] [400/625] eta: 0:07:34 lr: 0.003875 min_lr: 0.003875 loss: 2.8309 (2.8777) class_acc: 0.5469 (0.5506) weight_decay: 0.0500 (0.0500) grad_norm: 0.9594 (1.2342) time: 1.8764 data: 0.0008 max mem: 6925
Epoch: [51] [600/625] eta: 0:00:50 lr: 0.003873 min_lr: 0.003873 loss: 2.8375 (2.8791) class_acc: 0.5586 (0.5504) weight_decay: 0.0500 (0.0500) grad_norm: 0.9946 (1.2329) time: 2.0193 data: 0.0008 max mem: 6925
Epoch: [51] [624/625] eta: 0:00:01 lr: 0.003873 min_lr: 0.003873 loss: 2.9395 (2.8810) class_acc: 0.5430 (0.5503) weight_decay: 0.0500 (0.0500) grad_norm: 0.8678 (1.2198) time: 1.2169 data: 0.0014 max mem: 6925
Epoch: [51] Total time: 0:20:40 (1.9856 s / it)
Averaged stats: lr: 0.003873 min_lr: 0.003873 loss: 2.9395 (2.8812) class_acc: 0.5430 (0.5494) weight_decay: 0.0500 (0.0500) grad_norm: 0.8678 (1.2198)
Test: [ 0/50] eta: 0:11:00 loss: 1.6701 (1.6701) acc1: 60.0000 (60.0000) acc5: 84.8000 (84.8000) time: 13.2194 data: 13.1748 max mem: 6925
Test: [10/50] eta: 0:01:33 loss: 1.5639 (1.5386) acc1: 64.0000 (64.2182) acc5: 87.2000 (86.0364) time: 2.3310 data: 2.3001 max mem: 6925
Test: [20/50] eta: 0:00:55 loss: 1.6363 (1.7020) acc1: 60.0000 (60.6476) acc5: 84.0000 (83.7333) time: 1.2643 data: 1.2344 max mem: 6925
Test: [30/50] eta: 0:00:30 loss: 1.8078 (1.7178) acc1: 58.4000 (60.9548) acc5: 82.4000 (83.5613) time: 1.0818 data: 1.0513 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.7061 (1.7280) acc1: 60.8000 (60.5854) acc5: 83.2000 (83.5317) time: 0.6133 data: 0.5834 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.6918 (1.7403) acc1: 59.2000 (60.3040) acc5: 83.2000 (83.3280) time: 0.6165 data: 0.5865 max mem: 6925
Test: Total time: 0:00:52 (1.0571 s / it)
* Acc@1 61.918 Acc@5 84.126 loss 1.693
Accuracy of the model on the 50000 test images: 61.9%
Max accuracy: 62.15%
Epoch: [52] [ 0/625] eta: 3:36:55 lr: 0.003873 min_lr: 0.003873 loss: 2.7016 (2.7016) class_acc: 0.5977 (0.5977) weight_decay: 0.0500 (0.0500) time: 20.8243 data: 20.5724 max mem: 6925
Epoch: [52] [200/625] eta: 0:14:23 lr: 0.003870 min_lr: 0.003870 loss: 2.8677 (2.8619) class_acc: 0.5469 (0.5544) weight_decay: 0.0500 (0.0500) grad_norm: 1.3431 (1.3962) time: 1.9861 data: 0.0013 max mem: 6925
Epoch: [52] [400/625] eta: 0:07:18 lr: 0.003867 min_lr: 0.003867 loss: 2.8898 (2.8722) class_acc: 0.5430 (0.5520) weight_decay: 0.0500 (0.0500) grad_norm: 0.9424 (1.3339) time: 1.9013 data: 0.0013 max mem: 6925
Epoch: [52] [600/625] eta: 0:00:48 lr: 0.003865 min_lr: 0.003865 loss: 2.8805 (2.8734) class_acc: 0.5430 (0.5516) weight_decay: 0.0500 (0.0500) grad_norm: 1.1526 (1.2971) time: 1.8392 data: 0.0009 max mem: 6925
Epoch: [52] [624/625] eta: 0:00:01 lr: 0.003865 min_lr: 0.003865 loss: 2.8861 (2.8733) class_acc: 0.5391 (0.5515) weight_decay: 0.0500 (0.0500) grad_norm: 1.0384 (1.2943) time: 0.9595 data: 0.0018 max mem: 6925
Epoch: [52] Total time: 0:19:39 (1.8871 s / it)
Averaged stats: lr: 0.003865 min_lr: 0.003865 loss: 2.8861 (2.8779) class_acc: 0.5391 (0.5507) weight_decay: 0.0500 (0.0500) grad_norm: 1.0384 (1.2943)
Test: [ 0/50] eta: 0:10:25 loss: 1.7829 (1.7829) acc1: 62.4000 (62.4000) acc5: 81.6000 (81.6000) time: 12.5080 data: 12.4740 max mem: 6925
Test: [10/50] eta: 0:01:20 loss: 1.5657 (1.6094) acc1: 64.8000 (64.4364) acc5: 85.6000 (84.8000) time: 2.0084 data: 1.9780 max mem: 6925
Test: [20/50] eta: 0:00:46 loss: 1.7396 (1.7440) acc1: 60.0000 (60.9143) acc5: 83.2000 (83.1238) time: 1.0139 data: 0.9842 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.8042 (1.7724) acc1: 56.0000 (59.5871) acc5: 83.2000 (82.8645) time: 1.0330 data: 1.0039 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.7771 (1.7959) acc1: 56.0000 (59.0439) acc5: 82.4000 (82.4585) time: 0.7239 data: 0.6936 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.7536 (1.7923) acc1: 60.0000 (59.3120) acc5: 82.4000 (82.3360) time: 0.5961 data: 0.5654 max mem: 6925
Test: Total time: 0:00:48 (0.9675 s / it)
* Acc@1 60.278 Acc@5 83.156 loss 1.763
Accuracy of the model on the 50000 test images: 60.3%
Max accuracy: 62.15%
Epoch: [53] [ 0/625] eta: 4:02:41 lr: 0.003865 min_lr: 0.003865 loss: 3.0579 (3.0579) class_acc: 0.5117 (0.5117) weight_decay: 0.0500 (0.0500) time: 23.2978 data: 21.0287 max mem: 6925
Epoch: [53] [200/625] eta: 0:14:10 lr: 0.003862 min_lr: 0.003862 loss: 2.8573 (2.8707) class_acc: 0.5508 (0.5529) weight_decay: 0.0500 (0.0500) grad_norm: 1.0043 (1.2588) time: 1.7693 data: 0.0313 max mem: 6925
Epoch: [53] [400/625] eta: 0:07:21 lr: 0.003859 min_lr: 0.003859 loss: 2.9416 (2.8717) class_acc: 0.5312 (0.5520) weight_decay: 0.0500 (0.0500) grad_norm: 1.1663 (1.2516) time: 2.0039 data: 0.2269 max mem: 6925
Epoch: [53] [600/625] eta: 0:00:49 lr: 0.003857 min_lr: 0.003857 loss: 2.8659 (2.8768) class_acc: 0.5430 (0.5511) weight_decay: 0.0500 (0.0500) grad_norm: 0.9741 (1.2723) time: 1.9625 data: 0.0009 max mem: 6925
Epoch: [53] [624/625] eta: 0:00:01 lr: 0.003856 min_lr: 0.003856 loss: 2.8485 (2.8769) class_acc: 0.5625 (0.5514) weight_decay: 0.0500 (0.0500) grad_norm: 1.0450 (1.2705) time: 0.7442 data: 0.0015 max mem: 6925
Epoch: [53] Total time: 0:20:00 (1.9209 s / it)
Averaged stats: lr: 0.003856 min_lr: 0.003856 loss: 2.8485 (2.8732) class_acc: 0.5625 (0.5516) weight_decay: 0.0500 (0.0500) grad_norm: 1.0450 (1.2705)
Test: [ 0/50] eta: 0:09:56 loss: 1.5390 (1.5390) acc1: 63.2000 (63.2000) acc5: 84.0000 (84.0000) time: 11.9395 data: 11.9026 max mem: 6925
Test: [10/50] eta: 0:01:23 loss: 1.5390 (1.5498) acc1: 65.6000 (65.9636) acc5: 87.2000 (85.5273) time: 2.0998 data: 2.0693 max mem: 6925
Test: [20/50] eta: 0:00:49 loss: 1.6297 (1.6793) acc1: 63.2000 (62.9714) acc5: 85.6000 (84.0762) time: 1.1366 data: 1.1075 max mem: 6925
Test: [30/50] eta: 0:00:28 loss: 1.7688 (1.7043) acc1: 59.2000 (62.4258) acc5: 84.8000 (83.7677) time: 1.0750 data: 1.0465 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.7099 (1.7062) acc1: 58.4000 (62.0488) acc5: 82.4000 (83.6878) time: 0.6974 data: 0.6688 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.7444 (1.7210) acc1: 58.4000 (61.7600) acc5: 82.4000 (83.4880) time: 0.6493 data: 0.6208 max mem: 6925
Test: Total time: 0:00:49 (0.9986 s / it)
* Acc@1 62.420 Acc@5 84.448 loss 1.693
Accuracy of the model on the 50000 test images: 62.4%
Max accuracy: 62.42%
Epoch: [54] [ 0/625] eta: 3:31:21 lr: 0.003856 min_lr: 0.003856 loss: 3.0218 (3.0218) class_acc: 0.5312 (0.5312) weight_decay: 0.0500 (0.0500) time: 20.2910 data: 19.9497 max mem: 6925
Epoch: [54] [200/625] eta: 0:14:29 lr: 0.003854 min_lr: 0.003854 loss: 2.8638 (2.8435) class_acc: 0.5469 (0.5565) weight_decay: 0.0500 (0.0500) grad_norm: 0.8323 (1.2351) time: 2.0299 data: 1.7442 max mem: 6925
Epoch: [54] [400/625] eta: 0:07:26 lr: 0.003851 min_lr: 0.003851 loss: 2.8706 (2.8628) class_acc: 0.5430 (0.5536) weight_decay: 0.0500 (0.0500) grad_norm: 1.1266 (1.2845) time: 1.8441 data: 1.5492 max mem: 6925
Epoch: [54] [600/625] eta: 0:00:48 lr: 0.003848 min_lr: 0.003848 loss: 2.9133 (2.8661) class_acc: 0.5391 (0.5527) weight_decay: 0.0500 (0.0500) grad_norm: 0.8930 (1.2782) time: 1.8270 data: 0.0009 max mem: 6925
Epoch: [54] [624/625] eta: 0:00:01 lr: 0.003848 min_lr: 0.003848 loss: 2.8905 (2.8674) class_acc: 0.5430 (0.5523) weight_decay: 0.0500 (0.0500) grad_norm: 0.9698 (1.2734) time: 1.0795 data: 0.0057 max mem: 6925
Epoch: [54] Total time: 0:19:58 (1.9181 s / it)
Averaged stats: lr: 0.003848 min_lr: 0.003848 loss: 2.8905 (2.8656) class_acc: 0.5430 (0.5535) weight_decay: 0.0500 (0.0500) grad_norm: 0.9698 (1.2734)
Test: [ 0/50] eta: 0:10:44 loss: 1.5757 (1.5757) acc1: 64.8000 (64.8000) acc5: 88.0000 (88.0000) time: 12.8824 data: 12.8516 max mem: 6925
Test: [10/50] eta: 0:01:28 loss: 1.5757 (1.5617) acc1: 64.8000 (65.4545) acc5: 85.6000 (85.6727) time: 2.2094 data: 2.1769 max mem: 6925
Test: [20/50] eta: 0:00:53 loss: 1.7651 (1.7312) acc1: 59.2000 (61.4095) acc5: 84.0000 (83.9619) time: 1.2203 data: 1.1890 max mem: 6925
Test: [30/50] eta: 0:00:29 loss: 1.8296 (1.7341) acc1: 57.6000 (61.1613) acc5: 81.6000 (83.6129) time: 1.0768 data: 1.0476 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.7860 (1.7464) acc1: 60.8000 (61.1317) acc5: 81.6000 (83.3171) time: 0.6490 data: 0.6189 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.7856 (1.7565) acc1: 58.4000 (60.7520) acc5: 82.4000 (83.1360) time: 0.6297 data: 0.5994 max mem: 6925
Test: Total time: 0:00:50 (1.0185 s / it)
* Acc@1 61.270 Acc@5 83.820 loss 1.716
Accuracy of the model on the 50000 test images: 61.3%
Max accuracy: 62.42%
Epoch: [55] [ 0/625] eta: 3:54:27 lr: 0.003848 min_lr: 0.003848 loss: 2.8508 (2.8508) class_acc: 0.5430 (0.5430) weight_decay: 0.0500 (0.0500) time: 22.5083 data: 22.2776 max mem: 6925
Epoch: [55] [200/625] eta: 0:14:14 lr: 0.003845 min_lr: 0.003845 loss: 2.8017 (2.8342) class_acc: 0.5664 (0.5618) weight_decay: 0.0500 (0.0500) grad_norm: 0.9278 (1.3440) time: 1.8030 data: 0.0052 max mem: 6925
Epoch: [55] [400/625] eta: 0:07:22 lr: 0.003842 min_lr: 0.003842 loss: 2.8136 (2.8495) class_acc: 0.5664 (0.5560) weight_decay: 0.0500 (0.0500) grad_norm: 0.8965 (1.2596) time: 1.8610 data: 0.0009 max mem: 6925
Epoch: [55] [600/625] eta: 0:00:48 lr: 0.003839 min_lr: 0.003839 loss: 2.8393 (2.8541) class_acc: 0.5664 (0.5553) weight_decay: 0.0500 (0.0500) grad_norm: 1.0338 (1.3017) time: 1.8824 data: 0.0008 max mem: 6925
Epoch: [55] [624/625] eta: 0:00:01 lr: 0.003839 min_lr: 0.003839 loss: 2.8157 (2.8540) class_acc: 0.5586 (0.5553) weight_decay: 0.0500 (0.0500) grad_norm: 0.9884 (1.2924) time: 0.7831 data: 0.0016 max mem: 6925
Epoch: [55] Total time: 0:19:44 (1.8958 s / it)
Averaged stats: lr: 0.003839 min_lr: 0.003839 loss: 2.8157 (2.8590) class_acc: 0.5586 (0.5541) weight_decay: 0.0500 (0.0500) grad_norm: 0.9884 (1.2924)
Test: [ 0/50] eta: 0:10:41 loss: 1.7188 (1.7188) acc1: 59.2000 (59.2000) acc5: 83.2000 (83.2000) time: 12.8372 data: 12.7902 max mem: 6925
Test: [10/50] eta: 0:01:29 loss: 1.4788 (1.5178) acc1: 67.2000 (66.6909) acc5: 87.2000 (86.2545) time: 2.2276 data: 2.1968 max mem: 6925
Test: [20/50] eta: 0:00:50 loss: 1.6577 (1.7205) acc1: 62.4000 (60.8000) acc5: 85.6000 (83.8095) time: 1.1415 data: 1.1121 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.8448 (1.7236) acc1: 56.8000 (60.5936) acc5: 81.6000 (83.5871) time: 0.9219 data: 0.8928 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.8045 (1.7633) acc1: 58.4000 (59.8439) acc5: 81.6000 (82.5561) time: 0.5157 data: 0.4868 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.7664 (1.7643) acc1: 58.4000 (59.9040) acc5: 81.6000 (82.8480) time: 0.4653 data: 0.4359 max mem: 6925
Test: Total time: 0:00:46 (0.9366 s / it)
* Acc@1 61.046 Acc@5 83.762 loss 1.720
Accuracy of the model on the 50000 test images: 61.0%
Max accuracy: 62.42%
Epoch: [56] [ 0/625] eta: 3:26:16 lr: 0.003839 min_lr: 0.003839 loss: 2.6460 (2.6460) class_acc: 0.6367 (0.6367) weight_decay: 0.0500 (0.0500) time: 19.8019 data: 16.6346 max mem: 6925
Epoch: [56] [200/625] eta: 0:13:56 lr: 0.003836 min_lr: 0.003836 loss: 2.8107 (2.8464) class_acc: 0.5742 (0.5597) weight_decay: 0.0500 (0.0500) grad_norm: 1.1190 (1.3075) time: 1.8903 data: 0.0007 max mem: 6925
Epoch: [56] [400/625] eta: 0:07:20 lr: 0.003833 min_lr: 0.003833 loss: 2.9058 (2.8473) class_acc: 0.5391 (0.5584) weight_decay: 0.0500 (0.0500) grad_norm: 1.2450 (inf) time: 1.8143 data: 0.0008 max mem: 6925
Epoch: [56] [600/625] eta: 0:00:49 lr: 0.003831 min_lr: 0.003831 loss: 2.8694 (2.8561) class_acc: 0.5508 (0.5565) weight_decay: 0.0500 (0.0500) grad_norm: 0.8405 (inf) time: 2.0300 data: 0.0007 max mem: 6925
Epoch: [56] [624/625] eta: 0:00:01 lr: 0.003830 min_lr: 0.003830 loss: 2.8395 (2.8569) class_acc: 0.5508 (0.5564) weight_decay: 0.0500 (0.0500) grad_norm: 0.9188 (inf) time: 0.8041 data: 0.0317 max mem: 6925
Epoch: [56] Total time: 0:19:58 (1.9178 s / it)
Averaged stats: lr: 0.003830 min_lr: 0.003830 loss: 2.8395 (2.8559) class_acc: 0.5508 (0.5556) weight_decay: 0.0500 (0.0500) grad_norm: 0.9188 (inf)
Test: [ 0/50] eta: 0:10:19 loss: 1.9076 (1.9076) acc1: 56.0000 (56.0000) acc5: 83.2000 (83.2000) time: 12.3820 data: 12.3506 max mem: 6925
Test: [10/50] eta: 0:01:24 loss: 1.6272 (1.5932) acc1: 65.6000 (64.1455) acc5: 84.0000 (84.6546) time: 2.1178 data: 2.0881 max mem: 6925
Test: [20/50] eta: 0:00:50 loss: 1.6509 (1.7104) acc1: 60.8000 (61.3333) acc5: 84.0000 (83.8857) time: 1.1496 data: 1.1206 max mem: 6925
Test: [30/50] eta: 0:00:30 loss: 1.7214 (1.7187) acc1: 58.4000 (60.9548) acc5: 83.2000 (83.3548) time: 1.1642 data: 1.1354 max mem: 6925
Test: [40/50] eta: 0:00:13 loss: 1.7214 (1.7443) acc1: 59.2000 (60.7415) acc5: 82.4000 (83.0829) time: 0.9112 data: 0.8815 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.6834 (1.7377) acc1: 59.2000 (60.5760) acc5: 84.0000 (83.3120) time: 0.8432 data: 0.8135 max mem: 6925
Test: Total time: 0:00:54 (1.0873 s / it)
* Acc@1 61.324 Acc@5 83.626 loss 1.726
Accuracy of the model on the 50000 test images: 61.3%
Max accuracy: 62.42%
Epoch: [57] [ 0/625] eta: 3:50:47 lr: 0.003830 min_lr: 0.003830 loss: 2.6479 (2.6479) class_acc: 0.6094 (0.6094) weight_decay: 0.0500 (0.0500) time: 22.1563 data: 21.9166 max mem: 6925
Epoch: [57] [200/625] eta: 0:14:13 lr: 0.003827 min_lr: 0.003827 loss: 2.8614 (2.8477) class_acc: 0.5508 (0.5560) weight_decay: 0.0500 (0.0500) grad_norm: 1.0032 (1.1917) time: 1.8933 data: 0.0011 max mem: 6925
Epoch: [57] [400/625] eta: 0:07:29 lr: 0.003824 min_lr: 0.003824 loss: 2.8372 (2.8548) class_acc: 0.5547 (0.5557) weight_decay: 0.0500 (0.0500) grad_norm: 1.2819 (1.2182) time: 2.0396 data: 0.0025 max mem: 6925
Epoch: [57] [600/625] eta: 0:00:49 lr: 0.003821 min_lr: 0.003821 loss: 2.8902 (2.8575) class_acc: 0.5469 (0.5556) weight_decay: 0.0500 (0.0500) grad_norm: 0.9535 (1.2309) time: 2.0961 data: 0.0012 max mem: 6925
Epoch: [57] [624/625] eta: 0:00:01 lr: 0.003821 min_lr: 0.003821 loss: 2.8300 (2.8568) class_acc: 0.5547 (0.5556) weight_decay: 0.0500 (0.0500) grad_norm: 0.8877 (1.2379) time: 0.4824 data: 0.0022 max mem: 6925
Epoch: [57] Total time: 0:20:26 (1.9624 s / it)
Averaged stats: lr: 0.003821 min_lr: 0.003821 loss: 2.8300 (2.8520) class_acc: 0.5547 (0.5559) weight_decay: 0.0500 (0.0500) grad_norm: 0.8877 (1.2379)
Test: [ 0/50] eta: 0:11:52 loss: 1.7238 (1.7238) acc1: 63.2000 (63.2000) acc5: 83.2000 (83.2000) time: 14.2507 data: 14.2121 max mem: 6925
Test: [10/50] eta: 0:01:33 loss: 1.5903 (1.5861) acc1: 64.0000 (64.2182) acc5: 87.2000 (85.3091) time: 2.3354 data: 2.3045 max mem: 6925
Test: [20/50] eta: 0:00:54 loss: 1.6532 (1.6746) acc1: 62.4000 (62.1333) acc5: 84.8000 (84.3429) time: 1.2055 data: 1.1759 max mem: 6925
Test: [30/50] eta: 0:00:30 loss: 1.7503 (1.6907) acc1: 60.8000 (61.7032) acc5: 84.0000 (84.1290) time: 1.0729 data: 1.0433 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.7365 (1.7332) acc1: 61.6000 (60.9951) acc5: 84.0000 (83.3171) time: 0.6160 data: 0.5859 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.7957 (1.7473) acc1: 59.2000 (60.7200) acc5: 82.4000 (83.1360) time: 0.6154 data: 0.5858 max mem: 6925
Test: Total time: 0:00:51 (1.0224 s / it)
* Acc@1 61.524 Acc@5 83.898 loss 1.707
Accuracy of the model on the 50000 test images: 61.5%
Max accuracy: 62.42%
Epoch: [58] [ 0/625] eta: 4:00:09 lr: 0.003821 min_lr: 0.003821 loss: 2.7057 (2.7057) class_acc: 0.5977 (0.5977) weight_decay: 0.0500 (0.0500) time: 23.0554 data: 16.9972 max mem: 6925
Epoch: [58] [200/625] eta: 0:14:33 lr: 0.003818 min_lr: 0.003818 loss: 2.8020 (2.8251) class_acc: 0.5625 (0.5635) weight_decay: 0.0500 (0.0500) grad_norm: 1.2855 (1.4059) time: 1.9970 data: 0.0012 max mem: 6925
Epoch: [58] [400/625] eta: 0:07:24 lr: 0.003815 min_lr: 0.003815 loss: 2.8202 (2.8379) class_acc: 0.5547 (0.5604) weight_decay: 0.0500 (0.0500) grad_norm: 1.0823 (1.2320) time: 1.9112 data: 0.0016 max mem: 6925
Epoch: [58] [600/625] eta: 0:00:48 lr: 0.003812 min_lr: 0.003812 loss: 2.9047 (2.8422) class_acc: 0.5547 (0.5590) weight_decay: 0.0500 (0.0500) grad_norm: 0.8952 (1.2026) time: 1.8034 data: 0.0011 max mem: 6925
Epoch: [58] [624/625] eta: 0:00:01 lr: 0.003812 min_lr: 0.003812 loss: 2.8677 (2.8429) class_acc: 0.5469 (0.5589) weight_decay: 0.0500 (0.0500) grad_norm: 1.2237 (1.2243) time: 0.8668 data: 0.0015 max mem: 6925
Epoch: [58] Total time: 0:19:57 (1.9155 s / it)
Averaged stats: lr: 0.003812 min_lr: 0.003812 loss: 2.8677 (2.8474) class_acc: 0.5469 (0.5575) weight_decay: 0.0500 (0.0500) grad_norm: 1.2237 (1.2243)
Test: [ 0/50] eta: 0:10:22 loss: 1.5548 (1.5548) acc1: 65.6000 (65.6000) acc5: 88.0000 (88.0000) time: 12.4588 data: 12.4231 max mem: 6925
Test: [10/50] eta: 0:01:20 loss: 1.5548 (1.5639) acc1: 67.2000 (66.5455) acc5: 85.6000 (86.4000) time: 2.0219 data: 1.9913 max mem: 6925
Test: [20/50] eta: 0:00:46 loss: 1.6813 (1.7716) acc1: 60.8000 (61.0286) acc5: 84.0000 (83.8476) time: 1.0191 data: 0.9893 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.9449 (1.7768) acc1: 56.0000 (61.1355) acc5: 81.6000 (83.3032) time: 1.0037 data: 0.9738 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.8493 (1.7942) acc1: 60.8000 (60.6049) acc5: 80.8000 (82.9073) time: 0.6842 data: 0.6544 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.7921 (1.8023) acc1: 58.4000 (60.2240) acc5: 82.4000 (82.8000) time: 0.6074 data: 0.5770 max mem: 6925
Test: Total time: 0:00:47 (0.9502 s / it)
* Acc@1 60.598 Acc@5 83.612 loss 1.764
Accuracy of the model on the 50000 test images: 60.6%
Max accuracy: 62.42%
Epoch: [59] [ 0/625] eta: 3:51:40 lr: 0.003812 min_lr: 0.003812 loss: 2.8768 (2.8768) class_acc: 0.5273 (0.5273) weight_decay: 0.0500 (0.0500) time: 22.2411 data: 20.3049 max mem: 6925
Epoch: [59] [200/625] eta: 0:14:29 lr: 0.003809 min_lr: 0.003809 loss: 2.8674 (2.8288) class_acc: 0.5586 (0.5629) weight_decay: 0.0500 (0.0500) grad_norm: 1.1774 (1.4347) time: 1.8928 data: 0.0209 max mem: 6925
Epoch: [59] [400/625] eta: 0:07:25 lr: 0.003805 min_lr: 0.003805 loss: 2.8757 (2.8359) class_acc: 0.5586 (0.5609) weight_decay: 0.0500 (0.0500) grad_norm: 0.9331 (1.3446) time: 1.9092 data: 0.0009 max mem: 6925
Epoch: [59] [600/625] eta: 0:00:49 lr: 0.003802 min_lr: 0.003802 loss: 2.8590 (2.8436) class_acc: 0.5508 (0.5587) weight_decay: 0.0500 (0.0500) grad_norm: 1.0409 (1.3338) time: 2.0150 data: 0.0448 max mem: 6925
Epoch: [59] [624/625] eta: 0:00:01 lr: 0.003802 min_lr: 0.003802 loss: 2.8276 (2.8445) class_acc: 0.5391 (0.5585) weight_decay: 0.0500 (0.0500) grad_norm: 0.9370 (1.3295) time: 0.7405 data: 0.0299 max mem: 6925
Epoch: [59] Total time: 0:20:09 (1.9358 s / it)
Averaged stats: lr: 0.003802 min_lr: 0.003802 loss: 2.8276 (2.8430) class_acc: 0.5391 (0.5581) weight_decay: 0.0500 (0.0500) grad_norm: 0.9370 (1.3295)
Test: [ 0/50] eta: 0:10:07 loss: 1.4453 (1.4453) acc1: 64.8000 (64.8000) acc5: 90.4000 (90.4000) time: 12.1543 data: 12.0903 max mem: 6925
Test: [10/50] eta: 0:01:24 loss: 1.5417 (1.5588) acc1: 64.8000 (65.1636) acc5: 88.0000 (86.3273) time: 2.1025 data: 2.0696 max mem: 6925
Test: [20/50] eta: 0:00:50 loss: 1.7195 (1.6946) acc1: 60.8000 (61.5238) acc5: 84.8000 (84.8000) time: 1.1425 data: 1.1125 max mem: 6925
Test: [30/50] eta: 0:00:29 loss: 1.8291 (1.7342) acc1: 57.6000 (61.0581) acc5: 82.4000 (83.8710) time: 1.1428 data: 1.1127 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.8291 (1.7549) acc1: 58.4000 (60.7805) acc5: 81.6000 (83.3951) time: 0.8550 data: 0.8249 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.7893 (1.7683) acc1: 58.4000 (60.4640) acc5: 84.0000 (83.2960) time: 0.8619 data: 0.8324 max mem: 6925
Test: Total time: 0:00:52 (1.0538 s / it)
* Acc@1 61.614 Acc@5 84.132 loss 1.728
Accuracy of the model on the 50000 test images: 61.6%
Max accuracy: 62.42%
Epoch: [60] [ 0/625] eta: 4:10:12 lr: 0.003802 min_lr: 0.003802 loss: 2.6160 (2.6160) class_acc: 0.5898 (0.5898) weight_decay: 0.0500 (0.0500) time: 24.0199 data: 21.1165 max mem: 6925
Epoch: [60] [200/625] eta: 0:14:27 lr: 0.003799 min_lr: 0.003799 loss: 2.7800 (2.8228) class_acc: 0.5625 (0.5592) weight_decay: 0.0500 (0.0500) grad_norm: 1.3291 (1.3294) time: 1.8995 data: 0.0011 max mem: 6925
Epoch: [60] [400/625] eta: 0:07:24 lr: 0.003796 min_lr: 0.003796 loss: 2.8391 (2.8360) class_acc: 0.5430 (0.5585) weight_decay: 0.0500 (0.0500) grad_norm: 0.8651 (1.2722) time: 1.9852 data: 0.0009 max mem: 6925
Epoch: [60] [600/625] eta: 0:00:49 lr: 0.003793 min_lr: 0.003793 loss: 2.9116 (2.8482) class_acc: 0.5391 (0.5564) weight_decay: 0.0500 (0.0500) grad_norm: 0.9143 (1.2941) time: 1.9433 data: 0.1061 max mem: 6925
Epoch: [60] [624/625] eta: 0:00:01 lr: 0.003792 min_lr: 0.003792 loss: 2.8358 (2.8476) class_acc: 0.5547 (0.5565) weight_decay: 0.0500 (0.0500) grad_norm: 0.9286 (1.2988) time: 0.7832 data: 0.0124 max mem: 6925
Epoch: [60] Total time: 0:20:01 (1.9225 s / it)
Averaged stats: lr: 0.003792 min_lr: 0.003792 loss: 2.8358 (2.8396) class_acc: 0.5547 (0.5586) weight_decay: 0.0500 (0.0500) grad_norm: 0.9286 (1.2988)
Test: [ 0/50] eta: 0:10:24 loss: 1.5386 (1.5386) acc1: 65.6000 (65.6000) acc5: 87.2000 (87.2000) time: 12.4945 data: 12.4529 max mem: 6925
Test: [10/50] eta: 0:01:24 loss: 1.5856 (1.6271) acc1: 61.6000 (62.5455) acc5: 86.4000 (85.3818) time: 2.1008 data: 2.0694 max mem: 6925
Test: [20/50] eta: 0:00:48 loss: 1.8864 (1.8108) acc1: 56.8000 (59.0857) acc5: 82.4000 (82.7429) time: 1.0813 data: 1.0516 max mem: 6925
Test: [30/50] eta: 0:00:28 loss: 1.9148 (1.8223) acc1: 56.8000 (59.3290) acc5: 80.0000 (82.1677) time: 1.0460 data: 1.0165 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.8767 (1.8422) acc1: 58.4000 (58.4390) acc5: 80.0000 (82.1463) time: 0.6754 data: 0.6452 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.8749 (1.8448) acc1: 57.6000 (58.3680) acc5: 81.6000 (82.0800) time: 0.6678 data: 0.6365 max mem: 6925
Test: Total time: 0:00:48 (0.9685 s / it)
* Acc@1 59.264 Acc@5 82.560 loss 1.813
Accuracy of the model on the 50000 test images: 59.3%
Max accuracy: 62.42%
Epoch: [61] [ 0/625] eta: 3:43:38 lr: 0.003792 min_lr: 0.003792 loss: 2.8190 (2.8190) class_acc: 0.5859 (0.5859) weight_decay: 0.0500 (0.0500) time: 21.4701 data: 14.9259 max mem: 6925
Epoch: [61] [200/625] eta: 0:14:20 lr: 0.003789 min_lr: 0.003789 loss: 2.8673 (2.8310) class_acc: 0.5508 (0.5609) weight_decay: 0.0500 (0.0500) grad_norm: 1.1115 (1.2683) time: 1.8983 data: 0.0009 max mem: 6925
Epoch: [61] [400/625] eta: 0:07:25 lr: 0.003786 min_lr: 0.003786 loss: 2.8692 (2.8390) class_acc: 0.5508 (0.5592) weight_decay: 0.0500 (0.0500) grad_norm: 1.3612 (1.3347) time: 1.9164 data: 0.0111 max mem: 6925
Epoch: [61] [600/625] eta: 0:00:49 lr: 0.003782 min_lr: 0.003782 loss: 2.7909 (2.8422) class_acc: 0.5508 (0.5592) weight_decay: 0.0500 (0.0500) grad_norm: 1.0114 (1.2732) time: 1.9786 data: 0.0008 max mem: 6925
Epoch: [61] [624/625] eta: 0:00:01 lr: 0.003782 min_lr: 0.003782 loss: 2.8414 (2.8425) class_acc: 0.5547 (0.5590) weight_decay: 0.0500 (0.0500) grad_norm: 0.9288 (1.2660) time: 0.8202 data: 0.0023 max mem: 6925
Epoch: [61] Total time: 0:20:02 (1.9242 s / it)
Averaged stats: lr: 0.003782 min_lr: 0.003782 loss: 2.8414 (2.8349) class_acc: 0.5547 (0.5602) weight_decay: 0.0500 (0.0500) grad_norm: 0.9288 (1.2660)
Test: [ 0/50] eta: 0:11:21 loss: 1.6327 (1.6327) acc1: 62.4000 (62.4000) acc5: 84.0000 (84.0000) time: 13.6219 data: 13.5908 max mem: 6925
Test: [10/50] eta: 0:01:35 loss: 1.6327 (1.6171) acc1: 64.0000 (62.9091) acc5: 84.8000 (84.8727) time: 2.3827 data: 2.3521 max mem: 6925
Test: [20/50] eta: 0:00:53 loss: 1.7444 (1.7367) acc1: 60.8000 (61.0667) acc5: 84.0000 (83.4286) time: 1.1779 data: 1.1482 max mem: 6925
Test: [30/50] eta: 0:00:29 loss: 1.8159 (1.7531) acc1: 59.2000 (60.9548) acc5: 81.6000 (83.2516) time: 0.9891 data: 0.9603 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.8268 (1.7610) acc1: 60.0000 (60.9171) acc5: 81.6000 (82.9268) time: 0.6424 data: 0.6134 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.7222 (1.7579) acc1: 59.2000 (61.0240) acc5: 82.4000 (82.9440) time: 0.6139 data: 0.5836 max mem: 6925
Test: Total time: 0:00:50 (1.0112 s / it)
* Acc@1 61.396 Acc@5 83.752 loss 1.709
Accuracy of the model on the 50000 test images: 61.4%
Max accuracy: 62.42%
Epoch: [62] [ 0/625] eta: 3:38:27 lr: 0.003782 min_lr: 0.003782 loss: 2.8342 (2.8342) class_acc: 0.5508 (0.5508) weight_decay: 0.0500 (0.0500) time: 20.9722 data: 16.4174 max mem: 6925
Epoch: [62] [200/625] eta: 0:13:56 lr: 0.003779 min_lr: 0.003779 loss: 2.8288 (2.8225) class_acc: 0.5625 (0.5628) weight_decay: 0.0500 (0.0500) grad_norm: 1.1064 (1.3188) time: 1.8071 data: 0.0790 max mem: 6925
Epoch: [62] [400/625] eta: 0:07:17 lr: 0.003775 min_lr: 0.003775 loss: 2.8219 (2.8333) class_acc: 0.5703 (0.5608) weight_decay: 0.0500 (0.0500) grad_norm: 0.9344 (1.2619) time: 1.9238 data: 0.0009 max mem: 6925
Epoch: [62] [600/625] eta: 0:00:48 lr: 0.003772 min_lr: 0.003772 loss: 2.7773 (2.8332) class_acc: 0.5547 (0.5604) weight_decay: 0.0500 (0.0500) grad_norm: 1.1760 (inf) time: 1.8759 data: 0.0009 max mem: 6925
Epoch: [62] [624/625] eta: 0:00:01 lr: 0.003772 min_lr: 0.003772 loss: 2.8577 (2.8349) class_acc: 0.5508 (0.5602) weight_decay: 0.0500 (0.0500) grad_norm: 1.5628 (inf) time: 0.8288 data: 0.0024 max mem: 6925
Epoch: [62] Total time: 0:19:54 (1.9117 s / it)
Averaged stats: lr: 0.003772 min_lr: 0.003772 loss: 2.8577 (2.8293) class_acc: 0.5508 (0.5609) weight_decay: 0.0500 (0.0500) grad_norm: 1.5628 (inf)
Test: [ 0/50] eta: 0:09:51 loss: 1.5780 (1.5780) acc1: 66.4000 (66.4000) acc5: 86.4000 (86.4000) time: 11.8396 data: 11.8087 max mem: 6925
Test: [10/50] eta: 0:01:18 loss: 1.5780 (1.6928) acc1: 62.4000 (62.2545) acc5: 84.0000 (84.7273) time: 1.9592 data: 1.9296 max mem: 6925
Test: [20/50] eta: 0:00:46 loss: 1.8389 (1.8033) acc1: 59.2000 (59.7714) acc5: 83.2000 (83.4667) time: 1.0391 data: 1.0103 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.8732 (1.8015) acc1: 59.2000 (60.0516) acc5: 82.4000 (83.2774) time: 1.0481 data: 1.0198 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.7238 (1.8222) acc1: 60.0000 (59.8439) acc5: 82.4000 (82.5951) time: 0.7844 data: 0.7558 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.7782 (1.8125) acc1: 59.2000 (60.1760) acc5: 83.2000 (82.6560) time: 0.6986 data: 0.6698 max mem: 6925
Test: Total time: 0:00:51 (1.0289 s / it)
* Acc@1 60.876 Acc@5 83.292 loss 1.767
Accuracy of the model on the 50000 test images: 60.9%
Max accuracy: 62.42%
Epoch: [63] [ 0/625] eta: 3:39:31 lr: 0.003772 min_lr: 0.003772 loss: 2.7720 (2.7720) class_acc: 0.5664 (0.5664) weight_decay: 0.0500 (0.0500) time: 21.0744 data: 20.7895 max mem: 6925
Epoch: [63] [200/625] eta: 0:14:50 lr: 0.003768 min_lr: 0.003768 loss: 2.7867 (2.8070) class_acc: 0.5508 (0.5671) weight_decay: 0.0500 (0.0500) grad_norm: 1.1077 (1.1345) time: 2.0347 data: 0.0012 max mem: 6925
Epoch: [63] [400/625] eta: 0:07:32 lr: 0.003765 min_lr: 0.003765 loss: 2.8311 (2.8138) class_acc: 0.5625 (0.5637) weight_decay: 0.0500 (0.0500) grad_norm: 0.8851 (1.2268) time: 1.9254 data: 0.0009 max mem: 6925
Epoch: [63] [600/625] eta: 0:00:50 lr: 0.003762 min_lr: 0.003762 loss: 2.8592 (2.8209) class_acc: 0.5469 (0.5621) weight_decay: 0.0500 (0.0500) grad_norm: 1.5083 (1.2493) time: 2.1212 data: 0.0008 max mem: 6925
Epoch: [63] [624/625] eta: 0:00:01 lr: 0.003761 min_lr: 0.003761 loss: 2.8278 (2.8218) class_acc: 0.5508 (0.5618) weight_decay: 0.0500 (0.0500) grad_norm: 0.9930 (1.2435) time: 0.7643 data: 0.0014 max mem: 6925
Epoch: [63] Total time: 0:20:22 (1.9561 s / it)
Averaged stats: lr: 0.003761 min_lr: 0.003761 loss: 2.8278 (2.8263) class_acc: 0.5508 (0.5621) weight_decay: 0.0500 (0.0500) grad_norm: 0.9930 (1.2435)
Test: [ 0/50] eta: 0:09:56 loss: 1.6210 (1.6210) acc1: 62.4000 (62.4000) acc5: 85.6000 (85.6000) time: 11.9307 data: 11.8677 max mem: 6925
Test: [10/50] eta: 0:01:12 loss: 1.5662 (1.5315) acc1: 65.6000 (65.2364) acc5: 85.6000 (85.8182) time: 1.8240 data: 1.7902 max mem: 6925
Test: [20/50] eta: 0:00:40 loss: 1.6587 (1.6628) acc1: 62.4000 (62.8191) acc5: 84.0000 (84.4191) time: 0.8355 data: 0.8047 max mem: 6925
Test: [30/50] eta: 0:00:25 loss: 1.7734 (1.6889) acc1: 61.6000 (61.8065) acc5: 82.4000 (84.1548) time: 0.9636 data: 0.9321 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.6955 (1.7019) acc1: 60.0000 (61.6976) acc5: 83.2000 (83.9024) time: 0.9014 data: 0.8707 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.6734 (1.6933) acc1: 60.8000 (61.6160) acc5: 84.0000 (84.0000) time: 0.5337 data: 0.5037 max mem: 6925
Test: Total time: 0:00:50 (1.0031 s / it)
* Acc@1 61.936 Acc@5 84.118 loss 1.682
Accuracy of the model on the 50000 test images: 61.9%
Max accuracy: 62.42%
Epoch: [64] [ 0/625] eta: 3:40:41 lr: 0.003761 min_lr: 0.003761 loss: 2.6217 (2.6217) class_acc: 0.6055 (0.6055) weight_decay: 0.0500 (0.0500) time: 21.1870 data: 17.6727 max mem: 6925
Epoch: [64] [200/625] eta: 0:14:37 lr: 0.003758 min_lr: 0.003758 loss: 2.8332 (2.8201) class_acc: 0.5586 (0.5627) weight_decay: 0.0500 (0.0500) grad_norm: 1.1612 (1.3003) time: 2.0106 data: 0.0010 max mem: 6925
Epoch: [64] [400/625] eta: 0:07:30 lr: 0.003754 min_lr: 0.003754 loss: 2.8209 (2.8220) class_acc: 0.5547 (0.5619) weight_decay: 0.0500 (0.0500) grad_norm: 0.9827 (1.2450) time: 1.9085 data: 0.0008 max mem: 6925
Epoch: [64] [600/625] eta: 0:00:50 lr: 0.003751 min_lr: 0.003751 loss: 2.8171 (2.8199) class_acc: 0.5586 (0.5629) weight_decay: 0.0500 (0.0500) grad_norm: 0.8009 (1.2681) time: 2.0981 data: 0.0010 max mem: 6925
Epoch: [64] [624/625] eta: 0:00:01 lr: 0.003751 min_lr: 0.003751 loss: 2.8312 (2.8212) class_acc: 0.5508 (0.5626) weight_decay: 0.0500 (0.0500) grad_norm: 1.1437 (1.2814) time: 0.8006 data: 0.0017 max mem: 6925
Epoch: [64] Total time: 0:20:21 (1.9551 s / it)
Averaged stats: lr: 0.003751 min_lr: 0.003751 loss: 2.8312 (2.8232) class_acc: 0.5508 (0.5622) weight_decay: 0.0500 (0.0500) grad_norm: 1.1437 (1.2814)
Test: [ 0/50] eta: 0:09:31 loss: 1.6664 (1.6664) acc1: 61.6000 (61.6000) acc5: 83.2000 (83.2000) time: 11.4297 data: 11.3871 max mem: 6925
Test: [10/50] eta: 0:01:17 loss: 1.5881 (1.5522) acc1: 64.0000 (64.7273) acc5: 85.6000 (85.7455) time: 1.9411 data: 1.9096 max mem: 6925
Test: [20/50] eta: 0:00:43 loss: 1.6836 (1.7482) acc1: 60.8000 (60.0000) acc5: 84.0000 (83.4667) time: 0.9651 data: 0.9357 max mem: 6925
Test: [30/50] eta: 0:00:25 loss: 1.9691 (1.7971) acc1: 56.0000 (59.2000) acc5: 80.8000 (82.1677) time: 0.9085 data: 0.8787 max mem: 6925
Test: [40/50] eta: 0:00:10 loss: 1.8379 (1.7934) acc1: 56.8000 (59.5122) acc5: 81.6000 (82.5171) time: 0.6988 data: 0.6684 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.8091 (1.8007) acc1: 57.6000 (59.6160) acc5: 83.2000 (82.2720) time: 0.6233 data: 0.5938 max mem: 6925
Test: Total time: 0:00:46 (0.9307 s / it)
* Acc@1 61.004 Acc@5 83.316 loss 1.749
Accuracy of the model on the 50000 test images: 61.0%
Max accuracy: 62.42%
Epoch: [65] [ 0/625] eta: 3:50:56 lr: 0.003751 min_lr: 0.003751 loss: 2.5871 (2.5871) class_acc: 0.6328 (0.6328) weight_decay: 0.0500 (0.0500) time: 22.1697 data: 15.6432 max mem: 6925
Epoch: [65] [200/625] eta: 0:14:21 lr: 0.003747 min_lr: 0.003747 loss: 2.8189 (2.8131) class_acc: 0.5586 (0.5634) weight_decay: 0.0500 (0.0500) grad_norm: 0.9981 (1.3700) time: 1.9165 data: 0.0006 max mem: 6925
Epoch: [65] [400/625] eta: 0:07:27 lr: 0.003744 min_lr: 0.003744 loss: 2.8081 (2.8105) class_acc: 0.5625 (0.5648) weight_decay: 0.0500 (0.0500) grad_norm: 0.9273 (1.3073) time: 1.8601 data: 0.0008 max mem: 6925
Epoch: [65] [600/625] eta: 0:00:49 lr: 0.003740 min_lr: 0.003740 loss: 2.7720 (2.8221) class_acc: 0.5586 (0.5627) weight_decay: 0.0500 (0.0500) grad_norm: 0.9874 (1.2668) time: 1.7886 data: 0.0007 max mem: 6925
Epoch: [65] [624/625] eta: 0:00:01 lr: 0.003740 min_lr: 0.003740 loss: 2.7891 (2.8211) class_acc: 0.5664 (0.5628) weight_decay: 0.0500 (0.0500) grad_norm: 0.8847 (1.2826) time: 0.8361 data: 0.0014 max mem: 6925
Epoch: [65] Total time: 0:20:08 (1.9337 s / it)
Averaged stats: lr: 0.003740 min_lr: 0.003740 loss: 2.7891 (2.8225) class_acc: 0.5664 (0.5632) weight_decay: 0.0500 (0.0500) grad_norm: 0.8847 (1.2826)
Test: [ 0/50] eta: 0:11:11 loss: 1.9199 (1.9199) acc1: 63.2000 (63.2000) acc5: 80.8000 (80.8000) time: 13.4283 data: 13.3965 max mem: 6925
Test: [10/50] eta: 0:01:29 loss: 1.5071 (1.5635) acc1: 64.8000 (64.9455) acc5: 87.2000 (86.2546) time: 2.2352 data: 2.2057 max mem: 6925
Test: [20/50] eta: 0:00:49 loss: 1.7476 (1.7453) acc1: 58.4000 (60.6857) acc5: 84.0000 (83.8857) time: 1.0746 data: 1.0457 max mem: 6925
Test: [30/50] eta: 0:00:28 loss: 1.9148 (1.7845) acc1: 57.6000 (60.1032) acc5: 81.6000 (83.1484) time: 0.9847 data: 0.9557 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.9148 (1.8139) acc1: 57.6000 (59.2390) acc5: 80.8000 (82.7902) time: 0.6628 data: 0.6336 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.8248 (1.8182) acc1: 57.6000 (59.1680) acc5: 82.4000 (82.8000) time: 0.5851 data: 0.5556 max mem: 6925
Test: Total time: 0:00:50 (1.0083 s / it)
* Acc@1 60.110 Acc@5 83.130 loss 1.772
Accuracy of the model on the 50000 test images: 60.1%
Max accuracy: 62.42%
Epoch: [66] [ 0/625] eta: 4:05:49 lr: 0.003740 min_lr: 0.003740 loss: 2.7723 (2.7723) class_acc: 0.5664 (0.5664) weight_decay: 0.0500 (0.0500) time: 23.5984 data: 17.6572 max mem: 6925
Epoch: [66] [200/625] eta: 0:13:45 lr: 0.003736 min_lr: 0.003736 loss: 2.7990 (2.8045) class_acc: 0.5508 (0.5666) weight_decay: 0.0500 (0.0500) grad_norm: 1.0300 (1.2765) time: 1.8316 data: 0.0011 max mem: 6925
Epoch: [66] [400/625] eta: 0:07:10 lr: 0.003732 min_lr: 0.003732 loss: 2.8058 (2.8147) class_acc: 0.5625 (0.5643) weight_decay: 0.0500 (0.0500) grad_norm: 1.5639 (1.3270) time: 1.9836 data: 0.0009 max mem: 6925
Epoch: [66] [600/625] eta: 0:00:47 lr: 0.003729 min_lr: 0.003729 loss: 2.8371 (2.8207) class_acc: 0.5664 (0.5634) weight_decay: 0.0500 (0.0500) grad_norm: 1.1885 (1.3052) time: 1.9560 data: 0.0008 max mem: 6925
Epoch: [66] [624/625] eta: 0:00:01 lr: 0.003728 min_lr: 0.003728 loss: 2.8411 (2.8224) class_acc: 0.5469 (0.5629) weight_decay: 0.0500 (0.0500) grad_norm: 1.3782 (1.3188) time: 0.8055 data: 0.0013 max mem: 6925
Epoch: [66] Total time: 0:19:43 (1.8934 s / it)
Averaged stats: lr: 0.003728 min_lr: 0.003728 loss: 2.8411 (2.8158) class_acc: 0.5469 (0.5643) weight_decay: 0.0500 (0.0500) grad_norm: 1.3782 (1.3188)
Test: [ 0/50] eta: 0:10:27 loss: 1.5839 (1.5839) acc1: 66.4000 (66.4000) acc5: 88.0000 (88.0000) time: 12.5498 data: 12.5150 max mem: 6925
Test: [10/50] eta: 0:01:21 loss: 1.5797 (1.5663) acc1: 66.4000 (64.6545) acc5: 85.6000 (85.8182) time: 2.0350 data: 2.0055 max mem: 6925
Test: [20/50] eta: 0:00:47 loss: 1.8441 (1.7533) acc1: 58.4000 (60.6857) acc5: 83.2000 (83.9238) time: 1.0256 data: 0.9966 max mem: 6925
Test: [30/50] eta: 0:00:26 loss: 1.8686 (1.7639) acc1: 56.8000 (60.3613) acc5: 82.4000 (83.5355) time: 0.9435 data: 0.9132 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.7810 (1.7818) acc1: 57.6000 (60.0195) acc5: 82.4000 (82.9073) time: 0.6687 data: 0.6372 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.8428 (1.7897) acc1: 58.4000 (59.9680) acc5: 80.8000 (82.5600) time: 0.6335 data: 0.6024 max mem: 6925
Test: Total time: 0:00:49 (0.9956 s / it)
* Acc@1 60.982 Acc@5 83.310 loss 1.730
Accuracy of the model on the 50000 test images: 61.0%
Max accuracy: 62.42%
Epoch: [67] [ 0/625] eta: 3:44:57 lr: 0.003728 min_lr: 0.003728 loss: 2.9264 (2.9264) class_acc: 0.5742 (0.5742) weight_decay: 0.0500 (0.0500) time: 21.5957 data: 20.3900 max mem: 6925
Epoch: [67] [200/625] eta: 0:14:08 lr: 0.003725 min_lr: 0.003725 loss: 2.7556 (2.7964) class_acc: 0.5664 (0.5678) weight_decay: 0.0500 (0.0500) grad_norm: 1.1439 (1.1720) time: 2.0351 data: 0.0009 max mem: 6925
Epoch: [67] [400/625] eta: 0:07:23 lr: 0.003721 min_lr: 0.003721 loss: 2.7740 (2.8023) class_acc: 0.5781 (0.5660) weight_decay: 0.0500 (0.0500) grad_norm: 0.8820 (inf) time: 1.8914 data: 0.0008 max mem: 6925
Epoch: [67] [600/625] eta: 0:00:49 lr: 0.003717 min_lr: 0.003717 loss: 2.8870 (2.8131) class_acc: 0.5430 (0.5633) weight_decay: 0.0500 (0.0500) grad_norm: 0.8868 (inf) time: 2.0569 data: 0.0009 max mem: 6925
Epoch: [67] [624/625] eta: 0:00:01 lr: 0.003717 min_lr: 0.003717 loss: 2.8670 (2.8137) class_acc: 0.5547 (0.5634) weight_decay: 0.0500 (0.0500) grad_norm: 0.8363 (inf) time: 0.4606 data: 0.0016 max mem: 6925
Epoch: [67] Total time: 0:20:10 (1.9367 s / it)
Averaged stats: lr: 0.003717 min_lr: 0.003717 loss: 2.8670 (2.8123) class_acc: 0.5547 (0.5647) weight_decay: 0.0500 (0.0500) grad_norm: 0.8363 (inf)
Test: [ 0/50] eta: 0:10:52 loss: 1.7229 (1.7229) acc1: 64.8000 (64.8000) acc5: 83.2000 (83.2000) time: 13.0585 data: 13.0213 max mem: 6925
Test: [10/50] eta: 0:01:22 loss: 1.7096 (1.6526) acc1: 64.8000 (62.8364) acc5: 84.0000 (84.2909) time: 2.0510 data: 2.0206 max mem: 6925
Test: [20/50] eta: 0:00:46 loss: 1.7576 (1.8185) acc1: 58.4000 (58.5143) acc5: 82.4000 (82.0952) time: 0.9854 data: 0.9554 max mem: 6925
Test: [30/50] eta: 0:00:26 loss: 1.9064 (1.8196) acc1: 56.8000 (58.8903) acc5: 81.6000 (82.2452) time: 0.9184 data: 0.8884 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.7887 (1.8348) acc1: 56.8000 (58.9268) acc5: 80.8000 (81.9707) time: 0.7721 data: 0.7423 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.8013 (1.8372) acc1: 56.8000 (58.8160) acc5: 80.8000 (82.0960) time: 0.6588 data: 0.6296 max mem: 6925
Test: Total time: 0:00:51 (1.0315 s / it)
* Acc@1 59.802 Acc@5 82.864 loss 1.790
Accuracy of the model on the 50000 test images: 59.8%
Max accuracy: 62.42%
Epoch: [68] [ 0/625] eta: 3:40:38 lr: 0.003717 min_lr: 0.003717 loss: 2.6980 (2.6980) class_acc: 0.6328 (0.6328) weight_decay: 0.0500 (0.0500) time: 21.1815 data: 20.9502 max mem: 6925
Epoch: [68] [200/625] eta: 0:14:18 lr: 0.003713 min_lr: 0.003713 loss: 2.8020 (2.7837) class_acc: 0.5742 (0.5739) weight_decay: 0.0500 (0.0500) grad_norm: 0.9767 (1.1662) time: 2.0109 data: 1.7224 max mem: 6925
Epoch: [68] [400/625] eta: 0:07:17 lr: 0.003710 min_lr: 0.003710 loss: 2.8423 (2.7975) class_acc: 0.5586 (0.5691) weight_decay: 0.0500 (0.0500) grad_norm: 0.8151 (1.2567) time: 1.8780 data: 1.3760 max mem: 6925
Epoch: [68] [600/625] eta: 0:00:47 lr: 0.003706 min_lr: 0.003706 loss: 2.7922 (2.8050) class_acc: 0.5664 (0.5677) weight_decay: 0.0500 (0.0500) grad_norm: 1.3116 (1.2435) time: 1.9230 data: 1.6018 max mem: 6925
Epoch: [68] [624/625] eta: 0:00:01 lr: 0.003705 min_lr: 0.003705 loss: 2.8005 (2.8052) class_acc: 0.5781 (0.5680) weight_decay: 0.0500 (0.0500) grad_norm: 1.0922 (1.2422) time: 0.7525 data: 0.4685 max mem: 6925
Epoch: [68] Total time: 0:19:36 (1.8823 s / it)
Averaged stats: lr: 0.003705 min_lr: 0.003705 loss: 2.8005 (2.8102) class_acc: 0.5781 (0.5657) weight_decay: 0.0500 (0.0500) grad_norm: 1.0922 (1.2422)
Test: [ 0/50] eta: 0:10:01 loss: 1.5313 (1.5313) acc1: 61.6000 (61.6000) acc5: 88.0000 (88.0000) time: 12.0211 data: 11.9505 max mem: 6925
Test: [10/50] eta: 0:01:28 loss: 1.5961 (1.6150) acc1: 64.0000 (63.9273) acc5: 84.8000 (84.5818) time: 2.2219 data: 2.1886 max mem: 6925
Test: [20/50] eta: 0:00:49 loss: 1.8001 (1.7908) acc1: 60.8000 (60.5333) acc5: 81.6000 (82.5143) time: 1.1381 data: 1.1091 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.8759 (1.8170) acc1: 58.4000 (60.1548) acc5: 80.8000 (81.9613) time: 0.9114 data: 0.8829 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.8210 (1.8322) acc1: 58.4000 (59.6683) acc5: 79.2000 (81.6976) time: 0.5840 data: 0.5552 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.7791 (1.8284) acc1: 58.4000 (59.6800) acc5: 81.6000 (81.7920) time: 0.5050 data: 0.4759 max mem: 6925
Test: Total time: 0:00:47 (0.9551 s / it)
* Acc@1 60.416 Acc@5 83.308 loss 1.773
Accuracy of the model on the 50000 test images: 60.4%
Max accuracy: 62.42%
Epoch: [69] [ 0/625] eta: 3:32:42 lr: 0.003705 min_lr: 0.003705 loss: 2.8503 (2.8503) class_acc: 0.5273 (0.5273) weight_decay: 0.0500 (0.0500) time: 20.4206 data: 16.8313 max mem: 6925
Epoch: [69] [200/625] eta: 0:14:05 lr: 0.003702 min_lr: 0.003702 loss: 2.7707 (2.7854) class_acc: 0.5703 (0.5738) weight_decay: 0.0500 (0.0500) grad_norm: 1.2138 (1.1919) time: 1.9014 data: 0.0009 max mem: 6925
Epoch: [69] [400/625] eta: 0:07:19 lr: 0.003698 min_lr: 0.003698 loss: 2.8246 (2.7968) class_acc: 0.5547 (0.5703) weight_decay: 0.0500 (0.0500) grad_norm: 1.2833 (1.1993) time: 1.7934 data: 0.0009 max mem: 6925
Epoch: [69] [600/625] eta: 0:00:48 lr: 0.003694 min_lr: 0.003694 loss: 2.8403 (2.8116) class_acc: 0.5547 (0.5663) weight_decay: 0.0500 (0.0500) grad_norm: 1.0389 (1.2090) time: 1.9393 data: 0.0503 max mem: 6925
Epoch: [69] [624/625] eta: 0:00:01 lr: 0.003694 min_lr: 0.003694 loss: 2.8322 (2.8129) class_acc: 0.5391 (0.5657) weight_decay: 0.0500 (0.0500) grad_norm: 1.0389 (1.2051) time: 0.7437 data: 0.0019 max mem: 6925
Epoch: [69] Total time: 0:19:55 (1.9127 s / it)
Averaged stats: lr: 0.003694 min_lr: 0.003694 loss: 2.8322 (2.8066) class_acc: 0.5391 (0.5661) weight_decay: 0.0500 (0.0500) grad_norm: 1.0389 (1.2051)
Test: [ 0/50] eta: 0:09:45 loss: 1.6014 (1.6014) acc1: 68.8000 (68.8000) acc5: 84.8000 (84.8000) time: 11.7169 data: 11.6794 max mem: 6925
Test: [10/50] eta: 0:01:19 loss: 1.5711 (1.5525) acc1: 65.6000 (66.8364) acc5: 85.6000 (85.3091) time: 1.9799 data: 1.9502 max mem: 6925
Test: [20/50] eta: 0:00:46 loss: 1.6610 (1.7441) acc1: 60.0000 (61.3333) acc5: 84.0000 (83.3905) time: 1.0571 data: 1.0274 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.8137 (1.7609) acc1: 56.8000 (60.8516) acc5: 82.4000 (83.0194) time: 1.0785 data: 1.0490 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.7643 (1.7810) acc1: 59.2000 (60.3512) acc5: 80.8000 (82.5171) time: 0.8209 data: 0.7914 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.7643 (1.7817) acc1: 58.4000 (60.4160) acc5: 82.4000 (82.6880) time: 0.7015 data: 0.6699 max mem: 6925
Test: Total time: 0:00:50 (1.0049 s / it)
* Acc@1 60.930 Acc@5 83.356 loss 1.745
Accuracy of the model on the 50000 test images: 60.9%
Max accuracy: 62.42%
Epoch: [70] [ 0/625] eta: 3:29:37 lr: 0.003694 min_lr: 0.003694 loss: 2.6417 (2.6417) class_acc: 0.5703 (0.5703) weight_decay: 0.0500 (0.0500) time: 20.1241 data: 19.8897 max mem: 6925
Epoch: [70] [200/625] eta: 0:14:05 lr: 0.003690 min_lr: 0.003690 loss: 2.7603 (2.7798) class_acc: 0.5742 (0.5711) weight_decay: 0.0500 (0.0500) grad_norm: 1.0727 (1.2700) time: 1.8756 data: 1.4838 max mem: 6925
Epoch: [70] [400/625] eta: 0:07:15 lr: 0.003686 min_lr: 0.003686 loss: 2.7706 (2.7879) class_acc: 0.5703 (0.5699) weight_decay: 0.0500 (0.0500) grad_norm: 0.8360 (1.2930) time: 1.8782 data: 1.5868 max mem: 6925
Epoch: [70] [600/625] eta: 0:00:47 lr: 0.003682 min_lr: 0.003682 loss: 2.7869 (2.7950) class_acc: 0.5664 (0.5690) weight_decay: 0.0500 (0.0500) grad_norm: 0.7773 (1.2584) time: 1.8222 data: 1.5327 max mem: 6925
Epoch: [70] [624/625] eta: 0:00:01 lr: 0.003682 min_lr: 0.003682 loss: 2.7996 (2.7956) class_acc: 0.5547 (0.5686) weight_decay: 0.0500 (0.0500) grad_norm: 0.9767 (1.2520) time: 0.8320 data: 0.5812 max mem: 6925
Epoch: [70] Total time: 0:19:28 (1.8696 s / it)
Averaged stats: lr: 0.003682 min_lr: 0.003682 loss: 2.7996 (2.7999) class_acc: 0.5547 (0.5677) weight_decay: 0.0500 (0.0500) grad_norm: 0.9767 (1.2520)
Test: [ 0/50] eta: 0:10:02 loss: 1.9118 (1.9118) acc1: 50.4000 (50.4000) acc5: 84.8000 (84.8000) time: 12.0572 data: 12.0240 max mem: 6925
Test: [10/50] eta: 0:01:16 loss: 1.7429 (1.7354) acc1: 63.2000 (62.4727) acc5: 84.0000 (84.0000) time: 1.9197 data: 1.8909 max mem: 6925
Test: [20/50] eta: 0:00:43 loss: 1.8477 (1.8807) acc1: 58.4000 (58.9714) acc5: 83.2000 (82.2476) time: 0.9253 data: 0.8961 max mem: 6925
Test: [30/50] eta: 0:00:25 loss: 1.8862 (1.8658) acc1: 56.0000 (59.0968) acc5: 82.4000 (82.5548) time: 0.8826 data: 0.8532 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.8465 (1.8723) acc1: 56.8000 (59.1805) acc5: 82.4000 (82.0878) time: 0.7487 data: 0.7192 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.7867 (1.8698) acc1: 58.4000 (59.2640) acc5: 82.4000 (82.0480) time: 0.4814 data: 0.4519 max mem: 6925
Test: Total time: 0:00:48 (0.9707 s / it)
* Acc@1 60.122 Acc@5 82.570 loss 1.822
Accuracy of the model on the 50000 test images: 60.1%
Max accuracy: 62.42%
Epoch: [71] [ 0/625] eta: 3:51:39 lr: 0.003681 min_lr: 0.003681 loss: 2.7330 (2.7330) class_acc: 0.5938 (0.5938) weight_decay: 0.0500 (0.0500) time: 22.2389 data: 16.9629 max mem: 6925
Epoch: [71] [200/625] eta: 0:14:06 lr: 0.003678 min_lr: 0.003678 loss: 2.7770 (2.7754) class_acc: 0.5625 (0.5733) weight_decay: 0.0500 (0.0500) grad_norm: 0.9474 (1.2920) time: 1.8848 data: 0.0009 max mem: 6925
Epoch: [71] [400/625] eta: 0:07:20 lr: 0.003674 min_lr: 0.003674 loss: 2.7798 (2.7873) class_acc: 0.5703 (0.5701) weight_decay: 0.0500 (0.0500) grad_norm: 0.8361 (1.1835) time: 1.9628 data: 0.0007 max mem: 6925
Epoch: [71] [600/625] eta: 0:00:49 lr: 0.003670 min_lr: 0.003670 loss: 2.8178 (2.7939) class_acc: 0.5703 (0.5682) weight_decay: 0.0500 (0.0500) grad_norm: 1.3395 (1.2295) time: 2.0004 data: 0.0140 max mem: 6925
Epoch: [71] [624/625] eta: 0:00:01 lr: 0.003669 min_lr: 0.003669 loss: 2.8351 (2.7948) class_acc: 0.5625 (0.5681) weight_decay: 0.0500 (0.0500) grad_norm: 1.2749 (1.2252) time: 0.8890 data: 0.0013 max mem: 6925
Epoch: [71] Total time: 0:20:02 (1.9240 s / it)
Averaged stats: lr: 0.003669 min_lr: 0.003669 loss: 2.8351 (2.7975) class_acc: 0.5625 (0.5679) weight_decay: 0.0500 (0.0500) grad_norm: 1.2749 (1.2252)
Test: [ 0/50] eta: 0:09:54 loss: 1.6180 (1.6180) acc1: 64.8000 (64.8000) acc5: 88.8000 (88.8000) time: 11.8917 data: 11.8540 max mem: 6925
Test: [10/50] eta: 0:01:19 loss: 1.6257 (1.6710) acc1: 62.4000 (63.9273) acc5: 84.8000 (84.5818) time: 1.9823 data: 1.9527 max mem: 6925
Test: [20/50] eta: 0:00:46 loss: 1.8165 (1.7753) acc1: 59.2000 (60.6476) acc5: 83.2000 (83.3524) time: 1.0396 data: 1.0110 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.8306 (1.7999) acc1: 56.0000 (59.6387) acc5: 82.4000 (83.1484) time: 1.0388 data: 1.0096 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.7496 (1.8076) acc1: 57.6000 (59.8049) acc5: 82.4000 (82.9659) time: 0.8480 data: 0.8178 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.7459 (1.8134) acc1: 59.2000 (60.0320) acc5: 83.2000 (82.7520) time: 0.7029 data: 0.6724 max mem: 6925
Test: Total time: 0:00:53 (1.0677 s / it)
* Acc@1 60.554 Acc@5 83.390 loss 1.774
Accuracy of the model on the 50000 test images: 60.6%
Max accuracy: 62.42%
Epoch: [72] [ 0/625] eta: 5:16:56 lr: 0.003669 min_lr: 0.003669 loss: 2.7380 (2.7380) class_acc: 0.5781 (0.5781) weight_decay: 0.0500 (0.0500) time: 30.4268 data: 30.1916 max mem: 6925
Epoch: [72] [200/625] eta: 0:14:23 lr: 0.003665 min_lr: 0.003665 loss: 2.7943 (2.7713) class_acc: 0.5820 (0.5737) weight_decay: 0.0500 (0.0500) grad_norm: 1.3483 (1.3199) time: 1.9156 data: 0.8005 max mem: 6925
Epoch: [72] [400/625] eta: 0:07:24 lr: 0.003661 min_lr: 0.003661 loss: 2.7864 (2.7828) class_acc: 0.5625 (0.5711) weight_decay: 0.0500 (0.0500) grad_norm: 0.8358 (1.2547) time: 1.9350 data: 0.1278 max mem: 6925
Epoch: [72] [600/625] eta: 0:00:49 lr: 0.003657 min_lr: 0.003657 loss: 2.7647 (2.7920) class_acc: 0.5664 (0.5693) weight_decay: 0.0500 (0.0500) grad_norm: 0.9311 (1.2196) time: 1.9454 data: 0.0008 max mem: 6925
Epoch: [72] [624/625] eta: 0:00:01 lr: 0.003657 min_lr: 0.003657 loss: 2.7960 (2.7928) class_acc: 0.5625 (0.5691) weight_decay: 0.0500 (0.0500) grad_norm: 0.9359 (1.2174) time: 0.7642 data: 0.0024 max mem: 6925
Epoch: [72] Total time: 0:20:01 (1.9231 s / it)
Averaged stats: lr: 0.003657 min_lr: 0.003657 loss: 2.7960 (2.7952) class_acc: 0.5625 (0.5686) weight_decay: 0.0500 (0.0500) grad_norm: 0.9359 (1.2174)
Test: [ 0/50] eta: 0:10:29 loss: 1.8417 (1.8417) acc1: 52.8000 (52.8000) acc5: 84.0000 (84.0000) time: 12.5834 data: 12.5306 max mem: 6925
Test: [10/50] eta: 0:01:23 loss: 1.5001 (1.5837) acc1: 64.8000 (63.8545) acc5: 87.2000 (85.2364) time: 2.0996 data: 2.0680 max mem: 6925
Test: [20/50] eta: 0:00:47 loss: 1.6889 (1.7598) acc1: 59.2000 (59.6952) acc5: 83.2000 (83.1238) time: 1.0499 data: 1.0202 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.9424 (1.7824) acc1: 57.6000 (59.9226) acc5: 80.8000 (82.5548) time: 0.9403 data: 0.9107 max mem: 6925
Test: [40/50] eta: 0:00:10 loss: 1.7721 (1.7971) acc1: 59.2000 (59.7659) acc5: 81.6000 (82.4585) time: 0.5739 data: 0.5451 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.8424 (1.8206) acc1: 56.8000 (59.1680) acc5: 82.4000 (82.1920) time: 0.5096 data: 0.4787 max mem: 6925
Test: Total time: 0:00:45 (0.9114 s / it)
* Acc@1 60.142 Acc@5 83.080 loss 1.770
Accuracy of the model on the 50000 test images: 60.1%
Max accuracy: 62.42%
Epoch: [73] [ 0/625] eta: 3:25:23 lr: 0.003657 min_lr: 0.003657 loss: 3.0459 (3.0459) class_acc: 0.5117 (0.5117) weight_decay: 0.0500 (0.0500) time: 19.7168 data: 18.5260 max mem: 6925
Epoch: [73] [200/625] eta: 0:13:33 lr: 0.003653 min_lr: 0.003653 loss: 2.7937 (2.7688) class_acc: 0.5781 (0.5733) weight_decay: 0.0500 (0.0500) grad_norm: 1.1952 (1.2182) time: 2.0130 data: 0.2219 max mem: 6925
Epoch: [73] [400/625] eta: 0:07:02 lr: 0.003649 min_lr: 0.003649 loss: 2.8164 (2.7842) class_acc: 0.5586 (0.5701) weight_decay: 0.0500 (0.0500) grad_norm: 0.8178 (1.2377) time: 1.9014 data: 0.0226 max mem: 6925
Epoch: [73] [600/625] eta: 0:00:46 lr: 0.003645 min_lr: 0.003645 loss: 2.7873 (2.7901) class_acc: 0.5625 (0.5686) weight_decay: 0.0500 (0.0500) grad_norm: 1.0922 (1.2294) time: 1.9160 data: 0.0020 max mem: 6925
Epoch: [73] [624/625] eta: 0:00:01 lr: 0.003644 min_lr: 0.003644 loss: 2.7850 (2.7905) class_acc: 0.5703 (0.5685) weight_decay: 0.0500 (0.0500) grad_norm: 1.1352 (1.2318) time: 0.8751 data: 0.0018 max mem: 6925
Epoch: [73] Total time: 0:19:17 (1.8518 s / it)
Averaged stats: lr: 0.003644 min_lr: 0.003644 loss: 2.7850 (2.7907) class_acc: 0.5703 (0.5693) weight_decay: 0.0500 (0.0500) grad_norm: 1.1352 (1.2318)
Test: [ 0/50] eta: 0:10:40 loss: 1.5220 (1.5220) acc1: 64.8000 (64.8000) acc5: 88.0000 (88.0000) time: 12.8186 data: 12.7842 max mem: 6925
Test: [10/50] eta: 0:01:24 loss: 1.6156 (1.6326) acc1: 64.0000 (63.6364) acc5: 87.2000 (85.6000) time: 2.1155 data: 2.0860 max mem: 6925
Test: [20/50] eta: 0:00:49 loss: 1.7846 (1.7985) acc1: 60.8000 (60.0000) acc5: 81.6000 (82.8571) time: 1.0756 data: 1.0468 max mem: 6925
Test: [30/50] eta: 0:00:28 loss: 1.8982 (1.8307) acc1: 56.8000 (59.2258) acc5: 80.0000 (82.1161) time: 1.0778 data: 1.0493 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.8895 (1.8453) acc1: 56.8000 (58.8293) acc5: 80.8000 (82.1268) time: 0.8327 data: 0.8042 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.7994 (1.8601) acc1: 56.0000 (58.7040) acc5: 82.4000 (81.9200) time: 0.7619 data: 0.7320 max mem: 6925
Test: Total time: 0:00:52 (1.0463 s / it)
* Acc@1 60.150 Acc@5 82.752 loss 1.816
Accuracy of the model on the 50000 test images: 60.2%
Max accuracy: 62.42%
Epoch: [74] [ 0/625] eta: 3:23:20 lr: 0.003644 min_lr: 0.003644 loss: 2.8600 (2.8600) class_acc: 0.5703 (0.5703) weight_decay: 0.0500 (0.0500) time: 19.5210 data: 18.3482 max mem: 6925
Epoch: [74] [200/625] eta: 0:14:00 lr: 0.003640 min_lr: 0.003640 loss: 2.8120 (2.7804) class_acc: 0.5625 (0.5712) weight_decay: 0.0500 (0.0500) grad_norm: 1.3218 (1.3150) time: 1.9294 data: 0.2973 max mem: 6925
Epoch: [74] [400/625] eta: 0:07:14 lr: 0.003636 min_lr: 0.003636 loss: 2.7654 (2.7820) class_acc: 0.5703 (0.5718) weight_decay: 0.0500 (0.0500) grad_norm: 1.2094 (1.2773) time: 1.9005 data: 0.0550 max mem: 6925
Epoch: [74] [600/625] eta: 0:00:48 lr: 0.003632 min_lr: 0.003632 loss: 2.8159 (2.7857) class_acc: 0.5625 (0.5711) weight_decay: 0.0500 (0.0500) grad_norm: 1.1145 (1.2628) time: 1.9656 data: 0.0431 max mem: 6925
Epoch: [74] [624/625] eta: 0:00:01 lr: 0.003631 min_lr: 0.003631 loss: 2.7880 (2.7858) class_acc: 0.5625 (0.5710) weight_decay: 0.0500 (0.0500) grad_norm: 0.7885 (1.2489) time: 0.8729 data: 0.0235 max mem: 6925
Epoch: [74] Total time: 0:19:45 (1.8972 s / it)
Averaged stats: lr: 0.003631 min_lr: 0.003631 loss: 2.7880 (2.7871) class_acc: 0.5625 (0.5708) weight_decay: 0.0500 (0.0500) grad_norm: 0.7885 (1.2489)
Test: [ 0/50] eta: 0:11:00 loss: 1.6187 (1.6187) acc1: 63.2000 (63.2000) acc5: 88.0000 (88.0000) time: 13.2054 data: 13.1608 max mem: 6925
Test: [10/50] eta: 0:01:23 loss: 1.5260 (1.5840) acc1: 64.8000 (65.3818) acc5: 88.0000 (85.3091) time: 2.0802 data: 2.0492 max mem: 6925
Test: [20/50] eta: 0:00:47 loss: 1.7868 (1.7832) acc1: 60.0000 (59.7333) acc5: 81.6000 (82.8191) time: 0.9913 data: 0.9610 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.9393 (1.8148) acc1: 58.4000 (59.2516) acc5: 80.8000 (82.3226) time: 0.9770 data: 0.9469 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.7744 (1.8228) acc1: 58.4000 (59.1220) acc5: 83.2000 (82.4976) time: 0.7459 data: 0.7160 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.7942 (1.8282) acc1: 56.0000 (58.8160) acc5: 83.2000 (82.2720) time: 0.6275 data: 0.5980 max mem: 6925
Test: Total time: 0:00:51 (1.0235 s / it)
* Acc@1 59.830 Acc@5 82.530 loss 1.797
Accuracy of the model on the 50000 test images: 59.8%
Max accuracy: 62.42%
Epoch: [75] [ 0/625] eta: 3:28:41 lr: 0.003631 min_lr: 0.003631 loss: 2.8858 (2.8858) class_acc: 0.5547 (0.5547) weight_decay: 0.0500 (0.0500) time: 20.0351 data: 19.4155 max mem: 6925
Epoch: [75] [200/625] eta: 0:13:54 lr: 0.003627 min_lr: 0.003627 loss: 2.7955 (2.7638) class_acc: 0.5625 (0.5769) weight_decay: 0.0500 (0.0500) grad_norm: 1.3771 (1.2728) time: 2.0894 data: 1.2512 max mem: 6925
Epoch: [75] [400/625] eta: 0:07:09 lr: 0.003623 min_lr: 0.003623 loss: 2.8256 (2.7699) class_acc: 0.5508 (0.5751) weight_decay: 0.0500 (0.0500) grad_norm: 1.3718 (1.2570) time: 1.8305 data: 0.0008 max mem: 6925
Epoch: [75] [600/625] eta: 0:00:47 lr: 0.003619 min_lr: 0.003619 loss: 2.7415 (2.7807) class_acc: 0.5664 (0.5727) weight_decay: 0.0500 (0.0500) grad_norm: 1.0645 (1.2299) time: 1.8642 data: 0.0010 max mem: 6925
Epoch: [75] [624/625] eta: 0:00:01 lr: 0.003618 min_lr: 0.003618 loss: 2.7832 (2.7795) class_acc: 0.5508 (0.5727) weight_decay: 0.0500 (0.0500) grad_norm: 1.0645 (1.2306) time: 0.8127 data: 0.0016 max mem: 6925
Epoch: [75] Total time: 0:19:42 (1.8922 s / it)
Averaged stats: lr: 0.003618 min_lr: 0.003618 loss: 2.7832 (2.7818) class_acc: 0.5508 (0.5713) weight_decay: 0.0500 (0.0500) grad_norm: 1.0645 (1.2306)
Test: [ 0/50] eta: 0:10:23 loss: 1.6137 (1.6137) acc1: 56.0000 (56.0000) acc5: 92.0000 (92.0000) time: 12.4699 data: 12.4131 max mem: 6925
Test: [10/50] eta: 0:01:25 loss: 1.6137 (1.6021) acc1: 63.2000 (63.2000) acc5: 87.2000 (86.3273) time: 2.1374 data: 2.1058 max mem: 6925
Test: [20/50] eta: 0:00:51 loss: 1.6946 (1.7054) acc1: 61.6000 (61.5238) acc5: 84.8000 (84.8762) time: 1.1785 data: 1.1496 max mem: 6925
Test: [30/50] eta: 0:00:30 loss: 1.7700 (1.7304) acc1: 57.6000 (60.6968) acc5: 83.2000 (84.3097) time: 1.1939 data: 1.1650 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.7043 (1.7437) acc1: 58.4000 (60.2927) acc5: 83.2000 (83.9415) time: 0.8101 data: 0.7811 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.7572 (1.7568) acc1: 58.4000 (60.1440) acc5: 82.4000 (83.6480) time: 0.7155 data: 0.6868 max mem: 6925
Test: Total time: 0:00:53 (1.0604 s / it)
* Acc@1 61.002 Acc@5 84.006 loss 1.723
Accuracy of the model on the 50000 test images: 61.0%
Max accuracy: 62.42%
Epoch: [76] [ 0/625] eta: 3:21:41 lr: 0.003618 min_lr: 0.003618 loss: 2.7758 (2.7758) class_acc: 0.5781 (0.5781) weight_decay: 0.0500 (0.0500) time: 19.3621 data: 17.2429 max mem: 6925
Epoch: [76] [200/625] eta: 0:14:18 lr: 0.003614 min_lr: 0.003614 loss: 2.7689 (2.7768) class_acc: 0.5820 (0.5733) weight_decay: 0.0500 (0.0500) grad_norm: 1.3241 (1.2986) time: 1.9510 data: 0.4099 max mem: 6925
Epoch: [76] [400/625] eta: 0:07:18 lr: 0.003610 min_lr: 0.003610 loss: 2.7868 (2.7810) class_acc: 0.5625 (0.5708) weight_decay: 0.0500 (0.0500) grad_norm: 0.8451 (1.2695) time: 1.9334 data: 1.6057 max mem: 6925
Epoch: [76] [600/625] eta: 0:00:48 lr: 0.003605 min_lr: 0.003605 loss: 2.7482 (2.7865) class_acc: 0.5625 (0.5703) weight_decay: 0.0500 (0.0500) grad_norm: 1.0312 (1.2730) time: 1.8792 data: 1.5791 max mem: 6925
Epoch: [76] [624/625] eta: 0:00:01 lr: 0.003605 min_lr: 0.003605 loss: 2.7558 (2.7867) class_acc: 0.5625 (0.5704) weight_decay: 0.0500 (0.0500) grad_norm: 0.9890 (1.2647) time: 0.6410 data: 0.3817 max mem: 6925
Epoch: [76] Total time: 0:19:57 (1.9158 s / it)
Averaged stats: lr: 0.003605 min_lr: 0.003605 loss: 2.7558 (2.7802) class_acc: 0.5625 (0.5721) weight_decay: 0.0500 (0.0500) grad_norm: 0.9890 (1.2647)
Test: [ 0/50] eta: 0:10:09 loss: 1.4201 (1.4201) acc1: 72.0000 (72.0000) acc5: 89.6000 (89.6000) time: 12.1895 data: 12.1409 max mem: 6925
Test: [10/50] eta: 0:01:16 loss: 1.5789 (1.5898) acc1: 64.8000 (64.1455) acc5: 87.2000 (86.1818) time: 1.9133 data: 1.8775 max mem: 6925
Test: [20/50] eta: 0:00:43 loss: 1.8255 (1.8197) acc1: 58.4000 (59.7714) acc5: 84.0000 (83.5429) time: 0.9247 data: 0.8932 max mem: 6925
Test: [30/50] eta: 0:00:24 loss: 1.9900 (1.8326) acc1: 55.2000 (59.3548) acc5: 81.6000 (83.2258) time: 0.8528 data: 0.8240 max mem: 6925
Test: [40/50] eta: 0:00:10 loss: 1.8422 (1.8556) acc1: 57.6000 (58.8878) acc5: 83.2000 (82.7707) time: 0.5544 data: 0.5253 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.8787 (1.8699) acc1: 57.6000 (58.3840) acc5: 82.4000 (82.4320) time: 0.3866 data: 0.3576 max mem: 6925
Test: Total time: 0:00:44 (0.8936 s / it)
* Acc@1 59.604 Acc@5 82.572 loss 1.830
Accuracy of the model on the 50000 test images: 59.6%
Max accuracy: 62.42%
Epoch: [77] [ 0/625] eta: 3:23:18 lr: 0.003605 min_lr: 0.003605 loss: 2.5269 (2.5269) class_acc: 0.6133 (0.6133) weight_decay: 0.0500 (0.0500) time: 19.5171 data: 18.3459 max mem: 6925
Epoch: [77] [200/625] eta: 0:14:24 lr: 0.003601 min_lr: 0.003601 loss: 2.7818 (2.7515) class_acc: 0.5664 (0.5791) weight_decay: 0.0500 (0.0500) grad_norm: 1.1478 (1.3299) time: 2.2267 data: 0.1886 max mem: 6925
Epoch: [77] [400/625] eta: 0:07:32 lr: 0.003596 min_lr: 0.003596 loss: 2.7299 (2.7603) class_acc: 0.5820 (0.5766) weight_decay: 0.0500 (0.0500) grad_norm: 0.9365 (1.2395) time: 2.0837 data: 0.0013 max mem: 6925
Epoch: [77] [600/625] eta: 0:00:50 lr: 0.003592 min_lr: 0.003592 loss: 2.8063 (2.7731) class_acc: 0.5742 (0.5736) weight_decay: 0.0500 (0.0500) grad_norm: 1.0036 (1.2568) time: 2.0581 data: 0.0007 max mem: 6925
Epoch: [77] [624/625] eta: 0:00:01 lr: 0.003591 min_lr: 0.003591 loss: 2.8232 (2.7754) class_acc: 0.5625 (0.5731) weight_decay: 0.0500 (0.0500) grad_norm: 0.9434 (1.2573) time: 0.7646 data: 0.0019 max mem: 6925
Epoch: [77] Total time: 0:20:29 (1.9667 s / it)
Averaged stats: lr: 0.003591 min_lr: 0.003591 loss: 2.8232 (2.7792) class_acc: 0.5625 (0.5724) weight_decay: 0.0500 (0.0500) grad_norm: 0.9434 (1.2573)
Test: [ 0/50] eta: 0:10:46 loss: 2.0230 (2.0230) acc1: 53.6000 (53.6000) acc5: 81.6000 (81.6000) time: 12.9245 data: 12.8938 max mem: 6925
Test: [10/50] eta: 0:01:27 loss: 1.7551 (1.7702) acc1: 59.2000 (60.0000) acc5: 84.0000 (83.2727) time: 2.1943 data: 2.1654 max mem: 6925
Test: [20/50] eta: 0:00:52 loss: 1.9202 (1.9463) acc1: 56.8000 (56.3048) acc5: 82.4000 (81.4095) time: 1.1743 data: 1.1458 max mem: 6925
Test: [30/50] eta: 0:00:31 loss: 2.0194 (1.9296) acc1: 55.2000 (57.5226) acc5: 79.2000 (80.9290) time: 1.2376 data: 1.2092 max mem: 6925
Test: [40/50] eta: 0:00:14 loss: 1.9662 (1.9546) acc1: 55.2000 (56.7220) acc5: 79.2000 (80.7610) time: 1.0560 data: 1.0272 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.9469 (1.9522) acc1: 57.6000 (57.1040) acc5: 81.6000 (80.8320) time: 1.0096 data: 0.9783 max mem: 6925
Test: Total time: 0:00:58 (1.1686 s / it)
* Acc@1 57.732 Acc@5 81.156 loss 1.919
Accuracy of the model on the 50000 test images: 57.7%
Max accuracy: 62.42%
Epoch: [78] [ 0/625] eta: 3:51:47 lr: 0.003591 min_lr: 0.003591 loss: 2.6263 (2.6263) class_acc: 0.5820 (0.5820) weight_decay: 0.0500 (0.0500) time: 22.2514 data: 18.0379 max mem: 6925
Epoch: [78] [200/625] eta: 0:13:49 lr: 0.003587 min_lr: 0.003587 loss: 2.8066 (2.7558) class_acc: 0.5508 (0.5776) weight_decay: 0.0500 (0.0500) grad_norm: 1.2314 (1.2897) time: 1.8817 data: 0.2714 max mem: 6925
Epoch: [78] [400/625] eta: 0:07:13 lr: 0.003583 min_lr: 0.003583 loss: 2.7629 (2.7742) class_acc: 0.5625 (0.5728) weight_decay: 0.0500 (0.0500) grad_norm: 1.0912 (1.1676) time: 1.9780 data: 0.0009 max mem: 6925
Epoch: [78] [600/625] eta: 0:00:47 lr: 0.003578 min_lr: 0.003578 loss: 2.7646 (2.7823) class_acc: 0.5664 (0.5711) weight_decay: 0.0500 (0.0500) grad_norm: 0.8545 (1.1864) time: 1.9179 data: 0.0010 max mem: 6925
Epoch: [78] [624/625] eta: 0:00:01 lr: 0.003578 min_lr: 0.003578 loss: 2.8328 (2.7837) class_acc: 0.5547 (0.5708) weight_decay: 0.0500 (0.0500) grad_norm: 1.5233 (1.2116) time: 0.9061 data: 0.0015 max mem: 6925
Epoch: [78] Total time: 0:19:38 (1.8851 s / it)
Averaged stats: lr: 0.003578 min_lr: 0.003578 loss: 2.8328 (2.7756) class_acc: 0.5547 (0.5730) weight_decay: 0.0500 (0.0500) grad_norm: 1.5233 (1.2116)
Test: [ 0/50] eta: 0:09:53 loss: 1.6671 (1.6671) acc1: 60.8000 (60.8000) acc5: 86.4000 (86.4000) time: 11.8724 data: 11.8274 max mem: 6925
Test: [10/50] eta: 0:01:14 loss: 1.5441 (1.5977) acc1: 65.6000 (64.4364) acc5: 87.2000 (85.9636) time: 1.8681 data: 1.8361 max mem: 6925
Test: [20/50] eta: 0:00:43 loss: 1.7298 (1.7473) acc1: 60.0000 (60.9905) acc5: 84.0000 (83.5429) time: 0.9136 data: 0.8836 max mem: 6925
Test: [30/50] eta: 0:00:24 loss: 1.8764 (1.7920) acc1: 56.8000 (59.5871) acc5: 81.6000 (82.9161) time: 0.8867 data: 0.8575 max mem: 6925
Test: [40/50] eta: 0:00:10 loss: 1.7941 (1.8011) acc1: 56.8000 (59.3756) acc5: 81.6000 (82.6927) time: 0.6335 data: 0.6048 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.7863 (1.8100) acc1: 58.4000 (59.3760) acc5: 81.6000 (82.4320) time: 0.5585 data: 0.5291 max mem: 6925
Test: Total time: 0:00:44 (0.8842 s / it)
* Acc@1 60.396 Acc@5 82.920 loss 1.769
Accuracy of the model on the 50000 test images: 60.4%
Max accuracy: 62.42%
Epoch: [79] [ 0/625] eta: 3:38:06 lr: 0.003578 min_lr: 0.003578 loss: 2.9066 (2.9066) class_acc: 0.5547 (0.5547) weight_decay: 0.0500 (0.0500) time: 20.9387 data: 16.1734 max mem: 6925
Epoch: [79] [200/625] eta: 0:14:19 lr: 0.003573 min_lr: 0.003573 loss: 2.7538 (2.7508) class_acc: 0.5781 (0.5788) weight_decay: 0.0500 (0.0500) grad_norm: 0.8720 (1.1263) time: 1.9392 data: 0.0009 max mem: 6925
Epoch: [79] [400/625] eta: 0:07:22 lr: 0.003569 min_lr: 0.003569 loss: 2.7627 (2.7594) class_acc: 0.5742 (0.5776) weight_decay: 0.0500 (0.0500) grad_norm: 1.5276 (1.2033) time: 1.9202 data: 0.0009 max mem: 6925
Epoch: [79] [600/625] eta: 0:00:49 lr: 0.003564 min_lr: 0.003564 loss: 2.7539 (2.7673) class_acc: 0.5742 (0.5752) weight_decay: 0.0500 (0.0500) grad_norm: 0.8065 (1.1776) time: 1.9125 data: 0.0008 max mem: 6925
Epoch: [79] [624/625] eta: 0:00:01 lr: 0.003564 min_lr: 0.003564 loss: 2.7335 (2.7680) class_acc: 0.5664 (0.5750) weight_decay: 0.0500 (0.0500) grad_norm: 0.8435 (1.1727) time: 0.7949 data: 0.0017 max mem: 6925
Epoch: [79] Total time: 0:20:03 (1.9250 s / it)
Averaged stats: lr: 0.003564 min_lr: 0.003564 loss: 2.7335 (2.7736) class_acc: 0.5664 (0.5738) weight_decay: 0.0500 (0.0500) grad_norm: 0.8435 (1.1727)
Test: [ 0/50] eta: 0:09:58 loss: 1.7549 (1.7549) acc1: 61.6000 (61.6000) acc5: 85.6000 (85.6000) time: 11.9681 data: 11.9365 max mem: 6925
Test: [10/50] eta: 0:01:20 loss: 1.5551 (1.5561) acc1: 66.4000 (65.8909) acc5: 86.4000 (86.6909) time: 2.0169 data: 1.9871 max mem: 6925
Test: [20/50] eta: 0:00:46 loss: 1.6543 (1.6744) acc1: 61.6000 (63.0095) acc5: 84.8000 (84.6095) time: 1.0401 data: 1.0110 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.7541 (1.6946) acc1: 59.2000 (62.4000) acc5: 81.6000 (84.4387) time: 1.0556 data: 1.0270 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.7123 (1.7024) acc1: 59.2000 (62.3415) acc5: 83.2000 (84.1756) time: 0.8740 data: 0.8441 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.7113 (1.7117) acc1: 59.2000 (62.0160) acc5: 84.8000 (83.8240) time: 0.7563 data: 0.7259 max mem: 6925
Test: Total time: 0:00:53 (1.0645 s / it)
* Acc@1 62.118 Acc@5 84.332 loss 1.693
Accuracy of the model on the 50000 test images: 62.1%
Max accuracy: 62.42%
Epoch: [80] [ 0/625] eta: 3:31:42 lr: 0.003564 min_lr: 0.003564 loss: 2.6311 (2.6311) class_acc: 0.5859 (0.5859) weight_decay: 0.0500 (0.0500) time: 20.3239 data: 15.6085 max mem: 6925
Epoch: [80] [200/625] eta: 0:13:54 lr: 0.003559 min_lr: 0.003559 loss: 2.7457 (2.7517) class_acc: 0.5781 (0.5784) weight_decay: 0.0500 (0.0500) grad_norm: 0.9237 (1.1834) time: 1.8965 data: 0.0008 max mem: 6925
Epoch: [80] [400/625] eta: 0:07:17 lr: 0.003555 min_lr: 0.003555 loss: 2.7352 (2.7579) class_acc: 0.5703 (0.5763) weight_decay: 0.0500 (0.0500) grad_norm: 0.8972 (inf) time: 1.9711 data: 0.0008 max mem: 6925
Epoch: [80] [600/625] eta: 0:00:48 lr: 0.003550 min_lr: 0.003550 loss: 2.7822 (2.7644) class_acc: 0.5586 (0.5750) weight_decay: 0.0500 (0.0500) grad_norm: 0.9111 (inf) time: 2.0054 data: 0.0007 max mem: 6925
Epoch: [80] [624/625] eta: 0:00:01 lr: 0.003550 min_lr: 0.003550 loss: 2.7375 (2.7637) class_acc: 0.5898 (0.5752) weight_decay: 0.0500 (0.0500) grad_norm: 0.8987 (inf) time: 0.5976 data: 0.0013 max mem: 6925
Epoch: [80] Total time: 0:20:07 (1.9322 s / it)
Averaged stats: lr: 0.003550 min_lr: 0.003550 loss: 2.7375 (2.7683) class_acc: 0.5898 (0.5747) weight_decay: 0.0500 (0.0500) grad_norm: 0.8987 (inf)
Test: [ 0/50] eta: 0:09:48 loss: 1.4125 (1.4125) acc1: 66.4000 (66.4000) acc5: 86.4000 (86.4000) time: 11.7637 data: 11.7328 max mem: 6925
Test: [10/50] eta: 0:01:17 loss: 1.4586 (1.5480) acc1: 65.6000 (65.6727) acc5: 86.4000 (85.8909) time: 1.9315 data: 1.9000 max mem: 6925
Test: [20/50] eta: 0:00:44 loss: 1.6875 (1.7137) acc1: 62.4000 (61.9429) acc5: 83.2000 (83.5810) time: 0.9650 data: 0.9347 max mem: 6925
Test: [30/50] eta: 0:00:26 loss: 1.8164 (1.7392) acc1: 59.2000 (61.4968) acc5: 82.4000 (83.0968) time: 0.9766 data: 0.9467 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.8088 (1.7655) acc1: 59.2000 (60.8585) acc5: 82.4000 (82.6537) time: 0.8116 data: 0.7813 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.8099 (1.7865) acc1: 57.6000 (60.4000) acc5: 81.6000 (82.5440) time: 0.5947 data: 0.5640 max mem: 6925
Test: Total time: 0:00:50 (1.0043 s / it)
* Acc@1 61.000 Acc@5 83.236 loss 1.751
Accuracy of the model on the 50000 test images: 61.0%
Max accuracy: 62.42%
Epoch: [81] [ 0/625] eta: 3:29:24 lr: 0.003550 min_lr: 0.003550 loss: 2.8942 (2.8942) class_acc: 0.5195 (0.5195) weight_decay: 0.0500 (0.0500) time: 20.1034 data: 18.6187 max mem: 6925
Epoch: [81] [200/625] eta: 0:14:01 lr: 0.003545 min_lr: 0.003545 loss: 2.6952 (2.7516) class_acc: 0.5781 (0.5771) weight_decay: 0.0500 (0.0500) grad_norm: 0.9437 (1.2133) time: 2.1063 data: 0.0009 max mem: 6925
Epoch: [81] [400/625] eta: 0:07:24 lr: 0.003541 min_lr: 0.003541 loss: 2.7239 (2.7587) class_acc: 0.5625 (0.5760) weight_decay: 0.0500 (0.0500) grad_norm: 0.8236 (1.1987) time: 2.0033 data: 0.2165 max mem: 6925
Epoch: [81] [600/625] eta: 0:00:50 lr: 0.003536 min_lr: 0.003536 loss: 2.8718 (2.7698) class_acc: 0.5547 (0.5733) weight_decay: 0.0500 (0.0500) grad_norm: 0.9044 (1.1556) time: 2.2220 data: 0.0610 max mem: 6925
Epoch: [81] [624/625] eta: 0:00:01 lr: 0.003535 min_lr: 0.003535 loss: 2.7616 (2.7703) class_acc: 0.5703 (0.5732) weight_decay: 0.0500 (0.0500) grad_norm: 0.9433 (1.1595) time: 0.3993 data: 0.0018 max mem: 6925
Epoch: [81] Total time: 0:20:33 (1.9729 s / it)
Averaged stats: lr: 0.003535 min_lr: 0.003535 loss: 2.7616 (2.7659) class_acc: 0.5703 (0.5745) weight_decay: 0.0500 (0.0500) grad_norm: 0.9433 (1.1595)
Test: [ 0/50] eta: 0:10:14 loss: 1.4406 (1.4406) acc1: 63.2000 (63.2000) acc5: 91.2000 (91.2000) time: 12.2847 data: 12.2531 max mem: 6925
Test: [10/50] eta: 0:01:22 loss: 1.4450 (1.5080) acc1: 69.6000 (66.1091) acc5: 88.0000 (87.0546) time: 2.0592 data: 2.0291 max mem: 6925
Test: [20/50] eta: 0:00:47 loss: 1.7394 (1.7247) acc1: 57.6000 (61.4857) acc5: 83.2000 (84.4571) time: 1.0615 data: 1.0310 max mem: 6925
Test: [30/50] eta: 0:00:28 loss: 1.8851 (1.7605) acc1: 57.6000 (60.6710) acc5: 82.4000 (83.9484) time: 1.0540 data: 1.0235 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.8463 (1.7603) acc1: 56.8000 (60.4488) acc5: 83.2000 (83.9805) time: 0.9001 data: 0.8701 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.8263 (1.7724) acc1: 56.8000 (60.4480) acc5: 83.2000 (83.6640) time: 0.8683 data: 0.8386 max mem: 6925
Test: Total time: 0:00:52 (1.0505 s / it)
* Acc@1 61.186 Acc@5 83.834 loss 1.739
Accuracy of the model on the 50000 test images: 61.2%
Max accuracy: 62.42%
Epoch: [82] [ 0/625] eta: 3:46:49 lr: 0.003535 min_lr: 0.003535 loss: 2.6362 (2.6362) class_acc: 0.6328 (0.6328) weight_decay: 0.0500 (0.0500) time: 21.7744 data: 16.6444 max mem: 6925
Epoch: [82] [200/625] eta: 0:14:35 lr: 0.003531 min_lr: 0.003531 loss: 2.7676 (2.7265) class_acc: 0.5742 (0.5852) weight_decay: 0.0500 (0.0500) grad_norm: 1.0580 (1.0933) time: 2.2207 data: 0.0007 max mem: 6925
Epoch: [82] [400/625] eta: 0:07:48 lr: 0.003526 min_lr: 0.003526 loss: 2.7562 (2.7386) class_acc: 0.5742 (0.5819) weight_decay: 0.0500 (0.0500) grad_norm: 1.0169 (1.1504) time: 2.0701 data: 0.0007 max mem: 6925
Epoch: [82] [600/625] eta: 0:00:51 lr: 0.003521 min_lr: 0.003521 loss: 2.8037 (2.7534) class_acc: 0.5625 (0.5777) weight_decay: 0.0500 (0.0500) grad_norm: 1.0414 (1.1647) time: 2.1642 data: 0.0009 max mem: 6925
Epoch: [82] [624/625] eta: 0:00:02 lr: 0.003521 min_lr: 0.003521 loss: 2.7299 (2.7534) class_acc: 0.5625 (0.5775) weight_decay: 0.0500 (0.0500) grad_norm: 1.1282 (1.1872) time: 0.7428 data: 0.0017 max mem: 6925
Epoch: [82] Total time: 0:21:07 (2.0282 s / it)
Averaged stats: lr: 0.003521 min_lr: 0.003521 loss: 2.7299 (2.7595) class_acc: 0.5625 (0.5764) weight_decay: 0.0500 (0.0500) grad_norm: 1.1282 (1.1872)
Test: [ 0/50] eta: 0:10:07 loss: 1.5686 (1.5686) acc1: 69.6000 (69.6000) acc5: 84.0000 (84.0000) time: 12.1553 data: 12.1133 max mem: 6925
Test: [10/50] eta: 0:01:22 loss: 1.6297 (1.5953) acc1: 64.0000 (64.2182) acc5: 85.6000 (85.8909) time: 2.0536 data: 2.0234 max mem: 6925
Test: [20/50] eta: 0:00:47 loss: 1.7026 (1.7841) acc1: 57.6000 (59.9619) acc5: 83.2000 (83.6191) time: 1.0639 data: 1.0345 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.9194 (1.7933) acc1: 57.6000 (60.0000) acc5: 82.4000 (83.2258) time: 1.0251 data: 0.9958 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.8934 (1.8047) acc1: 58.4000 (59.8634) acc5: 82.4000 (83.2000) time: 0.8582 data: 0.8292 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.7422 (1.8037) acc1: 59.2000 (60.0800) acc5: 83.2000 (83.2480) time: 0.8397 data: 0.8103 max mem: 6925
Test: Total time: 0:00:53 (1.0711 s / it)
* Acc@1 60.858 Acc@5 83.406 loss 1.779
Accuracy of the model on the 50000 test images: 60.9%
Max accuracy: 62.42%
Epoch: [83] [ 0/625] eta: 4:24:28 lr: 0.003521 min_lr: 0.003521 loss: 2.7559 (2.7559) class_acc: 0.5742 (0.5742) weight_decay: 0.0500 (0.0500) time: 25.3890 data: 17.0336 max mem: 6925
Epoch: [83] [200/625] eta: 0:14:53 lr: 0.003516 min_lr: 0.003516 loss: 2.7965 (2.7433) class_acc: 0.5664 (0.5794) weight_decay: 0.0500 (0.0500) grad_norm: 0.9347 (1.1867) time: 1.8854 data: 0.0010 max mem: 6925
Epoch: [83] [400/625] eta: 0:07:33 lr: 0.003512 min_lr: 0.003512 loss: 2.7937 (2.7537) class_acc: 0.5586 (0.5772) weight_decay: 0.0500 (0.0500) grad_norm: 1.4524 (1.2421) time: 2.0031 data: 0.0007 max mem: 6925
Epoch: [83] [600/625] eta: 0:00:51 lr: 0.003507 min_lr: 0.003507 loss: 2.7616 (2.7589) class_acc: 0.5703 (0.5760) weight_decay: 0.0500 (0.0500) grad_norm: 1.1964 (1.2151) time: 2.0096 data: 0.0010 max mem: 6925
Epoch: [83] [624/625] eta: 0:00:02 lr: 0.003506 min_lr: 0.003506 loss: 2.7538 (2.7594) class_acc: 0.5781 (0.5760) weight_decay: 0.0500 (0.0500) grad_norm: 1.2316 (1.2215) time: 0.8396 data: 0.0015 max mem: 6925
Epoch: [83] Total time: 0:20:57 (2.0113 s / it)
Averaged stats: lr: 0.003506 min_lr: 0.003506 loss: 2.7538 (2.7582) class_acc: 0.5781 (0.5769) weight_decay: 0.0500 (0.0500) grad_norm: 1.2316 (1.2215)
Test: [ 0/50] eta: 0:10:41 loss: 1.5590 (1.5590) acc1: 64.0000 (64.0000) acc5: 88.0000 (88.0000) time: 12.8346 data: 12.8019 max mem: 6925
Test: [10/50] eta: 0:01:23 loss: 1.4897 (1.5651) acc1: 65.6000 (64.8000) acc5: 87.2000 (86.2545) time: 2.0945 data: 2.0651 max mem: 6925
Test: [20/50] eta: 0:00:49 loss: 1.7484 (1.7024) acc1: 61.6000 (61.9429) acc5: 84.0000 (84.6476) time: 1.1010 data: 1.0723 max mem: 6925
Test: [30/50] eta: 0:00:29 loss: 1.8520 (1.7220) acc1: 58.4000 (61.6774) acc5: 83.2000 (83.9742) time: 1.0957 data: 1.0673 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.8154 (1.7405) acc1: 58.4000 (61.4634) acc5: 82.4000 (83.7073) time: 0.7555 data: 0.7270 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.7937 (1.7449) acc1: 60.0000 (61.0560) acc5: 84.0000 (83.7120) time: 0.6879 data: 0.6592 max mem: 6925
Test: Total time: 0:00:50 (1.0186 s / it)
* Acc@1 61.728 Acc@5 84.246 loss 1.714
Accuracy of the model on the 50000 test images: 61.7%
Max accuracy: 62.42%
Epoch: [84] [ 0/625] eta: 3:19:39 lr: 0.003506 min_lr: 0.003506 loss: 2.6236 (2.6236) class_acc: 0.6289 (0.6289) weight_decay: 0.0500 (0.0500) time: 19.1669 data: 17.3255 max mem: 6925
Epoch: [84] [200/625] eta: 0:14:05 lr: 0.003502 min_lr: 0.003502 loss: 2.7657 (2.7448) class_acc: 0.5625 (0.5798) weight_decay: 0.0500 (0.0500) grad_norm: 1.3198 (1.2938) time: 1.8854 data: 0.0011 max mem: 6925
Epoch: [84] [400/625] eta: 0:07:14 lr: 0.003497 min_lr: 0.003497 loss: 2.7625 (2.7491) class_acc: 0.5703 (0.5790) weight_decay: 0.0500 (0.0500) grad_norm: 0.8785 (1.1649) time: 1.9251 data: 0.0010 max mem: 6925
Epoch: [84] [600/625] eta: 0:00:48 lr: 0.003492 min_lr: 0.003492 loss: 2.8058 (2.7602) class_acc: 0.5664 (0.5776) weight_decay: 0.0500 (0.0500) grad_norm: 1.0757 (1.1979) time: 1.7390 data: 0.0011 max mem: 6925
Epoch: [84] [624/625] eta: 0:00:01 lr: 0.003491 min_lr: 0.003491 loss: 2.7209 (2.7591) class_acc: 0.5703 (0.5775) weight_decay: 0.0500 (0.0500) grad_norm: 0.9227 (1.1947) time: 1.1947 data: 0.0015 max mem: 6925
Epoch: [84] Total time: 0:19:50 (1.9052 s / it)
Averaged stats: lr: 0.003491 min_lr: 0.003491 loss: 2.7209 (2.7581) class_acc: 0.5703 (0.5772) weight_decay: 0.0500 (0.0500) grad_norm: 0.9227 (1.1947)
Test: [ 0/50] eta: 0:10:50 loss: 1.5480 (1.5480) acc1: 66.4000 (66.4000) acc5: 88.0000 (88.0000) time: 13.0196 data: 12.9850 max mem: 6925
Test: [10/50] eta: 0:01:32 loss: 1.6332 (1.7037) acc1: 60.8000 (61.4545) acc5: 85.6000 (84.0000) time: 2.3121 data: 2.2821 max mem: 6925
Test: [20/50] eta: 0:00:55 loss: 1.8317 (1.8849) acc1: 57.6000 (57.5238) acc5: 81.6000 (81.9429) time: 1.2943 data: 1.2652 max mem: 6925
Test: [30/50] eta: 0:00:31 loss: 2.0429 (1.8984) acc1: 56.8000 (57.5742) acc5: 79.2000 (81.6774) time: 1.1698 data: 1.1410 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.8268 (1.9016) acc1: 57.6000 (57.6976) acc5: 79.2000 (81.3268) time: 0.7017 data: 0.6726 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.8268 (1.8905) acc1: 60.0000 (58.2720) acc5: 79.2000 (81.4240) time: 0.6815 data: 0.6523 max mem: 6925
Test: Total time: 0:00:54 (1.0801 s / it)
* Acc@1 59.334 Acc@5 82.026 loss 1.839
Accuracy of the model on the 50000 test images: 59.3%
Max accuracy: 62.42%
Epoch: [85] [ 0/625] eta: 3:42:32 lr: 0.003491 min_lr: 0.003491 loss: 2.6566 (2.6566) class_acc: 0.5977 (0.5977) weight_decay: 0.0500 (0.0500) time: 21.3644 data: 15.1057 max mem: 6925
Epoch: [85] [200/625] eta: 0:14:16 lr: 0.003487 min_lr: 0.003487 loss: 2.7291 (2.7226) class_acc: 0.5820 (0.5838) weight_decay: 0.0500 (0.0500) grad_norm: 1.1125 (1.1376) time: 1.9716 data: 1.6649 max mem: 6925
Epoch: [85] [400/625] eta: 0:07:40 lr: 0.003482 min_lr: 0.003482 loss: 2.7966 (2.7419) class_acc: 0.5664 (0.5806) weight_decay: 0.0500 (0.0500) grad_norm: 0.9452 (1.1424) time: 2.0901 data: 0.3576 max mem: 6925
Epoch: [85] [600/625] eta: 0:00:51 lr: 0.003477 min_lr: 0.003477 loss: 2.7790 (2.7545) class_acc: 0.5703 (0.5776) weight_decay: 0.0500 (0.0500) grad_norm: 1.1588 (1.1710) time: 2.0863 data: 0.0255 max mem: 6925
Epoch: [85] [624/625] eta: 0:00:02 lr: 0.003476 min_lr: 0.003476 loss: 2.7486 (2.7553) class_acc: 0.5703 (0.5775) weight_decay: 0.0500 (0.0500) grad_norm: 1.0376 (1.1733) time: 0.7275 data: 0.0021 max mem: 6925
Epoch: [85] Total time: 0:20:57 (2.0116 s / it)
Averaged stats: lr: 0.003476 min_lr: 0.003476 loss: 2.7486 (2.7521) class_acc: 0.5703 (0.5786) weight_decay: 0.0500 (0.0500) grad_norm: 1.0376 (1.1733)
Test: [ 0/50] eta: 0:09:22 loss: 1.4912 (1.4912) acc1: 68.8000 (68.8000) acc5: 86.4000 (86.4000) time: 11.2570 data: 11.2253 max mem: 6925
Test: [10/50] eta: 0:01:06 loss: 1.5827 (1.5958) acc1: 64.8000 (64.5091) acc5: 85.6000 (85.0182) time: 1.6674 data: 1.6377 max mem: 6925
Test: [20/50] eta: 0:00:43 loss: 1.7696 (1.7167) acc1: 60.8000 (61.3333) acc5: 84.0000 (84.0381) time: 0.9658 data: 0.9368 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.8168 (1.7264) acc1: 58.4000 (61.2129) acc5: 83.2000 (83.9742) time: 1.2151 data: 1.1866 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.7390 (1.7472) acc1: 59.2000 (60.6439) acc5: 83.2000 (83.5707) time: 1.0907 data: 1.0616 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.8092 (1.7606) acc1: 58.4000 (60.4640) acc5: 83.2000 (83.3920) time: 0.8087 data: 0.7797 max mem: 6925
Test: Total time: 0:00:58 (1.1795 s / it)
* Acc@1 61.506 Acc@5 83.722 loss 1.723
Accuracy of the model on the 50000 test images: 61.5%
Max accuracy: 62.42%
Epoch: [86] [ 0/625] eta: 3:42:10 lr: 0.003476 min_lr: 0.003476 loss: 2.6292 (2.6292) class_acc: 0.6094 (0.6094) weight_decay: 0.0500 (0.0500) time: 21.3293 data: 17.0890 max mem: 6925
Epoch: [86] [200/625] eta: 0:14:25 lr: 0.003472 min_lr: 0.003472 loss: 2.7450 (2.7455) class_acc: 0.5781 (0.5798) weight_decay: 0.0500 (0.0500) grad_norm: 1.0538 (1.2400) time: 2.0407 data: 0.0536 max mem: 6925
Epoch: [86] [400/625] eta: 0:07:37 lr: 0.003467 min_lr: 0.003467 loss: 2.7626 (2.7511) class_acc: 0.5625 (0.5769) weight_decay: 0.0500 (0.0500) grad_norm: 0.9761 (1.2244) time: 2.0003 data: 0.8798 max mem: 6925
Epoch: [86] [600/625] eta: 0:00:51 lr: 0.003462 min_lr: 0.003462 loss: 2.7126 (2.7557) class_acc: 0.5781 (0.5759) weight_decay: 0.0500 (0.0500) grad_norm: 0.9039 (1.1928) time: 2.1670 data: 0.9437 max mem: 6925
Epoch: [86] [624/625] eta: 0:00:02 lr: 0.003461 min_lr: 0.003461 loss: 2.7438 (2.7561) class_acc: 0.5703 (0.5757) weight_decay: 0.0500 (0.0500) grad_norm: 0.8645 (1.1823) time: 0.9047 data: 0.0762 max mem: 6925
Epoch: [86] Total time: 0:20:58 (2.0141 s / it)
Averaged stats: lr: 0.003461 min_lr: 0.003461 loss: 2.7438 (2.7478) class_acc: 0.5703 (0.5790) weight_decay: 0.0500 (0.0500) grad_norm: 0.8645 (1.1823)
Test: [ 0/50] eta: 0:11:30 loss: 1.6993 (1.6993) acc1: 64.0000 (64.0000) acc5: 86.4000 (86.4000) time: 13.8121 data: 13.7753 max mem: 6925
Test: [10/50] eta: 0:01:27 loss: 1.5904 (1.5369) acc1: 66.4000 (66.2545) acc5: 86.4000 (85.7455) time: 2.1783 data: 2.1482 max mem: 6925
Test: [20/50] eta: 0:00:50 loss: 1.6970 (1.6663) acc1: 60.8000 (63.2000) acc5: 84.0000 (84.3429) time: 1.0882 data: 1.0592 max mem: 6925
Test: [30/50] eta: 0:00:30 loss: 1.7708 (1.6961) acc1: 58.4000 (62.6065) acc5: 83.2000 (84.4645) time: 1.1439 data: 1.1145 max mem: 6925
Test: [40/50] eta: 0:00:13 loss: 1.7570 (1.7291) acc1: 58.4000 (62.2634) acc5: 83.2000 (83.7268) time: 1.0246 data: 0.9947 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.7296 (1.7439) acc1: 59.2000 (61.5840) acc5: 81.6000 (83.5520) time: 1.0334 data: 1.0042 max mem: 6925
Test: Total time: 0:00:58 (1.1714 s / it)
* Acc@1 61.712 Acc@5 84.370 loss 1.700
Accuracy of the model on the 50000 test images: 61.7%
Max accuracy: 62.42%
Epoch: [87] [ 0/625] eta: 3:26:53 lr: 0.003461 min_lr: 0.003461 loss: 2.5471 (2.5471) class_acc: 0.6367 (0.6367) weight_decay: 0.0500 (0.0500) time: 19.8613 data: 18.1577 max mem: 6925
Epoch: [87] [200/625] eta: 0:14:03 lr: 0.003456 min_lr: 0.003456 loss: 2.7433 (2.7290) class_acc: 0.5781 (0.5834) weight_decay: 0.0500 (0.0500) grad_norm: 1.1477 (inf) time: 1.9362 data: 0.0009 max mem: 6925
Epoch: [87] [400/625] eta: 0:07:24 lr: 0.003451 min_lr: 0.003451 loss: 2.7406 (2.7314) class_acc: 0.5703 (0.5824) weight_decay: 0.0500 (0.0500) grad_norm: 1.0497 (inf) time: 1.9576 data: 0.0007 max mem: 6925
Epoch: [87] [600/625] eta: 0:00:49 lr: 0.003446 min_lr: 0.003446 loss: 2.6917 (2.7358) class_acc: 0.5820 (0.5814) weight_decay: 0.0500 (0.0500) grad_norm: 0.9474 (inf) time: 2.0710 data: 0.0013 max mem: 6925
Epoch: [87] [624/625] eta: 0:00:01 lr: 0.003446 min_lr: 0.003446 loss: 2.7071 (2.7360) class_acc: 0.5898 (0.5814) weight_decay: 0.0500 (0.0500) grad_norm: 0.9536 (inf) time: 0.8144 data: 0.0017 max mem: 6925
Epoch: [87] Total time: 0:20:21 (1.9547 s / it)
Averaged stats: lr: 0.003446 min_lr: 0.003446 loss: 2.7071 (2.7468) class_acc: 0.5898 (0.5795) weight_decay: 0.0500 (0.0500) grad_norm: 0.9536 (inf)
Test: [ 0/50] eta: 0:09:56 loss: 1.6234 (1.6234) acc1: 64.0000 (64.0000) acc5: 88.0000 (88.0000) time: 11.9215 data: 11.8770 max mem: 6925
Test: [10/50] eta: 0:01:14 loss: 1.5743 (1.5587) acc1: 65.6000 (65.1636) acc5: 86.4000 (86.5455) time: 1.8696 data: 1.8366 max mem: 6925
Test: [20/50] eta: 0:00:40 loss: 1.6853 (1.7295) acc1: 63.2000 (61.6381) acc5: 82.4000 (84.2286) time: 0.8227 data: 0.7925 max mem: 6925
Test: [30/50] eta: 0:00:23 loss: 1.8812 (1.7623) acc1: 57.6000 (60.6194) acc5: 81.6000 (83.5613) time: 0.7853 data: 0.7555 max mem: 6925
Test: [40/50] eta: 0:00:10 loss: 1.8340 (1.7812) acc1: 57.6000 (60.0195) acc5: 82.4000 (83.1415) time: 0.7741 data: 0.7444 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.6946 (1.7620) acc1: 60.0000 (60.5280) acc5: 84.8000 (83.4080) time: 0.5164 data: 0.4880 max mem: 6925
Test: Total time: 0:00:46 (0.9349 s / it)
* Acc@1 61.770 Acc@5 84.130 loss 1.713
Accuracy of the model on the 50000 test images: 61.8%
Max accuracy: 62.42%
Epoch: [88] [ 0/625] eta: 3:36:16 lr: 0.003446 min_lr: 0.003446 loss: 2.7763 (2.7763) class_acc: 0.5625 (0.5625) weight_decay: 0.0500 (0.0500) time: 20.7627 data: 20.5256 max mem: 6925
Epoch: [88] [200/625] eta: 0:14:27 lr: 0.003441 min_lr: 0.003441 loss: 2.7253 (2.7368) class_acc: 0.5742 (0.5824) weight_decay: 0.0500 (0.0500) grad_norm: 1.2220 (1.2566) time: 1.9002 data: 0.0009 max mem: 6925
Epoch: [88] [400/625] eta: 0:07:34 lr: 0.003436 min_lr: 0.003436 loss: 2.6993 (2.7363) class_acc: 0.5938 (0.5829) weight_decay: 0.0500 (0.0500) grad_norm: 0.9648 (1.2305) time: 2.1028 data: 0.0009 max mem: 6925
Epoch: [88] [600/625] eta: 0:00:50 lr: 0.003431 min_lr: 0.003431 loss: 2.7234 (2.7437) class_acc: 0.5859 (0.5805) weight_decay: 0.0500 (0.0500) grad_norm: 0.8918 (1.1957) time: 2.1248 data: 0.0010 max mem: 6925
Epoch: [88] [624/625] eta: 0:00:01 lr: 0.003430 min_lr: 0.003430 loss: 2.7398 (2.7435) class_acc: 0.5859 (0.5806) weight_decay: 0.0500 (0.0500) grad_norm: 1.0672 (1.1926) time: 0.7146 data: 0.0015 max mem: 6925
Epoch: [88] Total time: 0:20:26 (1.9631 s / it)
Averaged stats: lr: 0.003430 min_lr: 0.003430 loss: 2.7398 (2.7425) class_acc: 0.5859 (0.5807) weight_decay: 0.0500 (0.0500) grad_norm: 1.0672 (1.1926)
Test: [ 0/50] eta: 0:10:21 loss: 1.5599 (1.5599) acc1: 66.4000 (66.4000) acc5: 89.6000 (89.6000) time: 12.4352 data: 12.3697 max mem: 6925
Test: [10/50] eta: 0:01:26 loss: 1.5599 (1.6355) acc1: 62.4000 (63.2000) acc5: 85.6000 (84.8000) time: 2.1696 data: 2.1371 max mem: 6925
Test: [20/50] eta: 0:00:48 loss: 1.8485 (1.8076) acc1: 59.2000 (59.7333) acc5: 82.4000 (82.8191) time: 1.0920 data: 1.0633 max mem: 6925
Test: [30/50] eta: 0:00:28 loss: 1.9454 (1.8467) acc1: 58.4000 (59.3032) acc5: 81.6000 (82.4258) time: 0.9924 data: 0.9641 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.9036 (1.8505) acc1: 56.0000 (59.4146) acc5: 82.4000 (82.4585) time: 0.7315 data: 0.7023 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.8899 (1.8550) acc1: 56.0000 (59.2640) acc5: 83.2000 (82.3200) time: 0.5596 data: 0.5303 max mem: 6925
Test: Total time: 0:00:52 (1.0481 s / it)
* Acc@1 60.398 Acc@5 82.830 loss 1.811
Accuracy of the model on the 50000 test images: 60.4%
Max accuracy: 62.42%
Epoch: [89] [ 0/625] eta: 3:26:04 lr: 0.003430 min_lr: 0.003430 loss: 2.7859 (2.7859) class_acc: 0.5781 (0.5781) weight_decay: 0.0500 (0.0500) time: 19.7826 data: 17.6456 max mem: 6925
Epoch: [89] [200/625] eta: 0:14:36 lr: 0.003425 min_lr: 0.003425 loss: 2.7188 (2.7237) class_acc: 0.5820 (0.5845) weight_decay: 0.0500 (0.0500) grad_norm: 0.9405 (1.0795) time: 1.9888 data: 0.0009 max mem: 6925
Epoch: [89] [400/625] eta: 0:07:37 lr: 0.003420 min_lr: 0.003420 loss: 2.7607 (2.7334) class_acc: 0.5781 (0.5811) weight_decay: 0.0500 (0.0500) grad_norm: 0.8195 (1.1436) time: 2.0248 data: 0.0719 max mem: 6925
Epoch: [89] [600/625] eta: 0:00:50 lr: 0.003415 min_lr: 0.003415 loss: 2.7826 (2.7461) class_acc: 0.5781 (0.5794) weight_decay: 0.0500 (0.0500) grad_norm: 0.9892 (1.1440) time: 2.1749 data: 0.0229 max mem: 6925
Epoch: [89] [624/625] eta: 0:00:01 lr: 0.003414 min_lr: 0.003414 loss: 2.6995 (2.7463) class_acc: 0.5938 (0.5793) weight_decay: 0.0500 (0.0500) grad_norm: 1.2510 (1.1468) time: 0.7886 data: 0.0033 max mem: 6925
Epoch: [89] Total time: 0:20:42 (1.9876 s / it)
Averaged stats: lr: 0.003414 min_lr: 0.003414 loss: 2.6995 (2.7405) class_acc: 0.5938 (0.5807) weight_decay: 0.0500 (0.0500) grad_norm: 1.2510 (1.1468)
Test: [ 0/50] eta: 0:10:22 loss: 1.6996 (1.6996) acc1: 62.4000 (62.4000) acc5: 84.8000 (84.8000) time: 12.4488 data: 12.4090 max mem: 6925
Test: [10/50] eta: 0:01:23 loss: 1.6998 (1.6731) acc1: 63.2000 (63.2727) acc5: 83.2000 (83.4909) time: 2.0865 data: 2.0563 max mem: 6925
Test: [20/50] eta: 0:00:47 loss: 1.7553 (1.8041) acc1: 59.2000 (60.0000) acc5: 83.2000 (82.8191) time: 1.0387 data: 1.0098 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.9052 (1.8333) acc1: 56.8000 (59.5871) acc5: 81.6000 (82.0129) time: 1.0054 data: 0.9768 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.9025 (1.8555) acc1: 58.4000 (59.5122) acc5: 80.0000 (81.6390) time: 0.7200 data: 0.6879 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.8906 (1.8609) acc1: 59.2000 (59.4080) acc5: 82.4000 (81.7120) time: 0.5505 data: 0.5185 max mem: 6925
Test: Total time: 0:00:49 (0.9949 s / it)
* Acc@1 60.170 Acc@5 82.756 loss 1.814
Accuracy of the model on the 50000 test images: 60.2%
Max accuracy: 62.42%
Epoch: [90] [ 0/625] eta: 3:46:39 lr: 0.003414 min_lr: 0.003414 loss: 2.8533 (2.8533) class_acc: 0.5352 (0.5352) weight_decay: 0.0500 (0.0500) time: 21.7599 data: 21.1698 max mem: 6925
Epoch: [90] [200/625] eta: 0:14:48 lr: 0.003409 min_lr: 0.003409 loss: 2.7198 (2.7200) class_acc: 0.5742 (0.5838) weight_decay: 0.0500 (0.0500) grad_norm: 0.9322 (1.1924) time: 1.9605 data: 0.0010 max mem: 6925
Epoch: [90] [400/625] eta: 0:07:37 lr: 0.003404 min_lr: 0.003404 loss: 2.7758 (2.7281) class_acc: 0.5781 (0.5841) weight_decay: 0.0500 (0.0500) grad_norm: 1.1161 (1.1511) time: 1.9409 data: 0.0008 max mem: 6925
Epoch: [90] [600/625] eta: 0:00:50 lr: 0.003399 min_lr: 0.003399 loss: 2.7525 (2.7412) class_acc: 0.5703 (0.5806) weight_decay: 0.0500 (0.0500) grad_norm: 0.8774 (1.1082) time: 1.9068 data: 0.0009 max mem: 6925
Epoch: [90] [624/625] eta: 0:00:01 lr: 0.003398 min_lr: 0.003398 loss: 2.8052 (2.7430) class_acc: 0.5586 (0.5803) weight_decay: 0.0500 (0.0500) grad_norm: 0.8881 (1.1093) time: 0.9142 data: 0.0013 max mem: 6925
Epoch: [90] Total time: 0:20:37 (1.9793 s / it)
Averaged stats: lr: 0.003398 min_lr: 0.003398 loss: 2.8052 (2.7405) class_acc: 0.5586 (0.5806) weight_decay: 0.0500 (0.0500) grad_norm: 0.8881 (1.1093)
Test: [ 0/50] eta: 0:10:09 loss: 1.6376 (1.6376) acc1: 69.6000 (69.6000) acc5: 85.6000 (85.6000) time: 12.1942 data: 12.1610 max mem: 6925
Test: [10/50] eta: 0:01:13 loss: 1.6376 (1.6474) acc1: 63.2000 (65.4545) acc5: 85.6000 (85.2364) time: 1.8351 data: 1.8049 max mem: 6925
Test: [20/50] eta: 0:00:39 loss: 1.7902 (1.8034) acc1: 60.0000 (60.8381) acc5: 81.6000 (83.0095) time: 0.7737 data: 0.7437 max mem: 6925
Test: [30/50] eta: 0:00:24 loss: 1.9702 (1.8329) acc1: 56.8000 (60.1032) acc5: 80.8000 (82.6839) time: 0.8736 data: 0.8442 max mem: 6925
Test: [40/50] eta: 0:00:10 loss: 1.8523 (1.8273) acc1: 57.6000 (59.8244) acc5: 81.6000 (82.6146) time: 0.8682 data: 0.8396 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.7744 (1.8303) acc1: 57.6000 (59.7120) acc5: 84.0000 (82.7200) time: 0.5143 data: 0.4838 max mem: 6925
Test: Total time: 0:00:48 (0.9615 s / it)
* Acc@1 60.766 Acc@5 83.412 loss 1.779
Accuracy of the model on the 50000 test images: 60.8%
Max accuracy: 62.42%
Epoch: [91] [ 0/625] eta: 3:37:20 lr: 0.003398 min_lr: 0.003398 loss: 2.7311 (2.7311) class_acc: 0.5859 (0.5859) weight_decay: 0.0500 (0.0500) time: 20.8643 data: 19.5491 max mem: 6925
Epoch: [91] [200/625] eta: 0:14:34 lr: 0.003393 min_lr: 0.003393 loss: 2.7131 (2.7179) class_acc: 0.5938 (0.5862) weight_decay: 0.0500 (0.0500) grad_norm: 0.9042 (1.1494) time: 2.1256 data: 0.0009 max mem: 6925
Epoch: [91] [400/625] eta: 0:07:25 lr: 0.003388 min_lr: 0.003388 loss: 2.7046 (2.7299) class_acc: 0.5898 (0.5821) weight_decay: 0.0500 (0.0500) grad_norm: 0.8432 (1.1076) time: 1.7963 data: 0.0204 max mem: 6925
Epoch: [91] [600/625] eta: 0:00:48 lr: 0.003383 min_lr: 0.003383 loss: 2.7690 (2.7342) class_acc: 0.5703 (0.5813) weight_decay: 0.0500 (0.0500) grad_norm: 1.0657 (1.1236) time: 2.0189 data: 0.0010 max mem: 6925
Epoch: [91] [624/625] eta: 0:00:01 lr: 0.003382 min_lr: 0.003382 loss: 2.7351 (2.7353) class_acc: 0.5742 (0.5812) weight_decay: 0.0500 (0.0500) grad_norm: 0.8804 (1.1165) time: 0.8283 data: 0.0015 max mem: 6925
Epoch: [91] Total time: 0:19:55 (1.9130 s / it)
Averaged stats: lr: 0.003382 min_lr: 0.003382 loss: 2.7351 (2.7356) class_acc: 0.5742 (0.5819) weight_decay: 0.0500 (0.0500) grad_norm: 0.8804 (1.1165)
Test: [ 0/50] eta: 0:11:10 loss: 1.4624 (1.4624) acc1: 69.6000 (69.6000) acc5: 88.0000 (88.0000) time: 13.4114 data: 13.3807 max mem: 6925
Test: [10/50] eta: 0:01:25 loss: 1.6268 (1.6149) acc1: 65.6000 (64.6545) acc5: 84.8000 (84.8727) time: 2.1310 data: 2.0984 max mem: 6925
Test: [20/50] eta: 0:00:48 loss: 1.8135 (1.8516) acc1: 57.6000 (59.1619) acc5: 80.8000 (81.8667) time: 1.0330 data: 1.0015 max mem: 6925
Test: [30/50] eta: 0:00:28 loss: 2.0064 (1.8560) acc1: 56.0000 (58.9677) acc5: 78.4000 (81.6258) time: 1.0095 data: 0.9801 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.9161 (1.8720) acc1: 56.8000 (59.1220) acc5: 80.8000 (81.5220) time: 0.6928 data: 0.6617 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.8078 (1.8579) acc1: 56.8000 (58.7680) acc5: 83.2000 (81.9520) time: 0.6327 data: 0.6008 max mem: 6925
Test: Total time: 0:00:48 (0.9763 s / it)
* Acc@1 59.936 Acc@5 82.688 loss 1.821
Accuracy of the model on the 50000 test images: 59.9%
Max accuracy: 62.42%
Epoch: [92] [ 0/625] eta: 3:40:17 lr: 0.003382 min_lr: 0.003382 loss: 2.6853 (2.6853) class_acc: 0.5742 (0.5742) weight_decay: 0.0500 (0.0500) time: 21.1484 data: 16.6877 max mem: 6925
Epoch: [92] [200/625] eta: 0:14:04 lr: 0.003377 min_lr: 0.003377 loss: 2.6686 (2.7144) class_acc: 0.5859 (0.5864) weight_decay: 0.0500 (0.0500) grad_norm: 0.9537 (1.1709) time: 1.9264 data: 0.0010 max mem: 6925
Epoch: [92] [400/625] eta: 0:07:18 lr: 0.003372 min_lr: 0.003372 loss: 2.7576 (2.7320) class_acc: 0.5820 (0.5828) weight_decay: 0.0500 (0.0500) grad_norm: 0.8196 (1.1334) time: 2.0242 data: 0.0008 max mem: 6925
Epoch: [92] [600/625] eta: 0:00:48 lr: 0.003367 min_lr: 0.003367 loss: 2.7582 (2.7346) class_acc: 0.5664 (0.5818) weight_decay: 0.0500 (0.0500) grad_norm: 1.1293 (1.1511) time: 1.8700 data: 0.0008 max mem: 6925
Epoch: [92] [624/625] eta: 0:00:01 lr: 0.003366 min_lr: 0.003366 loss: 2.7521 (2.7360) class_acc: 0.5664 (0.5813) weight_decay: 0.0500 (0.0500) grad_norm: 0.9651 (1.1439) time: 0.7812 data: 0.0017 max mem: 6925
Epoch: [92] Total time: 0:19:50 (1.9043 s / it)
Averaged stats: lr: 0.003366 min_lr: 0.003366 loss: 2.7521 (2.7329) class_acc: 0.5664 (0.5824) weight_decay: 0.0500 (0.0500) grad_norm: 0.9651 (1.1439)
Test: [ 0/50] eta: 0:09:03 loss: 1.2445 (1.2445) acc1: 71.2000 (71.2000) acc5: 92.0000 (92.0000) time: 10.8623 data: 10.8312 max mem: 6925
Test: [10/50] eta: 0:01:12 loss: 1.5512 (1.5731) acc1: 64.8000 (64.5091) acc5: 88.0000 (86.1818) time: 1.8157 data: 1.7847 max mem: 6925
Test: [20/50] eta: 0:00:42 loss: 1.7690 (1.7843) acc1: 59.2000 (60.5714) acc5: 82.4000 (82.9333) time: 0.9616 data: 0.9317 max mem: 6925
Test: [30/50] eta: 0:00:25 loss: 1.9597 (1.8101) acc1: 57.6000 (60.3097) acc5: 80.0000 (82.6839) time: 0.9709 data: 0.9421 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.8521 (1.8258) acc1: 59.2000 (60.0585) acc5: 81.6000 (82.5171) time: 0.7795 data: 0.7508 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.7969 (1.8287) acc1: 57.6000 (59.8240) acc5: 81.6000 (82.4160) time: 0.7200 data: 0.6913 max mem: 6925
Test: Total time: 0:00:47 (0.9480 s / it)
* Acc@1 60.876 Acc@5 83.168 loss 1.779
Accuracy of the model on the 50000 test images: 60.9%
Max accuracy: 62.42%
Epoch: [93] [ 0/625] eta: 3:58:49 lr: 0.003366 min_lr: 0.003366 loss: 2.7050 (2.7050) class_acc: 0.6016 (0.6016) weight_decay: 0.0500 (0.0500) time: 22.9273 data: 19.2210 max mem: 6925
Epoch: [93] [200/625] eta: 0:14:25 lr: 0.003361 min_lr: 0.003361 loss: 2.7379 (2.7147) class_acc: 0.5742 (0.5875) weight_decay: 0.0500 (0.0500) grad_norm: 0.8258 (1.0924) time: 1.9978 data: 0.0014 max mem: 6925
Epoch: [93] [400/625] eta: 0:07:22 lr: 0.003355 min_lr: 0.003355 loss: 2.7554 (2.7132) class_acc: 0.5703 (0.5871) weight_decay: 0.0500 (0.0500) grad_norm: 1.7566 (inf) time: 1.9935 data: 0.0143 max mem: 6925
Epoch: [93] [600/625] eta: 0:00:48 lr: 0.003350 min_lr: 0.003350 loss: 2.7091 (2.7252) class_acc: 0.5742 (0.5841) weight_decay: 0.0500 (0.0500) grad_norm: 1.0881 (inf) time: 1.9406 data: 0.0298 max mem: 6925
Epoch: [93] [624/625] eta: 0:00:01 lr: 0.003350 min_lr: 0.003350 loss: 2.7150 (2.7250) class_acc: 0.5938 (0.5841) weight_decay: 0.0500 (0.0500) grad_norm: 1.0881 (inf) time: 0.8111 data: 0.0018 max mem: 6925
Epoch: [93] Total time: 0:19:46 (1.8981 s / it)
Averaged stats: lr: 0.003350 min_lr: 0.003350 loss: 2.7150 (2.7271) class_acc: 0.5938 (0.5839) weight_decay: 0.0500 (0.0500) grad_norm: 1.0881 (inf)
Test: [ 0/50] eta: 0:09:25 loss: 1.5221 (1.5221) acc1: 68.0000 (68.0000) acc5: 90.4000 (90.4000) time: 11.3057 data: 11.2623 max mem: 6925
Test: [10/50] eta: 0:01:18 loss: 1.5386 (1.5769) acc1: 67.2000 (66.6909) acc5: 87.2000 (86.6182) time: 1.9550 data: 1.9241 max mem: 6925
Test: [20/50] eta: 0:00:46 loss: 1.6693 (1.7391) acc1: 63.2000 (62.2857) acc5: 85.6000 (84.5714) time: 1.0551 data: 1.0257 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.8522 (1.7645) acc1: 59.2000 (61.4968) acc5: 83.2000 (84.1548) time: 1.0330 data: 1.0031 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.7491 (1.7717) acc1: 59.2000 (61.3854) acc5: 84.0000 (83.7268) time: 0.7745 data: 0.7449 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.7029 (1.7731) acc1: 60.0000 (61.0880) acc5: 84.0000 (83.7280) time: 0.6378 data: 0.6080 max mem: 6925
Test: Total time: 0:00:49 (0.9970 s / it)
* Acc@1 61.704 Acc@5 83.932 loss 1.738
Accuracy of the model on the 50000 test images: 61.7%
Max accuracy: 62.42%
Epoch: [94] [ 0/625] eta: 3:37:46 lr: 0.003350 min_lr: 0.003350 loss: 2.6274 (2.6274) class_acc: 0.6445 (0.6445) weight_decay: 0.0500 (0.0500) time: 20.9067 data: 16.1798 max mem: 6925
Epoch: [94] [200/625] eta: 0:13:22 lr: 0.003344 min_lr: 0.003344 loss: 2.6667 (2.7051) class_acc: 0.5820 (0.5878) weight_decay: 0.0500 (0.0500) grad_norm: 0.8040 (1.0286) time: 1.6970 data: 0.1019 max mem: 6925
Epoch: [94] [400/625] eta: 0:07:07 lr: 0.003339 min_lr: 0.003339 loss: 2.6994 (2.7171) class_acc: 0.5859 (0.5867) weight_decay: 0.0500 (0.0500) grad_norm: 0.9971 (1.0756) time: 1.7933 data: 0.0134 max mem: 6925
Epoch: [94] [600/625] eta: 0:00:47 lr: 0.003334 min_lr: 0.003334 loss: 2.7372 (2.7283) class_acc: 0.5859 (0.5838) weight_decay: 0.0500 (0.0500) grad_norm: 1.0863 (1.1067) time: 2.0356 data: 0.0741 max mem: 6925
Epoch: [94] [624/625] eta: 0:00:01 lr: 0.003333 min_lr: 0.003333 loss: 2.7440 (2.7287) class_acc: 0.5703 (0.5836) weight_decay: 0.0500 (0.0500) grad_norm: 0.8883 (1.0986) time: 1.0177 data: 0.0301 max mem: 6925
Epoch: [94] Total time: 0:19:26 (1.8657 s / it)
Averaged stats: lr: 0.003333 min_lr: 0.003333 loss: 2.7440 (2.7269) class_acc: 0.5703 (0.5836) weight_decay: 0.0500 (0.0500) grad_norm: 0.8883 (1.0986)
Test: [ 0/50] eta: 0:10:01 loss: 1.7409 (1.7409) acc1: 63.2000 (63.2000) acc5: 87.2000 (87.2000) time: 12.0282 data: 11.9790 max mem: 6925
Test: [10/50] eta: 0:01:23 loss: 1.7409 (1.6924) acc1: 63.2000 (62.6182) acc5: 84.0000 (84.3636) time: 2.0901 data: 2.0582 max mem: 6925
Test: [20/50] eta: 0:00:50 loss: 1.8156 (1.8119) acc1: 58.4000 (59.6571) acc5: 82.4000 (82.5905) time: 1.1502 data: 1.1197 max mem: 6925
Test: [30/50] eta: 0:00:29 loss: 1.8374 (1.8219) acc1: 56.8000 (59.5097) acc5: 80.8000 (82.5032) time: 1.1146 data: 1.0844 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.8129 (1.8222) acc1: 58.4000 (59.9024) acc5: 82.4000 (82.7317) time: 0.7260 data: 0.6953 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.7165 (1.8242) acc1: 60.0000 (59.7280) acc5: 83.2000 (82.6560) time: 0.6510 data: 0.6209 max mem: 6925
Test: Total time: 0:00:50 (1.0098 s / it)
* Acc@1 60.876 Acc@5 83.272 loss 1.768
Accuracy of the model on the 50000 test images: 60.9%
Max accuracy: 62.42%
Epoch: [95] [ 0/625] eta: 3:31:45 lr: 0.003333 min_lr: 0.003333 loss: 2.6771 (2.6771) class_acc: 0.5703 (0.5703) weight_decay: 0.0500 (0.0500) time: 20.3291 data: 20.0996 max mem: 6925
Epoch: [95] [200/625] eta: 0:14:00 lr: 0.003327 min_lr: 0.003327 loss: 2.7176 (2.6948) class_acc: 0.5859 (0.5903) weight_decay: 0.0500 (0.0500) grad_norm: 0.9278 (1.0610) time: 1.8744 data: 0.0092 max mem: 6925
Epoch: [95] [400/625] eta: 0:07:13 lr: 0.003322 min_lr: 0.003322 loss: 2.7474 (2.7140) class_acc: 0.5703 (0.5861) weight_decay: 0.0500 (0.0500) grad_norm: 0.9237 (1.1112) time: 1.9906 data: 0.0284 max mem: 6925
Epoch: [95] [600/625] eta: 0:00:48 lr: 0.003317 min_lr: 0.003317 loss: 2.6904 (2.7203) class_acc: 0.5898 (0.5846) weight_decay: 0.0500 (0.0500) grad_norm: 1.1097 (1.1442) time: 1.9984 data: 0.0192 max mem: 6925
Epoch: [95] [624/625] eta: 0:00:01 lr: 0.003316 min_lr: 0.003316 loss: 2.7122 (2.7198) class_acc: 0.5859 (0.5850) weight_decay: 0.0500 (0.0500) grad_norm: 1.0402 (1.1429) time: 0.9456 data: 0.0016 max mem: 6925
Epoch: [95] Total time: 0:19:53 (1.9102 s / it)
Averaged stats: lr: 0.003316 min_lr: 0.003316 loss: 2.7122 (2.7221) class_acc: 0.5859 (0.5845) weight_decay: 0.0500 (0.0500) grad_norm: 1.0402 (1.1429)
Test: [ 0/50] eta: 0:10:05 loss: 1.8298 (1.8298) acc1: 58.4000 (58.4000) acc5: 85.6000 (85.6000) time: 12.1157 data: 12.0793 max mem: 6925
Test: [10/50] eta: 0:01:20 loss: 1.6190 (1.6027) acc1: 64.8000 (64.4364) acc5: 86.4000 (86.4727) time: 2.0082 data: 1.9767 max mem: 6925
Test: [20/50] eta: 0:00:45 loss: 1.7681 (1.7789) acc1: 60.8000 (60.8381) acc5: 84.0000 (83.6952) time: 0.9924 data: 0.9627 max mem: 6925
Test: [30/50] eta: 0:00:26 loss: 1.9254 (1.7842) acc1: 57.6000 (61.0839) acc5: 81.6000 (83.6129) time: 0.9793 data: 0.9507 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.7815 (1.8048) acc1: 59.2000 (60.5659) acc5: 81.6000 (83.2585) time: 0.9342 data: 0.9036 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.7764 (1.8123) acc1: 58.4000 (60.3840) acc5: 82.4000 (83.0080) time: 0.6599 data: 0.6290 max mem: 6925
Test: Total time: 0:00:53 (1.0752 s / it)
* Acc@1 61.096 Acc@5 83.402 loss 1.769
Accuracy of the model on the 50000 test images: 61.1%
Max accuracy: 62.42%
Epoch: [96] [ 0/625] eta: 4:18:07 lr: 0.003316 min_lr: 0.003316 loss: 2.6469 (2.6469) class_acc: 0.5977 (0.5977) weight_decay: 0.0500 (0.0500) time: 24.7805 data: 17.6722 max mem: 6925
Epoch: [96] [200/625] eta: 0:14:15 lr: 0.003311 min_lr: 0.003311 loss: 2.6894 (2.6993) class_acc: 0.5742 (0.5888) weight_decay: 0.0500 (0.0500) grad_norm: 1.1175 (1.1236) time: 1.8118 data: 1.4830 max mem: 6925
Epoch: [96] [400/625] eta: 0:07:19 lr: 0.003305 min_lr: 0.003305 loss: 2.7132 (2.7180) class_acc: 0.5703 (0.5852) weight_decay: 0.0500 (0.0500) grad_norm: 0.9679 (1.1092) time: 1.8442 data: 0.9748 max mem: 6925
Epoch: [96] [600/625] eta: 0:00:48 lr: 0.003300 min_lr: 0.003300 loss: 2.7495 (2.7266) class_acc: 0.5820 (0.5836) weight_decay: 0.0500 (0.0500) grad_norm: 0.9070 (1.1104) time: 1.8595 data: 0.3953 max mem: 6925
Epoch: [96] [624/625] eta: 0:00:01 lr: 0.003299 min_lr: 0.003299 loss: 2.7044 (2.7256) class_acc: 0.5742 (0.5835) weight_decay: 0.0500 (0.0500) grad_norm: 0.8743 (1.1098) time: 0.9320 data: 0.0514 max mem: 6925
Epoch: [96] Total time: 0:19:46 (1.8983 s / it)
Averaged stats: lr: 0.003299 min_lr: 0.003299 loss: 2.7044 (2.7219) class_acc: 0.5742 (0.5847) weight_decay: 0.0500 (0.0500) grad_norm: 0.8743 (1.1098)
Test: [ 0/50] eta: 0:10:05 loss: 1.7703 (1.7703) acc1: 63.2000 (63.2000) acc5: 80.8000 (80.8000) time: 12.1171 data: 12.0822 max mem: 6925
Test: [10/50] eta: 0:01:20 loss: 1.4918 (1.5320) acc1: 65.6000 (65.5273) acc5: 87.2000 (86.5455) time: 2.0137 data: 1.9840 max mem: 6925
Test: [20/50] eta: 0:00:46 loss: 1.7200 (1.6943) acc1: 60.8000 (61.7524) acc5: 84.8000 (84.6476) time: 1.0158 data: 0.9868 max mem: 6925
Test: [30/50] eta: 0:00:26 loss: 1.8090 (1.7098) acc1: 58.4000 (61.7290) acc5: 82.4000 (84.4129) time: 0.9353 data: 0.9065 max mem: 6925
Test: [40/50] eta: 0:00:10 loss: 1.6316 (1.7234) acc1: 60.0000 (61.4049) acc5: 82.4000 (84.2537) time: 0.6187 data: 0.5894 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.6731 (1.7351) acc1: 58.4000 (60.9600) acc5: 84.0000 (84.0800) time: 0.5301 data: 0.4999 max mem: 6925
Test: Total time: 0:00:45 (0.9125 s / it)
* Acc@1 62.038 Acc@5 84.422 loss 1.688
Accuracy of the model on the 50000 test images: 62.0%
Max accuracy: 62.42%
Epoch: [97] [ 0/625] eta: 3:17:05 lr: 0.003299 min_lr: 0.003299 loss: 2.8126 (2.8126) class_acc: 0.5508 (0.5508) weight_decay: 0.0500 (0.0500) time: 18.9206 data: 16.8176 max mem: 6925
Epoch: [97] [200/625] eta: 0:13:27 lr: 0.003294 min_lr: 0.003294 loss: 2.7399 (2.7026) class_acc: 0.5820 (0.5869) weight_decay: 0.0500 (0.0500) grad_norm: 0.8033 (1.1156) time: 1.6997 data: 0.0009 max mem: 6925
Epoch: [97] [400/625] eta: 0:06:59 lr: 0.003288 min_lr: 0.003288 loss: 2.7106 (2.7142) class_acc: 0.5859 (0.5856) weight_decay: 0.0500 (0.0500) grad_norm: 1.0187 (1.0780) time: 1.8314 data: 0.0007 max mem: 6925
Epoch: [97] [600/625] eta: 0:00:46 lr: 0.003283 min_lr: 0.003283 loss: 2.7497 (2.7248) class_acc: 0.5703 (0.5836) weight_decay: 0.0500 (0.0500) grad_norm: 0.9856 (1.0957) time: 2.0341 data: 0.0008 max mem: 6925
Epoch: [97] [624/625] eta: 0:00:01 lr: 0.003282 min_lr: 0.003282 loss: 2.7270 (2.7251) class_acc: 0.5781 (0.5837) weight_decay: 0.0500 (0.0500) grad_norm: 1.1461 (1.1013) time: 0.7797 data: 0.0015 max mem: 6925
Epoch: [97] Total time: 0:19:10 (1.8403 s / it)
Averaged stats: lr: 0.003282 min_lr: 0.003282 loss: 2.7270 (2.7202) class_acc: 0.5781 (0.5851) weight_decay: 0.0500 (0.0500) grad_norm: 1.1461 (1.1013)
Test: [ 0/50] eta: 0:09:03 loss: 1.6288 (1.6288) acc1: 61.6000 (61.6000) acc5: 88.8000 (88.8000) time: 10.8797 data: 10.8431 max mem: 6925
Test: [10/50] eta: 0:01:14 loss: 1.5834 (1.6318) acc1: 62.4000 (63.0545) acc5: 86.4000 (86.2545) time: 1.8635 data: 1.8333 max mem: 6925
Test: [20/50] eta: 0:00:44 loss: 1.7174 (1.8034) acc1: 60.0000 (59.8857) acc5: 83.2000 (83.6571) time: 1.0034 data: 0.9743 max mem: 6925
Test: [30/50] eta: 0:00:26 loss: 1.8989 (1.8219) acc1: 56.0000 (59.2258) acc5: 81.6000 (83.3290) time: 1.0133 data: 0.9842 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.8770 (1.8253) acc1: 56.8000 (58.9463) acc5: 81.6000 (82.7902) time: 0.8438 data: 0.8137 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.8056 (1.8284) acc1: 57.6000 (58.9920) acc5: 80.8000 (82.5120) time: 0.6289 data: 0.5988 max mem: 6925
Test: Total time: 0:00:51 (1.0222 s / it)
* Acc@1 60.566 Acc@5 83.022 loss 1.772
Accuracy of the model on the 50000 test images: 60.6%
Max accuracy: 62.42%
Epoch: [98] [ 0/625] eta: 3:32:40 lr: 0.003282 min_lr: 0.003282 loss: 2.6191 (2.6191) class_acc: 0.6250 (0.6250) weight_decay: 0.0500 (0.0500) time: 20.4163 data: 15.9268 max mem: 6925
Epoch: [98] [200/625] eta: 0:13:58 lr: 0.003276 min_lr: 0.003276 loss: 2.6935 (2.7035) class_acc: 0.5898 (0.5889) weight_decay: 0.0500 (0.0500) grad_norm: 0.8507 (1.0274) time: 1.8841 data: 0.0008 max mem: 6925
Epoch: [98] [400/625] eta: 0:07:18 lr: 0.003271 min_lr: 0.003271 loss: 2.7624 (2.7079) class_acc: 0.5859 (0.5877) weight_decay: 0.0500 (0.0500) grad_norm: 1.0334 (1.0537) time: 2.0066 data: 0.0007 max mem: 6925
Epoch: [98] [600/625] eta: 0:00:49 lr: 0.003265 min_lr: 0.003265 loss: 2.7358 (2.7155) class_acc: 0.5820 (0.5863) weight_decay: 0.0500 (0.0500) grad_norm: 0.8605 (1.0485) time: 2.0936 data: 0.0007 max mem: 6925
Epoch: [98] [624/625] eta: 0:00:01 lr: 0.003265 min_lr: 0.003265 loss: 2.7191 (2.7157) class_acc: 0.5820 (0.5863) weight_decay: 0.0500 (0.0500) grad_norm: 0.8554 (1.0404) time: 0.7757 data: 0.0017 max mem: 6925
Epoch: [98] Total time: 0:19:58 (1.9169 s / it)
Averaged stats: lr: 0.003265 min_lr: 0.003265 loss: 2.7191 (2.7153) class_acc: 0.5820 (0.5865) weight_decay: 0.0500 (0.0500) grad_norm: 0.8554 (1.0404)
Test: [ 0/50] eta: 0:08:40 loss: 1.7041 (1.7041) acc1: 61.6000 (61.6000) acc5: 84.8000 (84.8000) time: 10.4177 data: 10.3862 max mem: 6925
Test: [10/50] eta: 0:01:16 loss: 1.4490 (1.5239) acc1: 67.2000 (66.9818) acc5: 87.2000 (86.4000) time: 1.9185 data: 1.8887 max mem: 6925
Test: [20/50] eta: 0:00:46 loss: 1.6511 (1.6701) acc1: 64.0000 (63.2381) acc5: 85.6000 (84.4952) time: 1.1234 data: 1.0937 max mem: 6925
Test: [30/50] eta: 0:00:28 loss: 1.7839 (1.6669) acc1: 60.0000 (62.8129) acc5: 84.8000 (84.5677) time: 1.1775 data: 1.1464 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.6037 (1.6893) acc1: 60.0000 (62.4195) acc5: 84.8000 (84.0976) time: 0.9160 data: 0.8851 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.5847 (1.6897) acc1: 61.6000 (62.5280) acc5: 84.0000 (83.9840) time: 0.7201 data: 0.6909 max mem: 6925
Test: Total time: 0:00:52 (1.0562 s / it)
* Acc@1 63.104 Acc@5 84.882 loss 1.643
Accuracy of the model on the 50000 test images: 63.1%
Max accuracy: 63.10%
Epoch: [99] [ 0/625] eta: 3:42:06 lr: 0.003265 min_lr: 0.003265 loss: 2.6127 (2.6127) class_acc: 0.6250 (0.6250) weight_decay: 0.0500 (0.0500) time: 21.3216 data: 19.9127 max mem: 6925
Epoch: [99] [200/625] eta: 0:14:21 lr: 0.003259 min_lr: 0.003259 loss: 2.7000 (2.6961) class_acc: 0.5781 (0.5916) weight_decay: 0.0500 (0.0500) grad_norm: 0.8379 (1.2237) time: 1.9937 data: 0.0013 max mem: 6925
Epoch: [99] [400/625] eta: 0:07:19 lr: 0.003253 min_lr: 0.003253 loss: 2.7532 (2.7055) class_acc: 0.5898 (0.5895) weight_decay: 0.0500 (0.0500) grad_norm: 0.9224 (1.1164) time: 1.7982 data: 0.0010 max mem: 6925
Epoch: [99] [600/625] eta: 0:00:48 lr: 0.003248 min_lr: 0.003248 loss: 2.7284 (2.7138) class_acc: 0.5781 (0.5873) weight_decay: 0.0500 (0.0500) grad_norm: 0.9850 (1.0921) time: 1.7971 data: 0.0008 max mem: 6925
Epoch: [99] [624/625] eta: 0:00:01 lr: 0.003247 min_lr: 0.003247 loss: 2.7388 (2.7154) class_acc: 0.5781 (0.5869) weight_decay: 0.0500 (0.0500) grad_norm: 0.8227 (1.0887) time: 0.9143 data: 0.0015 max mem: 6925
Epoch: [99] Total time: 0:19:44 (1.8950 s / it)
Averaged stats: lr: 0.003247 min_lr: 0.003247 loss: 2.7388 (2.7111) class_acc: 0.5781 (0.5877) weight_decay: 0.0500 (0.0500) grad_norm: 0.8227 (1.0887)
Test: [ 0/50] eta: 0:08:52 loss: 1.3884 (1.3884) acc1: 68.8000 (68.8000) acc5: 88.0000 (88.0000) time: 10.6447 data: 10.6021 max mem: 6925
Test: [10/50] eta: 0:01:14 loss: 1.4906 (1.5049) acc1: 66.4000 (66.6909) acc5: 88.0000 (87.5636) time: 1.8671 data: 1.8350 max mem: 6925
Test: [20/50] eta: 0:00:43 loss: 1.5863 (1.6592) acc1: 63.2000 (62.6286) acc5: 86.4000 (85.7905) time: 1.0067 data: 0.9769 max mem: 6925
Test: [30/50] eta: 0:00:25 loss: 1.7603 (1.6683) acc1: 60.8000 (62.5806) acc5: 84.8000 (85.4194) time: 0.9331 data: 0.9046 max mem: 6925
Test: [40/50] eta: 0:00:10 loss: 1.6402 (1.6963) acc1: 60.8000 (62.0878) acc5: 82.4000 (84.9171) time: 0.6458 data: 0.6169 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.7744 (1.7040) acc1: 61.6000 (62.1600) acc5: 82.4000 (84.7200) time: 0.4963 data: 0.4674 max mem: 6925
Test: Total time: 0:00:46 (0.9229 s / it)
* Acc@1 62.640 Acc@5 84.602 loss 1.679
Accuracy of the model on the 50000 test images: 62.6%
Max accuracy: 63.10%
Epoch: [100] [ 0/625] eta: 3:28:42 lr: 0.003247 min_lr: 0.003247 loss: 2.5179 (2.5179) class_acc: 0.6133 (0.6133) weight_decay: 0.0500 (0.0500) time: 20.0361 data: 16.7607 max mem: 6925
Epoch: [100] [200/625] eta: 0:14:15 lr: 0.003242 min_lr: 0.003242 loss: 2.6986 (2.6810) class_acc: 0.5859 (0.5923) weight_decay: 0.0500 (0.0500) grad_norm: 0.8639 (inf) time: 1.9545 data: 0.0009 max mem: 6925
Epoch: [100] [400/625] eta: 0:07:21 lr: 0.003236 min_lr: 0.003236 loss: 2.7244 (2.7003) class_acc: 0.5898 (0.5896) weight_decay: 0.0500 (0.0500) grad_norm: 0.9345 (inf) time: 1.7824 data: 0.0009 max mem: 6925
Epoch: [100] [600/625] eta: 0:00:48 lr: 0.003230 min_lr: 0.003230 loss: 2.6804 (2.7074) class_acc: 0.5820 (0.5880) weight_decay: 0.0500 (0.0500) grad_norm: 1.2321 (inf) time: 1.9486 data: 0.0007 max mem: 6925
Epoch: [100] [624/625] eta: 0:00:01 lr: 0.003230 min_lr: 0.003230 loss: 2.7134 (2.7072) class_acc: 0.5859 (0.5880) weight_decay: 0.0500 (0.0500) grad_norm: 0.9765 (inf) time: 0.7673 data: 0.0016 max mem: 6925
Epoch: [100] Total time: 0:19:48 (1.9014 s / it)
Averaged stats: lr: 0.003230 min_lr: 0.003230 loss: 2.7134 (2.7100) class_acc: 0.5859 (0.5876) weight_decay: 0.0500 (0.0500) grad_norm: 0.9765 (inf)
Test: [ 0/50] eta: 0:09:31 loss: 1.4098 (1.4098) acc1: 68.0000 (68.0000) acc5: 88.0000 (88.0000) time: 11.4294 data: 11.3960 max mem: 6925
Test: [10/50] eta: 0:01:16 loss: 1.5086 (1.5136) acc1: 66.4000 (66.0364) acc5: 88.0000 (87.4182) time: 1.9008 data: 1.8710 max mem: 6925
Test: [20/50] eta: 0:00:45 loss: 1.6426 (1.7152) acc1: 61.6000 (62.0190) acc5: 84.8000 (84.9143) time: 1.0058 data: 0.9768 max mem: 6925
Test: [30/50] eta: 0:00:26 loss: 1.8495 (1.7347) acc1: 58.4000 (61.4194) acc5: 81.6000 (84.2065) time: 0.9956 data: 0.9671 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.7818 (1.7489) acc1: 60.0000 (61.5220) acc5: 81.6000 (83.5902) time: 0.8583 data: 0.8291 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.7989 (1.7471) acc1: 60.0000 (61.8080) acc5: 83.2000 (83.6160) time: 0.6654 data: 0.6358 max mem: 6925
Test: Total time: 0:00:52 (1.0435 s / it)
* Acc@1 62.274 Acc@5 84.442 loss 1.713
Accuracy of the model on the 50000 test images: 62.3%
Max accuracy: 63.10%
Epoch: [101] [ 0/625] eta: 3:29:40 lr: 0.003230 min_lr: 0.003230 loss: 2.4995 (2.4995) class_acc: 0.6445 (0.6445) weight_decay: 0.0500 (0.0500) time: 20.1291 data: 19.9006 max mem: 6925
Epoch: [101] [200/625] eta: 0:13:57 lr: 0.003224 min_lr: 0.003224 loss: 2.6449 (2.6780) class_acc: 0.5898 (0.5970) weight_decay: 0.0500 (0.0500) grad_norm: 0.9063 (1.1163) time: 1.8072 data: 0.8971 max mem: 6925
Epoch: [101] [400/625] eta: 0:07:11 lr: 0.003218 min_lr: 0.003218 loss: 2.7341 (2.6942) class_acc: 0.5820 (0.5923) weight_decay: 0.0500 (0.0500) grad_norm: 0.9033 (1.1268) time: 1.9426 data: 0.0632 max mem: 6925
Epoch: [101] [600/625] eta: 0:00:48 lr: 0.003212 min_lr: 0.003212 loss: 2.7228 (2.7042) class_acc: 0.5742 (0.5892) weight_decay: 0.0500 (0.0500) grad_norm: 1.2037 (1.1225) time: 1.9344 data: 0.1115 max mem: 6925
Epoch: [101] [624/625] eta: 0:00:01 lr: 0.003212 min_lr: 0.003212 loss: 2.6960 (2.7049) class_acc: 0.5820 (0.5889) weight_decay: 0.0500 (0.0500) grad_norm: 1.0986 (1.1224) time: 0.6328 data: 0.0254 max mem: 6925
Epoch: [101] Total time: 0:19:58 (1.9172 s / it)
Averaged stats: lr: 0.003212 min_lr: 0.003212 loss: 2.6960 (2.7056) class_acc: 0.5820 (0.5884) weight_decay: 0.0500 (0.0500) grad_norm: 1.0986 (1.1224)
Test: [ 0/50] eta: 0:10:20 loss: 1.3415 (1.3415) acc1: 75.2000 (75.2000) acc5: 92.8000 (92.8000) time: 12.4034 data: 12.3346 max mem: 6925
Test: [10/50] eta: 0:01:24 loss: 1.5158 (1.5896) acc1: 64.8000 (65.4545) acc5: 86.4000 (86.2545) time: 2.1164 data: 2.0832 max mem: 6925
Test: [20/50] eta: 0:00:49 loss: 1.7989 (1.7663) acc1: 60.0000 (61.7905) acc5: 83.2000 (83.6191) time: 1.1186 data: 1.0887 max mem: 6925
Test: [30/50] eta: 0:00:28 loss: 1.8783 (1.7898) acc1: 57.6000 (60.8774) acc5: 81.6000 (83.4839) time: 1.0513 data: 1.0209 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.8177 (1.8110) acc1: 60.8000 (60.7610) acc5: 82.4000 (83.0829) time: 0.7241 data: 0.6940 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.8007 (1.8138) acc1: 60.0000 (60.7520) acc5: 83.2000 (83.0560) time: 0.6880 data: 0.6580 max mem: 6925
Test: Total time: 0:00:51 (1.0204 s / it)
* Acc@1 61.974 Acc@5 83.964 loss 1.759
Accuracy of the model on the 50000 test images: 62.0%
Max accuracy: 63.10%
Epoch: [102] [ 0/625] eta: 3:40:40 lr: 0.003212 min_lr: 0.003212 loss: 2.5333 (2.5333) class_acc: 0.6289 (0.6289) weight_decay: 0.0500 (0.0500) time: 21.1856 data: 19.5641 max mem: 6925
Epoch: [102] [200/625] eta: 0:14:20 lr: 0.003206 min_lr: 0.003206 loss: 2.6808 (2.6705) class_acc: 0.5898 (0.5958) weight_decay: 0.0500 (0.0500) grad_norm: 0.7968 (1.0390) time: 1.8415 data: 0.0008 max mem: 6925
Epoch: [102] [400/625] eta: 0:07:34 lr: 0.003200 min_lr: 0.003200 loss: 2.7248 (2.6883) class_acc: 0.5859 (0.5923) weight_decay: 0.0500 (0.0500) grad_norm: 1.0953 (1.0800) time: 1.9545 data: 0.0009 max mem: 6925
Epoch: [102] [600/625] eta: 0:00:50 lr: 0.003195 min_lr: 0.003195 loss: 2.7322 (2.6966) class_acc: 0.5859 (0.5909) weight_decay: 0.0500 (0.0500) grad_norm: 0.9136 (1.0517) time: 2.1988 data: 0.0008 max mem: 6925
Epoch: [102] [624/625] eta: 0:00:01 lr: 0.003194 min_lr: 0.003194 loss: 2.7289 (2.6967) class_acc: 0.5859 (0.5909) weight_decay: 0.0500 (0.0500) grad_norm: 0.8710 (1.0529) time: 0.7143 data: 0.0012 max mem: 6925
Epoch: [102] Total time: 0:20:36 (1.9781 s / it)
Averaged stats: lr: 0.003194 min_lr: 0.003194 loss: 2.7289 (2.7044) class_acc: 0.5859 (0.5886) weight_decay: 0.0500 (0.0500) grad_norm: 0.8710 (1.0529)
Test: [ 0/50] eta: 0:10:15 loss: 1.7228 (1.7228) acc1: 61.6000 (61.6000) acc5: 85.6000 (85.6000) time: 12.3022 data: 12.2649 max mem: 6925
Test: [10/50] eta: 0:01:21 loss: 1.5561 (1.6089) acc1: 67.2000 (64.9455) acc5: 85.6000 (85.6727) time: 2.0397 data: 2.0100 max mem: 6925
Test: [20/50] eta: 0:00:47 loss: 1.6333 (1.7377) acc1: 60.8000 (61.4857) acc5: 84.0000 (84.1524) time: 1.0499 data: 1.0199 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.8166 (1.7436) acc1: 59.2000 (61.6516) acc5: 84.0000 (84.3355) time: 1.0368 data: 1.0071 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.6693 (1.7519) acc1: 62.4000 (61.4634) acc5: 83.2000 (83.8049) time: 0.7627 data: 0.7332 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.7474 (1.7614) acc1: 60.8000 (61.2960) acc5: 83.2000 (83.6160) time: 0.8309 data: 0.8014 max mem: 6925
Test: Total time: 0:00:51 (1.0318 s / it)
* Acc@1 61.860 Acc@5 84.172 loss 1.727
Accuracy of the model on the 50000 test images: 61.9%
Max accuracy: 63.10%
Epoch: [103] [ 0/625] eta: 3:44:54 lr: 0.003194 min_lr: 0.003194 loss: 2.5240 (2.5240) class_acc: 0.6406 (0.6406) weight_decay: 0.0500 (0.0500) time: 21.5910 data: 18.3808 max mem: 6925
Epoch: [103] [200/625] eta: 0:14:49 lr: 0.003188 min_lr: 0.003188 loss: 2.7006 (2.6814) class_acc: 0.5938 (0.5937) weight_decay: 0.0500 (0.0500) grad_norm: 1.1354 (1.1656) time: 1.9078 data: 0.0008 max mem: 6925
Epoch: [103] [400/625] eta: 0:07:37 lr: 0.003182 min_lr: 0.003182 loss: 2.6971 (2.6884) class_acc: 0.5977 (0.5933) weight_decay: 0.0500 (0.0500) grad_norm: 0.8657 (1.0798) time: 2.0293 data: 0.0009 max mem: 6925
Epoch: [103] [600/625] eta: 0:00:50 lr: 0.003176 min_lr: 0.003176 loss: 2.6000 (2.6939) class_acc: 0.5938 (0.5911) weight_decay: 0.0500 (0.0500) grad_norm: 0.8083 (1.0701) time: 2.0119 data: 0.0008 max mem: 6925
Epoch: [103] [624/625] eta: 0:00:01 lr: 0.003176 min_lr: 0.003176 loss: 2.7528 (2.6962) class_acc: 0.5781 (0.5907) weight_decay: 0.0500 (0.0500) grad_norm: 0.8314 (1.0631) time: 0.7809 data: 0.0014 max mem: 6925
Epoch: [103] Total time: 0:20:36 (1.9782 s / it)
Averaged stats: lr: 0.003176 min_lr: 0.003176 loss: 2.7528 (2.7026) class_acc: 0.5781 (0.5894) weight_decay: 0.0500 (0.0500) grad_norm: 0.8314 (1.0631)
Test: [ 0/50] eta: 0:10:52 loss: 1.6912 (1.6912) acc1: 56.0000 (56.0000) acc5: 87.2000 (87.2000) time: 13.0507 data: 13.0067 max mem: 6925
Test: [10/50] eta: 0:01:20 loss: 1.5050 (1.5404) acc1: 66.4000 (65.8909) acc5: 87.2000 (86.5455) time: 2.0096 data: 1.9784 max mem: 6925
Test: [20/50] eta: 0:00:46 loss: 1.6435 (1.7022) acc1: 62.4000 (62.5905) acc5: 84.8000 (84.9143) time: 0.9580 data: 0.9280 max mem: 6925
Test: [30/50] eta: 0:00:26 loss: 1.8172 (1.7078) acc1: 58.4000 (61.7548) acc5: 83.2000 (84.6710) time: 0.9848 data: 0.9554 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.7014 (1.7326) acc1: 60.0000 (61.1317) acc5: 83.2000 (84.2927) time: 0.8888 data: 0.8581 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.6883 (1.7294) acc1: 61.6000 (61.2320) acc5: 83.2000 (84.1440) time: 0.7166 data: 0.6850 max mem: 6925
Test: Total time: 0:00:53 (1.0683 s / it)
* Acc@1 62.250 Acc@5 84.444 loss 1.703
Accuracy of the model on the 50000 test images: 62.3%
Max accuracy: 63.10%
Epoch: [104] [ 0/625] eta: 3:31:43 lr: 0.003176 min_lr: 0.003176 loss: 2.8972 (2.8972) class_acc: 0.5547 (0.5547) weight_decay: 0.0500 (0.0500) time: 20.3257 data: 17.2968 max mem: 6925
Epoch: [104] [200/625] eta: 0:14:32 lr: 0.003170 min_lr: 0.003170 loss: 2.6477 (2.6797) class_acc: 0.5898 (0.5935) weight_decay: 0.0500 (0.0500) grad_norm: 1.0530 (1.1070) time: 1.9989 data: 0.0009 max mem: 6925
Epoch: [104] [400/625] eta: 0:07:35 lr: 0.003164 min_lr: 0.003164 loss: 2.6489 (2.6873) class_acc: 0.5898 (0.5925) weight_decay: 0.0500 (0.0500) grad_norm: 1.0471 (1.0707) time: 1.9145 data: 0.0009 max mem: 6925
Epoch: [104] [600/625] eta: 0:00:50 lr: 0.003158 min_lr: 0.003158 loss: 2.7025 (2.6984) class_acc: 0.5781 (0.5895) weight_decay: 0.0500 (0.0500) grad_norm: 0.9746 (1.0849) time: 2.0599 data: 0.0009 max mem: 6925
Epoch: [104] [624/625] eta: 0:00:01 lr: 0.003158 min_lr: 0.003158 loss: 2.6830 (2.6990) class_acc: 0.5938 (0.5895) weight_decay: 0.0500 (0.0500) grad_norm: 0.8720 (1.0755) time: 0.8390 data: 0.0015 max mem: 6925
Epoch: [104] Total time: 0:20:33 (1.9737 s / it)
Averaged stats: lr: 0.003158 min_lr: 0.003158 loss: 2.6830 (2.6993) class_acc: 0.5938 (0.5899) weight_decay: 0.0500 (0.0500) grad_norm: 0.8720 (1.0755)
Test: [ 0/50] eta: 0:08:37 loss: 1.7647 (1.7647) acc1: 60.8000 (60.8000) acc5: 87.2000 (87.2000) time: 10.3480 data: 10.3119 max mem: 6925
Test: [10/50] eta: 0:01:15 loss: 1.5893 (1.5757) acc1: 65.6000 (65.1636) acc5: 85.6000 (85.7455) time: 1.8822 data: 1.8524 max mem: 6925
Test: [20/50] eta: 0:00:46 loss: 1.7316 (1.7636) acc1: 61.6000 (62.0191) acc5: 84.0000 (83.3524) time: 1.1122 data: 1.0833 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.8922 (1.7788) acc1: 59.2000 (61.0581) acc5: 80.0000 (83.5871) time: 1.1327 data: 1.1036 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.7579 (1.7884) acc1: 59.2000 (60.8390) acc5: 81.6000 (83.2585) time: 0.9023 data: 0.8731 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.7164 (1.7894) acc1: 60.0000 (60.7680) acc5: 80.0000 (83.2960) time: 0.7966 data: 0.7661 max mem: 6925
Test: Total time: 0:00:54 (1.0821 s / it)
* Acc@1 61.726 Acc@5 83.990 loss 1.746
Accuracy of the model on the 50000 test images: 61.7%
Max accuracy: 63.10%
Epoch: [105] [ 0/625] eta: 3:28:33 lr: 0.003158 min_lr: 0.003158 loss: 2.6048 (2.6048) class_acc: 0.6289 (0.6289) weight_decay: 0.0500 (0.0500) time: 20.0217 data: 19.1030 max mem: 6925
Epoch: [105] [200/625] eta: 0:13:57 lr: 0.003152 min_lr: 0.003152 loss: 2.6663 (2.6728) class_acc: 0.5977 (0.5977) weight_decay: 0.0500 (0.0500) grad_norm: 0.8798 (1.0369) time: 1.8602 data: 0.1392 max mem: 6925
Epoch: [105] [400/625] eta: 0:07:20 lr: 0.003146 min_lr: 0.003146 loss: 2.7699 (2.6909) class_acc: 0.5742 (0.5929) weight_decay: 0.0500 (0.0500) grad_norm: 1.0702 (1.0633) time: 2.0661 data: 0.0235 max mem: 6925
Epoch: [105] [600/625] eta: 0:00:48 lr: 0.003140 min_lr: 0.003140 loss: 2.7192 (2.6962) class_acc: 0.5859 (0.5915) weight_decay: 0.0500 (0.0500) grad_norm: 1.0556 (1.0505) time: 2.1112 data: 0.3171 max mem: 6925
Epoch: [105] [624/625] eta: 0:00:01 lr: 0.003139 min_lr: 0.003139 loss: 2.6951 (2.6969) class_acc: 0.5859 (0.5912) weight_decay: 0.0500 (0.0500) grad_norm: 0.9183 (1.0483) time: 1.1123 data: 0.0356 max mem: 6925
Epoch: [105] Total time: 0:19:48 (1.9012 s / it)
Averaged stats: lr: 0.003139 min_lr: 0.003139 loss: 2.6951 (2.6959) class_acc: 0.5859 (0.5908) weight_decay: 0.0500 (0.0500) grad_norm: 0.9183 (1.0483)
Test: [ 0/50] eta: 0:09:46 loss: 1.5393 (1.5393) acc1: 64.0000 (64.0000) acc5: 87.2000 (87.2000) time: 11.7394 data: 11.6951 max mem: 6925
Test: [10/50] eta: 0:01:12 loss: 1.6478 (1.6276) acc1: 64.0000 (64.0727) acc5: 87.2000 (86.2545) time: 1.8176 data: 1.7854 max mem: 6925
Test: [20/50] eta: 0:00:42 loss: 1.8150 (1.7780) acc1: 60.8000 (61.1429) acc5: 84.0000 (84.0762) time: 0.8843 data: 0.8536 max mem: 6925
Test: [30/50] eta: 0:00:24 loss: 1.8319 (1.7992) acc1: 57.6000 (60.4129) acc5: 81.6000 (83.3806) time: 0.9037 data: 0.8740 max mem: 6925
Test: [40/50] eta: 0:00:10 loss: 1.7901 (1.8056) acc1: 58.4000 (59.8439) acc5: 81.6000 (83.3366) time: 0.7467 data: 0.7177 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.6952 (1.8204) acc1: 59.2000 (59.4240) acc5: 83.2000 (83.0560) time: 0.4621 data: 0.4329 max mem: 6925
Test: Total time: 0:00:47 (0.9488 s / it)
* Acc@1 60.996 Acc@5 83.392 loss 1.775
Accuracy of the model on the 50000 test images: 61.0%
Max accuracy: 63.10%
Epoch: [106] [ 0/625] eta: 3:38:47 lr: 0.003139 min_lr: 0.003139 loss: 2.5921 (2.5921) class_acc: 0.5938 (0.5938) weight_decay: 0.0500 (0.0500) time: 21.0046 data: 16.3889 max mem: 6925
Epoch: [106] [200/625] eta: 0:14:00 lr: 0.003133 min_lr: 0.003133 loss: 2.7485 (2.6876) class_acc: 0.5859 (0.5954) weight_decay: 0.0500 (0.0500) grad_norm: 0.9350 (1.0745) time: 1.8108 data: 0.0013 max mem: 6925
Epoch: [106] [400/625] eta: 0:07:12 lr: 0.003127 min_lr: 0.003127 loss: 2.6977 (2.6989) class_acc: 0.5664 (0.5907) weight_decay: 0.0500 (0.0500) grad_norm: 0.9784 (1.0853) time: 1.9287 data: 0.1640 max mem: 6925
Epoch: [106] [600/625] eta: 0:00:48 lr: 0.003121 min_lr: 0.003121 loss: 2.6633 (2.7022) class_acc: 0.5938 (0.5904) weight_decay: 0.0500 (0.0500) grad_norm: 0.9400 (inf) time: 2.0447 data: 0.0258 max mem: 6925
Epoch: [106] [624/625] eta: 0:00:01 lr: 0.003121 min_lr: 0.003121 loss: 2.6709 (2.7029) class_acc: 0.5859 (0.5901) weight_decay: 0.0500 (0.0500) grad_norm: 0.7951 (inf) time: 0.7981 data: 0.0019 max mem: 6925
Epoch: [106] Total time: 0:19:49 (1.9034 s / it)
Averaged stats: lr: 0.003121 min_lr: 0.003121 loss: 2.6709 (2.6950) class_acc: 0.5859 (0.5910) weight_decay: 0.0500 (0.0500) grad_norm: 0.7951 (inf)
Test: [ 0/50] eta: 0:11:16 loss: 1.5835 (1.5835) acc1: 67.2000 (67.2000) acc5: 87.2000 (87.2000) time: 13.5215 data: 13.4815 max mem: 6925
Test: [10/50] eta: 0:01:24 loss: 1.5606 (1.6229) acc1: 66.4000 (65.6000) acc5: 86.4000 (85.5273) time: 2.1009 data: 2.0688 max mem: 6925
Test: [20/50] eta: 0:00:47 loss: 1.6920 (1.7466) acc1: 60.0000 (61.9429) acc5: 84.0000 (83.9238) time: 0.9897 data: 0.9598 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.7601 (1.7417) acc1: 59.2000 (61.3936) acc5: 82.4000 (83.8710) time: 1.0067 data: 0.9781 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.7205 (1.7515) acc1: 59.2000 (60.8976) acc5: 84.0000 (83.9415) time: 0.9670 data: 0.9375 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.9025 (1.7680) acc1: 56.0000 (60.5920) acc5: 83.2000 (83.5680) time: 0.7925 data: 0.7631 max mem: 6925
Test: Total time: 0:00:57 (1.1561 s / it)
* Acc@1 61.566 Acc@5 83.952 loss 1.733
Accuracy of the model on the 50000 test images: 61.6%
Max accuracy: 63.10%
Epoch: [107] [ 0/625] eta: 3:38:02 lr: 0.003121 min_lr: 0.003121 loss: 2.6048 (2.6048) class_acc: 0.6094 (0.6094) weight_decay: 0.0500 (0.0500) time: 20.9313 data: 18.9735 max mem: 6925
Epoch: [107] [200/625] eta: 0:14:07 lr: 0.003115 min_lr: 0.003115 loss: 2.6950 (2.6797) class_acc: 0.5781 (0.5940) weight_decay: 0.0500 (0.0500) grad_norm: 1.0611 (1.0708) time: 1.6002 data: 0.8454 max mem: 6925
Epoch: [107] [400/625] eta: 0:07:22 lr: 0.003109 min_lr: 0.003109 loss: 2.7293 (2.6906) class_acc: 0.5977 (0.5922) weight_decay: 0.0500 (0.0500) grad_norm: 0.8051 (1.0486) time: 2.0282 data: 1.7433 max mem: 6925
Epoch: [107] [600/625] eta: 0:00:48 lr: 0.003103 min_lr: 0.003103 loss: 2.6337 (2.6933) class_acc: 0.5898 (0.5919) weight_decay: 0.0500 (0.0500) grad_norm: 0.8855 (1.0780) time: 1.8963 data: 1.0723 max mem: 6925
Epoch: [107] [624/625] eta: 0:00:01 lr: 0.003102 min_lr: 0.003102 loss: 2.6960 (2.6942) class_acc: 0.5820 (0.5916) weight_decay: 0.0500 (0.0500) grad_norm: 0.9264 (1.0801) time: 0.8489 data: 0.3663 max mem: 6925
Epoch: [107] Total time: 0:20:05 (1.9295 s / it)
Averaged stats: lr: 0.003102 min_lr: 0.003102 loss: 2.6960 (2.6915) class_acc: 0.5820 (0.5919) weight_decay: 0.0500 (0.0500) grad_norm: 0.9264 (1.0801)
Test: [ 0/50] eta: 0:10:27 loss: 1.7142 (1.7142) acc1: 58.4000 (58.4000) acc5: 88.0000 (88.0000) time: 12.5557 data: 12.5235 max mem: 6925
Test: [10/50] eta: 0:01:21 loss: 1.5734 (1.5776) acc1: 65.6000 (65.7455) acc5: 86.4000 (85.5273) time: 2.0262 data: 1.9930 max mem: 6925
Test: [20/50] eta: 0:00:46 loss: 1.7789 (1.7634) acc1: 60.8000 (61.9429) acc5: 83.2000 (83.8095) time: 1.0009 data: 0.9690 max mem: 6925
Test: [30/50] eta: 0:00:26 loss: 1.8421 (1.7809) acc1: 60.8000 (61.9613) acc5: 82.4000 (83.5871) time: 0.9448 data: 0.9132 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.7194 (1.7975) acc1: 60.0000 (61.3268) acc5: 83.2000 (83.2976) time: 0.6449 data: 0.6136 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.7849 (1.8080) acc1: 57.6000 (60.8640) acc5: 83.2000 (83.3280) time: 0.5824 data: 0.5487 max mem: 6925
Test: Total time: 0:00:46 (0.9284 s / it)
* Acc@1 61.606 Acc@5 83.528 loss 1.774
Accuracy of the model on the 50000 test images: 61.6%
Max accuracy: 63.10%
Epoch: [108] [ 0/625] eta: 3:21:17 lr: 0.003102 min_lr: 0.003102 loss: 2.7115 (2.7115) class_acc: 0.5664 (0.5664) weight_decay: 0.0500 (0.0500) time: 19.3241 data: 18.4716 max mem: 6925
Epoch: [108] [200/625] eta: 0:13:44 lr: 0.003096 min_lr: 0.003096 loss: 2.6611 (2.6745) class_acc: 0.5977 (0.5937) weight_decay: 0.0500 (0.0500) grad_norm: 1.0226 (1.0804) time: 2.0843 data: 0.0496 max mem: 6925
Epoch: [108] [400/625] eta: 0:07:03 lr: 0.003090 min_lr: 0.003090 loss: 2.7157 (2.6816) class_acc: 0.5859 (0.5925) weight_decay: 0.0500 (0.0500) grad_norm: 1.0208 (1.0849) time: 1.8197 data: 0.0009 max mem: 6925
Epoch: [108] [600/625] eta: 0:00:47 lr: 0.003084 min_lr: 0.003084 loss: 2.7119 (2.6878) class_acc: 0.5859 (0.5914) weight_decay: 0.0500 (0.0500) grad_norm: 1.0333 (1.0718) time: 1.8481 data: 0.0121 max mem: 6925
Epoch: [108] [624/625] eta: 0:00:01 lr: 0.003083 min_lr: 0.003083 loss: 2.7169 (2.6893) class_acc: 0.5898 (0.5913) weight_decay: 0.0500 (0.0500) grad_norm: 1.2491 (1.0716) time: 0.7593 data: 0.0195 max mem: 6925
Epoch: [108] Total time: 0:19:45 (1.8964 s / it)
Averaged stats: lr: 0.003083 min_lr: 0.003083 loss: 2.7169 (2.6893) class_acc: 0.5898 (0.5921) weight_decay: 0.0500 (0.0500) grad_norm: 1.2491 (1.0716)
Test: [ 0/50] eta: 0:09:21 loss: 1.5216 (1.5216) acc1: 62.4000 (62.4000) acc5: 89.6000 (89.6000) time: 11.2325 data: 11.1726 max mem: 6925
Test: [10/50] eta: 0:01:25 loss: 1.4538 (1.4618) acc1: 67.2000 (67.4182) acc5: 87.2000 (87.5636) time: 2.1419 data: 2.1100 max mem: 6925
Test: [20/50] eta: 0:00:53 loss: 1.5888 (1.6576) acc1: 62.4000 (63.0095) acc5: 84.8000 (84.8762) time: 1.2947 data: 1.2631 max mem: 6925
Test: [30/50] eta: 0:00:32 loss: 1.8273 (1.6738) acc1: 59.2000 (62.5290) acc5: 82.4000 (84.6194) time: 1.3163 data: 1.2843 max mem: 6925
Test: [40/50] eta: 0:00:13 loss: 1.7494 (1.6788) acc1: 60.8000 (62.5366) acc5: 84.0000 (84.4488) time: 0.9844 data: 0.9543 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.7418 (1.7127) acc1: 60.0000 (61.9520) acc5: 83.2000 (83.9040) time: 0.8192 data: 0.7868 max mem: 6925
Test: Total time: 0:00:58 (1.1609 s / it)
* Acc@1 62.590 Acc@5 84.616 loss 1.671
Accuracy of the model on the 50000 test images: 62.6%
Max accuracy: 63.10%
Epoch: [109] [ 0/625] eta: 4:03:39 lr: 0.003083 min_lr: 0.003083 loss: 2.6028 (2.6028) class_acc: 0.6367 (0.6367) weight_decay: 0.0500 (0.0500) time: 23.3918 data: 20.3645 max mem: 6925
Epoch: [109] [200/625] eta: 0:14:11 lr: 0.003077 min_lr: 0.003077 loss: 2.6456 (2.6694) class_acc: 0.5820 (0.5960) weight_decay: 0.0500 (0.0500) grad_norm: 1.0658 (1.1000) time: 1.9118 data: 0.0015 max mem: 6925
Epoch: [109] [400/625] eta: 0:07:32 lr: 0.003071 min_lr: 0.003071 loss: 2.6594 (2.6828) class_acc: 0.6016 (0.5937) weight_decay: 0.0500 (0.0500) grad_norm: 0.9586 (1.0490) time: 1.9641 data: 0.0017 max mem: 6925
Epoch: [109] [600/625] eta: 0:00:50 lr: 0.003065 min_lr: 0.003065 loss: 2.6897 (2.6889) class_acc: 0.5938 (0.5926) weight_decay: 0.0500 (0.0500) grad_norm: 0.7699 (1.0470) time: 2.1043 data: 0.0102 max mem: 6925
Epoch: [109] [624/625] eta: 0:00:01 lr: 0.003064 min_lr: 0.003064 loss: 2.6651 (2.6889) class_acc: 0.5898 (0.5925) weight_decay: 0.0500 (0.0500) grad_norm: 1.0052 (1.0585) time: 1.0382 data: 0.0015 max mem: 6925
Epoch: [109] Total time: 0:20:29 (1.9679 s / it)
Averaged stats: lr: 0.003064 min_lr: 0.003064 loss: 2.6651 (2.6853) class_acc: 0.5898 (0.5931) weight_decay: 0.0500 (0.0500) grad_norm: 1.0052 (1.0585)
Test: [ 0/50] eta: 0:10:39 loss: 1.4338 (1.4338) acc1: 67.2000 (67.2000) acc5: 88.8000 (88.8000) time: 12.7941 data: 12.7616 max mem: 6925
Test: [10/50] eta: 0:01:21 loss: 1.4669 (1.5092) acc1: 66.4000 (66.2545) acc5: 87.2000 (87.1273) time: 2.0360 data: 2.0061 max mem: 6925
Test: [20/50] eta: 0:00:47 loss: 1.6684 (1.7229) acc1: 60.8000 (61.7143) acc5: 84.0000 (84.3048) time: 1.0065 data: 0.9766 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.9023 (1.7523) acc1: 57.6000 (60.8516) acc5: 81.6000 (83.4581) time: 1.0087 data: 0.9795 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.8187 (1.7629) acc1: 57.6000 (60.7415) acc5: 83.2000 (83.3756) time: 0.7469 data: 0.7163 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.6467 (1.7657) acc1: 60.8000 (60.7360) acc5: 84.0000 (83.2960) time: 0.7295 data: 0.6983 max mem: 6925
Test: Total time: 0:00:49 (0.9956 s / it)
* Acc@1 61.356 Acc@5 83.924 loss 1.730
Accuracy of the model on the 50000 test images: 61.4%
Max accuracy: 63.10%
Epoch: [110] [ 0/625] eta: 3:44:53 lr: 0.003064 min_lr: 0.003064 loss: 2.7178 (2.7178) class_acc: 0.6016 (0.6016) weight_decay: 0.0500 (0.0500) time: 21.5897 data: 21.3567 max mem: 6925
Epoch: [110] [200/625] eta: 0:14:11 lr: 0.003058 min_lr: 0.003058 loss: 2.7087 (2.6776) class_acc: 0.5977 (0.5954) weight_decay: 0.0500 (0.0500) grad_norm: 0.8377 (1.1824) time: 1.8391 data: 0.2014 max mem: 6925
Epoch: [110] [400/625] eta: 0:07:24 lr: 0.003052 min_lr: 0.003052 loss: 2.6598 (2.6820) class_acc: 0.6016 (0.5948) weight_decay: 0.0500 (0.0500) grad_norm: 1.0445 (1.1209) time: 2.0015 data: 0.0009 max mem: 6925
Epoch: [110] [600/625] eta: 0:00:49 lr: 0.003046 min_lr: 0.003046 loss: 2.6824 (2.6837) class_acc: 0.5898 (0.5940) weight_decay: 0.0500 (0.0500) grad_norm: 1.1361 (1.1016) time: 1.8757 data: 0.0289 max mem: 6925
Epoch: [110] [624/625] eta: 0:00:01 lr: 0.003045 min_lr: 0.003045 loss: 2.6938 (2.6840) class_acc: 0.5859 (0.5939) weight_decay: 0.0500 (0.0500) grad_norm: 0.9851 (1.0985) time: 1.0616 data: 0.0097 max mem: 6925
Epoch: [110] Total time: 0:20:06 (1.9307 s / it)
Averaged stats: lr: 0.003045 min_lr: 0.003045 loss: 2.6938 (2.6831) class_acc: 0.5859 (0.5939) weight_decay: 0.0500 (0.0500) grad_norm: 0.9851 (1.0985)
Test: [ 0/50] eta: 0:10:26 loss: 1.3669 (1.3669) acc1: 67.2000 (67.2000) acc5: 91.2000 (91.2000) time: 12.5260 data: 12.4735 max mem: 6925
Test: [10/50] eta: 0:01:20 loss: 1.4836 (1.5131) acc1: 65.6000 (65.6727) acc5: 86.4000 (85.9636) time: 2.0114 data: 1.9794 max mem: 6925
Test: [20/50] eta: 0:00:45 loss: 1.6512 (1.6929) acc1: 61.6000 (61.6381) acc5: 84.0000 (83.8476) time: 0.9827 data: 0.9535 max mem: 6925
Test: [30/50] eta: 0:00:26 loss: 1.7894 (1.7046) acc1: 58.4000 (61.6000) acc5: 82.4000 (83.8194) time: 0.9253 data: 0.8967 max mem: 6925
Test: [40/50] eta: 0:00:10 loss: 1.6636 (1.7106) acc1: 59.2000 (61.2683) acc5: 84.0000 (83.9220) time: 0.6275 data: 0.5976 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.6963 (1.7269) acc1: 59.2000 (60.5600) acc5: 84.8000 (83.8240) time: 0.4972 data: 0.4673 max mem: 6925
Test: Total time: 0:00:47 (0.9403 s / it)
* Acc@1 61.970 Acc@5 84.366 loss 1.682
Accuracy of the model on the 50000 test images: 62.0%
Max accuracy: 63.10%
Epoch: [111] [ 0/625] eta: 3:19:22 lr: 0.003045 min_lr: 0.003045 loss: 2.7346 (2.7346) class_acc: 0.6289 (0.6289) weight_decay: 0.0500 (0.0500) time: 19.1400 data: 17.7385 max mem: 6925
Epoch: [111] [200/625] eta: 0:14:05 lr: 0.003039 min_lr: 0.003039 loss: 2.6058 (2.6671) class_acc: 0.5977 (0.5980) weight_decay: 0.0500 (0.0500) grad_norm: 0.8861 (1.1011) time: 1.7529 data: 0.0017 max mem: 6925
Epoch: [111] [400/625] eta: 0:07:15 lr: 0.003033 min_lr: 0.003033 loss: 2.6874 (2.6756) class_acc: 0.5859 (0.5960) weight_decay: 0.0500 (0.0500) grad_norm: 0.7799 (1.0588) time: 1.9749 data: 0.0360 max mem: 6925
Epoch: [111] [600/625] eta: 0:00:48 lr: 0.003027 min_lr: 0.003027 loss: 2.6883 (2.6832) class_acc: 0.5781 (0.5944) weight_decay: 0.0500 (0.0500) grad_norm: 0.8557 (1.0708) time: 1.9450 data: 0.0086 max mem: 6925
Epoch: [111] [624/625] eta: 0:00:01 lr: 0.003026 min_lr: 0.003026 loss: 2.6897 (2.6841) class_acc: 0.5898 (0.5941) weight_decay: 0.0500 (0.0500) grad_norm: 0.9211 (1.0728) time: 0.8745 data: 0.0015 max mem: 6925
Epoch: [111] Total time: 0:19:41 (1.8906 s / it)
Averaged stats: lr: 0.003026 min_lr: 0.003026 loss: 2.6897 (2.6812) class_acc: 0.5898 (0.5939) weight_decay: 0.0500 (0.0500) grad_norm: 0.9211 (1.0728)
Test: [ 0/50] eta: 0:10:31 loss: 1.7358 (1.7358) acc1: 58.4000 (58.4000) acc5: 85.6000 (85.6000) time: 12.6336 data: 12.6026 max mem: 6925
Test: [10/50] eta: 0:01:22 loss: 1.4448 (1.4967) acc1: 67.2000 (66.4000) acc5: 87.2000 (87.2000) time: 2.0569 data: 2.0270 max mem: 6925
Test: [20/50] eta: 0:00:48 loss: 1.7332 (1.6818) acc1: 61.6000 (62.2095) acc5: 84.8000 (84.9905) time: 1.0652 data: 1.0360 max mem: 6925
Test: [30/50] eta: 0:00:29 loss: 1.8136 (1.6840) acc1: 59.2000 (61.7806) acc5: 83.2000 (84.8258) time: 1.1223 data: 1.0935 max mem: 6925
Test: [40/50] eta: 0:00:13 loss: 1.6392 (1.6790) acc1: 61.6000 (61.6195) acc5: 84.8000 (84.7024) time: 1.0031 data: 0.9736 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.6392 (1.6991) acc1: 62.4000 (61.5040) acc5: 84.8000 (84.3840) time: 0.9290 data: 0.8991 max mem: 6925
Test: Total time: 0:00:54 (1.0928 s / it)
* Acc@1 62.440 Acc@5 84.784 loss 1.666
Accuracy of the model on the 50000 test images: 62.4%
Max accuracy: 63.10%
Epoch: [112] [ 0/625] eta: 3:43:30 lr: 0.003026 min_lr: 0.003026 loss: 2.6988 (2.6988) class_acc: 0.6172 (0.6172) weight_decay: 0.0500 (0.0500) time: 21.4569 data: 16.7715 max mem: 6925
Epoch: [112] [200/625] eta: 0:14:29 lr: 0.003020 min_lr: 0.003020 loss: 2.6134 (2.6629) class_acc: 0.6016 (0.5982) weight_decay: 0.0500 (0.0500) grad_norm: 0.9381 (1.0204) time: 2.1540 data: 0.0331 max mem: 6925
Epoch: [112] [400/625] eta: 0:07:19 lr: 0.003014 min_lr: 0.003014 loss: 2.6627 (2.6696) class_acc: 0.5820 (0.5965) weight_decay: 0.0500 (0.0500) grad_norm: 1.0221 (1.0328) time: 1.9602 data: 0.0007 max mem: 6925
Epoch: [112] [600/625] eta: 0:00:48 lr: 0.003007 min_lr: 0.003007 loss: 2.6411 (2.6750) class_acc: 0.5781 (0.5952) weight_decay: 0.0500 (0.0500) grad_norm: 0.9985 (1.0521) time: 1.9399 data: 0.0007 max mem: 6925
Epoch: [112] [624/625] eta: 0:00:01 lr: 0.003007 min_lr: 0.003007 loss: 2.7056 (2.6757) class_acc: 0.5820 (0.5946) weight_decay: 0.0500 (0.0500) grad_norm: 0.8955 (1.0484) time: 0.7459 data: 0.0014 max mem: 6925
Epoch: [112] Total time: 0:20:06 (1.9305 s / it)
Averaged stats: lr: 0.003007 min_lr: 0.003007 loss: 2.7056 (2.6761) class_acc: 0.5820 (0.5948) weight_decay: 0.0500 (0.0500) grad_norm: 0.8955 (1.0484)
Test: [ 0/50] eta: 0:10:35 loss: 1.6038 (1.6038) acc1: 62.4000 (62.4000) acc5: 86.4000 (86.4000) time: 12.7151 data: 12.6763 max mem: 6925
Test: [10/50] eta: 0:01:25 loss: 1.5625 (1.5600) acc1: 68.0000 (66.9091) acc5: 87.2000 (86.1091) time: 2.1353 data: 2.1051 max mem: 6925
Test: [20/50] eta: 0:00:50 loss: 1.7181 (1.7338) acc1: 62.4000 (63.0476) acc5: 84.8000 (83.9619) time: 1.1304 data: 1.1010 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.9047 (1.7470) acc1: 58.4000 (62.7613) acc5: 82.4000 (83.7161) time: 0.9874 data: 0.9585 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.6199 (1.7595) acc1: 60.0000 (61.9122) acc5: 82.4000 (83.4537) time: 0.5271 data: 0.4971 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.6959 (1.7597) acc1: 62.4000 (61.7440) acc5: 84.0000 (83.5360) time: 0.5203 data: 0.4905 max mem: 6925
Test: Total time: 0:00:46 (0.9348 s / it)
* Acc@1 61.954 Acc@5 84.056 loss 1.721
Accuracy of the model on the 50000 test images: 62.0%
Max accuracy: 63.10%
Epoch: [113] [ 0/625] eta: 3:27:19 lr: 0.003007 min_lr: 0.003007 loss: 2.6214 (2.6214) class_acc: 0.6055 (0.6055) weight_decay: 0.0500 (0.0500) time: 19.9027 data: 15.4132 max mem: 6925
Epoch: [113] [200/625] eta: 0:13:40 lr: 0.003000 min_lr: 0.003000 loss: 2.6743 (2.6650) class_acc: 0.5898 (0.5999) weight_decay: 0.0500 (0.0500) grad_norm: 0.7952 (1.0862) time: 1.8760 data: 0.0012 max mem: 6925
Epoch: [113] [400/625] eta: 0:07:09 lr: 0.002994 min_lr: 0.002994 loss: 2.6020 (2.6685) class_acc: 0.5898 (0.5971) weight_decay: 0.0500 (0.0500) grad_norm: 0.8951 (1.0488) time: 1.9172 data: 0.0012 max mem: 6925
Epoch: [113] [600/625] eta: 0:00:48 lr: 0.002988 min_lr: 0.002988 loss: 2.6641 (2.6732) class_acc: 0.5820 (0.5961) weight_decay: 0.0500 (0.0500) grad_norm: 0.9715 (inf) time: 1.9839 data: 0.0008 max mem: 6925
Epoch: [113] [624/625] eta: 0:00:01 lr: 0.002987 min_lr: 0.002987 loss: 2.6128 (2.6729) class_acc: 0.6016 (0.5960) weight_decay: 0.0500 (0.0500) grad_norm: 0.8902 (inf) time: 0.9183 data: 0.0014 max mem: 6925
Epoch: [113] Total time: 0:19:49 (1.9038 s / it)
Averaged stats: lr: 0.002987 min_lr: 0.002987 loss: 2.6128 (2.6732) class_acc: 0.6016 (0.5960) weight_decay: 0.0500 (0.0500) grad_norm: 0.8902 (inf)
Test: [ 0/50] eta: 0:11:05 loss: 1.5242 (1.5242) acc1: 62.4000 (62.4000) acc5: 88.8000 (88.8000) time: 13.3174 data: 13.2856 max mem: 6925
Test: [10/50] eta: 0:01:31 loss: 1.5162 (1.5525) acc1: 67.2000 (66.0364) acc5: 88.8000 (87.2727) time: 2.2762 data: 2.2458 max mem: 6925
Test: [20/50] eta: 0:00:55 loss: 1.6208 (1.6998) acc1: 61.6000 (62.4381) acc5: 84.8000 (85.7143) time: 1.2698 data: 1.2404 max mem: 6925
Test: [30/50] eta: 0:00:32 loss: 1.8822 (1.7060) acc1: 59.2000 (62.8129) acc5: 84.0000 (85.1097) time: 1.2803 data: 1.2515 max mem: 6925
Test: [40/50] eta: 0:00:13 loss: 1.6374 (1.7081) acc1: 63.2000 (62.2634) acc5: 84.8000 (85.0342) time: 0.8371 data: 0.8078 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.7199 (1.7211) acc1: 60.8000 (62.0160) acc5: 84.8000 (84.8960) time: 0.7356 data: 0.7067 max mem: 6925
Test: Total time: 0:00:56 (1.1297 s / it)
* Acc@1 63.114 Acc@5 85.110 loss 1.683
Accuracy of the model on the 50000 test images: 63.1%
Max accuracy: 63.11%
Epoch: [114] [ 0/625] eta: 3:36:27 lr: 0.002987 min_lr: 0.002987 loss: 2.9016 (2.9016) class_acc: 0.5352 (0.5352) weight_decay: 0.0500 (0.0500) time: 20.7799 data: 20.5515 max mem: 6925
Epoch: [114] [200/625] eta: 0:14:25 lr: 0.002981 min_lr: 0.002981 loss: 2.6400 (2.6584) class_acc: 0.6016 (0.5997) weight_decay: 0.0500 (0.0500) grad_norm: 1.0103 (1.0156) time: 1.8821 data: 0.0010 max mem: 6925
Epoch: [114] [400/625] eta: 0:07:22 lr: 0.002975 min_lr: 0.002975 loss: 2.6209 (2.6693) class_acc: 0.5898 (0.5969) weight_decay: 0.0500 (0.0500) grad_norm: 0.8398 (1.0432) time: 1.8809 data: 0.0216 max mem: 6925
Epoch: [114] [600/625] eta: 0:00:48 lr: 0.002968 min_lr: 0.002968 loss: 2.6777 (2.6704) class_acc: 0.5898 (0.5962) weight_decay: 0.0500 (0.0500) grad_norm: 0.9638 (inf) time: 1.9491 data: 0.0672 max mem: 6925
Epoch: [114] [624/625] eta: 0:00:01 lr: 0.002968 min_lr: 0.002968 loss: 2.6624 (2.6711) class_acc: 0.5859 (0.5960) weight_decay: 0.0500 (0.0500) grad_norm: 0.9145 (inf) time: 1.1160 data: 0.0183 max mem: 6925
Epoch: [114] Total time: 0:19:49 (1.9028 s / it)
Averaged stats: lr: 0.002968 min_lr: 0.002968 loss: 2.6624 (2.6719) class_acc: 0.5859 (0.5958) weight_decay: 0.0500 (0.0500) grad_norm: 0.9145 (inf)
Test: [ 0/50] eta: 0:09:37 loss: 1.6618 (1.6618) acc1: 64.8000 (64.8000) acc5: 88.0000 (88.0000) time: 11.5454 data: 11.5143 max mem: 6925
Test: [10/50] eta: 0:01:21 loss: 1.6004 (1.5083) acc1: 64.8000 (67.9273) acc5: 88.0000 (85.8909) time: 2.0275 data: 1.9960 max mem: 6925
Test: [20/50] eta: 0:00:48 loss: 1.6266 (1.6593) acc1: 62.4000 (63.2000) acc5: 83.2000 (84.7238) time: 1.1120 data: 1.0807 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.7726 (1.6765) acc1: 60.0000 (62.5032) acc5: 83.2000 (84.2581) time: 1.0476 data: 1.0177 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.6472 (1.6903) acc1: 60.8000 (62.4585) acc5: 82.4000 (84.0000) time: 0.6375 data: 0.6085 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.7045 (1.6807) acc1: 60.8000 (62.2560) acc5: 85.6000 (84.3200) time: 0.5892 data: 0.5593 max mem: 6925
Test: Total time: 0:00:47 (0.9425 s / it)
* Acc@1 63.170 Acc@5 85.368 loss 1.626
Accuracy of the model on the 50000 test images: 63.2%
Max accuracy: 63.17%
Epoch: [115] [ 0/625] eta: 3:05:58 lr: 0.002968 min_lr: 0.002968 loss: 2.9210 (2.9210) class_acc: 0.5273 (0.5273) weight_decay: 0.0500 (0.0500) time: 17.8537 data: 17.6018 max mem: 6925
Epoch: [115] [200/625] eta: 0:13:54 lr: 0.002961 min_lr: 0.002961 loss: 2.5764 (2.6493) class_acc: 0.6016 (0.6024) weight_decay: 0.0500 (0.0500) grad_norm: 0.9338 (1.0177) time: 1.8026 data: 0.0844 max mem: 6925
Epoch: [115] [400/625] eta: 0:07:14 lr: 0.002955 min_lr: 0.002955 loss: 2.6860 (2.6596) class_acc: 0.5938 (0.5993) weight_decay: 0.0500 (0.0500) grad_norm: 0.8269 (1.0518) time: 1.8170 data: 0.0140 max mem: 6925
Epoch: [115] [600/625] eta: 0:00:48 lr: 0.002949 min_lr: 0.002949 loss: 2.6854 (2.6674) class_acc: 0.5977 (0.5985) weight_decay: 0.0500 (0.0500) grad_norm: 1.0556 (1.0480) time: 1.8901 data: 0.0009 max mem: 6925
Epoch: [115] [624/625] eta: 0:00:01 lr: 0.002948 min_lr: 0.002948 loss: 2.6838 (2.6686) class_acc: 0.5820 (0.5979) weight_decay: 0.0500 (0.0500) grad_norm: 1.3495 (1.0709) time: 0.5404 data: 0.0014 max mem: 6925
Epoch: [115] Total time: 0:19:42 (1.8919 s / it)
Averaged stats: lr: 0.002948 min_lr: 0.002948 loss: 2.6838 (2.6707) class_acc: 0.5820 (0.5962) weight_decay: 0.0500 (0.0500) grad_norm: 1.3495 (1.0709)
Test: [ 0/50] eta: 0:09:52 loss: 1.8592 (1.8592) acc1: 60.0000 (60.0000) acc5: 85.6000 (85.6000) time: 11.8518 data: 11.8068 max mem: 6925
Test: [10/50] eta: 0:01:30 loss: 1.6178 (1.6208) acc1: 64.8000 (64.3636) acc5: 85.6000 (85.0909) time: 2.2603 data: 2.2299 max mem: 6925
Test: [20/50] eta: 0:00:51 loss: 1.7361 (1.7488) acc1: 60.8000 (61.0286) acc5: 84.8000 (84.1524) time: 1.2269 data: 1.1974 max mem: 6925
Test: [30/50] eta: 0:00:28 loss: 1.8654 (1.7530) acc1: 60.8000 (61.4452) acc5: 82.4000 (83.9484) time: 0.9811 data: 0.9519 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.6981 (1.7445) acc1: 61.6000 (61.3463) acc5: 83.2000 (83.8244) time: 0.5923 data: 0.5633 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.7291 (1.7484) acc1: 60.0000 (61.3600) acc5: 83.2000 (83.7760) time: 0.4602 data: 0.4313 max mem: 6925
Test: Total time: 0:00:49 (0.9938 s / it)
* Acc@1 61.814 Acc@5 84.224 loss 1.727
Accuracy of the model on the 50000 test images: 61.8%
Max accuracy: 63.17%
Epoch: [116] [ 0/625] eta: 3:32:40 lr: 0.002948 min_lr: 0.002948 loss: 2.6203 (2.6203) class_acc: 0.5977 (0.5977) weight_decay: 0.0500 (0.0500) time: 20.4168 data: 18.3380 max mem: 6925
Epoch: [116] [200/625] eta: 0:14:01 lr: 0.002942 min_lr: 0.002942 loss: 2.6418 (2.6492) class_acc: 0.5938 (0.6010) weight_decay: 0.0500 (0.0500) grad_norm: 0.9103 (0.9957) time: 1.8793 data: 0.0011 max mem: 6925
Epoch: [116] [400/625] eta: 0:07:15 lr: 0.002935 min_lr: 0.002935 loss: 2.5681 (2.6574) class_acc: 0.6133 (0.5996) weight_decay: 0.0500 (0.0500) grad_norm: 1.0011 (1.0062) time: 1.8551 data: 0.0178 max mem: 6925
Epoch: [116] [600/625] eta: 0:00:48 lr: 0.002929 min_lr: 0.002929 loss: 2.6996 (2.6609) class_acc: 0.5898 (0.5982) weight_decay: 0.0500 (0.0500) grad_norm: 0.8950 (1.0222) time: 1.8271 data: 0.0146 max mem: 6925
Epoch: [116] [624/625] eta: 0:00:01 lr: 0.002928 min_lr: 0.002928 loss: 2.6658 (2.6613) class_acc: 0.5898 (0.5981) weight_decay: 0.0500 (0.0500) grad_norm: 0.9831 (1.0226) time: 0.8364 data: 0.0165 max mem: 6925
Epoch: [116] Total time: 0:19:43 (1.8930 s / it)
Averaged stats: lr: 0.002928 min_lr: 0.002928 loss: 2.6658 (2.6647) class_acc: 0.5898 (0.5974) weight_decay: 0.0500 (0.0500) grad_norm: 0.9831 (1.0226)
Test: [ 0/50] eta: 0:10:09 loss: 1.3381 (1.3381) acc1: 68.8000 (68.8000) acc5: 90.4000 (90.4000) time: 12.1833 data: 12.1490 max mem: 6925
Test: [10/50] eta: 0:01:23 loss: 1.5256 (1.5041) acc1: 66.4000 (65.7455) acc5: 86.4000 (87.4182) time: 2.0842 data: 2.0547 max mem: 6925
Test: [20/50] eta: 0:00:49 loss: 1.7240 (1.6756) acc1: 62.4000 (62.0952) acc5: 85.6000 (85.2952) time: 1.1204 data: 1.0918 max mem: 6925
Test: [30/50] eta: 0:00:29 loss: 1.7883 (1.6865) acc1: 59.2000 (61.9355) acc5: 83.2000 (84.7742) time: 1.1338 data: 1.1054 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.6271 (1.6971) acc1: 60.0000 (61.9707) acc5: 84.0000 (84.4488) time: 0.8372 data: 0.8078 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.6077 (1.7094) acc1: 61.6000 (61.8720) acc5: 83.2000 (84.0000) time: 0.7373 data: 0.7079 max mem: 6925
Test: Total time: 0:00:52 (1.0537 s / it)
* Acc@1 63.116 Acc@5 84.960 loss 1.667
Accuracy of the model on the 50000 test images: 63.1%
Max accuracy: 63.17%
Epoch: [117] [ 0/625] eta: 3:21:15 lr: 0.002928 min_lr: 0.002928 loss: 2.5877 (2.5877) class_acc: 0.6094 (0.6094) weight_decay: 0.0500 (0.0500) time: 19.3202 data: 17.6503 max mem: 6925
Epoch: [117] [200/625] eta: 0:14:27 lr: 0.002922 min_lr: 0.002922 loss: 2.7227 (2.6498) class_acc: 0.5781 (0.6018) weight_decay: 0.0500 (0.0500) grad_norm: 0.8336 (1.0780) time: 2.0132 data: 0.0010 max mem: 6925
Epoch: [117] [400/625] eta: 0:07:22 lr: 0.002915 min_lr: 0.002915 loss: 2.6589 (2.6625) class_acc: 0.6055 (0.5998) weight_decay: 0.0500 (0.0500) grad_norm: 0.8939 (1.0600) time: 1.9941 data: 0.0008 max mem: 6925
Epoch: [117] [600/625] eta: 0:00:49 lr: 0.002909 min_lr: 0.002909 loss: 2.7117 (2.6690) class_acc: 0.6016 (0.5981) weight_decay: 0.0500 (0.0500) grad_norm: 0.9291 (1.0223) time: 1.8995 data: 0.0010 max mem: 6925
Epoch: [117] [624/625] eta: 0:00:01 lr: 0.002908 min_lr: 0.002908 loss: 2.6914 (2.6704) class_acc: 0.5859 (0.5978) weight_decay: 0.0500 (0.0500) grad_norm: 1.0216 (1.0270) time: 1.0343 data: 0.0015 max mem: 6925
Epoch: [117] Total time: 0:20:06 (1.9312 s / it)
Averaged stats: lr: 0.002908 min_lr: 0.002908 loss: 2.6914 (2.6661) class_acc: 0.5859 (0.5974) weight_decay: 0.0500 (0.0500) grad_norm: 1.0216 (1.0270)
Test: [ 0/50] eta: 0:11:02 loss: 1.8839 (1.8839) acc1: 62.4000 (62.4000) acc5: 83.2000 (83.2000) time: 13.2523 data: 13.2216 max mem: 6925
Test: [10/50] eta: 0:01:25 loss: 1.6078 (1.6351) acc1: 64.0000 (64.2182) acc5: 86.4000 (84.8727) time: 2.1356 data: 2.1061 max mem: 6925
Test: [20/50] eta: 0:00:49 loss: 1.7826 (1.7848) acc1: 61.6000 (61.6381) acc5: 82.4000 (83.1619) time: 1.0784 data: 1.0488 max mem: 6925
Test: [30/50] eta: 0:00:28 loss: 1.9055 (1.8054) acc1: 59.2000 (60.6710) acc5: 81.6000 (82.8903) time: 1.0664 data: 1.0363 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.7498 (1.7951) acc1: 57.6000 (60.7415) acc5: 82.4000 (82.9268) time: 0.7565 data: 0.7254 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.7139 (1.7908) acc1: 58.4000 (60.9280) acc5: 83.2000 (83.0400) time: 0.7609 data: 0.7298 max mem: 6925
Test: Total time: 0:00:50 (1.0112 s / it)
* Acc@1 61.580 Acc@5 83.878 loss 1.730
Accuracy of the model on the 50000 test images: 61.6%
Max accuracy: 63.17%
Epoch: [118] [ 0/625] eta: 3:43:01 lr: 0.002908 min_lr: 0.002908 loss: 2.6895 (2.6895) class_acc: 0.5703 (0.5703) weight_decay: 0.0500 (0.0500) time: 21.4111 data: 20.3221 max mem: 6925
Epoch: [118] [200/625] eta: 0:14:27 lr: 0.002902 min_lr: 0.002902 loss: 2.6284 (2.6283) class_acc: 0.6133 (0.6065) weight_decay: 0.0500 (0.0500) grad_norm: 1.1368 (1.0766) time: 2.0242 data: 0.0011 max mem: 6925
Epoch: [118] [400/625] eta: 0:07:22 lr: 0.002895 min_lr: 0.002895 loss: 2.6529 (2.6425) class_acc: 0.6016 (0.6031) weight_decay: 0.0500 (0.0500) grad_norm: 0.9023 (1.0752) time: 1.9651 data: 0.0008 max mem: 6925
Epoch: [118] [600/625] eta: 0:00:48 lr: 0.002889 min_lr: 0.002889 loss: 2.6863 (2.6576) class_acc: 0.5703 (0.5992) weight_decay: 0.0500 (0.0500) grad_norm: 0.9179 (1.0592) time: 1.9977 data: 0.0010 max mem: 6925
Epoch: [118] [624/625] eta: 0:00:01 lr: 0.002888 min_lr: 0.002888 loss: 2.6777 (2.6575) class_acc: 0.5977 (0.5994) weight_decay: 0.0500 (0.0500) grad_norm: 1.0796 (1.0632) time: 0.7488 data: 0.0015 max mem: 6925
Epoch: [118] Total time: 0:20:02 (1.9241 s / it)
Averaged stats: lr: 0.002888 min_lr: 0.002888 loss: 2.6777 (2.6608) class_acc: 0.5977 (0.5987) weight_decay: 0.0500 (0.0500) grad_norm: 1.0796 (1.0632)
Test: [ 0/50] eta: 0:10:16 loss: 1.4170 (1.4170) acc1: 66.4000 (66.4000) acc5: 88.0000 (88.0000) time: 12.3328 data: 12.2872 max mem: 6925
Test: [10/50] eta: 0:01:22 loss: 1.4642 (1.5019) acc1: 66.4000 (66.0364) acc5: 86.4000 (86.3273) time: 2.0581 data: 2.0253 max mem: 6925
Test: [20/50] eta: 0:00:48 loss: 1.7132 (1.7134) acc1: 60.0000 (61.4857) acc5: 84.0000 (83.8857) time: 1.0776 data: 1.0471 max mem: 6925
Test: [30/50] eta: 0:00:26 loss: 1.7603 (1.7140) acc1: 58.4000 (61.0581) acc5: 83.2000 (84.3871) time: 0.9509 data: 0.9217 max mem: 6925
Test: [40/50] eta: 0:00:10 loss: 1.7397 (1.7122) acc1: 60.0000 (61.3463) acc5: 84.8000 (84.1951) time: 0.5540 data: 0.5252 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.6458 (1.7011) acc1: 62.4000 (61.7120) acc5: 83.2000 (84.1120) time: 0.4814 data: 0.4531 max mem: 6925
Test: Total time: 0:00:45 (0.9189 s / it)
* Acc@1 62.718 Acc@5 84.430 loss 1.667
Accuracy of the model on the 50000 test images: 62.7%
Max accuracy: 63.17%
Epoch: [119] [ 0/625] eta: 3:15:28 lr: 0.002888 min_lr: 0.002888 loss: 2.7507 (2.7507) class_acc: 0.5898 (0.5898) weight_decay: 0.0500 (0.0500) time: 18.7652 data: 17.1582 max mem: 6925
Epoch: [119] [200/625] eta: 0:13:50 lr: 0.002882 min_lr: 0.002882 loss: 2.6550 (2.6361) class_acc: 0.5859 (0.6035) weight_decay: 0.0500 (0.0500) grad_norm: 0.8432 (1.0455) time: 1.8199 data: 0.0577 max mem: 6925
Epoch: [119] [400/625] eta: 0:07:14 lr: 0.002875 min_lr: 0.002875 loss: 2.6672 (2.6481) class_acc: 0.6016 (0.6009) weight_decay: 0.0500 (0.0500) grad_norm: 0.9927 (1.0644) time: 1.9140 data: 0.0624 max mem: 6925
Epoch: [119] [600/625] eta: 0:00:48 lr: 0.002869 min_lr: 0.002869 loss: 2.6535 (2.6548) class_acc: 0.5898 (0.6002) weight_decay: 0.0500 (0.0500) grad_norm: 0.8641 (1.0649) time: 1.8819 data: 1.4996 max mem: 6925
Epoch: [119] [624/625] eta: 0:00:01 lr: 0.002868 min_lr: 0.002868 loss: 2.6702 (2.6556) class_acc: 0.5977 (0.5999) weight_decay: 0.0500 (0.0500) grad_norm: 1.0797 (1.0771) time: 0.7769 data: 0.5164 max mem: 6925
Epoch: [119] Total time: 0:19:39 (1.8864 s / it)
Averaged stats: lr: 0.002868 min_lr: 0.002868 loss: 2.6702 (2.6575) class_acc: 0.5977 (0.5996) weight_decay: 0.0500 (0.0500) grad_norm: 1.0797 (1.0771)
Test: [ 0/50] eta: 0:10:12 loss: 1.6343 (1.6343) acc1: 59.2000 (59.2000) acc5: 89.6000 (89.6000) time: 12.2499 data: 12.2205 max mem: 6925
Test: [10/50] eta: 0:01:27 loss: 1.6166 (1.6275) acc1: 64.8000 (64.3636) acc5: 86.4000 (85.3091) time: 2.1846 data: 2.1550 max mem: 6925
Test: [20/50] eta: 0:00:52 loss: 1.6583 (1.7648) acc1: 60.0000 (61.3714) acc5: 84.0000 (83.5429) time: 1.2149 data: 1.1849 max mem: 6925
Test: [30/50] eta: 0:00:30 loss: 1.8272 (1.8058) acc1: 57.6000 (60.0258) acc5: 81.6000 (82.8129) time: 1.1958 data: 1.1661 max mem: 6925
Test: [40/50] eta: 0:00:13 loss: 1.7476 (1.8069) acc1: 56.8000 (60.1756) acc5: 83.2000 (82.8488) time: 0.8532 data: 0.8241 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.7290 (1.8059) acc1: 60.0000 (60.0000) acc5: 84.8000 (83.0400) time: 0.7472 data: 0.7179 max mem: 6925
Test: Total time: 0:00:54 (1.0935 s / it)
* Acc@1 61.294 Acc@5 83.666 loss 1.755
Accuracy of the model on the 50000 test images: 61.3%
Max accuracy: 63.17%
Epoch: [120] [ 0/625] eta: 3:32:22 lr: 0.002868 min_lr: 0.002868 loss: 2.9058 (2.9058) class_acc: 0.5312 (0.5312) weight_decay: 0.0500 (0.0500) time: 20.3882 data: 17.5458 max mem: 6925
Epoch: [120] [200/625] eta: 0:13:34 lr: 0.002862 min_lr: 0.002862 loss: 2.6902 (2.6350) class_acc: 0.5898 (0.6027) weight_decay: 0.0500 (0.0500) grad_norm: 0.9121 (0.9669) time: 1.9038 data: 0.0009 max mem: 6925
Epoch: [120] [400/625] eta: 0:07:14 lr: 0.002855 min_lr: 0.002855 loss: 2.6666 (2.6417) class_acc: 0.5938 (0.6018) weight_decay: 0.0500 (0.0500) grad_norm: 0.8979 (1.0305) time: 2.0678 data: 0.0010 max mem: 6925
Epoch: [120] [600/625] eta: 0:00:48 lr: 0.002849 min_lr: 0.002849 loss: 2.6890 (2.6497) class_acc: 0.5898 (0.5999) weight_decay: 0.0500 (0.0500) grad_norm: 1.1884 (1.0326) time: 1.9186 data: 0.0010 max mem: 6925
Epoch: [120] [624/625] eta: 0:00:01 lr: 0.002848 min_lr: 0.002848 loss: 2.6366 (2.6501) class_acc: 0.6016 (0.5998) weight_decay: 0.0500 (0.0500) grad_norm: 1.0229 (1.0465) time: 0.6100 data: 0.0019 max mem: 6925
Epoch: [120] Total time: 0:20:06 (1.9299 s / it)
Averaged stats: lr: 0.002848 min_lr: 0.002848 loss: 2.6366 (2.6525) class_acc: 0.6016 (0.6002) weight_decay: 0.0500 (0.0500) grad_norm: 1.0229 (1.0465)
Test: [ 0/50] eta: 0:10:37 loss: 1.2915 (1.2915) acc1: 68.8000 (68.8000) acc5: 87.2000 (87.2000) time: 12.7477 data: 12.7158 max mem: 6925
Test: [10/50] eta: 0:01:26 loss: 1.3619 (1.4595) acc1: 68.0000 (66.4727) acc5: 88.0000 (87.0546) time: 2.1517 data: 2.1217 max mem: 6925
Test: [20/50] eta: 0:00:50 loss: 1.6473 (1.6695) acc1: 60.8000 (62.0191) acc5: 84.8000 (84.4952) time: 1.1261 data: 1.0949 max mem: 6925
Test: [30/50] eta: 0:00:29 loss: 1.7832 (1.7015) acc1: 56.8000 (61.6000) acc5: 83.2000 (83.9226) time: 1.0972 data: 1.0666 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.7304 (1.7221) acc1: 60.0000 (61.2878) acc5: 83.2000 (83.5512) time: 0.6925 data: 0.6626 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.7463 (1.7414) acc1: 60.0000 (60.9120) acc5: 82.4000 (83.2640) time: 0.6348 data: 0.6049 max mem: 6925
Test: Total time: 0:00:49 (0.9973 s / it)
* Acc@1 61.628 Acc@5 83.656 loss 1.723
Accuracy of the model on the 50000 test images: 61.6%
Max accuracy: 63.17%
Epoch: [121] [ 0/625] eta: 4:18:15 lr: 0.002848 min_lr: 0.002848 loss: 2.6437 (2.6437) class_acc: 0.6172 (0.6172) weight_decay: 0.0500 (0.0500) time: 24.7924 data: 24.5657 max mem: 6925
Epoch: [121] [200/625] eta: 0:14:30 lr: 0.002841 min_lr: 0.002841 loss: 2.6022 (2.6402) class_acc: 0.5938 (0.6048) weight_decay: 0.0500 (0.0500) grad_norm: 0.9390 (0.9907) time: 1.9188 data: 1.4230 max mem: 6925
Epoch: [121] [400/625] eta: 0:07:25 lr: 0.002835 min_lr: 0.002835 loss: 2.6784 (2.6501) class_acc: 0.5938 (0.6017) weight_decay: 0.0500 (0.0500) grad_norm: 0.9265 (1.0577) time: 2.0846 data: 1.1937 max mem: 6925
Epoch: [121] [600/625] eta: 0:00:49 lr: 0.002828 min_lr: 0.002828 loss: 2.6432 (2.6555) class_acc: 0.5977 (0.6007) weight_decay: 0.0500 (0.0500) grad_norm: 0.8830 (1.0483) time: 1.9792 data: 0.0017 max mem: 6925
Epoch: [121] [624/625] eta: 0:00:01 lr: 0.002827 min_lr: 0.002827 loss: 2.6544 (2.6563) class_acc: 0.6133 (0.6007) weight_decay: 0.0500 (0.0500) grad_norm: 0.8477 (1.0446) time: 0.7907 data: 0.0016 max mem: 6925
Epoch: [121] Total time: 0:20:06 (1.9303 s / it)
Averaged stats: lr: 0.002827 min_lr: 0.002827 loss: 2.6544 (2.6557) class_acc: 0.6133 (0.5998) weight_decay: 0.0500 (0.0500) grad_norm: 0.8477 (1.0446)
Test: [ 0/50] eta: 0:09:42 loss: 1.3393 (1.3393) acc1: 68.8000 (68.8000) acc5: 92.0000 (92.0000) time: 11.6433 data: 11.5802 max mem: 6925
Test: [10/50] eta: 0:01:29 loss: 1.4490 (1.5238) acc1: 68.0000 (67.5636) acc5: 87.2000 (86.9818) time: 2.2288 data: 2.1965 max mem: 6925
Test: [20/50] eta: 0:00:54 loss: 1.6093 (1.7198) acc1: 63.2000 (63.0857) acc5: 85.6000 (84.3429) time: 1.3099 data: 1.2801 max mem: 6925
Test: [30/50] eta: 0:00:29 loss: 1.7784 (1.7083) acc1: 60.8000 (62.9936) acc5: 83.2000 (84.4645) time: 1.0811 data: 1.0515 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.7016 (1.6918) acc1: 62.4000 (63.0244) acc5: 85.6000 (84.5659) time: 0.6138 data: 0.5842 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.5965 (1.6940) acc1: 62.4000 (62.9760) acc5: 84.0000 (84.4160) time: 0.5575 data: 0.5281 max mem: 6925
Test: Total time: 0:00:51 (1.0357 s / it)
* Acc@1 63.380 Acc@5 84.886 loss 1.668
Accuracy of the model on the 50000 test images: 63.4%
Max accuracy: 63.38%
Epoch: [122] [ 0/625] eta: 3:22:37 lr: 0.002827 min_lr: 0.002827 loss: 2.6026 (2.6026) class_acc: 0.6328 (0.6328) weight_decay: 0.0500 (0.0500) time: 19.4516 data: 19.0287 max mem: 6925
Epoch: [122] [200/625] eta: 0:14:13 lr: 0.002821 min_lr: 0.002821 loss: 2.7303 (2.6357) class_acc: 0.5781 (0.6036) weight_decay: 0.0500 (0.0500) grad_norm: 0.9581 (0.9948) time: 1.8843 data: 0.0007 max mem: 6925
Epoch: [122] [400/625] eta: 0:07:26 lr: 0.002814 min_lr: 0.002814 loss: 2.6633 (2.6344) class_acc: 0.5938 (0.6047) weight_decay: 0.0500 (0.0500) grad_norm: 0.7787 (1.0309) time: 1.9061 data: 0.0006 max mem: 6925
Epoch: [122] [600/625] eta: 0:00:49 lr: 0.002808 min_lr: 0.002808 loss: 2.6493 (2.6466) class_acc: 0.5977 (0.6019) weight_decay: 0.0500 (0.0500) grad_norm: 1.3077 (1.0536) time: 2.0492 data: 0.0006 max mem: 6925
Epoch: [122] [624/625] eta: 0:00:01 lr: 0.002807 min_lr: 0.002807 loss: 2.6458 (2.6476) class_acc: 0.5938 (0.6015) weight_decay: 0.0500 (0.0500) grad_norm: 0.9576 (1.0521) time: 0.9610 data: 0.0014 max mem: 6925
Epoch: [122] Total time: 0:20:11 (1.9380 s / it)
Averaged stats: lr: 0.002807 min_lr: 0.002807 loss: 2.6458 (2.6480) class_acc: 0.5938 (0.6015) weight_decay: 0.0500 (0.0500) grad_norm: 0.9576 (1.0521)
Test: [ 0/50] eta: 0:09:21 loss: 1.5882 (1.5882) acc1: 64.0000 (64.0000) acc5: 85.6000 (85.6000) time: 11.2316 data: 11.1910 max mem: 6925
Test: [10/50] eta: 0:01:16 loss: 1.5387 (1.5377) acc1: 64.8000 (64.9455) acc5: 87.2000 (86.2545) time: 1.9167 data: 1.8869 max mem: 6925
Test: [20/50] eta: 0:00:45 loss: 1.6006 (1.6952) acc1: 64.0000 (62.0190) acc5: 84.0000 (85.0286) time: 1.0208 data: 0.9919 max mem: 6925
Test: [30/50] eta: 0:00:25 loss: 1.8715 (1.7298) acc1: 58.4000 (61.6000) acc5: 81.6000 (84.1032) time: 0.9539 data: 0.9249 max mem: 6925
Test: [40/50] eta: 0:00:10 loss: 1.6674 (1.7081) acc1: 61.6000 (61.8732) acc5: 81.6000 (84.2732) time: 0.6305 data: 0.6006 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.6205 (1.7212) acc1: 60.8000 (61.1520) acc5: 84.8000 (84.1280) time: 0.5551 data: 0.5255 max mem: 6925
Test: Total time: 0:00:45 (0.9100 s / it)
* Acc@1 63.000 Acc@5 84.968 loss 1.654
Accuracy of the model on the 50000 test images: 63.0%
Max accuracy: 63.38%
Epoch: [123] [ 0/625] eta: 3:47:58 lr: 0.002807 min_lr: 0.002807 loss: 2.4271 (2.4271) class_acc: 0.6602 (0.6602) weight_decay: 0.0500 (0.0500) time: 21.8863 data: 17.8727 max mem: 6925
Epoch: [123] [200/625] eta: 0:13:44 lr: 0.002800 min_lr: 0.002800 loss: 2.7525 (2.6372) class_acc: 0.5781 (0.6057) weight_decay: 0.0500 (0.0500) grad_norm: 0.7953 (0.9953) time: 1.8168 data: 0.0017 max mem: 6925
Epoch: [123] [400/625] eta: 0:07:12 lr: 0.002794 min_lr: 0.002794 loss: 2.6687 (2.6412) class_acc: 0.5938 (0.6037) weight_decay: 0.0500 (0.0500) grad_norm: 0.8069 (1.0042) time: 1.8682 data: 0.0013 max mem: 6925
Epoch: [123] [600/625] eta: 0:00:48 lr: 0.002787 min_lr: 0.002787 loss: 2.6016 (2.6515) class_acc: 0.5938 (0.6014) weight_decay: 0.0500 (0.0500) grad_norm: 0.9455 (1.0011) time: 2.1119 data: 0.0011 max mem: 6925
Epoch: [123] [624/625] eta: 0:00:01 lr: 0.002786 min_lr: 0.002786 loss: 2.6627 (2.6522) class_acc: 0.5820 (0.6010) weight_decay: 0.0500 (0.0500) grad_norm: 0.8822 (1.0011) time: 0.5780 data: 0.0014 max mem: 6925
Epoch: [123] Total time: 0:19:47 (1.8997 s / it)
Averaged stats: lr: 0.002786 min_lr: 0.002786 loss: 2.6627 (2.6487) class_acc: 0.5820 (0.6016) weight_decay: 0.0500 (0.0500) grad_norm: 0.8822 (1.0011)
Test: [ 0/50] eta: 0:10:31 loss: 1.7061 (1.7061) acc1: 61.6000 (61.6000) acc5: 85.6000 (85.6000) time: 12.6254 data: 12.5868 max mem: 6925
Test: [10/50] eta: 0:01:26 loss: 1.3473 (1.3780) acc1: 69.6000 (69.0909) acc5: 88.0000 (88.1455) time: 2.1649 data: 2.1350 max mem: 6925
Test: [20/50] eta: 0:00:51 loss: 1.4659 (1.5164) acc1: 66.4000 (65.8667) acc5: 87.2000 (86.4000) time: 1.1597 data: 1.1304 max mem: 6925
Test: [30/50] eta: 0:00:29 loss: 1.6246 (1.5887) acc1: 60.8000 (64.1290) acc5: 84.0000 (85.3936) time: 1.0930 data: 1.0640 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.7172 (1.6154) acc1: 60.0000 (63.5902) acc5: 84.0000 (85.0927) time: 0.6668 data: 0.6376 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.6739 (1.6256) acc1: 63.2000 (63.3440) acc5: 84.8000 (84.9440) time: 0.5995 data: 0.5702 max mem: 6925
Test: Total time: 0:00:49 (0.9946 s / it)
* Acc@1 63.962 Acc@5 85.698 loss 1.593
Accuracy of the model on the 50000 test images: 64.0%
Max accuracy: 63.96%
Epoch: [124] [ 0/625] eta: 3:31:20 lr: 0.002786 min_lr: 0.002786 loss: 2.6317 (2.6317) class_acc: 0.6094 (0.6094) weight_decay: 0.0500 (0.0500) time: 20.2893 data: 15.5554 max mem: 6925
Epoch: [124] [200/625] eta: 0:13:57 lr: 0.002780 min_lr: 0.002780 loss: 2.6011 (2.6175) class_acc: 0.6016 (0.6082) weight_decay: 0.0500 (0.0500) grad_norm: 0.8369 (0.9787) time: 1.7760 data: 0.0508 max mem: 6925
Epoch: [124] [400/625] eta: 0:07:07 lr: 0.002773 min_lr: 0.002773 loss: 2.6514 (2.6339) class_acc: 0.5977 (0.6049) weight_decay: 0.0500 (0.0500) grad_norm: 0.9867 (1.0437) time: 1.8098 data: 0.0011 max mem: 6925
Epoch: [124] [600/625] eta: 0:00:47 lr: 0.002766 min_lr: 0.002766 loss: 2.6838 (2.6456) class_acc: 0.5938 (0.6020) weight_decay: 0.0500 (0.0500) grad_norm: 0.7176 (1.0363) time: 1.9273 data: 0.0011 max mem: 6925
Epoch: [124] [624/625] eta: 0:00:01 lr: 0.002766 min_lr: 0.002766 loss: 2.6272 (2.6461) class_acc: 0.5977 (0.6019) weight_decay: 0.0500 (0.0500) grad_norm: 0.8478 (1.0338) time: 0.8111 data: 0.0013 max mem: 6925
Epoch: [124] Total time: 0:19:31 (1.8736 s / it)
Averaged stats: lr: 0.002766 min_lr: 0.002766 loss: 2.6272 (2.6458) class_acc: 0.5977 (0.6021) weight_decay: 0.0500 (0.0500) grad_norm: 0.8478 (1.0338)
Test: [ 0/50] eta: 0:10:12 loss: 1.6506 (1.6506) acc1: 60.0000 (60.0000) acc5: 86.4000 (86.4000) time: 12.2493 data: 12.2148 max mem: 6925
Test: [10/50] eta: 0:01:21 loss: 1.5884 (1.6473) acc1: 63.2000 (63.7091) acc5: 86.4000 (85.1636) time: 2.0373 data: 2.0058 max mem: 6925
Test: [20/50] eta: 0:00:47 loss: 1.7838 (1.8096) acc1: 60.8000 (60.6857) acc5: 84.0000 (82.8571) time: 1.0480 data: 1.0179 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.8780 (1.8246) acc1: 58.4000 (60.6194) acc5: 79.2000 (82.1161) time: 0.9741 data: 0.9450 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.8183 (1.8233) acc1: 60.0000 (60.5659) acc5: 81.6000 (82.2244) time: 0.6528 data: 0.6236 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.7632 (1.8295) acc1: 60.0000 (60.2560) acc5: 81.6000 (81.9040) time: 0.5345 data: 0.5052 max mem: 6925
Test: Total time: 0:00:47 (0.9576 s / it)
* Acc@1 60.822 Acc@5 83.080 loss 1.804
Accuracy of the model on the 50000 test images: 60.8%
Max accuracy: 63.96%
Epoch: [125] [ 0/625] eta: 3:26:13 lr: 0.002766 min_lr: 0.002766 loss: 2.7501 (2.7501) class_acc: 0.6016 (0.6016) weight_decay: 0.0500 (0.0500) time: 19.7976 data: 18.1844 max mem: 6925
Epoch: [125] [200/625] eta: 0:13:26 lr: 0.002759 min_lr: 0.002759 loss: 2.6763 (2.6268) class_acc: 0.5977 (0.6088) weight_decay: 0.0500 (0.0500) grad_norm: 0.9224 (1.0258) time: 1.7088 data: 0.0007 max mem: 6925
Epoch: [125] [400/625] eta: 0:07:04 lr: 0.002752 min_lr: 0.002752 loss: 2.6416 (2.6349) class_acc: 0.6016 (0.6053) weight_decay: 0.0500 (0.0500) grad_norm: 0.9654 (1.0022) time: 1.7425 data: 0.0008 max mem: 6925
Epoch: [125] [600/625] eta: 0:00:48 lr: 0.002746 min_lr: 0.002746 loss: 2.6785 (2.6420) class_acc: 0.5742 (0.6032) weight_decay: 0.0500 (0.0500) grad_norm: 0.9714 (1.0155) time: 2.0278 data: 0.0010 max mem: 6925
Epoch: [125] [624/625] eta: 0:00:01 lr: 0.002745 min_lr: 0.002745 loss: 2.6392 (2.6417) class_acc: 0.6055 (0.6035) weight_decay: 0.0500 (0.0500) grad_norm: 0.8997 (1.0112) time: 0.8352 data: 0.0014 max mem: 6925
Epoch: [125] Total time: 0:19:37 (1.8834 s / it)
Averaged stats: lr: 0.002745 min_lr: 0.002745 loss: 2.6392 (2.6407) class_acc: 0.6055 (0.6037) weight_decay: 0.0500 (0.0500) grad_norm: 0.8997 (1.0112)
Test: [ 0/50] eta: 0:09:50 loss: 1.5851 (1.5851) acc1: 66.4000 (66.4000) acc5: 87.2000 (87.2000) time: 11.8151 data: 11.7779 max mem: 6925
Test: [10/50] eta: 0:01:16 loss: 1.6471 (1.6337) acc1: 63.2000 (64.5818) acc5: 84.0000 (84.8727) time: 1.9082 data: 1.8778 max mem: 6925
Test: [20/50] eta: 0:00:43 loss: 1.7003 (1.7705) acc1: 61.6000 (61.3714) acc5: 83.2000 (83.3524) time: 0.9483 data: 0.9192 max mem: 6925
Test: [30/50] eta: 0:00:25 loss: 1.9206 (1.8153) acc1: 58.4000 (60.4645) acc5: 80.8000 (82.6839) time: 0.9386 data: 0.9102 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.9030 (1.8366) acc1: 58.4000 (59.9415) acc5: 81.6000 (82.2829) time: 0.8880 data: 0.8594 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.8377 (1.8487) acc1: 59.2000 (59.8080) acc5: 81.6000 (82.1760) time: 0.6670 data: 0.6383 max mem: 6925
Test: Total time: 0:00:52 (1.0551 s / it)
* Acc@1 60.770 Acc@5 83.022 loss 1.812
Accuracy of the model on the 50000 test images: 60.8%
Max accuracy: 63.96%
Epoch: [126] [ 0/625] eta: 3:46:02 lr: 0.002745 min_lr: 0.002745 loss: 2.5816 (2.5816) class_acc: 0.6250 (0.6250) weight_decay: 0.0500 (0.0500) time: 21.6993 data: 18.1204 max mem: 6925
Epoch: [126] [200/625] eta: 0:14:36 lr: 0.002738 min_lr: 0.002738 loss: 2.5838 (2.6163) class_acc: 0.6133 (0.6079) weight_decay: 0.0500 (0.0500) grad_norm: 0.8623 (1.0006) time: 1.8924 data: 0.0008 max mem: 6925
Epoch: [126] [400/625] eta: 0:07:31 lr: 0.002732 min_lr: 0.002732 loss: 2.6215 (2.6315) class_acc: 0.6055 (0.6055) weight_decay: 0.0500 (0.0500) grad_norm: 0.8979 (1.0532) time: 1.8789 data: 0.0009 max mem: 6925
Epoch: [126] [600/625] eta: 0:00:49 lr: 0.002725 min_lr: 0.002725 loss: 2.6485 (2.6386) class_acc: 0.6016 (0.6040) weight_decay: 0.0500 (0.0500) grad_norm: 0.9668 (1.0329) time: 2.0430 data: 0.0011 max mem: 6925
Epoch: [126] [624/625] eta: 0:00:01 lr: 0.002724 min_lr: 0.002724 loss: 2.7018 (2.6398) class_acc: 0.5898 (0.6037) weight_decay: 0.0500 (0.0500) grad_norm: 0.9798 (1.0390) time: 0.7686 data: 0.0013 max mem: 6925
Epoch: [126] Total time: 0:20:18 (1.9492 s / it)
Averaged stats: lr: 0.002724 min_lr: 0.002724 loss: 2.7018 (2.6415) class_acc: 0.5898 (0.6033) weight_decay: 0.0500 (0.0500) grad_norm: 0.9798 (1.0390)
Test: [ 0/50] eta: 0:10:15 loss: 1.2490 (1.2490) acc1: 74.4000 (74.4000) acc5: 88.8000 (88.8000) time: 12.3156 data: 12.2846 max mem: 6925
Test: [10/50] eta: 0:01:11 loss: 1.4611 (1.4755) acc1: 66.4000 (68.2909) acc5: 88.0000 (86.5455) time: 1.7752 data: 1.7437 max mem: 6925
Test: [20/50] eta: 0:00:39 loss: 1.5260 (1.6070) acc1: 64.0000 (64.2667) acc5: 85.6000 (85.2952) time: 0.7619 data: 0.7313 max mem: 6925
Test: [30/50] eta: 0:00:22 loss: 1.6816 (1.6451) acc1: 59.2000 (63.3548) acc5: 84.0000 (84.5419) time: 0.7776 data: 0.7479 max mem: 6925
Test: [40/50] eta: 0:00:10 loss: 1.7163 (1.6644) acc1: 59.2000 (62.5951) acc5: 83.2000 (84.3317) time: 0.8291 data: 0.7999 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.7163 (1.6791) acc1: 56.8000 (62.1440) acc5: 83.2000 (84.2400) time: 0.6845 data: 0.6560 max mem: 6925
Test: Total time: 0:00:48 (0.9633 s / it)
* Acc@1 63.196 Acc@5 85.130 loss 1.642
Accuracy of the model on the 50000 test images: 63.2%
Max accuracy: 63.96%
Epoch: [127] [ 0/625] eta: 3:39:00 lr: 0.002724 min_lr: 0.002724 loss: 2.5351 (2.5351) class_acc: 0.6406 (0.6406) weight_decay: 0.0500 (0.0500) time: 21.0242 data: 16.4617 max mem: 6925
Epoch: [127] [200/625] eta: 0:14:37 lr: 0.002717 min_lr: 0.002717 loss: 2.5946 (2.6100) class_acc: 0.6016 (0.6111) weight_decay: 0.0500 (0.0500) grad_norm: 1.0750 (0.9993) time: 2.0115 data: 0.0008 max mem: 6925
Epoch: [127] [400/625] eta: 0:07:28 lr: 0.002711 min_lr: 0.002711 loss: 2.6157 (2.6260) class_acc: 0.6016 (0.6074) weight_decay: 0.0500 (0.0500) grad_norm: 0.9728 (1.0314) time: 1.8648 data: 0.0008 max mem: 6925
Epoch: [127] [600/625] eta: 0:00:49 lr: 0.002704 min_lr: 0.002704 loss: 2.6721 (2.6334) class_acc: 0.5938 (0.6057) weight_decay: 0.0500 (0.0500) grad_norm: 0.8007 (1.0164) time: 1.9889 data: 0.0008 max mem: 6925
Epoch: [127] [624/625] eta: 0:00:01 lr: 0.002703 min_lr: 0.002703 loss: 2.6359 (2.6341) class_acc: 0.6055 (0.6056) weight_decay: 0.0500 (0.0500) grad_norm: 0.8161 (inf) time: 0.9339 data: 0.0017 max mem: 6925
Epoch: [127] Total time: 0:20:20 (1.9525 s / it)
Averaged stats: lr: 0.002703 min_lr: 0.002703 loss: 2.6359 (2.6347) class_acc: 0.6055 (0.6047) weight_decay: 0.0500 (0.0500) grad_norm: 0.8161 (inf)
Test: [ 0/50] eta: 0:11:33 loss: 1.4432 (1.4432) acc1: 69.6000 (69.6000) acc5: 85.6000 (85.6000) time: 13.8783 data: 13.8400 max mem: 6925
Test: [10/50] eta: 0:01:30 loss: 1.3681 (1.3859) acc1: 70.4000 (69.6727) acc5: 90.4000 (88.8727) time: 2.2673 data: 2.2353 max mem: 6925
Test: [20/50] eta: 0:00:54 loss: 1.5028 (1.5201) acc1: 65.6000 (65.9429) acc5: 87.2000 (87.3905) time: 1.2071 data: 1.1772 max mem: 6925
Test: [30/50] eta: 0:00:31 loss: 1.6330 (1.5425) acc1: 61.6000 (65.1613) acc5: 86.4000 (86.8903) time: 1.2012 data: 1.1710 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.6330 (1.5674) acc1: 62.4000 (64.6634) acc5: 86.4000 (86.5366) time: 0.7194 data: 0.6886 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.6347 (1.5844) acc1: 64.8000 (64.3520) acc5: 85.6000 (86.1600) time: 0.6643 data: 0.6335 max mem: 6925
Test: Total time: 0:00:53 (1.0638 s / it)
* Acc@1 65.048 Acc@5 86.456 loss 1.549
Accuracy of the model on the 50000 test images: 65.0%
Max accuracy: 65.05%
Epoch: [128] [ 0/625] eta: 3:39:27 lr: 0.002703 min_lr: 0.002703 loss: 2.7501 (2.7501) class_acc: 0.5781 (0.5781) weight_decay: 0.0500 (0.0500) time: 21.0686 data: 18.7208 max mem: 6925
Epoch: [128] [200/625] eta: 0:14:30 lr: 0.002696 min_lr: 0.002696 loss: 2.6800 (2.6158) class_acc: 0.5977 (0.6106) weight_decay: 0.0500 (0.0500) grad_norm: 1.0264 (1.0751) time: 1.8186 data: 0.0504 max mem: 6925
Epoch: [128] [400/625] eta: 0:07:29 lr: 0.002690 min_lr: 0.002690 loss: 2.6584 (2.6267) class_acc: 0.6016 (0.6071) weight_decay: 0.0500 (0.0500) grad_norm: 0.7975 (1.0630) time: 2.0358 data: 0.0199 max mem: 6925
Epoch: [128] [600/625] eta: 0:00:50 lr: 0.002683 min_lr: 0.002683 loss: 2.6343 (2.6343) class_acc: 0.6094 (0.6055) weight_decay: 0.0500 (0.0500) grad_norm: 0.7766 (1.0307) time: 2.0988 data: 0.0011 max mem: 6925
Epoch: [128] [624/625] eta: 0:00:01 lr: 0.002682 min_lr: 0.002682 loss: 2.6837 (2.6363) class_acc: 0.5820 (0.6049) weight_decay: 0.0500 (0.0500) grad_norm: 0.9114 (1.0290) time: 0.7565 data: 0.0013 max mem: 6925
Epoch: [128] Total time: 0:20:23 (1.9572 s / it)
Averaged stats: lr: 0.002682 min_lr: 0.002682 loss: 2.6837 (2.6334) class_acc: 0.5820 (0.6053) weight_decay: 0.0500 (0.0500) grad_norm: 0.9114 (1.0290)
Test: [ 0/50] eta: 0:11:01 loss: 1.6503 (1.6503) acc1: 61.6000 (61.6000) acc5: 88.0000 (88.0000) time: 13.2227 data: 13.1902 max mem: 6925
Test: [10/50] eta: 0:01:24 loss: 1.6503 (1.5700) acc1: 65.6000 (66.4000) acc5: 87.2000 (86.4000) time: 2.1087 data: 2.0786 max mem: 6925
Test: [20/50] eta: 0:00:48 loss: 1.6914 (1.7856) acc1: 61.6000 (62.1333) acc5: 84.0000 (83.3905) time: 1.0535 data: 1.0238 max mem: 6925
Test: [30/50] eta: 0:00:28 loss: 1.9852 (1.8002) acc1: 59.2000 (62.1419) acc5: 82.4000 (83.5871) time: 1.0467 data: 1.0174 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.9426 (1.8309) acc1: 59.2000 (61.4244) acc5: 83.2000 (83.0829) time: 0.8463 data: 0.8164 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.8305 (1.8285) acc1: 58.4000 (61.2000) acc5: 83.2000 (83.0880) time: 0.8341 data: 0.8040 max mem: 6925
Test: Total time: 0:00:53 (1.0740 s / it)
* Acc@1 62.178 Acc@5 84.110 loss 1.786
Accuracy of the model on the 50000 test images: 62.2%
Max accuracy: 65.05%
Epoch: [129] [ 0/625] eta: 3:34:58 lr: 0.002682 min_lr: 0.002682 loss: 2.6715 (2.6715) class_acc: 0.5977 (0.5977) weight_decay: 0.0500 (0.0500) time: 20.6370 data: 17.9791 max mem: 6925
Epoch: [129] [200/625] eta: 0:14:04 lr: 0.002675 min_lr: 0.002675 loss: 2.5616 (2.6284) class_acc: 0.6094 (0.6063) weight_decay: 0.0500 (0.0500) grad_norm: 0.9853 (1.0443) time: 1.9590 data: 0.0014 max mem: 6925
Epoch: [129] [400/625] eta: 0:07:21 lr: 0.002668 min_lr: 0.002668 loss: 2.6656 (2.6349) class_acc: 0.6016 (0.6059) weight_decay: 0.0500 (0.0500) grad_norm: 0.8601 (1.0307) time: 1.7726 data: 0.0010 max mem: 6925
Epoch: [129] [600/625] eta: 0:00:48 lr: 0.002662 min_lr: 0.002662 loss: 2.6480 (2.6395) class_acc: 0.6016 (0.6043) weight_decay: 0.0500 (0.0500) grad_norm: 0.9096 (1.0042) time: 1.9899 data: 0.0013 max mem: 6925
Epoch: [129] [624/625] eta: 0:00:01 lr: 0.002661 min_lr: 0.002661 loss: 2.6899 (2.6408) class_acc: 0.5898 (0.6041) weight_decay: 0.0500 (0.0500) grad_norm: 0.9532 (1.0048) time: 0.7995 data: 0.0027 max mem: 6925
Epoch: [129] Total time: 0:20:04 (1.9275 s / it)
Averaged stats: lr: 0.002661 min_lr: 0.002661 loss: 2.6899 (2.6297) class_acc: 0.5898 (0.6059) weight_decay: 0.0500 (0.0500) grad_norm: 0.9532 (1.0048)
Test: [ 0/50] eta: 0:09:59 loss: 1.4604 (1.4604) acc1: 66.4000 (66.4000) acc5: 90.4000 (90.4000) time: 11.9844 data: 11.9432 max mem: 6925
Test: [10/50] eta: 0:01:20 loss: 1.4804 (1.5475) acc1: 66.4000 (65.7455) acc5: 86.4000 (86.4000) time: 2.0231 data: 1.9919 max mem: 6925
Test: [20/50] eta: 0:00:47 loss: 1.6385 (1.7060) acc1: 61.6000 (62.4000) acc5: 84.8000 (84.6476) time: 1.0805 data: 1.0491 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.7872 (1.7081) acc1: 61.6000 (62.5548) acc5: 82.4000 (84.3871) time: 1.0277 data: 0.9960 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.7341 (1.6967) acc1: 63.2000 (62.7317) acc5: 83.2000 (84.5073) time: 0.6205 data: 0.5899 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.7368 (1.7221) acc1: 60.0000 (62.1280) acc5: 83.2000 (84.2240) time: 0.5637 data: 0.5338 max mem: 6925
Test: Total time: 0:00:46 (0.9340 s / it)
* Acc@1 62.618 Acc@5 84.756 loss 1.685
Accuracy of the model on the 50000 test images: 62.6%
Max accuracy: 65.05%
Epoch: [130] [ 0/625] eta: 3:39:20 lr: 0.002661 min_lr: 0.002661 loss: 2.5773 (2.5773) class_acc: 0.6055 (0.6055) weight_decay: 0.0500 (0.0500) time: 21.0567 data: 19.2773 max mem: 6925
Epoch: [130] [200/625] eta: 0:14:24 lr: 0.002654 min_lr: 0.002654 loss: 2.6450 (2.6065) class_acc: 0.6016 (0.6113) weight_decay: 0.0500 (0.0500) grad_norm: 1.0479 (1.0969) time: 1.9575 data: 0.0028 max mem: 6925
Epoch: [130] [400/625] eta: 0:07:27 lr: 0.002647 min_lr: 0.002647 loss: 2.5991 (2.6220) class_acc: 0.5977 (0.6073) weight_decay: 0.0500 (0.0500) grad_norm: 0.8577 (1.0500) time: 2.1095 data: 0.0012 max mem: 6925
Epoch: [130] [600/625] eta: 0:00:49 lr: 0.002640 min_lr: 0.002640 loss: 2.6493 (2.6292) class_acc: 0.6016 (0.6060) weight_decay: 0.0500 (0.0500) grad_norm: 0.9542 (1.0467) time: 2.0326 data: 0.0016 max mem: 6925
Epoch: [130] [624/625] eta: 0:00:01 lr: 0.002640 min_lr: 0.002640 loss: 2.6515 (2.6300) class_acc: 0.6016 (0.6060) weight_decay: 0.0500 (0.0500) grad_norm: 0.8679 (1.0404) time: 0.7625 data: 0.0024 max mem: 6925
Epoch: [130] Total time: 0:20:19 (1.9517 s / it)
Averaged stats: lr: 0.002640 min_lr: 0.002640 loss: 2.6515 (2.6296) class_acc: 0.6016 (0.6059) weight_decay: 0.0500 (0.0500) grad_norm: 0.8679 (1.0404)
Test: [ 0/50] eta: 0:10:40 loss: 1.3444 (1.3444) acc1: 70.4000 (70.4000) acc5: 92.0000 (92.0000) time: 12.8092 data: 12.7781 max mem: 6925
Test: [10/50] eta: 0:01:27 loss: 1.4514 (1.4885) acc1: 67.2000 (67.3455) acc5: 88.0000 (86.7636) time: 2.1838 data: 2.1544 max mem: 6925
Test: [20/50] eta: 0:00:52 loss: 1.6007 (1.6370) acc1: 64.8000 (64.3810) acc5: 85.6000 (85.3714) time: 1.1910 data: 1.1622 max mem: 6925
Test: [30/50] eta: 0:00:31 loss: 1.7964 (1.6541) acc1: 61.6000 (64.0774) acc5: 84.0000 (84.8258) time: 1.2320 data: 1.2035 max mem: 6925
Test: [40/50] eta: 0:00:13 loss: 1.6893 (1.6689) acc1: 61.6000 (63.7659) acc5: 84.8000 (84.7024) time: 0.9746 data: 0.9456 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.6633 (1.6903) acc1: 61.6000 (63.1520) acc5: 84.0000 (84.3840) time: 0.8457 data: 0.8165 max mem: 6925
Test: Total time: 0:00:57 (1.1449 s / it)
* Acc@1 63.520 Acc@5 85.098 loss 1.646
Accuracy of the model on the 50000 test images: 63.5%
Max accuracy: 65.05%
Epoch: [131] [ 0/625] eta: 3:32:35 lr: 0.002640 min_lr: 0.002640 loss: 2.4440 (2.4440) class_acc: 0.6641 (0.6641) weight_decay: 0.0500 (0.0500) time: 20.4087 data: 18.0986 max mem: 6925
Epoch: [131] [200/625] eta: 0:14:35 lr: 0.002633 min_lr: 0.002633 loss: 2.6355 (2.6008) class_acc: 0.6016 (0.6125) weight_decay: 0.0500 (0.0500) grad_norm: 0.8439 (0.9781) time: 1.7643 data: 0.0024 max mem: 6925
Epoch: [131] [400/625] eta: 0:07:26 lr: 0.002626 min_lr: 0.002626 loss: 2.6091 (2.6179) class_acc: 0.6016 (0.6081) weight_decay: 0.0500 (0.0500) grad_norm: 0.9180 (0.9866) time: 1.8885 data: 0.0013 max mem: 6925
Epoch: [131] [600/625] eta: 0:00:49 lr: 0.002619 min_lr: 0.002619 loss: 2.6418 (2.6255) class_acc: 0.5938 (0.6068) weight_decay: 0.0500 (0.0500) grad_norm: 0.9168 (0.9802) time: 1.9364 data: 0.0025 max mem: 6925
Epoch: [131] [624/625] eta: 0:00:01 lr: 0.002618 min_lr: 0.002618 loss: 2.6560 (2.6265) class_acc: 0.6016 (0.6066) weight_decay: 0.0500 (0.0500) grad_norm: 0.9195 (0.9848) time: 0.7526 data: 0.0018 max mem: 6925
Epoch: [131] Total time: 0:19:56 (1.9148 s / it)
Averaged stats: lr: 0.002618 min_lr: 0.002618 loss: 2.6560 (2.6235) class_acc: 0.6016 (0.6071) weight_decay: 0.0500 (0.0500) grad_norm: 0.9195 (0.9848)
Test: [ 0/50] eta: 0:10:08 loss: 1.6185 (1.6185) acc1: 62.4000 (62.4000) acc5: 84.8000 (84.8000) time: 12.1692 data: 12.1329 max mem: 6925
Test: [10/50] eta: 0:01:18 loss: 1.6041 (1.6272) acc1: 64.0000 (64.7273) acc5: 86.4000 (85.6000) time: 1.9744 data: 1.9449 max mem: 6925
Test: [20/50] eta: 0:00:46 loss: 1.6582 (1.7411) acc1: 61.6000 (62.6286) acc5: 84.8000 (83.8095) time: 1.0034 data: 0.9747 max mem: 6925
Test: [30/50] eta: 0:00:26 loss: 1.8015 (1.7644) acc1: 62.4000 (62.4258) acc5: 81.6000 (83.2774) time: 0.9662 data: 0.9377 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.8205 (1.7630) acc1: 61.6000 (61.9707) acc5: 81.6000 (83.2781) time: 0.6451 data: 0.6140 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.7904 (1.7557) acc1: 62.4000 (62.1760) acc5: 82.4000 (83.3760) time: 0.5556 data: 0.5238 max mem: 6925
Test: Total time: 0:00:47 (0.9502 s / it)
* Acc@1 62.600 Acc@5 84.374 loss 1.704
Accuracy of the model on the 50000 test images: 62.6%
Max accuracy: 65.05%
Epoch: [132] [ 0/625] eta: 4:05:37 lr: 0.002618 min_lr: 0.002618 loss: 2.6792 (2.6792) class_acc: 0.6172 (0.6172) weight_decay: 0.0500 (0.0500) time: 23.5798 data: 19.8899 max mem: 6925
Epoch: [132] [200/625] eta: 0:14:08 lr: 0.002612 min_lr: 0.002612 loss: 2.6609 (2.6070) class_acc: 0.6016 (0.6110) weight_decay: 0.0500 (0.0500) grad_norm: 0.9632 (1.0922) time: 1.8133 data: 0.2110 max mem: 6925
Epoch: [132] [400/625] eta: 0:07:21 lr: 0.002605 min_lr: 0.002605 loss: 2.6221 (2.6093) class_acc: 0.6055 (0.6095) weight_decay: 0.0500 (0.0500) grad_norm: 0.7858 (1.0741) time: 2.0198 data: 0.0006 max mem: 6925
Epoch: [132] [600/625] eta: 0:00:48 lr: 0.002598 min_lr: 0.002598 loss: 2.6445 (2.6185) class_acc: 0.5977 (0.6078) weight_decay: 0.0500 (0.0500) grad_norm: 0.8508 (1.0615) time: 2.0962 data: 0.0008 max mem: 6925
Epoch: [132] [624/625] eta: 0:00:01 lr: 0.002597 min_lr: 0.002597 loss: 2.6309 (2.6199) class_acc: 0.5977 (0.6073) weight_decay: 0.0500 (0.0500) grad_norm: 1.0460 (1.0596) time: 0.7757 data: 0.0016 max mem: 6925
Epoch: [132] Total time: 0:19:54 (1.9110 s / it)
Averaged stats: lr: 0.002597 min_lr: 0.002597 loss: 2.6309 (2.6230) class_acc: 0.5977 (0.6074) weight_decay: 0.0500 (0.0500) grad_norm: 1.0460 (1.0596)
Test: [ 0/50] eta: 0:10:03 loss: 1.2997 (1.2997) acc1: 67.2000 (67.2000) acc5: 91.2000 (91.2000) time: 12.0674 data: 12.0312 max mem: 6925
Test: [10/50] eta: 0:01:28 loss: 1.3998 (1.4403) acc1: 68.8000 (68.6545) acc5: 87.2000 (86.8364) time: 2.2192 data: 2.1848 max mem: 6925
Test: [20/50] eta: 0:00:53 loss: 1.6193 (1.6207) acc1: 63.2000 (63.6952) acc5: 85.6000 (84.9143) time: 1.2725 data: 1.2409 max mem: 6925
Test: [30/50] eta: 0:00:29 loss: 1.7515 (1.6567) acc1: 58.4000 (62.4000) acc5: 84.0000 (84.4645) time: 1.0785 data: 1.0496 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.7448 (1.6774) acc1: 61.6000 (61.8732) acc5: 84.0000 (84.3122) time: 0.6609 data: 0.6319 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.6773 (1.6948) acc1: 60.8000 (61.2800) acc5: 84.0000 (84.0800) time: 0.5705 data: 0.5406 max mem: 6925
Test: Total time: 0:00:51 (1.0369 s / it)
* Acc@1 62.438 Acc@5 84.890 loss 1.668
Accuracy of the model on the 50000 test images: 62.4%
Max accuracy: 65.05%
Epoch: [133] [ 0/625] eta: 3:28:32 lr: 0.002597 min_lr: 0.002597 loss: 2.4093 (2.4093) class_acc: 0.6562 (0.6562) weight_decay: 0.0500 (0.0500) time: 20.0194 data: 17.8022 max mem: 6925
Epoch: [133] [200/625] eta: 0:14:02 lr: 0.002590 min_lr: 0.002590 loss: 2.5864 (2.6010) class_acc: 0.6172 (0.6124) weight_decay: 0.0500 (0.0500) grad_norm: 0.8138 (1.0412) time: 1.9263 data: 0.0012 max mem: 6925
Epoch: [133] [400/625] eta: 0:07:16 lr: 0.002583 min_lr: 0.002583 loss: 2.6289 (2.6129) class_acc: 0.6016 (0.6095) weight_decay: 0.0500 (0.0500) grad_norm: 1.1456 (1.0542) time: 1.8486 data: 0.0010 max mem: 6925
Epoch: [133] [600/625] eta: 0:00:49 lr: 0.002576 min_lr: 0.002576 loss: 2.6107 (2.6191) class_acc: 0.6094 (0.6081) weight_decay: 0.0500 (0.0500) grad_norm: 0.9260 (1.0482) time: 2.0239 data: 0.1043 max mem: 6925
Epoch: [133] [624/625] eta: 0:00:01 lr: 0.002576 min_lr: 0.002576 loss: 2.6567 (2.6199) class_acc: 0.6016 (0.6079) weight_decay: 0.0500 (0.0500) grad_norm: 0.9356 (1.0503) time: 0.8468 data: 0.0172 max mem: 6925
Epoch: [133] Total time: 0:19:59 (1.9196 s / it)
Averaged stats: lr: 0.002576 min_lr: 0.002576 loss: 2.6567 (2.6192) class_acc: 0.6016 (0.6085) weight_decay: 0.0500 (0.0500) grad_norm: 0.9356 (1.0503)
Test: [ 0/50] eta: 0:09:39 loss: 1.4169 (1.4169) acc1: 68.8000 (68.8000) acc5: 88.8000 (88.8000) time: 11.5992 data: 11.5683 max mem: 6925
Test: [10/50] eta: 0:01:18 loss: 1.4293 (1.5167) acc1: 68.8000 (67.4909) acc5: 88.8000 (87.0545) time: 1.9586 data: 1.9273 max mem: 6925
Test: [20/50] eta: 0:00:46 loss: 1.6057 (1.6481) acc1: 62.4000 (63.6952) acc5: 84.8000 (85.1429) time: 1.0320 data: 1.0021 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.8576 (1.6769) acc1: 60.8000 (63.0452) acc5: 82.4000 (84.6710) time: 1.0466 data: 1.0180 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.6494 (1.6803) acc1: 61.6000 (62.7317) acc5: 83.2000 (84.6049) time: 0.8982 data: 0.8691 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.6613 (1.6897) acc1: 61.6000 (62.5440) acc5: 83.2000 (84.5760) time: 0.6885 data: 0.6591 max mem: 6925
Test: Total time: 0:00:53 (1.0734 s / it)
* Acc@1 63.186 Acc@5 84.806 loss 1.657
Accuracy of the model on the 50000 test images: 63.2%
Max accuracy: 65.05%
Epoch: [134] [ 0/625] eta: 3:35:02 lr: 0.002576 min_lr: 0.002576 loss: 2.5499 (2.5499) class_acc: 0.5742 (0.5742) weight_decay: 0.0500 (0.0500) time: 20.6445 data: 18.6605 max mem: 6925
Epoch: [134] [200/625] eta: 0:14:06 lr: 0.002569 min_lr: 0.002569 loss: 2.6135 (2.5844) class_acc: 0.5977 (0.6140) weight_decay: 0.0500 (0.0500) grad_norm: 0.9306 (1.0532) time: 1.8711 data: 0.7813 max mem: 6925
Epoch: [134] [400/625] eta: 0:07:19 lr: 0.002562 min_lr: 0.002562 loss: 2.6242 (2.6042) class_acc: 0.6055 (0.6116) weight_decay: 0.0500 (0.0500) grad_norm: 0.9631 (1.0548) time: 2.1007 data: 0.4389 max mem: 6925
Epoch: [134] [600/625] eta: 0:00:48 lr: 0.002555 min_lr: 0.002555 loss: 2.6113 (2.6100) class_acc: 0.6016 (0.6108) weight_decay: 0.0500 (0.0500) grad_norm: 0.9173 (inf) time: 1.8741 data: 0.0014 max mem: 6925
Epoch: [134] [624/625] eta: 0:00:01 lr: 0.002554 min_lr: 0.002554 loss: 2.6058 (2.6109) class_acc: 0.6172 (0.6109) weight_decay: 0.0500 (0.0500) grad_norm: 1.1351 (inf) time: 0.7489 data: 0.0016 max mem: 6925
Epoch: [134] Total time: 0:20:01 (1.9226 s / it)
Averaged stats: lr: 0.002554 min_lr: 0.002554 loss: 2.6058 (2.6163) class_acc: 0.6172 (0.6088) weight_decay: 0.0500 (0.0500) grad_norm: 1.1351 (inf)
Test: [ 0/50] eta: 0:08:34 loss: 1.4444 (1.4444) acc1: 64.0000 (64.0000) acc5: 89.6000 (89.6000) time: 10.2944 data: 10.2550 max mem: 6925
Test: [10/50] eta: 0:01:13 loss: 1.3977 (1.4175) acc1: 66.4000 (67.1273) acc5: 89.6000 (88.0727) time: 1.8278 data: 1.7961 max mem: 6925
Test: [20/50] eta: 0:00:44 loss: 1.5267 (1.6216) acc1: 64.0000 (62.9714) acc5: 86.4000 (85.2571) time: 1.0483 data: 1.0184 max mem: 6925
Test: [30/50] eta: 0:00:24 loss: 1.7607 (1.6681) acc1: 59.2000 (62.1161) acc5: 82.4000 (84.6710) time: 0.9033 data: 0.8743 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.7554 (1.6615) acc1: 60.0000 (62.5951) acc5: 82.4000 (84.6244) time: 0.8095 data: 0.7803 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.6418 (1.6630) acc1: 61.6000 (62.7360) acc5: 83.2000 (84.5440) time: 0.6278 data: 0.5987 max mem: 6925
Test: Total time: 0:00:50 (1.0134 s / it)
* Acc@1 63.446 Acc@5 85.118 loss 1.634
Accuracy of the model on the 50000 test images: 63.4%
Max accuracy: 65.05%
Epoch: [135] [ 0/625] eta: 3:28:38 lr: 0.002554 min_lr: 0.002554 loss: 2.5264 (2.5264) class_acc: 0.6289 (0.6289) weight_decay: 0.0500 (0.0500) time: 20.0299 data: 17.7833 max mem: 6925
Epoch: [135] [200/625] eta: 0:14:17 lr: 0.002547 min_lr: 0.002547 loss: 2.5757 (2.5908) class_acc: 0.6133 (0.6149) weight_decay: 0.0500 (0.0500) grad_norm: 0.8837 (0.9484) time: 1.8562 data: 0.0011 max mem: 6925
Epoch: [135] [400/625] eta: 0:07:17 lr: 0.002540 min_lr: 0.002540 loss: 2.6097 (2.6068) class_acc: 0.6016 (0.6113) weight_decay: 0.0500 (0.0500) grad_norm: 1.1139 (0.9768) time: 2.0957 data: 1.4874 max mem: 6925
Epoch: [135] [600/625] eta: 0:00:49 lr: 0.002533 min_lr: 0.002533 loss: 2.6852 (2.6165) class_acc: 0.5859 (0.6084) weight_decay: 0.0500 (0.0500) grad_norm: 1.0134 (0.9985) time: 2.1753 data: 0.0437 max mem: 6925
Epoch: [135] [624/625] eta: 0:00:01 lr: 0.002533 min_lr: 0.002533 loss: 2.6882 (2.6185) class_acc: 0.6094 (0.6081) weight_decay: 0.0500 (0.0500) grad_norm: 0.8913 (0.9968) time: 0.7939 data: 0.0020 max mem: 6925
Epoch: [135] Total time: 0:20:01 (1.9224 s / it)
Averaged stats: lr: 0.002533 min_lr: 0.002533 loss: 2.6882 (2.6144) class_acc: 0.6094 (0.6093) weight_decay: 0.0500 (0.0500) grad_norm: 0.8913 (0.9968)
Test: [ 0/50] eta: 0:10:22 loss: 1.3021 (1.3021) acc1: 71.2000 (71.2000) acc5: 89.6000 (89.6000) time: 12.4518 data: 12.3990 max mem: 6925
Test: [10/50] eta: 0:01:26 loss: 1.3625 (1.5090) acc1: 68.0000 (66.7636) acc5: 87.2000 (86.4727) time: 2.1728 data: 2.1403 max mem: 6925
Test: [20/50] eta: 0:00:50 loss: 1.7029 (1.6718) acc1: 60.8000 (62.9333) acc5: 84.8000 (84.8381) time: 1.1620 data: 1.1323 max mem: 6925
Test: [30/50] eta: 0:00:29 loss: 1.7714 (1.6828) acc1: 59.2000 (62.6323) acc5: 83.2000 (84.3355) time: 1.0749 data: 1.0444 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.6959 (1.6844) acc1: 60.8000 (62.5561) acc5: 84.0000 (84.3512) time: 0.6539 data: 0.6229 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.6614 (1.6833) acc1: 60.0000 (62.3040) acc5: 84.0000 (84.6080) time: 0.6306 data: 0.6012 max mem: 6925
Test: Total time: 0:00:49 (0.9920 s / it)
* Acc@1 63.264 Acc@5 85.228 loss 1.640
Accuracy of the model on the 50000 test images: 63.3%
Max accuracy: 65.05%
Epoch: [136] [ 0/625] eta: 3:43:43 lr: 0.002532 min_lr: 0.002532 loss: 2.6249 (2.6249) class_acc: 0.6055 (0.6055) weight_decay: 0.0500 (0.0500) time: 21.4779 data: 19.1244 max mem: 6925
Epoch: [136] [200/625] eta: 0:14:28 lr: 0.002526 min_lr: 0.002526 loss: 2.5884 (2.5832) class_acc: 0.6172 (0.6168) weight_decay: 0.0500 (0.0500) grad_norm: 0.9371 (0.9847) time: 1.9430 data: 0.1290 max mem: 6925
Epoch: [136] [400/625] eta: 0:07:29 lr: 0.002519 min_lr: 0.002519 loss: 2.5965 (2.6023) class_acc: 0.6055 (0.6124) weight_decay: 0.0500 (0.0500) grad_norm: 1.0124 (1.0135) time: 1.9342 data: 0.0009 max mem: 6925
Epoch: [136] [600/625] eta: 0:00:49 lr: 0.002512 min_lr: 0.002512 loss: 2.5953 (2.6074) class_acc: 0.6094 (0.6104) weight_decay: 0.0500 (0.0500) grad_norm: 0.9775 (1.0241) time: 1.8564 data: 0.0011 max mem: 6925
Epoch: [136] [624/625] eta: 0:00:01 lr: 0.002511 min_lr: 0.002511 loss: 2.6624 (2.6098) class_acc: 0.5977 (0.6098) weight_decay: 0.0500 (0.0500) grad_norm: 0.8451 (1.0203) time: 0.7910 data: 0.0215 max mem: 6925
Epoch: [136] Total time: 0:20:09 (1.9357 s / it)
Averaged stats: lr: 0.002511 min_lr: 0.002511 loss: 2.6624 (2.6129) class_acc: 0.5977 (0.6099) weight_decay: 0.0500 (0.0500) grad_norm: 0.8451 (1.0203)
Test: [ 0/50] eta: 0:10:08 loss: 1.5526 (1.5526) acc1: 68.0000 (68.0000) acc5: 90.4000 (90.4000) time: 12.1651 data: 12.1217 max mem: 6925
Test: [10/50] eta: 0:01:19 loss: 1.5882 (1.6100) acc1: 65.6000 (65.3818) acc5: 86.4000 (86.2545) time: 1.9826 data: 1.9501 max mem: 6925
Test: [20/50] eta: 0:00:46 loss: 1.7109 (1.7346) acc1: 62.4000 (62.0571) acc5: 84.0000 (84.0762) time: 1.0220 data: 0.9919 max mem: 6925
Test: [30/50] eta: 0:00:26 loss: 1.8479 (1.7480) acc1: 59.2000 (61.1613) acc5: 81.6000 (83.5871) time: 0.9432 data: 0.9136 max mem: 6925
Test: [40/50] eta: 0:00:10 loss: 1.8417 (1.7606) acc1: 57.6000 (60.8976) acc5: 81.6000 (83.4341) time: 0.5812 data: 0.5505 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.8313 (1.7689) acc1: 58.4000 (60.8800) acc5: 82.4000 (83.3280) time: 0.5188 data: 0.4894 max mem: 6925
Test: Total time: 0:00:46 (0.9392 s / it)
* Acc@1 61.646 Acc@5 83.856 loss 1.710
Accuracy of the model on the 50000 test images: 61.6%
Max accuracy: 65.05%
Epoch: [137] [ 0/625] eta: 3:42:58 lr: 0.002511 min_lr: 0.002511 loss: 2.5691 (2.5691) class_acc: 0.6016 (0.6016) weight_decay: 0.0500 (0.0500) time: 21.4050 data: 18.1187 max mem: 6925
Epoch: [137] [200/625] eta: 0:14:08 lr: 0.002504 min_lr: 0.002504 loss: 2.6356 (2.5866) class_acc: 0.6055 (0.6146) weight_decay: 0.0500 (0.0500) grad_norm: 0.9010 (0.9846) time: 1.8651 data: 0.2273 max mem: 6925
Epoch: [137] [400/625] eta: 0:07:24 lr: 0.002497 min_lr: 0.002497 loss: 2.5842 (2.6013) class_acc: 0.6016 (0.6118) weight_decay: 0.0500 (0.0500) grad_norm: 1.0067 (0.9782) time: 1.9870 data: 0.0008 max mem: 6925
Epoch: [137] [600/625] eta: 0:00:49 lr: 0.002490 min_lr: 0.002490 loss: 2.6232 (2.6108) class_acc: 0.6172 (0.6109) weight_decay: 0.0500 (0.0500) grad_norm: 0.9432 (1.0132) time: 1.9363 data: 0.1116 max mem: 6925
Epoch: [137] [624/625] eta: 0:00:01 lr: 0.002489 min_lr: 0.002489 loss: 2.5982 (2.6105) class_acc: 0.6055 (0.6111) weight_decay: 0.0500 (0.0500) grad_norm: 0.9196 (1.0219) time: 0.7811 data: 0.0844 max mem: 6925
Epoch: [137] Total time: 0:19:59 (1.9186 s / it)
Averaged stats: lr: 0.002489 min_lr: 0.002489 loss: 2.5982 (2.6090) class_acc: 0.6055 (0.6107) weight_decay: 0.0500 (0.0500) grad_norm: 0.9196 (1.0219)
Test: [ 0/50] eta: 0:09:51 loss: 1.4736 (1.4736) acc1: 64.0000 (64.0000) acc5: 88.8000 (88.8000) time: 11.8226 data: 11.7645 max mem: 6925
Test: [10/50] eta: 0:01:25 loss: 1.4736 (1.4961) acc1: 65.6000 (66.0364) acc5: 87.2000 (86.4727) time: 2.1445 data: 2.1115 max mem: 6925
Test: [20/50] eta: 0:00:51 loss: 1.6788 (1.6422) acc1: 62.4000 (63.0857) acc5: 86.4000 (85.2191) time: 1.2207 data: 1.1913 max mem: 6925
Test: [30/50] eta: 0:00:29 loss: 1.7573 (1.6549) acc1: 60.8000 (62.9936) acc5: 83.2000 (84.7226) time: 1.0936 data: 1.0646 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.6342 (1.6754) acc1: 62.4000 (62.5561) acc5: 84.0000 (84.5854) time: 0.6473 data: 0.6174 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.6526 (1.6843) acc1: 62.4000 (62.5440) acc5: 84.0000 (84.5120) time: 0.5292 data: 0.4993 max mem: 6925
Test: Total time: 0:00:50 (1.0110 s / it)
* Acc@1 63.324 Acc@5 85.274 loss 1.624
Accuracy of the model on the 50000 test images: 63.3%
Max accuracy: 65.05%
Epoch: [138] [ 0/625] eta: 4:07:55 lr: 0.002489 min_lr: 0.002489 loss: 2.4918 (2.4918) class_acc: 0.6328 (0.6328) weight_decay: 0.0500 (0.0500) time: 23.8004 data: 18.1222 max mem: 6925
Epoch: [138] [200/625] eta: 0:14:17 lr: 0.002482 min_lr: 0.002482 loss: 2.5933 (2.5795) class_acc: 0.6055 (0.6190) weight_decay: 0.0500 (0.0500) grad_norm: 1.0501 (1.0288) time: 1.9329 data: 0.0009 max mem: 6925
Epoch: [138] [400/625] eta: 0:07:26 lr: 0.002475 min_lr: 0.002475 loss: 2.6335 (2.5918) class_acc: 0.6016 (0.6150) weight_decay: 0.0500 (0.0500) grad_norm: 0.8765 (1.0192) time: 1.9702 data: 0.0009 max mem: 6925
Epoch: [138] [600/625] eta: 0:00:49 lr: 0.002468 min_lr: 0.002468 loss: 2.5675 (2.5994) class_acc: 0.6250 (0.6130) weight_decay: 0.0500 (0.0500) grad_norm: 0.9268 (1.0309) time: 1.9734 data: 0.0007 max mem: 6925
Epoch: [138] [624/625] eta: 0:00:01 lr: 0.002467 min_lr: 0.002467 loss: 2.6747 (2.6015) class_acc: 0.5898 (0.6123) weight_decay: 0.0500 (0.0500) grad_norm: 0.8935 (1.0280) time: 0.7677 data: 0.0015 max mem: 6925
Epoch: [138] Total time: 0:20:17 (1.9486 s / it)
Averaged stats: lr: 0.002467 min_lr: 0.002467 loss: 2.6747 (2.6058) class_acc: 0.5898 (0.6111) weight_decay: 0.0500 (0.0500) grad_norm: 0.8935 (1.0280)
Test: [ 0/50] eta: 0:10:12 loss: 1.2978 (1.2978) acc1: 73.6000 (73.6000) acc5: 90.4000 (90.4000) time: 12.2566 data: 12.2194 max mem: 6925
Test: [10/50] eta: 0:01:21 loss: 1.3406 (1.4734) acc1: 68.0000 (67.2000) acc5: 88.0000 (86.7636) time: 2.0348 data: 2.0034 max mem: 6925
Test: [20/50] eta: 0:00:47 loss: 1.6554 (1.6486) acc1: 61.6000 (63.3905) acc5: 84.0000 (84.7619) time: 1.0384 data: 1.0087 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.7440 (1.6633) acc1: 60.0000 (63.1484) acc5: 84.0000 (84.3871) time: 1.0335 data: 1.0050 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.6407 (1.6693) acc1: 61.6000 (62.8488) acc5: 84.8000 (84.4878) time: 0.8399 data: 0.8106 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.7979 (1.6772) acc1: 59.2000 (62.6240) acc5: 83.2000 (84.4160) time: 0.7388 data: 0.7075 max mem: 6925
Test: Total time: 0:00:50 (1.0181 s / it)
* Acc@1 63.474 Acc@5 85.512 loss 1.621
Accuracy of the model on the 50000 test images: 63.5%
Max accuracy: 65.05%
Epoch: [139] [ 0/625] eta: 3:54:03 lr: 0.002467 min_lr: 0.002467 loss: 2.5543 (2.5543) class_acc: 0.6289 (0.6289) weight_decay: 0.0500 (0.0500) time: 22.4692 data: 18.0602 max mem: 6925
Epoch: [139] [200/625] eta: 0:14:28 lr: 0.002460 min_lr: 0.002460 loss: 2.5759 (2.5891) class_acc: 0.6055 (0.6158) weight_decay: 0.0500 (0.0500) grad_norm: 1.0757 (1.0395) time: 2.0835 data: 0.0012 max mem: 6925
Epoch: [139] [400/625] eta: 0:07:26 lr: 0.002453 min_lr: 0.002453 loss: 2.5969 (2.5941) class_acc: 0.6055 (0.6145) weight_decay: 0.0500 (0.0500) grad_norm: 0.9203 (1.0398) time: 1.9555 data: 0.0011 max mem: 6925
Epoch: [139] [600/625] eta: 0:00:49 lr: 0.002446 min_lr: 0.002446 loss: 2.5996 (2.6048) class_acc: 0.6211 (0.6129) weight_decay: 0.0500 (0.0500) grad_norm: 1.1969 (1.0554) time: 1.8797 data: 0.0010 max mem: 6925
Epoch: [139] [624/625] eta: 0:00:01 lr: 0.002446 min_lr: 0.002446 loss: 2.6237 (2.6054) class_acc: 0.6094 (0.6128) weight_decay: 0.0500 (0.0500) grad_norm: 0.8745 (1.0569) time: 0.8897 data: 0.0156 max mem: 6925
Epoch: [139] Total time: 0:20:03 (1.9262 s / it)
Averaged stats: lr: 0.002446 min_lr: 0.002446 loss: 2.6237 (2.6000) class_acc: 0.6094 (0.6131) weight_decay: 0.0500 (0.0500) grad_norm: 0.8745 (1.0569)
Test: [ 0/50] eta: 0:09:19 loss: 1.2470 (1.2470) acc1: 74.4000 (74.4000) acc5: 89.6000 (89.6000) time: 11.1838 data: 11.1309 max mem: 6925
Test: [10/50] eta: 0:01:07 loss: 1.4846 (1.4589) acc1: 67.2000 (68.2182) acc5: 87.2000 (87.0545) time: 1.6891 data: 1.6578 max mem: 6925
Test: [20/50] eta: 0:00:38 loss: 1.6141 (1.6402) acc1: 60.0000 (63.6571) acc5: 84.8000 (85.1429) time: 0.7838 data: 0.7547 max mem: 6925
Test: [30/50] eta: 0:00:24 loss: 1.7775 (1.6613) acc1: 60.0000 (63.4065) acc5: 83.2000 (85.0065) time: 0.9729 data: 0.9435 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.6593 (1.6766) acc1: 63.2000 (63.3366) acc5: 85.6000 (84.9756) time: 0.9338 data: 0.9039 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.6172 (1.6900) acc1: 64.8000 (63.4400) acc5: 85.6000 (84.8160) time: 0.5498 data: 0.5196 max mem: 6925
Test: Total time: 0:00:49 (0.9833 s / it)
* Acc@1 64.256 Acc@5 85.598 loss 1.642
Accuracy of the model on the 50000 test images: 64.3%
Max accuracy: 65.05%
Epoch: [140] [ 0/625] eta: 4:06:43 lr: 0.002445 min_lr: 0.002445 loss: 2.4622 (2.4622) class_acc: 0.6328 (0.6328) weight_decay: 0.0500 (0.0500) time: 23.6848 data: 18.9717 max mem: 6925
Epoch: [140] [200/625] eta: 0:14:29 lr: 0.002438 min_lr: 0.002438 loss: 2.6335 (2.5843) class_acc: 0.5938 (0.6185) weight_decay: 0.0500 (0.0500) grad_norm: 1.0709 (1.0539) time: 1.9270 data: 0.0009 max mem: 6925
Epoch: [140] [400/625] eta: 0:07:32 lr: 0.002431 min_lr: 0.002431 loss: 2.5949 (2.5944) class_acc: 0.6133 (0.6165) weight_decay: 0.0500 (0.0500) grad_norm: 0.9426 (1.0471) time: 1.8779 data: 0.0008 max mem: 6925
Epoch: [140] [600/625] eta: 0:00:49 lr: 0.002424 min_lr: 0.002424 loss: 2.6394 (2.6044) class_acc: 0.6133 (0.6139) weight_decay: 0.0500 (0.0500) grad_norm: 0.9201 (1.0287) time: 1.9411 data: 0.0011 max mem: 6925
Epoch: [140] [624/625] eta: 0:00:01 lr: 0.002424 min_lr: 0.002424 loss: 2.6455 (2.6047) class_acc: 0.6094 (0.6138) weight_decay: 0.0500 (0.0500) grad_norm: 1.0041 (1.0287) time: 0.9506 data: 0.0014 max mem: 6925
Epoch: [140] Total time: 0:20:21 (1.9540 s / it)
Averaged stats: lr: 0.002424 min_lr: 0.002424 loss: 2.6455 (2.5996) class_acc: 0.6094 (0.6128) weight_decay: 0.0500 (0.0500) grad_norm: 1.0041 (1.0287)
Test: [ 0/50] eta: 0:10:13 loss: 1.4164 (1.4164) acc1: 72.8000 (72.8000) acc5: 86.4000 (86.4000) time: 12.2715 data: 12.2032 max mem: 6925
Test: [10/50] eta: 0:01:25 loss: 1.4164 (1.4364) acc1: 69.6000 (68.0000) acc5: 88.0000 (87.2727) time: 2.1365 data: 2.1032 max mem: 6925
Test: [20/50] eta: 0:00:46 loss: 1.5622 (1.5673) acc1: 63.2000 (64.8762) acc5: 85.6000 (85.6381) time: 1.0008 data: 0.9715 max mem: 6925
Test: [30/50] eta: 0:00:25 loss: 1.7114 (1.5753) acc1: 61.6000 (64.0000) acc5: 84.0000 (85.6258) time: 0.8037 data: 0.7748 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.6061 (1.5995) acc1: 59.2000 (63.2000) acc5: 85.6000 (85.6000) time: 0.7866 data: 0.7579 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.6061 (1.5963) acc1: 62.4000 (63.5360) acc5: 85.6000 (85.5040) time: 0.5779 data: 0.5494 max mem: 6925
Test: Total time: 0:00:51 (1.0255 s / it)
* Acc@1 64.662 Acc@5 86.248 loss 1.551
Accuracy of the model on the 50000 test images: 64.7%
Max accuracy: 65.05%
Epoch: [141] [ 0/625] eta: 3:45:18 lr: 0.002424 min_lr: 0.002424 loss: 2.2416 (2.2416) class_acc: 0.7031 (0.7031) weight_decay: 0.0500 (0.0500) time: 21.6302 data: 17.5140 max mem: 6925
Epoch: [141] [200/625] eta: 0:14:27 lr: 0.002417 min_lr: 0.002417 loss: 2.5993 (2.5729) class_acc: 0.6094 (0.6163) weight_decay: 0.0500 (0.0500) grad_norm: 1.0411 (1.0163) time: 1.8534 data: 0.0008 max mem: 6925
Epoch: [141] [400/625] eta: 0:07:24 lr: 0.002409 min_lr: 0.002409 loss: 2.6205 (2.5888) class_acc: 0.6094 (0.6140) weight_decay: 0.0500 (0.0500) grad_norm: 0.9413 (inf) time: 1.7310 data: 0.0009 max mem: 6925
Epoch: [141] [600/625] eta: 0:00:49 lr: 0.002402 min_lr: 0.002402 loss: 2.6374 (2.6027) class_acc: 0.6133 (0.6113) weight_decay: 0.0500 (0.0500) grad_norm: 1.0543 (inf) time: 2.1229 data: 0.0009 max mem: 6925
Epoch: [141] [624/625] eta: 0:00:01 lr: 0.002402 min_lr: 0.002402 loss: 2.6391 (2.6031) class_acc: 0.6016 (0.6112) weight_decay: 0.0500 (0.0500) grad_norm: 0.8728 (inf) time: 0.7624 data: 0.0019 max mem: 6925
Epoch: [141] Total time: 0:20:08 (1.9329 s / it)
Averaged stats: lr: 0.002402 min_lr: 0.002402 loss: 2.6391 (2.5985) class_acc: 0.6016 (0.6133) weight_decay: 0.0500 (0.0500) grad_norm: 0.8728 (inf)
Test: [ 0/50] eta: 0:09:59 loss: 1.5260 (1.5260) acc1: 65.6000 (65.6000) acc5: 88.0000 (88.0000) time: 11.9930 data: 11.9396 max mem: 6925
Test: [10/50] eta: 0:01:20 loss: 1.3636 (1.4378) acc1: 69.6000 (69.3818) acc5: 88.0000 (87.4182) time: 2.0075 data: 1.9754 max mem: 6925
Test: [20/50] eta: 0:00:42 loss: 1.6001 (1.6079) acc1: 65.6000 (64.4952) acc5: 85.6000 (85.4857) time: 0.8765 data: 0.8472 max mem: 6925
Test: [30/50] eta: 0:00:23 loss: 1.7773 (1.6497) acc1: 59.2000 (63.2774) acc5: 83.2000 (84.8774) time: 0.7266 data: 0.6980 max mem: 6925
Test: [40/50] eta: 0:00:10 loss: 1.7397 (1.6624) acc1: 58.4000 (62.7902) acc5: 83.2000 (84.9366) time: 0.7477 data: 0.7187 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.7045 (1.6622) acc1: 62.4000 (62.8160) acc5: 84.8000 (84.8000) time: 0.7067 data: 0.6765 max mem: 6925
Test: Total time: 0:00:49 (0.9873 s / it)
* Acc@1 63.700 Acc@5 85.292 loss 1.622
Accuracy of the model on the 50000 test images: 63.7%
Max accuracy: 65.05%
Epoch: [142] [ 0/625] eta: 3:46:09 lr: 0.002402 min_lr: 0.002402 loss: 2.4682 (2.4682) class_acc: 0.6406 (0.6406) weight_decay: 0.0500 (0.0500) time: 21.7111 data: 16.5934 max mem: 6925
Epoch: [142] [200/625] eta: 0:14:00 lr: 0.002395 min_lr: 0.002395 loss: 2.5363 (2.5678) class_acc: 0.6367 (0.6239) weight_decay: 0.0500 (0.0500) grad_norm: 0.9739 (0.9530) time: 1.7504 data: 0.0009 max mem: 6925
Epoch: [142] [400/625] eta: 0:07:22 lr: 0.002387 min_lr: 0.002387 loss: 2.6338 (2.5877) class_acc: 0.6133 (0.6187) weight_decay: 0.0500 (0.0500) grad_norm: 0.9814 (0.9839) time: 1.9631 data: 0.0013 max mem: 6925
Epoch: [142] [600/625] eta: 0:00:49 lr: 0.002380 min_lr: 0.002380 loss: 2.5674 (2.5913) class_acc: 0.6328 (0.6172) weight_decay: 0.0500 (0.0500) grad_norm: 0.9518 (1.0030) time: 2.0218 data: 0.0009 max mem: 6925
Epoch: [142] [624/625] eta: 0:00:01 lr: 0.002380 min_lr: 0.002380 loss: 2.6224 (2.5920) class_acc: 0.6094 (0.6171) weight_decay: 0.0500 (0.0500) grad_norm: 1.0154 (1.0111) time: 0.4633 data: 0.0017 max mem: 6925
Epoch: [142] Total time: 0:20:20 (1.9524 s / it)
Averaged stats: lr: 0.002380 min_lr: 0.002380 loss: 2.6224 (2.5938) class_acc: 0.6094 (0.6144) weight_decay: 0.0500 (0.0500) grad_norm: 1.0154 (1.0111)
Test: [ 0/50] eta: 0:10:43 loss: 1.6446 (1.6446) acc1: 58.4000 (58.4000) acc5: 84.0000 (84.0000) time: 12.8775 data: 12.8395 max mem: 6925
Test: [10/50] eta: 0:01:27 loss: 1.4566 (1.5028) acc1: 68.0000 (65.6000) acc5: 86.4000 (86.7636) time: 2.1780 data: 2.1467 max mem: 6925
Test: [20/50] eta: 0:00:52 loss: 1.6276 (1.6398) acc1: 63.2000 (63.1238) acc5: 85.6000 (85.1048) time: 1.1852 data: 1.1556 max mem: 6925
Test: [30/50] eta: 0:00:31 loss: 1.7622 (1.6657) acc1: 59.2000 (62.5806) acc5: 83.2000 (84.5936) time: 1.2254 data: 1.1968 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.6962 (1.6712) acc1: 63.2000 (62.3220) acc5: 83.2000 (84.5463) time: 0.8276 data: 0.7985 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.6334 (1.6811) acc1: 61.6000 (62.2720) acc5: 83.2000 (84.1280) time: 0.7231 data: 0.6932 max mem: 6925
Test: Total time: 0:00:53 (1.0799 s / it)
* Acc@1 62.966 Acc@5 85.016 loss 1.645
Accuracy of the model on the 50000 test images: 63.0%
Max accuracy: 65.05%
Epoch: [143] [ 0/625] eta: 3:57:22 lr: 0.002380 min_lr: 0.002380 loss: 2.5025 (2.5025) class_acc: 0.6094 (0.6094) weight_decay: 0.0500 (0.0500) time: 22.7876 data: 19.2864 max mem: 6925
Epoch: [143] [200/625] eta: 0:14:23 lr: 0.002373 min_lr: 0.002373 loss: 2.5987 (2.5822) class_acc: 0.6055 (0.6157) weight_decay: 0.0500 (0.0500) grad_norm: 0.8339 (0.9758) time: 2.0298 data: 0.0311 max mem: 6925
Epoch: [143] [400/625] eta: 0:07:29 lr: 0.002365 min_lr: 0.002365 loss: 2.5630 (2.5865) class_acc: 0.6133 (0.6151) weight_decay: 0.0500 (0.0500) grad_norm: 0.9136 (0.9844) time: 2.0373 data: 0.0012 max mem: 6925
Epoch: [143] [600/625] eta: 0:00:49 lr: 0.002358 min_lr: 0.002358 loss: 2.5430 (2.5898) class_acc: 0.6211 (0.6140) weight_decay: 0.0500 (0.0500) grad_norm: 1.0275 (0.9882) time: 2.0712 data: 0.0226 max mem: 6925
Epoch: [143] [624/625] eta: 0:00:01 lr: 0.002358 min_lr: 0.002358 loss: 2.6076 (2.5913) class_acc: 0.6055 (0.6136) weight_decay: 0.0500 (0.0500) grad_norm: 0.9886 (0.9913) time: 0.7610 data: 0.0019 max mem: 6925
Epoch: [143] Total time: 0:20:11 (1.9388 s / it)
Averaged stats: lr: 0.002358 min_lr: 0.002358 loss: 2.6076 (2.5895) class_acc: 0.6055 (0.6151) weight_decay: 0.0500 (0.0500) grad_norm: 0.9886 (0.9913)
Test: [ 0/50] eta: 0:10:09 loss: 1.3955 (1.3955) acc1: 67.2000 (67.2000) acc5: 91.2000 (91.2000) time: 12.1952 data: 12.1463 max mem: 6925
Test: [10/50] eta: 0:01:23 loss: 1.5388 (1.5588) acc1: 67.2000 (67.4182) acc5: 86.4000 (86.4000) time: 2.0817 data: 2.0496 max mem: 6925
Test: [20/50] eta: 0:00:49 loss: 1.7168 (1.7157) acc1: 63.2000 (62.9714) acc5: 83.2000 (84.3810) time: 1.1073 data: 1.0777 max mem: 6925
Test: [30/50] eta: 0:00:28 loss: 1.7856 (1.7136) acc1: 61.6000 (62.6839) acc5: 83.2000 (84.4387) time: 1.0765 data: 1.0478 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.6296 (1.7193) acc1: 62.4000 (62.6537) acc5: 84.0000 (84.1561) time: 0.7495 data: 0.7196 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.6600 (1.7027) acc1: 62.4000 (62.8480) acc5: 84.0000 (84.2240) time: 0.6382 data: 0.6069 max mem: 6925
Test: Total time: 0:00:50 (1.0010 s / it)
* Acc@1 63.290 Acc@5 85.028 loss 1.651
Accuracy of the model on the 50000 test images: 63.3%
Max accuracy: 65.05%
Epoch: [144] [ 0/625] eta: 3:38:12 lr: 0.002358 min_lr: 0.002358 loss: 2.6473 (2.6473) class_acc: 0.6523 (0.6523) weight_decay: 0.0500 (0.0500) time: 20.9483 data: 17.8406 max mem: 6925
Epoch: [144] [200/625] eta: 0:14:06 lr: 0.002350 min_lr: 0.002350 loss: 2.5722 (2.5666) class_acc: 0.6133 (0.6233) weight_decay: 0.0500 (0.0500) grad_norm: 0.9949 (1.0706) time: 1.8613 data: 0.2306 max mem: 6925
Epoch: [144] [400/625] eta: 0:07:21 lr: 0.002343 min_lr: 0.002343 loss: 2.6090 (2.5789) class_acc: 0.5977 (0.6180) weight_decay: 0.0500 (0.0500) grad_norm: 1.0167 (1.0512) time: 1.8926 data: 0.0012 max mem: 6925
Epoch: [144] [600/625] eta: 0:00:48 lr: 0.002336 min_lr: 0.002336 loss: 2.5827 (2.5845) class_acc: 0.6133 (0.6158) weight_decay: 0.0500 (0.0500) grad_norm: 0.8661 (1.0328) time: 2.0239 data: 0.0007 max mem: 6925
Epoch: [144] [624/625] eta: 0:00:01 lr: 0.002335 min_lr: 0.002335 loss: 2.6089 (2.5860) class_acc: 0.6094 (0.6154) weight_decay: 0.0500 (0.0500) grad_norm: 0.9115 (1.0280) time: 0.7262 data: 0.0024 max mem: 6925
Epoch: [144] Total time: 0:19:52 (1.9072 s / it)
Averaged stats: lr: 0.002335 min_lr: 0.002335 loss: 2.6089 (2.5874) class_acc: 0.6094 (0.6154) weight_decay: 0.0500 (0.0500) grad_norm: 0.9115 (1.0280)
Test: [ 0/50] eta: 0:09:36 loss: 1.2760 (1.2760) acc1: 73.6000 (73.6000) acc5: 88.0000 (88.0000) time: 11.5314 data: 11.4859 max mem: 6925
Test: [10/50] eta: 0:01:18 loss: 1.3980 (1.4390) acc1: 70.4000 (69.2364) acc5: 87.2000 (87.3455) time: 1.9745 data: 1.9429 max mem: 6925
Test: [20/50] eta: 0:00:46 loss: 1.5376 (1.5633) acc1: 64.0000 (65.5238) acc5: 86.4000 (86.4762) time: 1.0356 data: 1.0061 max mem: 6925
Test: [30/50] eta: 0:00:26 loss: 1.6749 (1.5962) acc1: 61.6000 (64.6452) acc5: 84.8000 (85.9097) time: 0.9634 data: 0.9342 max mem: 6925
Test: [40/50] eta: 0:00:10 loss: 1.5665 (1.6198) acc1: 61.6000 (64.0976) acc5: 84.8000 (85.6000) time: 0.6107 data: 0.5819 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.5747 (1.6222) acc1: 62.4000 (63.9200) acc5: 84.8000 (85.4400) time: 0.5386 data: 0.5103 max mem: 6925
Test: Total time: 0:00:45 (0.9052 s / it)
* Acc@1 64.632 Acc@5 86.108 loss 1.585
Accuracy of the model on the 50000 test images: 64.6%
Max accuracy: 65.05%
Epoch: [145] [ 0/625] eta: 3:32:32 lr: 0.002335 min_lr: 0.002335 loss: 2.4464 (2.4464) class_acc: 0.6562 (0.6562) weight_decay: 0.0500 (0.0500) time: 20.4037 data: 15.9302 max mem: 6925
Epoch: [145] [200/625] eta: 0:13:56 lr: 0.002328 min_lr: 0.002328 loss: 2.5701 (2.5643) class_acc: 0.6094 (0.6226) weight_decay: 0.0500 (0.0500) grad_norm: 0.9215 (1.0179) time: 1.9767 data: 0.0008 max mem: 6925
Epoch: [145] [400/625] eta: 0:07:13 lr: 0.002321 min_lr: 0.002321 loss: 2.5856 (2.5785) class_acc: 0.6094 (0.6195) weight_decay: 0.0500 (0.0500) grad_norm: 0.9558 (1.0124) time: 1.8824 data: 0.0007 max mem: 6925
Epoch: [145] [600/625] eta: 0:00:48 lr: 0.002314 min_lr: 0.002314 loss: 2.5804 (2.5845) class_acc: 0.5977 (0.6170) weight_decay: 0.0500 (0.0500) grad_norm: 0.8745 (0.9893) time: 1.9328 data: 0.0009 max mem: 6925
Epoch: [145] [624/625] eta: 0:00:01 lr: 0.002313 min_lr: 0.002313 loss: 2.5707 (2.5845) class_acc: 0.6250 (0.6172) weight_decay: 0.0500 (0.0500) grad_norm: 0.8745 (0.9899) time: 0.8073 data: 0.0013 max mem: 6925
Epoch: [145] Total time: 0:19:33 (1.8773 s / it)
Averaged stats: lr: 0.002313 min_lr: 0.002313 loss: 2.5707 (2.5859) class_acc: 0.6250 (0.6161) weight_decay: 0.0500 (0.0500) grad_norm: 0.8745 (0.9899)
Test: [ 0/50] eta: 0:11:20 loss: 1.4998 (1.4998) acc1: 66.4000 (66.4000) acc5: 88.0000 (88.0000) time: 13.6094 data: 13.5783 max mem: 6925
Test: [10/50] eta: 0:01:33 loss: 1.4798 (1.4779) acc1: 68.0000 (67.7091) acc5: 87.2000 (87.2727) time: 2.3472 data: 2.3167 max mem: 6925
Test: [20/50] eta: 0:00:54 loss: 1.6062 (1.6091) acc1: 64.0000 (65.0667) acc5: 86.4000 (86.0191) time: 1.2418 data: 1.2121 max mem: 6925
Test: [30/50] eta: 0:00:31 loss: 1.6677 (1.6619) acc1: 61.6000 (64.5161) acc5: 85.6000 (84.9290) time: 1.1121 data: 1.0817 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.5979 (1.6651) acc1: 61.6000 (64.2732) acc5: 84.0000 (84.8585) time: 0.6790 data: 0.6484 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.6777 (1.6673) acc1: 63.2000 (64.1600) acc5: 84.0000 (84.6560) time: 0.5800 data: 0.5510 max mem: 6925
Test: Total time: 0:00:52 (1.0551 s / it)
* Acc@1 64.236 Acc@5 85.742 loss 1.614
Accuracy of the model on the 50000 test images: 64.2%
Max accuracy: 65.05%
Epoch: [146] [ 0/625] eta: 3:14:25 lr: 0.002313 min_lr: 0.002313 loss: 2.3761 (2.3761) class_acc: 0.6484 (0.6484) weight_decay: 0.0500 (0.0500) time: 18.6651 data: 18.4253 max mem: 6925
Epoch: [146] [200/625] eta: 0:14:02 lr: 0.002306 min_lr: 0.002306 loss: 2.5826 (2.5556) class_acc: 0.6172 (0.6228) weight_decay: 0.0500 (0.0500) grad_norm: 0.8890 (1.0557) time: 1.8239 data: 0.2406 max mem: 6925
Epoch: [146] [400/625] eta: 0:07:17 lr: 0.002299 min_lr: 0.002299 loss: 2.5751 (2.5644) class_acc: 0.6250 (0.6208) weight_decay: 0.0500 (0.0500) grad_norm: 0.8892 (1.0419) time: 1.8825 data: 0.0010 max mem: 6925
Epoch: [146] [600/625] eta: 0:00:48 lr: 0.002292 min_lr: 0.002292 loss: 2.5929 (2.5770) class_acc: 0.6133 (0.6183) weight_decay: 0.0500 (0.0500) grad_norm: 0.9578 (1.0331) time: 1.9020 data: 0.0266 max mem: 6925
Epoch: [146] [624/625] eta: 0:00:01 lr: 0.002291 min_lr: 0.002291 loss: 2.6187 (2.5781) class_acc: 0.6172 (0.6181) weight_decay: 0.0500 (0.0500) grad_norm: 0.8966 (1.0323) time: 0.8502 data: 0.0016 max mem: 6925
Epoch: [146] Total time: 0:19:48 (1.9024 s / it)
Averaged stats: lr: 0.002291 min_lr: 0.002291 loss: 2.6187 (2.5831) class_acc: 0.6172 (0.6164) weight_decay: 0.0500 (0.0500) grad_norm: 0.8966 (1.0323)
Test: [ 0/50] eta: 0:10:44 loss: 1.5464 (1.5464) acc1: 67.2000 (67.2000) acc5: 88.8000 (88.8000) time: 12.8806 data: 12.8497 max mem: 6925
Test: [10/50] eta: 0:01:31 loss: 1.4632 (1.4962) acc1: 65.6000 (67.0545) acc5: 88.8000 (87.6364) time: 2.2764 data: 2.2463 max mem: 6925
Test: [20/50] eta: 0:00:54 loss: 1.7601 (1.6631) acc1: 60.8000 (63.1619) acc5: 84.0000 (85.1429) time: 1.2597 data: 1.2304 max mem: 6925
Test: [30/50] eta: 0:00:29 loss: 1.7744 (1.6798) acc1: 59.2000 (62.6839) acc5: 84.0000 (85.1097) time: 1.0118 data: 0.9811 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.6745 (1.6746) acc1: 60.8000 (62.6537) acc5: 84.8000 (85.0146) time: 0.5100 data: 0.4771 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.6242 (1.6800) acc1: 61.6000 (62.6080) acc5: 84.8000 (84.9280) time: 0.4273 data: 0.3955 max mem: 6925
Test: Total time: 0:00:49 (0.9849 s / it)
* Acc@1 63.306 Acc@5 85.452 loss 1.641
Accuracy of the model on the 50000 test images: 63.3%
Max accuracy: 65.05%
Epoch: [147] [ 0/625] eta: 3:43:09 lr: 0.002291 min_lr: 0.002291 loss: 2.4452 (2.4452) class_acc: 0.6562 (0.6562) weight_decay: 0.0500 (0.0500) time: 21.4234 data: 17.9113 max mem: 6925
Epoch: [147] [200/625] eta: 0:13:51 lr: 0.002284 min_lr: 0.002284 loss: 2.5976 (2.5680) class_acc: 0.6211 (0.6212) weight_decay: 0.0500 (0.0500) grad_norm: 0.9298 (1.0445) time: 1.9033 data: 0.0010 max mem: 6925
Epoch: [147] [400/625] eta: 0:07:11 lr: 0.002277 min_lr: 0.002277 loss: 2.5721 (2.5728) class_acc: 0.6133 (0.6196) weight_decay: 0.0500 (0.0500) grad_norm: 1.0568 (1.0408) time: 1.8995 data: 0.0012 max mem: 6925
Epoch: [147] [600/625] eta: 0:00:48 lr: 0.002270 min_lr: 0.002270 loss: 2.6033 (2.5817) class_acc: 0.5977 (0.6171) weight_decay: 0.0500 (0.0500) grad_norm: 0.8858 (1.0370) time: 1.9100 data: 0.0320 max mem: 6925
Epoch: [147] [624/625] eta: 0:00:01 lr: 0.002269 min_lr: 0.002269 loss: 2.5571 (2.5804) class_acc: 0.6367 (0.6176) weight_decay: 0.0500 (0.0500) grad_norm: 0.8543 (1.0313) time: 1.0253 data: 0.0025 max mem: 6925
Epoch: [147] Total time: 0:19:45 (1.8974 s / it)
Averaged stats: lr: 0.002269 min_lr: 0.002269 loss: 2.5571 (2.5797) class_acc: 0.6367 (0.6177) weight_decay: 0.0500 (0.0500) grad_norm: 0.8543 (1.0313)
Test: [ 0/50] eta: 0:10:40 loss: 1.4522 (1.4522) acc1: 64.8000 (64.8000) acc5: 91.2000 (91.2000) time: 12.8097 data: 12.7788 max mem: 6925
Test: [10/50] eta: 0:01:28 loss: 1.4746 (1.4850) acc1: 67.2000 (67.7091) acc5: 88.8000 (88.1455) time: 2.2107 data: 2.1813 max mem: 6925
Test: [20/50] eta: 0:00:52 loss: 1.5458 (1.6474) acc1: 64.8000 (64.0000) acc5: 85.6000 (85.7524) time: 1.2068 data: 1.1779 max mem: 6925
Test: [30/50] eta: 0:00:30 loss: 1.8149 (1.6652) acc1: 60.0000 (63.5355) acc5: 82.4000 (85.2129) time: 1.1248 data: 1.0961 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.7899 (1.6958) acc1: 60.8000 (62.8293) acc5: 82.4000 (84.5073) time: 0.6548 data: 0.6252 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.6964 (1.6988) acc1: 60.0000 (62.5920) acc5: 83.2000 (84.3840) time: 0.6547 data: 0.6251 max mem: 6925
Test: Total time: 0:00:50 (1.0090 s / it)
* Acc@1 63.680 Acc@5 85.294 loss 1.648
Accuracy of the model on the 50000 test images: 63.7%
Max accuracy: 65.05%
Epoch: [148] [ 0/625] eta: 4:25:40 lr: 0.002269 min_lr: 0.002269 loss: 2.7441 (2.7441) class_acc: 0.5820 (0.5820) weight_decay: 0.0500 (0.0500) time: 25.5048 data: 19.0926 max mem: 6925
Epoch: [148] [200/625] eta: 0:14:23 lr: 0.002262 min_lr: 0.002262 loss: 2.6187 (2.5668) class_acc: 0.6094 (0.6201) weight_decay: 0.0500 (0.0500) grad_norm: 1.0644 (inf) time: 1.9314 data: 0.1459 max mem: 6925
Epoch: [148] [400/625] eta: 0:07:28 lr: 0.002255 min_lr: 0.002255 loss: 2.5231 (2.5686) class_acc: 0.6172 (0.6194) weight_decay: 0.0500 (0.0500) grad_norm: 0.8943 (inf) time: 1.9619 data: 0.0008 max mem: 6925
Epoch: [148] [600/625] eta: 0:00:49 lr: 0.002248 min_lr: 0.002248 loss: 2.6028 (2.5785) class_acc: 0.6055 (0.6173) weight_decay: 0.0500 (0.0500) grad_norm: 0.9844 (inf) time: 1.8837 data: 0.0008 max mem: 6925
Epoch: [148] [624/625] eta: 0:00:01 lr: 0.002247 min_lr: 0.002247 loss: 2.6154 (2.5802) class_acc: 0.6055 (0.6170) weight_decay: 0.0500 (0.0500) grad_norm: 1.0399 (inf) time: 0.8406 data: 0.0015 max mem: 6925
Epoch: [148] Total time: 0:20:13 (1.9409 s / it)
Averaged stats: lr: 0.002247 min_lr: 0.002247 loss: 2.6154 (2.5768) class_acc: 0.6055 (0.6183) weight_decay: 0.0500 (0.0500) grad_norm: 1.0399 (inf)
Test: [ 0/50] eta: 0:10:17 loss: 1.5041 (1.5041) acc1: 68.0000 (68.0000) acc5: 89.6000 (89.6000) time: 12.3517 data: 12.3131 max mem: 6925
Test: [10/50] eta: 0:01:20 loss: 1.4963 (1.5304) acc1: 67.2000 (67.4182) acc5: 86.4000 (86.1091) time: 2.0210 data: 1.9886 max mem: 6925
Test: [20/50] eta: 0:00:46 loss: 1.6301 (1.6661) acc1: 63.2000 (63.5429) acc5: 84.8000 (84.5714) time: 0.9932 data: 0.9630 max mem: 6925
Test: [30/50] eta: 0:00:25 loss: 1.7581 (1.6961) acc1: 60.0000 (62.6065) acc5: 82.4000 (84.1032) time: 0.8896 data: 0.8610 max mem: 6925
Test: [40/50] eta: 0:00:10 loss: 1.6782 (1.6837) acc1: 60.0000 (62.9268) acc5: 84.0000 (84.2927) time: 0.6209 data: 0.5926 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.6558 (1.6938) acc1: 62.4000 (62.5760) acc5: 84.0000 (84.1600) time: 0.5199 data: 0.4914 max mem: 6925
Test: Total time: 0:00:47 (0.9533 s / it)
* Acc@1 63.430 Acc@5 85.110 loss 1.644
Accuracy of the model on the 50000 test images: 63.4%
Max accuracy: 65.05%
Epoch: [149] [ 0/625] eta: 3:43:57 lr: 0.002247 min_lr: 0.002247 loss: 2.4832 (2.4832) class_acc: 0.6328 (0.6328) weight_decay: 0.0500 (0.0500) time: 21.5002 data: 19.1497 max mem: 6925
Epoch: [149] [200/625] eta: 0:14:16 lr: 0.002240 min_lr: 0.002240 loss: 2.5810 (2.5524) class_acc: 0.6172 (0.6249) weight_decay: 0.0500 (0.0500) grad_norm: 0.9859 (1.0624) time: 2.0707 data: 0.5208 max mem: 6925
Epoch: [149] [400/625] eta: 0:07:13 lr: 0.002232 min_lr: 0.002232 loss: 2.6050 (2.5670) class_acc: 0.6094 (0.6212) weight_decay: 0.0500 (0.0500) grad_norm: 0.9204 (1.0401) time: 1.7727 data: 0.2277 max mem: 6925
Epoch: [149] [600/625] eta: 0:00:47 lr: 0.002225 min_lr: 0.002225 loss: 2.5746 (2.5744) class_acc: 0.6172 (0.6195) weight_decay: 0.0500 (0.0500) grad_norm: 1.1256 (1.0400) time: 1.8076 data: 1.5079 max mem: 6925
Epoch: [149] [624/625] eta: 0:00:01 lr: 0.002224 min_lr: 0.002224 loss: 2.5362 (2.5732) class_acc: 0.6172 (0.6196) weight_decay: 0.0500 (0.0500) grad_norm: 0.9885 (1.0375) time: 1.1145 data: 0.8368 max mem: 6925
Epoch: [149] Total time: 0:19:35 (1.8805 s / it)
Averaged stats: lr: 0.002224 min_lr: 0.002224 loss: 2.5362 (2.5741) class_acc: 0.6172 (0.6189) weight_decay: 0.0500 (0.0500) grad_norm: 0.9885 (1.0375)
Test: [ 0/50] eta: 0:10:19 loss: 1.4730 (1.4730) acc1: 68.8000 (68.8000) acc5: 88.0000 (88.0000) time: 12.3830 data: 12.3518 max mem: 6925
Test: [10/50] eta: 0:01:27 loss: 1.4313 (1.4794) acc1: 68.8000 (68.8000) acc5: 88.0000 (87.2000) time: 2.1800 data: 2.1484 max mem: 6925
Test: [20/50] eta: 0:00:52 loss: 1.6854 (1.6714) acc1: 63.2000 (64.0381) acc5: 83.2000 (84.9905) time: 1.2033 data: 1.1732 max mem: 6925
Test: [30/50] eta: 0:00:30 loss: 1.8157 (1.7018) acc1: 60.8000 (63.2000) acc5: 82.4000 (84.4645) time: 1.1426 data: 1.1111 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.7200 (1.7068) acc1: 61.6000 (63.2781) acc5: 83.2000 (84.4683) time: 0.6792 data: 0.6462 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.6865 (1.7079) acc1: 62.4000 (63.1040) acc5: 84.0000 (84.3200) time: 0.5661 data: 0.5352 max mem: 6925
Test: Total time: 0:00:50 (1.0167 s / it)
* Acc@1 64.248 Acc@5 85.460 loss 1.646
Accuracy of the model on the 50000 test images: 64.2%
Max accuracy: 65.05%
Epoch: [150] [ 0/625] eta: 3:24:43 lr: 0.002224 min_lr: 0.002224 loss: 2.6349 (2.6349) class_acc: 0.6055 (0.6055) weight_decay: 0.0500 (0.0500) time: 19.6533 data: 15.3630 max mem: 6925
Epoch: [150] [200/625] eta: 0:14:36 lr: 0.002217 min_lr: 0.002217 loss: 2.5519 (2.5634) class_acc: 0.6055 (0.6200) weight_decay: 0.0500 (0.0500) grad_norm: 0.9347 (1.0000) time: 1.9340 data: 0.0011 max mem: 6925
Epoch: [150] [400/625] eta: 0:07:39 lr: 0.002210 min_lr: 0.002210 loss: 2.6296 (2.5670) class_acc: 0.6055 (0.6181) weight_decay: 0.0500 (0.0500) grad_norm: 1.0579 (1.0247) time: 2.0618 data: 0.0008 max mem: 6925
Epoch: [150] [600/625] eta: 0:00:51 lr: 0.002203 min_lr: 0.002203 loss: 2.5589 (2.5734) class_acc: 0.6211 (0.6174) weight_decay: 0.0500 (0.0500) grad_norm: 0.8312 (1.0180) time: 2.1184 data: 0.0008 max mem: 6925
Epoch: [150] [624/625] eta: 0:00:02 lr: 0.002202 min_lr: 0.002202 loss: 2.5856 (2.5743) class_acc: 0.6211 (0.6175) weight_decay: 0.0500 (0.0500) grad_norm: 1.0379 (1.0198) time: 0.4830 data: 0.0014 max mem: 6925
Epoch: [150] Total time: 0:20:58 (2.0132 s / it)
Averaged stats: lr: 0.002202 min_lr: 0.002202 loss: 2.5856 (2.5710) class_acc: 0.6211 (0.6194) weight_decay: 0.0500 (0.0500) grad_norm: 1.0379 (1.0198)
Test: [ 0/50] eta: 0:11:35 loss: 1.3736 (1.3736) acc1: 68.8000 (68.8000) acc5: 89.6000 (89.6000) time: 13.9077 data: 13.8661 max mem: 6925
Test: [10/50] eta: 0:01:31 loss: 1.4243 (1.4350) acc1: 68.8000 (68.5091) acc5: 88.0000 (87.4909) time: 2.2834 data: 2.2533 max mem: 6925
Test: [20/50] eta: 0:00:53 loss: 1.5583 (1.5582) acc1: 64.0000 (65.1429) acc5: 87.2000 (86.3619) time: 1.1920 data: 1.1632 max mem: 6925
Test: [30/50] eta: 0:00:31 loss: 1.6867 (1.5917) acc1: 61.6000 (64.0774) acc5: 84.0000 (85.7032) time: 1.2205 data: 1.1920 max mem: 6925
Test: [40/50] eta: 0:00:13 loss: 1.6897 (1.5899) acc1: 60.8000 (64.1366) acc5: 83.2000 (85.6585) time: 0.8409 data: 0.8110 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.6372 (1.5978) acc1: 61.6000 (63.9360) acc5: 84.8000 (85.6000) time: 0.8408 data: 0.8109 max mem: 6925
Test: Total time: 0:00:54 (1.0996 s / it)
* Acc@1 65.090 Acc@5 86.156 loss 1.550
Accuracy of the model on the 50000 test images: 65.1%
Max accuracy: 65.09%
Epoch: [151] [ 0/625] eta: 3:30:20 lr: 0.002202 min_lr: 0.002202 loss: 2.6611 (2.6611) class_acc: 0.5820 (0.5820) weight_decay: 0.0500 (0.0500) time: 20.1929 data: 18.4187 max mem: 6925
Epoch: [151] [200/625] eta: 0:14:27 lr: 0.002195 min_lr: 0.002195 loss: 2.5117 (2.5535) class_acc: 0.6172 (0.6254) weight_decay: 0.0500 (0.0500) grad_norm: 0.9658 (0.9900) time: 2.0448 data: 0.2142 max mem: 6925
Epoch: [151] [400/625] eta: 0:07:38 lr: 0.002188 min_lr: 0.002188 loss: 2.5934 (2.5593) class_acc: 0.6211 (0.6233) weight_decay: 0.0500 (0.0500) grad_norm: 1.1097 (1.0439) time: 1.6869 data: 0.0011 max mem: 6925
Epoch: [151] [600/625] eta: 0:00:50 lr: 0.002181 min_lr: 0.002181 loss: 2.6030 (2.5673) class_acc: 0.6133 (0.6218) weight_decay: 0.0500 (0.0500) grad_norm: 1.0012 (1.0238) time: 1.9379 data: 0.0015 max mem: 6925
Epoch: [151] [624/625] eta: 0:00:02 lr: 0.002180 min_lr: 0.002180 loss: 2.5555 (2.5670) class_acc: 0.6172 (0.6215) weight_decay: 0.0500 (0.0500) grad_norm: 1.0559 (1.0272) time: 0.7587 data: 0.0019 max mem: 6925
Epoch: [151] Total time: 0:20:57 (2.0113 s / it)
Averaged stats: lr: 0.002180 min_lr: 0.002180 loss: 2.5555 (2.5691) class_acc: 0.6172 (0.6199) weight_decay: 0.0500 (0.0500) grad_norm: 1.0559 (1.0272)
Test: [ 0/50] eta: 0:10:32 loss: 1.6378 (1.6378) acc1: 62.4000 (62.4000) acc5: 87.2000 (87.2000) time: 12.6434 data: 12.5993 max mem: 6925
Test: [10/50] eta: 0:01:19 loss: 1.6062 (1.5765) acc1: 63.2000 (64.9455) acc5: 88.0000 (86.3273) time: 1.9993 data: 1.9682 max mem: 6925
Test: [20/50] eta: 0:00:46 loss: 1.7175 (1.7024) acc1: 60.8000 (62.0952) acc5: 84.8000 (84.5714) time: 1.0020 data: 0.9730 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.8008 (1.7194) acc1: 58.4000 (61.3936) acc5: 82.4000 (84.3355) time: 1.0389 data: 1.0104 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.6823 (1.7207) acc1: 60.0000 (61.3854) acc5: 83.2000 (84.0585) time: 1.0068 data: 0.9762 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.6823 (1.7287) acc1: 60.0000 (61.4400) acc5: 82.4000 (83.7920) time: 0.9315 data: 0.9010 max mem: 6925
Test: Total time: 0:00:58 (1.1723 s / it)
* Acc@1 62.642 Acc@5 84.396 loss 1.695
Accuracy of the model on the 50000 test images: 62.6%
Max accuracy: 65.09%
Epoch: [152] [ 0/625] eta: 3:19:15 lr: 0.002180 min_lr: 0.002180 loss: 2.3596 (2.3596) class_acc: 0.6836 (0.6836) weight_decay: 0.0500 (0.0500) time: 19.1289 data: 18.0027 max mem: 6925
Epoch: [152] [200/625] eta: 0:14:13 lr: 0.002173 min_lr: 0.002173 loss: 2.5386 (2.5429) class_acc: 0.6328 (0.6261) weight_decay: 0.0500 (0.0500) grad_norm: 0.9423 (1.0059) time: 1.7801 data: 0.0088 max mem: 6925
Epoch: [152] [400/625] eta: 0:07:27 lr: 0.002165 min_lr: 0.002165 loss: 2.5854 (2.5548) class_acc: 0.6133 (0.6233) weight_decay: 0.0500 (0.0500) grad_norm: 1.0698 (1.0235) time: 1.9176 data: 0.0369 max mem: 6925
Epoch: [152] [600/625] eta: 0:00:49 lr: 0.002158 min_lr: 0.002158 loss: 2.5777 (2.5627) class_acc: 0.6172 (0.6220) weight_decay: 0.0500 (0.0500) grad_norm: 0.9187 (1.0146) time: 2.1216 data: 0.0012 max mem: 6925
Epoch: [152] [624/625] eta: 0:00:01 lr: 0.002157 min_lr: 0.002157 loss: 2.5961 (2.5655) class_acc: 0.6094 (0.6213) weight_decay: 0.0500 (0.0500) grad_norm: 1.0525 (1.0271) time: 0.7946 data: 0.0013 max mem: 6925
Epoch: [152] Total time: 0:20:15 (1.9443 s / it)
Averaged stats: lr: 0.002157 min_lr: 0.002157 loss: 2.5961 (2.5653) class_acc: 0.6094 (0.6209) weight_decay: 0.0500 (0.0500) grad_norm: 1.0525 (1.0271)
Test: [ 0/50] eta: 0:09:56 loss: 1.6348 (1.6348) acc1: 63.2000 (63.2000) acc5: 82.4000 (82.4000) time: 11.9356 data: 11.8943 max mem: 6925
Test: [10/50] eta: 0:01:15 loss: 1.4419 (1.5072) acc1: 67.2000 (68.0000) acc5: 88.0000 (86.7636) time: 1.8924 data: 1.8613 max mem: 6925
Test: [20/50] eta: 0:00:42 loss: 1.5428 (1.6346) acc1: 64.0000 (64.6095) acc5: 86.4000 (85.1429) time: 0.9025 data: 0.8733 max mem: 6925
Test: [30/50] eta: 0:00:23 loss: 1.7345 (1.6741) acc1: 62.4000 (63.7936) acc5: 83.2000 (84.4387) time: 0.7489 data: 0.7205 max mem: 6925
Test: [40/50] eta: 0:00:10 loss: 1.7314 (1.6805) acc1: 60.8000 (63.3366) acc5: 83.2000 (84.7220) time: 0.6730 data: 0.6416 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.6705 (1.6854) acc1: 61.6000 (62.9440) acc5: 84.0000 (84.5280) time: 0.5215 data: 0.4901 max mem: 6925
Test: Total time: 0:00:46 (0.9274 s / it)
* Acc@1 63.826 Acc@5 85.456 loss 1.636
Accuracy of the model on the 50000 test images: 63.8%
Max accuracy: 65.09%
Epoch: [153] [ 0/625] eta: 3:29:21 lr: 0.002157 min_lr: 0.002157 loss: 2.3518 (2.3518) class_acc: 0.6875 (0.6875) weight_decay: 0.0500 (0.0500) time: 20.0981 data: 18.6270 max mem: 6925
Epoch: [153] [200/625] eta: 0:14:20 lr: 0.002150 min_lr: 0.002150 loss: 2.5532 (2.5593) class_acc: 0.6250 (0.6235) weight_decay: 0.0500 (0.0500) grad_norm: 1.0777 (0.9819) time: 1.8839 data: 0.0007 max mem: 6925
Epoch: [153] [400/625] eta: 0:07:31 lr: 0.002143 min_lr: 0.002143 loss: 2.5271 (2.5583) class_acc: 0.6250 (0.6227) weight_decay: 0.0500 (0.0500) grad_norm: 0.8950 (0.9919) time: 2.0072 data: 0.0007 max mem: 6925
Epoch: [153] [600/625] eta: 0:00:50 lr: 0.002136 min_lr: 0.002136 loss: 2.5978 (2.5654) class_acc: 0.6055 (0.6207) weight_decay: 0.0500 (0.0500) grad_norm: 0.9089 (inf) time: 2.0495 data: 0.0007 max mem: 6925
Epoch: [153] [624/625] eta: 0:00:01 lr: 0.002135 min_lr: 0.002135 loss: 2.5916 (2.5664) class_acc: 0.6094 (0.6204) weight_decay: 0.0500 (0.0500) grad_norm: 0.9089 (inf) time: 1.0926 data: 0.0013 max mem: 6925
Epoch: [153] Total time: 0:20:34 (1.9748 s / it)
Averaged stats: lr: 0.002135 min_lr: 0.002135 loss: 2.5916 (2.5629) class_acc: 0.6094 (0.6210) weight_decay: 0.0500 (0.0500) grad_norm: 0.9089 (inf)
Test: [ 0/50] eta: 0:10:11 loss: 1.5684 (1.5684) acc1: 65.6000 (65.6000) acc5: 88.8000 (88.8000) time: 12.2235 data: 12.1729 max mem: 6925
Test: [10/50] eta: 0:01:22 loss: 1.3813 (1.4447) acc1: 67.2000 (67.4909) acc5: 88.8000 (87.7091) time: 2.0607 data: 2.0300 max mem: 6925
Test: [20/50] eta: 0:00:48 loss: 1.4974 (1.5612) acc1: 64.0000 (64.8381) acc5: 87.2000 (86.4000) time: 1.0977 data: 1.0690 max mem: 6925
Test: [30/50] eta: 0:00:29 loss: 1.6522 (1.5951) acc1: 61.6000 (64.0516) acc5: 85.6000 (86.0129) time: 1.1385 data: 1.1093 max mem: 6925
Test: [40/50] eta: 0:00:13 loss: 1.6422 (1.6017) acc1: 61.6000 (64.0585) acc5: 83.2000 (85.9512) time: 1.0902 data: 1.0610 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.6308 (1.5953) acc1: 62.4000 (64.1280) acc5: 84.8000 (85.8080) time: 0.9499 data: 0.9203 max mem: 6925
Test: Total time: 0:00:58 (1.1737 s / it)
* Acc@1 65.474 Acc@5 86.542 loss 1.550
Accuracy of the model on the 50000 test images: 65.5%
Max accuracy: 65.47%
Epoch: [154] [ 0/625] eta: 3:33:15 lr: 0.002135 min_lr: 0.002135 loss: 2.4343 (2.4343) class_acc: 0.6445 (0.6445) weight_decay: 0.0500 (0.0500) time: 20.4724 data: 17.7729 max mem: 6925
Epoch: [154] [200/625] eta: 0:14:31 lr: 0.002128 min_lr: 0.002128 loss: 2.5441 (2.5409) class_acc: 0.6172 (0.6269) weight_decay: 0.0500 (0.0500) grad_norm: 0.9910 (1.0445) time: 2.1419 data: 0.0012 max mem: 6925
Epoch: [154] [400/625] eta: 0:07:47 lr: 0.002121 min_lr: 0.002121 loss: 2.5166 (2.5481) class_acc: 0.6250 (0.6246) weight_decay: 0.0500 (0.0500) grad_norm: 0.9995 (1.0461) time: 1.9960 data: 0.0122 max mem: 6925
Epoch: [154] [600/625] eta: 0:00:51 lr: 0.002113 min_lr: 0.002113 loss: 2.5224 (2.5549) class_acc: 0.6289 (0.6226) weight_decay: 0.0500 (0.0500) grad_norm: 0.9496 (1.0230) time: 2.0712 data: 0.1443 max mem: 6925
Epoch: [154] [624/625] eta: 0:00:02 lr: 0.002113 min_lr: 0.002113 loss: 2.5446 (2.5562) class_acc: 0.6055 (0.6222) weight_decay: 0.0500 (0.0500) grad_norm: 0.9146 (1.0261) time: 1.1903 data: 0.0015 max mem: 6925
Epoch: [154] Total time: 0:21:06 (2.0263 s / it)
Averaged stats: lr: 0.002113 min_lr: 0.002113 loss: 2.5446 (2.5564) class_acc: 0.6055 (0.6227) weight_decay: 0.0500 (0.0500) grad_norm: 0.9146 (1.0261)
Test: [ 0/50] eta: 0:07:30 loss: 1.7000 (1.7000) acc1: 65.6000 (65.6000) acc5: 81.6000 (81.6000) time: 9.0012 data: 8.9694 max mem: 6925
Test: [10/50] eta: 0:00:59 loss: 1.5416 (1.5805) acc1: 66.4000 (66.1818) acc5: 86.4000 (86.2545) time: 1.4810 data: 1.4484 max mem: 6925
Test: [20/50] eta: 0:00:35 loss: 1.6752 (1.6915) acc1: 64.0000 (63.8857) acc5: 86.4000 (85.1048) time: 0.8047 data: 0.7731 max mem: 6925
Test: [30/50] eta: 0:00:23 loss: 1.7698 (1.6965) acc1: 61.6000 (63.4581) acc5: 82.4000 (84.9032) time: 1.0102 data: 0.9805 max mem: 6925
Test: [40/50] eta: 0:00:10 loss: 1.7337 (1.6950) acc1: 60.8000 (63.3561) acc5: 83.2000 (84.8195) time: 0.9524 data: 0.9237 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.6732 (1.6832) acc1: 60.8000 (63.3920) acc5: 84.8000 (84.9440) time: 0.5142 data: 0.4856 max mem: 6925
Test: Total time: 0:00:46 (0.9380 s / it)
* Acc@1 64.014 Acc@5 85.518 loss 1.655
Accuracy of the model on the 50000 test images: 64.0%
Max accuracy: 65.47%
Epoch: [155] [ 0/625] eta: 3:34:09 lr: 0.002113 min_lr: 0.002113 loss: 2.5456 (2.5456) class_acc: 0.6055 (0.6055) weight_decay: 0.0500 (0.0500) time: 20.5594 data: 18.0482 max mem: 6925
Epoch: [155] [200/625] eta: 0:14:25 lr: 0.002105 min_lr: 0.002105 loss: 2.5309 (2.5355) class_acc: 0.6172 (0.6284) weight_decay: 0.0500 (0.0500) grad_norm: 1.0270 (1.0912) time: 1.9489 data: 0.0011 max mem: 6925
Epoch: [155] [400/625] eta: 0:07:41 lr: 0.002098 min_lr: 0.002098 loss: 2.5888 (2.5510) class_acc: 0.6094 (0.6246) weight_decay: 0.0500 (0.0500) grad_norm: 1.1842 (1.0685) time: 1.9740 data: 0.0010 max mem: 6925
Epoch: [155] [600/625] eta: 0:00:51 lr: 0.002091 min_lr: 0.002091 loss: 2.5673 (2.5610) class_acc: 0.6211 (0.6216) weight_decay: 0.0500 (0.0500) grad_norm: 0.9516 (1.0737) time: 2.1088 data: 0.0009 max mem: 6925
Epoch: [155] [624/625] eta: 0:00:02 lr: 0.002090 min_lr: 0.002090 loss: 2.5894 (2.5617) class_acc: 0.6172 (0.6214) weight_decay: 0.0500 (0.0500) grad_norm: 0.9305 (1.0683) time: 0.6359 data: 0.0026 max mem: 6925
Epoch: [155] Total time: 0:21:11 (2.0344 s / it)
Averaged stats: lr: 0.002090 min_lr: 0.002090 loss: 2.5894 (2.5577) class_acc: 0.6172 (0.6227) weight_decay: 0.0500 (0.0500) grad_norm: 0.9305 (1.0683)
Test: [ 0/50] eta: 0:10:17 loss: 1.6603 (1.6603) acc1: 58.4000 (58.4000) acc5: 85.6000 (85.6000) time: 12.3505 data: 12.3148 max mem: 6925
Test: [10/50] eta: 0:01:17 loss: 1.4916 (1.4346) acc1: 70.4000 (69.0909) acc5: 86.4000 (87.2727) time: 1.9500 data: 1.9200 max mem: 6925
Test: [20/50] eta: 0:00:45 loss: 1.5866 (1.5701) acc1: 64.8000 (65.6000) acc5: 86.4000 (86.2095) time: 0.9686 data: 0.9395 max mem: 6925
Test: [30/50] eta: 0:00:26 loss: 1.7308 (1.6125) acc1: 61.6000 (64.3613) acc5: 85.6000 (85.8065) time: 1.0051 data: 0.9759 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.6501 (1.6231) acc1: 62.4000 (63.9805) acc5: 85.6000 (85.5805) time: 0.9429 data: 0.9136 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.6850 (1.6310) acc1: 63.2000 (63.8880) acc5: 85.6000 (85.4720) time: 0.9190 data: 0.8900 max mem: 6925
Test: Total time: 0:00:52 (1.0527 s / it)
* Acc@1 64.554 Acc@5 85.948 loss 1.596
Accuracy of the model on the 50000 test images: 64.6%
Max accuracy: 65.47%
Epoch: [156] [ 0/625] eta: 4:09:01 lr: 0.002090 min_lr: 0.002090 loss: 2.4642 (2.4642) class_acc: 0.6445 (0.6445) weight_decay: 0.0500 (0.0500) time: 23.9061 data: 17.1180 max mem: 6925
Epoch: [156] [200/625] eta: 0:14:54 lr: 0.002083 min_lr: 0.002083 loss: 2.4674 (2.5360) class_acc: 0.6406 (0.6276) weight_decay: 0.0500 (0.0500) grad_norm: 0.9892 (1.0886) time: 1.9838 data: 0.0008 max mem: 6925
Epoch: [156] [400/625] eta: 0:07:45 lr: 0.002076 min_lr: 0.002076 loss: 2.6384 (2.5472) class_acc: 0.6055 (0.6253) weight_decay: 0.0500 (0.0500) grad_norm: 0.9942 (1.0931) time: 2.1221 data: 0.0008 max mem: 6925
Epoch: [156] [600/625] eta: 0:00:51 lr: 0.002069 min_lr: 0.002069 loss: 2.5728 (2.5541) class_acc: 0.6289 (0.6239) weight_decay: 0.0500 (0.0500) grad_norm: 0.8935 (1.0644) time: 2.2949 data: 0.0007 max mem: 6925
Epoch: [156] [624/625] eta: 0:00:02 lr: 0.002068 min_lr: 0.002068 loss: 2.5391 (2.5533) class_acc: 0.6211 (0.6240) weight_decay: 0.0500 (0.0500) grad_norm: 0.8911 (1.0586) time: 0.7333 data: 0.0014 max mem: 6925
Epoch: [156] Total time: 0:20:56 (2.0106 s / it)
Averaged stats: lr: 0.002068 min_lr: 0.002068 loss: 2.5391 (2.5517) class_acc: 0.6211 (0.6241) weight_decay: 0.0500 (0.0500) grad_norm: 0.8911 (1.0586)
Test: [ 0/50] eta: 0:10:37 loss: 1.6058 (1.6058) acc1: 64.8000 (64.8000) acc5: 86.4000 (86.4000) time: 12.7427 data: 12.7077 max mem: 6925
Test: [10/50] eta: 0:01:12 loss: 1.3902 (1.4301) acc1: 69.6000 (68.9455) acc5: 88.8000 (88.1455) time: 1.8037 data: 1.7742 max mem: 6925
Test: [20/50] eta: 0:00:39 loss: 1.5494 (1.5862) acc1: 67.2000 (65.9048) acc5: 87.2000 (86.8191) time: 0.7411 data: 0.7117 max mem: 6925
Test: [30/50] eta: 0:00:23 loss: 1.7648 (1.6271) acc1: 61.6000 (64.5936) acc5: 84.0000 (86.0645) time: 0.8378 data: 0.8066 max mem: 6925
Test: [40/50] eta: 0:00:10 loss: 1.5815 (1.6244) acc1: 62.4000 (64.6439) acc5: 84.8000 (85.7366) time: 0.8338 data: 0.8022 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.5815 (1.6342) acc1: 64.0000 (64.3680) acc5: 84.8000 (85.4880) time: 0.5630 data: 0.5334 max mem: 6925
Test: Total time: 0:00:47 (0.9597 s / it)
* Acc@1 64.918 Acc@5 86.286 loss 1.595
Accuracy of the model on the 50000 test images: 64.9%
Max accuracy: 65.47%
Epoch: [157] [ 0/625] eta: 3:43:03 lr: 0.002068 min_lr: 0.002068 loss: 2.4752 (2.4752) class_acc: 0.6328 (0.6328) weight_decay: 0.0500 (0.0500) time: 21.4142 data: 18.4439 max mem: 6925
Epoch: [157] [200/625] eta: 0:14:37 lr: 0.002061 min_lr: 0.002061 loss: 2.5596 (2.5354) class_acc: 0.6133 (0.6287) weight_decay: 0.0500 (0.0500) grad_norm: 0.9693 (1.0217) time: 1.9715 data: 0.5731 max mem: 6925
Epoch: [157] [400/625] eta: 0:07:40 lr: 0.002053 min_lr: 0.002053 loss: 2.5221 (2.5413) class_acc: 0.6172 (0.6273) weight_decay: 0.0500 (0.0500) grad_norm: 0.9001 (1.0397) time: 2.0494 data: 0.0011 max mem: 6925
Epoch: [157] [600/625] eta: 0:00:51 lr: 0.002046 min_lr: 0.002046 loss: 2.6068 (2.5464) class_acc: 0.6172 (0.6264) weight_decay: 0.0500 (0.0500) grad_norm: 1.0004 (1.0471) time: 2.0963 data: 0.0010 max mem: 6925
Epoch: [157] [624/625] eta: 0:00:01 lr: 0.002045 min_lr: 0.002045 loss: 2.5240 (2.5472) class_acc: 0.6211 (0.6261) weight_decay: 0.0500 (0.0500) grad_norm: 1.0290 (1.0539) time: 0.8550 data: 0.0023 max mem: 6925
Epoch: [157] Total time: 0:20:50 (2.0003 s / it)
Averaged stats: lr: 0.002045 min_lr: 0.002045 loss: 2.5240 (2.5517) class_acc: 0.6211 (0.6245) weight_decay: 0.0500 (0.0500) grad_norm: 1.0290 (1.0539)
Test: [ 0/50] eta: 0:09:16 loss: 1.4659 (1.4659) acc1: 68.8000 (68.8000) acc5: 88.0000 (88.0000) time: 11.1221 data: 11.0604 max mem: 6925
Test: [10/50] eta: 0:01:18 loss: 1.4659 (1.5133) acc1: 68.8000 (68.0000) acc5: 87.2000 (86.5455) time: 1.9700 data: 1.9374 max mem: 6925
Test: [20/50] eta: 0:00:47 loss: 1.7117 (1.6545) acc1: 64.8000 (63.8857) acc5: 85.6000 (85.5238) time: 1.1036 data: 1.0738 max mem: 6925
Test: [30/50] eta: 0:00:28 loss: 1.8011 (1.6876) acc1: 59.2000 (62.9419) acc5: 84.8000 (84.9290) time: 1.1375 data: 1.1079 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.6870 (1.7011) acc1: 59.2000 (62.7317) acc5: 83.2000 (84.8000) time: 0.9601 data: 0.9305 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.7467 (1.7071) acc1: 60.8000 (62.6400) acc5: 84.0000 (84.5920) time: 0.7852 data: 0.7555 max mem: 6925
Test: Total time: 0:00:54 (1.0834 s / it)
* Acc@1 63.388 Acc@5 85.068 loss 1.663
Accuracy of the model on the 50000 test images: 63.4%
Max accuracy: 65.47%
Epoch: [158] [ 0/625] eta: 4:07:47 lr: 0.002045 min_lr: 0.002045 loss: 2.5032 (2.5032) class_acc: 0.6250 (0.6250) weight_decay: 0.0500 (0.0500) time: 23.7880 data: 23.5604 max mem: 6925
Epoch: [158] [200/625] eta: 0:14:45 lr: 0.002038 min_lr: 0.002038 loss: 2.5524 (2.5310) class_acc: 0.6328 (0.6276) weight_decay: 0.0500 (0.0500) grad_norm: 0.9257 (1.0043) time: 2.0104 data: 0.0520 max mem: 6925
Epoch: [158] [400/625] eta: 0:07:43 lr: 0.002031 min_lr: 0.002031 loss: 2.5150 (2.5427) class_acc: 0.6211 (0.6262) weight_decay: 0.0500 (0.0500) grad_norm: 0.8560 (0.9817) time: 2.1061 data: 0.0009 max mem: 6925
Epoch: [158] [600/625] eta: 0:00:51 lr: 0.002024 min_lr: 0.002024 loss: 2.5395 (2.5462) class_acc: 0.6211 (0.6252) weight_decay: 0.0500 (0.0500) grad_norm: 0.9539 (0.9970) time: 2.0712 data: 0.0009 max mem: 6925
Epoch: [158] [624/625] eta: 0:00:01 lr: 0.002023 min_lr: 0.002023 loss: 2.5152 (2.5453) class_acc: 0.6289 (0.6255) weight_decay: 0.0500 (0.0500) grad_norm: 1.0168 (0.9985) time: 0.8585 data: 0.0017 max mem: 6925
Epoch: [158] Total time: 0:20:50 (2.0008 s / it)
Averaged stats: lr: 0.002023 min_lr: 0.002023 loss: 2.5152 (2.5464) class_acc: 0.6289 (0.6250) weight_decay: 0.0500 (0.0500) grad_norm: 1.0168 (0.9985)
Test: [ 0/50] eta: 0:10:28 loss: 1.5513 (1.5513) acc1: 61.6000 (61.6000) acc5: 89.6000 (89.6000) time: 12.5743 data: 12.5397 max mem: 6925
Test: [10/50] eta: 0:01:25 loss: 1.5513 (1.4818) acc1: 65.6000 (67.1273) acc5: 87.2000 (87.0546) time: 2.1253 data: 2.0955 max mem: 6925
Test: [20/50] eta: 0:00:50 loss: 1.6438 (1.6228) acc1: 62.4000 (63.8857) acc5: 85.6000 (85.3333) time: 1.1508 data: 1.1216 max mem: 6925
Test: [30/50] eta: 0:00:30 loss: 1.6822 (1.6308) acc1: 60.0000 (63.5097) acc5: 83.2000 (85.1097) time: 1.1822 data: 1.1533 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.6667 (1.6560) acc1: 58.4000 (63.0829) acc5: 84.0000 (84.8390) time: 0.8395 data: 0.8109 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.6671 (1.6542) acc1: 62.4000 (63.2640) acc5: 84.0000 (84.7360) time: 0.7907 data: 0.7598 max mem: 6925
Test: Total time: 0:00:53 (1.0603 s / it)
* Acc@1 64.072 Acc@5 85.496 loss 1.608
Accuracy of the model on the 50000 test images: 64.1%
Max accuracy: 65.47%
Epoch: [159] [ 0/625] eta: 3:38:37 lr: 0.002023 min_lr: 0.002023 loss: 2.6063 (2.6063) class_acc: 0.6211 (0.6211) weight_decay: 0.0500 (0.0500) time: 20.9885 data: 15.6022 max mem: 6925
Epoch: [159] [200/625] eta: 0:14:05 lr: 0.002016 min_lr: 0.002016 loss: 2.4711 (2.5295) class_acc: 0.6289 (0.6285) weight_decay: 0.0500 (0.0500) grad_norm: 0.8929 (1.0459) time: 1.9259 data: 0.0007 max mem: 6925
Epoch: [159] [400/625] eta: 0:07:18 lr: 0.002009 min_lr: 0.002009 loss: 2.5708 (2.5349) class_acc: 0.6211 (0.6284) weight_decay: 0.0500 (0.0500) grad_norm: 0.9645 (1.0484) time: 2.0860 data: 0.0010 max mem: 6925
Epoch: [159] [600/625] eta: 0:00:48 lr: 0.002001 min_lr: 0.002001 loss: 2.6015 (2.5456) class_acc: 0.6211 (0.6257) weight_decay: 0.0500 (0.0500) grad_norm: 1.1267 (1.0607) time: 1.9104 data: 0.0007 max mem: 6925
Epoch: [159] [624/625] eta: 0:00:01 lr: 0.002001 min_lr: 0.002001 loss: 2.5432 (2.5462) class_acc: 0.6172 (0.6255) weight_decay: 0.0500 (0.0500) grad_norm: 1.2046 (1.0645) time: 0.9444 data: 0.0014 max mem: 6925
Epoch: [159] Total time: 0:19:50 (1.9044 s / it)
Averaged stats: lr: 0.002001 min_lr: 0.002001 loss: 2.5432 (2.5446) class_acc: 0.6172 (0.6256) weight_decay: 0.0500 (0.0500) grad_norm: 1.2046 (1.0645)
Test: [ 0/50] eta: 0:10:35 loss: 1.5527 (1.5527) acc1: 67.2000 (67.2000) acc5: 86.4000 (86.4000) time: 12.7044 data: 12.6732 max mem: 6925
Test: [10/50] eta: 0:01:26 loss: 1.3674 (1.4655) acc1: 67.2000 (67.1273) acc5: 88.0000 (87.2727) time: 2.1563 data: 2.1267 max mem: 6925
Test: [20/50] eta: 0:00:51 loss: 1.5693 (1.6006) acc1: 64.0000 (64.4571) acc5: 85.6000 (85.9048) time: 1.1534 data: 1.1234 max mem: 6925
Test: [30/50] eta: 0:00:30 loss: 1.7508 (1.6331) acc1: 61.6000 (63.9484) acc5: 84.8000 (85.2903) time: 1.1630 data: 1.1334 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.6921 (1.6345) acc1: 61.6000 (63.7268) acc5: 84.8000 (85.2878) time: 0.7518 data: 0.7226 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.6125 (1.6223) acc1: 62.4000 (63.6800) acc5: 84.8000 (85.3760) time: 0.7523 data: 0.7225 max mem: 6925
Test: Total time: 0:00:51 (1.0248 s / it)
* Acc@1 64.810 Acc@5 86.062 loss 1.575
Accuracy of the model on the 50000 test images: 64.8%
Max accuracy: 65.47%
Epoch: [160] [ 0/625] eta: 3:21:45 lr: 0.002001 min_lr: 0.002001 loss: 2.5206 (2.5206) class_acc: 0.6250 (0.6250) weight_decay: 0.0500 (0.0500) time: 19.3693 data: 19.0294 max mem: 6925
Epoch: [160] [200/625] eta: 0:13:34 lr: 0.001993 min_lr: 0.001993 loss: 2.5035 (2.5172) class_acc: 0.6289 (0.6324) weight_decay: 0.0500 (0.0500) grad_norm: 0.8340 (0.9378) time: 1.8301 data: 0.0008 max mem: 6925
Epoch: [160] [400/625] eta: 0:07:08 lr: 0.001986 min_lr: 0.001986 loss: 2.5414 (2.5335) class_acc: 0.6172 (0.6277) weight_decay: 0.0500 (0.0500) grad_norm: 1.0086 (1.0261) time: 1.8193 data: 0.0007 max mem: 6925
Epoch: [160] [600/625] eta: 0:00:47 lr: 0.001979 min_lr: 0.001979 loss: 2.5899 (2.5371) class_acc: 0.6133 (0.6272) weight_decay: 0.0500 (0.0500) grad_norm: 0.9039 (1.0208) time: 1.8348 data: 0.0007 max mem: 6925
Epoch: [160] [624/625] eta: 0:00:01 lr: 0.001978 min_lr: 0.001978 loss: 2.5571 (2.5383) class_acc: 0.6133 (0.6267) weight_decay: 0.0500 (0.0500) grad_norm: 0.9703 (1.0199) time: 0.7479 data: 0.0016 max mem: 6925
Epoch: [160] Total time: 0:19:36 (1.8820 s / it)
Averaged stats: lr: 0.001978 min_lr: 0.001978 loss: 2.5571 (2.5399) class_acc: 0.6133 (0.6268) weight_decay: 0.0500 (0.0500) grad_norm: 0.9703 (1.0199)
Test: [ 0/50] eta: 0:10:17 loss: 1.1750 (1.1750) acc1: 73.6000 (73.6000) acc5: 92.8000 (92.8000) time: 12.3419 data: 12.3036 max mem: 6925
Test: [10/50] eta: 0:01:21 loss: 1.3401 (1.3459) acc1: 70.4000 (70.4727) acc5: 88.8000 (88.4364) time: 2.0256 data: 1.9951 max mem: 6925
Test: [20/50] eta: 0:00:46 loss: 1.4268 (1.4700) acc1: 67.2000 (67.1619) acc5: 88.0000 (87.6191) time: 1.0095 data: 0.9805 max mem: 6925
Test: [30/50] eta: 0:00:25 loss: 1.5992 (1.5077) acc1: 64.8000 (66.1419) acc5: 88.0000 (87.4065) time: 0.8724 data: 0.8425 max mem: 6925
Test: [40/50] eta: 0:00:10 loss: 1.5867 (1.5188) acc1: 62.4000 (65.6976) acc5: 87.2000 (87.1805) time: 0.5465 data: 0.5158 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.5867 (1.5230) acc1: 61.6000 (65.5200) acc5: 86.4000 (86.9920) time: 0.5215 data: 0.4908 max mem: 6925
Test: Total time: 0:00:45 (0.9064 s / it)
* Acc@1 66.550 Acc@5 87.162 loss 1.484
Accuracy of the model on the 50000 test images: 66.6%
Max accuracy: 66.55%
Epoch: [161] [ 0/625] eta: 3:47:43 lr: 0.001978 min_lr: 0.001978 loss: 2.5681 (2.5681) class_acc: 0.6250 (0.6250) weight_decay: 0.0500 (0.0500) time: 21.8612 data: 21.6234 max mem: 6925
Epoch: [161] [200/625] eta: 0:14:10 lr: 0.001971 min_lr: 0.001971 loss: 2.5321 (2.5202) class_acc: 0.6250 (0.6318) weight_decay: 0.0500 (0.0500) grad_norm: 1.0058 (1.0382) time: 1.9745 data: 0.1050 max mem: 6925
Epoch: [161] [400/625] eta: 0:07:26 lr: 0.001964 min_lr: 0.001964 loss: 2.5080 (2.5260) class_acc: 0.6211 (0.6303) weight_decay: 0.0500 (0.0500) grad_norm: 1.0053 (1.0411) time: 1.7680 data: 0.0014 max mem: 6925
Epoch: [161] [600/625] eta: 0:00:49 lr: 0.001956 min_lr: 0.001956 loss: 2.5305 (2.5335) class_acc: 0.6289 (0.6296) weight_decay: 0.0500 (0.0500) grad_norm: 0.8970 (1.0253) time: 2.0568 data: 0.0007 max mem: 6925
Epoch: [161] [624/625] eta: 0:00:01 lr: 0.001956 min_lr: 0.001956 loss: 2.5508 (2.5340) class_acc: 0.6328 (0.6295) weight_decay: 0.0500 (0.0500) grad_norm: 0.9629 (1.0303) time: 0.9522 data: 0.0013 max mem: 6925
Epoch: [161] Total time: 0:20:18 (1.9501 s / it)
Averaged stats: lr: 0.001956 min_lr: 0.001956 loss: 2.5508 (2.5395) class_acc: 0.6328 (0.6273) weight_decay: 0.0500 (0.0500) grad_norm: 0.9629 (1.0303)
Test: [ 0/50] eta: 0:10:45 loss: 1.4367 (1.4367) acc1: 67.2000 (67.2000) acc5: 88.0000 (88.0000) time: 12.9031 data: 12.8700 max mem: 6925
Test: [10/50] eta: 0:01:24 loss: 1.4213 (1.4311) acc1: 69.6000 (68.3636) acc5: 88.0000 (87.4909) time: 2.1245 data: 2.0949 max mem: 6925
Test: [20/50] eta: 0:00:50 loss: 1.4710 (1.5614) acc1: 64.8000 (64.9905) acc5: 87.2000 (86.4381) time: 1.1090 data: 1.0790 max mem: 6925
Test: [30/50] eta: 0:00:29 loss: 1.7322 (1.6087) acc1: 60.8000 (64.4129) acc5: 83.2000 (85.8065) time: 1.1533 data: 1.1236 max mem: 6925
Test: [40/50] eta: 0:00:13 loss: 1.6400 (1.6034) acc1: 60.8000 (64.4488) acc5: 84.8000 (85.8927) time: 1.0469 data: 1.0181 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.6400 (1.6230) acc1: 61.6000 (64.2240) acc5: 84.8000 (85.6640) time: 0.9582 data: 0.9290 max mem: 6925
Test: Total time: 0:00:57 (1.1592 s / it)
* Acc@1 65.040 Acc@5 86.276 loss 1.575
Accuracy of the model on the 50000 test images: 65.0%
Max accuracy: 66.55%
Epoch: [162] [ 0/625] eta: 3:51:15 lr: 0.001956 min_lr: 0.001956 loss: 2.4522 (2.4522) class_acc: 0.6406 (0.6406) weight_decay: 0.0500 (0.0500) time: 22.2004 data: 18.2576 max mem: 6925
Epoch: [162] [200/625] eta: 0:14:33 lr: 0.001948 min_lr: 0.001948 loss: 2.5799 (2.5253) class_acc: 0.6289 (0.6315) weight_decay: 0.0500 (0.0500) grad_norm: 0.8456 (0.9900) time: 2.0275 data: 0.9552 max mem: 6925
Epoch: [162] [400/625] eta: 0:07:35 lr: 0.001941 min_lr: 0.001941 loss: 2.5380 (2.5281) class_acc: 0.6250 (0.6298) weight_decay: 0.0500 (0.0500) grad_norm: 0.8820 (1.0294) time: 2.1635 data: 0.0008 max mem: 6925
Epoch: [162] [600/625] eta: 0:00:50 lr: 0.001934 min_lr: 0.001934 loss: 2.5712 (2.5311) class_acc: 0.6172 (0.6292) weight_decay: 0.0500 (0.0500) grad_norm: 0.9862 (1.0255) time: 2.0562 data: 0.0026 max mem: 6925
Epoch: [162] [624/625] eta: 0:00:01 lr: 0.001933 min_lr: 0.001933 loss: 2.5328 (2.5329) class_acc: 0.6250 (0.6289) weight_decay: 0.0500 (0.0500) grad_norm: 0.9398 (1.0250) time: 0.7701 data: 0.0014 max mem: 6925
Epoch: [162] Total time: 0:20:34 (1.9748 s / it)
Averaged stats: lr: 0.001933 min_lr: 0.001933 loss: 2.5328 (2.5342) class_acc: 0.6250 (0.6280) weight_decay: 0.0500 (0.0500) grad_norm: 0.9398 (1.0250)
Test: [ 0/50] eta: 0:09:57 loss: 1.3815 (1.3815) acc1: 68.0000 (68.0000) acc5: 89.6000 (89.6000) time: 11.9518 data: 11.9060 max mem: 6925
Test: [10/50] eta: 0:01:16 loss: 1.3604 (1.3744) acc1: 72.0000 (71.7818) acc5: 88.8000 (88.4364) time: 1.9115 data: 1.8804 max mem: 6925
Test: [20/50] eta: 0:00:43 loss: 1.5445 (1.5647) acc1: 65.6000 (66.5143) acc5: 87.2000 (86.4381) time: 0.9249 data: 0.8957 max mem: 6925
Test: [30/50] eta: 0:00:25 loss: 1.7683 (1.6151) acc1: 61.6000 (64.9290) acc5: 83.2000 (85.4452) time: 0.8880 data: 0.8566 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.6305 (1.6077) acc1: 61.6000 (64.8976) acc5: 84.8000 (85.6390) time: 0.7756 data: 0.7443 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.6305 (1.6331) acc1: 61.6000 (64.1120) acc5: 85.6000 (85.3920) time: 0.5379 data: 0.5091 max mem: 6925
Test: Total time: 0:00:49 (0.9808 s / it)
* Acc@1 64.730 Acc@5 85.946 loss 1.600
Accuracy of the model on the 50000 test images: 64.7%
Max accuracy: 66.55%
Epoch: [163] [ 0/625] eta: 3:34:30 lr: 0.001933 min_lr: 0.001933 loss: 2.5607 (2.5607) class_acc: 0.6289 (0.6289) weight_decay: 0.0500 (0.0500) time: 20.5929 data: 20.3562 max mem: 6925
Epoch: [163] [200/625] eta: 0:14:12 lr: 0.001926 min_lr: 0.001926 loss: 2.5172 (2.5132) class_acc: 0.6211 (0.6342) weight_decay: 0.0500 (0.0500) grad_norm: 0.9413 (1.0164) time: 1.9403 data: 1.4431 max mem: 6925
Epoch: [163] [400/625] eta: 0:07:20 lr: 0.001919 min_lr: 0.001919 loss: 2.5451 (2.5233) class_acc: 0.6289 (0.6323) weight_decay: 0.0500 (0.0500) grad_norm: 0.9294 (1.0036) time: 1.8274 data: 1.4438 max mem: 6925
Epoch: [163] [600/625] eta: 0:00:49 lr: 0.001912 min_lr: 0.001912 loss: 2.5671 (2.5296) class_acc: 0.6211 (0.6302) weight_decay: 0.0500 (0.0500) grad_norm: 0.8911 (0.9930) time: 1.8985 data: 1.1838 max mem: 6925
Epoch: [163] [624/625] eta: 0:00:01 lr: 0.001911 min_lr: 0.001911 loss: 2.5004 (2.5294) class_acc: 0.6406 (0.6303) weight_decay: 0.0500 (0.0500) grad_norm: 0.9101 (0.9920) time: 0.8899 data: 0.4185 max mem: 6925
Epoch: [163] Total time: 0:19:59 (1.9194 s / it)
Averaged stats: lr: 0.001911 min_lr: 0.001911 loss: 2.5004 (2.5323) class_acc: 0.6406 (0.6286) weight_decay: 0.0500 (0.0500) grad_norm: 0.9101 (0.9920)
Test: [ 0/50] eta: 0:10:11 loss: 1.3265 (1.3265) acc1: 70.4000 (70.4000) acc5: 87.2000 (87.2000) time: 12.2261 data: 12.1934 max mem: 6925
Test: [10/50] eta: 0:01:28 loss: 1.3157 (1.3252) acc1: 71.2000 (71.3455) acc5: 89.6000 (88.2909) time: 2.2026 data: 2.1731 max mem: 6925
Test: [20/50] eta: 0:00:52 loss: 1.4154 (1.4818) acc1: 68.8000 (67.9238) acc5: 88.0000 (87.0095) time: 1.2427 data: 1.2136 max mem: 6925
Test: [30/50] eta: 0:00:29 loss: 1.6129 (1.5407) acc1: 64.0000 (65.8581) acc5: 86.4000 (86.1936) time: 1.0562 data: 1.0275 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.6129 (1.5773) acc1: 61.6000 (65.1317) acc5: 84.0000 (85.6585) time: 0.5643 data: 0.5336 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.5612 (1.5637) acc1: 63.2000 (64.8480) acc5: 85.6000 (85.8400) time: 0.4690 data: 0.4377 max mem: 6925
Test: Total time: 0:00:49 (0.9849 s / it)
* Acc@1 65.386 Acc@5 86.598 loss 1.519
Accuracy of the model on the 50000 test images: 65.4%
Max accuracy: 66.55%
Epoch: [164] [ 0/625] eta: 3:38:46 lr: 0.001911 min_lr: 0.001911 loss: 2.4915 (2.4915) class_acc: 0.6289 (0.6289) weight_decay: 0.0500 (0.0500) time: 21.0016 data: 17.9094 max mem: 6925
Epoch: [164] [200/625] eta: 0:13:39 lr: 0.001904 min_lr: 0.001904 loss: 2.5121 (2.5126) class_acc: 0.6328 (0.6319) weight_decay: 0.0500 (0.0500) grad_norm: 1.0359 (1.1014) time: 1.7099 data: 0.0008 max mem: 6925
Epoch: [164] [400/625] eta: 0:07:10 lr: 0.001896 min_lr: 0.001896 loss: 2.4859 (2.5220) class_acc: 0.6250 (0.6293) weight_decay: 0.0500 (0.0500) grad_norm: 0.9972 (1.0721) time: 1.8915 data: 0.0008 max mem: 6925
Epoch: [164] [600/625] eta: 0:00:48 lr: 0.001889 min_lr: 0.001889 loss: 2.5427 (2.5240) class_acc: 0.6211 (0.6293) weight_decay: 0.0500 (0.0500) grad_norm: 0.8485 (1.0385) time: 2.0742 data: 0.0009 max mem: 6925
Epoch: [164] [624/625] eta: 0:00:01 lr: 0.001888 min_lr: 0.001888 loss: 2.5059 (2.5245) class_acc: 0.6289 (0.6291) weight_decay: 0.0500 (0.0500) grad_norm: 0.9205 (1.0463) time: 0.5478 data: 0.0015 max mem: 6925
Epoch: [164] Total time: 0:19:51 (1.9057 s / it)
Averaged stats: lr: 0.001888 min_lr: 0.001888 loss: 2.5059 (2.5299) class_acc: 0.6289 (0.6292) weight_decay: 0.0500 (0.0500) grad_norm: 0.9205 (1.0463)
Test: [ 0/50] eta: 0:10:38 loss: 1.6336 (1.6336) acc1: 64.0000 (64.0000) acc5: 84.0000 (84.0000) time: 12.7619 data: 12.7207 max mem: 6925
Test: [10/50] eta: 0:01:24 loss: 1.3138 (1.3934) acc1: 70.4000 (69.2364) acc5: 89.6000 (88.0727) time: 2.1088 data: 2.0787 max mem: 6925
Test: [20/50] eta: 0:00:49 loss: 1.6389 (1.5975) acc1: 63.2000 (64.1524) acc5: 85.6000 (85.8667) time: 1.1048 data: 1.0759 max mem: 6925
Test: [30/50] eta: 0:00:29 loss: 1.7256 (1.6131) acc1: 61.6000 (63.9742) acc5: 83.2000 (85.5742) time: 1.1319 data: 1.1022 max mem: 6925
Test: [40/50] eta: 0:00:13 loss: 1.6043 (1.6147) acc1: 63.2000 (64.1561) acc5: 84.8000 (85.4244) time: 0.9727 data: 0.9423 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.5981 (1.6162) acc1: 63.2000 (64.2400) acc5: 85.6000 (85.4400) time: 0.8953 data: 0.8643 max mem: 6925
Test: Total time: 0:00:55 (1.1111 s / it)
* Acc@1 64.722 Acc@5 86.218 loss 1.574
Accuracy of the model on the 50000 test images: 64.7%
Max accuracy: 66.55%
Epoch: [165] [ 0/625] eta: 3:35:03 lr: 0.001888 min_lr: 0.001888 loss: 2.4737 (2.4737) class_acc: 0.6484 (0.6484) weight_decay: 0.0500 (0.0500) time: 20.6458 data: 20.4116 max mem: 6925
Epoch: [165] [200/625] eta: 0:14:08 lr: 0.001881 min_lr: 0.001881 loss: 2.5057 (2.5033) class_acc: 0.6406 (0.6363) weight_decay: 0.0500 (0.0500) grad_norm: 1.0560 (1.0911) time: 1.8754 data: 0.0265 max mem: 6925
Epoch: [165] [400/625] eta: 0:07:20 lr: 0.001874 min_lr: 0.001874 loss: 2.5157 (2.5139) class_acc: 0.6289 (0.6329) weight_decay: 0.0500 (0.0500) grad_norm: 0.9167 (1.0607) time: 2.0913 data: 0.0659 max mem: 6925
Epoch: [165] [600/625] eta: 0:00:48 lr: 0.001867 min_lr: 0.001867 loss: 2.5034 (2.5207) class_acc: 0.6367 (0.6326) weight_decay: 0.0500 (0.0500) grad_norm: 0.9940 (1.0384) time: 1.8785 data: 0.0007 max mem: 6925
Epoch: [165] [624/625] eta: 0:00:01 lr: 0.001866 min_lr: 0.001866 loss: 2.5504 (2.5231) class_acc: 0.6289 (0.6323) weight_decay: 0.0500 (0.0500) grad_norm: 1.0455 (1.0500) time: 0.9613 data: 0.0015 max mem: 6925
Epoch: [165] Total time: 0:19:47 (1.9002 s / it)
Averaged stats: lr: 0.001866 min_lr: 0.001866 loss: 2.5504 (2.5252) class_acc: 0.6289 (0.6306) weight_decay: 0.0500 (0.0500) grad_norm: 1.0455 (1.0500)
Test: [ 0/50] eta: 0:10:31 loss: 1.2113 (1.2113) acc1: 72.0000 (72.0000) acc5: 90.4000 (90.4000) time: 12.6393 data: 12.6046 max mem: 6925
Test: [10/50] eta: 0:01:27 loss: 1.3179 (1.3516) acc1: 71.2000 (71.2727) acc5: 88.8000 (87.7818) time: 2.1925 data: 2.1622 max mem: 6925
Test: [20/50] eta: 0:00:51 loss: 1.4900 (1.5211) acc1: 65.6000 (66.2476) acc5: 87.2000 (86.5905) time: 1.1699 data: 1.1408 max mem: 6925
Test: [30/50] eta: 0:00:28 loss: 1.5704 (1.5601) acc1: 62.4000 (65.1613) acc5: 85.6000 (86.0129) time: 0.9852 data: 0.9565 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.5704 (1.5770) acc1: 64.0000 (64.9171) acc5: 85.6000 (85.9512) time: 0.5532 data: 0.5239 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.4970 (1.5717) acc1: 65.6000 (64.9600) acc5: 87.2000 (85.9520) time: 0.4910 data: 0.4606 max mem: 6925
Test: Total time: 0:00:47 (0.9564 s / it)
* Acc@1 65.374 Acc@5 86.632 loss 1.536
Accuracy of the model on the 50000 test images: 65.4%
Max accuracy: 66.55%
Epoch: [166] [ 0/625] eta: 3:30:48 lr: 0.001866 min_lr: 0.001866 loss: 2.4240 (2.4240) class_acc: 0.6797 (0.6797) weight_decay: 0.0500 (0.0500) time: 20.2376 data: 20.0107 max mem: 6925
Epoch: [166] [200/625] eta: 0:13:52 lr: 0.001859 min_lr: 0.001859 loss: 2.4943 (2.5076) class_acc: 0.6250 (0.6348) weight_decay: 0.0500 (0.0500) grad_norm: 0.9534 (1.0272) time: 1.8576 data: 0.0009 max mem: 6925
Epoch: [166] [400/625] eta: 0:07:05 lr: 0.001852 min_lr: 0.001852 loss: 2.5283 (2.5150) class_acc: 0.6133 (0.6329) weight_decay: 0.0500 (0.0500) grad_norm: 0.8359 (1.0210) time: 1.7891 data: 0.0009 max mem: 6925
Epoch: [166] [600/625] eta: 0:00:46 lr: 0.001844 min_lr: 0.001844 loss: 2.5045 (2.5175) class_acc: 0.6289 (0.6318) weight_decay: 0.0500 (0.0500) grad_norm: 0.8911 (inf) time: 1.9079 data: 0.0008 max mem: 6925
Epoch: [166] [624/625] eta: 0:00:01 lr: 0.001844 min_lr: 0.001844 loss: 2.5020 (2.5169) class_acc: 0.6328 (0.6319) weight_decay: 0.0500 (0.0500) grad_norm: 0.8592 (inf) time: 0.8480 data: 0.0014 max mem: 6925
Epoch: [166] Total time: 0:19:17 (1.8513 s / it)
Averaged stats: lr: 0.001844 min_lr: 0.001844 loss: 2.5020 (2.5218) class_acc: 0.6328 (0.6314) weight_decay: 0.0500 (0.0500) grad_norm: 0.8592 (inf)
Test: [ 0/50] eta: 0:09:55 loss: 1.2154 (1.2154) acc1: 71.2000 (71.2000) acc5: 92.0000 (92.0000) time: 11.9181 data: 11.8873 max mem: 6925
Test: [10/50] eta: 0:01:15 loss: 1.3490 (1.3701) acc1: 71.2000 (70.6182) acc5: 89.6000 (88.6546) time: 1.8822 data: 1.8516 max mem: 6925
Test: [20/50] eta: 0:00:43 loss: 1.4837 (1.5459) acc1: 68.0000 (66.9333) acc5: 86.4000 (86.5524) time: 0.9209 data: 0.8895 max mem: 6925
Test: [30/50] eta: 0:00:24 loss: 1.6521 (1.5597) acc1: 62.4000 (66.2194) acc5: 84.8000 (86.2968) time: 0.8866 data: 0.8549 max mem: 6925
Test: [40/50] eta: 0:00:10 loss: 1.5105 (1.5481) acc1: 64.0000 (66.1463) acc5: 85.6000 (86.1854) time: 0.7006 data: 0.6708 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.5105 (1.5557) acc1: 64.8000 (65.8880) acc5: 85.6000 (86.0480) time: 0.5343 data: 0.5057 max mem: 6925
Test: Total time: 0:00:47 (0.9418 s / it)
* Acc@1 65.976 Acc@5 86.888 loss 1.517
Accuracy of the model on the 50000 test images: 66.0%
Max accuracy: 66.55%
Epoch: [167] [ 0/625] eta: 3:57:46 lr: 0.001844 min_lr: 0.001844 loss: 2.4768 (2.4768) class_acc: 0.6484 (0.6484) weight_decay: 0.0500 (0.0500) time: 22.8257 data: 19.4404 max mem: 6925
Epoch: [167] [200/625] eta: 0:14:04 lr: 0.001836 min_lr: 0.001836 loss: 2.5276 (2.5020) class_acc: 0.6289 (0.6340) weight_decay: 0.0500 (0.0500) grad_norm: 0.9607 (1.0280) time: 1.8503 data: 0.0011 max mem: 6925
Epoch: [167] [400/625] eta: 0:07:18 lr: 0.001829 min_lr: 0.001829 loss: 2.4891 (2.5102) class_acc: 0.6367 (0.6336) weight_decay: 0.0500 (0.0500) grad_norm: 1.0703 (1.0428) time: 2.0025 data: 0.0011 max mem: 6925
Epoch: [167] [600/625] eta: 0:00:48 lr: 0.001822 min_lr: 0.001822 loss: 2.6044 (2.5188) class_acc: 0.6211 (0.6322) weight_decay: 0.0500 (0.0500) grad_norm: 1.0070 (1.0456) time: 2.1011 data: 0.0008 max mem: 6925
Epoch: [167] [624/625] eta: 0:00:01 lr: 0.001821 min_lr: 0.001821 loss: 2.4863 (2.5191) class_acc: 0.6289 (0.6322) weight_decay: 0.0500 (0.0500) grad_norm: 1.1512 (1.0495) time: 0.7933 data: 0.0015 max mem: 6925
Epoch: [167] Total time: 0:19:53 (1.9091 s / it)
Averaged stats: lr: 0.001821 min_lr: 0.001821 loss: 2.4863 (2.5187) class_acc: 0.6289 (0.6319) weight_decay: 0.0500 (0.0500) grad_norm: 1.1512 (1.0495)
Test: [ 0/50] eta: 0:08:36 loss: 1.3003 (1.3003) acc1: 68.8000 (68.8000) acc5: 90.4000 (90.4000) time: 10.3225 data: 10.2913 max mem: 6925
Test: [10/50] eta: 0:01:17 loss: 1.3536 (1.4132) acc1: 67.2000 (68.4364) acc5: 87.2000 (87.6364) time: 1.9338 data: 1.9036 max mem: 6925
Test: [20/50] eta: 0:00:47 loss: 1.5745 (1.5787) acc1: 64.0000 (64.7238) acc5: 85.6000 (86.1714) time: 1.1347 data: 1.1055 max mem: 6925
Test: [30/50] eta: 0:00:28 loss: 1.6735 (1.5888) acc1: 61.6000 (64.4129) acc5: 84.0000 (86.0387) time: 1.1712 data: 1.1426 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.5318 (1.5775) acc1: 63.2000 (64.2537) acc5: 86.4000 (86.0488) time: 0.9892 data: 0.9601 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.5667 (1.5990) acc1: 61.6000 (63.8880) acc5: 87.2000 (85.8400) time: 0.7944 data: 0.7653 max mem: 6925
Test: Total time: 0:00:54 (1.0840 s / it)
* Acc@1 65.228 Acc@5 86.412 loss 1.553
Accuracy of the model on the 50000 test images: 65.2%
Max accuracy: 66.55%
Epoch: [168] [ 0/625] eta: 3:56:58 lr: 0.001821 min_lr: 0.001821 loss: 2.4287 (2.4287) class_acc: 0.6602 (0.6602) weight_decay: 0.0500 (0.0500) time: 22.7497 data: 21.2787 max mem: 6925
Epoch: [168] [200/625] eta: 0:14:37 lr: 0.001814 min_lr: 0.001814 loss: 2.4861 (2.4936) class_acc: 0.6406 (0.6393) weight_decay: 0.0500 (0.0500) grad_norm: 0.9756 (1.0472) time: 2.0135 data: 0.0013 max mem: 6925
Epoch: [168] [400/625] eta: 0:07:29 lr: 0.001807 min_lr: 0.001807 loss: 2.4903 (2.5016) class_acc: 0.6445 (0.6370) weight_decay: 0.0500 (0.0500) grad_norm: 0.8675 (1.0188) time: 1.8539 data: 0.0014 max mem: 6925
Epoch: [168] [600/625] eta: 0:00:49 lr: 0.001800 min_lr: 0.001800 loss: 2.5181 (2.5094) class_acc: 0.6250 (0.6341) weight_decay: 0.0500 (0.0500) grad_norm: 1.0959 (1.0309) time: 2.0258 data: 0.0012 max mem: 6925
Epoch: [168] [624/625] eta: 0:00:01 lr: 0.001799 min_lr: 0.001799 loss: 2.5301 (2.5102) class_acc: 0.6133 (0.6337) weight_decay: 0.0500 (0.0500) grad_norm: 1.0246 (1.0300) time: 0.7103 data: 0.0015 max mem: 6925
Epoch: [168] Total time: 0:20:15 (1.9447 s / it)
Averaged stats: lr: 0.001799 min_lr: 0.001799 loss: 2.5301 (2.5161) class_acc: 0.6133 (0.6329) weight_decay: 0.0500 (0.0500) grad_norm: 1.0246 (1.0300)
Test: [ 0/50] eta: 0:09:23 loss: 1.4185 (1.4185) acc1: 69.6000 (69.6000) acc5: 88.8000 (88.8000) time: 11.2682 data: 11.2212 max mem: 6925
Test: [10/50] eta: 0:01:10 loss: 1.4433 (1.4284) acc1: 69.6000 (69.7455) acc5: 88.8000 (87.7818) time: 1.7595 data: 1.7282 max mem: 6925
Test: [20/50] eta: 0:00:40 loss: 1.5575 (1.6001) acc1: 67.2000 (65.2571) acc5: 84.8000 (85.6762) time: 0.8639 data: 0.8340 max mem: 6925
Test: [30/50] eta: 0:00:25 loss: 1.6999 (1.5902) acc1: 62.4000 (64.7484) acc5: 84.0000 (85.4194) time: 1.0284 data: 0.9991 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.6055 (1.5819) acc1: 62.4000 (64.8390) acc5: 85.6000 (85.6585) time: 0.8497 data: 0.8211 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.5992 (1.5839) acc1: 62.4000 (64.5920) acc5: 85.6000 (85.6160) time: 0.4271 data: 0.3976 max mem: 6925
Test: Total time: 0:00:48 (0.9719 s / it)
* Acc@1 65.378 Acc@5 86.210 loss 1.547
Accuracy of the model on the 50000 test images: 65.4%
Max accuracy: 66.55%
Epoch: [169] [ 0/625] eta: 3:24:31 lr: 0.001799 min_lr: 0.001799 loss: 2.5663 (2.5663) class_acc: 0.6484 (0.6484) weight_decay: 0.0500 (0.0500) time: 19.6350 data: 19.4003 max mem: 6925
Epoch: [169] [200/625] eta: 0:13:53 lr: 0.001792 min_lr: 0.001792 loss: 2.5435 (2.5028) class_acc: 0.6289 (0.6381) weight_decay: 0.0500 (0.0500) grad_norm: 1.0234 (1.0714) time: 1.8810 data: 0.0008 max mem: 6925
Epoch: [169] [400/625] eta: 0:07:15 lr: 0.001785 min_lr: 0.001785 loss: 2.5161 (2.5123) class_acc: 0.6289 (0.6357) weight_decay: 0.0500 (0.0500) grad_norm: 1.0872 (1.0846) time: 1.9841 data: 0.0009 max mem: 6925
Epoch: [169] [600/625] eta: 0:00:48 lr: 0.001777 min_lr: 0.001777 loss: 2.5221 (2.5178) class_acc: 0.6289 (0.6337) weight_decay: 0.0500 (0.0500) grad_norm: 1.0020 (1.0545) time: 2.0078 data: 0.0016 max mem: 6925
Epoch: [169] [624/625] eta: 0:00:01 lr: 0.001777 min_lr: 0.001777 loss: 2.5094 (2.5182) class_acc: 0.6172 (0.6334) weight_decay: 0.0500 (0.0500) grad_norm: 1.0398 (1.0567) time: 0.7753 data: 0.0018 max mem: 6925
Epoch: [169] Total time: 0:19:42 (1.8920 s / it)
Averaged stats: lr: 0.001777 min_lr: 0.001777 loss: 2.5094 (2.5148) class_acc: 0.6172 (0.6330) weight_decay: 0.0500 (0.0500) grad_norm: 1.0398 (1.0567)
Test: [ 0/50] eta: 0:10:28 loss: 1.4150 (1.4150) acc1: 66.4000 (66.4000) acc5: 92.0000 (92.0000) time: 12.5688 data: 12.5307 max mem: 6925
Test: [10/50] eta: 0:01:25 loss: 1.3551 (1.3564) acc1: 70.4000 (69.9636) acc5: 89.6000 (88.9455) time: 2.1360 data: 2.1044 max mem: 6925
Test: [20/50] eta: 0:00:50 loss: 1.4342 (1.5203) acc1: 65.6000 (66.6286) acc5: 87.2000 (86.8952) time: 1.1360 data: 1.1060 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.6804 (1.5739) acc1: 60.8000 (65.1355) acc5: 84.8000 (85.9355) time: 0.9637 data: 0.9347 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.6582 (1.5713) acc1: 62.4000 (64.6829) acc5: 84.8000 (85.9317) time: 0.5270 data: 0.4977 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.4933 (1.5645) acc1: 62.4000 (64.9760) acc5: 86.4000 (85.8880) time: 0.4872 data: 0.4581 max mem: 6925
Test: Total time: 0:00:46 (0.9333 s / it)
* Acc@1 65.756 Acc@5 86.700 loss 1.523
Accuracy of the model on the 50000 test images: 65.8%
Max accuracy: 66.55%
Epoch: [170] [ 0/625] eta: 3:33:12 lr: 0.001777 min_lr: 0.001777 loss: 2.4243 (2.4243) class_acc: 0.6367 (0.6367) weight_decay: 0.0500 (0.0500) time: 20.4687 data: 20.2062 max mem: 6925
Epoch: [170] [200/625] eta: 0:14:01 lr: 0.001769 min_lr: 0.001769 loss: 2.5315 (2.4762) class_acc: 0.6328 (0.6424) weight_decay: 0.0500 (0.0500) grad_norm: 0.9769 (1.1140) time: 1.8505 data: 1.1509 max mem: 6925
Epoch: [170] [400/625] eta: 0:07:06 lr: 0.001762 min_lr: 0.001762 loss: 2.4967 (2.5016) class_acc: 0.6367 (0.6362) weight_decay: 0.0500 (0.0500) grad_norm: 0.9134 (1.0817) time: 1.8687 data: 0.0162 max mem: 6925
Epoch: [170] [600/625] eta: 0:00:47 lr: 0.001755 min_lr: 0.001755 loss: 2.5682 (2.5122) class_acc: 0.6094 (0.6329) weight_decay: 0.0500 (0.0500) grad_norm: 1.0235 (1.0611) time: 1.9799 data: 0.6764 max mem: 6925
Epoch: [170] [624/625] eta: 0:00:01 lr: 0.001754 min_lr: 0.001754 loss: 2.5155 (2.5126) class_acc: 0.6211 (0.6328) weight_decay: 0.0500 (0.0500) grad_norm: 1.0472 (1.0645) time: 0.7257 data: 0.1814 max mem: 6925
Epoch: [170] Total time: 0:19:25 (1.8651 s / it)
Averaged stats: lr: 0.001754 min_lr: 0.001754 loss: 2.5155 (2.5095) class_acc: 0.6211 (0.6347) weight_decay: 0.0500 (0.0500) grad_norm: 1.0472 (1.0645)
Test: [ 0/50] eta: 0:10:12 loss: 1.2911 (1.2911) acc1: 72.0000 (72.0000) acc5: 89.6000 (89.6000) time: 12.2427 data: 12.2034 max mem: 6925
Test: [10/50] eta: 0:01:28 loss: 1.2911 (1.3527) acc1: 72.0000 (71.9273) acc5: 89.6000 (89.3091) time: 2.2045 data: 2.1720 max mem: 6925
Test: [20/50] eta: 0:00:53 loss: 1.4271 (1.4904) acc1: 68.0000 (67.9619) acc5: 88.8000 (87.8095) time: 1.2485 data: 1.2176 max mem: 6925
Test: [30/50] eta: 0:00:29 loss: 1.5809 (1.5073) acc1: 64.0000 (67.6129) acc5: 85.6000 (87.4065) time: 1.0449 data: 1.0141 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.4410 (1.4943) acc1: 66.4000 (67.7463) acc5: 86.4000 (87.3366) time: 0.5587 data: 0.5287 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.4506 (1.5106) acc1: 66.4000 (67.2320) acc5: 87.2000 (87.1360) time: 0.5572 data: 0.5286 max mem: 6925
Test: Total time: 0:00:48 (0.9759 s / it)
* Acc@1 67.558 Acc@5 87.864 loss 1.465
Accuracy of the model on the 50000 test images: 67.6%
Max accuracy: 67.56%
Epoch: [171] [ 0/625] eta: 3:05:03 lr: 0.001754 min_lr: 0.001754 loss: 2.4507 (2.4507) class_acc: 0.6445 (0.6445) weight_decay: 0.0500 (0.0500) time: 17.7649 data: 16.6375 max mem: 6925
Epoch: [171] [200/625] eta: 0:13:44 lr: 0.001747 min_lr: 0.001747 loss: 2.5272 (2.4872) class_acc: 0.6328 (0.6404) weight_decay: 0.0500 (0.0500) grad_norm: 0.9486 (1.0021) time: 1.9001 data: 0.1810 max mem: 6925
Epoch: [171] [400/625] eta: 0:06:59 lr: 0.001740 min_lr: 0.001740 loss: 2.5154 (2.4971) class_acc: 0.6367 (0.6377) weight_decay: 0.0500 (0.0500) grad_norm: 0.9580 (1.0210) time: 1.7360 data: 0.0011 max mem: 6925
Epoch: [171] [600/625] eta: 0:00:46 lr: 0.001733 min_lr: 0.001733 loss: 2.5642 (2.5106) class_acc: 0.6211 (0.6345) weight_decay: 0.0500 (0.0500) grad_norm: 0.9198 (1.0302) time: 1.8571 data: 0.0011 max mem: 6925
Epoch: [171] [624/625] eta: 0:00:01 lr: 0.001732 min_lr: 0.001732 loss: 2.4834 (2.5103) class_acc: 0.6406 (0.6346) weight_decay: 0.0500 (0.0500) grad_norm: 0.9634 (1.0284) time: 0.6650 data: 0.0026 max mem: 6925
Epoch: [171] Total time: 0:19:15 (1.8481 s / it)
Averaged stats: lr: 0.001732 min_lr: 0.001732 loss: 2.4834 (2.5078) class_acc: 0.6406 (0.6345) weight_decay: 0.0500 (0.0500) grad_norm: 0.9634 (1.0284)
Test: [ 0/50] eta: 0:10:58 loss: 1.4215 (1.4215) acc1: 68.0000 (68.0000) acc5: 88.8000 (88.8000) time: 13.1636 data: 13.1213 max mem: 6925
Test: [10/50] eta: 0:01:28 loss: 1.3293 (1.3443) acc1: 70.4000 (70.6909) acc5: 90.4000 (89.3818) time: 2.2114 data: 2.1797 max mem: 6925
Test: [20/50] eta: 0:00:52 loss: 1.4665 (1.4829) acc1: 66.4000 (66.9333) acc5: 89.6000 (88.4191) time: 1.1757 data: 1.1454 max mem: 6925
Test: [30/50] eta: 0:00:30 loss: 1.6137 (1.5214) acc1: 62.4000 (65.9355) acc5: 87.2000 (87.4839) time: 1.1641 data: 1.1348 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.4987 (1.5287) acc1: 64.0000 (65.5805) acc5: 86.4000 (87.3561) time: 0.7147 data: 0.6852 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.4987 (1.5285) acc1: 64.8000 (65.6000) acc5: 86.4000 (87.1680) time: 0.7153 data: 0.6851 max mem: 6925
Test: Total time: 0:00:51 (1.0299 s / it)
* Acc@1 66.612 Acc@5 87.424 loss 1.482
Accuracy of the model on the 50000 test images: 66.6%
Max accuracy: 67.56%
Epoch: [172] [ 0/625] eta: 3:34:24 lr: 0.001732 min_lr: 0.001732 loss: 2.4703 (2.4703) class_acc: 0.6406 (0.6406) weight_decay: 0.0500 (0.0500) time: 20.5831 data: 16.8213 max mem: 6925
Epoch: [172] [200/625] eta: 0:14:05 lr: 0.001725 min_lr: 0.001725 loss: 2.5050 (2.4850) class_acc: 0.6250 (0.6394) weight_decay: 0.0500 (0.0500) grad_norm: 1.0232 (0.9896) time: 1.9748 data: 0.0009 max mem: 6925
Epoch: [172] [400/625] eta: 0:07:18 lr: 0.001718 min_lr: 0.001718 loss: 2.4599 (2.4902) class_acc: 0.6367 (0.6383) weight_decay: 0.0500 (0.0500) grad_norm: 1.0653 (1.0247) time: 2.0173 data: 0.0009 max mem: 6925
Epoch: [172] [600/625] eta: 0:00:49 lr: 0.001711 min_lr: 0.001711 loss: 2.5237 (2.4984) class_acc: 0.6250 (0.6374) weight_decay: 0.0500 (0.0500) grad_norm: 0.9943 (1.0435) time: 1.8224 data: 0.0011 max mem: 6925
Epoch: [172] [624/625] eta: 0:00:01 lr: 0.001710 min_lr: 0.001710 loss: 2.4924 (2.4993) class_acc: 0.6250 (0.6371) weight_decay: 0.0500 (0.0500) grad_norm: 0.9833 (1.0420) time: 0.8672 data: 0.0017 max mem: 6925
Epoch: [172] Total time: 0:19:57 (1.9167 s / it)
Averaged stats: lr: 0.001710 min_lr: 0.001710 loss: 2.4924 (2.5032) class_acc: 0.6250 (0.6360) weight_decay: 0.0500 (0.0500) grad_norm: 0.9833 (1.0420)
Test: [ 0/50] eta: 0:10:31 loss: 1.4530 (1.4530) acc1: 68.0000 (68.0000) acc5: 88.8000 (88.8000) time: 12.6310 data: 12.5966 max mem: 6925
Test: [10/50] eta: 0:01:14 loss: 1.3654 (1.4069) acc1: 69.6000 (71.2000) acc5: 89.6000 (88.8727) time: 1.8738 data: 1.8420 max mem: 6925
Test: [20/50] eta: 0:00:41 loss: 1.5637 (1.5623) acc1: 68.8000 (67.0095) acc5: 86.4000 (86.5143) time: 0.8146 data: 0.7847 max mem: 6925
Test: [30/50] eta: 0:00:26 loss: 1.6506 (1.5678) acc1: 62.4000 (66.1936) acc5: 84.0000 (86.4000) time: 1.0273 data: 0.9972 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.5587 (1.5787) acc1: 63.2000 (65.8341) acc5: 85.6000 (86.1463) time: 0.9827 data: 0.9515 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.5885 (1.5959) acc1: 63.2000 (65.5040) acc5: 85.6000 (85.8880) time: 0.5335 data: 0.5034 max mem: 6925
Test: Total time: 0:00:51 (1.0392 s / it)
* Acc@1 66.030 Acc@5 86.464 loss 1.556
Accuracy of the model on the 50000 test images: 66.0%
Max accuracy: 67.56%
Epoch: [173] [ 0/625] eta: 3:35:26 lr: 0.001710 min_lr: 0.001710 loss: 2.3998 (2.3998) class_acc: 0.6562 (0.6562) weight_decay: 0.0500 (0.0500) time: 20.6831 data: 19.4403 max mem: 6925
Epoch: [173] [200/625] eta: 0:13:56 lr: 0.001703 min_lr: 0.001703 loss: 2.4697 (2.4867) class_acc: 0.6367 (0.6394) weight_decay: 0.0500 (0.0500) grad_norm: 1.0152 (1.0606) time: 1.8453 data: 0.0009 max mem: 6925
Epoch: [173] [400/625] eta: 0:07:12 lr: 0.001696 min_lr: 0.001696 loss: 2.4595 (2.4840) class_acc: 0.6367 (0.6408) weight_decay: 0.0500 (0.0500) grad_norm: 0.8801 (1.0514) time: 1.8720 data: 0.0006 max mem: 6925
Epoch: [173] [600/625] eta: 0:00:47 lr: 0.001689 min_lr: 0.001689 loss: 2.5215 (2.4939) class_acc: 0.6328 (0.6386) weight_decay: 0.0500 (0.0500) grad_norm: 0.9727 (1.0594) time: 1.9397 data: 0.0006 max mem: 6925
Epoch: [173] [624/625] eta: 0:00:01 lr: 0.001688 min_lr: 0.001688 loss: 2.5059 (2.4945) class_acc: 0.6406 (0.6384) weight_decay: 0.0500 (0.0500) grad_norm: 1.1877 (inf) time: 0.6507 data: 0.0026 max mem: 6925
Epoch: [173] Total time: 0:19:40 (1.8885 s / it)
Averaged stats: lr: 0.001688 min_lr: 0.001688 loss: 2.5059 (2.4993) class_acc: 0.6406 (0.6367) weight_decay: 0.0500 (0.0500) grad_norm: 1.1877 (inf)
Test: [ 0/50] eta: 0:10:25 loss: 1.3506 (1.3506) acc1: 68.8000 (68.8000) acc5: 89.6000 (89.6000) time: 12.5071 data: 12.4615 max mem: 6925
Test: [10/50] eta: 0:01:24 loss: 1.2729 (1.2894) acc1: 70.4000 (71.4909) acc5: 89.6000 (89.0909) time: 2.1072 data: 2.0762 max mem: 6925
Test: [20/50] eta: 0:00:45 loss: 1.4197 (1.4320) acc1: 68.0000 (68.2667) acc5: 88.0000 (87.5810) time: 0.9515 data: 0.9218 max mem: 6925
Test: [30/50] eta: 0:00:24 loss: 1.4965 (1.4560) acc1: 64.8000 (67.6903) acc5: 85.6000 (87.2774) time: 0.7629 data: 0.7324 max mem: 6925
Test: [40/50] eta: 0:00:10 loss: 1.4755 (1.4633) acc1: 64.8000 (67.0439) acc5: 85.6000 (87.1805) time: 0.5748 data: 0.5447 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.4026 (1.4618) acc1: 67.2000 (67.0240) acc5: 87.2000 (87.0560) time: 0.3850 data: 0.3563 max mem: 6925
Test: Total time: 0:00:45 (0.9192 s / it)
* Acc@1 68.100 Acc@5 88.158 loss 1.416
Accuracy of the model on the 50000 test images: 68.1%
Max accuracy: 68.10%
Epoch: [174] [ 0/625] eta: 4:06:10 lr: 0.001688 min_lr: 0.001688 loss: 2.3073 (2.3073) class_acc: 0.7070 (0.7070) weight_decay: 0.0500 (0.0500) time: 23.6334 data: 23.3892 max mem: 6925
Epoch: [174] [200/625] eta: 0:13:49 lr: 0.001681 min_lr: 0.001681 loss: 2.4616 (2.4749) class_acc: 0.6445 (0.6423) weight_decay: 0.0500 (0.0500) grad_norm: 1.0445 (1.0453) time: 1.7440 data: 0.0010 max mem: 6925
Epoch: [174] [400/625] eta: 0:07:12 lr: 0.001674 min_lr: 0.001674 loss: 2.4857 (2.4844) class_acc: 0.6406 (0.6404) weight_decay: 0.0500 (0.0500) grad_norm: 1.0428 (1.0651) time: 1.9697 data: 0.0009 max mem: 6925
Epoch: [174] [600/625] eta: 0:00:48 lr: 0.001666 min_lr: 0.001666 loss: 2.4928 (2.4976) class_acc: 0.6367 (0.6368) weight_decay: 0.0500 (0.0500) grad_norm: 1.0102 (1.0873) time: 2.0140 data: 0.0010 max mem: 6925
Epoch: [174] [624/625] eta: 0:00:01 lr: 0.001666 min_lr: 0.001666 loss: 2.4992 (2.4989) class_acc: 0.6250 (0.6362) weight_decay: 0.0500 (0.0500) grad_norm: 0.9174 (1.0841) time: 0.8190 data: 0.0014 max mem: 6925
Epoch: [174] Total time: 0:19:38 (1.8862 s / it)
Averaged stats: lr: 0.001666 min_lr: 0.001666 loss: 2.4992 (2.4973) class_acc: 0.6250 (0.6367) weight_decay: 0.0500 (0.0500) grad_norm: 0.9174 (1.0841)
Test: [ 0/50] eta: 0:10:46 loss: 1.3113 (1.3113) acc1: 70.4000 (70.4000) acc5: 92.0000 (92.0000) time: 12.9304 data: 12.8878 max mem: 6925
Test: [10/50] eta: 0:01:26 loss: 1.2559 (1.2803) acc1: 70.4000 (70.5455) acc5: 90.4000 (89.8182) time: 2.1732 data: 2.1428 max mem: 6925
Test: [20/50] eta: 0:00:52 loss: 1.3577 (1.4590) acc1: 68.0000 (66.3238) acc5: 88.0000 (87.5429) time: 1.1833 data: 1.1537 max mem: 6925
Test: [30/50] eta: 0:00:31 loss: 1.6208 (1.5009) acc1: 64.0000 (66.3226) acc5: 84.8000 (86.9677) time: 1.2545 data: 1.2253 max mem: 6925
Test: [40/50] eta: 0:00:13 loss: 1.4789 (1.5153) acc1: 66.4000 (66.0293) acc5: 87.2000 (86.6927) time: 0.9237 data: 0.8949 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.4682 (1.5237) acc1: 63.2000 (65.8720) acc5: 86.4000 (86.4320) time: 0.8298 data: 0.8005 max mem: 6925
Test: Total time: 0:00:55 (1.1194 s / it)
* Acc@1 66.850 Acc@5 87.302 loss 1.459
Accuracy of the model on the 50000 test images: 66.9%
Max accuracy: 68.10%
Epoch: [175] [ 0/625] eta: 3:41:06 lr: 0.001666 min_lr: 0.001666 loss: 2.5147 (2.5147) class_acc: 0.6367 (0.6367) weight_decay: 0.0500 (0.0500) time: 21.2268 data: 18.1466 max mem: 6925
Epoch: [175] [200/625] eta: 0:14:01 lr: 0.001658 min_lr: 0.001658 loss: 2.5038 (2.4637) class_acc: 0.6367 (0.6462) weight_decay: 0.0500 (0.0500) grad_norm: 1.0026 (1.0995) time: 1.7541 data: 0.0010 max mem: 6925
Epoch: [175] [400/625] eta: 0:07:18 lr: 0.001651 min_lr: 0.001651 loss: 2.5058 (2.4872) class_acc: 0.6289 (0.6403) weight_decay: 0.0500 (0.0500) grad_norm: 1.0474 (1.0810) time: 1.7802 data: 0.0009 max mem: 6925
Epoch: [175] [600/625] eta: 0:00:48 lr: 0.001644 min_lr: 0.001644 loss: 2.5445 (2.4970) class_acc: 0.6250 (0.6379) weight_decay: 0.0500 (0.0500) grad_norm: 1.0581 (1.0739) time: 1.7452 data: 0.0008 max mem: 6925
Epoch: [175] [624/625] eta: 0:00:01 lr: 0.001644 min_lr: 0.001644 loss: 2.4680 (2.4965) class_acc: 0.6367 (0.6378) weight_decay: 0.0500 (0.0500) grad_norm: 1.0653 (1.0736) time: 0.8698 data: 0.0022 max mem: 6925
Epoch: [175] Total time: 0:19:57 (1.9155 s / it)
Averaged stats: lr: 0.001644 min_lr: 0.001644 loss: 2.4680 (2.4923) class_acc: 0.6367 (0.6380) weight_decay: 0.0500 (0.0500) grad_norm: 1.0653 (1.0736)
Test: [ 0/50] eta: 0:10:13 loss: 1.6153 (1.6153) acc1: 65.6000 (65.6000) acc5: 88.0000 (88.0000) time: 12.2676 data: 12.2239 max mem: 6925
Test: [10/50] eta: 0:01:20 loss: 1.3943 (1.4057) acc1: 71.2000 (70.8364) acc5: 88.8000 (88.5818) time: 2.0248 data: 1.9943 max mem: 6925
Test: [20/50] eta: 0:00:42 loss: 1.5726 (1.6204) acc1: 65.6000 (65.5238) acc5: 87.2000 (86.2095) time: 0.8836 data: 0.8547 max mem: 6925
Test: [30/50] eta: 0:00:23 loss: 1.7564 (1.6576) acc1: 59.2000 (64.4645) acc5: 83.2000 (85.5742) time: 0.7168 data: 0.6876 max mem: 6925
Test: [40/50] eta: 0:00:10 loss: 1.6964 (1.6528) acc1: 63.2000 (64.5659) acc5: 83.2000 (85.4634) time: 0.6075 data: 0.5784 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.6611 (1.6540) acc1: 63.2000 (64.2400) acc5: 85.6000 (85.3760) time: 0.4503 data: 0.4216 max mem: 6925
Test: Total time: 0:00:45 (0.9046 s / it)
* Acc@1 65.148 Acc@5 86.008 loss 1.601
Accuracy of the model on the 50000 test images: 65.1%
Max accuracy: 68.10%
Epoch: [176] [ 0/625] eta: 3:54:32 lr: 0.001643 min_lr: 0.001643 loss: 2.4545 (2.4545) class_acc: 0.6602 (0.6602) weight_decay: 0.0500 (0.0500) time: 22.5157 data: 22.2799 max mem: 6925
Epoch: [176] [200/625] eta: 0:14:23 lr: 0.001636 min_lr: 0.001636 loss: 2.4646 (2.4709) class_acc: 0.6367 (0.6422) weight_decay: 0.0500 (0.0500) grad_norm: 0.9484 (1.0752) time: 1.8240 data: 0.0017 max mem: 6925
Epoch: [176] [400/625] eta: 0:07:25 lr: 0.001629 min_lr: 0.001629 loss: 2.4925 (2.4816) class_acc: 0.6289 (0.6397) weight_decay: 0.0500 (0.0500) grad_norm: 1.0483 (1.0697) time: 2.0416 data: 0.0009 max mem: 6925
Epoch: [176] [600/625] eta: 0:00:49 lr: 0.001622 min_lr: 0.001622 loss: 2.4844 (2.4840) class_acc: 0.6328 (0.6397) weight_decay: 0.0500 (0.0500) grad_norm: 1.2783 (1.0762) time: 1.9823 data: 0.0011 max mem: 6925
Epoch: [176] [624/625] eta: 0:00:01 lr: 0.001621 min_lr: 0.001621 loss: 2.5031 (2.4855) class_acc: 0.6367 (0.6395) weight_decay: 0.0500 (0.0500) grad_norm: 1.2215 (1.0819) time: 0.9484 data: 0.0016 max mem: 6925
Epoch: [176] Total time: 0:20:14 (1.9433 s / it)
Averaged stats: lr: 0.001621 min_lr: 0.001621 loss: 2.5031 (2.4911) class_acc: 0.6367 (0.6387) weight_decay: 0.0500 (0.0500) grad_norm: 1.2215 (1.0819)
Test: [ 0/50] eta: 0:10:16 loss: 1.3695 (1.3695) acc1: 72.0000 (72.0000) acc5: 87.2000 (87.2000) time: 12.3376 data: 12.2725 max mem: 6925
Test: [10/50] eta: 0:01:21 loss: 1.3821 (1.4066) acc1: 70.4000 (68.9455) acc5: 88.0000 (88.8000) time: 2.0298 data: 1.9968 max mem: 6925
Test: [20/50] eta: 0:00:48 loss: 1.4753 (1.5421) acc1: 66.4000 (65.9810) acc5: 88.0000 (87.3524) time: 1.0691 data: 1.0399 max mem: 6925
Test: [30/50] eta: 0:00:28 loss: 1.5883 (1.5693) acc1: 61.6000 (65.0323) acc5: 86.4000 (87.2516) time: 1.1002 data: 1.0717 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.6412 (1.5851) acc1: 61.6000 (64.7610) acc5: 84.8000 (86.6732) time: 0.9535 data: 0.9247 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.6453 (1.5983) acc1: 62.4000 (64.5600) acc5: 84.8000 (86.4000) time: 0.9369 data: 0.9081 max mem: 6925
Test: Total time: 0:00:56 (1.1293 s / it)
* Acc@1 66.436 Acc@5 86.998 loss 1.556
Accuracy of the model on the 50000 test images: 66.4%
Max accuracy: 68.10%
Epoch: [177] [ 0/625] eta: 3:36:41 lr: 0.001621 min_lr: 0.001621 loss: 2.4003 (2.4003) class_acc: 0.6445 (0.6445) weight_decay: 0.0500 (0.0500) time: 20.8019 data: 17.8532 max mem: 6925
Epoch: [177] [200/625] eta: 0:14:15 lr: 0.001614 min_lr: 0.001614 loss: 2.4705 (2.4742) class_acc: 0.6445 (0.6419) weight_decay: 0.0500 (0.0500) grad_norm: 0.9797 (1.0857) time: 1.9029 data: 0.0008 max mem: 6925
Epoch: [177] [400/625] eta: 0:07:34 lr: 0.001607 min_lr: 0.001607 loss: 2.4768 (2.4840) class_acc: 0.6367 (0.6397) weight_decay: 0.0500 (0.0500) grad_norm: 1.1451 (1.0923) time: 2.0550 data: 0.0008 max mem: 6925
Epoch: [177] [600/625] eta: 0:00:49 lr: 0.001600 min_lr: 0.001600 loss: 2.4824 (2.4895) class_acc: 0.6484 (0.6387) weight_decay: 0.0500 (0.0500) grad_norm: 0.9721 (1.1109) time: 2.0457 data: 0.0010 max mem: 6925
Epoch: [177] [624/625] eta: 0:00:01 lr: 0.001599 min_lr: 0.001599 loss: 2.5417 (2.4912) class_acc: 0.6211 (0.6383) weight_decay: 0.0500 (0.0500) grad_norm: 1.0184 (1.1131) time: 1.0753 data: 0.0016 max mem: 6925
Epoch: [177] Total time: 0:20:30 (1.9688 s / it)
Averaged stats: lr: 0.001599 min_lr: 0.001599 loss: 2.5417 (2.4872) class_acc: 0.6211 (0.6393) weight_decay: 0.0500 (0.0500) grad_norm: 1.0184 (1.1131)
Test: [ 0/50] eta: 0:10:02 loss: 1.2321 (1.2321) acc1: 72.8000 (72.8000) acc5: 92.8000 (92.8000) time: 12.0450 data: 12.0025 max mem: 6925
Test: [10/50] eta: 0:01:20 loss: 1.3324 (1.3364) acc1: 72.0000 (70.9818) acc5: 88.8000 (88.8000) time: 2.0141 data: 1.9834 max mem: 6925
Test: [20/50] eta: 0:00:45 loss: 1.3502 (1.4618) acc1: 68.8000 (67.0476) acc5: 88.0000 (88.0000) time: 0.9839 data: 0.9543 max mem: 6925
Test: [30/50] eta: 0:00:25 loss: 1.5042 (1.4881) acc1: 63.2000 (66.5290) acc5: 86.4000 (87.5871) time: 0.8758 data: 0.8466 max mem: 6925
Test: [40/50] eta: 0:00:10 loss: 1.4946 (1.4802) acc1: 64.0000 (66.7707) acc5: 88.0000 (87.5707) time: 0.6473 data: 0.6181 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.3910 (1.4835) acc1: 66.4000 (66.8160) acc5: 88.0000 (87.4240) time: 0.5279 data: 0.4989 max mem: 6925
Test: Total time: 0:00:48 (0.9607 s / it)
* Acc@1 67.584 Acc@5 88.020 loss 1.441
Accuracy of the model on the 50000 test images: 67.6%
Max accuracy: 68.10%
Epoch: [178] [ 0/625] eta: 3:54:37 lr: 0.001599 min_lr: 0.001599 loss: 2.5665 (2.5665) class_acc: 0.6289 (0.6289) weight_decay: 0.0500 (0.0500) time: 22.5238 data: 17.6322 max mem: 6925
Epoch: [178] [200/625] eta: 0:14:47 lr: 0.001592 min_lr: 0.001592 loss: 2.4793 (2.4721) class_acc: 0.6406 (0.6421) weight_decay: 0.0500 (0.0500) grad_norm: 1.0792 (1.1535) time: 2.5615 data: 0.0011 max mem: 6925
Epoch: [178] [400/625] eta: 0:07:51 lr: 0.001585 min_lr: 0.001585 loss: 2.4222 (2.4765) class_acc: 0.6484 (0.6410) weight_decay: 0.0500 (0.0500) grad_norm: 1.0333 (1.1163) time: 2.2769 data: 0.0013 max mem: 6925
Epoch: [178] [600/625] eta: 0:00:52 lr: 0.001578 min_lr: 0.001578 loss: 2.4850 (2.4819) class_acc: 0.6406 (0.6401) weight_decay: 0.0500 (0.0500) grad_norm: 0.9557 (1.1131) time: 1.7738 data: 0.0011 max mem: 6925
Epoch: [178] [624/625] eta: 0:00:02 lr: 0.001578 min_lr: 0.001578 loss: 2.4605 (2.4812) class_acc: 0.6406 (0.6402) weight_decay: 0.0500 (0.0500) grad_norm: 0.9600 (1.1076) time: 0.7357 data: 0.0024 max mem: 6925
Epoch: [178] Total time: 0:21:37 (2.0765 s / it)
Averaged stats: lr: 0.001578 min_lr: 0.001578 loss: 2.4605 (2.4844) class_acc: 0.6406 (0.6402) weight_decay: 0.0500 (0.0500) grad_norm: 0.9600 (1.1076)
Test: [ 0/50] eta: 0:10:38 loss: 1.1404 (1.1404) acc1: 76.0000 (76.0000) acc5: 92.0000 (92.0000) time: 12.7607 data: 12.7276 max mem: 6925
Test: [10/50] eta: 0:01:16 loss: 1.2695 (1.3016) acc1: 72.0000 (71.7091) acc5: 88.8000 (88.8727) time: 1.9056 data: 1.8759 max mem: 6925
Test: [20/50] eta: 0:00:42 loss: 1.3516 (1.4523) acc1: 65.6000 (67.4667) acc5: 87.2000 (87.4667) time: 0.8482 data: 0.8192 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.5780 (1.5114) acc1: 63.2000 (66.1936) acc5: 85.6000 (86.7613) time: 1.1153 data: 1.0849 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.5853 (1.5140) acc1: 63.2000 (65.9707) acc5: 84.8000 (86.6146) time: 1.1555 data: 1.1247 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.5648 (1.5271) acc1: 64.0000 (65.7120) acc5: 85.6000 (86.4000) time: 0.7214 data: 0.6911 max mem: 6925
Test: Total time: 0:00:57 (1.1565 s / it)
* Acc@1 66.182 Acc@5 86.868 loss 1.497
Accuracy of the model on the 50000 test images: 66.2%
Max accuracy: 68.10%
Epoch: [179] [ 0/625] eta: 3:39:59 lr: 0.001577 min_lr: 0.001577 loss: 2.4564 (2.4564) class_acc: 0.6211 (0.6211) weight_decay: 0.0500 (0.0500) time: 21.1198 data: 19.9616 max mem: 6925
Epoch: [179] [200/625] eta: 0:16:25 lr: 0.001570 min_lr: 0.001570 loss: 2.5113 (2.4808) class_acc: 0.6289 (0.6401) weight_decay: 0.0500 (0.0500) grad_norm: 1.1223 (1.1428) time: 3.1027 data: 2.6545 max mem: 6925
Epoch: [179] [400/625] eta: 0:08:23 lr: 0.001563 min_lr: 0.001563 loss: 2.4997 (2.4832) class_acc: 0.6367 (0.6395) weight_decay: 0.0500 (0.0500) grad_norm: 1.1826 (1.1154) time: 2.0603 data: 1.7408 max mem: 6925
Epoch: [179] [600/625] eta: 0:00:53 lr: 0.001556 min_lr: 0.001556 loss: 2.5236 (2.4894) class_acc: 0.6289 (0.6385) weight_decay: 0.0500 (0.0500) grad_norm: 1.0159 (1.0972) time: 1.9488 data: 1.4865 max mem: 6925
Epoch: [179] [624/625] eta: 0:00:02 lr: 0.001556 min_lr: 0.001556 loss: 2.5144 (2.4899) class_acc: 0.6289 (0.6384) weight_decay: 0.0500 (0.0500) grad_norm: 1.0061 (1.0950) time: 0.7146 data: 0.3581 max mem: 6925
Epoch: [179] Total time: 0:21:48 (2.0929 s / it)
Averaged stats: lr: 0.001556 min_lr: 0.001556 loss: 2.5144 (2.4816) class_acc: 0.6289 (0.6405) weight_decay: 0.0500 (0.0500) grad_norm: 1.0061 (1.0950)
Test: [ 0/50] eta: 0:09:57 loss: 1.2509 (1.2509) acc1: 73.6000 (73.6000) acc5: 91.2000 (91.2000) time: 11.9443 data: 11.9096 max mem: 6925
Test: [10/50] eta: 0:01:16 loss: 1.3455 (1.3593) acc1: 71.2000 (70.8364) acc5: 88.8000 (88.9455) time: 1.9108 data: 1.8808 max mem: 6925
Test: [20/50] eta: 0:00:43 loss: 1.4968 (1.5100) acc1: 65.6000 (67.2381) acc5: 87.2000 (87.0857) time: 0.9133 data: 0.8832 max mem: 6925
Test: [30/50] eta: 0:00:24 loss: 1.6415 (1.5573) acc1: 63.2000 (66.0387) acc5: 85.6000 (86.2968) time: 0.8771 data: 0.8460 max mem: 6925
Test: [40/50] eta: 0:00:10 loss: 1.5860 (1.5623) acc1: 63.2000 (65.6976) acc5: 84.0000 (86.1268) time: 0.6861 data: 0.6560 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.5860 (1.5693) acc1: 64.8000 (65.6160) acc5: 84.0000 (85.9360) time: 0.5954 data: 0.5670 max mem: 6925
Test: Total time: 0:00:46 (0.9227 s / it)
* Acc@1 66.650 Acc@5 87.124 loss 1.518
Accuracy of the model on the 50000 test images: 66.7%
Max accuracy: 68.10%
Epoch: [180] [ 0/625] eta: 3:24:59 lr: 0.001556 min_lr: 0.001556 loss: 2.5361 (2.5361) class_acc: 0.6172 (0.6172) weight_decay: 0.0500 (0.0500) time: 19.6795 data: 17.1574 max mem: 6925
Epoch: [180] [200/625] eta: 0:13:58 lr: 0.001549 min_lr: 0.001549 loss: 2.4752 (2.4643) class_acc: 0.6484 (0.6447) weight_decay: 0.0500 (0.0500) grad_norm: 0.9620 (1.0954) time: 1.8769 data: 0.0016 max mem: 6925
Epoch: [180] [400/625] eta: 0:07:06 lr: 0.001542 min_lr: 0.001542 loss: 2.4969 (2.4763) class_acc: 0.6406 (0.6423) weight_decay: 0.0500 (0.0500) grad_norm: 1.0041 (1.0851) time: 1.7814 data: 0.0010 max mem: 6925
Epoch: [180] [600/625] eta: 0:00:47 lr: 0.001535 min_lr: 0.001535 loss: 2.4950 (2.4822) class_acc: 0.6406 (0.6414) weight_decay: 0.0500 (0.0500) grad_norm: 0.9956 (1.0756) time: 2.0810 data: 0.0014 max mem: 6925
Epoch: [180] [624/625] eta: 0:00:01 lr: 0.001534 min_lr: 0.001534 loss: 2.4857 (2.4826) class_acc: 0.6289 (0.6413) weight_decay: 0.0500 (0.0500) grad_norm: 1.0658 (1.0779) time: 0.8089 data: 0.0019 max mem: 6925
Epoch: [180] Total time: 0:19:38 (1.8850 s / it)
Averaged stats: lr: 0.001534 min_lr: 0.001534 loss: 2.4857 (2.4773) class_acc: 0.6289 (0.6422) weight_decay: 0.0500 (0.0500) grad_norm: 1.0658 (1.0779)
Test: [ 0/50] eta: 0:11:15 loss: 1.4928 (1.4928) acc1: 71.2000 (71.2000) acc5: 85.6000 (85.6000) time: 13.5131 data: 13.4787 max mem: 6925
Test: [10/50] eta: 0:01:31 loss: 1.2280 (1.3093) acc1: 72.8000 (72.3636) acc5: 89.6000 (88.0727) time: 2.2869 data: 2.2575 max mem: 6925
Test: [20/50] eta: 0:00:54 loss: 1.4734 (1.4724) acc1: 67.2000 (67.6191) acc5: 87.2000 (87.1619) time: 1.2397 data: 1.2109 max mem: 6925
Test: [30/50] eta: 0:00:32 loss: 1.5896 (1.5008) acc1: 63.2000 (66.9161) acc5: 86.4000 (86.4000) time: 1.2714 data: 1.2425 max mem: 6925
Test: [40/50] eta: 0:00:13 loss: 1.5260 (1.5059) acc1: 64.8000 (66.7122) acc5: 86.4000 (86.6146) time: 0.8688 data: 0.8386 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.4623 (1.5034) acc1: 65.6000 (66.6880) acc5: 87.2000 (86.6560) time: 0.7689 data: 0.7380 max mem: 6925
Test: Total time: 0:00:56 (1.1371 s / it)
* Acc@1 67.024 Acc@5 87.536 loss 1.462
Accuracy of the model on the 50000 test images: 67.0%
Max accuracy: 68.10%
Epoch: [181] [ 0/625] eta: 3:46:28 lr: 0.001534 min_lr: 0.001534 loss: 2.4810 (2.4810) class_acc: 0.6328 (0.6328) weight_decay: 0.0500 (0.0500) time: 21.7423 data: 19.7349 max mem: 6925
Epoch: [181] [200/625] eta: 0:14:16 lr: 0.001527 min_lr: 0.001527 loss: 2.4286 (2.4709) class_acc: 0.6328 (0.6424) weight_decay: 0.0500 (0.0500) grad_norm: 0.9834 (inf) time: 1.9061 data: 0.0016 max mem: 6925
Epoch: [181] [400/625] eta: 0:07:21 lr: 0.001520 min_lr: 0.001520 loss: 2.5522 (2.4736) class_acc: 0.6172 (0.6415) weight_decay: 0.0500 (0.0500) grad_norm: 1.1194 (inf) time: 1.9437 data: 0.0516 max mem: 6925
Epoch: [181] [600/625] eta: 0:00:48 lr: 0.001513 min_lr: 0.001513 loss: 2.5033 (2.4770) class_acc: 0.6484 (0.6409) weight_decay: 0.0500 (0.0500) grad_norm: 1.0725 (inf) time: 1.8708 data: 0.0017 max mem: 6925
Epoch: [181] [624/625] eta: 0:00:01 lr: 0.001512 min_lr: 0.001512 loss: 2.4589 (2.4782) class_acc: 0.6289 (0.6404) weight_decay: 0.0500 (0.0500) grad_norm: 0.9591 (inf) time: 0.6532 data: 0.0024 max mem: 6925
Epoch: [181] Total time: 0:20:04 (1.9270 s / it)
Averaged stats: lr: 0.001512 min_lr: 0.001512 loss: 2.4589 (2.4756) class_acc: 0.6289 (0.6423) weight_decay: 0.0500 (0.0500) grad_norm: 0.9591 (inf)
Test: [ 0/50] eta: 0:10:40 loss: 1.2957 (1.2957) acc1: 66.4000 (66.4000) acc5: 90.4000 (90.4000) time: 12.8013 data: 12.7651 max mem: 6925
Test: [10/50] eta: 0:01:23 loss: 1.3388 (1.3727) acc1: 68.8000 (70.3273) acc5: 90.4000 (88.9455) time: 2.0769 data: 2.0465 max mem: 6925
Test: [20/50] eta: 0:00:47 loss: 1.4690 (1.4986) acc1: 65.6000 (67.5048) acc5: 88.0000 (87.5810) time: 1.0361 data: 1.0064 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.5797 (1.5381) acc1: 62.4000 (66.3742) acc5: 84.8000 (86.6839) time: 0.9515 data: 0.9214 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.5458 (1.5362) acc1: 63.2000 (66.3415) acc5: 84.8000 (86.7317) time: 0.5863 data: 0.5551 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.5458 (1.5512) acc1: 64.8000 (66.0000) acc5: 86.4000 (86.5280) time: 0.5920 data: 0.5617 max mem: 6925
Test: Total time: 0:00:45 (0.9164 s / it)
* Acc@1 67.402 Acc@5 87.496 loss 1.501
Accuracy of the model on the 50000 test images: 67.4%
Max accuracy: 68.10%
Epoch: [182] [ 0/625] eta: 3:53:18 lr: 0.001512 min_lr: 0.001512 loss: 2.4833 (2.4833) class_acc: 0.6719 (0.6719) weight_decay: 0.0500 (0.0500) time: 22.3975 data: 15.8167 max mem: 6925
Epoch: [182] [200/625] eta: 0:14:22 lr: 0.001505 min_lr: 0.001505 loss: 2.4775 (2.4620) class_acc: 0.6484 (0.6457) weight_decay: 0.0500 (0.0500) grad_norm: 1.1684 (1.1243) time: 1.9192 data: 0.0603 max mem: 6925
Epoch: [182] [400/625] eta: 0:07:25 lr: 0.001498 min_lr: 0.001498 loss: 2.4858 (2.4671) class_acc: 0.6367 (0.6448) weight_decay: 0.0500 (0.0500) grad_norm: 1.0298 (1.1224) time: 1.8665 data: 0.0117 max mem: 6925
Epoch: [182] [600/625] eta: 0:00:49 lr: 0.001491 min_lr: 0.001491 loss: 2.4685 (2.4684) class_acc: 0.6328 (0.6440) weight_decay: 0.0500 (0.0500) grad_norm: 1.0209 (1.1011) time: 2.0285 data: 0.0013 max mem: 6925
Epoch: [182] [624/625] eta: 0:00:01 lr: 0.001490 min_lr: 0.001490 loss: 2.4709 (2.4690) class_acc: 0.6367 (0.6440) weight_decay: 0.0500 (0.0500) grad_norm: 1.0532 (1.1012) time: 0.8936 data: 0.0022 max mem: 6925
Epoch: [182] Total time: 0:20:12 (1.9397 s / it)
Averaged stats: lr: 0.001490 min_lr: 0.001490 loss: 2.4709 (2.4707) class_acc: 0.6367 (0.6434) weight_decay: 0.0500 (0.0500) grad_norm: 1.0532 (1.1012)
Test: [ 0/50] eta: 0:10:48 loss: 1.3027 (1.3027) acc1: 71.2000 (71.2000) acc5: 89.6000 (89.6000) time: 12.9708 data: 12.9327 max mem: 6925
Test: [10/50] eta: 0:01:28 loss: 1.3134 (1.3244) acc1: 69.6000 (70.4727) acc5: 88.8000 (88.8000) time: 2.2165 data: 2.1866 max mem: 6925
Test: [20/50] eta: 0:00:52 loss: 1.3995 (1.4669) acc1: 67.2000 (67.1619) acc5: 88.0000 (87.6952) time: 1.1745 data: 1.1451 max mem: 6925
Test: [30/50] eta: 0:00:30 loss: 1.5453 (1.5006) acc1: 64.8000 (66.7613) acc5: 86.4000 (87.0968) time: 1.1440 data: 1.1150 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.5655 (1.5121) acc1: 64.8000 (66.5366) acc5: 86.4000 (87.0829) time: 0.7895 data: 0.7609 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.4932 (1.5214) acc1: 65.6000 (66.2560) acc5: 87.2000 (86.8320) time: 0.7289 data: 0.6996 max mem: 6925
Test: Total time: 0:00:53 (1.0614 s / it)
* Acc@1 66.880 Acc@5 87.224 loss 1.485
Accuracy of the model on the 50000 test images: 66.9%
Max accuracy: 68.10%
Epoch: [183] [ 0/625] eta: 3:25:56 lr: 0.001490 min_lr: 0.001490 loss: 2.3124 (2.3124) class_acc: 0.6719 (0.6719) weight_decay: 0.0500 (0.0500) time: 19.7709 data: 18.6089 max mem: 6925
Epoch: [183] [200/625] eta: 0:14:16 lr: 0.001483 min_lr: 0.001483 loss: 2.4115 (2.4469) class_acc: 0.6562 (0.6499) weight_decay: 0.0500 (0.0500) grad_norm: 1.3229 (1.1160) time: 1.8987 data: 0.0011 max mem: 6925
Epoch: [183] [400/625] eta: 0:07:29 lr: 0.001476 min_lr: 0.001476 loss: 2.4574 (2.4538) class_acc: 0.6367 (0.6478) weight_decay: 0.0500 (0.0500) grad_norm: 1.1589 (1.1288) time: 2.1904 data: 0.0010 max mem: 6925
Epoch: [183] [600/625] eta: 0:00:50 lr: 0.001469 min_lr: 0.001469 loss: 2.4631 (2.4606) class_acc: 0.6484 (0.6463) weight_decay: 0.0500 (0.0500) grad_norm: 0.9896 (1.1114) time: 1.9260 data: 0.0024 max mem: 6925
Epoch: [183] [624/625] eta: 0:00:01 lr: 0.001469 min_lr: 0.001469 loss: 2.4991 (2.4611) class_acc: 0.6445 (0.6460) weight_decay: 0.0500 (0.0500) grad_norm: 0.9863 (1.1123) time: 0.9748 data: 0.0020 max mem: 6925
Epoch: [183] Total time: 0:20:24 (1.9586 s / it)
Averaged stats: lr: 0.001469 min_lr: 0.001469 loss: 2.4991 (2.4669) class_acc: 0.6445 (0.6446) weight_decay: 0.0500 (0.0500) grad_norm: 0.9863 (1.1123)
Test: [ 0/50] eta: 0:11:03 loss: 1.5464 (1.5464) acc1: 65.6000 (65.6000) acc5: 88.8000 (88.8000) time: 13.2744 data: 13.2410 max mem: 6925
Test: [10/50] eta: 0:01:26 loss: 1.2572 (1.2761) acc1: 73.6000 (72.4364) acc5: 89.6000 (89.8182) time: 2.1658 data: 2.1350 max mem: 6925
Test: [20/50] eta: 0:00:50 loss: 1.3992 (1.4825) acc1: 69.6000 (67.3905) acc5: 88.0000 (87.6191) time: 1.0918 data: 1.0623 max mem: 6925
Test: [30/50] eta: 0:00:29 loss: 1.6834 (1.5102) acc1: 61.6000 (66.4774) acc5: 86.4000 (86.9936) time: 1.0815 data: 1.0524 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.5735 (1.5309) acc1: 61.6000 (65.9707) acc5: 86.4000 (86.6927) time: 0.7689 data: 0.7391 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.5365 (1.5462) acc1: 64.8000 (65.8880) acc5: 86.4000 (86.3680) time: 0.6862 data: 0.6560 max mem: 6925
Test: Total time: 0:00:51 (1.0206 s / it)
* Acc@1 66.444 Acc@5 87.132 loss 1.510
Accuracy of the model on the 50000 test images: 66.4%
Max accuracy: 68.10%
Epoch: [184] [ 0/625] eta: 3:38:41 lr: 0.001469 min_lr: 0.001469 loss: 2.6909 (2.6909) class_acc: 0.6094 (0.6094) weight_decay: 0.0500 (0.0500) time: 20.9943 data: 20.7566 max mem: 6925
Epoch: [184] [200/625] eta: 0:14:21 lr: 0.001462 min_lr: 0.001462 loss: 2.4471 (2.4500) class_acc: 0.6484 (0.6471) weight_decay: 0.0500 (0.0500) grad_norm: 1.0977 (1.1391) time: 2.0533 data: 0.0013 max mem: 6925
Epoch: [184] [400/625] eta: 0:07:25 lr: 0.001455 min_lr: 0.001455 loss: 2.4561 (2.4583) class_acc: 0.6328 (0.6455) weight_decay: 0.0500 (0.0500) grad_norm: 1.0636 (1.1051) time: 1.8542 data: 0.0008 max mem: 6925
Epoch: [184] [600/625] eta: 0:00:49 lr: 0.001448 min_lr: 0.001448 loss: 2.4778 (2.4651) class_acc: 0.6406 (0.6441) weight_decay: 0.0500 (0.0500) grad_norm: 1.1215 (1.1082) time: 2.2469 data: 0.0010 max mem: 6925
Epoch: [184] [624/625] eta: 0:00:01 lr: 0.001447 min_lr: 0.001447 loss: 2.4346 (2.4650) class_acc: 0.6523 (0.6442) weight_decay: 0.0500 (0.0500) grad_norm: 1.0947 (1.1115) time: 0.6786 data: 0.0016 max mem: 6925
Epoch: [184] Total time: 0:20:04 (1.9274 s / it)
Averaged stats: lr: 0.001447 min_lr: 0.001447 loss: 2.4346 (2.4649) class_acc: 0.6523 (0.6448) weight_decay: 0.0500 (0.0500) grad_norm: 1.0947 (1.1115)
Test: [ 0/50] eta: 0:10:15 loss: 1.2635 (1.2635) acc1: 70.4000 (70.4000) acc5: 90.4000 (90.4000) time: 12.3145 data: 12.2817 max mem: 6925
Test: [10/50] eta: 0:01:19 loss: 1.3750 (1.4173) acc1: 69.6000 (69.0909) acc5: 88.8000 (88.0000) time: 1.9803 data: 1.9506 max mem: 6925
Test: [20/50] eta: 0:00:46 loss: 1.5520 (1.5862) acc1: 65.6000 (65.6762) acc5: 86.4000 (86.0571) time: 0.9999 data: 0.9708 max mem: 6925
Test: [30/50] eta: 0:00:26 loss: 1.6944 (1.6023) acc1: 62.4000 (65.1613) acc5: 84.0000 (85.7548) time: 0.9860 data: 0.9559 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.7053 (1.6378) acc1: 62.4000 (64.4098) acc5: 85.6000 (85.4634) time: 0.7146 data: 0.6840 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.7053 (1.6365) acc1: 61.6000 (64.1920) acc5: 85.6000 (85.3120) time: 0.6785 data: 0.6489 max mem: 6925
Test: Total time: 0:00:47 (0.9478 s / it)
* Acc@1 65.224 Acc@5 86.306 loss 1.587
Accuracy of the model on the 50000 test images: 65.2%
Max accuracy: 68.10%
Epoch: [185] [ 0/625] eta: 4:05:58 lr: 0.001447 min_lr: 0.001447 loss: 2.4358 (2.4358) class_acc: 0.6641 (0.6641) weight_decay: 0.0500 (0.0500) time: 23.6139 data: 19.9434 max mem: 6925
Epoch: [185] [200/625] eta: 0:14:35 lr: 0.001440 min_lr: 0.001440 loss: 2.4515 (2.4460) class_acc: 0.6484 (0.6506) weight_decay: 0.0500 (0.0500) grad_norm: 1.0220 (1.0820) time: 1.8642 data: 0.0008 max mem: 6925
Epoch: [185] [400/625] eta: 0:07:30 lr: 0.001433 min_lr: 0.001433 loss: 2.3925 (2.4487) class_acc: 0.6602 (0.6492) weight_decay: 0.0500 (0.0500) grad_norm: 1.2547 (1.0852) time: 1.8649 data: 0.0009 max mem: 6925
Epoch: [185] [600/625] eta: 0:00:49 lr: 0.001426 min_lr: 0.001426 loss: 2.5002 (2.4599) class_acc: 0.6328 (0.6468) weight_decay: 0.0500 (0.0500) grad_norm: 1.1493 (1.0933) time: 2.1334 data: 0.0008 max mem: 6925
Epoch: [185] [624/625] eta: 0:00:01 lr: 0.001426 min_lr: 0.001426 loss: 2.4508 (2.4600) class_acc: 0.6484 (0.6469) weight_decay: 0.0500 (0.0500) grad_norm: 0.9880 (1.0893) time: 0.7675 data: 0.0018 max mem: 6925
Epoch: [185] Total time: 0:20:15 (1.9447 s / it)
Averaged stats: lr: 0.001426 min_lr: 0.001426 loss: 2.4508 (2.4631) class_acc: 0.6484 (0.6456) weight_decay: 0.0500 (0.0500) grad_norm: 0.9880 (1.0893)
Test: [ 0/50] eta: 0:09:41 loss: 1.2163 (1.2163) acc1: 70.4000 (70.4000) acc5: 92.8000 (92.8000) time: 11.6347 data: 11.5993 max mem: 6925
Test: [10/50] eta: 0:01:13 loss: 1.2163 (1.3119) acc1: 72.8000 (71.3455) acc5: 88.8000 (88.8000) time: 1.8368 data: 1.8070 max mem: 6925
Test: [20/50] eta: 0:00:42 loss: 1.4401 (1.4698) acc1: 67.2000 (67.6952) acc5: 88.0000 (87.4286) time: 0.9032 data: 0.8740 max mem: 6925
Test: [30/50] eta: 0:00:25 loss: 1.5810 (1.5025) acc1: 65.6000 (66.6581) acc5: 85.6000 (86.8645) time: 0.9840 data: 0.9540 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.5554 (1.5007) acc1: 65.6000 (66.3805) acc5: 88.0000 (86.8683) time: 0.9360 data: 0.9061 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.4490 (1.5056) acc1: 64.0000 (66.1280) acc5: 86.4000 (86.6720) time: 0.6389 data: 0.6091 max mem: 6925
Test: Total time: 0:00:52 (1.0561 s / it)
* Acc@1 67.032 Acc@5 87.348 loss 1.466
Accuracy of the model on the 50000 test images: 67.0%
Max accuracy: 68.10%
Epoch: [186] [ 0/625] eta: 3:43:36 lr: 0.001425 min_lr: 0.001425 loss: 2.3965 (2.3965) class_acc: 0.6641 (0.6641) weight_decay: 0.0500 (0.0500) time: 21.4657 data: 19.9064 max mem: 6925
Epoch: [186] [200/625] eta: 0:14:10 lr: 0.001419 min_lr: 0.001419 loss: 2.3356 (2.4491) class_acc: 0.6523 (0.6489) weight_decay: 0.0500 (0.0500) grad_norm: 1.1957 (1.0960) time: 1.7001 data: 0.0576 max mem: 6925
Epoch: [186] [400/625] eta: 0:07:16 lr: 0.001412 min_lr: 0.001412 loss: 2.5348 (2.4564) class_acc: 0.6211 (0.6471) weight_decay: 0.0500 (0.0500) grad_norm: 1.1041 (1.0852) time: 1.8633 data: 0.0010 max mem: 6925
Epoch: [186] [600/625] eta: 0:00:48 lr: 0.001405 min_lr: 0.001405 loss: 2.4500 (2.4635) class_acc: 0.6367 (0.6451) weight_decay: 0.0500 (0.0500) grad_norm: 1.2137 (1.1191) time: 2.0258 data: 0.0120 max mem: 6925
Epoch: [186] [624/625] eta: 0:00:01 lr: 0.001404 min_lr: 0.001404 loss: 2.4755 (2.4636) class_acc: 0.6406 (0.6452) weight_decay: 0.0500 (0.0500) grad_norm: 1.0119 (1.1166) time: 0.8754 data: 0.0025 max mem: 6925
Epoch: [186] Total time: 0:19:45 (1.8966 s / it)
Averaged stats: lr: 0.001404 min_lr: 0.001404 loss: 2.4755 (2.4602) class_acc: 0.6406 (0.6459) weight_decay: 0.0500 (0.0500) grad_norm: 1.0119 (1.1166)
Test: [ 0/50] eta: 0:10:00 loss: 1.2998 (1.2998) acc1: 68.8000 (68.8000) acc5: 89.6000 (89.6000) time: 12.0062 data: 11.9662 max mem: 6925
Test: [10/50] eta: 0:01:07 loss: 1.2487 (1.2984) acc1: 72.8000 (71.5636) acc5: 89.6000 (88.8727) time: 1.6970 data: 1.6654 max mem: 6925
Test: [20/50] eta: 0:00:36 loss: 1.4269 (1.4343) acc1: 68.0000 (68.2286) acc5: 88.0000 (87.6571) time: 0.6889 data: 0.6593 max mem: 6925
Test: [30/50] eta: 0:00:22 loss: 1.5636 (1.4616) acc1: 65.6000 (67.4581) acc5: 86.4000 (87.2258) time: 0.8418 data: 0.8126 max mem: 6925
Test: [40/50] eta: 0:00:10 loss: 1.4759 (1.4684) acc1: 65.6000 (67.3171) acc5: 87.2000 (87.3951) time: 0.9244 data: 0.8951 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.4146 (1.4682) acc1: 66.4000 (67.1840) acc5: 88.0000 (87.3600) time: 0.6087 data: 0.5798 max mem: 6925
Test: Total time: 0:00:47 (0.9560 s / it)
* Acc@1 67.776 Acc@5 88.160 loss 1.417
Accuracy of the model on the 50000 test images: 67.8%
Max accuracy: 68.10%
Epoch: [187] [ 0/625] eta: 3:39:59 lr: 0.001404 min_lr: 0.001404 loss: 2.5414 (2.5414) class_acc: 0.6680 (0.6680) weight_decay: 0.0500 (0.0500) time: 21.1184 data: 17.9302 max mem: 6925
Epoch: [187] [200/625] eta: 0:14:24 lr: 0.001397 min_lr: 0.001397 loss: 2.4486 (2.4511) class_acc: 0.6484 (0.6474) weight_decay: 0.0500 (0.0500) grad_norm: 0.9310 (1.0696) time: 1.9537 data: 0.0010 max mem: 6925
Epoch: [187] [400/625] eta: 0:07:22 lr: 0.001390 min_lr: 0.001390 loss: 2.5135 (2.4566) class_acc: 0.6367 (0.6467) weight_decay: 0.0500 (0.0500) grad_norm: 1.0849 (1.0764) time: 1.8700 data: 0.0011 max mem: 6925
Epoch: [187] [600/625] eta: 0:00:49 lr: 0.001383 min_lr: 0.001383 loss: 2.4637 (2.4592) class_acc: 0.6445 (0.6459) weight_decay: 0.0500 (0.0500) grad_norm: 0.9577 (1.0843) time: 2.0926 data: 0.0008 max mem: 6925
Epoch: [187] [624/625] eta: 0:00:01 lr: 0.001383 min_lr: 0.001383 loss: 2.4657 (2.4595) class_acc: 0.6445 (0.6460) weight_decay: 0.0500 (0.0500) grad_norm: 1.0136 (1.0841) time: 0.7201 data: 0.0018 max mem: 6925
Epoch: [187] Total time: 0:19:58 (1.9172 s / it)
Averaged stats: lr: 0.001383 min_lr: 0.001383 loss: 2.4657 (2.4560) class_acc: 0.6445 (0.6469) weight_decay: 0.0500 (0.0500) grad_norm: 1.0136 (1.0841)
Test: [ 0/50] eta: 0:09:52 loss: 1.3768 (1.3768) acc1: 73.6000 (73.6000) acc5: 88.0000 (88.0000) time: 11.8415 data: 11.8025 max mem: 6925
Test: [10/50] eta: 0:01:11 loss: 1.3922 (1.4140) acc1: 69.6000 (70.9091) acc5: 88.8000 (88.4364) time: 1.7944 data: 1.7648 max mem: 6925
Test: [20/50] eta: 0:00:39 loss: 1.5048 (1.5683) acc1: 67.2000 (66.4000) acc5: 86.4000 (86.7048) time: 0.8037 data: 0.7751 max mem: 6925
Test: [30/50] eta: 0:00:24 loss: 1.6080 (1.5750) acc1: 62.4000 (65.7806) acc5: 85.6000 (86.6323) time: 0.8894 data: 0.8610 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.5912 (1.5644) acc1: 64.8000 (65.8732) acc5: 87.2000 (86.5366) time: 0.9562 data: 0.9270 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.5507 (1.5702) acc1: 66.4000 (65.7920) acc5: 87.2000 (86.4480) time: 0.6441 data: 0.6139 max mem: 6925
Test: Total time: 0:00:50 (1.0109 s / it)
* Acc@1 66.774 Acc@5 87.376 loss 1.520
Accuracy of the model on the 50000 test images: 66.8%
Max accuracy: 68.10%
Epoch: [188] [ 0/625] eta: 3:36:16 lr: 0.001383 min_lr: 0.001383 loss: 2.3830 (2.3830) class_acc: 0.6406 (0.6406) weight_decay: 0.0500 (0.0500) time: 20.7624 data: 18.3269 max mem: 6925
Epoch: [188] [200/625] eta: 0:13:49 lr: 0.001376 min_lr: 0.001376 loss: 2.4400 (2.4294) class_acc: 0.6523 (0.6514) weight_decay: 0.0500 (0.0500) grad_norm: 1.0419 (inf) time: 1.8972 data: 0.0007 max mem: 6925
Epoch: [188] [400/625] eta: 0:07:11 lr: 0.001369 min_lr: 0.001369 loss: 2.4392 (2.4452) class_acc: 0.6523 (0.6499) weight_decay: 0.0500 (0.0500) grad_norm: 1.0434 (inf) time: 1.9077 data: 0.0014 max mem: 6925
Epoch: [188] [600/625] eta: 0:00:48 lr: 0.001362 min_lr: 0.001362 loss: 2.4781 (2.4478) class_acc: 0.6406 (0.6489) weight_decay: 0.0500 (0.0500) grad_norm: 1.1146 (inf) time: 2.0523 data: 0.0010 max mem: 6925
Epoch: [188] [624/625] eta: 0:00:01 lr: 0.001361 min_lr: 0.001361 loss: 2.5024 (2.4491) class_acc: 0.6406 (0.6489) weight_decay: 0.0500 (0.0500) grad_norm: 1.0443 (inf) time: 0.5821 data: 0.0018 max mem: 6925
Epoch: [188] Total time: 0:19:52 (1.9084 s / it)
Averaged stats: lr: 0.001361 min_lr: 0.001361 loss: 2.5024 (2.4492) class_acc: 0.6406 (0.6488) weight_decay: 0.0500 (0.0500) grad_norm: 1.0443 (inf)
Test: [ 0/50] eta: 0:10:43 loss: 1.6119 (1.6119) acc1: 62.4000 (62.4000) acc5: 87.2000 (87.2000) time: 12.8729 data: 12.8401 max mem: 6925
Test: [10/50] eta: 0:01:27 loss: 1.3718 (1.3807) acc1: 70.4000 (69.5273) acc5: 87.2000 (87.6364) time: 2.1879 data: 2.1582 max mem: 6925
Test: [20/50] eta: 0:00:51 loss: 1.5001 (1.4926) acc1: 65.6000 (66.7429) acc5: 85.6000 (86.4762) time: 1.1646 data: 1.1357 max mem: 6925
Test: [30/50] eta: 0:00:29 loss: 1.5889 (1.5051) acc1: 64.8000 (66.5290) acc5: 85.6000 (86.1419) time: 1.1077 data: 1.0793 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.5452 (1.5123) acc1: 64.8000 (66.2634) acc5: 86.4000 (86.3415) time: 0.6730 data: 0.6432 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.5032 (1.5347) acc1: 64.8000 (65.7120) acc5: 86.4000 (86.0960) time: 0.6498 data: 0.6198 max mem: 6925
Test: Total time: 0:00:50 (1.0075 s / it)
* Acc@1 66.594 Acc@5 87.102 loss 1.481
Accuracy of the model on the 50000 test images: 66.6%
Max accuracy: 68.10%
Epoch: [189] [ 0/625] eta: 3:26:39 lr: 0.001361 min_lr: 0.001361 loss: 2.4670 (2.4670) class_acc: 0.6445 (0.6445) weight_decay: 0.0500 (0.0500) time: 19.8392 data: 19.4921 max mem: 6925
Epoch: [189] [200/625] eta: 0:14:32 lr: 0.001355 min_lr: 0.001355 loss: 2.4353 (2.4253) class_acc: 0.6523 (0.6566) weight_decay: 0.0500 (0.0500) grad_norm: 0.9482 (1.0897) time: 1.9667 data: 0.0007 max mem: 6925
Epoch: [189] [400/625] eta: 0:07:27 lr: 0.001348 min_lr: 0.001348 loss: 2.4781 (2.4311) class_acc: 0.6445 (0.6540) weight_decay: 0.0500 (0.0500) grad_norm: 1.0651 (1.1438) time: 1.9825 data: 0.0009 max mem: 6925
Epoch: [189] [600/625] eta: 0:00:49 lr: 0.001341 min_lr: 0.001341 loss: 2.4539 (2.4397) class_acc: 0.6445 (0.6510) weight_decay: 0.0500 (0.0500) grad_norm: 1.1136 (1.1092) time: 1.9054 data: 0.0408 max mem: 6925
Epoch: [189] [624/625] eta: 0:00:01 lr: 0.001340 min_lr: 0.001340 loss: 2.4622 (2.4404) class_acc: 0.6406 (0.6507) weight_decay: 0.0500 (0.0500) grad_norm: 1.0584 (1.1071) time: 0.7956 data: 0.0014 max mem: 6925
Epoch: [189] Total time: 0:20:01 (1.9216 s / it)
Averaged stats: lr: 0.001340 min_lr: 0.001340 loss: 2.4622 (2.4476) class_acc: 0.6406 (0.6491) weight_decay: 0.0500 (0.0500) grad_norm: 1.0584 (1.1071)
Test: [ 0/50] eta: 0:09:41 loss: 1.2654 (1.2654) acc1: 76.0000 (76.0000) acc5: 93.6000 (93.6000) time: 11.6315 data: 11.5901 max mem: 6925
Test: [10/50] eta: 0:01:09 loss: 1.3438 (1.4145) acc1: 71.2000 (70.1091) acc5: 88.0000 (88.2909) time: 1.7463 data: 1.7156 max mem: 6925
Test: [20/50] eta: 0:00:38 loss: 1.5449 (1.5520) acc1: 64.0000 (66.1714) acc5: 85.6000 (86.7810) time: 0.7605 data: 0.7304 max mem: 6925
Test: [30/50] eta: 0:00:24 loss: 1.5854 (1.5554) acc1: 62.4000 (66.0387) acc5: 85.6000 (86.6581) time: 0.9549 data: 0.9250 max mem: 6925
Test: [40/50] eta: 0:00:10 loss: 1.5261 (1.5433) acc1: 64.8000 (66.0683) acc5: 86.4000 (86.9463) time: 0.8727 data: 0.8436 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.4711 (1.5380) acc1: 65.6000 (66.0960) acc5: 88.0000 (87.0240) time: 0.4648 data: 0.4357 max mem: 6925
Test: Total time: 0:00:47 (0.9543 s / it)
* Acc@1 67.248 Acc@5 87.362 loss 1.492
Accuracy of the model on the 50000 test images: 67.2%
Max accuracy: 68.10%
Epoch: [190] [ 0/625] eta: 3:43:26 lr: 0.001340 min_lr: 0.001340 loss: 2.3908 (2.3908) class_acc: 0.6484 (0.6484) weight_decay: 0.0500 (0.0500) time: 21.4507 data: 20.3741 max mem: 6925
Epoch: [190] [200/625] eta: 0:14:04 lr: 0.001333 min_lr: 0.001333 loss: 2.4124 (2.4301) class_acc: 0.6523 (0.6537) weight_decay: 0.0500 (0.0500) grad_norm: 1.0462 (1.1233) time: 1.9493 data: 0.0009 max mem: 6925
Epoch: [190] [400/625] eta: 0:07:17 lr: 0.001327 min_lr: 0.001327 loss: 2.4570 (2.4395) class_acc: 0.6328 (0.6508) weight_decay: 0.0500 (0.0500) grad_norm: 0.9457 (1.1163) time: 1.9187 data: 0.0906 max mem: 6925
Epoch: [190] [600/625] eta: 0:00:49 lr: 0.001320 min_lr: 0.001320 loss: 2.4975 (2.4456) class_acc: 0.6406 (0.6485) weight_decay: 0.0500 (0.0500) grad_norm: 1.1258 (1.1205) time: 1.9296 data: 0.0011 max mem: 6925
Epoch: [190] [624/625] eta: 0:00:01 lr: 0.001319 min_lr: 0.001319 loss: 2.4789 (2.4473) class_acc: 0.6367 (0.6481) weight_decay: 0.0500 (0.0500) grad_norm: 1.1677 (1.1343) time: 0.8023 data: 0.0021 max mem: 6925
Epoch: [190] Total time: 0:19:59 (1.9197 s / it)
Averaged stats: lr: 0.001319 min_lr: 0.001319 loss: 2.4789 (2.4448) class_acc: 0.6367 (0.6496) weight_decay: 0.0500 (0.0500) grad_norm: 1.1677 (1.1343)
Test: [ 0/50] eta: 0:09:53 loss: 1.3191 (1.3191) acc1: 67.2000 (67.2000) acc5: 92.0000 (92.0000) time: 11.8782 data: 11.8380 max mem: 6925
Test: [10/50] eta: 0:01:10 loss: 1.2716 (1.2689) acc1: 72.8000 (72.8727) acc5: 90.4000 (90.4000) time: 1.7573 data: 1.7259 max mem: 6925
Test: [20/50] eta: 0:00:45 loss: 1.4706 (1.4687) acc1: 67.2000 (67.2381) acc5: 87.2000 (87.9619) time: 0.9871 data: 0.9569 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.6667 (1.5215) acc1: 62.4000 (66.2968) acc5: 86.4000 (87.2516) time: 1.1288 data: 1.0994 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.4532 (1.5113) acc1: 64.0000 (66.3415) acc5: 86.4000 (87.2195) time: 0.8448 data: 0.8155 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.4431 (1.5310) acc1: 64.8000 (65.8080) acc5: 86.4000 (86.9600) time: 0.4989 data: 0.4695 max mem: 6925
Test: Total time: 0:00:52 (1.0401 s / it)
* Acc@1 66.990 Acc@5 87.542 loss 1.479
Accuracy of the model on the 50000 test images: 67.0%
Max accuracy: 68.10%
Epoch: [191] [ 0/625] eta: 3:35:10 lr: 0.001319 min_lr: 0.001319 loss: 2.5022 (2.5022) class_acc: 0.6562 (0.6562) weight_decay: 0.0500 (0.0500) time: 20.6573 data: 17.9667 max mem: 6925
Epoch: [191] [200/625] eta: 0:13:54 lr: 0.001312 min_lr: 0.001312 loss: 2.4544 (2.4332) class_acc: 0.6367 (0.6527) weight_decay: 0.0500 (0.0500) grad_norm: 1.1734 (1.1230) time: 1.7658 data: 0.1386 max mem: 6925
Epoch: [191] [400/625] eta: 0:07:16 lr: 0.001305 min_lr: 0.001305 loss: 2.4296 (2.4396) class_acc: 0.6406 (0.6510) weight_decay: 0.0500 (0.0500) grad_norm: 1.1253 (1.1259) time: 1.8306 data: 0.0009 max mem: 6925
Epoch: [191] [600/625] eta: 0:00:48 lr: 0.001299 min_lr: 0.001299 loss: 2.4042 (2.4428) class_acc: 0.6562 (0.6503) weight_decay: 0.0500 (0.0500) grad_norm: 0.9054 (1.1067) time: 1.8476 data: 0.0008 max mem: 6925
Epoch: [191] [624/625] eta: 0:00:01 lr: 0.001298 min_lr: 0.001298 loss: 2.4477 (2.4429) class_acc: 0.6367 (0.6501) weight_decay: 0.0500 (0.0500) grad_norm: 0.9918 (1.1029) time: 0.7453 data: 0.0019 max mem: 6925
Epoch: [191] Total time: 0:19:52 (1.9083 s / it)
Averaged stats: lr: 0.001298 min_lr: 0.001298 loss: 2.4477 (2.4420) class_acc: 0.6367 (0.6504) weight_decay: 0.0500 (0.0500) grad_norm: 0.9918 (1.1029)
Test: [ 0/50] eta: 0:10:04 loss: 1.4066 (1.4066) acc1: 69.6000 (69.6000) acc5: 91.2000 (91.2000) time: 12.0929 data: 12.0592 max mem: 6925
Test: [10/50] eta: 0:01:09 loss: 1.1456 (1.2051) acc1: 73.6000 (73.3091) acc5: 91.2000 (90.8364) time: 1.7329 data: 1.7027 max mem: 6925
Test: [20/50] eta: 0:00:38 loss: 1.3030 (1.3998) acc1: 69.6000 (68.6095) acc5: 90.4000 (88.9143) time: 0.7495 data: 0.7203 max mem: 6925
Test: [30/50] eta: 0:00:25 loss: 1.5383 (1.4416) acc1: 64.8000 (67.5871) acc5: 86.4000 (88.1290) time: 0.9921 data: 0.9619 max mem: 6925
Test: [40/50] eta: 0:00:10 loss: 1.4977 (1.4636) acc1: 66.4000 (67.2195) acc5: 85.6000 (87.8244) time: 0.8883 data: 0.8576 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.4763 (1.4580) acc1: 66.4000 (67.2480) acc5: 85.6000 (87.7280) time: 0.4333 data: 0.4044 max mem: 6925
Test: Total time: 0:00:47 (0.9539 s / it)
* Acc@1 68.152 Acc@5 88.162 loss 1.410
Accuracy of the model on the 50000 test images: 68.2%
Max accuracy: 68.15%
Epoch: [192] [ 0/625] eta: 3:13:06 lr: 0.001298 min_lr: 0.001298 loss: 2.4290 (2.4290) class_acc: 0.6797 (0.6797) weight_decay: 0.0500 (0.0500) time: 18.5383 data: 16.6361 max mem: 6925
Epoch: [192] [200/625] eta: 0:13:31 lr: 0.001291 min_lr: 0.001291 loss: 2.3875 (2.4248) class_acc: 0.6641 (0.6557) weight_decay: 0.0500 (0.0500) grad_norm: 1.0194 (1.0905) time: 2.0486 data: 0.0007 max mem: 6925
Epoch: [192] [400/625] eta: 0:07:04 lr: 0.001284 min_lr: 0.001284 loss: 2.4410 (2.4314) class_acc: 0.6367 (0.6537) weight_decay: 0.0500 (0.0500) grad_norm: 0.9849 (1.1124) time: 1.9236 data: 0.0010 max mem: 6925
Epoch: [192] [600/625] eta: 0:00:47 lr: 0.001278 min_lr: 0.001278 loss: 2.4579 (2.4341) class_acc: 0.6523 (0.6521) weight_decay: 0.0500 (0.0500) grad_norm: 0.9989 (1.1298) time: 1.9737 data: 0.0009 max mem: 6925
Epoch: [192] [624/625] eta: 0:00:01 lr: 0.001277 min_lr: 0.001277 loss: 2.4464 (2.4343) class_acc: 0.6484 (0.6520) weight_decay: 0.0500 (0.0500) grad_norm: 1.0428 (1.1279) time: 0.6856 data: 0.0020 max mem: 6925
Epoch: [192] Total time: 0:19:41 (1.8904 s / it)
Averaged stats: lr: 0.001277 min_lr: 0.001277 loss: 2.4464 (2.4372) class_acc: 0.6484 (0.6515) weight_decay: 0.0500 (0.0500) grad_norm: 1.0428 (1.1279)
Test: [ 0/50] eta: 0:10:40 loss: 1.1736 (1.1736) acc1: 76.0000 (76.0000) acc5: 92.0000 (92.0000) time: 12.8128 data: 12.7822 max mem: 6925
Test: [10/50] eta: 0:01:22 loss: 1.2673 (1.2688) acc1: 72.8000 (72.7273) acc5: 89.6000 (89.6000) time: 2.0618 data: 2.0323 max mem: 6925
Test: [20/50] eta: 0:00:48 loss: 1.3675 (1.4219) acc1: 68.0000 (68.4190) acc5: 88.8000 (88.3810) time: 1.0513 data: 1.0224 max mem: 6925
Test: [30/50] eta: 0:00:28 loss: 1.5614 (1.4754) acc1: 62.4000 (67.3806) acc5: 86.4000 (87.7677) time: 1.0808 data: 1.0525 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.5006 (1.4674) acc1: 64.8000 (67.3951) acc5: 88.0000 (87.9415) time: 0.9136 data: 0.8842 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.4655 (1.4763) acc1: 64.8000 (67.0080) acc5: 88.8000 (87.8240) time: 0.8471 data: 0.8177 max mem: 6925
Test: Total time: 0:00:55 (1.1127 s / it)
* Acc@1 68.038 Acc@5 88.206 loss 1.432
Accuracy of the model on the 50000 test images: 68.0%
Max accuracy: 68.15%
Epoch: [193] [ 0/625] eta: 3:26:26 lr: 0.001277 min_lr: 0.001277 loss: 2.2735 (2.2735) class_acc: 0.6992 (0.6992) weight_decay: 0.0500 (0.0500) time: 19.8191 data: 19.1023 max mem: 6925
Epoch: [193] [200/625] eta: 0:14:30 lr: 0.001270 min_lr: 0.001270 loss: 2.4185 (2.4247) class_acc: 0.6562 (0.6555) weight_decay: 0.0500 (0.0500) grad_norm: 1.0690 (1.0471) time: 1.9996 data: 0.0010 max mem: 6925
Epoch: [193] [400/625] eta: 0:07:25 lr: 0.001264 min_lr: 0.001264 loss: 2.4222 (2.4343) class_acc: 0.6445 (0.6528) weight_decay: 0.0500 (0.0500) grad_norm: 1.0447 (1.0953) time: 1.8629 data: 0.0019 max mem: 6925
Epoch: [193] [600/625] eta: 0:00:48 lr: 0.001257 min_lr: 0.001257 loss: 2.3843 (2.4354) class_acc: 0.6602 (0.6524) weight_decay: 0.0500 (0.0500) grad_norm: 1.1021 (1.1307) time: 1.8254 data: 0.0009 max mem: 6925
Epoch: [193] [624/625] eta: 0:00:01 lr: 0.001256 min_lr: 0.001256 loss: 2.4589 (2.4360) class_acc: 0.6406 (0.6523) weight_decay: 0.0500 (0.0500) grad_norm: 0.9904 (1.1281) time: 1.0013 data: 0.0021 max mem: 6925
Epoch: [193] Total time: 0:19:52 (1.9087 s / it)
Averaged stats: lr: 0.001256 min_lr: 0.001256 loss: 2.4589 (2.4345) class_acc: 0.6406 (0.6523) weight_decay: 0.0500 (0.0500) grad_norm: 0.9904 (1.1281)
Test: [ 0/50] eta: 0:10:32 loss: 1.5562 (1.5562) acc1: 59.2000 (59.2000) acc5: 88.8000 (88.8000) time: 12.6459 data: 12.6038 max mem: 6925
Test: [10/50] eta: 0:01:18 loss: 1.3399 (1.3449) acc1: 69.6000 (69.3091) acc5: 89.6000 (88.8000) time: 1.9614 data: 1.9309 max mem: 6925
Test: [20/50] eta: 0:00:42 loss: 1.4033 (1.4616) acc1: 68.0000 (67.1619) acc5: 87.2000 (87.9619) time: 0.8383 data: 0.8088 max mem: 6925
Test: [30/50] eta: 0:00:24 loss: 1.5536 (1.4755) acc1: 64.8000 (66.7097) acc5: 87.2000 (87.7419) time: 0.8260 data: 0.7966 max mem: 6925
Test: [40/50] eta: 0:00:10 loss: 1.4243 (1.4689) acc1: 67.2000 (67.0439) acc5: 88.0000 (87.9220) time: 0.7007 data: 0.6717 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.4243 (1.4720) acc1: 67.2000 (66.7520) acc5: 87.2000 (87.6320) time: 0.4053 data: 0.3769 max mem: 6925
Test: Total time: 0:00:46 (0.9266 s / it)
* Acc@1 68.024 Acc@5 88.312 loss 1.419
Accuracy of the model on the 50000 test images: 68.0%
Max accuracy: 68.15%
Epoch: [194] [ 0/625] eta: 3:27:24 lr: 0.001256 min_lr: 0.001256 loss: 2.3426 (2.3426) class_acc: 0.6250 (0.6250) weight_decay: 0.0500 (0.0500) time: 19.9114 data: 18.8173 max mem: 6925
Epoch: [194] [200/625] eta: 0:13:57 lr: 0.001249 min_lr: 0.001249 loss: 2.4557 (2.4223) class_acc: 0.6484 (0.6539) weight_decay: 0.0500 (0.0500) grad_norm: 1.1322 (1.1361) time: 1.9511 data: 0.1385 max mem: 6925
Epoch: [194] [400/625] eta: 0:07:17 lr: 0.001243 min_lr: 0.001243 loss: 2.4233 (2.4312) class_acc: 0.6523 (0.6525) weight_decay: 0.0500 (0.0500) grad_norm: 1.0191 (1.0917) time: 1.8822 data: 0.0554 max mem: 6925
Epoch: [194] [600/625] eta: 0:00:49 lr: 0.001236 min_lr: 0.001236 loss: 2.3736 (2.4334) class_acc: 0.6602 (0.6520) weight_decay: 0.0500 (0.0500) grad_norm: 0.9870 (1.1264) time: 2.3176 data: 0.0419 max mem: 6925
Epoch: [194] [624/625] eta: 0:00:01 lr: 0.001235 min_lr: 0.001235 loss: 2.3963 (2.4328) class_acc: 0.6445 (0.6521) weight_decay: 0.0500 (0.0500) grad_norm: 0.9740 (1.1217) time: 0.9927 data: 0.0016 max mem: 6925
Epoch: [194] Total time: 0:20:25 (1.9603 s / it)
Averaged stats: lr: 0.001235 min_lr: 0.001235 loss: 2.3963 (2.4299) class_acc: 0.6445 (0.6532) weight_decay: 0.0500 (0.0500) grad_norm: 0.9740 (1.1217)
Test: [ 0/50] eta: 0:10:59 loss: 1.1125 (1.1125) acc1: 76.8000 (76.8000) acc5: 92.8000 (92.8000) time: 13.1846 data: 13.1497 max mem: 6925
Test: [10/50] eta: 0:01:18 loss: 1.1360 (1.1934) acc1: 74.4000 (73.8182) acc5: 90.4000 (89.5273) time: 1.9596 data: 1.9281 max mem: 6925
Test: [20/50] eta: 0:00:42 loss: 1.2900 (1.3508) acc1: 68.8000 (69.2952) acc5: 88.8000 (88.5714) time: 0.8251 data: 0.7953 max mem: 6925
Test: [30/50] eta: 0:00:23 loss: 1.4571 (1.3790) acc1: 65.6000 (68.1548) acc5: 88.0000 (88.2839) time: 0.7632 data: 0.7344 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.3951 (1.3914) acc1: 65.6000 (67.6683) acc5: 88.8000 (88.1756) time: 0.8970 data: 0.8681 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.3656 (1.3960) acc1: 68.0000 (67.7120) acc5: 87.2000 (87.9040) time: 0.8273 data: 0.7984 max mem: 6925
Test: Total time: 0:00:53 (1.0686 s / it)
* Acc@1 69.120 Acc@5 88.660 loss 1.353
Accuracy of the model on the 50000 test images: 69.1%
Max accuracy: 69.12%
Epoch: [195] [ 0/625] eta: 4:39:40 lr: 0.001235 min_lr: 0.001235 loss: 2.4192 (2.4192) class_acc: 0.6523 (0.6523) weight_decay: 0.0500 (0.0500) time: 26.8486 data: 20.5779 max mem: 6925
Epoch: [195] [200/625] eta: 0:17:11 lr: 0.001229 min_lr: 0.001229 loss: 2.4296 (2.3988) class_acc: 0.6289 (0.6620) weight_decay: 0.0500 (0.0500) grad_norm: 1.1019 (1.1569) time: 2.2904 data: 0.0020 max mem: 6925
Epoch: [195] [400/625] eta: 0:08:14 lr: 0.001222 min_lr: 0.001222 loss: 2.3759 (2.4197) class_acc: 0.6641 (0.6566) weight_decay: 0.0500 (0.0500) grad_norm: 1.0362 (1.1548) time: 2.0131 data: 0.0011 max mem: 6925
Epoch: [195] [600/625] eta: 0:00:53 lr: 0.001215 min_lr: 0.001215 loss: 2.4518 (2.4253) class_acc: 0.6562 (0.6543) weight_decay: 0.0500 (0.0500) grad_norm: 1.0382 (1.1226) time: 1.9639 data: 0.0010 max mem: 6925
Epoch: [195] [624/625] eta: 0:00:02 lr: 0.001215 min_lr: 0.001215 loss: 2.4447 (2.4248) class_acc: 0.6523 (0.6544) weight_decay: 0.0500 (0.0500) grad_norm: 1.0457 (1.1228) time: 0.8241 data: 0.0015 max mem: 6925
Epoch: [195] Total time: 0:21:37 (2.0767 s / it)
Averaged stats: lr: 0.001215 min_lr: 0.001215 loss: 2.4447 (2.4284) class_acc: 0.6523 (0.6538) weight_decay: 0.0500 (0.0500) grad_norm: 1.0457 (1.1228)
Test: [ 0/50] eta: 0:10:09 loss: 1.2714 (1.2714) acc1: 74.4000 (74.4000) acc5: 90.4000 (90.4000) time: 12.1963 data: 12.1490 max mem: 6925
Test: [10/50] eta: 0:01:24 loss: 1.2714 (1.2902) acc1: 74.4000 (72.6545) acc5: 88.8000 (89.3091) time: 2.1186 data: 2.0870 max mem: 6925
Test: [20/50] eta: 0:00:50 loss: 1.3405 (1.3945) acc1: 68.8000 (69.4857) acc5: 88.8000 (88.1524) time: 1.1644 data: 1.1352 max mem: 6925
Test: [30/50] eta: 0:00:30 loss: 1.5303 (1.4315) acc1: 67.2000 (68.6968) acc5: 85.6000 (87.7419) time: 1.1765 data: 1.1479 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.4069 (1.4151) acc1: 68.0000 (68.7220) acc5: 88.0000 (88.0390) time: 0.8072 data: 0.7778 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.4405 (1.4298) acc1: 63.2000 (68.1280) acc5: 88.0000 (87.9040) time: 0.6590 data: 0.6287 max mem: 6925
Test: Total time: 0:00:53 (1.0623 s / it)
* Acc@1 68.448 Acc@5 88.354 loss 1.392
Accuracy of the model on the 50000 test images: 68.4%
Max accuracy: 69.12%
Epoch: [196] [ 0/625] eta: 3:35:58 lr: 0.001215 min_lr: 0.001215 loss: 2.2084 (2.2084) class_acc: 0.7227 (0.7227) weight_decay: 0.0500 (0.0500) time: 20.7341 data: 16.9177 max mem: 6925
Epoch: [196] [200/625] eta: 0:14:25 lr: 0.001208 min_lr: 0.001208 loss: 2.4236 (2.4101) class_acc: 0.6562 (0.6579) weight_decay: 0.0500 (0.0500) grad_norm: 0.9635 (1.0820) time: 1.8483 data: 0.0009 max mem: 6925
Epoch: [196] [400/625] eta: 0:07:25 lr: 0.001201 min_lr: 0.001201 loss: 2.3584 (2.4117) class_acc: 0.6523 (0.6573) weight_decay: 0.0500 (0.0500) grad_norm: 1.0135 (1.1114) time: 1.8516 data: 0.0009 max mem: 6925
Epoch: [196] [600/625] eta: 0:00:48 lr: 0.001195 min_lr: 0.001195 loss: 2.4449 (2.4190) class_acc: 0.6484 (0.6559) weight_decay: 0.0500 (0.0500) grad_norm: 1.2024 (1.1451) time: 1.9642 data: 0.0008 max mem: 6925
Epoch: [196] [624/625] eta: 0:00:01 lr: 0.001194 min_lr: 0.001194 loss: 2.4763 (2.4203) class_acc: 0.6523 (0.6555) weight_decay: 0.0500 (0.0500) grad_norm: 1.1574 (1.1509) time: 0.8014 data: 0.0006 max mem: 6925
Epoch: [196] Total time: 0:19:56 (1.9149 s / it)
Averaged stats: lr: 0.001194 min_lr: 0.001194 loss: 2.4763 (2.4243) class_acc: 0.6523 (0.6546) weight_decay: 0.0500 (0.0500) grad_norm: 1.1574 (1.1509)
Test: [ 0/50] eta: 0:10:03 loss: 1.2961 (1.2961) acc1: 72.0000 (72.0000) acc5: 92.0000 (92.0000) time: 12.0652 data: 12.0343 max mem: 6925
Test: [10/50] eta: 0:01:16 loss: 1.2232 (1.2193) acc1: 72.8000 (73.3818) acc5: 92.0000 (90.6909) time: 1.9236 data: 1.8936 max mem: 6925
Test: [20/50] eta: 0:00:43 loss: 1.3940 (1.4116) acc1: 69.6000 (68.4952) acc5: 88.8000 (88.8762) time: 0.9141 data: 0.8848 max mem: 6925
Test: [30/50] eta: 0:00:25 loss: 1.4795 (1.4343) acc1: 65.6000 (68.0258) acc5: 88.0000 (88.6452) time: 0.8828 data: 0.8531 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.4220 (1.4538) acc1: 67.2000 (67.5512) acc5: 87.2000 (88.3122) time: 0.7596 data: 0.7299 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.4382 (1.4530) acc1: 67.2000 (67.7280) acc5: 87.2000 (88.1120) time: 0.5138 data: 0.4849 max mem: 6925
Test: Total time: 0:00:49 (0.9845 s / it)
* Acc@1 68.708 Acc@5 88.170 loss 1.415
Accuracy of the model on the 50000 test images: 68.7%
Max accuracy: 69.12%
Epoch: [197] [ 0/625] eta: 3:34:47 lr: 0.001194 min_lr: 0.001194 loss: 2.3615 (2.3615) class_acc: 0.6680 (0.6680) weight_decay: 0.0500 (0.0500) time: 20.6196 data: 17.8429 max mem: 6925
Epoch: [197] [200/625] eta: 0:14:36 lr: 0.001187 min_lr: 0.001187 loss: 2.4321 (2.4081) class_acc: 0.6367 (0.6613) weight_decay: 0.0500 (0.0500) grad_norm: 1.0432 (1.1241) time: 2.0240 data: 0.1550 max mem: 6925
Epoch: [197] [400/625] eta: 0:07:31 lr: 0.001181 min_lr: 0.001181 loss: 2.4233 (2.4189) class_acc: 0.6484 (0.6572) weight_decay: 0.0500 (0.0500) grad_norm: 1.0524 (1.1482) time: 1.9573 data: 0.1191 max mem: 6925
Epoch: [197] [600/625] eta: 0:00:50 lr: 0.001174 min_lr: 0.001174 loss: 2.4423 (2.4190) class_acc: 0.6445 (0.6564) weight_decay: 0.0500 (0.0500) grad_norm: 1.0811 (1.1507) time: 2.0111 data: 0.0013 max mem: 6925
Epoch: [197] [624/625] eta: 0:00:01 lr: 0.001174 min_lr: 0.001174 loss: 2.4362 (2.4202) class_acc: 0.6406 (0.6559) weight_decay: 0.0500 (0.0500) grad_norm: 1.1264 (1.1550) time: 0.9351 data: 0.0016 max mem: 6925
Epoch: [197] Total time: 0:20:36 (1.9788 s / it)
Averaged stats: lr: 0.001174 min_lr: 0.001174 loss: 2.4362 (2.4189) class_acc: 0.6406 (0.6562) weight_decay: 0.0500 (0.0500) grad_norm: 1.1264 (1.1550)
Test: [ 0/50] eta: 0:10:37 loss: 1.2540 (1.2540) acc1: 68.8000 (68.8000) acc5: 89.6000 (89.6000) time: 12.7528 data: 12.7122 max mem: 6925
Test: [10/50] eta: 0:01:23 loss: 1.2540 (1.2602) acc1: 71.2000 (72.2909) acc5: 88.8000 (89.0909) time: 2.0861 data: 2.0562 max mem: 6925
Test: [20/50] eta: 0:00:48 loss: 1.3326 (1.4089) acc1: 68.8000 (68.7238) acc5: 88.0000 (88.1524) time: 1.0741 data: 1.0450 max mem: 6925
Test: [30/50] eta: 0:00:28 loss: 1.5559 (1.4499) acc1: 64.8000 (68.2839) acc5: 87.2000 (87.7936) time: 1.0919 data: 1.0619 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.5064 (1.4475) acc1: 67.2000 (68.2341) acc5: 87.2000 (87.7854) time: 0.8587 data: 0.8275 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.5064 (1.4608) acc1: 66.4000 (67.8720) acc5: 87.2000 (87.5360) time: 0.8915 data: 0.8612 max mem: 6925
Test: Total time: 0:00:53 (1.0782 s / it)
* Acc@1 68.636 Acc@5 88.578 loss 1.410
Accuracy of the model on the 50000 test images: 68.6%
Max accuracy: 69.12%
Epoch: [198] [ 0/625] eta: 3:35:04 lr: 0.001174 min_lr: 0.001174 loss: 2.5513 (2.5513) class_acc: 0.6641 (0.6641) weight_decay: 0.0500 (0.0500) time: 20.6476 data: 18.3777 max mem: 6925
Epoch: [198] [200/625] eta: 0:14:27 lr: 0.001167 min_lr: 0.001167 loss: 2.4244 (2.4128) class_acc: 0.6406 (0.6594) weight_decay: 0.0500 (0.0500) grad_norm: 1.0465 (1.1578) time: 2.0048 data: 0.0491 max mem: 6925
Epoch: [198] [400/625] eta: 0:07:27 lr: 0.001161 min_lr: 0.001161 loss: 2.4127 (2.4181) class_acc: 0.6602 (0.6581) weight_decay: 0.0500 (0.0500) grad_norm: 0.9548 (1.1382) time: 1.9357 data: 0.0008 max mem: 6925
Epoch: [198] [600/625] eta: 0:00:50 lr: 0.001154 min_lr: 0.001154 loss: 2.4413 (2.4172) class_acc: 0.6641 (0.6581) weight_decay: 0.0500 (0.0500) grad_norm: 1.0298 (1.1235) time: 2.0016 data: 0.0013 max mem: 6925
Epoch: [198] [624/625] eta: 0:00:01 lr: 0.001153 min_lr: 0.001153 loss: 2.3724 (2.4170) class_acc: 0.6641 (0.6581) weight_decay: 0.0500 (0.0500) grad_norm: 0.9618 (1.1185) time: 0.7441 data: 0.0017 max mem: 6925
Epoch: [198] Total time: 0:20:24 (1.9590 s / it)
Averaged stats: lr: 0.001153 min_lr: 0.001153 loss: 2.3724 (2.4174) class_acc: 0.6641 (0.6568) weight_decay: 0.0500 (0.0500) grad_norm: 0.9618 (1.1185)
Test: [ 0/50] eta: 0:10:26 loss: 1.2627 (1.2627) acc1: 68.8000 (68.8000) acc5: 90.4000 (90.4000) time: 12.5305 data: 12.4915 max mem: 6925
Test: [10/50] eta: 0:01:24 loss: 1.1788 (1.2312) acc1: 72.8000 (72.1455) acc5: 89.6000 (89.1636) time: 2.1201 data: 2.0901 max mem: 6925
Test: [20/50] eta: 0:00:50 loss: 1.3625 (1.3753) acc1: 68.0000 (68.4191) acc5: 88.0000 (87.6191) time: 1.1340 data: 1.1046 max mem: 6925
Test: [30/50] eta: 0:00:30 loss: 1.4541 (1.4017) acc1: 64.8000 (67.8194) acc5: 88.0000 (87.5613) time: 1.1717 data: 1.1427 max mem: 6925
Test: [40/50] eta: 0:00:13 loss: 1.3966 (1.3917) acc1: 66.4000 (67.8439) acc5: 88.8000 (87.7659) time: 0.9173 data: 0.8888 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.3736 (1.4069) acc1: 65.6000 (67.4240) acc5: 88.0000 (87.4560) time: 0.8133 data: 0.7846 max mem: 6925
Test: Total time: 0:00:54 (1.0886 s / it)
* Acc@1 68.568 Acc@5 88.366 loss 1.363
Accuracy of the model on the 50000 test images: 68.6%
Max accuracy: 69.12%
Epoch: [199] [ 0/625] eta: 3:40:17 lr: 0.001153 min_lr: 0.001153 loss: 2.3940 (2.3940) class_acc: 0.6641 (0.6641) weight_decay: 0.0500 (0.0500) time: 21.1480 data: 20.4845 max mem: 6925
Epoch: [199] [200/625] eta: 0:14:23 lr: 0.001147 min_lr: 0.001147 loss: 2.4509 (2.3942) class_acc: 0.6484 (0.6598) weight_decay: 0.0500 (0.0500) grad_norm: 1.2203 (1.1248) time: 1.8749 data: 0.8090 max mem: 6925
Epoch: [199] [400/625] eta: 0:07:27 lr: 0.001140 min_lr: 0.001140 loss: 2.3734 (2.3991) class_acc: 0.6602 (0.6591) weight_decay: 0.0500 (0.0500) grad_norm: 1.0288 (1.1316) time: 1.9961 data: 0.0009 max mem: 6925
Epoch: [199] [600/625] eta: 0:00:48 lr: 0.001134 min_lr: 0.001134 loss: 2.3721 (2.4096) class_acc: 0.6641 (0.6574) weight_decay: 0.0500 (0.0500) grad_norm: 1.1357 (1.1422) time: 1.9676 data: 0.0006 max mem: 6925
Epoch: [199] [624/625] eta: 0:00:01 lr: 0.001133 min_lr: 0.001133 loss: 2.4341 (2.4109) class_acc: 0.6406 (0.6572) weight_decay: 0.0500 (0.0500) grad_norm: 1.0880 (1.1420) time: 1.2623 data: 0.0014 max mem: 6925
Epoch: [199] Total time: 0:20:01 (1.9228 s / it)
Averaged stats: lr: 0.001133 min_lr: 0.001133 loss: 2.4341 (2.4138) class_acc: 0.6406 (0.6570) weight_decay: 0.0500 (0.0500) grad_norm: 1.0880 (1.1420)
Test: [ 0/50] eta: 0:10:02 loss: 1.2320 (1.2320) acc1: 73.6000 (73.6000) acc5: 91.2000 (91.2000) time: 12.0486 data: 12.0177 max mem: 6925
Test: [10/50] eta: 0:01:26 loss: 1.2320 (1.2638) acc1: 71.2000 (71.6364) acc5: 89.6000 (89.5273) time: 2.1614 data: 2.1305 max mem: 6925
Test: [20/50] eta: 0:00:50 loss: 1.3773 (1.4054) acc1: 68.0000 (68.0381) acc5: 88.8000 (88.4191) time: 1.1676 data: 1.1374 max mem: 6925
Test: [30/50] eta: 0:00:26 loss: 1.4935 (1.4264) acc1: 65.6000 (67.7419) acc5: 88.0000 (88.1032) time: 0.8778 data: 0.8487 max mem: 6925
Test: [40/50] eta: 0:00:10 loss: 1.4715 (1.4448) acc1: 66.4000 (67.3951) acc5: 88.0000 (87.8244) time: 0.4534 data: 0.4249 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.4715 (1.4477) acc1: 67.2000 (67.4560) acc5: 87.2000 (87.6160) time: 0.4058 data: 0.3775 max mem: 6925
Test: Total time: 0:00:45 (0.9055 s / it)
* Acc@1 68.474 Acc@5 88.412 loss 1.402
Accuracy of the model on the 50000 test images: 68.5%
Max accuracy: 69.12%
Epoch: [200] [ 0/625] eta: 4:14:55 lr: 0.001133 min_lr: 0.001133 loss: 2.4298 (2.4298) class_acc: 0.6406 (0.6406) weight_decay: 0.0500 (0.0500) time: 24.4735 data: 16.6305 max mem: 6925
Epoch: [200] [200/625] eta: 0:13:37 lr: 0.001126 min_lr: 0.001126 loss: 2.3612 (2.3934) class_acc: 0.6680 (0.6631) weight_decay: 0.0500 (0.0500) grad_norm: 1.0346 (1.1297) time: 1.8051 data: 0.8109 max mem: 6925
Epoch: [200] [400/625] eta: 0:07:12 lr: 0.001120 min_lr: 0.001120 loss: 2.3813 (2.3977) class_acc: 0.6523 (0.6612) weight_decay: 0.0500 (0.0500) grad_norm: 1.0893 (1.1467) time: 1.9085 data: 0.0485 max mem: 6925
Epoch: [200] [600/625] eta: 0:00:47 lr: 0.001114 min_lr: 0.001114 loss: 2.4168 (2.4057) class_acc: 0.6406 (0.6594) weight_decay: 0.0500 (0.0500) grad_norm: 1.0192 (1.1578) time: 1.9540 data: 0.2131 max mem: 6925
Epoch: [200] [624/625] eta: 0:00:01 lr: 0.001113 min_lr: 0.001113 loss: 2.4232 (2.4064) class_acc: 0.6523 (0.6594) weight_decay: 0.0500 (0.0500) grad_norm: 1.1179 (1.1566) time: 0.7097 data: 0.0634 max mem: 6925
Epoch: [200] Total time: 0:19:38 (1.8855 s / it)
Averaged stats: lr: 0.001113 min_lr: 0.001113 loss: 2.4232 (2.4111) class_acc: 0.6523 (0.6583) weight_decay: 0.0500 (0.0500) grad_norm: 1.1179 (1.1566)
Test: [ 0/50] eta: 0:10:40 loss: 1.2492 (1.2492) acc1: 74.4000 (74.4000) acc5: 91.2000 (91.2000) time: 12.8020 data: 12.7599 max mem: 6925
Test: [10/50] eta: 0:01:22 loss: 1.2790 (1.2232) acc1: 72.8000 (73.0182) acc5: 89.6000 (89.6000) time: 2.0573 data: 2.0270 max mem: 6925
Test: [20/50] eta: 0:00:47 loss: 1.4087 (1.3801) acc1: 69.6000 (69.6762) acc5: 88.0000 (88.3048) time: 1.0208 data: 0.9908 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.5267 (1.4101) acc1: 65.6000 (68.4645) acc5: 87.2000 (87.9484) time: 1.0355 data: 1.0060 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.5267 (1.4117) acc1: 65.6000 (68.4683) acc5: 87.2000 (87.8634) time: 0.8323 data: 0.8036 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.4156 (1.4101) acc1: 65.6000 (68.0800) acc5: 88.0000 (87.8560) time: 0.6806 data: 0.6504 max mem: 6925
Test: Total time: 0:00:53 (1.0607 s / it)
* Acc@1 69.092 Acc@5 88.648 loss 1.356
Accuracy of the model on the 50000 test images: 69.1%
Max accuracy: 69.12%
Epoch: [201] [ 0/625] eta: 3:45:19 lr: 0.001113 min_lr: 0.001113 loss: 2.4406 (2.4406) class_acc: 0.6484 (0.6484) weight_decay: 0.0500 (0.0500) time: 21.6319 data: 20.5031 max mem: 6925
Epoch: [201] [200/625] eta: 0:13:34 lr: 0.001106 min_lr: 0.001106 loss: 2.3946 (2.3937) class_acc: 0.6602 (0.6623) weight_decay: 0.0500 (0.0500) grad_norm: 1.0980 (inf) time: 1.8335 data: 0.3284 max mem: 6925
Epoch: [201] [400/625] eta: 0:07:05 lr: 0.001100 min_lr: 0.001100 loss: 2.4114 (2.4046) class_acc: 0.6602 (0.6586) weight_decay: 0.0500 (0.0500) grad_norm: 1.0077 (inf) time: 1.8608 data: 0.0151 max mem: 6925
Epoch: [201] [600/625] eta: 0:00:47 lr: 0.001094 min_lr: 0.001094 loss: 2.3737 (2.4091) class_acc: 0.6562 (0.6572) weight_decay: 0.0500 (0.0500) grad_norm: 1.2319 (inf) time: 1.9272 data: 0.0009 max mem: 6925
Epoch: [201] [624/625] eta: 0:00:01 lr: 0.001093 min_lr: 0.001093 loss: 2.4245 (2.4091) class_acc: 0.6406 (0.6570) weight_decay: 0.0500 (0.0500) grad_norm: 1.1325 (inf) time: 0.8046 data: 0.0017 max mem: 6925
Epoch: [201] Total time: 0:19:19 (1.8559 s / it)
Averaged stats: lr: 0.001093 min_lr: 0.001093 loss: 2.4245 (2.4079) class_acc: 0.6406 (0.6592) weight_decay: 0.0500 (0.0500) grad_norm: 1.1325 (inf)
Test: [ 0/50] eta: 0:09:27 loss: 1.2451 (1.2451) acc1: 72.0000 (72.0000) acc5: 92.0000 (92.0000) time: 11.3555 data: 11.3219 max mem: 6925
Test: [10/50] eta: 0:01:18 loss: 1.2542 (1.2931) acc1: 72.8000 (72.1455) acc5: 90.4000 (89.2364) time: 1.9702 data: 1.9388 max mem: 6925
Test: [20/50] eta: 0:00:47 loss: 1.5288 (1.4888) acc1: 67.2000 (68.1524) acc5: 87.2000 (87.3905) time: 1.0777 data: 1.0475 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.6024 (1.4845) acc1: 65.6000 (68.2065) acc5: 86.4000 (87.4323) time: 1.0522 data: 1.0227 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.3429 (1.4730) acc1: 68.0000 (68.3317) acc5: 87.2000 (87.5122) time: 0.7566 data: 0.7265 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.3683 (1.4636) acc1: 68.8000 (68.1280) acc5: 88.0000 (87.6320) time: 0.6278 data: 0.5977 max mem: 6925
Test: Total time: 0:00:49 (0.9881 s / it)
* Acc@1 68.680 Acc@5 88.442 loss 1.431
Accuracy of the model on the 50000 test images: 68.7%
Max accuracy: 69.12%
Epoch: [202] [ 0/625] eta: 3:34:20 lr: 0.001093 min_lr: 0.001093 loss: 2.2128 (2.2128) class_acc: 0.7344 (0.7344) weight_decay: 0.0500 (0.0500) time: 20.5774 data: 20.3270 max mem: 6925
Epoch: [202] [200/625] eta: 0:14:23 lr: 0.001086 min_lr: 0.001086 loss: 2.4833 (2.3973) class_acc: 0.6602 (0.6631) weight_decay: 0.0500 (0.0500) grad_norm: 1.0189 (1.1426) time: 2.1478 data: 0.0008 max mem: 6925
Epoch: [202] [400/625] eta: 0:07:23 lr: 0.001080 min_lr: 0.001080 loss: 2.4091 (2.4016) class_acc: 0.6445 (0.6612) weight_decay: 0.0500 (0.0500) grad_norm: 1.2799 (1.1747) time: 1.9684 data: 0.0009 max mem: 6925
Epoch: [202] [600/625] eta: 0:00:48 lr: 0.001074 min_lr: 0.001074 loss: 2.3314 (2.4058) class_acc: 0.6797 (0.6599) weight_decay: 0.0500 (0.0500) grad_norm: 0.9774 (1.1653) time: 1.9117 data: 0.0008 max mem: 6925
Epoch: [202] [624/625] eta: 0:00:01 lr: 0.001073 min_lr: 0.001073 loss: 2.4543 (2.4066) class_acc: 0.6406 (0.6596) weight_decay: 0.0500 (0.0500) grad_norm: 1.0108 (1.1645) time: 0.9435 data: 0.0017 max mem: 6925
Epoch: [202] Total time: 0:19:56 (1.9137 s / it)
Averaged stats: lr: 0.001073 min_lr: 0.001073 loss: 2.4543 (2.4036) class_acc: 0.6406 (0.6601) weight_decay: 0.0500 (0.0500) grad_norm: 1.0108 (1.1645)
Test: [ 0/50] eta: 0:10:23 loss: 1.1271 (1.1271) acc1: 77.6000 (77.6000) acc5: 93.6000 (93.6000) time: 12.4641 data: 12.4329 max mem: 6925
Test: [10/50] eta: 0:01:24 loss: 1.1810 (1.2353) acc1: 73.6000 (73.5273) acc5: 90.4000 (89.8909) time: 2.1052 data: 2.0744 max mem: 6925
Test: [20/50] eta: 0:00:49 loss: 1.3881 (1.4176) acc1: 68.8000 (68.8762) acc5: 88.8000 (88.5333) time: 1.1121 data: 1.0823 max mem: 6925
Test: [30/50] eta: 0:00:29 loss: 1.5919 (1.4532) acc1: 64.0000 (68.0516) acc5: 87.2000 (87.9226) time: 1.0935 data: 1.0637 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.4149 (1.4419) acc1: 64.0000 (67.9024) acc5: 88.0000 (87.9805) time: 0.7765 data: 0.7460 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.4149 (1.4446) acc1: 66.4000 (67.7440) acc5: 88.0000 (87.8720) time: 0.7610 data: 0.7294 max mem: 6925
Test: Total time: 0:00:50 (1.0135 s / it)
* Acc@1 68.604 Acc@5 88.268 loss 1.411
Accuracy of the model on the 50000 test images: 68.6%
Max accuracy: 69.12%
Epoch: [203] [ 0/625] eta: 3:36:03 lr: 0.001073 min_lr: 0.001073 loss: 2.2433 (2.2433) class_acc: 0.6992 (0.6992) weight_decay: 0.0500 (0.0500) time: 20.7418 data: 17.7346 max mem: 6925
Epoch: [203] [200/625] eta: 0:14:16 lr: 0.001066 min_lr: 0.001066 loss: 2.4319 (2.3936) class_acc: 0.6406 (0.6625) weight_decay: 0.0500 (0.0500) grad_norm: 1.0364 (1.1869) time: 2.0515 data: 0.0056 max mem: 6925
Epoch: [203] [400/625] eta: 0:07:21 lr: 0.001060 min_lr: 0.001060 loss: 2.4010 (2.3922) class_acc: 0.6523 (0.6626) weight_decay: 0.0500 (0.0500) grad_norm: 0.9591 (1.1339) time: 1.7623 data: 0.0442 max mem: 6925
Epoch: [203] [600/625] eta: 0:00:49 lr: 0.001054 min_lr: 0.001054 loss: 2.3503 (2.3931) class_acc: 0.6836 (0.6629) weight_decay: 0.0500 (0.0500) grad_norm: 1.1163 (1.1394) time: 2.0112 data: 0.0022 max mem: 6925
Epoch: [203] [624/625] eta: 0:00:01 lr: 0.001053 min_lr: 0.001053 loss: 2.4175 (2.3942) class_acc: 0.6562 (0.6625) weight_decay: 0.0500 (0.0500) grad_norm: 1.0589 (1.1383) time: 0.8926 data: 0.0015 max mem: 6925
Epoch: [203] Total time: 0:20:09 (1.9346 s / it)
Averaged stats: lr: 0.001053 min_lr: 0.001053 loss: 2.4175 (2.3999) class_acc: 0.6562 (0.6606) weight_decay: 0.0500 (0.0500) grad_norm: 1.0589 (1.1383)
Test: [ 0/50] eta: 0:10:15 loss: 1.1814 (1.1814) acc1: 75.2000 (75.2000) acc5: 94.4000 (94.4000) time: 12.3126 data: 12.2448 max mem: 6925
Test: [10/50] eta: 0:01:19 loss: 1.3197 (1.3566) acc1: 71.2000 (70.4727) acc5: 89.6000 (89.2364) time: 1.9892 data: 1.9561 max mem: 6925
Test: [20/50] eta: 0:00:46 loss: 1.5445 (1.5108) acc1: 68.0000 (67.0095) acc5: 87.2000 (87.6952) time: 1.0154 data: 0.9865 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.5933 (1.5187) acc1: 63.2000 (66.5290) acc5: 87.2000 (87.3290) time: 1.0601 data: 1.0317 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.5150 (1.5025) acc1: 65.6000 (66.6927) acc5: 87.2000 (87.5902) time: 0.8725 data: 0.8436 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.4028 (1.4960) acc1: 67.2000 (66.8960) acc5: 88.0000 (87.6320) time: 0.7471 data: 0.7173 max mem: 6925
Test: Total time: 0:00:53 (1.0654 s / it)
* Acc@1 68.206 Acc@5 88.242 loss 1.442
Accuracy of the model on the 50000 test images: 68.2%
Max accuracy: 69.12%
Epoch: [204] [ 0/625] eta: 3:54:53 lr: 0.001053 min_lr: 0.001053 loss: 2.2787 (2.2787) class_acc: 0.6992 (0.6992) weight_decay: 0.0500 (0.0500) time: 22.5493 data: 21.9135 max mem: 6925
Epoch: [204] [200/625] eta: 0:14:33 lr: 0.001047 min_lr: 0.001047 loss: 2.3805 (2.3879) class_acc: 0.6562 (0.6635) weight_decay: 0.0500 (0.0500) grad_norm: 1.1393 (1.1653) time: 1.9878 data: 0.1151 max mem: 6925
Epoch: [204] [400/625] eta: 0:07:31 lr: 0.001040 min_lr: 0.001040 loss: 2.3533 (2.3902) class_acc: 0.6680 (0.6628) weight_decay: 0.0500 (0.0500) grad_norm: 1.1200 (1.1871) time: 1.9713 data: 0.0214 max mem: 6925
Epoch: [204] [600/625] eta: 0:00:50 lr: 0.001034 min_lr: 0.001034 loss: 2.3892 (2.3983) class_acc: 0.6562 (0.6605) weight_decay: 0.0500 (0.0500) grad_norm: 1.0486 (1.1713) time: 2.0511 data: 0.0012 max mem: 6925
Epoch: [204] [624/625] eta: 0:00:01 lr: 0.001033 min_lr: 0.001033 loss: 2.4410 (2.4005) class_acc: 0.6445 (0.6600) weight_decay: 0.0500 (0.0500) grad_norm: 0.9662 (1.1625) time: 0.8076 data: 0.0020 max mem: 6925
Epoch: [204] Total time: 0:20:33 (1.9728 s / it)
Averaged stats: lr: 0.001033 min_lr: 0.001033 loss: 2.4410 (2.3966) class_acc: 0.6445 (0.6616) weight_decay: 0.0500 (0.0500) grad_norm: 0.9662 (1.1625)
Test: [ 0/50] eta: 0:10:23 loss: 1.2292 (1.2292) acc1: 68.8000 (68.8000) acc5: 93.6000 (93.6000) time: 12.4772 data: 12.4442 max mem: 6925
Test: [10/50] eta: 0:01:20 loss: 1.2292 (1.2160) acc1: 73.6000 (73.0182) acc5: 90.4000 (90.4727) time: 2.0073 data: 1.9776 max mem: 6925
Test: [20/50] eta: 0:00:46 loss: 1.3590 (1.3752) acc1: 69.6000 (69.7143) acc5: 88.8000 (88.6476) time: 1.0152 data: 0.9862 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.4294 (1.4008) acc1: 65.6000 (68.5419) acc5: 87.2000 (88.2323) time: 1.0316 data: 1.0030 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.4306 (1.4038) acc1: 65.6000 (68.4488) acc5: 88.0000 (88.2342) time: 0.8357 data: 0.8047 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.4306 (1.4113) acc1: 66.4000 (68.2400) acc5: 88.0000 (88.1440) time: 0.7355 data: 0.7041 max mem: 6925
Test: Total time: 0:00:53 (1.0679 s / it)
* Acc@1 69.468 Acc@5 89.022 loss 1.362
Accuracy of the model on the 50000 test images: 69.5%
Max accuracy: 69.47%
Epoch: [205] [ 0/625] eta: 3:19:55 lr: 0.001033 min_lr: 0.001033 loss: 2.3100 (2.3100) class_acc: 0.6836 (0.6836) weight_decay: 0.0500 (0.0500) time: 19.1933 data: 18.3549 max mem: 6925
Epoch: [205] [200/625] eta: 0:14:06 lr: 0.001027 min_lr: 0.001027 loss: 2.3609 (2.3773) class_acc: 0.6680 (0.6649) weight_decay: 0.0500 (0.0500) grad_norm: 1.1594 (1.1983) time: 1.8144 data: 0.0007 max mem: 6925
Epoch: [205] [400/625] eta: 0:07:27 lr: 0.001021 min_lr: 0.001021 loss: 2.3921 (2.3839) class_acc: 0.6680 (0.6639) weight_decay: 0.0500 (0.0500) grad_norm: 1.1148 (1.2008) time: 2.1396 data: 0.0006 max mem: 6925
Epoch: [205] [600/625] eta: 0:00:49 lr: 0.001014 min_lr: 0.001014 loss: 2.3741 (2.3877) class_acc: 0.6719 (0.6628) weight_decay: 0.0500 (0.0500) grad_norm: 1.1799 (1.1938) time: 2.1534 data: 0.0009 max mem: 6925
Epoch: [205] [624/625] eta: 0:00:01 lr: 0.001014 min_lr: 0.001014 loss: 2.4175 (2.3887) class_acc: 0.6445 (0.6626) weight_decay: 0.0500 (0.0500) grad_norm: 1.0829 (1.1948) time: 0.6367 data: 0.0018 max mem: 6925
Epoch: [205] Total time: 0:20:27 (1.9635 s / it)
Averaged stats: lr: 0.001014 min_lr: 0.001014 loss: 2.4175 (2.3933) class_acc: 0.6445 (0.6625) weight_decay: 0.0500 (0.0500) grad_norm: 1.0829 (1.1948)
Test: [ 0/50] eta: 0:10:55 loss: 1.1919 (1.1919) acc1: 75.2000 (75.2000) acc5: 92.8000 (92.8000) time: 13.1116 data: 13.0758 max mem: 6925
Test: [10/50] eta: 0:01:28 loss: 1.1919 (1.2038) acc1: 75.2000 (73.7455) acc5: 91.2000 (90.4000) time: 2.2227 data: 2.1930 max mem: 6925
Test: [20/50] eta: 0:00:52 loss: 1.3741 (1.3703) acc1: 68.0000 (69.5238) acc5: 88.8000 (88.7238) time: 1.1846 data: 1.1556 max mem: 6925
Test: [30/50] eta: 0:00:30 loss: 1.4870 (1.3988) acc1: 64.0000 (68.8258) acc5: 87.2000 (88.2581) time: 1.1295 data: 1.1006 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.4211 (1.4061) acc1: 68.0000 (68.5854) acc5: 87.2000 (88.1561) time: 0.6912 data: 0.6620 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.4211 (1.4158) acc1: 68.0000 (68.5440) acc5: 87.2000 (87.8880) time: 0.6167 data: 0.5872 max mem: 6925
Test: Total time: 0:00:51 (1.0268 s / it)
* Acc@1 69.078 Acc@5 88.672 loss 1.384
Accuracy of the model on the 50000 test images: 69.1%
Max accuracy: 69.47%
Epoch: [206] [ 0/625] eta: 3:23:23 lr: 0.001014 min_lr: 0.001014 loss: 2.3108 (2.3108) class_acc: 0.6797 (0.6797) weight_decay: 0.0500 (0.0500) time: 19.5261 data: 19.2999 max mem: 6925
Epoch: [206] [200/625] eta: 0:13:45 lr: 0.001007 min_lr: 0.001007 loss: 2.3371 (2.3780) class_acc: 0.6758 (0.6657) weight_decay: 0.0500 (0.0500) grad_norm: 1.1325 (1.1547) time: 1.7653 data: 1.1203 max mem: 6925
Epoch: [206] [400/625] eta: 0:07:09 lr: 0.001001 min_lr: 0.001001 loss: 2.4262 (2.3870) class_acc: 0.6602 (0.6643) weight_decay: 0.0500 (0.0500) grad_norm: 1.1617 (1.1475) time: 1.9532 data: 0.5088 max mem: 6925
Epoch: [206] [600/625] eta: 0:00:47 lr: 0.000995 min_lr: 0.000995 loss: 2.4053 (2.3935) class_acc: 0.6602 (0.6627) weight_decay: 0.0500 (0.0500) grad_norm: 1.1037 (1.1458) time: 2.2336 data: 0.6730 max mem: 6925
Epoch: [206] [624/625] eta: 0:00:01 lr: 0.000994 min_lr: 0.000994 loss: 2.3714 (2.3929) class_acc: 0.6602 (0.6629) weight_decay: 0.0500 (0.0500) grad_norm: 1.1932 (1.1523) time: 1.0265 data: 0.0193 max mem: 6925
Epoch: [206] Total time: 0:19:34 (1.8792 s / it)
Averaged stats: lr: 0.000994 min_lr: 0.000994 loss: 2.3714 (2.3867) class_acc: 0.6602 (0.6640) weight_decay: 0.0500 (0.0500) grad_norm: 1.1932 (1.1523)
Test: [ 0/50] eta: 0:09:20 loss: 1.1589 (1.1589) acc1: 72.0000 (72.0000) acc5: 93.6000 (93.6000) time: 11.2020 data: 11.1563 max mem: 6925
Test: [10/50] eta: 0:01:22 loss: 1.1985 (1.2078) acc1: 72.8000 (72.6545) acc5: 89.6000 (90.2546) time: 2.0599 data: 2.0296 max mem: 6925
Test: [20/50] eta: 0:00:50 loss: 1.3020 (1.3411) acc1: 70.4000 (69.9429) acc5: 88.8000 (88.9524) time: 1.2045 data: 1.1743 max mem: 6925
Test: [30/50] eta: 0:00:30 loss: 1.4195 (1.3707) acc1: 67.2000 (69.3677) acc5: 87.2000 (88.4903) time: 1.2096 data: 1.1785 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.3754 (1.3809) acc1: 68.8000 (69.2488) acc5: 88.0000 (88.5268) time: 0.8375 data: 0.8078 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.3754 (1.3772) acc1: 68.8000 (69.1200) acc5: 88.8000 (88.4960) time: 0.7626 data: 0.7338 max mem: 6925
Test: Total time: 0:00:53 (1.0631 s / it)
* Acc@1 69.802 Acc@5 89.192 loss 1.327
Accuracy of the model on the 50000 test images: 69.8%
Max accuracy: 69.80%
Epoch: [207] [ 0/625] eta: 3:59:08 lr: 0.000994 min_lr: 0.000994 loss: 2.2506 (2.2506) class_acc: 0.7461 (0.7461) weight_decay: 0.0500 (0.0500) time: 22.9571 data: 19.2066 max mem: 6925
Epoch: [207] [200/625] eta: 0:14:29 lr: 0.000988 min_lr: 0.000988 loss: 2.3640 (2.3814) class_acc: 0.6602 (0.6671) weight_decay: 0.0500 (0.0500) grad_norm: 1.0987 (1.2247) time: 2.0696 data: 0.9254 max mem: 6925
Epoch: [207] [400/625] eta: 0:07:35 lr: 0.000982 min_lr: 0.000982 loss: 2.3923 (2.3873) class_acc: 0.6562 (0.6641) weight_decay: 0.0500 (0.0500) grad_norm: 1.1561 (1.1858) time: 2.1283 data: 0.0846 max mem: 6925
Epoch: [207] [600/625] eta: 0:00:51 lr: 0.000976 min_lr: 0.000976 loss: 2.3984 (2.3891) class_acc: 0.6602 (0.6643) weight_decay: 0.0500 (0.0500) grad_norm: 1.0152 (inf) time: 2.2213 data: 0.0199 max mem: 6925
Epoch: [207] [624/625] eta: 0:00:01 lr: 0.000975 min_lr: 0.000975 loss: 2.3747 (2.3889) class_acc: 0.6719 (0.6642) weight_decay: 0.0500 (0.0500) grad_norm: 0.9984 (inf) time: 0.7165 data: 0.0014 max mem: 6925
Epoch: [207] Total time: 0:20:48 (1.9975 s / it)
Averaged stats: lr: 0.000975 min_lr: 0.000975 loss: 2.3747 (2.3865) class_acc: 0.6719 (0.6636) weight_decay: 0.0500 (0.0500) grad_norm: 0.9984 (inf)
Test: [ 0/50] eta: 0:09:18 loss: 1.1608 (1.1608) acc1: 76.8000 (76.8000) acc5: 92.0000 (92.0000) time: 11.1634 data: 11.1279 max mem: 6925
Test: [10/50] eta: 0:01:21 loss: 1.1608 (1.2430) acc1: 73.6000 (73.3091) acc5: 90.4000 (89.6000) time: 2.0269 data: 1.9972 max mem: 6925
Test: [20/50] eta: 0:00:49 loss: 1.3309 (1.4001) acc1: 69.6000 (69.1810) acc5: 88.0000 (88.1524) time: 1.1659 data: 1.1368 max mem: 6925
Test: [30/50] eta: 0:00:29 loss: 1.4832 (1.4100) acc1: 67.2000 (68.8516) acc5: 86.4000 (88.2581) time: 1.1951 data: 1.1664 max mem: 6925
Test: [40/50] eta: 0:00:13 loss: 1.4340 (1.4241) acc1: 66.4000 (68.1756) acc5: 88.8000 (88.1171) time: 0.9649 data: 0.9357 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.4404 (1.4270) acc1: 65.6000 (67.9680) acc5: 88.8000 (88.0480) time: 0.8625 data: 0.8328 max mem: 6925
Test: Total time: 0:00:54 (1.0936 s / it)
* Acc@1 69.008 Acc@5 88.868 loss 1.381
Accuracy of the model on the 50000 test images: 69.0%
Max accuracy: 69.80%
Epoch: [208] [ 0/625] eta: 3:42:08 lr: 0.000975 min_lr: 0.000975 loss: 2.4566 (2.4566) class_acc: 0.6445 (0.6445) weight_decay: 0.0500 (0.0500) time: 21.3248 data: 17.5473 max mem: 6925
Epoch: [208] [200/625] eta: 0:15:00 lr: 0.000969 min_lr: 0.000969 loss: 2.3771 (2.3739) class_acc: 0.6719 (0.6690) weight_decay: 0.0500 (0.0500) grad_norm: 1.1748 (1.1665) time: 1.9600 data: 0.0008 max mem: 6925
Epoch: [208] [400/625] eta: 0:07:44 lr: 0.000963 min_lr: 0.000963 loss: 2.3197 (2.3758) class_acc: 0.6797 (0.6679) weight_decay: 0.0500 (0.0500) grad_norm: 1.0823 (1.1711) time: 2.0652 data: 0.0007 max mem: 6925
Epoch: [208] [600/625] eta: 0:00:51 lr: 0.000956 min_lr: 0.000956 loss: 2.3739 (2.3856) class_acc: 0.6719 (0.6656) weight_decay: 0.0500 (0.0500) grad_norm: 1.1407 (1.1711) time: 2.0102 data: 0.0008 max mem: 6925
Epoch: [208] [624/625] eta: 0:00:01 lr: 0.000956 min_lr: 0.000956 loss: 2.3884 (2.3866) class_acc: 0.6602 (0.6652) weight_decay: 0.0500 (0.0500) grad_norm: 1.0452 (1.1659) time: 0.7199 data: 0.0014 max mem: 6925
Epoch: [208] Total time: 0:20:47 (1.9957 s / it)
Averaged stats: lr: 0.000956 min_lr: 0.000956 loss: 2.3884 (2.3811) class_acc: 0.6602 (0.6657) weight_decay: 0.0500 (0.0500) grad_norm: 1.0452 (1.1659)
Test: [ 0/50] eta: 0:09:40 loss: 1.2270 (1.2270) acc1: 74.4000 (74.4000) acc5: 89.6000 (89.6000) time: 11.6077 data: 11.5757 max mem: 6925
Test: [10/50] eta: 0:01:14 loss: 1.2150 (1.2333) acc1: 72.8000 (72.8727) acc5: 90.4000 (90.0364) time: 1.8516 data: 1.8211 max mem: 6925
Test: [20/50] eta: 0:00:44 loss: 1.2940 (1.3769) acc1: 70.4000 (69.4095) acc5: 89.6000 (88.8000) time: 0.9914 data: 0.9619 max mem: 6925
Test: [30/50] eta: 0:00:26 loss: 1.4260 (1.3860) acc1: 65.6000 (68.8774) acc5: 87.2000 (88.5936) time: 1.0723 data: 1.0430 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.3920 (1.4118) acc1: 66.4000 (68.3122) acc5: 87.2000 (88.4488) time: 0.7844 data: 0.7551 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.4011 (1.4229) acc1: 66.4000 (67.9360) acc5: 88.0000 (88.3360) time: 0.6347 data: 0.6052 max mem: 6925
Test: Total time: 0:00:48 (0.9672 s / it)
* Acc@1 69.192 Acc@5 88.990 loss 1.383
Accuracy of the model on the 50000 test images: 69.2%
Max accuracy: 69.80%
Epoch: [209] [ 0/625] eta: 3:26:24 lr: 0.000956 min_lr: 0.000956 loss: 2.2859 (2.2859) class_acc: 0.7188 (0.7188) weight_decay: 0.0500 (0.0500) time: 19.8144 data: 19.5555 max mem: 6925
Epoch: [209] [200/625] eta: 0:14:11 lr: 0.000950 min_lr: 0.000950 loss: 2.3306 (2.3472) class_acc: 0.6797 (0.6748) weight_decay: 0.0500 (0.0500) grad_norm: 1.2808 (1.2045) time: 1.8730 data: 0.0798 max mem: 6925
Epoch: [209] [400/625] eta: 0:07:15 lr: 0.000944 min_lr: 0.000944 loss: 2.3725 (2.3651) class_acc: 0.6523 (0.6701) weight_decay: 0.0500 (0.0500) grad_norm: 1.3317 (1.2271) time: 1.8269 data: 0.0010 max mem: 6925
Epoch: [209] [600/625] eta: 0:00:48 lr: 0.000937 min_lr: 0.000937 loss: 2.3891 (2.3727) class_acc: 0.6680 (0.6684) weight_decay: 0.0500 (0.0500) grad_norm: 1.0403 (1.2039) time: 2.0539 data: 0.0091 max mem: 6925
Epoch: [209] [624/625] eta: 0:00:01 lr: 0.000937 min_lr: 0.000937 loss: 2.3748 (2.3738) class_acc: 0.6641 (0.6681) weight_decay: 0.0500 (0.0500) grad_norm: 1.0448 (1.2052) time: 0.7301 data: 0.0123 max mem: 6925
Epoch: [209] Total time: 0:19:34 (1.8794 s / it)
Averaged stats: lr: 0.000937 min_lr: 0.000937 loss: 2.3748 (2.3791) class_acc: 0.6641 (0.6662) weight_decay: 0.0500 (0.0500) grad_norm: 1.0448 (1.2052)
Test: [ 0/50] eta: 0:09:16 loss: 1.0348 (1.0348) acc1: 76.8000 (76.8000) acc5: 92.8000 (92.8000) time: 11.1261 data: 11.0743 max mem: 6925
Test: [10/50] eta: 0:01:13 loss: 1.1610 (1.1885) acc1: 74.4000 (73.6727) acc5: 89.6000 (90.2546) time: 1.8381 data: 1.8070 max mem: 6925
Test: [20/50] eta: 0:00:43 loss: 1.3862 (1.3565) acc1: 68.8000 (69.4476) acc5: 89.6000 (88.9905) time: 0.9572 data: 0.9282 max mem: 6925
Test: [30/50] eta: 0:00:25 loss: 1.5234 (1.3759) acc1: 65.6000 (68.8000) acc5: 88.0000 (88.5677) time: 0.9698 data: 0.9412 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.4063 (1.3892) acc1: 66.4000 (68.7024) acc5: 88.0000 (88.4488) time: 0.8576 data: 0.8277 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.4063 (1.4054) acc1: 67.2000 (68.2560) acc5: 88.0000 (88.2080) time: 0.6062 data: 0.5756 max mem: 6925
Test: Total time: 0:00:50 (1.0115 s / it)
* Acc@1 69.312 Acc@5 88.686 loss 1.369
Accuracy of the model on the 50000 test images: 69.3%
Max accuracy: 69.80%
Epoch: [210] [ 0/625] eta: 3:27:39 lr: 0.000937 min_lr: 0.000937 loss: 2.2861 (2.2861) class_acc: 0.7109 (0.7109) weight_decay: 0.0500 (0.0500) time: 19.9355 data: 19.6838 max mem: 6925
Epoch: [210] [200/625] eta: 0:13:52 lr: 0.000931 min_lr: 0.000931 loss: 2.3439 (2.3552) class_acc: 0.6836 (0.6707) weight_decay: 0.0500 (0.0500) grad_norm: 1.1471 (1.2262) time: 1.8508 data: 0.5761 max mem: 6925
Epoch: [210] [400/625] eta: 0:07:16 lr: 0.000925 min_lr: 0.000925 loss: 2.3784 (2.3626) class_acc: 0.6719 (0.6698) weight_decay: 0.0500 (0.0500) grad_norm: 1.1028 (1.2072) time: 2.1344 data: 0.0069 max mem: 6925
Epoch: [210] [600/625] eta: 0:00:48 lr: 0.000918 min_lr: 0.000918 loss: 2.4009 (2.3734) class_acc: 0.6719 (0.6672) weight_decay: 0.0500 (0.0500) grad_norm: 1.1726 (1.2060) time: 1.9258 data: 0.0010 max mem: 6925
Epoch: [210] [624/625] eta: 0:00:01 lr: 0.000918 min_lr: 0.000918 loss: 2.4149 (2.3738) class_acc: 0.6602 (0.6672) weight_decay: 0.0500 (0.0500) grad_norm: 1.0990 (1.2057) time: 0.7981 data: 0.0016 max mem: 6925
Epoch: [210] Total time: 0:19:40 (1.8896 s / it)
Averaged stats: lr: 0.000918 min_lr: 0.000918 loss: 2.4149 (2.3789) class_acc: 0.6602 (0.6664) weight_decay: 0.0500 (0.0500) grad_norm: 1.0990 (1.2057)
Test: [ 0/50] eta: 0:10:42 loss: 1.0472 (1.0472) acc1: 74.4000 (74.4000) acc5: 93.6000 (93.6000) time: 12.8522 data: 12.7977 max mem: 6925
Test: [10/50] eta: 0:01:18 loss: 1.2121 (1.2344) acc1: 73.6000 (73.3818) acc5: 90.4000 (90.1091) time: 1.9554 data: 1.9221 max mem: 6925
Test: [20/50] eta: 0:00:41 loss: 1.3920 (1.3985) acc1: 69.6000 (68.9524) acc5: 88.8000 (88.7238) time: 0.8222 data: 0.7923 max mem: 6925
Test: [30/50] eta: 0:00:26 loss: 1.5347 (1.4270) acc1: 64.8000 (68.4645) acc5: 87.2000 (88.2065) time: 0.9403 data: 0.9115 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.4716 (1.4228) acc1: 68.0000 (68.8390) acc5: 88.8000 (88.4683) time: 0.9602 data: 0.9310 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.4095 (1.4314) acc1: 68.0000 (68.5280) acc5: 89.6000 (88.3520) time: 0.5739 data: 0.5451 max mem: 6925
Test: Total time: 0:00:51 (1.0383 s / it)
* Acc@1 69.610 Acc@5 89.044 loss 1.391
Accuracy of the model on the 50000 test images: 69.6%
Max accuracy: 69.80%
Epoch: [211] [ 0/625] eta: 3:36:02 lr: 0.000918 min_lr: 0.000918 loss: 2.2988 (2.2988) class_acc: 0.6680 (0.6680) weight_decay: 0.0500 (0.0500) time: 20.7399 data: 19.2138 max mem: 6925
Epoch: [211] [200/625] eta: 0:14:00 lr: 0.000912 min_lr: 0.000912 loss: 2.3445 (2.3685) class_acc: 0.6758 (0.6673) weight_decay: 0.0500 (0.0500) grad_norm: 1.0690 (1.1802) time: 1.7227 data: 0.0009 max mem: 6925
Epoch: [211] [400/625] eta: 0:07:21 lr: 0.000906 min_lr: 0.000906 loss: 2.3512 (2.3722) class_acc: 0.6719 (0.6675) weight_decay: 0.0500 (0.0500) grad_norm: 1.1517 (1.1836) time: 1.8857 data: 0.0011 max mem: 6925
Epoch: [211] [600/625] eta: 0:00:49 lr: 0.000900 min_lr: 0.000900 loss: 2.4013 (2.3761) class_acc: 0.6523 (0.6668) weight_decay: 0.0500 (0.0500) grad_norm: 1.1960 (1.2052) time: 1.9388 data: 0.0010 max mem: 6925
Epoch: [211] [624/625] eta: 0:00:01 lr: 0.000899 min_lr: 0.000899 loss: 2.3982 (2.3771) class_acc: 0.6602 (0.6666) weight_decay: 0.0500 (0.0500) grad_norm: 1.2691 (1.2089) time: 0.8320 data: 0.0015 max mem: 6925
Epoch: [211] Total time: 0:20:01 (1.9221 s / it)
Averaged stats: lr: 0.000899 min_lr: 0.000899 loss: 2.3982 (2.3728) class_acc: 0.6602 (0.6677) weight_decay: 0.0500 (0.0500) grad_norm: 1.2691 (1.2089)
Test: [ 0/50] eta: 0:10:39 loss: 1.0740 (1.0740) acc1: 79.2000 (79.2000) acc5: 91.2000 (91.2000) time: 12.7821 data: 12.7508 max mem: 6925
Test: [10/50] eta: 0:01:30 loss: 1.1373 (1.1623) acc1: 74.4000 (74.9091) acc5: 91.2000 (90.1818) time: 2.2610 data: 2.2321 max mem: 6925
Test: [20/50] eta: 0:00:54 loss: 1.3770 (1.3378) acc1: 68.8000 (70.5143) acc5: 88.0000 (88.7238) time: 1.2670 data: 1.2376 max mem: 6925
Test: [30/50] eta: 0:00:29 loss: 1.5036 (1.3764) acc1: 66.4000 (69.2903) acc5: 86.4000 (88.1806) time: 1.0786 data: 1.0484 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.4357 (1.3936) acc1: 66.4000 (68.8390) acc5: 86.4000 (87.8829) time: 0.5918 data: 0.5619 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.4687 (1.4079) acc1: 66.4000 (68.5920) acc5: 86.4000 (87.7280) time: 0.5126 data: 0.4829 max mem: 6925
Test: Total time: 0:00:51 (1.0240 s / it)
* Acc@1 69.734 Acc@5 88.956 loss 1.361
Accuracy of the model on the 50000 test images: 69.7%
Max accuracy: 69.80%
Epoch: [212] [ 0/625] eta: 4:15:39 lr: 0.000899 min_lr: 0.000899 loss: 2.2060 (2.2060) class_acc: 0.6758 (0.6758) weight_decay: 0.0500 (0.0500) time: 24.5429 data: 17.6882 max mem: 6925
Epoch: [212] [200/625] eta: 0:13:55 lr: 0.000893 min_lr: 0.000893 loss: 2.4128 (2.3596) class_acc: 0.6602 (0.6692) weight_decay: 0.0500 (0.0500) grad_norm: 1.1474 (1.2114) time: 1.8229 data: 0.0010 max mem: 6925
Epoch: [212] [400/625] eta: 0:07:07 lr: 0.000887 min_lr: 0.000887 loss: 2.3691 (2.3617) class_acc: 0.6641 (0.6695) weight_decay: 0.0500 (0.0500) grad_norm: 1.1567 (1.2097) time: 1.9028 data: 0.0013 max mem: 6925
Epoch: [212] [600/625] eta: 0:00:48 lr: 0.000881 min_lr: 0.000881 loss: 2.3887 (2.3642) class_acc: 0.6719 (0.6690) weight_decay: 0.0500 (0.0500) grad_norm: 1.1363 (1.2064) time: 1.7512 data: 0.0007 max mem: 6925
Epoch: [212] [624/625] eta: 0:00:01 lr: 0.000880 min_lr: 0.000880 loss: 2.3327 (2.3647) class_acc: 0.6641 (0.6688) weight_decay: 0.0500 (0.0500) grad_norm: 1.0656 (1.2040) time: 0.7179 data: 0.0015 max mem: 6925
Epoch: [212] Total time: 0:19:50 (1.9053 s / it)
Averaged stats: lr: 0.000880 min_lr: 0.000880 loss: 2.3327 (2.3684) class_acc: 0.6641 (0.6686) weight_decay: 0.0500 (0.0500) grad_norm: 1.0656 (1.2040)
Test: [ 0/50] eta: 0:09:32 loss: 1.1922 (1.1922) acc1: 69.6000 (69.6000) acc5: 92.0000 (92.0000) time: 11.4556 data: 11.4162 max mem: 6925
Test: [10/50] eta: 0:01:09 loss: 1.1695 (1.1511) acc1: 74.4000 (73.9636) acc5: 92.0000 (90.8364) time: 1.7498 data: 1.7193 max mem: 6925
Test: [20/50] eta: 0:00:41 loss: 1.2434 (1.3161) acc1: 70.4000 (70.2476) acc5: 89.6000 (89.4476) time: 0.8924 data: 0.8633 max mem: 6925
Test: [30/50] eta: 0:00:26 loss: 1.4236 (1.3585) acc1: 67.2000 (69.3936) acc5: 88.8000 (88.9548) time: 1.0679 data: 1.0393 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.4236 (1.3655) acc1: 66.4000 (69.2488) acc5: 88.8000 (88.8000) time: 0.8751 data: 0.8443 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.4411 (1.3804) acc1: 67.2000 (68.7520) acc5: 88.0000 (88.5440) time: 0.4632 data: 0.4309 max mem: 6925
Test: Total time: 0:00:50 (1.0003 s / it)
* Acc@1 69.822 Acc@5 89.152 loss 1.336
Accuracy of the model on the 50000 test images: 69.8%
Max accuracy: 69.82%
Epoch: [213] [ 0/625] eta: 3:27:28 lr: 0.000880 min_lr: 0.000880 loss: 2.2354 (2.2354) class_acc: 0.7031 (0.7031) weight_decay: 0.0500 (0.0500) time: 19.9176 data: 19.6781 max mem: 6925
Epoch: [213] [200/625] eta: 0:14:12 lr: 0.000874 min_lr: 0.000874 loss: 2.3750 (2.3490) class_acc: 0.6680 (0.6733) weight_decay: 0.0500 (0.0500) grad_norm: 1.0797 (1.2494) time: 1.8212 data: 0.0097 max mem: 6925
Epoch: [213] [400/625] eta: 0:07:12 lr: 0.000868 min_lr: 0.000868 loss: 2.3718 (2.3595) class_acc: 0.6641 (0.6709) weight_decay: 0.0500 (0.0500) grad_norm: 1.4064 (1.2372) time: 1.8391 data: 0.0660 max mem: 6925
Epoch: [213] [600/625] eta: 0:00:48 lr: 0.000863 min_lr: 0.000863 loss: 2.3532 (2.3678) class_acc: 0.6484 (0.6684) weight_decay: 0.0500 (0.0500) grad_norm: 1.0783 (1.1975) time: 1.9697 data: 0.0011 max mem: 6925
Epoch: [213] [624/625] eta: 0:00:01 lr: 0.000862 min_lr: 0.000862 loss: 2.3774 (2.3687) class_acc: 0.6680 (0.6683) weight_decay: 0.0500 (0.0500) grad_norm: 1.0796 (1.1955) time: 0.4929 data: 0.0026 max mem: 6925
Epoch: [213] Total time: 0:19:47 (1.9001 s / it)
Averaged stats: lr: 0.000862 min_lr: 0.000862 loss: 2.3774 (2.3680) class_acc: 0.6680 (0.6687) weight_decay: 0.0500 (0.0500) grad_norm: 1.0796 (1.1955)
Test: [ 0/50] eta: 0:09:26 loss: 1.2915 (1.2915) acc1: 72.0000 (72.0000) acc5: 90.4000 (90.4000) time: 11.3353 data: 11.2657 max mem: 6925
Test: [10/50] eta: 0:01:22 loss: 1.2544 (1.2541) acc1: 72.0000 (72.5818) acc5: 89.6000 (90.1091) time: 2.0585 data: 2.0254 max mem: 6925
Test: [20/50] eta: 0:00:50 loss: 1.2889 (1.3814) acc1: 69.6000 (69.8286) acc5: 88.0000 (88.8762) time: 1.1879 data: 1.1589 max mem: 6925
Test: [30/50] eta: 0:00:29 loss: 1.5065 (1.4053) acc1: 68.0000 (69.4194) acc5: 88.0000 (88.6710) time: 1.1847 data: 1.1562 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.4103 (1.4161) acc1: 68.0000 (69.3268) acc5: 88.0000 (88.6634) time: 0.7610 data: 0.7321 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.4351 (1.4310) acc1: 68.0000 (68.8960) acc5: 88.0000 (88.5760) time: 0.6260 data: 0.5966 max mem: 6925
Test: Total time: 0:00:51 (1.0305 s / it)
* Acc@1 69.882 Acc@5 89.210 loss 1.381
Accuracy of the model on the 50000 test images: 69.9%
Max accuracy: 69.88%
Epoch: [214] [ 0/625] eta: 3:33:51 lr: 0.000862 min_lr: 0.000862 loss: 2.4388 (2.4388) class_acc: 0.6523 (0.6523) weight_decay: 0.0500 (0.0500) time: 20.5309 data: 17.3593 max mem: 6925
Epoch: [214] [200/625] eta: 0:14:30 lr: 0.000856 min_lr: 0.000856 loss: 2.3253 (2.3430) class_acc: 0.6680 (0.6769) weight_decay: 0.0500 (0.0500) grad_norm: 1.1823 (inf) time: 2.0124 data: 0.0007 max mem: 6925
Epoch: [214] [400/625] eta: 0:07:28 lr: 0.000850 min_lr: 0.000850 loss: 2.4087 (2.3511) class_acc: 0.6719 (0.6746) weight_decay: 0.0500 (0.0500) grad_norm: 1.2852 (inf) time: 1.8678 data: 0.0008 max mem: 6925
Epoch: [214] [600/625] eta: 0:00:49 lr: 0.000844 min_lr: 0.000844 loss: 2.3780 (2.3581) class_acc: 0.6680 (0.6723) weight_decay: 0.0500 (0.0500) grad_norm: 1.0967 (inf) time: 2.0341 data: 0.0008 max mem: 6925
Epoch: [214] [624/625] eta: 0:00:01 lr: 0.000844 min_lr: 0.000844 loss: 2.3547 (2.3589) class_acc: 0.6602 (0.6722) weight_decay: 0.0500 (0.0500) grad_norm: 1.0996 (inf) time: 0.7882 data: 0.0013 max mem: 6925
Epoch: [214] Total time: 0:20:16 (1.9462 s / it)
Averaged stats: lr: 0.000844 min_lr: 0.000844 loss: 2.3547 (2.3610) class_acc: 0.6602 (0.6706) weight_decay: 0.0500 (0.0500) grad_norm: 1.0996 (inf)
Test: [ 0/50] eta: 0:07:55 loss: 1.1659 (1.1659) acc1: 74.4000 (74.4000) acc5: 92.8000 (92.8000) time: 9.5014 data: 9.4650 max mem: 6925
Test: [10/50] eta: 0:01:07 loss: 1.1659 (1.1445) acc1: 74.4000 (74.2545) acc5: 90.4000 (90.6182) time: 1.6834 data: 1.6528 max mem: 6925
Test: [20/50] eta: 0:00:41 loss: 1.3166 (1.3271) acc1: 70.4000 (70.0571) acc5: 88.8000 (89.2571) time: 0.9652 data: 0.9357 max mem: 6925
Test: [30/50] eta: 0:00:24 loss: 1.5041 (1.3627) acc1: 66.4000 (69.1871) acc5: 87.2000 (88.7484) time: 1.0025 data: 0.9735 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.4273 (1.3694) acc1: 65.6000 (69.1902) acc5: 88.0000 (88.6829) time: 0.8905 data: 0.8609 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.3651 (1.3693) acc1: 68.0000 (68.8800) acc5: 88.8000 (88.7840) time: 0.5961 data: 0.5654 max mem: 6925
Test: Total time: 0:00:50 (1.0123 s / it)
* Acc@1 70.400 Acc@5 89.414 loss 1.316
Accuracy of the model on the 50000 test images: 70.4%
Max accuracy: 70.40%
Epoch: [215] [ 0/625] eta: 3:52:47 lr: 0.000843 min_lr: 0.000843 loss: 2.2510 (2.2510) class_acc: 0.6914 (0.6914) weight_decay: 0.0500 (0.0500) time: 22.3483 data: 13.8054 max mem: 6925
Epoch: [215] [200/625] eta: 0:14:30 lr: 0.000838 min_lr: 0.000838 loss: 2.2881 (2.3436) class_acc: 0.6719 (0.6743) weight_decay: 0.0500 (0.0500) grad_norm: 1.1847 (1.2190) time: 1.9400 data: 0.0563 max mem: 6925
Epoch: [215] [400/625] eta: 0:07:30 lr: 0.000832 min_lr: 0.000832 loss: 2.3592 (2.3482) class_acc: 0.6719 (0.6732) weight_decay: 0.0500 (0.0500) grad_norm: 1.0640 (1.2107) time: 1.9605 data: 0.0010 max mem: 6925
Epoch: [215] [600/625] eta: 0:00:49 lr: 0.000826 min_lr: 0.000826 loss: 2.3685 (2.3562) class_acc: 0.6758 (0.6716) weight_decay: 0.0500 (0.0500) grad_norm: 1.2392 (1.2317) time: 1.9941 data: 0.0010 max mem: 6925
Epoch: [215] [624/625] eta: 0:00:01 lr: 0.000825 min_lr: 0.000825 loss: 2.3689 (2.3572) class_acc: 0.6680 (0.6713) weight_decay: 0.0500 (0.0500) grad_norm: 1.1651 (1.2303) time: 0.7599 data: 0.0013 max mem: 6925
Epoch: [215] Total time: 0:20:15 (1.9454 s / it)
Averaged stats: lr: 0.000825 min_lr: 0.000825 loss: 2.3689 (2.3563) class_acc: 0.6680 (0.6720) weight_decay: 0.0500 (0.0500) grad_norm: 1.1651 (1.2303)
Test: [ 0/50] eta: 0:10:38 loss: 1.1487 (1.1487) acc1: 72.0000 (72.0000) acc5: 91.2000 (91.2000) time: 12.7746 data: 12.7428 max mem: 6925
Test: [10/50] eta: 0:01:20 loss: 1.0802 (1.1433) acc1: 76.0000 (75.2727) acc5: 91.2000 (90.4000) time: 2.0161 data: 1.9846 max mem: 6925
Test: [20/50] eta: 0:00:46 loss: 1.2213 (1.3042) acc1: 72.0000 (71.1238) acc5: 89.6000 (89.2191) time: 0.9817 data: 0.9494 max mem: 6925
Test: [30/50] eta: 0:00:26 loss: 1.4962 (1.3609) acc1: 66.4000 (69.6258) acc5: 88.8000 (88.5936) time: 0.9190 data: 0.8879 max mem: 6925
Test: [40/50] eta: 0:00:10 loss: 1.4045 (1.3636) acc1: 66.4000 (69.3463) acc5: 87.2000 (88.5659) time: 0.6319 data: 0.6032 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.4045 (1.3776) acc1: 66.4000 (68.8000) acc5: 89.6000 (88.4960) time: 0.6256 data: 0.5967 max mem: 6925
Test: Total time: 0:00:48 (0.9603 s / it)
* Acc@1 70.026 Acc@5 89.280 loss 1.337
Accuracy of the model on the 50000 test images: 70.0%
Max accuracy: 70.40%
Epoch: [216] [ 0/625] eta: 3:54:00 lr: 0.000825 min_lr: 0.000825 loss: 2.3963 (2.3963) class_acc: 0.6875 (0.6875) weight_decay: 0.0500 (0.0500) time: 22.4650 data: 21.7065 max mem: 6925
Epoch: [216] [200/625] eta: 0:13:58 lr: 0.000819 min_lr: 0.000819 loss: 2.3165 (2.3630) class_acc: 0.6719 (0.6693) weight_decay: 0.0500 (0.0500) grad_norm: 1.3189 (1.3743) time: 1.7538 data: 0.6304 max mem: 6925
Epoch: [216] [400/625] eta: 0:07:20 lr: 0.000814 min_lr: 0.000814 loss: 2.3393 (2.3537) class_acc: 0.6680 (0.6721) weight_decay: 0.0500 (0.0500) grad_norm: 1.1602 (1.2985) time: 1.9699 data: 0.0024 max mem: 6925
Epoch: [216] [600/625] eta: 0:00:49 lr: 0.000808 min_lr: 0.000808 loss: 2.4002 (2.3564) class_acc: 0.6680 (0.6723) weight_decay: 0.0500 (0.0500) grad_norm: 1.2304 (1.2768) time: 2.0090 data: 0.0080 max mem: 6925
Epoch: [216] [624/625] eta: 0:00:01 lr: 0.000807 min_lr: 0.000807 loss: 2.4197 (2.3580) class_acc: 0.6562 (0.6719) weight_decay: 0.0500 (0.0500) grad_norm: 1.1892 (1.2750) time: 0.6687 data: 0.0027 max mem: 6925
Epoch: [216] Total time: 0:20:15 (1.9450 s / it)
Averaged stats: lr: 0.000807 min_lr: 0.000807 loss: 2.4197 (2.3547) class_acc: 0.6562 (0.6722) weight_decay: 0.0500 (0.0500) grad_norm: 1.1892 (1.2750)
Test: [ 0/50] eta: 0:10:57 loss: 1.1932 (1.1932) acc1: 75.2000 (75.2000) acc5: 92.8000 (92.8000) time: 13.1572 data: 13.1164 max mem: 6925
Test: [10/50] eta: 0:01:29 loss: 1.0761 (1.1427) acc1: 75.2000 (74.9818) acc5: 92.8000 (91.9273) time: 2.2286 data: 2.1972 max mem: 6925
Test: [20/50] eta: 0:00:51 loss: 1.3220 (1.2888) acc1: 70.4000 (70.8952) acc5: 90.4000 (89.9429) time: 1.1588 data: 1.1294 max mem: 6925
Test: [30/50] eta: 0:00:31 loss: 1.4016 (1.3314) acc1: 67.2000 (70.3484) acc5: 88.0000 (89.0065) time: 1.1816 data: 1.1532 max mem: 6925
Test: [40/50] eta: 0:00:13 loss: 1.3525 (1.3358) acc1: 70.4000 (70.2829) acc5: 87.2000 (88.8781) time: 0.9439 data: 0.9149 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.3525 (1.3431) acc1: 68.0000 (69.9680) acc5: 88.8000 (88.8000) time: 0.8293 data: 0.7994 max mem: 6925
Test: Total time: 0:00:56 (1.1239 s / it)
* Acc@1 70.828 Acc@5 89.668 loss 1.302
Accuracy of the model on the 50000 test images: 70.8%
Max accuracy: 70.83%
Epoch: [217] [ 0/625] eta: 3:24:08 lr: 0.000807 min_lr: 0.000807 loss: 2.3051 (2.3051) class_acc: 0.6914 (0.6914) weight_decay: 0.0500 (0.0500) time: 19.5971 data: 16.2827 max mem: 6925
Epoch: [217] [200/625] eta: 0:13:52 lr: 0.000801 min_lr: 0.000801 loss: 2.3844 (2.3482) class_acc: 0.6758 (0.6724) weight_decay: 0.0500 (0.0500) grad_norm: 1.2299 (1.2928) time: 1.8782 data: 0.1419 max mem: 6925
Epoch: [217] [400/625] eta: 0:07:14 lr: 0.000796 min_lr: 0.000796 loss: 2.3605 (2.3468) class_acc: 0.6719 (0.6735) weight_decay: 0.0500 (0.0500) grad_norm: 1.1262 (1.2566) time: 1.8733 data: 0.0693 max mem: 6925
Epoch: [217] [600/625] eta: 0:00:48 lr: 0.000790 min_lr: 0.000790 loss: 2.3828 (2.3572) class_acc: 0.6680 (0.6716) weight_decay: 0.0500 (0.0500) grad_norm: 1.1052 (1.2363) time: 1.9287 data: 1.2657 max mem: 6925
Epoch: [217] [624/625] eta: 0:00:01 lr: 0.000789 min_lr: 0.000789 loss: 2.3325 (2.3571) class_acc: 0.6797 (0.6718) weight_decay: 0.0500 (0.0500) grad_norm: 1.1138 (1.2354) time: 0.9964 data: 0.7356 max mem: 6925
Epoch: [217] Total time: 0:19:37 (1.8840 s / it)
Averaged stats: lr: 0.000789 min_lr: 0.000789 loss: 2.3325 (2.3487) class_acc: 0.6797 (0.6738) weight_decay: 0.0500 (0.0500) grad_norm: 1.1138 (1.2354)
Test: [ 0/50] eta: 0:09:51 loss: 1.1272 (1.1272) acc1: 77.6000 (77.6000) acc5: 92.8000 (92.8000) time: 11.8257 data: 11.7898 max mem: 6925
Test: [10/50] eta: 0:01:20 loss: 1.2212 (1.2592) acc1: 74.4000 (73.1636) acc5: 90.4000 (90.2546) time: 2.0034 data: 1.9708 max mem: 6925
Test: [20/50] eta: 0:00:47 loss: 1.3911 (1.3918) acc1: 68.8000 (70.2857) acc5: 88.0000 (88.8000) time: 1.0802 data: 1.0496 max mem: 6925
Test: [30/50] eta: 0:00:26 loss: 1.4763 (1.4149) acc1: 67.2000 (69.2387) acc5: 87.2000 (88.6452) time: 0.9759 data: 0.9472 max mem: 6925
Test: [40/50] eta: 0:00:10 loss: 1.4321 (1.4219) acc1: 66.4000 (68.8195) acc5: 86.4000 (88.5073) time: 0.5643 data: 0.5359 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.4368 (1.4226) acc1: 67.2000 (68.6880) acc5: 87.2000 (88.3200) time: 0.5276 data: 0.4994 max mem: 6925
Test: Total time: 0:00:45 (0.9085 s / it)
* Acc@1 69.334 Acc@5 89.052 loss 1.386
Accuracy of the model on the 50000 test images: 69.3%
Max accuracy: 70.83%
Epoch: [218] [ 0/625] eta: 3:40:32 lr: 0.000789 min_lr: 0.000789 loss: 2.2251 (2.2251) class_acc: 0.7070 (0.7070) weight_decay: 0.0500 (0.0500) time: 21.1727 data: 15.9652 max mem: 6925
Epoch: [218] [200/625] eta: 0:14:16 lr: 0.000784 min_lr: 0.000784 loss: 2.3544 (2.3515) class_acc: 0.6602 (0.6733) weight_decay: 0.0500 (0.0500) grad_norm: 1.1774 (1.2783) time: 2.0452 data: 0.0243 max mem: 6925
Epoch: [218] [400/625] eta: 0:07:30 lr: 0.000778 min_lr: 0.000778 loss: 2.3833 (2.3513) class_acc: 0.6641 (0.6734) weight_decay: 0.0500 (0.0500) grad_norm: 1.3370 (1.2609) time: 1.8754 data: 0.1169 max mem: 6925
Epoch: [218] [600/625] eta: 0:00:50 lr: 0.000772 min_lr: 0.000772 loss: 2.3147 (2.3520) class_acc: 0.6680 (0.6729) weight_decay: 0.0500 (0.0500) grad_norm: 1.1625 (1.2305) time: 2.1377 data: 0.0484 max mem: 6925
Epoch: [218] [624/625] eta: 0:00:01 lr: 0.000772 min_lr: 0.000772 loss: 2.3643 (2.3526) class_acc: 0.6719 (0.6728) weight_decay: 0.0500 (0.0500) grad_norm: 1.1124 (1.2270) time: 0.8069 data: 0.0014 max mem: 6925
Epoch: [218] Total time: 0:20:34 (1.9759 s / it)
Averaged stats: lr: 0.000772 min_lr: 0.000772 loss: 2.3643 (2.3462) class_acc: 0.6719 (0.6744) weight_decay: 0.0500 (0.0500) grad_norm: 1.1124 (1.2270)
Test: [ 0/50] eta: 0:10:27 loss: 1.1529 (1.1529) acc1: 73.6000 (73.6000) acc5: 92.8000 (92.8000) time: 12.5504 data: 12.5012 max mem: 6925
Test: [10/50] eta: 0:01:21 loss: 1.1281 (1.1417) acc1: 76.0000 (74.6909) acc5: 91.2000 (90.6909) time: 2.0465 data: 2.0154 max mem: 6925
Test: [20/50] eta: 0:00:48 loss: 1.2948 (1.2992) acc1: 70.4000 (70.5905) acc5: 89.6000 (89.0667) time: 1.0611 data: 1.0324 max mem: 6925
Test: [30/50] eta: 0:00:28 loss: 1.4709 (1.3542) acc1: 66.4000 (69.5742) acc5: 86.4000 (88.5936) time: 1.1039 data: 1.0755 max mem: 6925
Test: [40/50] eta: 0:00:13 loss: 1.4011 (1.3655) acc1: 66.4000 (69.7366) acc5: 88.8000 (88.7024) time: 1.0140 data: 0.9853 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.3167 (1.3614) acc1: 69.6000 (69.8400) acc5: 88.8000 (88.7040) time: 0.9083 data: 0.8796 max mem: 6925
Test: Total time: 0:00:58 (1.1689 s / it)
* Acc@1 70.780 Acc@5 89.670 loss 1.310
Accuracy of the model on the 50000 test images: 70.8%
Max accuracy: 70.83%
Epoch: [219] [ 0/625] eta: 3:59:08 lr: 0.000771 min_lr: 0.000771 loss: 2.1911 (2.1911) class_acc: 0.6914 (0.6914) weight_decay: 0.0500 (0.0500) time: 22.9575 data: 18.9505 max mem: 6925
Epoch: [219] [200/625] eta: 0:14:36 lr: 0.000766 min_lr: 0.000766 loss: 2.3216 (2.3339) class_acc: 0.6797 (0.6775) weight_decay: 0.0500 (0.0500) grad_norm: 1.2139 (1.2779) time: 1.7690 data: 0.0014 max mem: 6925
Epoch: [219] [400/625] eta: 0:07:45 lr: 0.000760 min_lr: 0.000760 loss: 2.3356 (2.3354) class_acc: 0.6797 (0.6770) weight_decay: 0.0500 (0.0500) grad_norm: 1.2510 (1.2995) time: 2.0793 data: 0.0016 max mem: 6925
Epoch: [219] [600/625] eta: 0:00:51 lr: 0.000755 min_lr: 0.000755 loss: 2.3542 (2.3389) class_acc: 0.6719 (0.6768) weight_decay: 0.0500 (0.0500) grad_norm: 1.1882 (1.2597) time: 1.9903 data: 0.0012 max mem: 6925
Epoch: [219] [624/625] eta: 0:00:02 lr: 0.000754 min_lr: 0.000754 loss: 2.3373 (2.3406) class_acc: 0.6641 (0.6764) weight_decay: 0.0500 (0.0500) grad_norm: 1.1010 (1.2563) time: 0.4607 data: 0.0025 max mem: 6925
Epoch: [219] Total time: 0:21:09 (2.0305 s / it)
Averaged stats: lr: 0.000754 min_lr: 0.000754 loss: 2.3373 (2.3437) class_acc: 0.6641 (0.6751) weight_decay: 0.0500 (0.0500) grad_norm: 1.1010 (1.2563)
Test: [ 0/50] eta: 0:10:53 loss: 1.2726 (1.2726) acc1: 68.0000 (68.0000) acc5: 91.2000 (91.2000) time: 13.0798 data: 13.0473 max mem: 6925
Test: [10/50] eta: 0:01:33 loss: 1.1273 (1.1863) acc1: 75.2000 (74.8364) acc5: 92.8000 (91.7091) time: 2.3301 data: 2.3006 max mem: 6925
Test: [20/50] eta: 0:00:54 loss: 1.3589 (1.3372) acc1: 71.2000 (70.7429) acc5: 90.4000 (89.9429) time: 1.2553 data: 1.2258 max mem: 6925
Test: [30/50] eta: 0:00:30 loss: 1.4253 (1.3584) acc1: 67.2000 (70.1677) acc5: 88.0000 (89.5484) time: 1.0865 data: 1.0572 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.4030 (1.3773) acc1: 68.8000 (69.7561) acc5: 88.0000 (89.2488) time: 0.7000 data: 0.6712 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.4030 (1.3769) acc1: 68.8000 (69.3920) acc5: 88.8000 (89.2160) time: 0.6323 data: 0.6025 max mem: 6925
Test: Total time: 0:00:52 (1.0563 s / it)
* Acc@1 70.836 Acc@5 89.614 loss 1.332
Accuracy of the model on the 50000 test images: 70.8%
Max accuracy: 70.84%
Epoch: [220] [ 0/625] eta: 3:56:56 lr: 0.000754 min_lr: 0.000754 loss: 2.1256 (2.1256) class_acc: 0.7500 (0.7500) weight_decay: 0.0500 (0.0500) time: 22.7470 data: 18.2872 max mem: 6925
Epoch: [220] [200/625] eta: 0:14:33 lr: 0.000748 min_lr: 0.000748 loss: 2.2592 (2.3390) class_acc: 0.6914 (0.6759) weight_decay: 0.0500 (0.0500) grad_norm: 1.0976 (1.2788) time: 1.7831 data: 0.1115 max mem: 6925
Epoch: [220] [400/625] eta: 0:07:36 lr: 0.000743 min_lr: 0.000743 loss: 2.3848 (2.3478) class_acc: 0.6680 (0.6745) weight_decay: 0.0500 (0.0500) grad_norm: 1.2870 (1.2970) time: 2.0238 data: 0.0008 max mem: 6925
Epoch: [220] [600/625] eta: 0:00:50 lr: 0.000737 min_lr: 0.000737 loss: 2.3590 (2.3453) class_acc: 0.6758 (0.6750) weight_decay: 0.0500 (0.0500) grad_norm: 1.0941 (inf) time: 1.8238 data: 0.0009 max mem: 6925
Epoch: [220] [624/625] eta: 0:00:01 lr: 0.000736 min_lr: 0.000736 loss: 2.3310 (2.3462) class_acc: 0.6719 (0.6747) weight_decay: 0.0500 (0.0500) grad_norm: 1.1861 (inf) time: 0.8697 data: 0.0015 max mem: 6925
Epoch: [220] Total time: 0:20:38 (1.9824 s / it)
Averaged stats: lr: 0.000736 min_lr: 0.000736 loss: 2.3310 (2.3422) class_acc: 0.6719 (0.6754) weight_decay: 0.0500 (0.0500) grad_norm: 1.1861 (inf)
Test: [ 0/50] eta: 0:10:21 loss: 0.9530 (0.9530) acc1: 79.2000 (79.2000) acc5: 94.4000 (94.4000) time: 12.4283 data: 12.3973 max mem: 6925
Test: [10/50] eta: 0:01:29 loss: 0.9991 (1.0954) acc1: 75.2000 (75.6364) acc5: 92.0000 (91.6364) time: 2.2350 data: 2.2046 max mem: 6925
Test: [20/50] eta: 0:00:54 loss: 1.2788 (1.2355) acc1: 72.8000 (71.6191) acc5: 90.4000 (90.4381) time: 1.2870 data: 1.2576 max mem: 6925
Test: [30/50] eta: 0:00:32 loss: 1.3458 (1.2720) acc1: 66.4000 (70.5806) acc5: 88.0000 (89.6258) time: 1.2625 data: 1.2332 max mem: 6925
Test: [40/50] eta: 0:00:13 loss: 1.3417 (1.2927) acc1: 67.2000 (70.2634) acc5: 89.6000 (89.5024) time: 0.8199 data: 0.7893 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.2351 (1.2913) acc1: 71.2000 (70.3200) acc5: 90.4000 (89.4880) time: 0.6692 data: 0.6391 max mem: 6925
Test: Total time: 0:00:55 (1.1120 s / it)
* Acc@1 71.446 Acc@5 89.904 loss 1.245
Accuracy of the model on the 50000 test images: 71.4%
Max accuracy: 71.45%
Epoch: [221] [ 0/625] eta: 3:38:00 lr: 0.000736 min_lr: 0.000736 loss: 2.3859 (2.3859) class_acc: 0.6641 (0.6641) weight_decay: 0.0500 (0.0500) time: 20.9286 data: 20.6886 max mem: 6925
Epoch: [221] [200/625] eta: 0:14:04 lr: 0.000731 min_lr: 0.000731 loss: 2.3129 (2.3220) class_acc: 0.6758 (0.6788) weight_decay: 0.0500 (0.0500) grad_norm: 1.0210 (1.1825) time: 1.9607 data: 0.6592 max mem: 6925
Epoch: [221] [400/625] eta: 0:07:25 lr: 0.000725 min_lr: 0.000725 loss: 2.2569 (2.3254) class_acc: 0.6875 (0.6787) weight_decay: 0.0500 (0.0500) grad_norm: 1.2387 (1.2391) time: 2.0401 data: 1.6359 max mem: 6925
Epoch: [221] [600/625] eta: 0:00:50 lr: 0.000720 min_lr: 0.000720 loss: 2.3308 (2.3345) class_acc: 0.6758 (0.6770) weight_decay: 0.0500 (0.0500) grad_norm: 1.1576 (1.2571) time: 2.1154 data: 0.1990 max mem: 6925
Epoch: [221] [624/625] eta: 0:00:01 lr: 0.000719 min_lr: 0.000719 loss: 2.3447 (2.3345) class_acc: 0.6680 (0.6770) weight_decay: 0.0500 (0.0500) grad_norm: 1.1874 (1.2574) time: 0.9226 data: 0.0364 max mem: 6925
Epoch: [221] Total time: 0:20:28 (1.9663 s / it)
Averaged stats: lr: 0.000719 min_lr: 0.000719 loss: 2.3447 (2.3341) class_acc: 0.6680 (0.6775) weight_decay: 0.0500 (0.0500) grad_norm: 1.1874 (1.2574)
Test: [ 0/50] eta: 0:10:37 loss: 1.0710 (1.0710) acc1: 76.8000 (76.8000) acc5: 94.4000 (94.4000) time: 12.7500 data: 12.7122 max mem: 6925
Test: [10/50] eta: 0:01:28 loss: 1.1120 (1.1544) acc1: 75.2000 (74.6182) acc5: 91.2000 (90.7636) time: 2.2167 data: 2.1868 max mem: 6925
Test: [20/50] eta: 0:00:52 loss: 1.2841 (1.2750) acc1: 69.6000 (70.8952) acc5: 90.4000 (89.9810) time: 1.1974 data: 1.1685 max mem: 6925
Test: [30/50] eta: 0:00:31 loss: 1.4048 (1.2972) acc1: 67.2000 (70.5548) acc5: 88.8000 (89.3936) time: 1.1894 data: 1.1600 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.3384 (1.3127) acc1: 69.6000 (70.5366) acc5: 88.8000 (89.2488) time: 0.8235 data: 0.7925 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.3610 (1.3174) acc1: 69.6000 (70.3360) acc5: 88.8000 (89.2480) time: 0.7178 data: 0.6862 max mem: 6925
Test: Total time: 0:00:53 (1.0735 s / it)
* Acc@1 70.770 Acc@5 89.824 loss 1.286
Accuracy of the model on the 50000 test images: 70.8%
Max accuracy: 71.45%
Epoch: [222] [ 0/625] eta: 3:52:56 lr: 0.000719 min_lr: 0.000719 loss: 2.3261 (2.3261) class_acc: 0.6797 (0.6797) weight_decay: 0.0500 (0.0500) time: 22.3628 data: 19.0515 max mem: 6925
Epoch: [222] [200/625] eta: 0:14:46 lr: 0.000714 min_lr: 0.000714 loss: 2.3291 (2.3134) class_acc: 0.6719 (0.6826) weight_decay: 0.0500 (0.0500) grad_norm: 1.1676 (1.2510) time: 2.0544 data: 0.1191 max mem: 6925
Epoch: [222] [400/625] eta: 0:07:43 lr: 0.000708 min_lr: 0.000708 loss: 2.2779 (2.3237) class_acc: 0.6914 (0.6801) weight_decay: 0.0500 (0.0500) grad_norm: 1.1589 (1.2540) time: 2.0319 data: 0.0021 max mem: 6925
Epoch: [222] [600/625] eta: 0:00:51 lr: 0.000703 min_lr: 0.000703 loss: 2.3323 (2.3269) class_acc: 0.6680 (0.6788) weight_decay: 0.0500 (0.0500) grad_norm: 1.2778 (1.2513) time: 1.9527 data: 0.0009 max mem: 6925
Epoch: [222] [624/625] eta: 0:00:02 lr: 0.000702 min_lr: 0.000702 loss: 2.3439 (2.3280) class_acc: 0.6719 (0.6785) weight_decay: 0.0500 (0.0500) grad_norm: 1.3158 (1.2607) time: 0.8327 data: 0.0019 max mem: 6925
Epoch: [222] Total time: 0:21:03 (2.0212 s / it)
Averaged stats: lr: 0.000702 min_lr: 0.000702 loss: 2.3439 (2.3316) class_acc: 0.6719 (0.6778) weight_decay: 0.0500 (0.0500) grad_norm: 1.3158 (1.2607)
Test: [ 0/50] eta: 0:10:51 loss: 1.2682 (1.2682) acc1: 68.0000 (68.0000) acc5: 92.0000 (92.0000) time: 13.0368 data: 12.9788 max mem: 6925
Test: [10/50] eta: 0:01:23 loss: 1.1905 (1.1862) acc1: 74.4000 (74.1818) acc5: 92.0000 (91.4182) time: 2.0988 data: 2.0657 max mem: 6925
Test: [20/50] eta: 0:00:48 loss: 1.3575 (1.3449) acc1: 71.2000 (70.0952) acc5: 89.6000 (89.8286) time: 1.0627 data: 1.0330 max mem: 6925
Test: [30/50] eta: 0:00:29 loss: 1.4581 (1.3690) acc1: 66.4000 (69.6000) acc5: 88.8000 (89.3677) time: 1.0958 data: 1.0668 max mem: 6925
Test: [40/50] eta: 0:00:13 loss: 1.3993 (1.3675) acc1: 68.0000 (69.4244) acc5: 88.0000 (89.3463) time: 0.9844 data: 0.9549 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.4141 (1.3799) acc1: 68.0000 (69.2480) acc5: 88.0000 (89.0400) time: 0.9799 data: 0.9494 max mem: 6925
Test: Total time: 0:00:58 (1.1654 s / it)
* Acc@1 69.922 Acc@5 89.456 loss 1.348
Accuracy of the model on the 50000 test images: 69.9%
Max accuracy: 71.45%
Epoch: [223] [ 0/625] eta: 3:35:17 lr: 0.000702 min_lr: 0.000702 loss: 2.3051 (2.3051) class_acc: 0.6914 (0.6914) weight_decay: 0.0500 (0.0500) time: 20.6680 data: 20.3882 max mem: 6925
Epoch: [223] [200/625] eta: 0:14:43 lr: 0.000696 min_lr: 0.000696 loss: 2.3252 (2.3225) class_acc: 0.6641 (0.6801) weight_decay: 0.0500 (0.0500) grad_norm: 1.3431 (1.2433) time: 2.1238 data: 0.6298 max mem: 6925
Epoch: [223] [400/625] eta: 0:07:50 lr: 0.000691 min_lr: 0.000691 loss: 2.3977 (2.3210) class_acc: 0.6719 (0.6803) weight_decay: 0.0500 (0.0500) grad_norm: 1.2893 (1.2487) time: 2.1511 data: 0.7578 max mem: 6925
Epoch: [223] [600/625] eta: 0:00:52 lr: 0.000686 min_lr: 0.000686 loss: 2.3388 (2.3247) class_acc: 0.6680 (0.6791) weight_decay: 0.0500 (0.0500) grad_norm: 1.1298 (inf) time: 2.1214 data: 1.0663 max mem: 6925
Epoch: [223] [624/625] eta: 0:00:02 lr: 0.000685 min_lr: 0.000685 loss: 2.3266 (2.3252) class_acc: 0.6680 (0.6786) weight_decay: 0.0500 (0.0500) grad_norm: 1.1298 (inf) time: 0.8858 data: 0.2141 max mem: 6925
Epoch: [223] Total time: 0:21:13 (2.0382 s / it)
Averaged stats: lr: 0.000685 min_lr: 0.000685 loss: 2.3266 (2.3276) class_acc: 0.6680 (0.6790) weight_decay: 0.0500 (0.0500) grad_norm: 1.1298 (inf)
Test: [ 0/50] eta: 0:10:15 loss: 1.1457 (1.1457) acc1: 72.8000 (72.8000) acc5: 92.8000 (92.8000) time: 12.3127 data: 12.2759 max mem: 6925
Test: [10/50] eta: 0:01:20 loss: 1.0827 (1.0804) acc1: 76.8000 (76.2909) acc5: 91.2000 (91.2727) time: 2.0239 data: 1.9892 max mem: 6925
Test: [20/50] eta: 0:00:48 loss: 1.2439 (1.2367) acc1: 72.0000 (72.0762) acc5: 90.4000 (90.0952) time: 1.0746 data: 1.0423 max mem: 6925
Test: [30/50] eta: 0:00:29 loss: 1.3520 (1.2658) acc1: 68.0000 (71.3806) acc5: 88.8000 (89.6000) time: 1.1374 data: 1.1075 max mem: 6925
Test: [40/50] eta: 0:00:13 loss: 1.2595 (1.2721) acc1: 69.6000 (71.2976) acc5: 89.6000 (89.6390) time: 1.0649 data: 1.0348 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.2595 (1.2753) acc1: 70.4000 (71.0400) acc5: 89.6000 (89.6000) time: 0.9829 data: 0.9532 max mem: 6925
Test: Total time: 0:00:58 (1.1686 s / it)
* Acc@1 71.964 Acc@5 90.470 loss 1.230
Accuracy of the model on the 50000 test images: 72.0%
Max accuracy: 71.96%
Epoch: [224] [ 0/625] eta: 3:37:35 lr: 0.000685 min_lr: 0.000685 loss: 2.3190 (2.3190) class_acc: 0.6523 (0.6523) weight_decay: 0.0500 (0.0500) time: 20.8889 data: 20.6515 max mem: 6925
Epoch: [224] [200/625] eta: 0:14:54 lr: 0.000680 min_lr: 0.000680 loss: 2.3350 (2.3152) class_acc: 0.6836 (0.6833) weight_decay: 0.0500 (0.0500) grad_norm: 1.2485 (1.3004) time: 2.1377 data: 0.1374 max mem: 6925
Epoch: [224] [400/625] eta: 0:07:49 lr: 0.000674 min_lr: 0.000674 loss: 2.3106 (2.3253) class_acc: 0.6797 (0.6810) weight_decay: 0.0500 (0.0500) grad_norm: 1.2764 (1.3391) time: 2.1211 data: 0.0411 max mem: 6925
Epoch: [224] [600/625] eta: 0:00:51 lr: 0.000669 min_lr: 0.000669 loss: 2.3232 (2.3293) class_acc: 0.6758 (0.6805) weight_decay: 0.0500 (0.0500) grad_norm: 1.2191 (1.3011) time: 2.0500 data: 0.0009 max mem: 6925
Epoch: [224] [624/625] eta: 0:00:02 lr: 0.000668 min_lr: 0.000668 loss: 2.3439 (2.3298) class_acc: 0.6719 (0.6805) weight_decay: 0.0500 (0.0500) grad_norm: 1.2412 (1.3008) time: 0.9052 data: 0.0081 max mem: 6925
Epoch: [224] Total time: 0:21:06 (2.0271 s / it)
Averaged stats: lr: 0.000668 min_lr: 0.000668 loss: 2.3439 (2.3267) class_acc: 0.6719 (0.6794) weight_decay: 0.0500 (0.0500) grad_norm: 1.2412 (1.3008)
Test: [ 0/50] eta: 0:09:39 loss: 1.1388 (1.1388) acc1: 74.4000 (74.4000) acc5: 92.0000 (92.0000) time: 11.5947 data: 11.5618 max mem: 6925
Test: [10/50] eta: 0:01:19 loss: 1.1388 (1.2011) acc1: 74.4000 (73.8182) acc5: 91.2000 (90.1818) time: 1.9896 data: 1.9594 max mem: 6925
Test: [20/50] eta: 0:00:47 loss: 1.3302 (1.3408) acc1: 69.6000 (70.5524) acc5: 89.6000 (88.7619) time: 1.0853 data: 1.0560 max mem: 6925
Test: [30/50] eta: 0:00:28 loss: 1.4329 (1.3445) acc1: 68.0000 (70.5032) acc5: 88.0000 (88.8258) time: 1.1489 data: 1.1190 max mem: 6925
Test: [40/50] eta: 0:00:13 loss: 1.3371 (1.3447) acc1: 71.2000 (70.5561) acc5: 89.6000 (88.8976) time: 1.0227 data: 0.9925 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.3582 (1.3502) acc1: 71.2000 (70.3360) acc5: 89.6000 (88.8160) time: 0.8526 data: 0.8231 max mem: 6925
Test: Total time: 0:00:55 (1.1102 s / it)
* Acc@1 71.240 Acc@5 89.780 loss 1.304
Accuracy of the model on the 50000 test images: 71.2%
Max accuracy: 71.96%
Epoch: [225] [ 0/625] eta: 3:32:06 lr: 0.000668 min_lr: 0.000668 loss: 2.3868 (2.3868) class_acc: 0.6758 (0.6758) weight_decay: 0.0500 (0.0500) time: 20.3624 data: 18.2421 max mem: 6925
Epoch: [225] [200/625] eta: 0:14:46 lr: 0.000663 min_lr: 0.000663 loss: 2.3336 (2.3059) class_acc: 0.6797 (0.6858) weight_decay: 0.0500 (0.0500) grad_norm: 1.1841 (1.3196) time: 1.8749 data: 0.0013 max mem: 6925
Epoch: [225] [400/625] eta: 0:07:41 lr: 0.000657 min_lr: 0.000657 loss: 2.2991 (2.3177) class_acc: 0.6836 (0.6819) weight_decay: 0.0500 (0.0500) grad_norm: 1.2623 (1.2996) time: 2.0025 data: 0.0008 max mem: 6925
Epoch: [225] [600/625] eta: 0:00:51 lr: 0.000652 min_lr: 0.000652 loss: 2.3184 (2.3201) class_acc: 0.6914 (0.6811) weight_decay: 0.0500 (0.0500) grad_norm: 1.2893 (1.3134) time: 2.1298 data: 0.0012 max mem: 6925
Epoch: [225] [624/625] eta: 0:00:02 lr: 0.000652 min_lr: 0.000652 loss: 2.2954 (2.3212) class_acc: 0.6758 (0.6808) weight_decay: 0.0500 (0.0500) grad_norm: 1.2063 (1.3117) time: 0.8358 data: 0.0014 max mem: 6925
Epoch: [225] Total time: 0:20:53 (2.0050 s / it)
Averaged stats: lr: 0.000652 min_lr: 0.000652 loss: 2.2954 (2.3233) class_acc: 0.6758 (0.6797) weight_decay: 0.0500 (0.0500) grad_norm: 1.2063 (1.3117)
Test: [ 0/50] eta: 0:10:49 loss: 1.1553 (1.1553) acc1: 75.2000 (75.2000) acc5: 92.8000 (92.8000) time: 12.9924 data: 12.9614 max mem: 6925
Test: [10/50] eta: 0:01:23 loss: 1.0403 (1.1204) acc1: 76.0000 (76.2182) acc5: 92.0000 (90.7636) time: 2.0978 data: 2.0676 max mem: 6925
Test: [20/50] eta: 0:00:48 loss: 1.3001 (1.2755) acc1: 72.8000 (71.9619) acc5: 89.6000 (89.5619) time: 1.0423 data: 1.0127 max mem: 6925
Test: [30/50] eta: 0:00:28 loss: 1.3459 (1.3017) acc1: 68.0000 (71.0710) acc5: 88.8000 (89.4452) time: 1.0535 data: 1.0239 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.3195 (1.3060) acc1: 69.6000 (70.9463) acc5: 90.4000 (89.5805) time: 0.8645 data: 0.8351 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.2855 (1.3022) acc1: 70.4000 (70.7840) acc5: 90.4000 (89.5680) time: 0.8180 data: 0.7888 max mem: 6925
Test: Total time: 0:00:52 (1.0593 s / it)
* Acc@1 71.684 Acc@5 90.198 loss 1.262
Accuracy of the model on the 50000 test images: 71.7%
Max accuracy: 71.96%
Epoch: [226] [ 0/625] eta: 3:45:59 lr: 0.000651 min_lr: 0.000651 loss: 2.1163 (2.1163) class_acc: 0.7266 (0.7266) weight_decay: 0.0500 (0.0500) time: 21.6959 data: 21.3588 max mem: 6925
Epoch: [226] [200/625] eta: 0:14:45 lr: 0.000646 min_lr: 0.000646 loss: 2.3073 (2.3039) class_acc: 0.6836 (0.6848) weight_decay: 0.0500 (0.0500) grad_norm: 1.3329 (1.2639) time: 1.7979 data: 0.0185 max mem: 6925
Epoch: [226] [400/625] eta: 0:07:44 lr: 0.000641 min_lr: 0.000641 loss: 2.3143 (2.3119) class_acc: 0.6836 (0.6836) weight_decay: 0.0500 (0.0500) grad_norm: 1.2285 (1.2807) time: 1.9714 data: 0.0008 max mem: 6925
Epoch: [226] [600/625] eta: 0:00:51 lr: 0.000636 min_lr: 0.000636 loss: 2.3065 (2.3166) class_acc: 0.6797 (0.6825) weight_decay: 0.0500 (0.0500) grad_norm: 1.1836 (1.2846) time: 2.1138 data: 0.0008 max mem: 6925
Epoch: [226] [624/625] eta: 0:00:02 lr: 0.000635 min_lr: 0.000635 loss: 2.3445 (2.3180) class_acc: 0.6602 (0.6820) weight_decay: 0.0500 (0.0500) grad_norm: 1.2629 (1.2864) time: 0.7843 data: 0.0013 max mem: 6925
Epoch: [226] Total time: 0:20:59 (2.0157 s / it)
Averaged stats: lr: 0.000635 min_lr: 0.000635 loss: 2.3445 (2.3184) class_acc: 0.6602 (0.6815) weight_decay: 0.0500 (0.0500) grad_norm: 1.2629 (1.2864)
Test: [ 0/50] eta: 0:11:33 loss: 1.0372 (1.0372) acc1: 76.8000 (76.8000) acc5: 92.8000 (92.8000) time: 13.8780 data: 13.8392 max mem: 6925
Test: [10/50] eta: 0:01:29 loss: 1.0384 (1.0900) acc1: 76.0000 (75.9273) acc5: 91.2000 (91.4182) time: 2.2470 data: 2.2172 max mem: 6925
Test: [20/50] eta: 0:00:53 loss: 1.2384 (1.2242) acc1: 72.0000 (72.0762) acc5: 89.6000 (90.2095) time: 1.1759 data: 1.1472 max mem: 6925
Test: [30/50] eta: 0:00:31 loss: 1.3440 (1.2550) acc1: 68.8000 (71.2774) acc5: 88.8000 (89.8581) time: 1.2207 data: 1.1917 max mem: 6925
Test: [40/50] eta: 0:00:14 loss: 1.2838 (1.2738) acc1: 68.8000 (71.0439) acc5: 89.6000 (89.7951) time: 1.0269 data: 0.9973 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.3649 (1.2779) acc1: 69.6000 (70.9120) acc5: 89.6000 (89.5680) time: 0.9386 data: 0.9088 max mem: 6925
Test: Total time: 0:00:58 (1.1720 s / it)
* Acc@1 71.428 Acc@5 90.084 loss 1.240
Accuracy of the model on the 50000 test images: 71.4%
Max accuracy: 71.96%
Epoch: [227] [ 0/625] eta: 4:11:57 lr: 0.000635 min_lr: 0.000635 loss: 2.2304 (2.2304) class_acc: 0.7109 (0.7109) weight_decay: 0.0500 (0.0500) time: 24.1879 data: 21.1837 max mem: 6925
Epoch: [227] [200/625] eta: 0:14:34 lr: 0.000630 min_lr: 0.000630 loss: 2.2902 (2.3095) class_acc: 0.6953 (0.6853) weight_decay: 0.0500 (0.0500) grad_norm: 1.1262 (1.2976) time: 2.0537 data: 0.5333 max mem: 6925
Epoch: [227] [400/625] eta: 0:07:38 lr: 0.000625 min_lr: 0.000625 loss: 2.2775 (2.3099) class_acc: 0.6719 (0.6842) weight_decay: 0.0500 (0.0500) grad_norm: 1.0957 (1.3025) time: 1.9488 data: 0.0009 max mem: 6925
Epoch: [227] [600/625] eta: 0:00:51 lr: 0.000619 min_lr: 0.000619 loss: 2.3447 (2.3152) class_acc: 0.6641 (0.6829) weight_decay: 0.0500 (0.0500) grad_norm: 1.2134 (1.2890) time: 1.7099 data: 0.0010 max mem: 6925
Epoch: [227] [624/625] eta: 0:00:02 lr: 0.000619 min_lr: 0.000619 loss: 2.2054 (2.3130) class_acc: 0.6953 (0.6833) weight_decay: 0.0500 (0.0500) grad_norm: 1.2215 (1.2911) time: 0.7956 data: 0.0018 max mem: 6925
Epoch: [227] Total time: 0:21:05 (2.0247 s / it)
Averaged stats: lr: 0.000619 min_lr: 0.000619 loss: 2.2054 (2.3136) class_acc: 0.6953 (0.6828) weight_decay: 0.0500 (0.0500) grad_norm: 1.2215 (1.2911)
Test: [ 0/50] eta: 0:10:54 loss: 1.1100 (1.1100) acc1: 76.0000 (76.0000) acc5: 92.0000 (92.0000) time: 13.0891 data: 13.0578 max mem: 6925
Test: [10/50] eta: 0:01:24 loss: 1.0733 (1.0924) acc1: 76.0000 (76.8000) acc5: 92.0000 (90.2546) time: 2.1208 data: 2.0902 max mem: 6925
Test: [20/50] eta: 0:00:49 loss: 1.2309 (1.2376) acc1: 72.8000 (72.9524) acc5: 90.4000 (89.5619) time: 1.0725 data: 1.0430 max mem: 6925
Test: [30/50] eta: 0:00:29 loss: 1.3407 (1.2730) acc1: 69.6000 (71.7161) acc5: 89.6000 (89.2129) time: 1.1202 data: 1.0917 max mem: 6925
Test: [40/50] eta: 0:00:13 loss: 1.2655 (1.2771) acc1: 69.6000 (71.3951) acc5: 89.6000 (89.3463) time: 1.0222 data: 0.9930 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.2655 (1.2772) acc1: 69.6000 (71.1200) acc5: 89.6000 (89.2480) time: 0.9173 data: 0.8880 max mem: 6925
Test: Total time: 0:00:58 (1.1757 s / it)
* Acc@1 71.904 Acc@5 90.296 loss 1.240
Accuracy of the model on the 50000 test images: 71.9%
Max accuracy: 71.96%
Epoch: [228] [ 0/625] eta: 3:36:00 lr: 0.000619 min_lr: 0.000619 loss: 2.2716 (2.2716) class_acc: 0.6875 (0.6875) weight_decay: 0.0500 (0.0500) time: 20.7373 data: 18.9130 max mem: 6925
Epoch: [228] [200/625] eta: 0:14:10 lr: 0.000614 min_lr: 0.000614 loss: 2.3405 (2.3024) class_acc: 0.6719 (0.6844) weight_decay: 0.0500 (0.0500) grad_norm: 1.1985 (1.3741) time: 1.9296 data: 0.0013 max mem: 6925
Epoch: [228] [400/625] eta: 0:07:25 lr: 0.000608 min_lr: 0.000608 loss: 2.2407 (2.3068) class_acc: 0.6953 (0.6843) weight_decay: 0.0500 (0.0500) grad_norm: 1.4271 (1.3345) time: 1.9026 data: 0.0009 max mem: 6925
Epoch: [228] [600/625] eta: 0:00:49 lr: 0.000603 min_lr: 0.000603 loss: 2.2736 (2.3134) class_acc: 0.6836 (0.6832) weight_decay: 0.0500 (0.0500) grad_norm: 1.1983 (1.3131) time: 1.8595 data: 0.0009 max mem: 6925
Epoch: [228] [624/625] eta: 0:00:01 lr: 0.000603 min_lr: 0.000603 loss: 2.3362 (2.3133) class_acc: 0.6641 (0.6828) weight_decay: 0.0500 (0.0500) grad_norm: 1.2430 (1.3106) time: 0.8989 data: 0.0014 max mem: 6925
Epoch: [228] Total time: 0:20:09 (1.9349 s / it)
Averaged stats: lr: 0.000603 min_lr: 0.000603 loss: 2.3362 (2.3119) class_acc: 0.6641 (0.6829) weight_decay: 0.0500 (0.0500) grad_norm: 1.2430 (1.3106)
Test: [ 0/50] eta: 0:10:25 loss: 1.1992 (1.1992) acc1: 74.4000 (74.4000) acc5: 89.6000 (89.6000) time: 12.5078 data: 12.4667 max mem: 6925
Test: [10/50] eta: 0:01:23 loss: 1.1043 (1.1022) acc1: 76.0000 (76.8727) acc5: 92.8000 (91.6364) time: 2.0930 data: 2.0624 max mem: 6925
Test: [20/50] eta: 0:00:48 loss: 1.1948 (1.2684) acc1: 74.4000 (72.4952) acc5: 89.6000 (90.0571) time: 1.0645 data: 1.0349 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.4411 (1.3062) acc1: 68.8000 (71.5355) acc5: 88.8000 (89.4452) time: 0.9655 data: 0.9352 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.3441 (1.3035) acc1: 69.6000 (71.3951) acc5: 89.6000 (89.7171) time: 0.5937 data: 0.5637 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.2866 (1.3106) acc1: 69.6000 (71.0080) acc5: 90.4000 (89.4560) time: 0.5927 data: 0.5636 max mem: 6925
Test: Total time: 0:00:46 (0.9220 s / it)
* Acc@1 71.758 Acc@5 90.274 loss 1.258
Accuracy of the model on the 50000 test images: 71.8%
Max accuracy: 71.96%
Epoch: [229] [ 0/625] eta: 4:14:25 lr: 0.000603 min_lr: 0.000603 loss: 2.2384 (2.2384) class_acc: 0.7188 (0.7188) weight_decay: 0.0500 (0.0500) time: 24.4249 data: 18.9304 max mem: 6925
Epoch: [229] [200/625] eta: 0:14:09 lr: 0.000597 min_lr: 0.000597 loss: 2.2377 (2.2956) class_acc: 0.6914 (0.6879) weight_decay: 0.0500 (0.0500) grad_norm: 1.2546 (1.2663) time: 1.7432 data: 0.0084 max mem: 6925
Epoch: [229] [400/625] eta: 0:07:16 lr: 0.000592 min_lr: 0.000592 loss: 2.2613 (2.3015) class_acc: 0.6758 (0.6854) weight_decay: 0.0500 (0.0500) grad_norm: 1.2300 (1.3045) time: 1.9982 data: 0.0018 max mem: 6925
Epoch: [229] [600/625] eta: 0:00:48 lr: 0.000587 min_lr: 0.000587 loss: 2.2772 (2.3035) class_acc: 0.6797 (0.6852) weight_decay: 0.0500 (0.0500) grad_norm: 1.2569 (1.3119) time: 2.0611 data: 0.0586 max mem: 6925
Epoch: [229] [624/625] eta: 0:00:01 lr: 0.000587 min_lr: 0.000587 loss: 2.3260 (2.3042) class_acc: 0.6875 (0.6853) weight_decay: 0.0500 (0.0500) grad_norm: 1.3359 (1.3162) time: 0.7865 data: 0.0097 max mem: 6925
Epoch: [229] Total time: 0:20:06 (1.9305 s / it)
Averaged stats: lr: 0.000587 min_lr: 0.000587 loss: 2.3260 (2.3112) class_acc: 0.6875 (0.6831) weight_decay: 0.0500 (0.0500) grad_norm: 1.3359 (1.3162)
Test: [ 0/50] eta: 0:10:42 loss: 1.2315 (1.2315) acc1: 70.4000 (70.4000) acc5: 92.0000 (92.0000) time: 12.8559 data: 12.8226 max mem: 6925
Test: [10/50] eta: 0:01:29 loss: 1.0557 (1.1408) acc1: 76.0000 (74.6909) acc5: 91.2000 (90.6909) time: 2.2349 data: 2.2052 max mem: 6925
Test: [20/50] eta: 0:00:53 loss: 1.2805 (1.2826) acc1: 71.2000 (71.5810) acc5: 88.8000 (89.7143) time: 1.2181 data: 1.1893 max mem: 6925
Test: [30/50] eta: 0:00:32 loss: 1.3637 (1.3158) acc1: 68.0000 (71.0194) acc5: 88.0000 (89.2645) time: 1.2600 data: 1.2317 max mem: 6925
Test: [40/50] eta: 0:00:13 loss: 1.3030 (1.3120) acc1: 68.8000 (70.9463) acc5: 88.8000 (89.4244) time: 0.9474 data: 0.9177 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.2556 (1.3069) acc1: 70.4000 (71.0720) acc5: 88.8000 (89.3440) time: 0.8604 data: 0.8288 max mem: 6925
Test: Total time: 0:00:56 (1.1399 s / it)
* Acc@1 71.840 Acc@5 90.226 loss 1.257
Accuracy of the model on the 50000 test images: 71.8%
Max accuracy: 71.96%
Epoch: [230] [ 0/625] eta: 3:48:26 lr: 0.000587 min_lr: 0.000587 loss: 2.3757 (2.3757) class_acc: 0.6367 (0.6367) weight_decay: 0.0500 (0.0500) time: 21.9308 data: 21.6968 max mem: 6925
Epoch: [230] [200/625] eta: 0:14:31 lr: 0.000582 min_lr: 0.000582 loss: 2.2744 (2.2814) class_acc: 0.6797 (0.6907) weight_decay: 0.0500 (0.0500) grad_norm: 1.4240 (1.3235) time: 1.9201 data: 0.0024 max mem: 6925
Epoch: [230] [400/625] eta: 0:07:30 lr: 0.000577 min_lr: 0.000577 loss: 2.3034 (2.2918) class_acc: 0.6680 (0.6875) weight_decay: 0.0500 (0.0500) grad_norm: 1.2042 (1.3675) time: 2.0758 data: 0.0012 max mem: 6925
Epoch: [230] [600/625] eta: 0:00:49 lr: 0.000571 min_lr: 0.000571 loss: 2.3196 (2.2995) class_acc: 0.6797 (0.6864) weight_decay: 0.0500 (0.0500) grad_norm: 1.1557 (1.3390) time: 2.0431 data: 0.0017 max mem: 6925
Epoch: [230] [624/625] eta: 0:00:01 lr: 0.000571 min_lr: 0.000571 loss: 2.3394 (2.3011) class_acc: 0.6641 (0.6859) weight_decay: 0.0500 (0.0500) grad_norm: 1.1795 (1.3360) time: 0.8719 data: 0.0019 max mem: 6925
Epoch: [230] Total time: 0:20:18 (1.9490 s / it)
Averaged stats: lr: 0.000571 min_lr: 0.000571 loss: 2.3394 (2.3071) class_acc: 0.6641 (0.6846) weight_decay: 0.0500 (0.0500) grad_norm: 1.1795 (1.3360)
Test: [ 0/50] eta: 0:09:58 loss: 1.0988 (1.0988) acc1: 76.0000 (76.0000) acc5: 92.0000 (92.0000) time: 11.9689 data: 11.9339 max mem: 6925
Test: [10/50] eta: 0:01:19 loss: 1.0737 (1.0734) acc1: 76.8000 (75.6364) acc5: 92.0000 (91.8545) time: 1.9925 data: 1.9630 max mem: 6925
Test: [20/50] eta: 0:00:46 loss: 1.1472 (1.2294) acc1: 72.0000 (72.0762) acc5: 91.2000 (90.2476) time: 1.0410 data: 1.0123 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.4039 (1.2657) acc1: 68.0000 (71.2774) acc5: 88.0000 (90.0129) time: 1.0147 data: 0.9850 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.2890 (1.2643) acc1: 69.6000 (71.4732) acc5: 89.6000 (89.9902) time: 0.6811 data: 0.6513 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.3097 (1.2718) acc1: 70.4000 (71.2480) acc5: 89.6000 (89.7440) time: 0.6329 data: 0.6042 max mem: 6925
Test: Total time: 0:00:47 (0.9448 s / it)
* Acc@1 72.274 Acc@5 90.560 loss 1.222
Accuracy of the model on the 50000 test images: 72.3%
Max accuracy: 72.27%
Epoch: [231] [ 0/625] eta: 3:23:07 lr: 0.000571 min_lr: 0.000571 loss: 2.3445 (2.3445) class_acc: 0.6680 (0.6680) weight_decay: 0.0500 (0.0500) time: 19.5007 data: 18.9652 max mem: 6925
Epoch: [231] [200/625] eta: 0:14:20 lr: 0.000566 min_lr: 0.000566 loss: 2.2991 (2.2833) class_acc: 0.6758 (0.6921) weight_decay: 0.0500 (0.0500) grad_norm: 1.1949 (1.2613) time: 1.8545 data: 0.6663 max mem: 6925
Epoch: [231] [400/625] eta: 0:07:22 lr: 0.000561 min_lr: 0.000561 loss: 2.2891 (2.2930) class_acc: 0.6914 (0.6884) weight_decay: 0.0500 (0.0500) grad_norm: 1.2260 (1.2564) time: 1.9039 data: 1.6095 max mem: 6925
Epoch: [231] [600/625] eta: 0:00:48 lr: 0.000556 min_lr: 0.000556 loss: 2.2521 (2.2969) class_acc: 0.6758 (0.6875) weight_decay: 0.0500 (0.0500) grad_norm: 1.2925 (inf) time: 1.7157 data: 1.4282 max mem: 6925
Epoch: [231] [624/625] eta: 0:00:01 lr: 0.000555 min_lr: 0.000555 loss: 2.3407 (2.2983) class_acc: 0.6758 (0.6872) weight_decay: 0.0500 (0.0500) grad_norm: 1.2925 (inf) time: 0.6415 data: 0.3842 max mem: 6925
Epoch: [231] Total time: 0:20:07 (1.9316 s / it)
Averaged stats: lr: 0.000555 min_lr: 0.000555 loss: 2.3407 (2.3012) class_acc: 0.6758 (0.6859) weight_decay: 0.0500 (0.0500) grad_norm: 1.2925 (inf)
Test: [ 0/50] eta: 0:10:27 loss: 1.1462 (1.1462) acc1: 75.2000 (75.2000) acc5: 92.8000 (92.8000) time: 12.5403 data: 12.5023 max mem: 6925
Test: [10/50] eta: 0:01:28 loss: 1.0991 (1.1082) acc1: 76.0000 (76.5818) acc5: 92.0000 (91.7818) time: 2.2170 data: 2.1871 max mem: 6925
Test: [20/50] eta: 0:00:53 loss: 1.2311 (1.2419) acc1: 72.8000 (72.7619) acc5: 90.4000 (90.4000) time: 1.2440 data: 1.2145 max mem: 6925
Test: [30/50] eta: 0:00:29 loss: 1.3477 (1.2853) acc1: 67.2000 (71.2000) acc5: 89.6000 (90.1419) time: 1.0828 data: 1.0527 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.2983 (1.2897) acc1: 68.0000 (71.0634) acc5: 90.4000 (90.0098) time: 0.6153 data: 0.5840 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.2850 (1.2966) acc1: 69.6000 (70.9280) acc5: 90.4000 (89.8080) time: 0.5967 data: 0.5658 max mem: 6925
Test: Total time: 0:00:50 (1.0029 s / it)
* Acc@1 71.802 Acc@5 90.378 loss 1.263
Accuracy of the model on the 50000 test images: 71.8%
Max accuracy: 72.27%
Epoch: [232] [ 0/625] eta: 3:52:12 lr: 0.000555 min_lr: 0.000555 loss: 2.2186 (2.2186) class_acc: 0.7031 (0.7031) weight_decay: 0.0500 (0.0500) time: 22.2912 data: 16.0632 max mem: 6925
Epoch: [232] [200/625] eta: 0:13:44 lr: 0.000550 min_lr: 0.000550 loss: 2.2659 (2.2933) class_acc: 0.6758 (0.6878) weight_decay: 0.0500 (0.0500) grad_norm: 1.3362 (1.3162) time: 1.9119 data: 0.0012 max mem: 6925
Epoch: [232] [400/625] eta: 0:07:11 lr: 0.000545 min_lr: 0.000545 loss: 2.2620 (2.2913) class_acc: 0.6953 (0.6877) weight_decay: 0.0500 (0.0500) grad_norm: 1.4020 (1.3200) time: 1.9055 data: 0.0014 max mem: 6925
Epoch: [232] [600/625] eta: 0:00:48 lr: 0.000540 min_lr: 0.000540 loss: 2.3297 (2.2951) class_acc: 0.6797 (0.6866) weight_decay: 0.0500 (0.0500) grad_norm: 1.3702 (1.3398) time: 1.7126 data: 0.0013 max mem: 6925
Epoch: [232] [624/625] eta: 0:00:01 lr: 0.000540 min_lr: 0.000540 loss: 2.2988 (2.2953) class_acc: 0.6836 (0.6865) weight_decay: 0.0500 (0.0500) grad_norm: 1.3086 (1.3418) time: 0.9662 data: 0.0014 max mem: 6925
Epoch: [232] Total time: 0:19:38 (1.8862 s / it)
Averaged stats: lr: 0.000540 min_lr: 0.000540 loss: 2.2988 (2.2999) class_acc: 0.6836 (0.6862) weight_decay: 0.0500 (0.0500) grad_norm: 1.3086 (1.3418)
Test: [ 0/50] eta: 0:10:56 loss: 1.1265 (1.1265) acc1: 73.6000 (73.6000) acc5: 92.0000 (92.0000) time: 13.1247 data: 13.0927 max mem: 6925
Test: [10/50] eta: 0:01:21 loss: 1.0794 (1.1014) acc1: 76.0000 (77.1636) acc5: 92.0000 (91.4182) time: 2.0306 data: 1.9974 max mem: 6925
Test: [20/50] eta: 0:00:46 loss: 1.2264 (1.2426) acc1: 72.0000 (72.6476) acc5: 89.6000 (90.2095) time: 0.9602 data: 0.9281 max mem: 6925
Test: [30/50] eta: 0:00:26 loss: 1.4012 (1.2848) acc1: 68.0000 (71.4065) acc5: 88.0000 (89.6774) time: 0.9272 data: 0.8970 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.2416 (1.2821) acc1: 69.6000 (71.5707) acc5: 90.4000 (89.6976) time: 0.8317 data: 0.8023 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.2323 (1.2787) acc1: 72.8000 (71.6640) acc5: 90.4000 (89.7120) time: 0.6909 data: 0.6607 max mem: 6925
Test: Total time: 0:00:53 (1.0673 s / it)
* Acc@1 72.478 Acc@5 90.512 loss 1.233
Accuracy of the model on the 50000 test images: 72.5%
Max accuracy: 72.48%
Epoch: [233] [ 0/625] eta: 3:39:35 lr: 0.000540 min_lr: 0.000540 loss: 2.3210 (2.3210) class_acc: 0.6875 (0.6875) weight_decay: 0.0500 (0.0500) time: 21.0814 data: 18.4495 max mem: 6925
Epoch: [233] [200/625] eta: 0:14:10 lr: 0.000535 min_lr: 0.000535 loss: 2.2917 (2.2783) class_acc: 0.6797 (0.6932) weight_decay: 0.0500 (0.0500) grad_norm: 1.3353 (1.2935) time: 2.0332 data: 0.0009 max mem: 6925
Epoch: [233] [400/625] eta: 0:07:26 lr: 0.000530 min_lr: 0.000530 loss: 2.3005 (2.2895) class_acc: 0.6797 (0.6897) weight_decay: 0.0500 (0.0500) grad_norm: 1.4165 (1.3169) time: 2.0153 data: 0.0010 max mem: 6925
Epoch: [233] [600/625] eta: 0:00:49 lr: 0.000525 min_lr: 0.000525 loss: 2.3302 (2.2944) class_acc: 0.6758 (0.6881) weight_decay: 0.0500 (0.0500) grad_norm: 1.4046 (1.3357) time: 2.0313 data: 0.0014 max mem: 6925
Epoch: [233] [624/625] eta: 0:00:01 lr: 0.000525 min_lr: 0.000525 loss: 2.3204 (2.2947) class_acc: 0.6875 (0.6881) weight_decay: 0.0500 (0.0500) grad_norm: 1.4004 (1.3358) time: 0.7893 data: 0.0014 max mem: 6925
Epoch: [233] Total time: 0:20:08 (1.9329 s / it)
Averaged stats: lr: 0.000525 min_lr: 0.000525 loss: 2.3204 (2.2926) class_acc: 0.6875 (0.6879) weight_decay: 0.0500 (0.0500) grad_norm: 1.4004 (1.3358)
Test: [ 0/50] eta: 0:10:06 loss: 1.2048 (1.2048) acc1: 70.4000 (70.4000) acc5: 92.0000 (92.0000) time: 12.1234 data: 12.0669 max mem: 6925
Test: [10/50] eta: 0:01:26 loss: 1.0614 (1.0908) acc1: 76.0000 (75.2727) acc5: 92.8000 (92.2909) time: 2.1608 data: 2.1265 max mem: 6925
Test: [20/50] eta: 0:00:50 loss: 1.1969 (1.2440) acc1: 73.6000 (71.3143) acc5: 90.4000 (90.5143) time: 1.1604 data: 1.1302 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.2623 (1.2584) acc1: 69.6000 (70.8129) acc5: 88.8000 (90.4000) time: 0.9494 data: 0.9206 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.2563 (1.2618) acc1: 70.4000 (70.6732) acc5: 89.6000 (90.1854) time: 0.5063 data: 0.4757 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.2820 (1.2713) acc1: 71.2000 (70.5920) acc5: 89.6000 (89.9520) time: 0.4807 data: 0.4504 max mem: 6925
Test: Total time: 0:00:46 (0.9397 s / it)
* Acc@1 72.128 Acc@5 90.576 loss 1.220
Accuracy of the model on the 50000 test images: 72.1%
Max accuracy: 72.48%
Epoch: [234] [ 0/625] eta: 4:29:49 lr: 0.000525 min_lr: 0.000525 loss: 2.2588 (2.2588) class_acc: 0.7227 (0.7227) weight_decay: 0.0500 (0.0500) time: 25.9034 data: 16.1874 max mem: 6925
Epoch: [234] [200/625] eta: 0:14:28 lr: 0.000520 min_lr: 0.000520 loss: 2.3109 (2.3019) class_acc: 0.6875 (0.6865) weight_decay: 0.0500 (0.0500) grad_norm: 1.1974 (1.3147) time: 1.9555 data: 0.0009 max mem: 6925
Epoch: [234] [400/625] eta: 0:07:29 lr: 0.000515 min_lr: 0.000515 loss: 2.2977 (2.3004) class_acc: 0.6797 (0.6861) weight_decay: 0.0500 (0.0500) grad_norm: 1.1926 (1.3237) time: 1.8167 data: 0.0007 max mem: 6925
Epoch: [234] [600/625] eta: 0:00:49 lr: 0.000510 min_lr: 0.000510 loss: 2.2930 (2.2995) class_acc: 0.6875 (0.6862) weight_decay: 0.0500 (0.0500) grad_norm: 1.2297 (1.3227) time: 1.9520 data: 0.0007 max mem: 6925
Epoch: [234] [624/625] eta: 0:00:01 lr: 0.000510 min_lr: 0.000510 loss: 2.2886 (2.2993) class_acc: 0.6836 (0.6863) weight_decay: 0.0500 (0.0500) grad_norm: 1.2435 (1.3204) time: 0.7988 data: 0.0014 max mem: 6925
Epoch: [234] Total time: 0:20:20 (1.9526 s / it)
Averaged stats: lr: 0.000510 min_lr: 0.000510 loss: 2.2886 (2.2919) class_acc: 0.6836 (0.6881) weight_decay: 0.0500 (0.0500) grad_norm: 1.2435 (1.3204)
Test: [ 0/50] eta: 0:11:34 loss: 1.1170 (1.1170) acc1: 76.0000 (76.0000) acc5: 92.0000 (92.0000) time: 13.8868 data: 13.8556 max mem: 6925
Test: [10/50] eta: 0:01:29 loss: 1.0237 (1.0579) acc1: 77.6000 (77.3818) acc5: 92.8000 (92.5091) time: 2.2451 data: 2.2155 max mem: 6925
Test: [20/50] eta: 0:00:52 loss: 1.2024 (1.1889) acc1: 73.6000 (73.3333) acc5: 92.0000 (91.4286) time: 1.1509 data: 1.1212 max mem: 6925
Test: [30/50] eta: 0:00:31 loss: 1.2733 (1.2331) acc1: 69.6000 (72.5677) acc5: 90.4000 (90.8645) time: 1.1957 data: 1.1659 max mem: 6925
Test: [40/50] eta: 0:00:13 loss: 1.2733 (1.2493) acc1: 70.4000 (72.1171) acc5: 90.4000 (90.5561) time: 0.9835 data: 0.9543 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.2985 (1.2568) acc1: 70.4000 (71.6800) acc5: 90.4000 (90.3680) time: 0.9640 data: 0.9353 max mem: 6925
Test: Total time: 0:00:57 (1.1453 s / it)
* Acc@1 72.656 Acc@5 90.768 loss 1.212
Accuracy of the model on the 50000 test images: 72.7%
Max accuracy: 72.66%
Epoch: [235] [ 0/625] eta: 3:31:39 lr: 0.000510 min_lr: 0.000510 loss: 2.2012 (2.2012) class_acc: 0.6836 (0.6836) weight_decay: 0.0500 (0.0500) time: 20.3196 data: 17.4237 max mem: 6925
Epoch: [235] [200/625] eta: 0:14:12 lr: 0.000505 min_lr: 0.000505 loss: 2.2652 (2.2803) class_acc: 0.6914 (0.6921) weight_decay: 0.0500 (0.0500) grad_norm: 1.2508 (1.3139) time: 1.8425 data: 0.0007 max mem: 6925
Epoch: [235] [400/625] eta: 0:07:28 lr: 0.000500 min_lr: 0.000500 loss: 2.3212 (2.2916) class_acc: 0.6758 (0.6892) weight_decay: 0.0500 (0.0500) grad_norm: 1.3282 (1.3416) time: 1.9522 data: 0.0008 max mem: 6925
Epoch: [235] [600/625] eta: 0:00:49 lr: 0.000495 min_lr: 0.000495 loss: 2.2836 (2.2933) class_acc: 0.6914 (0.6887) weight_decay: 0.0500 (0.0500) grad_norm: 1.3004 (1.3366) time: 2.0249 data: 0.0121 max mem: 6925
Epoch: [235] [624/625] eta: 0:00:01 lr: 0.000495 min_lr: 0.000495 loss: 2.3037 (2.2938) class_acc: 0.6875 (0.6886) weight_decay: 0.0500 (0.0500) grad_norm: 1.2900 (1.3359) time: 0.6359 data: 0.0013 max mem: 6925
Epoch: [235] Total time: 0:20:27 (1.9646 s / it)
Averaged stats: lr: 0.000495 min_lr: 0.000495 loss: 2.3037 (2.2871) class_acc: 0.6875 (0.6895) weight_decay: 0.0500 (0.0500) grad_norm: 1.2900 (1.3359)
Test: [ 0/50] eta: 0:10:11 loss: 1.2019 (1.2019) acc1: 76.0000 (76.0000) acc5: 91.2000 (91.2000) time: 12.2264 data: 12.1683 max mem: 6925
Test: [10/50] eta: 0:01:22 loss: 1.0581 (1.0971) acc1: 76.8000 (77.3091) acc5: 92.0000 (91.5636) time: 2.0510 data: 2.0196 max mem: 6925
Test: [20/50] eta: 0:00:47 loss: 1.1928 (1.2307) acc1: 73.6000 (73.4857) acc5: 91.2000 (90.5143) time: 1.0641 data: 1.0352 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.3767 (1.2824) acc1: 71.2000 (72.0258) acc5: 88.8000 (90.0129) time: 0.9962 data: 0.9670 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.3450 (1.2942) acc1: 70.4000 (71.5512) acc5: 89.6000 (90.0098) time: 0.6295 data: 0.5990 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.3001 (1.2903) acc1: 70.4000 (71.2320) acc5: 89.6000 (89.7760) time: 0.6164 data: 0.5859 max mem: 6925
Test: Total time: 0:00:46 (0.9301 s / it)
* Acc@1 72.250 Acc@5 90.656 loss 1.240
Accuracy of the model on the 50000 test images: 72.3%
Max accuracy: 72.66%
Epoch: [236] [ 0/625] eta: 3:34:54 lr: 0.000495 min_lr: 0.000495 loss: 2.4172 (2.4172) class_acc: 0.6836 (0.6836) weight_decay: 0.0500 (0.0500) time: 20.6314 data: 16.3926 max mem: 6925
Epoch: [236] [200/625] eta: 0:14:28 lr: 0.000490 min_lr: 0.000490 loss: 2.2963 (2.2777) class_acc: 0.6953 (0.6908) weight_decay: 0.0500 (0.0500) grad_norm: 1.2879 (1.3007) time: 1.8309 data: 0.0009 max mem: 6925
Epoch: [236] [400/625] eta: 0:07:28 lr: 0.000485 min_lr: 0.000485 loss: 2.2533 (2.2775) class_acc: 0.7031 (0.6909) weight_decay: 0.0500 (0.0500) grad_norm: 1.2430 (1.3062) time: 1.9650 data: 0.0009 max mem: 6925
Epoch: [236] [600/625] eta: 0:00:49 lr: 0.000481 min_lr: 0.000481 loss: 2.2657 (2.2769) class_acc: 0.6992 (0.6914) weight_decay: 0.0500 (0.0500) grad_norm: 1.1816 (1.3140) time: 1.9569 data: 0.0008 max mem: 6925
Epoch: [236] [624/625] eta: 0:00:01 lr: 0.000480 min_lr: 0.000480 loss: 2.2830 (2.2772) class_acc: 0.6992 (0.6915) weight_decay: 0.0500 (0.0500) grad_norm: 1.1660 (1.3083) time: 0.8198 data: 0.0014 max mem: 6925
Epoch: [236] Total time: 0:20:11 (1.9388 s / it)
Averaged stats: lr: 0.000480 min_lr: 0.000480 loss: 2.2830 (2.2832) class_acc: 0.6992 (0.6899) weight_decay: 0.0500 (0.0500) grad_norm: 1.1660 (1.3083)
Test: [ 0/50] eta: 0:09:13 loss: 0.9479 (0.9479) acc1: 78.4000 (78.4000) acc5: 93.6000 (93.6000) time: 11.0685 data: 11.0212 max mem: 6925
Test: [10/50] eta: 0:01:11 loss: 0.9982 (1.0478) acc1: 78.4000 (77.2364) acc5: 92.8000 (92.2182) time: 1.7842 data: 1.7535 max mem: 6925
Test: [20/50] eta: 0:00:40 loss: 1.2523 (1.2053) acc1: 70.4000 (72.4952) acc5: 91.2000 (91.1619) time: 0.8770 data: 0.8468 max mem: 6925
Test: [30/50] eta: 0:00:25 loss: 1.3333 (1.2416) acc1: 68.8000 (71.6645) acc5: 90.4000 (90.8129) time: 0.9595 data: 0.9295 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.2357 (1.2401) acc1: 70.4000 (71.7659) acc5: 90.4000 (90.6342) time: 0.9493 data: 0.9185 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.2357 (1.2372) acc1: 72.8000 (71.9360) acc5: 91.2000 (90.5440) time: 0.6087 data: 0.5775 max mem: 6925
Test: Total time: 0:00:51 (1.0226 s / it)
* Acc@1 72.848 Acc@5 90.884 loss 1.194
Accuracy of the model on the 50000 test images: 72.8%
Max accuracy: 72.85%
Epoch: [237] [ 0/625] eta: 4:24:36 lr: 0.000480 min_lr: 0.000480 loss: 2.2694 (2.2694) class_acc: 0.6953 (0.6953) weight_decay: 0.0500 (0.0500) time: 25.4028 data: 19.6384 max mem: 6925
Epoch: [237] [200/625] eta: 0:14:15 lr: 0.000475 min_lr: 0.000475 loss: 2.2668 (2.2699) class_acc: 0.6875 (0.6954) weight_decay: 0.0500 (0.0500) grad_norm: 1.4186 (1.3796) time: 1.9904 data: 0.1042 max mem: 6925
Epoch: [237] [400/625] eta: 0:07:27 lr: 0.000471 min_lr: 0.000471 loss: 2.2774 (2.2753) class_acc: 0.6875 (0.6940) weight_decay: 0.0500 (0.0500) grad_norm: 1.1265 (1.3441) time: 1.9936 data: 0.0804 max mem: 6925
Epoch: [237] [600/625] eta: 0:00:49 lr: 0.000466 min_lr: 0.000466 loss: 2.2888 (2.2800) class_acc: 0.6797 (0.6927) weight_decay: 0.0500 (0.0500) grad_norm: 1.3792 (1.3532) time: 2.1219 data: 0.0010 max mem: 6925
Epoch: [237] [624/625] eta: 0:00:01 lr: 0.000466 min_lr: 0.000466 loss: 2.3080 (2.2815) class_acc: 0.6758 (0.6922) weight_decay: 0.0500 (0.0500) grad_norm: 1.5570 (1.3631) time: 0.7284 data: 0.0019 max mem: 6925
Epoch: [237] Total time: 0:20:13 (1.9414 s / it)
Averaged stats: lr: 0.000466 min_lr: 0.000466 loss: 2.3080 (2.2798) class_acc: 0.6758 (0.6910) weight_decay: 0.0500 (0.0500) grad_norm: 1.5570 (1.3631)
Test: [ 0/50] eta: 0:10:01 loss: 1.1150 (1.1150) acc1: 76.8000 (76.8000) acc5: 91.2000 (91.2000) time: 12.0274 data: 11.9948 max mem: 6925
Test: [10/50] eta: 0:01:28 loss: 1.0699 (1.0690) acc1: 77.6000 (77.3818) acc5: 91.2000 (91.7091) time: 2.2095 data: 2.1786 max mem: 6925
Test: [20/50] eta: 0:00:53 loss: 1.2533 (1.2135) acc1: 72.8000 (73.3333) acc5: 90.4000 (90.6667) time: 1.2803 data: 1.2507 max mem: 6925
Test: [30/50] eta: 0:00:31 loss: 1.3360 (1.2452) acc1: 68.0000 (72.3613) acc5: 89.6000 (90.5290) time: 1.2277 data: 1.1991 max mem: 6925
Test: [40/50] eta: 0:00:13 loss: 1.2541 (1.2362) acc1: 72.0000 (72.7610) acc5: 91.2000 (90.5561) time: 0.8380 data: 0.8093 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.2007 (1.2390) acc1: 72.0000 (72.5120) acc5: 91.2000 (90.3680) time: 0.8379 data: 0.8092 max mem: 6925
Test: Total time: 0:00:54 (1.0964 s / it)
* Acc@1 73.242 Acc@5 91.088 loss 1.200
Accuracy of the model on the 50000 test images: 73.2%
Max accuracy: 73.24%
Epoch: [238] [ 0/625] eta: 3:37:25 lr: 0.000466 min_lr: 0.000466 loss: 2.2361 (2.2361) class_acc: 0.6914 (0.6914) weight_decay: 0.0500 (0.0500) time: 20.8724 data: 17.8540 max mem: 6925
Epoch: [238] [200/625] eta: 0:14:38 lr: 0.000461 min_lr: 0.000461 loss: 2.2695 (2.2676) class_acc: 0.6875 (0.6968) weight_decay: 0.0500 (0.0500) grad_norm: 1.3844 (1.3135) time: 1.9578 data: 0.4575 max mem: 6925
Epoch: [238] [400/625] eta: 0:07:33 lr: 0.000456 min_lr: 0.000456 loss: 2.3187 (2.2773) class_acc: 0.6758 (0.6931) weight_decay: 0.0500 (0.0500) grad_norm: 1.2094 (1.3050) time: 1.9554 data: 0.0009 max mem: 6925
Epoch: [238] [600/625] eta: 0:00:50 lr: 0.000452 min_lr: 0.000452 loss: 2.3082 (2.2817) class_acc: 0.6836 (0.6920) weight_decay: 0.0500 (0.0500) grad_norm: 1.2358 (1.3090) time: 1.9701 data: 0.0010 max mem: 6925
Epoch: [238] [624/625] eta: 0:00:01 lr: 0.000451 min_lr: 0.000451 loss: 2.2339 (2.2810) class_acc: 0.6992 (0.6923) weight_decay: 0.0500 (0.0500) grad_norm: 1.2580 (1.3093) time: 0.9776 data: 0.0016 max mem: 6925
Epoch: [238] Total time: 0:20:32 (1.9727 s / it)
Averaged stats: lr: 0.000451 min_lr: 0.000451 loss: 2.2339 (2.2776) class_acc: 0.6992 (0.6917) weight_decay: 0.0500 (0.0500) grad_norm: 1.2580 (1.3093)
Test: [ 0/50] eta: 0:10:26 loss: 1.0580 (1.0580) acc1: 73.6000 (73.6000) acc5: 94.4000 (94.4000) time: 12.5353 data: 12.4980 max mem: 6925
Test: [10/50] eta: 0:01:20 loss: 0.9699 (1.0461) acc1: 78.4000 (77.8182) acc5: 92.8000 (92.3636) time: 2.0165 data: 1.9857 max mem: 6925
Test: [20/50] eta: 0:00:46 loss: 1.1753 (1.2075) acc1: 74.4000 (73.6000) acc5: 90.4000 (90.5905) time: 1.0142 data: 0.9849 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.3383 (1.2366) acc1: 68.8000 (72.6968) acc5: 89.6000 (90.3226) time: 1.0399 data: 1.0114 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.2347 (1.2489) acc1: 69.6000 (72.2341) acc5: 90.4000 (90.1268) time: 0.8064 data: 0.7779 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.2142 (1.2556) acc1: 71.2000 (71.9520) acc5: 89.6000 (89.8880) time: 0.7932 data: 0.7642 max mem: 6925
Test: Total time: 0:00:49 (0.9952 s / it)
* Acc@1 72.660 Acc@5 90.764 loss 1.203
Accuracy of the model on the 50000 test images: 72.7%
Max accuracy: 73.24%
Epoch: [239] [ 0/625] eta: 3:23:07 lr: 0.000451 min_lr: 0.000451 loss: 2.3324 (2.3324) class_acc: 0.6953 (0.6953) weight_decay: 0.0500 (0.0500) time: 19.4996 data: 19.0698 max mem: 6925
Epoch: [239] [200/625] eta: 0:14:31 lr: 0.000447 min_lr: 0.000447 loss: 2.2174 (2.2636) class_acc: 0.6953 (0.6952) weight_decay: 0.0500 (0.0500) grad_norm: 1.2139 (1.3293) time: 1.8667 data: 0.0922 max mem: 6925
Epoch: [239] [400/625] eta: 0:07:19 lr: 0.000442 min_lr: 0.000442 loss: 2.2556 (2.2709) class_acc: 0.6953 (0.6937) weight_decay: 0.0500 (0.0500) grad_norm: 1.3150 (1.3697) time: 1.9220 data: 1.6152 max mem: 6925
Epoch: [239] [600/625] eta: 0:00:48 lr: 0.000438 min_lr: 0.000438 loss: 2.2665 (2.2751) class_acc: 0.6914 (0.6932) weight_decay: 0.0500 (0.0500) grad_norm: 1.3161 (1.3634) time: 1.9271 data: 1.6274 max mem: 6925
Epoch: [239] [624/625] eta: 0:00:01 lr: 0.000437 min_lr: 0.000437 loss: 2.3241 (2.2765) class_acc: 0.6758 (0.6928) weight_decay: 0.0500 (0.0500) grad_norm: 1.3443 (1.3697) time: 0.9045 data: 0.4176 max mem: 6925
Epoch: [239] Total time: 0:19:57 (1.9154 s / it)
Averaged stats: lr: 0.000437 min_lr: 0.000437 loss: 2.3241 (2.2728) class_acc: 0.6758 (0.6931) weight_decay: 0.0500 (0.0500) grad_norm: 1.3443 (1.3697)
Test: [ 0/50] eta: 0:10:19 loss: 1.1218 (1.1218) acc1: 78.4000 (78.4000) acc5: 91.2000 (91.2000) time: 12.3858 data: 12.3436 max mem: 6925
Test: [10/50] eta: 0:01:21 loss: 1.0238 (1.0641) acc1: 77.6000 (77.9636) acc5: 91.2000 (90.9818) time: 2.0491 data: 2.0182 max mem: 6925
Test: [20/50] eta: 0:00:47 loss: 1.2090 (1.2206) acc1: 73.6000 (73.5619) acc5: 90.4000 (90.0191) time: 1.0530 data: 1.0219 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.3440 (1.2539) acc1: 68.8000 (72.5419) acc5: 89.6000 (89.8839) time: 0.9736 data: 0.9426 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.2928 (1.2621) acc1: 68.8000 (72.0585) acc5: 89.6000 (89.8732) time: 0.5865 data: 0.5571 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.2056 (1.2603) acc1: 72.0000 (71.8400) acc5: 90.4000 (89.9520) time: 0.5862 data: 0.5570 max mem: 6925
Test: Total time: 0:00:45 (0.9125 s / it)
* Acc@1 72.820 Acc@5 90.760 loss 1.214
Accuracy of the model on the 50000 test images: 72.8%
Max accuracy: 73.24%
Epoch: [240] [ 0/625] eta: 3:32:58 lr: 0.000437 min_lr: 0.000437 loss: 2.2847 (2.2847) class_acc: 0.7109 (0.7109) weight_decay: 0.0500 (0.0500) time: 20.4449 data: 16.1014 max mem: 6925
Epoch: [240] [200/625] eta: 0:14:15 lr: 0.000433 min_lr: 0.000433 loss: 2.2900 (2.2674) class_acc: 0.6836 (0.6965) weight_decay: 0.0500 (0.0500) grad_norm: 1.2315 (1.3215) time: 1.8823 data: 0.0008 max mem: 6925
Epoch: [240] [400/625] eta: 0:07:26 lr: 0.000428 min_lr: 0.000428 loss: 2.2877 (2.2742) class_acc: 0.6797 (0.6936) weight_decay: 0.0500 (0.0500) grad_norm: 1.3390 (1.3666) time: 1.8629 data: 0.0007 max mem: 6925
Epoch: [240] [600/625] eta: 0:00:49 lr: 0.000424 min_lr: 0.000424 loss: 2.2935 (2.2742) class_acc: 0.6836 (0.6936) weight_decay: 0.0500 (0.0500) grad_norm: 1.1968 (1.3428) time: 1.9368 data: 0.0008 max mem: 6925
Epoch: [240] [624/625] eta: 0:00:01 lr: 0.000423 min_lr: 0.000423 loss: 2.2854 (2.2743) class_acc: 0.6797 (0.6934) weight_decay: 0.0500 (0.0500) grad_norm: 1.1662 (1.3453) time: 0.7694 data: 0.0015 max mem: 6925
Epoch: [240] Total time: 0:20:10 (1.9373 s / it)
Averaged stats: lr: 0.000423 min_lr: 0.000423 loss: 2.2854 (2.2718) class_acc: 0.6797 (0.6934) weight_decay: 0.0500 (0.0500) grad_norm: 1.1662 (1.3453)
Test: [ 0/50] eta: 0:09:20 loss: 1.0342 (1.0342) acc1: 76.0000 (76.0000) acc5: 92.8000 (92.8000) time: 11.2152 data: 11.1832 max mem: 6925
Test: [10/50] eta: 0:01:11 loss: 1.0032 (1.0465) acc1: 77.6000 (77.2364) acc5: 92.8000 (92.8727) time: 1.7870 data: 1.7572 max mem: 6925
Test: [20/50] eta: 0:00:41 loss: 1.2548 (1.2060) acc1: 74.4000 (72.6476) acc5: 90.4000 (91.0476) time: 0.8894 data: 0.8602 max mem: 6925
Test: [30/50] eta: 0:00:26 loss: 1.3195 (1.2318) acc1: 68.0000 (72.0258) acc5: 88.8000 (90.7871) time: 1.0851 data: 1.0566 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.2198 (1.2257) acc1: 72.8000 (72.5463) acc5: 90.4000 (90.5756) time: 1.0350 data: 1.0056 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.2198 (1.2295) acc1: 72.8000 (72.4000) acc5: 90.4000 (90.4800) time: 0.6181 data: 0.5886 max mem: 6925
Test: Total time: 0:00:53 (1.0781 s / it)
* Acc@1 73.374 Acc@5 91.096 loss 1.192
Accuracy of the model on the 50000 test images: 73.4%
Max accuracy: 73.37%
Epoch: [241] [ 0/625] eta: 3:23:19 lr: 0.000423 min_lr: 0.000423 loss: 2.0499 (2.0499) class_acc: 0.7422 (0.7422) weight_decay: 0.0500 (0.0500) time: 19.5187 data: 18.6040 max mem: 6925
Epoch: [241] [200/625] eta: 0:13:45 lr: 0.000419 min_lr: 0.000419 loss: 2.2887 (2.2724) class_acc: 0.6836 (0.6933) weight_decay: 0.0500 (0.0500) grad_norm: 1.2623 (1.2942) time: 1.8893 data: 0.8963 max mem: 6925
Epoch: [241] [400/625] eta: 0:07:20 lr: 0.000415 min_lr: 0.000415 loss: 2.2768 (2.2681) class_acc: 0.6875 (0.6953) weight_decay: 0.0500 (0.0500) grad_norm: 1.1737 (1.3190) time: 1.7973 data: 0.0008 max mem: 6925
Epoch: [241] [600/625] eta: 0:00:49 lr: 0.000410 min_lr: 0.000410 loss: 2.2408 (2.2689) class_acc: 0.7031 (0.6954) weight_decay: 0.0500 (0.0500) grad_norm: 1.4079 (1.3404) time: 2.0672 data: 0.0011 max mem: 6925
Epoch: [241] [624/625] eta: 0:00:01 lr: 0.000410 min_lr: 0.000410 loss: 2.2818 (2.2696) class_acc: 0.6914 (0.6954) weight_decay: 0.0500 (0.0500) grad_norm: 1.4310 (1.3459) time: 0.5845 data: 0.0015 max mem: 6925
Epoch: [241] Total time: 0:20:15 (1.9454 s / it)
Averaged stats: lr: 0.000410 min_lr: 0.000410 loss: 2.2818 (2.2665) class_acc: 0.6914 (0.6948) weight_decay: 0.0500 (0.0500) grad_norm: 1.4310 (1.3459)
Test: [ 0/50] eta: 0:09:05 loss: 1.0530 (1.0530) acc1: 79.2000 (79.2000) acc5: 91.2000 (91.2000) time: 10.9146 data: 10.8803 max mem: 6925
Test: [10/50] eta: 0:01:13 loss: 1.0438 (1.0815) acc1: 76.0000 (76.7273) acc5: 92.8000 (91.9273) time: 1.8251 data: 1.7958 max mem: 6925
Test: [20/50] eta: 0:00:42 loss: 1.1925 (1.2237) acc1: 72.8000 (73.1048) acc5: 91.2000 (90.7048) time: 0.9348 data: 0.9061 max mem: 6925
Test: [30/50] eta: 0:00:25 loss: 1.3165 (1.2513) acc1: 68.8000 (72.1548) acc5: 88.8000 (90.3484) time: 0.9498 data: 0.9205 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.3165 (1.2497) acc1: 69.6000 (72.2537) acc5: 89.6000 (90.1659) time: 0.8669 data: 0.8376 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.2470 (1.2492) acc1: 72.0000 (72.2080) acc5: 89.6000 (90.2080) time: 0.6148 data: 0.5850 max mem: 6925
Test: Total time: 0:00:50 (1.0094 s / it)
* Acc@1 73.142 Acc@5 90.898 loss 1.204
Accuracy of the model on the 50000 test images: 73.1%
Max accuracy: 73.37%
Epoch: [242] [ 0/625] eta: 3:43:07 lr: 0.000410 min_lr: 0.000410 loss: 2.2726 (2.2726) class_acc: 0.6953 (0.6953) weight_decay: 0.0500 (0.0500) time: 21.4193 data: 20.3956 max mem: 6925
Epoch: [242] [200/625] eta: 0:14:22 lr: 0.000405 min_lr: 0.000405 loss: 2.2385 (2.2459) class_acc: 0.7031 (0.6993) weight_decay: 0.0500 (0.0500) grad_norm: 1.2120 (1.3187) time: 2.0184 data: 0.8368 max mem: 6925
Epoch: [242] [400/625] eta: 0:07:26 lr: 0.000401 min_lr: 0.000401 loss: 2.2714 (2.2537) class_acc: 0.6953 (0.6979) weight_decay: 0.0500 (0.0500) grad_norm: 1.2700 (1.3544) time: 1.8004 data: 0.0012 max mem: 6925
Epoch: [242] [600/625] eta: 0:00:49 lr: 0.000397 min_lr: 0.000397 loss: 2.2633 (2.2608) class_acc: 0.6836 (0.6964) weight_decay: 0.0500 (0.0500) grad_norm: 1.4691 (1.3852) time: 2.1561 data: 0.0017 max mem: 6925
Epoch: [242] [624/625] eta: 0:00:01 lr: 0.000396 min_lr: 0.000396 loss: 2.2507 (2.2606) class_acc: 0.6953 (0.6965) weight_decay: 0.0500 (0.0500) grad_norm: 1.2921 (1.3815) time: 0.7682 data: 0.0028 max mem: 6925
Epoch: [242] Total time: 0:20:13 (1.9409 s / it)
Averaged stats: lr: 0.000396 min_lr: 0.000396 loss: 2.2507 (2.2627) class_acc: 0.6953 (0.6957) weight_decay: 0.0500 (0.0500) grad_norm: 1.2921 (1.3815)
Test: [ 0/50] eta: 0:11:11 loss: 1.0903 (1.0903) acc1: 73.6000 (73.6000) acc5: 92.8000 (92.8000) time: 13.4331 data: 13.4005 max mem: 6925
Test: [10/50] eta: 0:01:21 loss: 1.0287 (1.0536) acc1: 77.6000 (76.8000) acc5: 92.0000 (91.5636) time: 2.0310 data: 2.0010 max mem: 6925
Test: [20/50] eta: 0:00:45 loss: 1.1360 (1.1827) acc1: 73.6000 (73.1429) acc5: 90.4000 (90.4762) time: 0.9073 data: 0.8766 max mem: 6925
Test: [30/50] eta: 0:00:26 loss: 1.2935 (1.2215) acc1: 70.4000 (72.3355) acc5: 89.6000 (90.2710) time: 0.9155 data: 0.8848 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.2203 (1.2181) acc1: 71.2000 (72.6049) acc5: 90.4000 (90.3610) time: 0.9077 data: 0.8767 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.2122 (1.2177) acc1: 72.8000 (72.4640) acc5: 90.4000 (90.3200) time: 0.6316 data: 0.5999 max mem: 6925
Test: Total time: 0:00:53 (1.0688 s / it)
* Acc@1 73.390 Acc@5 91.108 loss 1.173
Accuracy of the model on the 50000 test images: 73.4%
Max accuracy: 73.39%
Epoch: [243] [ 0/625] eta: 3:43:01 lr: 0.000396 min_lr: 0.000396 loss: 2.0966 (2.0966) class_acc: 0.7461 (0.7461) weight_decay: 0.0500 (0.0500) time: 21.4109 data: 17.8488 max mem: 6925
Epoch: [243] [200/625] eta: 0:14:27 lr: 0.000392 min_lr: 0.000392 loss: 2.2890 (2.2527) class_acc: 0.6875 (0.6985) weight_decay: 0.0500 (0.0500) grad_norm: 1.1781 (1.2944) time: 1.8410 data: 0.0007 max mem: 6925
Epoch: [243] [400/625] eta: 0:07:32 lr: 0.000388 min_lr: 0.000388 loss: 2.2877 (2.2567) class_acc: 0.6875 (0.6976) weight_decay: 0.0500 (0.0500) grad_norm: 1.3498 (1.3259) time: 2.0016 data: 0.0008 max mem: 6925
Epoch: [243] [600/625] eta: 0:00:50 lr: 0.000383 min_lr: 0.000383 loss: 2.2739 (2.2588) class_acc: 0.6992 (0.6970) weight_decay: 0.0500 (0.0500) grad_norm: 1.2896 (1.3612) time: 2.0805 data: 0.0008 max mem: 6925
Epoch: [243] [624/625] eta: 0:00:01 lr: 0.000383 min_lr: 0.000383 loss: 2.2889 (2.2600) class_acc: 0.6953 (0.6969) weight_decay: 0.0500 (0.0500) grad_norm: 1.3526 (1.3635) time: 0.7966 data: 0.0015 max mem: 6925
Epoch: [243] Total time: 0:20:27 (1.9643 s / it)
Averaged stats: lr: 0.000383 min_lr: 0.000383 loss: 2.2889 (2.2608) class_acc: 0.6953 (0.6960) weight_decay: 0.0500 (0.0500) grad_norm: 1.3526 (1.3635)
Test: [ 0/50] eta: 0:08:55 loss: 1.0092 (1.0092) acc1: 77.6000 (77.6000) acc5: 93.6000 (93.6000) time: 10.7121 data: 10.6806 max mem: 6925
Test: [10/50] eta: 0:01:08 loss: 0.9597 (1.0064) acc1: 78.4000 (78.7636) acc5: 92.8000 (92.6546) time: 1.7115 data: 1.6816 max mem: 6925
Test: [20/50] eta: 0:00:39 loss: 1.1448 (1.1714) acc1: 75.2000 (74.1333) acc5: 91.2000 (91.3905) time: 0.8615 data: 0.8316 max mem: 6925
Test: [30/50] eta: 0:00:23 loss: 1.3297 (1.2131) acc1: 69.6000 (72.9548) acc5: 89.6000 (90.9677) time: 0.8962 data: 0.8668 max mem: 6925
Test: [40/50] eta: 0:00:10 loss: 1.2860 (1.2145) acc1: 70.4000 (72.7805) acc5: 91.2000 (91.0049) time: 0.7638 data: 0.7350 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.2600 (1.2230) acc1: 70.4000 (72.4480) acc5: 91.2000 (90.6880) time: 0.5844 data: 0.5554 max mem: 6925
Test: Total time: 0:00:46 (0.9382 s / it)
* Acc@1 73.514 Acc@5 91.240 loss 1.181
Accuracy of the model on the 50000 test images: 73.5%
Max accuracy: 73.51%
Epoch: [244] [ 0/625] eta: 3:36:50 lr: 0.000383 min_lr: 0.000383 loss: 2.2062 (2.2062) class_acc: 0.7148 (0.7148) weight_decay: 0.0500 (0.0500) time: 20.8161 data: 20.5827 max mem: 6925
Epoch: [244] [200/625] eta: 0:14:24 lr: 0.000379 min_lr: 0.000379 loss: 2.2502 (2.2568) class_acc: 0.7031 (0.6980) weight_decay: 0.0500 (0.0500) grad_norm: 1.3217 (1.3771) time: 2.0995 data: 0.4320 max mem: 6925
Epoch: [244] [400/625] eta: 0:07:21 lr: 0.000374 min_lr: 0.000374 loss: 2.2077 (2.2642) class_acc: 0.7109 (0.6968) weight_decay: 0.0500 (0.0500) grad_norm: 1.4217 (inf) time: 2.0459 data: 0.1243 max mem: 6925
Epoch: [244] [600/625] eta: 0:00:48 lr: 0.000370 min_lr: 0.000370 loss: 2.2186 (2.2604) class_acc: 0.7070 (0.6970) weight_decay: 0.0500 (0.0500) grad_norm: 1.3405 (inf) time: 2.0659 data: 0.0010 max mem: 6925
Epoch: [244] [624/625] eta: 0:00:01 lr: 0.000370 min_lr: 0.000370 loss: 2.2997 (2.2615) class_acc: 0.6758 (0.6967) weight_decay: 0.0500 (0.0500) grad_norm: 1.4702 (inf) time: 0.6547 data: 0.0019 max mem: 6925
Epoch: [244] Total time: 0:20:00 (1.9200 s / it)
Averaged stats: lr: 0.000370 min_lr: 0.000370 loss: 2.2997 (2.2580) class_acc: 0.6758 (0.6968) weight_decay: 0.0500 (0.0500) grad_norm: 1.4702 (inf)
Test: [ 0/50] eta: 0:09:55 loss: 1.2149 (1.2149) acc1: 74.4000 (74.4000) acc5: 91.2000 (91.2000) time: 11.9180 data: 11.8857 max mem: 6925
Test: [10/50] eta: 0:01:12 loss: 0.9947 (1.0388) acc1: 77.6000 (77.3818) acc5: 93.6000 (92.1455) time: 1.8172 data: 1.7865 max mem: 6925
Test: [20/50] eta: 0:00:40 loss: 1.1552 (1.1702) acc1: 73.6000 (73.9429) acc5: 92.0000 (90.8952) time: 0.8363 data: 0.8068 max mem: 6925
Test: [30/50] eta: 0:00:24 loss: 1.2876 (1.1908) acc1: 70.4000 (73.3677) acc5: 90.4000 (90.7355) time: 0.8657 data: 0.8371 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.1439 (1.2000) acc1: 72.0000 (73.1707) acc5: 91.2000 (90.7707) time: 0.8546 data: 0.8253 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.2032 (1.2042) acc1: 72.0000 (73.0560) acc5: 91.2000 (90.6400) time: 0.7189 data: 0.6883 max mem: 6925
Test: Total time: 0:00:50 (1.0033 s / it)
* Acc@1 73.514 Acc@5 91.234 loss 1.163
Accuracy of the model on the 50000 test images: 73.5%
Max accuracy: 73.51%
Epoch: [245] [ 0/625] eta: 3:27:06 lr: 0.000370 min_lr: 0.000370 loss: 2.2414 (2.2414) class_acc: 0.7148 (0.7148) weight_decay: 0.0500 (0.0500) time: 19.8818 data: 19.6521 max mem: 6925
Epoch: [245] [200/625] eta: 0:14:07 lr: 0.000366 min_lr: 0.000366 loss: 2.2512 (2.2425) class_acc: 0.6875 (0.7016) weight_decay: 0.0500 (0.0500) grad_norm: 1.4381 (inf) time: 1.9013 data: 0.0009 max mem: 6925
Epoch: [245] [400/625] eta: 0:07:16 lr: 0.000362 min_lr: 0.000362 loss: 2.2801 (2.2478) class_acc: 0.6953 (0.7008) weight_decay: 0.0500 (0.0500) grad_norm: 1.4382 (inf) time: 1.9098 data: 0.0011 max mem: 6925
Epoch: [245] [600/625] eta: 0:00:48 lr: 0.000357 min_lr: 0.000357 loss: 2.2554 (2.2514) class_acc: 0.7031 (0.6995) weight_decay: 0.0500 (0.0500) grad_norm: 1.2069 (inf) time: 1.9784 data: 0.0009 max mem: 6925
Epoch: [245] [624/625] eta: 0:00:01 lr: 0.000357 min_lr: 0.000357 loss: 2.2734 (2.2534) class_acc: 0.6836 (0.6988) weight_decay: 0.0500 (0.0500) grad_norm: 1.2170 (inf) time: 0.7114 data: 0.0017 max mem: 6925
Epoch: [245] Total time: 0:19:47 (1.8994 s / it)
Averaged stats: lr: 0.000357 min_lr: 0.000357 loss: 2.2734 (2.2547) class_acc: 0.6836 (0.6981) weight_decay: 0.0500 (0.0500) grad_norm: 1.2170 (inf)
Test: [ 0/50] eta: 0:10:24 loss: 1.0714 (1.0714) acc1: 74.4000 (74.4000) acc5: 93.6000 (93.6000) time: 12.4974 data: 12.4529 max mem: 6925
Test: [10/50] eta: 0:01:27 loss: 0.9995 (1.0454) acc1: 77.6000 (77.3091) acc5: 92.8000 (92.5091) time: 2.1812 data: 2.1487 max mem: 6925
Test: [20/50] eta: 0:00:50 loss: 1.1623 (1.1604) acc1: 73.6000 (73.7905) acc5: 91.2000 (91.3143) time: 1.1429 data: 1.1122 max mem: 6925
Test: [30/50] eta: 0:00:28 loss: 1.3299 (1.2039) acc1: 70.4000 (72.8516) acc5: 90.4000 (90.9936) time: 1.0371 data: 1.0068 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.2289 (1.2081) acc1: 69.6000 (72.6829) acc5: 90.4000 (90.9463) time: 0.7477 data: 0.7182 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.2100 (1.2123) acc1: 70.4000 (72.6240) acc5: 90.4000 (90.8960) time: 0.7477 data: 0.7185 max mem: 6925
Test: Total time: 0:00:52 (1.0458 s / it)
* Acc@1 73.600 Acc@5 91.358 loss 1.173
Accuracy of the model on the 50000 test images: 73.6%
Max accuracy: 73.60%
Epoch: [246] [ 0/625] eta: 3:29:30 lr: 0.000357 min_lr: 0.000357 loss: 2.4930 (2.4930) class_acc: 0.6289 (0.6289) weight_decay: 0.0500 (0.0500) time: 20.1122 data: 19.8153 max mem: 6925
Epoch: [246] [200/625] eta: 0:13:59 lr: 0.000353 min_lr: 0.000353 loss: 2.2608 (2.2442) class_acc: 0.6914 (0.7002) weight_decay: 0.0500 (0.0500) grad_norm: 1.1556 (1.4006) time: 1.8745 data: 0.4296 max mem: 6925
Epoch: [246] [400/625] eta: 0:07:21 lr: 0.000349 min_lr: 0.000349 loss: 2.2060 (2.2495) class_acc: 0.6992 (0.6983) weight_decay: 0.0500 (0.0500) grad_norm: 1.3398 (1.3992) time: 1.9078 data: 0.0916 max mem: 6925
Epoch: [246] [600/625] eta: 0:00:49 lr: 0.000345 min_lr: 0.000345 loss: 2.2810 (2.2511) class_acc: 0.6953 (0.6981) weight_decay: 0.0500 (0.0500) grad_norm: 1.3877 (1.4006) time: 2.2630 data: 0.0008 max mem: 6925
Epoch: [246] [624/625] eta: 0:00:01 lr: 0.000344 min_lr: 0.000344 loss: 2.2546 (2.2517) class_acc: 0.6875 (0.6978) weight_decay: 0.0500 (0.0500) grad_norm: 1.3993 (1.4056) time: 0.4447 data: 0.0017 max mem: 6925
Epoch: [246] Total time: 0:20:25 (1.9602 s / it)
Averaged stats: lr: 0.000344 min_lr: 0.000344 loss: 2.2546 (2.2498) class_acc: 0.6875 (0.6990) weight_decay: 0.0500 (0.0500) grad_norm: 1.3993 (1.4056)
Test: [ 0/50] eta: 0:09:52 loss: 0.9888 (0.9888) acc1: 75.2000 (75.2000) acc5: 94.4000 (94.4000) time: 11.8477 data: 11.7991 max mem: 6925
Test: [10/50] eta: 0:01:25 loss: 1.0032 (1.0077) acc1: 76.8000 (76.8000) acc5: 92.0000 (92.0727) time: 2.1376 data: 2.1049 max mem: 6925
Test: [20/50] eta: 0:00:51 loss: 1.0887 (1.1399) acc1: 72.8000 (73.9048) acc5: 90.4000 (90.7048) time: 1.2091 data: 1.1787 max mem: 6925
Test: [30/50] eta: 0:00:31 loss: 1.2394 (1.1826) acc1: 70.4000 (73.1097) acc5: 89.6000 (90.4258) time: 1.2328 data: 1.2036 max mem: 6925
Test: [40/50] eta: 0:00:13 loss: 1.2306 (1.1949) acc1: 71.2000 (72.9561) acc5: 89.6000 (90.2439) time: 0.8931 data: 0.8642 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.2212 (1.1972) acc1: 72.0000 (72.8640) acc5: 89.6000 (90.0960) time: 0.6879 data: 0.6588 max mem: 6925
Test: Total time: 0:00:55 (1.1094 s / it)
* Acc@1 73.596 Acc@5 91.146 loss 1.159
Accuracy of the model on the 50000 test images: 73.6%
Max accuracy: 73.60%
Epoch: [247] [ 0/625] eta: 3:42:35 lr: 0.000344 min_lr: 0.000344 loss: 2.3618 (2.3618) class_acc: 0.6797 (0.6797) weight_decay: 0.0500 (0.0500) time: 21.3689 data: 21.1369 max mem: 6925
Epoch: [247] [200/625] eta: 0:14:25 lr: 0.000340 min_lr: 0.000340 loss: 2.1797 (2.2326) class_acc: 0.7227 (0.7063) weight_decay: 0.0500 (0.0500) grad_norm: 1.2321 (1.4331) time: 1.8870 data: 0.0009 max mem: 6925
Epoch: [247] [400/625] eta: 0:07:30 lr: 0.000336 min_lr: 0.000336 loss: 2.2629 (2.2387) class_acc: 0.6992 (0.7032) weight_decay: 0.0500 (0.0500) grad_norm: 1.2098 (1.3899) time: 1.8981 data: 0.0011 max mem: 6925
Epoch: [247] [600/625] eta: 0:00:49 lr: 0.000332 min_lr: 0.000332 loss: 2.3020 (2.2482) class_acc: 0.6797 (0.7005) weight_decay: 0.0500 (0.0500) grad_norm: 1.2951 (1.3920) time: 2.0168 data: 0.0010 max mem: 6925
Epoch: [247] [624/625] eta: 0:00:01 lr: 0.000332 min_lr: 0.000332 loss: 2.2813 (2.2496) class_acc: 0.6875 (0.7003) weight_decay: 0.0500 (0.0500) grad_norm: 1.3723 (1.3964) time: 0.6361 data: 0.0019 max mem: 6925
Epoch: [247] Total time: 0:20:35 (1.9763 s / it)
Averaged stats: lr: 0.000332 min_lr: 0.000332 loss: 2.2813 (2.2480) class_acc: 0.6875 (0.6997) weight_decay: 0.0500 (0.0500) grad_norm: 1.3723 (1.3964)
Test: [ 0/50] eta: 0:11:03 loss: 0.9610 (0.9610) acc1: 76.0000 (76.0000) acc5: 94.4000 (94.4000) time: 13.2611 data: 13.2200 max mem: 6925
Test: [10/50] eta: 0:01:30 loss: 0.9610 (1.0373) acc1: 77.6000 (76.8000) acc5: 92.8000 (92.0727) time: 2.2600 data: 2.2288 max mem: 6925
Test: [20/50] eta: 0:00:52 loss: 1.1854 (1.1716) acc1: 72.8000 (73.3714) acc5: 90.4000 (90.8571) time: 1.1853 data: 1.1552 max mem: 6925
Test: [30/50] eta: 0:00:31 loss: 1.3022 (1.1949) acc1: 69.6000 (72.9290) acc5: 90.4000 (90.6839) time: 1.1646 data: 1.1352 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.1834 (1.1977) acc1: 72.8000 (72.8195) acc5: 90.4000 (90.7512) time: 0.7780 data: 0.7489 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.1967 (1.1990) acc1: 73.6000 (72.8160) acc5: 91.2000 (90.6720) time: 0.7239 data: 0.6950 max mem: 6925
Test: Total time: 0:00:52 (1.0588 s / it)
* Acc@1 73.594 Acc@5 91.494 loss 1.155
Accuracy of the model on the 50000 test images: 73.6%
Max accuracy: 73.60%
Epoch: [248] [ 0/625] eta: 4:04:44 lr: 0.000332 min_lr: 0.000332 loss: 2.1838 (2.1838) class_acc: 0.7539 (0.7539) weight_decay: 0.0500 (0.0500) time: 23.4950 data: 18.0187 max mem: 6925
Epoch: [248] [200/625] eta: 0:14:56 lr: 0.000328 min_lr: 0.000328 loss: 2.2083 (2.2391) class_acc: 0.7031 (0.7029) weight_decay: 0.0500 (0.0500) grad_norm: 1.4177 (1.4159) time: 2.0200 data: 0.0008 max mem: 6925
Epoch: [248] [400/625] eta: 0:07:39 lr: 0.000324 min_lr: 0.000324 loss: 2.2045 (2.2368) class_acc: 0.7031 (0.7036) weight_decay: 0.0500 (0.0500) grad_norm: 1.2312 (1.4138) time: 1.9045 data: 0.0009 max mem: 6925
Epoch: [248] [600/625] eta: 0:00:50 lr: 0.000320 min_lr: 0.000320 loss: 2.2633 (2.2420) class_acc: 0.6992 (0.7023) weight_decay: 0.0500 (0.0500) grad_norm: 1.3613 (1.3995) time: 2.0958 data: 0.0009 max mem: 6925
Epoch: [248] [624/625] eta: 0:00:01 lr: 0.000320 min_lr: 0.000320 loss: 2.2023 (2.2416) class_acc: 0.7031 (0.7026) weight_decay: 0.0500 (0.0500) grad_norm: 1.4140 (1.4059) time: 0.8223 data: 0.0012 max mem: 6925
Epoch: [248] Total time: 0:20:45 (1.9931 s / it)
Averaged stats: lr: 0.000320 min_lr: 0.000320 loss: 2.2023 (2.2452) class_acc: 0.7031 (0.7006) weight_decay: 0.0500 (0.0500) grad_norm: 1.4140 (1.4059)
Test: [ 0/50] eta: 0:11:32 loss: 1.0167 (1.0167) acc1: 76.0000 (76.0000) acc5: 92.0000 (92.0000) time: 13.8452 data: 13.8093 max mem: 6925
Test: [10/50] eta: 0:01:31 loss: 0.9737 (1.0165) acc1: 79.2000 (77.8909) acc5: 92.0000 (92.0727) time: 2.2934 data: 2.2609 max mem: 6925
Test: [20/50] eta: 0:00:54 loss: 1.1277 (1.1488) acc1: 72.8000 (74.0191) acc5: 92.0000 (91.1238) time: 1.2221 data: 1.1917 max mem: 6925
Test: [30/50] eta: 0:00:32 loss: 1.2305 (1.1783) acc1: 70.4000 (73.0839) acc5: 89.6000 (90.8387) time: 1.2741 data: 1.2453 max mem: 6925
Test: [40/50] eta: 0:00:13 loss: 1.1573 (1.1803) acc1: 70.4000 (72.9756) acc5: 90.4000 (90.7317) time: 0.9018 data: 0.8718 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.1677 (1.1818) acc1: 72.0000 (72.7360) acc5: 90.4000 (90.6240) time: 0.8718 data: 0.8407 max mem: 6925
Test: Total time: 0:00:56 (1.1380 s / it)
* Acc@1 73.878 Acc@5 91.520 loss 1.138
Accuracy of the model on the 50000 test images: 73.9%
Max accuracy: 73.88%
Epoch: [249] [ 0/625] eta: 3:27:56 lr: 0.000320 min_lr: 0.000320 loss: 2.1747 (2.1747) class_acc: 0.6836 (0.6836) weight_decay: 0.0500 (0.0500) time: 19.9631 data: 19.6286 max mem: 6925
Epoch: [249] [200/625] eta: 0:14:26 lr: 0.000316 min_lr: 0.000316 loss: 2.1871 (2.2307) class_acc: 0.7188 (0.7031) weight_decay: 0.0500 (0.0500) grad_norm: 1.3659 (1.4105) time: 1.8310 data: 0.0008 max mem: 6925
Epoch: [249] [400/625] eta: 0:07:27 lr: 0.000312 min_lr: 0.000312 loss: 2.2185 (2.2334) class_acc: 0.6953 (0.7028) weight_decay: 0.0500 (0.0500) grad_norm: 1.3978 (1.4058) time: 2.1293 data: 0.0008 max mem: 6925
Epoch: [249] [600/625] eta: 0:00:49 lr: 0.000308 min_lr: 0.000308 loss: 2.2224 (2.2370) class_acc: 0.7031 (0.7028) weight_decay: 0.0500 (0.0500) grad_norm: 1.4152 (1.4164) time: 1.9060 data: 0.0007 max mem: 6925
Epoch: [249] [624/625] eta: 0:00:01 lr: 0.000308 min_lr: 0.000308 loss: 2.1887 (2.2360) class_acc: 0.7148 (0.7031) weight_decay: 0.0500 (0.0500) grad_norm: 1.3956 (1.4201) time: 0.9520 data: 0.0014 max mem: 6925
Epoch: [249] Total time: 0:20:21 (1.9544 s / it)
Averaged stats: lr: 0.000308 min_lr: 0.000308 loss: 2.1887 (2.2407) class_acc: 0.7148 (0.7010) weight_decay: 0.0500 (0.0500) grad_norm: 1.3956 (1.4201)
Test: [ 0/50] eta: 0:10:23 loss: 1.0345 (1.0345) acc1: 77.6000 (77.6000) acc5: 92.0000 (92.0000) time: 12.4766 data: 12.4309 max mem: 6925
Test: [10/50] eta: 0:01:25 loss: 1.0014 (0.9999) acc1: 77.6000 (78.1818) acc5: 92.8000 (92.3636) time: 2.1272 data: 2.0957 max mem: 6925
Test: [20/50] eta: 0:00:50 loss: 1.0730 (1.1554) acc1: 74.4000 (74.5524) acc5: 91.2000 (91.0095) time: 1.1521 data: 1.1215 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.2924 (1.1856) acc1: 71.2000 (73.6516) acc5: 89.6000 (90.8645) time: 0.9882 data: 0.9584 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.1777 (1.1898) acc1: 71.2000 (73.2683) acc5: 90.4000 (90.8293) time: 0.5593 data: 0.5308 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.1479 (1.1890) acc1: 71.2000 (73.1360) acc5: 90.4000 (90.7520) time: 0.5185 data: 0.4892 max mem: 6925
Test: Total time: 0:00:47 (0.9455 s / it)
* Acc@1 74.052 Acc@5 91.550 loss 1.144
Accuracy of the model on the 50000 test images: 74.1%
Max accuracy: 74.05%
Epoch: [250] [ 0/625] eta: 3:24:40 lr: 0.000307 min_lr: 0.000307 loss: 2.2041 (2.2041) class_acc: 0.7148 (0.7148) weight_decay: 0.0500 (0.0500) time: 19.6493 data: 16.5582 max mem: 6925
Epoch: [250] [200/625] eta: 0:15:30 lr: 0.000304 min_lr: 0.000304 loss: 2.2413 (2.2351) class_acc: 0.6953 (0.7033) weight_decay: 0.0500 (0.0500) grad_norm: 1.4961 (1.4279) time: 2.0497 data: 0.0009 max mem: 6925
Epoch: [250] [400/625] eta: 0:07:59 lr: 0.000300 min_lr: 0.000300 loss: 2.2354 (2.2361) class_acc: 0.6992 (0.7031) weight_decay: 0.0500 (0.0500) grad_norm: 1.3409 (1.4444) time: 2.0117 data: 0.0010 max mem: 6925
Epoch: [250] [600/625] eta: 0:00:52 lr: 0.000296 min_lr: 0.000296 loss: 2.2610 (2.2377) class_acc: 0.6836 (0.7023) weight_decay: 0.0500 (0.0500) grad_norm: 1.4499 (1.4382) time: 2.0365 data: 0.0013 max mem: 6925
Epoch: [250] [624/625] eta: 0:00:02 lr: 0.000296 min_lr: 0.000296 loss: 2.2256 (2.2374) class_acc: 0.7031 (0.7024) weight_decay: 0.0500 (0.0500) grad_norm: 1.3618 (1.4363) time: 0.7410 data: 0.0013 max mem: 6925
Epoch: [250] Total time: 0:21:23 (2.0542 s / it)
Averaged stats: lr: 0.000296 min_lr: 0.000296 loss: 2.2256 (2.2389) class_acc: 0.7031 (0.7019) weight_decay: 0.0500 (0.0500) grad_norm: 1.3618 (1.4363)
Test: [ 0/50] eta: 0:10:17 loss: 0.9797 (0.9797) acc1: 74.4000 (74.4000) acc5: 93.6000 (93.6000) time: 12.3506 data: 12.3037 max mem: 6925
Test: [10/50] eta: 0:01:27 loss: 0.9668 (0.9974) acc1: 76.8000 (78.1818) acc5: 93.6000 (92.3636) time: 2.1889 data: 2.1580 max mem: 6925
Test: [20/50] eta: 0:00:52 loss: 1.1040 (1.1449) acc1: 73.6000 (74.0191) acc5: 92.0000 (91.3524) time: 1.2041 data: 1.1749 max mem: 6925
Test: [30/50] eta: 0:00:30 loss: 1.2400 (1.1787) acc1: 71.2000 (73.3161) acc5: 90.4000 (90.9677) time: 1.1707 data: 1.1420 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.1998 (1.1778) acc1: 71.2000 (73.5220) acc5: 91.2000 (91.0244) time: 0.8012 data: 0.7722 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.1998 (1.1756) acc1: 72.8000 (73.5040) acc5: 91.2000 (90.8960) time: 0.8028 data: 0.7734 max mem: 6925
Test: Total time: 0:00:52 (1.0579 s / it)
* Acc@1 74.210 Acc@5 91.670 loss 1.136
Accuracy of the model on the 50000 test images: 74.2%
Max accuracy: 74.21%
Epoch: [251] [ 0/625] eta: 4:15:49 lr: 0.000296 min_lr: 0.000296 loss: 2.1724 (2.1724) class_acc: 0.7188 (0.7188) weight_decay: 0.0500 (0.0500) time: 24.5591 data: 22.0529 max mem: 6925
Epoch: [251] [200/625] eta: 0:14:52 lr: 0.000292 min_lr: 0.000292 loss: 2.2572 (2.2299) class_acc: 0.6953 (0.7032) weight_decay: 0.0500 (0.0500) grad_norm: 1.3599 (1.4144) time: 1.9221 data: 0.0013 max mem: 6925
Epoch: [251] [400/625] eta: 0:07:34 lr: 0.000288 min_lr: 0.000288 loss: 2.2312 (2.2401) class_acc: 0.7070 (0.7024) weight_decay: 0.0500 (0.0500) grad_norm: 1.2197 (1.3914) time: 1.9432 data: 0.0011 max mem: 6925
Epoch: [251] [600/625] eta: 0:00:49 lr: 0.000284 min_lr: 0.000284 loss: 2.2315 (2.2415) class_acc: 0.6992 (0.7012) weight_decay: 0.0500 (0.0500) grad_norm: 1.3352 (1.3876) time: 1.9091 data: 0.0017 max mem: 6925
Epoch: [251] [624/625] eta: 0:00:01 lr: 0.000284 min_lr: 0.000284 loss: 2.2300 (2.2414) class_acc: 0.6992 (0.7012) weight_decay: 0.0500 (0.0500) grad_norm: 1.4961 (1.3915) time: 0.9610 data: 0.0014 max mem: 6925
Epoch: [251] Total time: 0:20:17 (1.9483 s / it)
Averaged stats: lr: 0.000284 min_lr: 0.000284 loss: 2.2300 (2.2380) class_acc: 0.6992 (0.7016) weight_decay: 0.0500 (0.0500) grad_norm: 1.4961 (1.3915)
Test: [ 0/50] eta: 0:10:44 loss: 0.9714 (0.9714) acc1: 77.6000 (77.6000) acc5: 92.8000 (92.8000) time: 12.8928 data: 12.8621 max mem: 6925
Test: [10/50] eta: 0:01:28 loss: 0.9429 (1.0161) acc1: 78.4000 (78.6909) acc5: 92.8000 (92.0727) time: 2.2066 data: 2.1765 max mem: 6925
Test: [20/50] eta: 0:00:52 loss: 1.1671 (1.1653) acc1: 73.6000 (74.4381) acc5: 92.0000 (90.9333) time: 1.1941 data: 1.1647 max mem: 6925
Test: [30/50] eta: 0:00:31 loss: 1.2548 (1.2041) acc1: 71.2000 (73.3936) acc5: 90.4000 (90.8387) time: 1.1971 data: 1.1683 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.2069 (1.2015) acc1: 73.6000 (73.4049) acc5: 91.2000 (90.8683) time: 0.7557 data: 0.7258 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.2272 (1.2015) acc1: 73.6000 (73.3920) acc5: 90.4000 (90.8160) time: 0.6623 data: 0.6315 max mem: 6925
Test: Total time: 0:00:52 (1.0551 s / it)
* Acc@1 73.872 Acc@5 91.468 loss 1.162
Accuracy of the model on the 50000 test images: 73.9%
Max accuracy: 74.21%
Epoch: [252] [ 0/625] eta: 4:04:30 lr: 0.000284 min_lr: 0.000284 loss: 2.4296 (2.4296) class_acc: 0.7031 (0.7031) weight_decay: 0.0500 (0.0500) time: 23.4732 data: 18.4642 max mem: 6925
Epoch: [252] [200/625] eta: 0:14:23 lr: 0.000280 min_lr: 0.000280 loss: 2.1657 (2.2204) class_acc: 0.7109 (0.7075) weight_decay: 0.0500 (0.0500) grad_norm: 1.3301 (1.3310) time: 2.0118 data: 0.0012 max mem: 6925
Epoch: [252] [400/625] eta: 0:07:31 lr: 0.000277 min_lr: 0.000277 loss: 2.2556 (2.2287) class_acc: 0.7109 (0.7054) weight_decay: 0.0500 (0.0500) grad_norm: 1.2806 (1.3664) time: 1.9967 data: 0.0008 max mem: 6925
Epoch: [252] [600/625] eta: 0:00:49 lr: 0.000273 min_lr: 0.000273 loss: 2.2260 (2.2316) class_acc: 0.7109 (0.7050) weight_decay: 0.0500 (0.0500) grad_norm: 1.4972 (inf) time: 2.0313 data: 0.0007 max mem: 6925
Epoch: [252] [624/625] eta: 0:00:01 lr: 0.000273 min_lr: 0.000273 loss: 2.2316 (2.2305) class_acc: 0.7031 (0.7052) weight_decay: 0.0500 (0.0500) grad_norm: 1.5518 (inf) time: 0.7618 data: 0.0015 max mem: 6925
Epoch: [252] Total time: 0:20:17 (1.9480 s / it)
Averaged stats: lr: 0.000273 min_lr: 0.000273 loss: 2.2316 (2.2346) class_acc: 0.7031 (0.7031) weight_decay: 0.0500 (0.0500) grad_norm: 1.5518 (inf)
Test: [ 0/50] eta: 0:10:38 loss: 1.0513 (1.0513) acc1: 75.2000 (75.2000) acc5: 90.4000 (90.4000) time: 12.7710 data: 12.7366 max mem: 6925
Test: [10/50] eta: 0:01:28 loss: 0.9722 (0.9966) acc1: 78.4000 (78.3273) acc5: 92.0000 (92.2182) time: 2.2095 data: 2.1772 max mem: 6925
Test: [20/50] eta: 0:00:53 loss: 1.1240 (1.1422) acc1: 76.0000 (74.4381) acc5: 91.2000 (91.2381) time: 1.2250 data: 1.1935 max mem: 6925
Test: [30/50] eta: 0:00:29 loss: 1.2643 (1.1803) acc1: 70.4000 (73.3419) acc5: 90.4000 (90.7613) time: 1.0926 data: 1.0628 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.2288 (1.1842) acc1: 70.4000 (73.3659) acc5: 90.4000 (90.7902) time: 0.6019 data: 0.5731 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.1447 (1.1832) acc1: 72.0000 (73.2960) acc5: 91.2000 (90.7040) time: 0.4939 data: 0.4652 max mem: 6925
Test: Total time: 0:00:50 (1.0073 s / it)
* Acc@1 74.080 Acc@5 91.538 loss 1.147
Accuracy of the model on the 50000 test images: 74.1%
Max accuracy: 74.21%
Epoch: [253] [ 0/625] eta: 3:54:12 lr: 0.000273 min_lr: 0.000273 loss: 2.2087 (2.2087) class_acc: 0.7070 (0.7070) weight_decay: 0.0500 (0.0500) time: 22.4834 data: 19.2094 max mem: 6925
Epoch: [253] [200/625] eta: 0:14:12 lr: 0.000269 min_lr: 0.000269 loss: 2.2302 (2.2305) class_acc: 0.7070 (0.7054) weight_decay: 0.0500 (0.0500) grad_norm: 1.2627 (1.3851) time: 2.0974 data: 1.3532 max mem: 6925
Epoch: [253] [400/625] eta: 0:07:18 lr: 0.000265 min_lr: 0.000265 loss: 2.2064 (2.2314) class_acc: 0.7070 (0.7059) weight_decay: 0.0500 (0.0500) grad_norm: 1.4116 (1.4407) time: 1.7952 data: 0.0008 max mem: 6925
Epoch: [253] [600/625] eta: 0:00:48 lr: 0.000262 min_lr: 0.000262 loss: 2.2221 (2.2334) class_acc: 0.6992 (0.7048) weight_decay: 0.0500 (0.0500) grad_norm: 1.3167 (1.4270) time: 1.9938 data: 0.0134 max mem: 6925
Epoch: [253] [624/625] eta: 0:00:01 lr: 0.000261 min_lr: 0.000261 loss: 2.2193 (2.2333) class_acc: 0.7031 (0.7047) weight_decay: 0.0500 (0.0500) grad_norm: 1.3887 (1.4344) time: 1.3356 data: 0.0176 max mem: 6925
Epoch: [253] Total time: 0:19:48 (1.9020 s / it)
Averaged stats: lr: 0.000261 min_lr: 0.000261 loss: 2.2193 (2.2306) class_acc: 0.7031 (0.7040) weight_decay: 0.0500 (0.0500) grad_norm: 1.3887 (1.4344)
Test: [ 0/50] eta: 0:10:32 loss: 1.0772 (1.0772) acc1: 76.8000 (76.8000) acc5: 90.4000 (90.4000) time: 12.6448 data: 12.5977 max mem: 6925
Test: [10/50] eta: 0:01:25 loss: 1.0183 (1.0280) acc1: 77.6000 (78.0364) acc5: 92.0000 (92.2182) time: 2.1465 data: 2.1133 max mem: 6925
Test: [20/50] eta: 0:00:51 loss: 1.1051 (1.1590) acc1: 75.2000 (74.5905) acc5: 92.0000 (91.1619) time: 1.1569 data: 1.1266 max mem: 6925
Test: [30/50] eta: 0:00:30 loss: 1.2596 (1.1917) acc1: 68.8000 (73.3161) acc5: 90.4000 (90.8129) time: 1.1951 data: 1.1664 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.1696 (1.1892) acc1: 71.2000 (73.5805) acc5: 90.4000 (90.8098) time: 0.8520 data: 0.8229 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.1578 (1.1873) acc1: 74.4000 (73.5040) acc5: 90.4000 (90.6880) time: 0.8409 data: 0.8111 max mem: 6925
Test: Total time: 0:00:53 (1.0674 s / it)
* Acc@1 74.196 Acc@5 91.656 loss 1.139
Accuracy of the model on the 50000 test images: 74.2%
Max accuracy: 74.21%
Epoch: [254] [ 0/625] eta: 3:48:01 lr: 0.000261 min_lr: 0.000261 loss: 2.3459 (2.3459) class_acc: 0.6758 (0.6758) weight_decay: 0.0500 (0.0500) time: 21.8897 data: 21.6588 max mem: 6925
Epoch: [254] [200/625] eta: 0:14:28 lr: 0.000258 min_lr: 0.000258 loss: 2.1835 (2.2138) class_acc: 0.7188 (0.7087) weight_decay: 0.0500 (0.0500) grad_norm: 1.4053 (1.4166) time: 2.1402 data: 0.2836 max mem: 6925
Epoch: [254] [400/625] eta: 0:07:27 lr: 0.000254 min_lr: 0.000254 loss: 2.2169 (2.2222) class_acc: 0.7031 (0.7070) weight_decay: 0.0500 (0.0500) grad_norm: 1.3745 (1.4445) time: 1.9754 data: 0.0009 max mem: 6925
Epoch: [254] [600/625] eta: 0:00:49 lr: 0.000251 min_lr: 0.000251 loss: 2.2250 (2.2264) class_acc: 0.7031 (0.7060) weight_decay: 0.0500 (0.0500) grad_norm: 1.2997 (1.4297) time: 1.9665 data: 0.0019 max mem: 6925
Epoch: [254] [624/625] eta: 0:00:01 lr: 0.000251 min_lr: 0.000251 loss: 2.2085 (2.2263) class_acc: 0.6992 (0.7061) weight_decay: 0.0500 (0.0500) grad_norm: 1.3717 (1.4291) time: 0.9006 data: 0.0016 max mem: 6925
Epoch: [254] Total time: 0:20:14 (1.9437 s / it)
Averaged stats: lr: 0.000251 min_lr: 0.000251 loss: 2.2085 (2.2280) class_acc: 0.6992 (0.7048) weight_decay: 0.0500 (0.0500) grad_norm: 1.3717 (1.4291)
Test: [ 0/50] eta: 0:10:15 loss: 1.0770 (1.0770) acc1: 78.4000 (78.4000) acc5: 92.0000 (92.0000) time: 12.3148 data: 12.2836 max mem: 6925
Test: [10/50] eta: 0:01:21 loss: 1.0196 (1.0145) acc1: 77.6000 (78.5455) acc5: 92.0000 (91.9273) time: 2.0294 data: 1.9975 max mem: 6925
Test: [20/50] eta: 0:00:47 loss: 1.0815 (1.1446) acc1: 76.0000 (75.1238) acc5: 91.2000 (91.2000) time: 1.0525 data: 1.0217 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.2310 (1.1763) acc1: 71.2000 (74.1161) acc5: 90.4000 (91.1226) time: 1.0078 data: 0.9785 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.1903 (1.1740) acc1: 71.2000 (73.9512) acc5: 91.2000 (91.1220) time: 0.6548 data: 0.6243 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.1681 (1.1753) acc1: 72.8000 (73.7600) acc5: 92.0000 (91.0560) time: 0.5911 data: 0.5601 max mem: 6925
Test: Total time: 0:00:47 (0.9419 s / it)
* Acc@1 74.484 Acc@5 91.830 loss 1.133
Accuracy of the model on the 50000 test images: 74.5%
Max accuracy: 74.48%
Epoch: [255] [ 0/625] eta: 3:19:24 lr: 0.000250 min_lr: 0.000250 loss: 2.1484 (2.1484) class_acc: 0.7500 (0.7500) weight_decay: 0.0500 (0.0500) time: 19.1429 data: 18.6138 max mem: 6925
Epoch: [255] [200/625] eta: 0:13:59 lr: 0.000247 min_lr: 0.000247 loss: 2.1658 (2.2219) class_acc: 0.7148 (0.7062) weight_decay: 0.0500 (0.0500) grad_norm: 1.2764 (1.3973) time: 1.9138 data: 1.4355 max mem: 6925
Epoch: [255] [400/625] eta: 0:07:09 lr: 0.000244 min_lr: 0.000244 loss: 2.2340 (2.2217) class_acc: 0.6992 (0.7052) weight_decay: 0.0500 (0.0500) grad_norm: 1.3705 (1.4126) time: 1.7797 data: 1.4504 max mem: 6925
Epoch: [255] [600/625] eta: 0:00:47 lr: 0.000240 min_lr: 0.000240 loss: 2.2477 (2.2260) class_acc: 0.6992 (0.7053) weight_decay: 0.0500 (0.0500) grad_norm: 1.3788 (1.4204) time: 1.8517 data: 1.5210 max mem: 6925
Epoch: [255] [624/625] eta: 0:00:01 lr: 0.000240 min_lr: 0.000240 loss: 2.1553 (2.2245) class_acc: 0.7109 (0.7056) weight_decay: 0.0500 (0.0500) grad_norm: 1.4310 (1.4208) time: 0.7690 data: 0.4726 max mem: 6925
Epoch: [255] Total time: 0:19:40 (1.8890 s / it)
Averaged stats: lr: 0.000240 min_lr: 0.000240 loss: 2.1553 (2.2245) class_acc: 0.7109 (0.7056) weight_decay: 0.0500 (0.0500) grad_norm: 1.4310 (1.4208)
Test: [ 0/50] eta: 0:09:38 loss: 1.0630 (1.0630) acc1: 74.4000 (74.4000) acc5: 92.0000 (92.0000) time: 11.5781 data: 11.5357 max mem: 6925
Test: [10/50] eta: 0:01:18 loss: 0.9665 (1.0186) acc1: 77.6000 (78.3273) acc5: 92.8000 (92.5091) time: 1.9539 data: 1.9237 max mem: 6925
Test: [20/50] eta: 0:00:45 loss: 1.1123 (1.1516) acc1: 75.2000 (74.5524) acc5: 92.8000 (91.6191) time: 0.9996 data: 0.9706 max mem: 6925
Test: [30/50] eta: 0:00:25 loss: 1.2155 (1.1858) acc1: 69.6000 (73.2387) acc5: 90.4000 (91.1484) time: 0.8699 data: 0.8405 max mem: 6925
Test: [40/50] eta: 0:00:10 loss: 1.2034 (1.1874) acc1: 68.8000 (73.0927) acc5: 92.0000 (91.1415) time: 0.5535 data: 0.5241 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.1740 (1.1886) acc1: 73.6000 (73.1040) acc5: 91.2000 (91.0720) time: 0.5044 data: 0.4753 max mem: 6925
Test: Total time: 0:00:44 (0.8966 s / it)
* Acc@1 74.382 Acc@5 91.782 loss 1.146
Accuracy of the model on the 50000 test images: 74.4%
Max accuracy: 74.48%
Epoch: [256] [ 0/625] eta: 3:34:14 lr: 0.000240 min_lr: 0.000240 loss: 2.2284 (2.2284) class_acc: 0.6992 (0.6992) weight_decay: 0.0500 (0.0500) time: 20.5666 data: 17.8654 max mem: 6925
Epoch: [256] [200/625] eta: 0:13:24 lr: 0.000236 min_lr: 0.000236 loss: 2.1966 (2.2098) class_acc: 0.7148 (0.7108) weight_decay: 0.0500 (0.0500) grad_norm: 1.3619 (1.4180) time: 1.7705 data: 0.0009 max mem: 6925
Epoch: [256] [400/625] eta: 0:07:02 lr: 0.000233 min_lr: 0.000233 loss: 2.2419 (2.2154) class_acc: 0.7070 (0.7102) weight_decay: 0.0500 (0.0500) grad_norm: 1.4727 (1.4797) time: 1.9159 data: 0.0007 max mem: 6925
Epoch: [256] [600/625] eta: 0:00:47 lr: 0.000230 min_lr: 0.000230 loss: 2.2387 (2.2210) class_acc: 0.6875 (0.7080) weight_decay: 0.0500 (0.0500) grad_norm: 1.2895 (1.4747) time: 1.7751 data: 0.0008 max mem: 6925
Epoch: [256] [624/625] eta: 0:00:01 lr: 0.000229 min_lr: 0.000229 loss: 2.2495 (2.2218) class_acc: 0.7070 (0.7077) weight_decay: 0.0500 (0.0500) grad_norm: 1.3189 (1.4686) time: 0.6415 data: 0.0020 max mem: 6925
Epoch: [256] Total time: 0:19:27 (1.8675 s / it)
Averaged stats: lr: 0.000229 min_lr: 0.000229 loss: 2.2495 (2.2219) class_acc: 0.7070 (0.7064) weight_decay: 0.0500 (0.0500) grad_norm: 1.3189 (1.4686)
Test: [ 0/50] eta: 0:10:57 loss: 1.0062 (1.0062) acc1: 79.2000 (79.2000) acc5: 92.0000 (92.0000) time: 13.1570 data: 13.1264 max mem: 6925
Test: [10/50] eta: 0:01:23 loss: 1.0062 (1.0172) acc1: 79.2000 (78.6182) acc5: 92.0000 (91.9273) time: 2.0959 data: 2.0663 max mem: 6925
Test: [20/50] eta: 0:00:49 loss: 1.1347 (1.1574) acc1: 73.6000 (74.7810) acc5: 91.2000 (91.1238) time: 1.0602 data: 1.0310 max mem: 6925
Test: [30/50] eta: 0:00:28 loss: 1.2841 (1.1860) acc1: 70.4000 (73.9355) acc5: 90.4000 (90.9419) time: 1.0930 data: 1.0636 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.1791 (1.1809) acc1: 72.8000 (74.0488) acc5: 90.4000 (90.9463) time: 0.8746 data: 0.8451 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.1375 (1.1857) acc1: 72.8000 (73.7120) acc5: 91.2000 (90.9120) time: 0.8244 data: 0.7951 max mem: 6925
Test: Total time: 0:00:52 (1.0579 s / it)
* Acc@1 74.314 Acc@5 91.816 loss 1.144
Accuracy of the model on the 50000 test images: 74.3%
Max accuracy: 74.48%
Epoch: [257] [ 0/625] eta: 4:01:52 lr: 0.000229 min_lr: 0.000229 loss: 2.1972 (2.1972) class_acc: 0.7305 (0.7305) weight_decay: 0.0500 (0.0500) time: 23.2204 data: 16.5471 max mem: 6925
Epoch: [257] [200/625] eta: 0:14:10 lr: 0.000226 min_lr: 0.000226 loss: 2.2227 (2.2155) class_acc: 0.6992 (0.7065) weight_decay: 0.0500 (0.0500) grad_norm: 1.4516 (1.4005) time: 1.7161 data: 0.0008 max mem: 6925
Epoch: [257] [400/625] eta: 0:07:26 lr: 0.000223 min_lr: 0.000223 loss: 2.2440 (2.2171) class_acc: 0.6992 (0.7057) weight_decay: 0.0500 (0.0500) grad_norm: 1.4354 (1.4362) time: 1.9528 data: 0.0008 max mem: 6925
Epoch: [257] [600/625] eta: 0:00:49 lr: 0.000219 min_lr: 0.000219 loss: 2.2298 (2.2180) class_acc: 0.7109 (0.7068) weight_decay: 0.0500 (0.0500) grad_norm: 1.3562 (1.4416) time: 2.0137 data: 0.0008 max mem: 6925
Epoch: [257] [624/625] eta: 0:00:01 lr: 0.000219 min_lr: 0.000219 loss: 2.2059 (2.2184) class_acc: 0.7070 (0.7068) weight_decay: 0.0500 (0.0500) grad_norm: 1.3678 (1.4396) time: 0.9161 data: 0.0016 max mem: 6925
Epoch: [257] Total time: 0:20:15 (1.9452 s / it)
Averaged stats: lr: 0.000219 min_lr: 0.000219 loss: 2.2059 (2.2145) class_acc: 0.7070 (0.7079) weight_decay: 0.0500 (0.0500) grad_norm: 1.3678 (1.4396)
Test: [ 0/50] eta: 0:10:29 loss: 1.0158 (1.0158) acc1: 78.4000 (78.4000) acc5: 92.0000 (92.0000) time: 12.5898 data: 12.5474 max mem: 6925
Test: [10/50] eta: 0:01:23 loss: 0.9538 (1.0064) acc1: 79.2000 (78.4727) acc5: 92.0000 (92.4364) time: 2.0832 data: 2.0530 max mem: 6925
Test: [20/50] eta: 0:00:48 loss: 1.0864 (1.1394) acc1: 75.2000 (74.6286) acc5: 91.2000 (91.4286) time: 1.0740 data: 1.0440 max mem: 6925
Test: [30/50] eta: 0:00:29 loss: 1.1922 (1.1652) acc1: 71.2000 (73.7290) acc5: 91.2000 (91.2516) time: 1.1077 data: 1.0778 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.1773 (1.1650) acc1: 72.0000 (73.9902) acc5: 91.2000 (91.1805) time: 0.8762 data: 0.8472 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.1709 (1.1665) acc1: 72.8000 (73.8720) acc5: 91.2000 (91.1040) time: 0.7498 data: 0.7191 max mem: 6925
Test: Total time: 0:00:52 (1.0513 s / it)
* Acc@1 74.566 Acc@5 91.820 loss 1.129
Accuracy of the model on the 50000 test images: 74.6%
Max accuracy: 74.57%
Epoch: [258] [ 0/625] eta: 4:07:46 lr: 0.000219 min_lr: 0.000219 loss: 2.1696 (2.1696) class_acc: 0.7109 (0.7109) weight_decay: 0.0500 (0.0500) time: 23.7856 data: 19.9997 max mem: 6925
Epoch: [258] [200/625] eta: 0:14:13 lr: 0.000216 min_lr: 0.000216 loss: 2.2285 (2.2039) class_acc: 0.6992 (0.7112) weight_decay: 0.0500 (0.0500) grad_norm: 1.3899 (1.4505) time: 1.9447 data: 0.0576 max mem: 6925
Epoch: [258] [400/625] eta: 0:07:16 lr: 0.000212 min_lr: 0.000212 loss: 2.1983 (2.2060) class_acc: 0.7109 (0.7105) weight_decay: 0.0500 (0.0500) grad_norm: 1.4695 (1.4844) time: 1.8366 data: 0.0197 max mem: 6925
Epoch: [258] [600/625] eta: 0:00:48 lr: 0.000209 min_lr: 0.000209 loss: 2.1754 (2.2078) class_acc: 0.7188 (0.7097) weight_decay: 0.0500 (0.0500) grad_norm: 1.4178 (1.5007) time: 1.8967 data: 0.2396 max mem: 6925
Epoch: [258] [624/625] eta: 0:00:01 lr: 0.000209 min_lr: 0.000209 loss: 2.2043 (2.2088) class_acc: 0.7070 (0.7096) weight_decay: 0.0500 (0.0500) grad_norm: 1.3962 (1.4961) time: 0.7712 data: 0.0792 max mem: 6925
Epoch: [258] Total time: 0:19:56 (1.9137 s / it)
Averaged stats: lr: 0.000209 min_lr: 0.000209 loss: 2.2043 (2.2153) class_acc: 0.7070 (0.7079) weight_decay: 0.0500 (0.0500) grad_norm: 1.3962 (1.4961)
Test: [ 0/50] eta: 0:10:08 loss: 0.9949 (0.9949) acc1: 75.2000 (75.2000) acc5: 92.0000 (92.0000) time: 12.1633 data: 12.1308 max mem: 6925
Test: [10/50] eta: 0:01:19 loss: 0.9382 (0.9772) acc1: 77.6000 (78.3273) acc5: 92.8000 (92.8727) time: 1.9802 data: 1.9493 max mem: 6925
Test: [20/50] eta: 0:00:46 loss: 1.0737 (1.1136) acc1: 76.0000 (74.5143) acc5: 92.8000 (92.0762) time: 1.0087 data: 0.9785 max mem: 6925
Test: [30/50] eta: 0:00:25 loss: 1.2336 (1.1467) acc1: 71.2000 (74.0129) acc5: 91.2000 (91.8710) time: 0.9025 data: 0.8732 max mem: 6925
Test: [40/50] eta: 0:00:10 loss: 1.1415 (1.1506) acc1: 72.0000 (73.9512) acc5: 92.0000 (91.6293) time: 0.5496 data: 0.5208 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.1272 (1.1511) acc1: 72.0000 (73.7600) acc5: 92.0000 (91.4560) time: 0.4958 data: 0.4668 max mem: 6925
Test: Total time: 0:00:44 (0.8956 s / it)
* Acc@1 74.664 Acc@5 91.866 loss 1.114
Accuracy of the model on the 50000 test images: 74.7%
Max accuracy: 74.66%
Epoch: [259] [ 0/625] eta: 3:23:26 lr: 0.000209 min_lr: 0.000209 loss: 2.2169 (2.2169) class_acc: 0.6875 (0.6875) weight_decay: 0.0500 (0.0500) time: 19.5309 data: 18.9184 max mem: 6925
Epoch: [259] [200/625] eta: 0:13:41 lr: 0.000206 min_lr: 0.000206 loss: 2.1765 (2.1980) class_acc: 0.7070 (0.7114) weight_decay: 0.0500 (0.0500) grad_norm: 1.4179 (1.3742) time: 1.8220 data: 0.0040 max mem: 6925
Epoch: [259] [400/625] eta: 0:07:13 lr: 0.000203 min_lr: 0.000203 loss: 2.2056 (2.2086) class_acc: 0.7109 (0.7097) weight_decay: 0.0500 (0.0500) grad_norm: 1.3628 (1.4071) time: 1.8060 data: 0.0007 max mem: 6925
Epoch: [259] [600/625] eta: 0:00:48 lr: 0.000199 min_lr: 0.000199 loss: 2.1936 (2.2099) class_acc: 0.7188 (0.7098) weight_decay: 0.0500 (0.0500) grad_norm: 1.4094 (1.4202) time: 2.0922 data: 0.0007 max mem: 6925
Epoch: [259] [624/625] eta: 0:00:01 lr: 0.000199 min_lr: 0.000199 loss: 2.1894 (2.2102) class_acc: 0.7031 (0.7097) weight_decay: 0.0500 (0.0500) grad_norm: 1.3530 (1.4207) time: 0.7201 data: 0.0017 max mem: 6925
Epoch: [259] Total time: 0:19:59 (1.9192 s / it)
Averaged stats: lr: 0.000199 min_lr: 0.000199 loss: 2.1894 (2.2117) class_acc: 0.7031 (0.7089) weight_decay: 0.0500 (0.0500) grad_norm: 1.3530 (1.4207)
Test: [ 0/50] eta: 0:10:43 loss: 0.9892 (0.9892) acc1: 79.2000 (79.2000) acc5: 91.2000 (91.2000) time: 12.8682 data: 12.8257 max mem: 6925
Test: [10/50] eta: 0:01:26 loss: 0.9892 (0.9991) acc1: 79.2000 (78.7636) acc5: 92.0000 (92.4364) time: 2.1600 data: 2.1299 max mem: 6925
Test: [20/50] eta: 0:00:51 loss: 1.1518 (1.1321) acc1: 72.8000 (74.5143) acc5: 92.0000 (91.3905) time: 1.1541 data: 1.1252 max mem: 6925
Test: [30/50] eta: 0:00:30 loss: 1.2643 (1.1587) acc1: 70.4000 (73.9871) acc5: 90.4000 (91.4323) time: 1.1941 data: 1.1654 max mem: 6925
Test: [40/50] eta: 0:00:13 loss: 1.1450 (1.1612) acc1: 73.6000 (73.9707) acc5: 91.2000 (91.3171) time: 1.0333 data: 1.0034 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.1167 (1.1608) acc1: 74.4000 (73.9360) acc5: 91.2000 (91.2480) time: 0.9145 data: 0.8824 max mem: 6925
Test: Total time: 0:00:57 (1.1553 s / it)
* Acc@1 74.616 Acc@5 91.898 loss 1.123
Accuracy of the model on the 50000 test images: 74.6%
Max accuracy: 74.66%
Epoch: [260] [ 0/625] eta: 3:19:42 lr: 0.000199 min_lr: 0.000199 loss: 2.3543 (2.3543) class_acc: 0.6680 (0.6680) weight_decay: 0.0500 (0.0500) time: 19.1718 data: 15.9224 max mem: 6925
Epoch: [260] [200/625] eta: 0:14:20 lr: 0.000196 min_lr: 0.000196 loss: 2.2293 (2.1982) class_acc: 0.7148 (0.7116) weight_decay: 0.0500 (0.0500) grad_norm: 1.6200 (inf) time: 1.8584 data: 0.0008 max mem: 6925
Epoch: [260] [400/625] eta: 0:07:21 lr: 0.000193 min_lr: 0.000193 loss: 2.2336 (2.2050) class_acc: 0.6992 (0.7105) weight_decay: 0.0500 (0.0500) grad_norm: 1.3970 (inf) time: 1.9950 data: 0.0207 max mem: 6925
Epoch: [260] [600/625] eta: 0:00:48 lr: 0.000190 min_lr: 0.000190 loss: 2.1938 (2.2090) class_acc: 0.7070 (0.7092) weight_decay: 0.0500 (0.0500) grad_norm: 1.3845 (inf) time: 1.8187 data: 0.0009 max mem: 6925
Epoch: [260] [624/625] eta: 0:00:01 lr: 0.000189 min_lr: 0.000189 loss: 2.1931 (2.2088) class_acc: 0.7109 (0.7092) weight_decay: 0.0500 (0.0500) grad_norm: 1.4120 (inf) time: 0.8894 data: 0.0156 max mem: 6925
Epoch: [260] Total time: 0:19:45 (1.8965 s / it)
Averaged stats: lr: 0.000189 min_lr: 0.000189 loss: 2.1931 (2.2092) class_acc: 0.7109 (0.7096) weight_decay: 0.0500 (0.0500) grad_norm: 1.4120 (inf)
Test: [ 0/50] eta: 0:09:49 loss: 1.0309 (1.0309) acc1: 78.4000 (78.4000) acc5: 92.8000 (92.8000) time: 11.7873 data: 11.7526 max mem: 6925
Test: [10/50] eta: 0:01:21 loss: 0.9940 (1.0056) acc1: 78.4000 (78.6909) acc5: 92.8000 (92.8000) time: 2.0392 data: 2.0090 max mem: 6925
Test: [20/50] eta: 0:00:48 loss: 1.1688 (1.1443) acc1: 75.2000 (74.7429) acc5: 91.2000 (91.5048) time: 1.1175 data: 1.0881 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.2935 (1.1698) acc1: 71.2000 (73.9871) acc5: 90.4000 (91.5097) time: 1.0277 data: 0.9990 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.1556 (1.1723) acc1: 71.2000 (73.8732) acc5: 91.2000 (91.3366) time: 0.6076 data: 0.5768 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.1556 (1.1723) acc1: 72.0000 (73.6640) acc5: 91.2000 (91.2320) time: 0.5529 data: 0.5222 max mem: 6925
Test: Total time: 0:00:47 (0.9462 s / it)
* Acc@1 74.594 Acc@5 91.916 loss 1.136
Accuracy of the model on the 50000 test images: 74.6%
Max accuracy: 74.66%
Epoch: [261] [ 0/625] eta: 4:42:50 lr: 0.000189 min_lr: 0.000189 loss: 2.2137 (2.2137) class_acc: 0.7266 (0.7266) weight_decay: 0.0500 (0.0500) time: 27.1525 data: 18.7620 max mem: 6925
Epoch: [261] [200/625] eta: 0:14:29 lr: 0.000186 min_lr: 0.000186 loss: 2.1640 (2.2026) class_acc: 0.7070 (0.7101) weight_decay: 0.0500 (0.0500) grad_norm: 1.3460 (1.4363) time: 1.7911 data: 0.0011 max mem: 6925
Epoch: [261] [400/625] eta: 0:07:26 lr: 0.000183 min_lr: 0.000183 loss: 2.1873 (2.2018) class_acc: 0.7070 (0.7105) weight_decay: 0.0500 (0.0500) grad_norm: 1.2765 (1.4397) time: 1.9152 data: 0.0016 max mem: 6925
Epoch: [261] [600/625] eta: 0:00:49 lr: 0.000180 min_lr: 0.000180 loss: 2.2157 (2.2056) class_acc: 0.7031 (0.7097) weight_decay: 0.0500 (0.0500) grad_norm: 1.3381 (1.4660) time: 1.9430 data: 0.0010 max mem: 6925
Epoch: [261] [624/625] eta: 0:00:01 lr: 0.000180 min_lr: 0.000180 loss: 2.1759 (2.2056) class_acc: 0.7188 (0.7100) weight_decay: 0.0500 (0.0500) grad_norm: 1.3967 (1.4649) time: 0.9243 data: 0.0028 max mem: 6925
Epoch: [261] Total time: 0:20:12 (1.9403 s / it)
Averaged stats: lr: 0.000180 min_lr: 0.000180 loss: 2.1759 (2.2097) class_acc: 0.7188 (0.7100) weight_decay: 0.0500 (0.0500) grad_norm: 1.3967 (1.4649)
Test: [ 0/50] eta: 0:10:28 loss: 0.9689 (0.9689) acc1: 81.6000 (81.6000) acc5: 92.0000 (92.0000) time: 12.5719 data: 12.5341 max mem: 6925
Test: [10/50] eta: 0:01:24 loss: 0.8981 (0.9808) acc1: 80.0000 (79.2000) acc5: 92.8000 (92.3636) time: 2.1046 data: 2.0733 max mem: 6925
Test: [20/50] eta: 0:00:49 loss: 1.0837 (1.1113) acc1: 76.0000 (75.2000) acc5: 92.0000 (91.4286) time: 1.0892 data: 1.0596 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.2170 (1.1372) acc1: 71.2000 (74.3226) acc5: 91.2000 (91.3807) time: 0.9816 data: 0.9531 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.1298 (1.1416) acc1: 72.0000 (74.6146) acc5: 91.2000 (91.2781) time: 0.6249 data: 0.5959 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.1151 (1.1428) acc1: 73.6000 (74.4000) acc5: 92.0000 (91.2160) time: 0.6462 data: 0.6160 max mem: 6925
Test: Total time: 0:00:48 (0.9781 s / it)
* Acc@1 74.808 Acc@5 92.004 loss 1.105
Accuracy of the model on the 50000 test images: 74.8%
Max accuracy: 74.81%
Epoch: [262] [ 0/625] eta: 3:24:56 lr: 0.000180 min_lr: 0.000180 loss: 2.2050 (2.2050) class_acc: 0.7031 (0.7031) weight_decay: 0.0500 (0.0500) time: 19.6748 data: 19.4500 max mem: 6925
Epoch: [262] [200/625] eta: 0:14:03 lr: 0.000177 min_lr: 0.000177 loss: 2.2099 (2.2049) class_acc: 0.7109 (0.7123) weight_decay: 0.0500 (0.0500) grad_norm: 1.3289 (1.4284) time: 2.0340 data: 0.0007 max mem: 6925
Epoch: [262] [400/625] eta: 0:07:14 lr: 0.000174 min_lr: 0.000174 loss: 2.2045 (2.2056) class_acc: 0.7148 (0.7119) weight_decay: 0.0500 (0.0500) grad_norm: 1.4909 (1.4737) time: 1.8621 data: 0.0007 max mem: 6925
Epoch: [262] [600/625] eta: 0:00:48 lr: 0.000171 min_lr: 0.000171 loss: 2.2133 (2.2038) class_acc: 0.6992 (0.7115) weight_decay: 0.0500 (0.0500) grad_norm: 1.5643 (1.4751) time: 2.0324 data: 0.0011 max mem: 6925
Epoch: [262] [624/625] eta: 0:00:01 lr: 0.000171 min_lr: 0.000171 loss: 2.2031 (2.2046) class_acc: 0.7070 (0.7111) weight_decay: 0.0500 (0.0500) grad_norm: 1.5643 (1.4819) time: 0.4716 data: 0.0025 max mem: 6925
Epoch: [262] Total time: 0:19:58 (1.9179 s / it)
Averaged stats: lr: 0.000171 min_lr: 0.000171 loss: 2.2031 (2.2029) class_acc: 0.7070 (0.7113) weight_decay: 0.0500 (0.0500) grad_norm: 1.5643 (1.4819)
Test: [ 0/50] eta: 0:10:18 loss: 0.9168 (0.9168) acc1: 79.2000 (79.2000) acc5: 94.4000 (94.4000) time: 12.3605 data: 12.3270 max mem: 6925
Test: [10/50] eta: 0:01:17 loss: 0.9168 (0.9550) acc1: 79.2000 (78.8364) acc5: 92.8000 (92.5091) time: 1.9257 data: 1.8930 max mem: 6925
Test: [20/50] eta: 0:00:43 loss: 1.0929 (1.0942) acc1: 76.0000 (75.0857) acc5: 92.0000 (91.4667) time: 0.9114 data: 0.8793 max mem: 6925
Test: [30/50] eta: 0:00:26 loss: 1.1743 (1.1293) acc1: 71.2000 (74.2968) acc5: 90.4000 (91.4065) time: 1.0275 data: 0.9969 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.1258 (1.1324) acc1: 72.0000 (74.2829) acc5: 91.2000 (91.2976) time: 0.9802 data: 0.9509 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.1033 (1.1322) acc1: 72.8000 (74.1280) acc5: 91.2000 (91.2160) time: 0.5922 data: 0.5628 max mem: 6925
Test: Total time: 0:00:53 (1.0745 s / it)
* Acc@1 74.862 Acc@5 92.002 loss 1.092
Accuracy of the model on the 50000 test images: 74.9%
Max accuracy: 74.86%
Epoch: [263] [ 0/625] eta: 3:44:08 lr: 0.000171 min_lr: 0.000171 loss: 2.1825 (2.1825) class_acc: 0.7109 (0.7109) weight_decay: 0.0500 (0.0500) time: 21.5172 data: 18.8522 max mem: 6925
Epoch: [263] [200/625] eta: 0:13:44 lr: 0.000168 min_lr: 0.000168 loss: 2.1763 (2.2071) class_acc: 0.7031 (0.7087) weight_decay: 0.0500 (0.0500) grad_norm: 1.3600 (1.4536) time: 1.8894 data: 0.0013 max mem: 6925
Epoch: [263] [400/625] eta: 0:07:10 lr: 0.000165 min_lr: 0.000165 loss: 2.1903 (2.2075) class_acc: 0.7070 (0.7092) weight_decay: 0.0500 (0.0500) grad_norm: 1.4664 (1.4833) time: 1.7492 data: 0.3427 max mem: 6925
Epoch: [263] [600/625] eta: 0:00:48 lr: 0.000162 min_lr: 0.000162 loss: 2.1855 (2.2055) class_acc: 0.7109 (0.7099) weight_decay: 0.0500 (0.0500) grad_norm: 1.3449 (1.4877) time: 2.0380 data: 0.0012 max mem: 6925
Epoch: [263] [624/625] eta: 0:00:01 lr: 0.000162 min_lr: 0.000162 loss: 2.2220 (2.2063) class_acc: 0.7109 (0.7097) weight_decay: 0.0500 (0.0500) grad_norm: 1.3332 (1.4819) time: 0.5823 data: 0.0031 max mem: 6925
Epoch: [263] Total time: 0:19:51 (1.9063 s / it)
Averaged stats: lr: 0.000162 min_lr: 0.000162 loss: 2.2220 (2.2021) class_acc: 0.7109 (0.7117) weight_decay: 0.0500 (0.0500) grad_norm: 1.3332 (1.4819)
Test: [ 0/50] eta: 0:09:24 loss: 0.9758 (0.9758) acc1: 76.8000 (76.8000) acc5: 92.0000 (92.0000) time: 11.2893 data: 11.2511 max mem: 6925
Test: [10/50] eta: 0:01:16 loss: 0.9250 (0.9753) acc1: 78.4000 (79.2000) acc5: 92.8000 (92.8000) time: 1.9059 data: 1.8757 max mem: 6925
Test: [20/50] eta: 0:00:40 loss: 1.0440 (1.1054) acc1: 76.0000 (75.5810) acc5: 92.0000 (91.6571) time: 0.8633 data: 0.8341 max mem: 6925
Test: [30/50] eta: 0:00:22 loss: 1.1995 (1.1385) acc1: 72.0000 (74.3484) acc5: 91.2000 (91.6129) time: 0.7158 data: 0.6870 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.1533 (1.1425) acc1: 72.8000 (74.1463) acc5: 91.2000 (91.4732) time: 0.8410 data: 0.8117 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.1161 (1.1434) acc1: 72.8000 (73.9520) acc5: 91.2000 (91.4080) time: 0.7406 data: 0.7108 max mem: 6925
Test: Total time: 0:00:50 (1.0028 s / it)
* Acc@1 74.910 Acc@5 92.076 loss 1.103
Accuracy of the model on the 50000 test images: 74.9%
Max accuracy: 74.91%
Epoch: [264] [ 0/625] eta: 3:19:15 lr: 0.000162 min_lr: 0.000162 loss: 2.1785 (2.1785) class_acc: 0.7266 (0.7266) weight_decay: 0.0500 (0.0500) time: 19.1290 data: 16.8475 max mem: 6925
Epoch: [264] [200/625] eta: 0:14:03 lr: 0.000159 min_lr: 0.000159 loss: 2.2305 (2.2016) class_acc: 0.7070 (0.7126) weight_decay: 0.0500 (0.0500) grad_norm: 1.3238 (1.4190) time: 1.9495 data: 0.1748 max mem: 6925
Epoch: [264] [400/625] eta: 0:07:20 lr: 0.000156 min_lr: 0.000156 loss: 2.1765 (2.1991) class_acc: 0.7148 (0.7135) weight_decay: 0.0500 (0.0500) grad_norm: 1.3533 (1.4194) time: 1.8605 data: 0.0008 max mem: 6925
Epoch: [264] [600/625] eta: 0:00:48 lr: 0.000154 min_lr: 0.000154 loss: 2.1511 (2.2010) class_acc: 0.7305 (0.7130) weight_decay: 0.0500 (0.0500) grad_norm: 1.2886 (1.4226) time: 2.0072 data: 0.0008 max mem: 6925
Epoch: [264] [624/625] eta: 0:00:01 lr: 0.000153 min_lr: 0.000153 loss: 2.1849 (2.2012) class_acc: 0.7070 (0.7129) weight_decay: 0.0500 (0.0500) grad_norm: 1.3683 (1.4276) time: 0.8645 data: 0.0021 max mem: 6925
Epoch: [264] Total time: 0:19:51 (1.9066 s / it)
Averaged stats: lr: 0.000153 min_lr: 0.000153 loss: 2.1849 (2.2000) class_acc: 0.7070 (0.7126) weight_decay: 0.0500 (0.0500) grad_norm: 1.3683 (1.4276)
Test: [ 0/50] eta: 0:09:18 loss: 0.9693 (0.9693) acc1: 77.6000 (77.6000) acc5: 92.0000 (92.0000) time: 11.1661 data: 11.1301 max mem: 6925
Test: [10/50] eta: 0:01:15 loss: 0.9317 (0.9642) acc1: 79.2000 (78.9818) acc5: 93.6000 (93.1636) time: 1.8891 data: 1.8597 max mem: 6925
Test: [20/50] eta: 0:00:40 loss: 1.0377 (1.0949) acc1: 74.4000 (75.3524) acc5: 92.0000 (91.9619) time: 0.8483 data: 0.8196 max mem: 6925
Test: [30/50] eta: 0:00:22 loss: 1.1464 (1.1286) acc1: 72.0000 (74.6839) acc5: 90.4000 (91.7161) time: 0.7002 data: 0.6708 max mem: 6925
Test: [40/50] eta: 0:00:10 loss: 1.1314 (1.1380) acc1: 71.2000 (74.4000) acc5: 91.2000 (91.5512) time: 0.7626 data: 0.7333 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.1247 (1.1398) acc1: 73.6000 (74.3200) acc5: 92.0000 (91.5040) time: 0.5943 data: 0.5657 max mem: 6925
Test: Total time: 0:00:46 (0.9358 s / it)
* Acc@1 75.064 Acc@5 92.090 loss 1.101
Accuracy of the model on the 50000 test images: 75.1%
Max accuracy: 75.06%
Epoch: [265] [ 0/625] eta: 3:16:40 lr: 0.000153 min_lr: 0.000153 loss: 2.2857 (2.2857) class_acc: 0.7031 (0.7031) weight_decay: 0.0500 (0.0500) time: 18.8800 data: 16.9545 max mem: 6925
Epoch: [265] [200/625] eta: 0:14:22 lr: 0.000150 min_lr: 0.000150 loss: 2.2036 (2.2053) class_acc: 0.7070 (0.7116) weight_decay: 0.0500 (0.0500) grad_norm: 1.3784 (1.4854) time: 2.0146 data: 0.1951 max mem: 6925
Epoch: [265] [400/625] eta: 0:07:26 lr: 0.000148 min_lr: 0.000148 loss: 2.1841 (2.2004) class_acc: 0.7109 (0.7138) weight_decay: 0.0500 (0.0500) grad_norm: 1.4788 (1.4762) time: 1.7911 data: 0.0012 max mem: 6925
Epoch: [265] [600/625] eta: 0:00:48 lr: 0.000145 min_lr: 0.000145 loss: 2.1624 (2.2002) class_acc: 0.7188 (0.7131) weight_decay: 0.0500 (0.0500) grad_norm: 1.3602 (1.4906) time: 2.0363 data: 0.0620 max mem: 6925
Epoch: [265] [624/625] eta: 0:00:01 lr: 0.000145 min_lr: 0.000145 loss: 2.2165 (2.2009) class_acc: 0.7031 (0.7130) weight_decay: 0.0500 (0.0500) grad_norm: 1.5362 (1.5011) time: 1.1662 data: 0.0014 max mem: 6925
Epoch: [265] Total time: 0:19:59 (1.9187 s / it)
Averaged stats: lr: 0.000145 min_lr: 0.000145 loss: 2.2165 (2.1982) class_acc: 0.7031 (0.7128) weight_decay: 0.0500 (0.0500) grad_norm: 1.5362 (1.5011)
Test: [ 0/50] eta: 0:09:46 loss: 0.9929 (0.9929) acc1: 74.4000 (74.4000) acc5: 93.6000 (93.6000) time: 11.7338 data: 11.6926 max mem: 6925
Test: [10/50] eta: 0:01:22 loss: 0.9272 (0.9712) acc1: 78.4000 (78.9818) acc5: 93.6000 (93.0909) time: 2.0563 data: 2.0224 max mem: 6925
Test: [20/50] eta: 0:00:50 loss: 1.0992 (1.1094) acc1: 76.8000 (75.3905) acc5: 91.2000 (91.5429) time: 1.1905 data: 1.1592 max mem: 6925
Test: [30/50] eta: 0:00:30 loss: 1.2207 (1.1498) acc1: 71.2000 (74.1419) acc5: 89.6000 (91.4065) time: 1.2120 data: 1.1831 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.1652 (1.1545) acc1: 72.0000 (74.0488) acc5: 91.2000 (91.3171) time: 0.8527 data: 0.8237 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.1177 (1.1555) acc1: 72.8000 (73.8240) acc5: 92.0000 (91.2960) time: 0.7240 data: 0.6950 max mem: 6925
Test: Total time: 0:00:53 (1.0708 s / it)
* Acc@1 74.870 Acc@5 91.966 loss 1.115
Accuracy of the model on the 50000 test images: 74.9%
Max accuracy: 75.06%
Epoch: [266] [ 0/625] eta: 3:39:58 lr: 0.000145 min_lr: 0.000145 loss: 2.2109 (2.2109) class_acc: 0.7109 (0.7109) weight_decay: 0.0500 (0.0500) time: 21.1180 data: 18.9477 max mem: 6925
Epoch: [266] [200/625] eta: 0:14:18 lr: 0.000142 min_lr: 0.000142 loss: 2.2029 (2.1864) class_acc: 0.7148 (0.7163) weight_decay: 0.0500 (0.0500) grad_norm: 1.4429 (1.5560) time: 1.9697 data: 0.0012 max mem: 6925
Epoch: [266] [400/625] eta: 0:07:21 lr: 0.000139 min_lr: 0.000139 loss: 2.2471 (2.1936) class_acc: 0.7109 (0.7143) weight_decay: 0.0500 (0.0500) grad_norm: 1.4831 (1.5195) time: 1.8995 data: 0.0008 max mem: 6925
Epoch: [266] [600/625] eta: 0:00:48 lr: 0.000137 min_lr: 0.000137 loss: 2.1943 (2.1949) class_acc: 0.7148 (0.7137) weight_decay: 0.0500 (0.0500) grad_norm: 1.3101 (1.4800) time: 1.9917 data: 0.0009 max mem: 6925
Epoch: [266] [624/625] eta: 0:00:01 lr: 0.000137 min_lr: 0.000137 loss: 2.1955 (2.1948) class_acc: 0.7148 (0.7137) weight_decay: 0.0500 (0.0500) grad_norm: 1.3358 (1.4751) time: 0.8932 data: 0.0017 max mem: 6925
Epoch: [266] Total time: 0:19:54 (1.9113 s / it)
Averaged stats: lr: 0.000137 min_lr: 0.000137 loss: 2.1955 (2.1956) class_acc: 0.7148 (0.7133) weight_decay: 0.0500 (0.0500) grad_norm: 1.3358 (1.4751)
Test: [ 0/50] eta: 0:10:04 loss: 1.0173 (1.0173) acc1: 76.8000 (76.8000) acc5: 92.0000 (92.0000) time: 12.0997 data: 12.0491 max mem: 6925
Test: [10/50] eta: 0:01:31 loss: 0.9205 (0.9664) acc1: 79.2000 (78.9818) acc5: 92.8000 (92.8727) time: 2.2895 data: 2.2551 max mem: 6925
Test: [20/50] eta: 0:00:53 loss: 1.0756 (1.0975) acc1: 76.0000 (75.2381) acc5: 92.0000 (91.8857) time: 1.2708 data: 1.2394 max mem: 6925
Test: [30/50] eta: 0:00:29 loss: 1.1479 (1.1291) acc1: 72.0000 (74.1419) acc5: 90.4000 (91.8194) time: 0.9920 data: 0.9618 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.1208 (1.1306) acc1: 72.0000 (74.1073) acc5: 92.0000 (91.7463) time: 0.5527 data: 0.5230 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.1208 (1.1342) acc1: 73.6000 (73.9360) acc5: 92.0000 (91.6000) time: 0.5101 data: 0.4808 max mem: 6925
Test: Total time: 0:00:49 (0.9911 s / it)
* Acc@1 75.058 Acc@5 92.138 loss 1.097
Accuracy of the model on the 50000 test images: 75.1%
Max accuracy: 75.06%
Epoch: [267] [ 0/625] eta: 3:39:38 lr: 0.000136 min_lr: 0.000136 loss: 2.0794 (2.0794) class_acc: 0.7266 (0.7266) weight_decay: 0.0500 (0.0500) time: 21.0858 data: 16.6609 max mem: 6925
Epoch: [267] [200/625] eta: 0:14:12 lr: 0.000134 min_lr: 0.000134 loss: 2.1396 (2.1834) class_acc: 0.7109 (0.7161) weight_decay: 0.0500 (0.0500) grad_norm: 1.4042 (1.4103) time: 1.9805 data: 0.0641 max mem: 6925
Epoch: [267] [400/625] eta: 0:07:10 lr: 0.000131 min_lr: 0.000131 loss: 2.2414 (2.1905) class_acc: 0.7031 (0.7146) weight_decay: 0.0500 (0.0500) grad_norm: 1.3811 (inf) time: 1.7322 data: 0.0008 max mem: 6925
Epoch: [267] [600/625] eta: 0:00:47 lr: 0.000129 min_lr: 0.000129 loss: 2.1994 (2.1931) class_acc: 0.7031 (0.7142) weight_decay: 0.0500 (0.0500) grad_norm: 1.3335 (inf) time: 1.9333 data: 0.0008 max mem: 6925
Epoch: [267] [624/625] eta: 0:00:01 lr: 0.000129 min_lr: 0.000129 loss: 2.1764 (2.1928) class_acc: 0.7109 (0.7143) weight_decay: 0.0500 (0.0500) grad_norm: 1.4291 (inf) time: 0.9801 data: 0.0015 max mem: 6925
Epoch: [267] Total time: 0:19:31 (1.8746 s / it)
Averaged stats: lr: 0.000129 min_lr: 0.000129 loss: 2.1764 (2.1939) class_acc: 0.7109 (0.7136) weight_decay: 0.0500 (0.0500) grad_norm: 1.4291 (inf)
Test: [ 0/50] eta: 0:09:48 loss: 0.9817 (0.9817) acc1: 76.8000 (76.8000) acc5: 92.8000 (92.8000) time: 11.7758 data: 11.7273 max mem: 6925
Test: [10/50] eta: 0:01:14 loss: 0.9204 (0.9673) acc1: 80.0000 (79.3455) acc5: 93.6000 (92.9455) time: 1.8683 data: 1.8369 max mem: 6925
Test: [20/50] eta: 0:00:43 loss: 1.0760 (1.0968) acc1: 76.0000 (75.9238) acc5: 92.0000 (91.7714) time: 0.9206 data: 0.8907 max mem: 6925
Test: [30/50] eta: 0:00:24 loss: 1.1849 (1.1328) acc1: 71.2000 (74.5548) acc5: 90.4000 (91.5871) time: 0.8438 data: 0.8134 max mem: 6925
Test: [40/50] eta: 0:00:10 loss: 1.0984 (1.1346) acc1: 72.8000 (74.2829) acc5: 91.2000 (91.5512) time: 0.6789 data: 0.6493 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.0878 (1.1355) acc1: 73.6000 (74.1920) acc5: 91.2000 (91.5040) time: 0.4614 data: 0.4331 max mem: 6925
Test: Total time: 0:00:46 (0.9354 s / it)
* Acc@1 75.100 Acc@5 92.224 loss 1.093
Accuracy of the model on the 50000 test images: 75.1%
Max accuracy: 75.10%
Epoch: [268] [ 0/625] eta: 3:46:12 lr: 0.000128 min_lr: 0.000128 loss: 1.9998 (1.9998) class_acc: 0.7383 (0.7383) weight_decay: 0.0500 (0.0500) time: 21.7161 data: 18.7690 max mem: 6925
Epoch: [268] [200/625] eta: 0:14:01 lr: 0.000126 min_lr: 0.000126 loss: 2.2172 (2.1913) class_acc: 0.7109 (0.7145) weight_decay: 0.0500 (0.0500) grad_norm: 1.4998 (1.5167) time: 1.9792 data: 1.0464 max mem: 6925
Epoch: [268] [400/625] eta: 0:07:16 lr: 0.000123 min_lr: 0.000123 loss: 2.1888 (2.1902) class_acc: 0.7109 (0.7145) weight_decay: 0.0500 (0.0500) grad_norm: 1.3134 (1.4688) time: 1.8935 data: 1.0276 max mem: 6925
Epoch: [268] [600/625] eta: 0:00:48 lr: 0.000121 min_lr: 0.000121 loss: 2.2082 (2.1917) class_acc: 0.7070 (0.7145) weight_decay: 0.0500 (0.0500) grad_norm: 1.5771 (1.4881) time: 1.8882 data: 1.1768 max mem: 6925
Epoch: [268] [624/625] eta: 0:00:01 lr: 0.000121 min_lr: 0.000121 loss: 2.1773 (2.1914) class_acc: 0.7109 (0.7146) weight_decay: 0.0500 (0.0500) grad_norm: 1.5338 (1.4885) time: 0.8792 data: 0.4045 max mem: 6925
Epoch: [268] Total time: 0:19:36 (1.8826 s / it)
Averaged stats: lr: 0.000121 min_lr: 0.000121 loss: 2.1773 (2.1927) class_acc: 0.7109 (0.7141) weight_decay: 0.0500 (0.0500) grad_norm: 1.5338 (1.4885)
Test: [ 0/50] eta: 0:10:36 loss: 0.9810 (0.9810) acc1: 78.4000 (78.4000) acc5: 92.0000 (92.0000) time: 12.7291 data: 12.6849 max mem: 6925
Test: [10/50] eta: 0:01:24 loss: 0.9711 (0.9766) acc1: 79.2000 (79.4182) acc5: 92.8000 (92.4364) time: 2.1021 data: 2.0696 max mem: 6925
Test: [20/50] eta: 0:00:48 loss: 1.1130 (1.1126) acc1: 76.0000 (75.5810) acc5: 92.0000 (91.4667) time: 1.0595 data: 1.0289 max mem: 6925
Test: [30/50] eta: 0:00:28 loss: 1.2107 (1.1431) acc1: 72.8000 (74.6839) acc5: 90.4000 (91.2774) time: 1.0415 data: 1.0117 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.1315 (1.1472) acc1: 73.6000 (74.6732) acc5: 91.2000 (91.2000) time: 0.7527 data: 0.7220 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.1101 (1.1464) acc1: 74.4000 (74.5440) acc5: 91.2000 (91.1840) time: 0.6966 data: 0.6637 max mem: 6925
Test: Total time: 0:00:49 (0.9926 s / it)
* Acc@1 75.130 Acc@5 92.060 loss 1.107
Accuracy of the model on the 50000 test images: 75.1%
Max accuracy: 75.13%
Epoch: [269] [ 0/625] eta: 3:54:37 lr: 0.000121 min_lr: 0.000121 loss: 2.1433 (2.1433) class_acc: 0.7266 (0.7266) weight_decay: 0.0500 (0.0500) time: 22.5234 data: 19.2057 max mem: 6925
Epoch: [269] [200/625] eta: 0:13:48 lr: 0.000118 min_lr: 0.000118 loss: 2.1623 (2.1882) class_acc: 0.7070 (0.7154) weight_decay: 0.0500 (0.0500) grad_norm: 1.3688 (1.4099) time: 1.9259 data: 0.0021 max mem: 6925
Epoch: [269] [400/625] eta: 0:07:03 lr: 0.000116 min_lr: 0.000116 loss: 2.2070 (2.1878) class_acc: 0.7109 (0.7145) weight_decay: 0.0500 (0.0500) grad_norm: 1.3635 (1.4492) time: 1.8044 data: 0.0011 max mem: 6925
Epoch: [269] [600/625] eta: 0:00:47 lr: 0.000113 min_lr: 0.000113 loss: 2.2570 (2.1926) class_acc: 0.7031 (0.7134) weight_decay: 0.0500 (0.0500) grad_norm: 1.3399 (1.4458) time: 1.8268 data: 0.0011 max mem: 6925
Epoch: [269] [624/625] eta: 0:00:01 lr: 0.000113 min_lr: 0.000113 loss: 2.1970 (2.1933) class_acc: 0.7109 (0.7133) weight_decay: 0.0500 (0.0500) grad_norm: 1.3456 (1.4443) time: 0.8703 data: 0.0019 max mem: 6925
Epoch: [269] Total time: 0:19:14 (1.8472 s / it)
Averaged stats: lr: 0.000113 min_lr: 0.000113 loss: 2.1970 (2.1881) class_acc: 0.7109 (0.7152) weight_decay: 0.0500 (0.0500) grad_norm: 1.3456 (1.4443)
Test: [ 0/50] eta: 0:10:33 loss: 0.9845 (0.9845) acc1: 76.8000 (76.8000) acc5: 92.0000 (92.0000) time: 12.6633 data: 12.6324 max mem: 6925
Test: [10/50] eta: 0:01:21 loss: 0.9498 (0.9805) acc1: 78.4000 (78.9091) acc5: 92.8000 (92.8000) time: 2.0270 data: 1.9962 max mem: 6925
Test: [20/50] eta: 0:00:46 loss: 1.0798 (1.1056) acc1: 76.0000 (75.5810) acc5: 92.0000 (91.8095) time: 0.9805 data: 0.9508 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.2182 (1.1392) acc1: 71.2000 (74.5548) acc5: 90.4000 (91.4839) time: 0.9810 data: 0.9523 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.1666 (1.1408) acc1: 72.8000 (74.4585) acc5: 91.2000 (91.4927) time: 0.7875 data: 0.7586 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.1124 (1.1408) acc1: 74.4000 (74.4000) acc5: 92.0000 (91.4080) time: 0.6992 data: 0.6691 max mem: 6925
Test: Total time: 0:00:50 (1.0132 s / it)
* Acc@1 75.198 Acc@5 92.272 loss 1.099
Accuracy of the model on the 50000 test images: 75.2%
Max accuracy: 75.20%
Epoch: [270] [ 0/625] eta: 3:29:07 lr: 0.000113 min_lr: 0.000113 loss: 2.1434 (2.1434) class_acc: 0.7109 (0.7109) weight_decay: 0.0500 (0.0500) time: 20.0765 data: 19.8410 max mem: 6925
Epoch: [270] [200/625] eta: 0:13:33 lr: 0.000111 min_lr: 0.000111 loss: 2.1709 (2.1877) class_acc: 0.7070 (0.7150) weight_decay: 0.0500 (0.0500) grad_norm: 1.4706 (1.4628) time: 1.7247 data: 0.0715 max mem: 6925
Epoch: [270] [400/625] eta: 0:07:08 lr: 0.000109 min_lr: 0.000109 loss: 2.2112 (2.1837) class_acc: 0.7070 (0.7162) weight_decay: 0.0500 (0.0500) grad_norm: 1.3664 (1.4597) time: 1.9696 data: 0.0125 max mem: 6925
Epoch: [270] [600/625] eta: 0:00:47 lr: 0.000106 min_lr: 0.000106 loss: 2.2332 (2.1853) class_acc: 0.7031 (0.7154) weight_decay: 0.0500 (0.0500) grad_norm: 1.5082 (1.4697) time: 2.0107 data: 0.0014 max mem: 6925
Epoch: [270] [624/625] eta: 0:00:01 lr: 0.000106 min_lr: 0.000106 loss: 2.1447 (2.1842) class_acc: 0.7227 (0.7155) weight_decay: 0.0500 (0.0500) grad_norm: 1.4144 (1.4667) time: 0.6296 data: 0.0014 max mem: 6925
Epoch: [270] Total time: 0:19:38 (1.8855 s / it)
Averaged stats: lr: 0.000106 min_lr: 0.000106 loss: 2.1447 (2.1885) class_acc: 0.7227 (0.7149) weight_decay: 0.0500 (0.0500) grad_norm: 1.4144 (1.4667)
Test: [ 0/50] eta: 0:10:04 loss: 0.9765 (0.9765) acc1: 77.6000 (77.6000) acc5: 92.8000 (92.8000) time: 12.0987 data: 12.0265 max mem: 6925
Test: [10/50] eta: 0:01:24 loss: 0.9254 (0.9583) acc1: 79.2000 (79.4909) acc5: 92.8000 (92.8727) time: 2.1022 data: 2.0687 max mem: 6925
Test: [20/50] eta: 0:00:50 loss: 1.0565 (1.0891) acc1: 76.8000 (76.3429) acc5: 92.0000 (91.7714) time: 1.1461 data: 1.1171 max mem: 6925
Test: [30/50] eta: 0:00:29 loss: 1.1681 (1.1256) acc1: 72.0000 (75.2000) acc5: 91.2000 (91.3806) time: 1.1275 data: 1.0984 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.1361 (1.1288) acc1: 73.6000 (74.9659) acc5: 92.0000 (91.3951) time: 0.7463 data: 0.7172 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.1088 (1.1283) acc1: 74.4000 (74.7840) acc5: 91.2000 (91.3120) time: 0.6765 data: 0.6481 max mem: 6925
Test: Total time: 0:00:50 (1.0154 s / it)
* Acc@1 75.256 Acc@5 92.228 loss 1.089
Accuracy of the model on the 50000 test images: 75.3%
Max accuracy: 75.26%
Epoch: [271] [ 0/625] eta: 3:28:41 lr: 0.000106 min_lr: 0.000106 loss: 2.2935 (2.2935) class_acc: 0.6953 (0.6953) weight_decay: 0.0500 (0.0500) time: 20.0352 data: 18.3225 max mem: 6925
Epoch: [271] [200/625] eta: 0:14:01 lr: 0.000104 min_lr: 0.000104 loss: 2.1029 (2.1721) class_acc: 0.7266 (0.7207) weight_decay: 0.0500 (0.0500) grad_norm: 1.6945 (1.5478) time: 1.7220 data: 0.0355 max mem: 6925
Epoch: [271] [400/625] eta: 0:07:15 lr: 0.000101 min_lr: 0.000101 loss: 2.1825 (2.1783) class_acc: 0.7031 (0.7187) weight_decay: 0.0500 (0.0500) grad_norm: 1.3331 (1.4919) time: 1.8107 data: 0.0344 max mem: 6925
Epoch: [271] [600/625] eta: 0:00:47 lr: 0.000099 min_lr: 0.000099 loss: 2.1682 (2.1824) class_acc: 0.7148 (0.7172) weight_decay: 0.0500 (0.0500) grad_norm: 1.4616 (1.4768) time: 1.8815 data: 1.0454 max mem: 6925
Epoch: [271] [624/625] eta: 0:00:01 lr: 0.000099 min_lr: 0.000099 loss: 2.1615 (2.1822) class_acc: 0.7148 (0.7172) weight_decay: 0.0500 (0.0500) grad_norm: 1.4767 (1.4778) time: 0.8079 data: 0.4733 max mem: 6925
Epoch: [271] Total time: 0:19:45 (1.8972 s / it)
Averaged stats: lr: 0.000099 min_lr: 0.000099 loss: 2.1615 (2.1851) class_acc: 0.7148 (0.7162) weight_decay: 0.0500 (0.0500) grad_norm: 1.4767 (1.4778)
Test: [ 0/50] eta: 0:09:00 loss: 0.9851 (0.9851) acc1: 77.6000 (77.6000) acc5: 92.8000 (92.8000) time: 10.8112 data: 10.7683 max mem: 6925
Test: [10/50] eta: 0:01:17 loss: 0.9722 (0.9781) acc1: 77.6000 (78.6182) acc5: 92.8000 (92.7273) time: 1.9309 data: 1.9003 max mem: 6925
Test: [20/50] eta: 0:00:46 loss: 1.0775 (1.1048) acc1: 76.0000 (74.8571) acc5: 92.0000 (91.8095) time: 1.0778 data: 1.0488 max mem: 6925
Test: [30/50] eta: 0:00:26 loss: 1.2194 (1.1377) acc1: 71.2000 (74.0903) acc5: 91.2000 (91.4839) time: 1.0159 data: 0.9859 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.1266 (1.1394) acc1: 72.8000 (74.0683) acc5: 92.0000 (91.4927) time: 0.6619 data: 0.6320 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.1254 (1.1402) acc1: 72.8000 (73.9680) acc5: 92.0000 (91.5040) time: 0.5450 data: 0.5157 max mem: 6925
Test: Total time: 0:00:47 (0.9418 s / it)
* Acc@1 75.158 Acc@5 92.202 loss 1.098
Accuracy of the model on the 50000 test images: 75.2%
Max accuracy: 75.26%
Epoch: [272] [ 0/625] eta: 3:20:03 lr: 0.000099 min_lr: 0.000099 loss: 2.0266 (2.0266) class_acc: 0.7617 (0.7617) weight_decay: 0.0500 (0.0500) time: 19.2052 data: 18.3426 max mem: 6925
Epoch: [272] [200/625] eta: 0:13:26 lr: 0.000097 min_lr: 0.000097 loss: 2.1507 (2.1866) class_acc: 0.7266 (0.7176) weight_decay: 0.0500 (0.0500) grad_norm: 1.4078 (1.5341) time: 1.8873 data: 0.0015 max mem: 6925
Epoch: [272] [400/625] eta: 0:07:10 lr: 0.000094 min_lr: 0.000094 loss: 2.1987 (2.1875) class_acc: 0.7266 (0.7169) weight_decay: 0.0500 (0.0500) grad_norm: 1.3632 (1.4994) time: 2.0447 data: 0.0012 max mem: 6925
Epoch: [272] [600/625] eta: 0:00:49 lr: 0.000092 min_lr: 0.000092 loss: 2.2160 (2.1862) class_acc: 0.7188 (0.7167) weight_decay: 0.0500 (0.0500) grad_norm: 1.3296 (1.5228) time: 2.0305 data: 0.0011 max mem: 6925
Epoch: [272] [624/625] eta: 0:00:01 lr: 0.000092 min_lr: 0.000092 loss: 2.1855 (2.1871) class_acc: 0.7148 (0.7166) weight_decay: 0.0500 (0.0500) grad_norm: 1.3819 (1.5205) time: 0.7355 data: 0.0188 max mem: 6925
Epoch: [272] Total time: 0:20:14 (1.9433 s / it)
Averaged stats: lr: 0.000092 min_lr: 0.000092 loss: 2.1855 (2.1840) class_acc: 0.7148 (0.7167) weight_decay: 0.0500 (0.0500) grad_norm: 1.3819 (1.5205)
Test: [ 0/50] eta: 0:10:48 loss: 0.9783 (0.9783) acc1: 78.4000 (78.4000) acc5: 92.8000 (92.8000) time: 12.9656 data: 12.9273 max mem: 6925
Test: [10/50] eta: 0:01:30 loss: 0.9658 (0.9855) acc1: 79.2000 (78.7636) acc5: 92.8000 (92.5818) time: 2.2727 data: 2.2429 max mem: 6925
Test: [20/50] eta: 0:00:54 loss: 1.0640 (1.1082) acc1: 75.2000 (75.4667) acc5: 92.0000 (91.6952) time: 1.2695 data: 1.2404 max mem: 6925
Test: [30/50] eta: 0:00:32 loss: 1.1998 (1.1463) acc1: 72.0000 (74.4516) acc5: 90.4000 (91.4581) time: 1.2475 data: 1.2178 max mem: 6925
Test: [40/50] eta: 0:00:13 loss: 1.1539 (1.1464) acc1: 72.0000 (74.4781) acc5: 92.0000 (91.3951) time: 0.8288 data: 0.7985 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.1178 (1.1442) acc1: 73.6000 (74.3680) acc5: 91.2000 (91.3600) time: 0.6818 data: 0.6513 max mem: 6925
Test: Total time: 0:00:56 (1.1213 s / it)
* Acc@1 75.224 Acc@5 92.186 loss 1.105
Accuracy of the model on the 50000 test images: 75.2%
Max accuracy: 75.26%
Epoch: [273] [ 0/625] eta: 4:00:53 lr: 0.000092 min_lr: 0.000092 loss: 2.2084 (2.2084) class_acc: 0.6953 (0.6953) weight_decay: 0.0500 (0.0500) time: 23.1254 data: 19.3686 max mem: 6925
Epoch: [273] [200/625] eta: 0:14:43 lr: 0.000090 min_lr: 0.000090 loss: 2.1422 (2.1693) class_acc: 0.7266 (0.7188) weight_decay: 0.0500 (0.0500) grad_norm: 1.6015 (1.4849) time: 1.8956 data: 0.0014 max mem: 6925
Epoch: [273] [400/625] eta: 0:07:33 lr: 0.000088 min_lr: 0.000088 loss: 2.2335 (2.1817) class_acc: 0.6992 (0.7166) weight_decay: 0.0500 (0.0500) grad_norm: 1.2608 (1.4726) time: 2.0553 data: 0.0010 max mem: 6925
Epoch: [273] [600/625] eta: 0:00:50 lr: 0.000086 min_lr: 0.000086 loss: 2.1563 (2.1796) class_acc: 0.7031 (0.7179) weight_decay: 0.0500 (0.0500) grad_norm: 1.3989 (inf) time: 1.9815 data: 0.0014 max mem: 6925
Epoch: [273] [624/625] eta: 0:00:01 lr: 0.000085 min_lr: 0.000085 loss: 2.1643 (2.1799) class_acc: 0.7148 (0.7178) weight_decay: 0.0500 (0.0500) grad_norm: 1.3483 (inf) time: 1.0299 data: 0.0015 max mem: 6925
Epoch: [273] Total time: 0:20:36 (1.9781 s / it)
Averaged stats: lr: 0.000085 min_lr: 0.000085 loss: 2.1643 (2.1817) class_acc: 0.7148 (0.7173) weight_decay: 0.0500 (0.0500) grad_norm: 1.3483 (inf)
Test: [ 0/50] eta: 0:09:30 loss: 0.9952 (0.9952) acc1: 77.6000 (77.6000) acc5: 92.0000 (92.0000) time: 11.4178 data: 11.3859 max mem: 6925
Test: [10/50] eta: 0:01:18 loss: 0.9440 (0.9698) acc1: 79.2000 (79.6364) acc5: 92.8000 (92.7273) time: 1.9590 data: 1.9275 max mem: 6925
Test: [20/50] eta: 0:00:46 loss: 1.0453 (1.1001) acc1: 76.8000 (76.1143) acc5: 92.0000 (91.6952) time: 1.0661 data: 1.0357 max mem: 6925
Test: [30/50] eta: 0:00:28 loss: 1.1963 (1.1387) acc1: 73.6000 (75.0452) acc5: 90.4000 (91.4323) time: 1.1011 data: 1.0723 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.1293 (1.1393) acc1: 72.8000 (74.8098) acc5: 91.2000 (91.3171) time: 1.0041 data: 0.9754 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.1126 (1.1379) acc1: 73.6000 (74.6240) acc5: 91.2000 (91.3440) time: 0.8531 data: 0.8239 max mem: 6925
Test: Total time: 0:00:57 (1.1553 s / it)
* Acc@1 75.376 Acc@5 92.220 loss 1.094
Accuracy of the model on the 50000 test images: 75.4%
Max accuracy: 75.38%
Epoch: [274] [ 0/625] eta: 3:48:25 lr: 0.000085 min_lr: 0.000085 loss: 2.1040 (2.1040) class_acc: 0.7305 (0.7305) weight_decay: 0.0500 (0.0500) time: 21.9293 data: 17.8444 max mem: 6925
Epoch: [274] [200/625] eta: 0:14:45 lr: 0.000083 min_lr: 0.000083 loss: 2.1739 (2.1873) class_acc: 0.7148 (0.7165) weight_decay: 0.0500 (0.0500) grad_norm: 1.5174 (1.5237) time: 1.9240 data: 0.0010 max mem: 6925
Epoch: [274] [400/625] eta: 0:07:37 lr: 0.000081 min_lr: 0.000081 loss: 2.2014 (2.1897) class_acc: 0.7109 (0.7160) weight_decay: 0.0500 (0.0500) grad_norm: 1.3903 (1.5031) time: 2.0533 data: 0.0010 max mem: 6925
Epoch: [274] [600/625] eta: 0:00:50 lr: 0.000079 min_lr: 0.000079 loss: 2.1808 (2.1841) class_acc: 0.7070 (0.7171) weight_decay: 0.0500 (0.0500) grad_norm: 1.3637 (1.4893) time: 2.1364 data: 0.0009 max mem: 6925
Epoch: [274] [624/625] eta: 0:00:01 lr: 0.000079 min_lr: 0.000079 loss: 2.1686 (2.1844) class_acc: 0.7109 (0.7169) weight_decay: 0.0500 (0.0500) grad_norm: 1.3355 (1.4846) time: 0.7421 data: 0.0019 max mem: 6925
Epoch: [274] Total time: 0:20:28 (1.9654 s / it)
Averaged stats: lr: 0.000079 min_lr: 0.000079 loss: 2.1686 (2.1820) class_acc: 0.7109 (0.7168) weight_decay: 0.0500 (0.0500) grad_norm: 1.3355 (1.4846)
Test: [ 0/50] eta: 0:09:19 loss: 0.9980 (0.9980) acc1: 78.4000 (78.4000) acc5: 92.8000 (92.8000) time: 11.1925 data: 11.1510 max mem: 6925
Test: [10/50] eta: 0:01:21 loss: 0.9478 (0.9626) acc1: 78.4000 (79.5636) acc5: 94.4000 (93.3091) time: 2.0306 data: 1.9997 max mem: 6925
Test: [20/50] eta: 0:00:48 loss: 1.0679 (1.0900) acc1: 76.0000 (76.1143) acc5: 92.0000 (92.0381) time: 1.1464 data: 1.1164 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.1843 (1.1271) acc1: 72.0000 (75.1226) acc5: 90.4000 (91.6903) time: 1.0528 data: 1.0224 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.1168 (1.1296) acc1: 72.0000 (75.1220) acc5: 91.2000 (91.5512) time: 0.6073 data: 0.5772 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.1168 (1.1282) acc1: 73.6000 (74.8800) acc5: 91.2000 (91.6160) time: 0.4696 data: 0.4396 max mem: 6925
Test: Total time: 0:00:47 (0.9480 s / it)
* Acc@1 75.370 Acc@5 92.256 loss 1.089
Accuracy of the model on the 50000 test images: 75.4%
Max accuracy: 75.38%
Epoch: [275] [ 0/625] eta: 3:34:25 lr: 0.000079 min_lr: 0.000079 loss: 2.3777 (2.3777) class_acc: 0.6875 (0.6875) weight_decay: 0.0500 (0.0500) time: 20.5848 data: 15.4601 max mem: 6925
Epoch: [275] [200/625] eta: 0:13:56 lr: 0.000077 min_lr: 0.000077 loss: 2.1837 (2.1783) class_acc: 0.7109 (0.7198) weight_decay: 0.0500 (0.0500) grad_norm: 1.6940 (1.4748) time: 1.8826 data: 0.0010 max mem: 6925
Epoch: [275] [400/625] eta: 0:07:09 lr: 0.000075 min_lr: 0.000075 loss: 2.1684 (2.1791) class_acc: 0.7070 (0.7188) weight_decay: 0.0500 (0.0500) grad_norm: 1.4735 (1.4816) time: 1.8681 data: 0.0007 max mem: 6925
Epoch: [275] [600/625] eta: 0:00:47 lr: 0.000073 min_lr: 0.000073 loss: 2.1471 (2.1800) class_acc: 0.7109 (0.7180) weight_decay: 0.0500 (0.0500) grad_norm: 1.2641 (1.4652) time: 1.8762 data: 0.0007 max mem: 6925
Epoch: [275] [624/625] eta: 0:00:01 lr: 0.000073 min_lr: 0.000073 loss: 2.1796 (2.1798) class_acc: 0.7227 (0.7181) weight_decay: 0.0500 (0.0500) grad_norm: 1.4223 (1.4696) time: 0.8687 data: 0.0014 max mem: 6925
Epoch: [275] Total time: 0:19:29 (1.8713 s / it)
Averaged stats: lr: 0.000073 min_lr: 0.000073 loss: 2.1796 (2.1768) class_acc: 0.7227 (0.7185) weight_decay: 0.0500 (0.0500) grad_norm: 1.4223 (1.4696)
Test: [ 0/50] eta: 0:09:45 loss: 0.9873 (0.9873) acc1: 76.8000 (76.8000) acc5: 92.0000 (92.0000) time: 11.7010 data: 11.6687 max mem: 6925
Test: [10/50] eta: 0:01:19 loss: 0.9304 (0.9603) acc1: 78.4000 (78.9091) acc5: 93.6000 (92.9455) time: 1.9820 data: 1.9519 max mem: 6925
Test: [20/50] eta: 0:00:45 loss: 1.0571 (1.0862) acc1: 76.8000 (75.5810) acc5: 92.0000 (91.9238) time: 1.0234 data: 0.9934 max mem: 6925
Test: [30/50] eta: 0:00:26 loss: 1.1776 (1.1213) acc1: 72.0000 (74.7355) acc5: 90.4000 (91.5871) time: 0.9696 data: 0.9400 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.1461 (1.1247) acc1: 72.0000 (74.7122) acc5: 91.2000 (91.4146) time: 0.6798 data: 0.6508 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.1035 (1.1244) acc1: 74.4000 (74.6400) acc5: 91.2000 (91.4720) time: 0.5668 data: 0.5380 max mem: 6925
Test: Total time: 0:00:46 (0.9391 s / it)
* Acc@1 75.308 Acc@5 92.256 loss 1.086
Accuracy of the model on the 50000 test images: 75.3%
Max accuracy: 75.38%
Epoch: [276] [ 0/625] eta: 3:28:49 lr: 0.000073 min_lr: 0.000073 loss: 2.1457 (2.1457) class_acc: 0.7266 (0.7266) weight_decay: 0.0500 (0.0500) time: 20.0480 data: 17.8866 max mem: 6925
Epoch: [276] [200/625] eta: 0:14:13 lr: 0.000071 min_lr: 0.000071 loss: 2.1805 (2.1714) class_acc: 0.7031 (0.7198) weight_decay: 0.0500 (0.0500) grad_norm: 1.4572 (1.5131) time: 1.9235 data: 0.0008 max mem: 6925
Epoch: [276] [400/625] eta: 0:07:23 lr: 0.000069 min_lr: 0.000069 loss: 2.1654 (2.1712) class_acc: 0.7188 (0.7196) weight_decay: 0.0500 (0.0500) grad_norm: 1.4763 (1.4896) time: 1.9678 data: 0.0031 max mem: 6925
Epoch: [276] [600/625] eta: 0:00:49 lr: 0.000067 min_lr: 0.000067 loss: 2.1822 (2.1740) class_acc: 0.7148 (0.7192) weight_decay: 0.0500 (0.0500) grad_norm: 1.4383 (1.4626) time: 1.9664 data: 0.0008 max mem: 6925
Epoch: [276] [624/625] eta: 0:00:01 lr: 0.000067 min_lr: 0.000067 loss: 2.1758 (2.1749) class_acc: 0.7188 (0.7190) weight_decay: 0.0500 (0.0500) grad_norm: 1.4251 (1.4602) time: 0.7662 data: 0.0015 max mem: 6925
Epoch: [276] Total time: 0:20:07 (1.9323 s / it)
Averaged stats: lr: 0.000067 min_lr: 0.000067 loss: 2.1758 (2.1783) class_acc: 0.7188 (0.7179) weight_decay: 0.0500 (0.0500) grad_norm: 1.4251 (1.4602)
Test: [ 0/50] eta: 0:10:30 loss: 0.9707 (0.9707) acc1: 78.4000 (78.4000) acc5: 92.8000 (92.8000) time: 12.6005 data: 12.5665 max mem: 6925
Test: [10/50] eta: 0:01:22 loss: 0.9367 (0.9674) acc1: 78.4000 (79.1273) acc5: 93.6000 (92.8727) time: 2.0612 data: 2.0316 max mem: 6925
Test: [20/50] eta: 0:00:48 loss: 1.0649 (1.0881) acc1: 76.8000 (75.8857) acc5: 92.8000 (92.0381) time: 1.0743 data: 1.0453 max mem: 6925
Test: [30/50] eta: 0:00:28 loss: 1.1763 (1.1244) acc1: 71.2000 (74.8645) acc5: 91.2000 (91.7936) time: 1.0967 data: 1.0673 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.1405 (1.1259) acc1: 72.0000 (74.8683) acc5: 91.2000 (91.6683) time: 0.9397 data: 0.9104 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.1028 (1.1271) acc1: 72.8000 (74.6240) acc5: 91.2000 (91.6000) time: 0.8281 data: 0.7980 max mem: 6925
Test: Total time: 0:00:55 (1.1026 s / it)
* Acc@1 75.412 Acc@5 92.310 loss 1.086
Accuracy of the model on the 50000 test images: 75.4%
Max accuracy: 75.41%
Epoch: [277] [ 0/625] eta: 3:20:12 lr: 0.000067 min_lr: 0.000067 loss: 2.0030 (2.0030) class_acc: 0.7500 (0.7500) weight_decay: 0.0500 (0.0500) time: 19.2200 data: 18.3038 max mem: 6925
Epoch: [277] [200/625] eta: 0:14:08 lr: 0.000065 min_lr: 0.000065 loss: 2.1339 (2.1633) class_acc: 0.7305 (0.7202) weight_decay: 0.0500 (0.0500) grad_norm: 1.5025 (1.4818) time: 1.6140 data: 0.0261 max mem: 6925
Epoch: [277] [400/625] eta: 0:07:15 lr: 0.000064 min_lr: 0.000064 loss: 2.2059 (2.1743) class_acc: 0.6992 (0.7187) weight_decay: 0.0500 (0.0500) grad_norm: 1.4655 (1.4671) time: 1.7524 data: 0.0009 max mem: 6925
Epoch: [277] [600/625] eta: 0:00:48 lr: 0.000062 min_lr: 0.000062 loss: 2.1503 (2.1722) class_acc: 0.6992 (0.7188) weight_decay: 0.0500 (0.0500) grad_norm: 1.4496 (1.4716) time: 1.8476 data: 0.0012 max mem: 6925
Epoch: [277] [624/625] eta: 0:00:01 lr: 0.000062 min_lr: 0.000062 loss: 2.1613 (2.1728) class_acc: 0.7227 (0.7189) weight_decay: 0.0500 (0.0500) grad_norm: 1.4712 (1.4737) time: 0.7986 data: 0.0016 max mem: 6925
Epoch: [277] Total time: 0:19:51 (1.9061 s / it)
Averaged stats: lr: 0.000062 min_lr: 0.000062 loss: 2.1613 (2.1756) class_acc: 0.7227 (0.7185) weight_decay: 0.0500 (0.0500) grad_norm: 1.4712 (1.4737)
Test: [ 0/50] eta: 0:08:46 loss: 0.9606 (0.9606) acc1: 77.6000 (77.6000) acc5: 93.6000 (93.6000) time: 10.5277 data: 10.4951 max mem: 6925
Test: [10/50] eta: 0:01:00 loss: 0.9387 (0.9602) acc1: 80.0000 (79.5636) acc5: 93.6000 (93.1636) time: 1.5014 data: 1.4714 max mem: 6925
Test: [20/50] eta: 0:00:36 loss: 1.0717 (1.0868) acc1: 76.0000 (75.9619) acc5: 92.0000 (92.1143) time: 0.7642 data: 0.7350 max mem: 6925
Test: [30/50] eta: 0:00:23 loss: 1.1813 (1.1236) acc1: 72.0000 (74.8903) acc5: 91.2000 (91.7677) time: 0.9730 data: 0.9442 max mem: 6925
Test: [40/50] eta: 0:00:10 loss: 1.1253 (1.1253) acc1: 73.6000 (74.8098) acc5: 92.0000 (91.6293) time: 0.9120 data: 0.8828 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.1083 (1.1258) acc1: 72.8000 (74.5440) acc5: 92.0000 (91.5840) time: 0.5617 data: 0.5321 max mem: 6925
Test: Total time: 0:00:47 (0.9463 s / it)
* Acc@1 75.380 Acc@5 92.292 loss 1.087
Accuracy of the model on the 50000 test images: 75.4%
Max accuracy: 75.41%
Epoch: [278] [ 0/625] eta: 3:44:52 lr: 0.000062 min_lr: 0.000062 loss: 2.1674 (2.1674) class_acc: 0.7227 (0.7227) weight_decay: 0.0500 (0.0500) time: 21.5877 data: 17.6301 max mem: 6925
Epoch: [278] [200/625] eta: 0:13:46 lr: 0.000060 min_lr: 0.000060 loss: 2.1831 (2.1798) class_acc: 0.7148 (0.7193) weight_decay: 0.0500 (0.0500) grad_norm: 1.4785 (1.4640) time: 1.8729 data: 0.0019 max mem: 6925
Epoch: [278] [400/625] eta: 0:07:10 lr: 0.000058 min_lr: 0.000058 loss: 2.1799 (2.1825) class_acc: 0.7070 (0.7178) weight_decay: 0.0500 (0.0500) grad_norm: 1.4731 (1.4473) time: 1.8044 data: 0.0011 max mem: 6925
Epoch: [278] [600/625] eta: 0:00:47 lr: 0.000056 min_lr: 0.000056 loss: 2.1348 (2.1809) class_acc: 0.7266 (0.7181) weight_decay: 0.0500 (0.0500) grad_norm: 1.4525 (1.4418) time: 1.8029 data: 0.0014 max mem: 6925
Epoch: [278] [624/625] eta: 0:00:01 lr: 0.000056 min_lr: 0.000056 loss: 2.1617 (2.1804) class_acc: 0.7227 (0.7181) weight_decay: 0.0500 (0.0500) grad_norm: 1.4320 (1.4417) time: 0.6615 data: 0.0027 max mem: 6925
Epoch: [278] Total time: 0:19:40 (1.8888 s / it)
Averaged stats: lr: 0.000056 min_lr: 0.000056 loss: 2.1617 (2.1733) class_acc: 0.7227 (0.7191) weight_decay: 0.0500 (0.0500) grad_norm: 1.4320 (1.4417)
Test: [ 0/50] eta: 0:10:31 loss: 0.9604 (0.9604) acc1: 79.2000 (79.2000) acc5: 92.8000 (92.8000) time: 12.6216 data: 12.5882 max mem: 6925
Test: [10/50] eta: 0:01:18 loss: 0.9429 (0.9559) acc1: 79.2000 (79.8545) acc5: 92.8000 (93.1636) time: 1.9667 data: 1.9366 max mem: 6925
Test: [20/50] eta: 0:00:44 loss: 1.0653 (1.0824) acc1: 76.0000 (76.1524) acc5: 92.0000 (92.0381) time: 0.9385 data: 0.9087 max mem: 6925
Test: [30/50] eta: 0:00:26 loss: 1.1933 (1.1200) acc1: 72.8000 (75.0452) acc5: 91.2000 (91.6645) time: 0.9388 data: 0.9096 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.1196 (1.1196) acc1: 72.8000 (75.0634) acc5: 92.0000 (91.6683) time: 0.7574 data: 0.7280 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.1036 (1.1196) acc1: 73.6000 (74.8320) acc5: 92.0000 (91.6320) time: 0.5826 data: 0.5524 max mem: 6925
Test: Total time: 0:00:48 (0.9654 s / it)
* Acc@1 75.420 Acc@5 92.288 loss 1.083
Accuracy of the model on the 50000 test images: 75.4%
Max accuracy: 75.42%
Epoch: [279] [ 0/625] eta: 3:16:04 lr: 0.000056 min_lr: 0.000056 loss: 2.1327 (2.1327) class_acc: 0.7227 (0.7227) weight_decay: 0.0500 (0.0500) time: 18.8239 data: 16.5639 max mem: 6925
Epoch: [279] [200/625] eta: 0:14:02 lr: 0.000055 min_lr: 0.000055 loss: 2.1894 (2.1681) class_acc: 0.7070 (0.7196) weight_decay: 0.0500 (0.0500) grad_norm: 1.2931 (1.4285) time: 1.8995 data: 0.0010 max mem: 6925
Epoch: [279] [400/625] eta: 0:07:13 lr: 0.000053 min_lr: 0.000053 loss: 2.1492 (2.1681) class_acc: 0.7266 (0.7201) weight_decay: 0.0500 (0.0500) grad_norm: 1.4963 (1.4658) time: 1.9653 data: 0.0014 max mem: 6925
Epoch: [279] [600/625] eta: 0:00:47 lr: 0.000051 min_lr: 0.000051 loss: 2.1766 (2.1676) class_acc: 0.7148 (0.7207) weight_decay: 0.0500 (0.0500) grad_norm: 1.3778 (1.4671) time: 2.0722 data: 0.0010 max mem: 6925
Epoch: [279] [624/625] eta: 0:00:01 lr: 0.000051 min_lr: 0.000051 loss: 2.1972 (2.1687) class_acc: 0.7109 (0.7206) weight_decay: 0.0500 (0.0500) grad_norm: 1.5242 (1.4801) time: 0.7205 data: 0.0020 max mem: 6925
Epoch: [279] Total time: 0:19:26 (1.8659 s / it)
Averaged stats: lr: 0.000051 min_lr: 0.000051 loss: 2.1972 (2.1718) class_acc: 0.7109 (0.7196) weight_decay: 0.0500 (0.0500) grad_norm: 1.5242 (1.4801)
Test: [ 0/50] eta: 0:10:37 loss: 0.9633 (0.9633) acc1: 78.4000 (78.4000) acc5: 92.8000 (92.8000) time: 12.7499 data: 12.7107 max mem: 6925
Test: [10/50] eta: 0:01:19 loss: 0.9292 (0.9575) acc1: 79.2000 (79.5636) acc5: 92.8000 (93.0182) time: 1.9843 data: 1.9531 max mem: 6925
Test: [20/50] eta: 0:00:45 loss: 1.0550 (1.0837) acc1: 76.8000 (75.6952) acc5: 91.2000 (92.1524) time: 0.9496 data: 0.9199 max mem: 6925
Test: [30/50] eta: 0:00:26 loss: 1.1962 (1.1188) acc1: 72.0000 (74.6839) acc5: 91.2000 (91.8194) time: 0.9415 data: 0.9126 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.1479 (1.1200) acc1: 73.6000 (74.7317) acc5: 91.2000 (91.6683) time: 0.7402 data: 0.7112 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.1055 (1.1195) acc1: 74.4000 (74.5920) acc5: 91.2000 (91.5520) time: 0.5464 data: 0.5171 max mem: 6925
Test: Total time: 0:00:49 (0.9909 s / it)
* Acc@1 75.408 Acc@5 92.324 loss 1.080
Accuracy of the model on the 50000 test images: 75.4%
Max accuracy: 75.42%
Epoch: [280] [ 0/625] eta: 3:33:56 lr: 0.000051 min_lr: 0.000051 loss: 2.2191 (2.2191) class_acc: 0.7109 (0.7109) weight_decay: 0.0500 (0.0500) time: 20.5383 data: 17.2459 max mem: 6925
Epoch: [280] [200/625] eta: 0:14:01 lr: 0.000050 min_lr: 0.000050 loss: 2.1174 (2.1585) class_acc: 0.7266 (0.7226) weight_decay: 0.0500 (0.0500) grad_norm: 1.4552 (1.4986) time: 1.8404 data: 0.0008 max mem: 6925
Epoch: [280] [400/625] eta: 0:07:15 lr: 0.000048 min_lr: 0.000048 loss: 2.1341 (2.1627) class_acc: 0.7227 (0.7219) weight_decay: 0.0500 (0.0500) grad_norm: 1.3505 (1.4531) time: 1.8468 data: 0.0264 max mem: 6925
Epoch: [280] [600/625] eta: 0:00:48 lr: 0.000046 min_lr: 0.000046 loss: 2.1669 (2.1674) class_acc: 0.7188 (0.7208) weight_decay: 0.0500 (0.0500) grad_norm: 1.3207 (1.4466) time: 1.9204 data: 0.0009 max mem: 6925
Epoch: [280] [624/625] eta: 0:00:01 lr: 0.000046 min_lr: 0.000046 loss: 2.1277 (2.1668) class_acc: 0.7344 (0.7211) weight_decay: 0.0500 (0.0500) grad_norm: 1.3522 (1.4459) time: 0.6846 data: 0.0030 max mem: 6925
Epoch: [280] Total time: 0:19:47 (1.9002 s / it)
Averaged stats: lr: 0.000046 min_lr: 0.000046 loss: 2.1277 (2.1709) class_acc: 0.7344 (0.7199) weight_decay: 0.0500 (0.0500) grad_norm: 1.3522 (1.4459)
Test: [ 0/50] eta: 0:10:13 loss: 0.9932 (0.9932) acc1: 76.8000 (76.8000) acc5: 91.2000 (91.2000) time: 12.2631 data: 12.1998 max mem: 6925
Test: [10/50] eta: 0:01:18 loss: 0.9401 (0.9635) acc1: 79.2000 (79.4182) acc5: 92.8000 (92.8000) time: 1.9740 data: 1.9419 max mem: 6925
Test: [20/50] eta: 0:00:44 loss: 1.0609 (1.0858) acc1: 77.6000 (75.9619) acc5: 92.0000 (91.9619) time: 0.9606 data: 0.9314 max mem: 6925
Test: [30/50] eta: 0:00:26 loss: 1.1966 (1.1221) acc1: 71.2000 (74.8903) acc5: 91.2000 (91.7161) time: 0.9417 data: 0.9128 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.1306 (1.1255) acc1: 72.8000 (74.8878) acc5: 92.0000 (91.5512) time: 0.7114 data: 0.6824 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.1133 (1.1258) acc1: 72.8000 (74.6720) acc5: 92.0000 (91.5520) time: 0.5212 data: 0.4922 max mem: 6925
Test: Total time: 0:00:48 (0.9745 s / it)
* Acc@1 75.432 Acc@5 92.244 loss 1.087
Accuracy of the model on the 50000 test images: 75.4%
Max accuracy: 75.43%
Epoch: [281] [ 0/625] eta: 3:31:32 lr: 0.000046 min_lr: 0.000046 loss: 2.2921 (2.2921) class_acc: 0.7031 (0.7031) weight_decay: 0.0500 (0.0500) time: 20.3085 data: 20.0693 max mem: 6925
Epoch: [281] [200/625] eta: 0:13:26 lr: 0.000045 min_lr: 0.000045 loss: 2.1606 (2.1646) class_acc: 0.7109 (0.7216) weight_decay: 0.0500 (0.0500) grad_norm: 1.4859 (1.4581) time: 1.7440 data: 0.0007 max mem: 6925
Epoch: [281] [400/625] eta: 0:07:00 lr: 0.000043 min_lr: 0.000043 loss: 2.1305 (2.1679) class_acc: 0.7305 (0.7205) weight_decay: 0.0500 (0.0500) grad_norm: 1.4165 (1.4574) time: 1.8058 data: 0.1310 max mem: 6925
Epoch: [281] [600/625] eta: 0:00:46 lr: 0.000042 min_lr: 0.000042 loss: 2.1456 (2.1701) class_acc: 0.7305 (0.7197) weight_decay: 0.0500 (0.0500) grad_norm: 1.3403 (1.4489) time: 1.7065 data: 0.0006 max mem: 6925
Epoch: [281] [624/625] eta: 0:00:01 lr: 0.000042 min_lr: 0.000042 loss: 2.1889 (2.1713) class_acc: 0.7109 (0.7197) weight_decay: 0.0500 (0.0500) grad_norm: 1.3403 (1.4483) time: 0.8483 data: 0.0012 max mem: 6925
Epoch: [281] Total time: 0:19:23 (1.8612 s / it)
Averaged stats: lr: 0.000042 min_lr: 0.000042 loss: 2.1889 (2.1701) class_acc: 0.7109 (0.7202) weight_decay: 0.0500 (0.0500) grad_norm: 1.3403 (1.4483)
Test: [ 0/50] eta: 0:09:33 loss: 0.9811 (0.9811) acc1: 75.2000 (75.2000) acc5: 92.8000 (92.8000) time: 11.4677 data: 11.4369 max mem: 6925
Test: [10/50] eta: 0:01:19 loss: 0.9445 (0.9670) acc1: 80.0000 (79.4909) acc5: 92.8000 (92.8727) time: 1.9931 data: 1.9629 max mem: 6925
Test: [20/50] eta: 0:00:48 loss: 1.0582 (1.0922) acc1: 76.8000 (76.1143) acc5: 92.0000 (92.0381) time: 1.1105 data: 1.0810 max mem: 6925
Test: [30/50] eta: 0:00:28 loss: 1.1698 (1.1274) acc1: 72.0000 (75.0710) acc5: 91.2000 (91.7161) time: 1.1403 data: 1.1116 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.1489 (1.1281) acc1: 73.6000 (75.0244) acc5: 92.0000 (91.6098) time: 0.8688 data: 0.8401 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.1126 (1.1277) acc1: 73.6000 (74.8960) acc5: 92.0000 (91.6160) time: 0.7708 data: 0.7406 max mem: 6925
Test: Total time: 0:00:52 (1.0434 s / it)
* Acc@1 75.498 Acc@5 92.250 loss 1.088
Accuracy of the model on the 50000 test images: 75.5%
Max accuracy: 75.50%
Epoch: [282] [ 0/625] eta: 3:13:07 lr: 0.000042 min_lr: 0.000042 loss: 2.1305 (2.1305) class_acc: 0.7422 (0.7422) weight_decay: 0.0500 (0.0500) time: 18.5393 data: 15.0837 max mem: 6925
Epoch: [282] [200/625] eta: 0:13:45 lr: 0.000040 min_lr: 0.000040 loss: 2.1746 (2.1622) class_acc: 0.7188 (0.7225) weight_decay: 0.0500 (0.0500) grad_norm: 1.4069 (inf) time: 1.8214 data: 0.0248 max mem: 6925
Epoch: [282] [400/625] eta: 0:07:04 lr: 0.000039 min_lr: 0.000039 loss: 2.1870 (2.1600) class_acc: 0.7148 (0.7216) weight_decay: 0.0500 (0.0500) grad_norm: 1.4432 (inf) time: 1.8990 data: 0.0656 max mem: 6925
Epoch: [282] [600/625] eta: 0:00:47 lr: 0.000037 min_lr: 0.000037 loss: 2.1499 (2.1586) class_acc: 0.7266 (0.7218) weight_decay: 0.0500 (0.0500) grad_norm: 1.3524 (inf) time: 1.9463 data: 0.0009 max mem: 6925
Epoch: [282] [624/625] eta: 0:00:01 lr: 0.000037 min_lr: 0.000037 loss: 2.1161 (2.1584) class_acc: 0.7227 (0.7218) weight_decay: 0.0500 (0.0500) grad_norm: 1.4504 (inf) time: 0.6201 data: 0.0013 max mem: 6925
Epoch: [282] Total time: 0:19:25 (1.8647 s / it)
Averaged stats: lr: 0.000037 min_lr: 0.000037 loss: 2.1161 (2.1673) class_acc: 0.7227 (0.7200) weight_decay: 0.0500 (0.0500) grad_norm: 1.4504 (inf)
Test: [ 0/50] eta: 0:09:31 loss: 0.9936 (0.9936) acc1: 77.6000 (77.6000) acc5: 92.8000 (92.8000) time: 11.4368 data: 11.4033 max mem: 6925
Test: [10/50] eta: 0:01:17 loss: 0.9251 (0.9607) acc1: 79.2000 (79.2000) acc5: 92.8000 (92.6546) time: 1.9294 data: 1.8988 max mem: 6925
Test: [20/50] eta: 0:00:45 loss: 1.0784 (1.0850) acc1: 76.8000 (75.8857) acc5: 92.0000 (91.9238) time: 1.0193 data: 0.9896 max mem: 6925
Test: [30/50] eta: 0:00:26 loss: 1.1787 (1.1199) acc1: 73.6000 (75.0452) acc5: 91.2000 (91.6387) time: 1.0037 data: 0.9749 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.1192 (1.1204) acc1: 73.6000 (74.9854) acc5: 91.2000 (91.5707) time: 0.7393 data: 0.7109 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.0965 (1.1194) acc1: 73.6000 (74.8960) acc5: 91.2000 (91.5520) time: 0.7279 data: 0.6996 max mem: 6925
Test: Total time: 0:00:47 (0.9598 s / it)
* Acc@1 75.444 Acc@5 92.256 loss 1.082
Accuracy of the model on the 50000 test images: 75.4%
Max accuracy: 75.50%
Epoch: [283] [ 0/625] eta: 4:03:47 lr: 0.000037 min_lr: 0.000037 loss: 1.9957 (1.9957) class_acc: 0.7656 (0.7656) weight_decay: 0.0500 (0.0500) time: 23.4039 data: 17.8947 max mem: 6925
Epoch: [283] [200/625] eta: 0:14:24 lr: 0.000036 min_lr: 0.000036 loss: 2.1858 (2.1631) class_acc: 0.7109 (0.7216) weight_decay: 0.0500 (0.0500) grad_norm: 1.4050 (1.5061) time: 1.8910 data: 0.0009 max mem: 6925
Epoch: [283] [400/625] eta: 0:07:25 lr: 0.000035 min_lr: 0.000035 loss: 2.1480 (2.1645) class_acc: 0.7227 (0.7207) weight_decay: 0.0500 (0.0500) grad_norm: 1.4467 (1.4826) time: 1.9590 data: 0.0441 max mem: 6925
Epoch: [283] [600/625] eta: 0:00:49 lr: 0.000033 min_lr: 0.000033 loss: 2.1655 (2.1648) class_acc: 0.7109 (0.7208) weight_decay: 0.0500 (0.0500) grad_norm: 1.4571 (1.4804) time: 1.9180 data: 0.0101 max mem: 6925
Epoch: [283] [624/625] eta: 0:00:01 lr: 0.000033 min_lr: 0.000033 loss: 2.2262 (2.1664) class_acc: 0.7070 (0.7205) weight_decay: 0.0500 (0.0500) grad_norm: 1.3921 (1.4770) time: 1.0217 data: 0.0127 max mem: 6925
Epoch: [283] Total time: 0:20:07 (1.9315 s / it)
Averaged stats: lr: 0.000033 min_lr: 0.000033 loss: 2.2262 (2.1691) class_acc: 0.7070 (0.7204) weight_decay: 0.0500 (0.0500) grad_norm: 1.3921 (1.4770)
Test: [ 0/50] eta: 0:10:22 loss: 0.9739 (0.9739) acc1: 78.4000 (78.4000) acc5: 92.8000 (92.8000) time: 12.4453 data: 12.4122 max mem: 6925
Test: [10/50] eta: 0:01:21 loss: 0.9265 (0.9569) acc1: 79.2000 (79.3455) acc5: 92.8000 (92.8727) time: 2.0268 data: 1.9969 max mem: 6925
Test: [20/50] eta: 0:00:47 loss: 1.0479 (1.0843) acc1: 76.8000 (75.8095) acc5: 92.0000 (92.0000) time: 1.0296 data: 1.0005 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.1808 (1.1204) acc1: 72.8000 (74.8129) acc5: 91.2000 (91.6903) time: 1.0258 data: 0.9971 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.1258 (1.1217) acc1: 73.6000 (74.8488) acc5: 91.2000 (91.6098) time: 0.7748 data: 0.7460 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.1169 (1.1221) acc1: 73.6000 (74.7200) acc5: 91.2000 (91.5360) time: 0.7355 data: 0.7065 max mem: 6925
Test: Total time: 0:00:50 (1.0084 s / it)
* Acc@1 75.514 Acc@5 92.322 loss 1.084
Accuracy of the model on the 50000 test images: 75.5%
Max accuracy: 75.51%
Epoch: [284] [ 0/625] eta: 3:33:09 lr: 0.000033 min_lr: 0.000033 loss: 2.1664 (2.1664) class_acc: 0.7383 (0.7383) weight_decay: 0.0500 (0.0500) time: 20.4636 data: 20.0698 max mem: 6925
Epoch: [284] [200/625] eta: 0:14:36 lr: 0.000032 min_lr: 0.000032 loss: 2.1873 (2.1761) class_acc: 0.7070 (0.7193) weight_decay: 0.0500 (0.0500) grad_norm: 1.3143 (1.3937) time: 1.9052 data: 0.0011 max mem: 6925
Epoch: [284] [400/625] eta: 0:07:28 lr: 0.000031 min_lr: 0.000031 loss: 2.2046 (2.1744) class_acc: 0.7070 (0.7196) weight_decay: 0.0500 (0.0500) grad_norm: 1.3927 (1.4175) time: 1.8914 data: 0.0015 max mem: 6925
Epoch: [284] [600/625] eta: 0:00:49 lr: 0.000029 min_lr: 0.000029 loss: 2.1326 (2.1705) class_acc: 0.7227 (0.7211) weight_decay: 0.0500 (0.0500) grad_norm: 1.3455 (1.4471) time: 2.1214 data: 0.0009 max mem: 6925
Epoch: [284] [624/625] eta: 0:00:01 lr: 0.000029 min_lr: 0.000029 loss: 2.2068 (2.1722) class_acc: 0.7148 (0.7207) weight_decay: 0.0500 (0.0500) grad_norm: 1.3201 (1.4445) time: 0.9186 data: 0.0020 max mem: 6925
Epoch: [284] Total time: 0:20:06 (1.9307 s / it)
Averaged stats: lr: 0.000029 min_lr: 0.000029 loss: 2.2068 (2.1676) class_acc: 0.7148 (0.7208) weight_decay: 0.0500 (0.0500) grad_norm: 1.3201 (1.4445)
Test: [ 0/50] eta: 0:09:46 loss: 0.9801 (0.9801) acc1: 76.0000 (76.0000) acc5: 92.8000 (92.8000) time: 11.7378 data: 11.6920 max mem: 6925
Test: [10/50] eta: 0:01:12 loss: 0.9391 (0.9596) acc1: 80.0000 (79.3455) acc5: 92.8000 (92.9455) time: 1.8127 data: 1.7816 max mem: 6925
Test: [20/50] eta: 0:00:38 loss: 1.0582 (1.0881) acc1: 76.0000 (75.9238) acc5: 92.0000 (92.0381) time: 0.7770 data: 0.7459 max mem: 6925
Test: [30/50] eta: 0:00:25 loss: 1.1767 (1.1243) acc1: 72.0000 (74.9677) acc5: 91.2000 (91.6903) time: 0.9757 data: 0.9450 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.1207 (1.1246) acc1: 73.6000 (74.8878) acc5: 91.2000 (91.6293) time: 1.0328 data: 1.0036 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.1168 (1.1253) acc1: 72.8000 (74.6880) acc5: 91.2000 (91.6000) time: 0.5932 data: 0.5635 max mem: 6925
Test: Total time: 0:00:51 (1.0295 s / it)
* Acc@1 75.412 Acc@5 92.280 loss 1.087
Accuracy of the model on the 50000 test images: 75.4%
Max accuracy: 75.51%
Epoch: [285] [ 0/625] eta: 3:54:05 lr: 0.000029 min_lr: 0.000029 loss: 2.1644 (2.1644) class_acc: 0.7539 (0.7539) weight_decay: 0.0500 (0.0500) time: 22.4726 data: 17.9818 max mem: 6925
Epoch: [285] [200/625] eta: 0:14:03 lr: 0.000028 min_lr: 0.000028 loss: 2.1627 (2.1749) class_acc: 0.7227 (0.7188) weight_decay: 0.0500 (0.0500) grad_norm: 1.3539 (1.4108) time: 1.9452 data: 0.0007 max mem: 6925
Epoch: [285] [400/625] eta: 0:07:25 lr: 0.000027 min_lr: 0.000027 loss: 2.1902 (2.1671) class_acc: 0.7148 (0.7209) weight_decay: 0.0500 (0.0500) grad_norm: 1.4572 (1.4273) time: 2.1114 data: 0.4597 max mem: 6925
Epoch: [285] [600/625] eta: 0:00:49 lr: 0.000026 min_lr: 0.000026 loss: 2.1312 (2.1669) class_acc: 0.7188 (0.7198) weight_decay: 0.0500 (0.0500) grad_norm: 1.3165 (1.4336) time: 1.9837 data: 0.0009 max mem: 6925
Epoch: [285] [624/625] eta: 0:00:01 lr: 0.000026 min_lr: 0.000026 loss: 2.1420 (2.1660) class_acc: 0.7109 (0.7199) weight_decay: 0.0500 (0.0500) grad_norm: 1.3165 (1.4348) time: 0.9362 data: 0.0013 max mem: 6925
Epoch: [285] Total time: 0:20:11 (1.9387 s / it)
Averaged stats: lr: 0.000026 min_lr: 0.000026 loss: 2.1420 (2.1658) class_acc: 0.7109 (0.7209) weight_decay: 0.0500 (0.0500) grad_norm: 1.3165 (1.4348)
Test: [ 0/50] eta: 0:11:01 loss: 0.9789 (0.9789) acc1: 76.8000 (76.8000) acc5: 92.8000 (92.8000) time: 13.2363 data: 13.2051 max mem: 6925
Test: [10/50] eta: 0:01:26 loss: 0.9335 (0.9536) acc1: 79.2000 (79.2000) acc5: 93.6000 (93.0909) time: 2.1558 data: 2.1261 max mem: 6925
Test: [20/50] eta: 0:00:48 loss: 1.0691 (1.0816) acc1: 76.8000 (75.7333) acc5: 92.0000 (92.1905) time: 1.0188 data: 0.9897 max mem: 6925
Test: [30/50] eta: 0:00:27 loss: 1.1595 (1.1179) acc1: 71.2000 (74.7871) acc5: 90.4000 (91.8968) time: 0.9626 data: 0.9340 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.1263 (1.1193) acc1: 73.6000 (74.7707) acc5: 91.2000 (91.7854) time: 0.8844 data: 0.8547 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.0947 (1.1198) acc1: 73.6000 (74.7520) acc5: 92.0000 (91.8080) time: 0.8742 data: 0.8428 max mem: 6925
Test: Total time: 0:00:54 (1.0857 s / it)
* Acc@1 75.442 Acc@5 92.320 loss 1.082
Accuracy of the model on the 50000 test images: 75.4%
Max accuracy: 75.51%
Epoch: [286] [ 0/625] eta: 3:47:32 lr: 0.000026 min_lr: 0.000026 loss: 2.2325 (2.2325) class_acc: 0.6992 (0.6992) weight_decay: 0.0500 (0.0500) time: 21.8443 data: 19.2664 max mem: 6925
Epoch: [286] [200/625] eta: 0:14:40 lr: 0.000025 min_lr: 0.000025 loss: 2.1551 (2.1669) class_acc: 0.7188 (0.7202) weight_decay: 0.0500 (0.0500) grad_norm: 1.3693 (1.4358) time: 1.8991 data: 0.0012 max mem: 6925
Epoch: [286] [400/625] eta: 0:07:33 lr: 0.000023 min_lr: 0.000023 loss: 2.1247 (2.1688) class_acc: 0.7227 (0.7206) weight_decay: 0.0500 (0.0500) grad_norm: 1.3418 (1.4441) time: 1.8395 data: 0.0008 max mem: 6925
Epoch: [286] [600/625] eta: 0:00:50 lr: 0.000022 min_lr: 0.000022 loss: 2.1280 (2.1662) class_acc: 0.7266 (0.7219) weight_decay: 0.0500 (0.0500) grad_norm: 1.4290 (1.4223) time: 1.9938 data: 0.0009 max mem: 6925
Epoch: [286] [624/625] eta: 0:00:01 lr: 0.000022 min_lr: 0.000022 loss: 2.1510 (2.1668) class_acc: 0.7148 (0.7217) weight_decay: 0.0500 (0.0500) grad_norm: 1.4263 (1.4235) time: 0.9210 data: 0.0016 max mem: 6925
Epoch: [286] Total time: 0:20:37 (1.9806 s / it)
Averaged stats: lr: 0.000022 min_lr: 0.000022 loss: 2.1510 (2.1636) class_acc: 0.7148 (0.7218) weight_decay: 0.0500 (0.0500) grad_norm: 1.4263 (1.4235)
Test: [ 0/50] eta: 0:09:58 loss: 0.9846 (0.9846) acc1: 76.0000 (76.0000) acc5: 92.0000 (92.0000) time: 11.9755 data: 11.9430 max mem: 6925
Test: [10/50] eta: 0:01:23 loss: 0.9299 (0.9539) acc1: 78.4000 (79.0545) acc5: 92.8000 (92.9455) time: 2.0792 data: 2.0487 max mem: 6925
Test: [20/50] eta: 0:00:50 loss: 1.0575 (1.0851) acc1: 76.8000 (75.8476) acc5: 92.0000 (92.0762) time: 1.1605 data: 1.1310 max mem: 6925
Test: [30/50] eta: 0:00:30 loss: 1.1727 (1.1216) acc1: 71.2000 (74.8903) acc5: 91.2000 (91.6903) time: 1.1912 data: 1.1603 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.1270 (1.1214) acc1: 72.8000 (74.9073) acc5: 91.2000 (91.6878) time: 0.8776 data: 0.8463 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.0930 (1.1204) acc1: 73.6000 (74.7680) acc5: 91.2000 (91.6320) time: 0.7635 data: 0.7343 max mem: 6925
Test: Total time: 0:00:54 (1.0847 s / it)
* Acc@1 75.478 Acc@5 92.340 loss 1.081
Accuracy of the model on the 50000 test images: 75.5%
Max accuracy: 75.51%
Epoch: [287] [ 0/625] eta: 3:36:23 lr: 0.000022 min_lr: 0.000022 loss: 2.0734 (2.0734) class_acc: 0.7461 (0.7461) weight_decay: 0.0500 (0.0500) time: 20.7738 data: 17.6347 max mem: 6925
Epoch: [287] [200/625] eta: 0:15:08 lr: 0.000021 min_lr: 0.000021 loss: 2.1763 (2.1632) class_acc: 0.7148 (0.7224) weight_decay: 0.0500 (0.0500) grad_norm: 1.3925 (1.4629) time: 2.1368 data: 0.0052 max mem: 6925
Epoch: [287] [400/625] eta: 0:07:55 lr: 0.000020 min_lr: 0.000020 loss: 2.1166 (2.1609) class_acc: 0.7344 (0.7223) weight_decay: 0.0500 (0.0500) grad_norm: 1.4673 (1.4591) time: 2.1501 data: 0.0049 max mem: 6925
Epoch: [287] [600/625] eta: 0:00:52 lr: 0.000019 min_lr: 0.000019 loss: 2.1597 (2.1606) class_acc: 0.7188 (0.7227) weight_decay: 0.0500 (0.0500) grad_norm: 1.3371 (1.4528) time: 1.9784 data: 0.0010 max mem: 6925
Epoch: [287] [624/625] eta: 0:00:02 lr: 0.000019 min_lr: 0.000019 loss: 2.1318 (2.1613) class_acc: 0.7109 (0.7224) weight_decay: 0.0500 (0.0500) grad_norm: 1.4181 (1.4536) time: 0.8637 data: 0.0016 max mem: 6925
Epoch: [287] Total time: 0:21:26 (2.0585 s / it)
Averaged stats: lr: 0.000019 min_lr: 0.000019 loss: 2.1318 (2.1641) class_acc: 0.7109 (0.7216) weight_decay: 0.0500 (0.0500) grad_norm: 1.4181 (1.4536)
Test: [ 0/50] eta: 0:11:11 loss: 0.9852 (0.9852) acc1: 76.8000 (76.8000) acc5: 92.8000 (92.8000) time: 13.4279 data: 13.3717 max mem: 6925
Test: [10/50] eta: 0:01:24 loss: 0.9369 (0.9530) acc1: 80.0000 (79.4909) acc5: 92.8000 (93.0909) time: 2.1083 data: 2.0750 max mem: 6925
Test: [20/50] eta: 0:00:49 loss: 1.0608 (1.0850) acc1: 76.8000 (75.8476) acc5: 92.0000 (92.2286) time: 1.0504 data: 1.0208 max mem: 6925
Test: [30/50] eta: 0:00:29 loss: 1.1899 (1.1187) acc1: 72.0000 (74.9161) acc5: 91.2000 (91.8452) time: 1.1054 data: 1.0770 max mem: 6925
Test: [40/50] eta: 0:00:13 loss: 1.1162 (1.1192) acc1: 72.0000 (74.9659) acc5: 91.2000 (91.7659) time: 0.9913 data: 0.9620 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.1056 (1.1195) acc1: 74.4000 (74.8320) acc5: 91.2000 (91.6960) time: 0.9863 data: 0.9562 max mem: 6925
Test: Total time: 0:00:59 (1.1809 s / it)
* Acc@1 75.430 Acc@5 92.350 loss 1.081
Accuracy of the model on the 50000 test images: 75.4%
Max accuracy: 75.51%
Epoch: [288] [ 0/625] eta: 3:43:26 lr: 0.000019 min_lr: 0.000019 loss: 2.1895 (2.1895) class_acc: 0.6914 (0.6914) weight_decay: 0.0500 (0.0500) time: 21.4506 data: 16.4771 max mem: 6925
Epoch: [288] [200/625] eta: 0:14:32 lr: 0.000018 min_lr: 0.000018 loss: 2.1096 (2.1640) class_acc: 0.7188 (0.7220) weight_decay: 0.0500 (0.0500) grad_norm: 1.3752 (1.4216) time: 2.1607 data: 0.0086 max mem: 6925
Epoch: [288] [400/625] eta: 0:07:40 lr: 0.000017 min_lr: 0.000017 loss: 2.1279 (2.1623) class_acc: 0.7148 (0.7218) weight_decay: 0.0500 (0.0500) grad_norm: 1.3856 (inf) time: 1.9879 data: 0.0006 max mem: 6925
Epoch: [288] [600/625] eta: 0:00:51 lr: 0.000016 min_lr: 0.000016 loss: 2.1727 (2.1628) class_acc: 0.7109 (0.7222) weight_decay: 0.0500 (0.0500) grad_norm: 1.4828 (inf) time: 2.0688 data: 0.0008 max mem: 6925
Epoch: [288] [624/625] eta: 0:00:02 lr: 0.000016 min_lr: 0.000016 loss: 2.1580 (2.1632) class_acc: 0.7148 (0.7219) weight_decay: 0.0500 (0.0500) grad_norm: 1.4938 (inf) time: 0.7536 data: 0.0016 max mem: 6925
Epoch: [288] Total time: 0:21:07 (2.0282 s / it)
Averaged stats: lr: 0.000016 min_lr: 0.000016 loss: 2.1580 (2.1643) class_acc: 0.7148 (0.7215) weight_decay: 0.0500 (0.0500) grad_norm: 1.4938 (inf)
Test: [ 0/50] eta: 0:09:26 loss: 0.9778 (0.9778) acc1: 77.6000 (77.6000) acc5: 92.8000 (92.8000) time: 11.3236 data: 11.2860 max mem: 6925
Test: [10/50] eta: 0:01:22 loss: 0.9336 (0.9468) acc1: 78.4000 (79.7091) acc5: 93.6000 (93.0182) time: 2.0651 data: 2.0347 max mem: 6925
Test: [20/50] eta: 0:00:51 loss: 1.0475 (1.0785) acc1: 76.8000 (76.1524) acc5: 92.0000 (92.1905) time: 1.2346 data: 1.2054 max mem: 6925
Test: [30/50] eta: 0:00:31 loss: 1.1754 (1.1156) acc1: 72.0000 (75.2000) acc5: 91.2000 (91.8452) time: 1.2980 data: 1.2685 max mem: 6925
Test: [40/50] eta: 0:00:14 loss: 1.1220 (1.1171) acc1: 72.0000 (75.2000) acc5: 92.0000 (91.7659) time: 1.0705 data: 1.0407 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.0986 (1.1171) acc1: 73.6000 (75.0560) acc5: 92.0000 (91.7440) time: 1.0634 data: 1.0340 max mem: 6925
Test: Total time: 0:00:57 (1.1572 s / it)
* Acc@1 75.512 Acc@5 92.312 loss 1.078
Accuracy of the model on the 50000 test images: 75.5%
Max accuracy: 75.51%
Epoch: [289] [ 0/625] eta: 3:40:45 lr: 0.000016 min_lr: 0.000016 loss: 2.2863 (2.2863) class_acc: 0.6562 (0.6562) weight_decay: 0.0500 (0.0500) time: 21.1922 data: 16.6951 max mem: 6925
Epoch: [289] [200/625] eta: 0:14:55 lr: 0.000015 min_lr: 0.000015 loss: 2.1548 (2.1720) class_acc: 0.7305 (0.7188) weight_decay: 0.0500 (0.0500) grad_norm: 1.3575 (1.4117) time: 2.0925 data: 0.1407 max mem: 6925
Epoch: [289] [400/625] eta: 0:07:51 lr: 0.000014 min_lr: 0.000014 loss: 2.1289 (2.1743) class_acc: 0.7188 (0.7182) weight_decay: 0.0500 (0.0500) grad_norm: 1.2935 (1.4520) time: 2.2121 data: 0.0561 max mem: 6925
Epoch: [289] [600/625] eta: 0:00:52 lr: 0.000014 min_lr: 0.000014 loss: 2.1660 (2.1703) class_acc: 0.7227 (0.7191) weight_decay: 0.0500 (0.0500) grad_norm: 1.2704 (1.4334) time: 2.0740 data: 0.0164 max mem: 6925
Epoch: [289] [624/625] eta: 0:00:02 lr: 0.000014 min_lr: 0.000014 loss: 2.1573 (2.1702) class_acc: 0.7188 (0.7192) weight_decay: 0.0500 (0.0500) grad_norm: 1.4096 (1.4330) time: 0.7895 data: 0.0301 max mem: 6925
Epoch: [289] Total time: 0:21:11 (2.0348 s / it)
Averaged stats: lr: 0.000014 min_lr: 0.000014 loss: 2.1573 (2.1626) class_acc: 0.7188 (0.7214) weight_decay: 0.0500 (0.0500) grad_norm: 1.4096 (1.4330)
Test: [ 0/50] eta: 0:11:16 loss: 0.9754 (0.9754) acc1: 76.0000 (76.0000) acc5: 92.8000 (92.8000) time: 13.5286 data: 13.4975 max mem: 6925
Test: [10/50] eta: 0:01:27 loss: 0.9399 (0.9522) acc1: 78.4000 (79.4909) acc5: 92.8000 (93.0182) time: 2.1806 data: 2.1510 max mem: 6925
Test: [20/50] eta: 0:00:50 loss: 1.0609 (1.0829) acc1: 77.6000 (76.2667) acc5: 92.0000 (92.0762) time: 1.0897 data: 1.0597 max mem: 6925
Test: [30/50] eta: 0:00:30 loss: 1.1748 (1.1181) acc1: 72.0000 (75.3290) acc5: 91.2000 (91.7161) time: 1.1310 data: 1.1014 max mem: 6925
Test: [40/50] eta: 0:00:13 loss: 1.1192 (1.1188) acc1: 72.8000 (75.1220) acc5: 92.0000 (91.7073) time: 1.0171 data: 0.9883 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.0998 (1.1193) acc1: 73.6000 (74.9120) acc5: 92.0000 (91.6800) time: 0.9374 data: 0.9068 max mem: 6925
Test: Total time: 0:00:58 (1.1745 s / it)
* Acc@1 75.574 Acc@5 92.314 loss 1.079
Accuracy of the model on the 50000 test images: 75.6%
Max accuracy: 75.57%
Epoch: [290] [ 0/625] eta: 3:25:42 lr: 0.000014 min_lr: 0.000014 loss: 2.3649 (2.3649) class_acc: 0.6875 (0.6875) weight_decay: 0.0500 (0.0500) time: 19.7473 data: 17.3662 max mem: 6925
Epoch: [290] [200/625] eta: 0:14:45 lr: 0.000013 min_lr: 0.000013 loss: 2.1671 (2.1668) class_acc: 0.7227 (0.7188) weight_decay: 0.0500 (0.0500) grad_norm: 1.4105 (1.4408) time: 2.0718 data: 0.1363 max mem: 6925
Epoch: [290] [400/625] eta: 0:07:46 lr: 0.000012 min_lr: 0.000012 loss: 2.1374 (2.1682) class_acc: 0.7109 (0.7195) weight_decay: 0.0500 (0.0500) grad_norm: 1.3625 (1.4187) time: 1.9942 data: 0.0009 max mem: 6925
Epoch: [290] [600/625] eta: 0:00:51 lr: 0.000011 min_lr: 0.000011 loss: 2.1713 (2.1673) class_acc: 0.7148 (0.7194) weight_decay: 0.0500 (0.0500) grad_norm: 1.3558 (1.4088) time: 2.2189 data: 0.0007 max mem: 6925
Epoch: [290] [624/625] eta: 0:00:02 lr: 0.000011 min_lr: 0.000011 loss: 2.1604 (2.1669) class_acc: 0.7266 (0.7199) weight_decay: 0.0500 (0.0500) grad_norm: 1.4163 (1.4079) time: 0.7393 data: 0.0014 max mem: 6925
Epoch: [290] Total time: 0:21:03 (2.0213 s / it)
Averaged stats: lr: 0.000011 min_lr: 0.000011 loss: 2.1604 (2.1619) class_acc: 0.7266 (0.7224) weight_decay: 0.0500 (0.0500) grad_norm: 1.4163 (1.4079)
Test: [ 0/50] eta: 0:09:43 loss: 0.9791 (0.9791) acc1: 76.0000 (76.0000) acc5: 92.8000 (92.8000) time: 11.6617 data: 11.6228 max mem: 6925
Test: [10/50] eta: 0:01:23 loss: 0.9330 (0.9495) acc1: 79.2000 (79.2000) acc5: 92.8000 (92.9455) time: 2.0814 data: 2.0508 max mem: 6925
Test: [20/50] eta: 0:00:51 loss: 1.0537 (1.0796) acc1: 76.8000 (75.7333) acc5: 92.0000 (92.0381) time: 1.2114 data: 1.1818 max mem: 6925
Test: [30/50] eta: 0:00:31 loss: 1.1632 (1.1159) acc1: 72.0000 (74.9677) acc5: 91.2000 (91.7419) time: 1.2803 data: 1.2511 max mem: 6925
Test: [40/50] eta: 0:00:14 loss: 1.1195 (1.1170) acc1: 72.8000 (74.9073) acc5: 92.0000 (91.7268) time: 1.0808 data: 1.0516 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.1063 (1.1179) acc1: 73.6000 (74.8160) acc5: 91.2000 (91.6640) time: 0.9603 data: 0.9303 max mem: 6925
Test: Total time: 0:00:58 (1.1756 s / it)
* Acc@1 75.480 Acc@5 92.304 loss 1.079
Accuracy of the model on the 50000 test images: 75.5%
Max accuracy: 75.57%
Epoch: [291] [ 0/625] eta: 3:28:31 lr: 0.000011 min_lr: 0.000011 loss: 2.1873 (2.1873) class_acc: 0.7383 (0.7383) weight_decay: 0.0500 (0.0500) time: 20.0187 data: 19.0099 max mem: 6925
Epoch: [291] [200/625] eta: 0:14:12 lr: 0.000010 min_lr: 0.000010 loss: 2.1499 (2.1691) class_acc: 0.7305 (0.7187) weight_decay: 0.0500 (0.0500) grad_norm: 1.3237 (1.3917) time: 1.8828 data: 0.1623 max mem: 6925
Epoch: [291] [400/625] eta: 0:07:40 lr: 0.000010 min_lr: 0.000010 loss: 2.1768 (2.1664) class_acc: 0.7188 (0.7199) weight_decay: 0.0500 (0.0500) grad_norm: 1.3822 (1.4246) time: 2.0524 data: 0.0317 max mem: 6925
Epoch: [291] [600/625] eta: 0:00:51 lr: 0.000009 min_lr: 0.000009 loss: 2.1354 (2.1637) class_acc: 0.7266 (0.7211) weight_decay: 0.0500 (0.0500) grad_norm: 1.4769 (1.4305) time: 2.1421 data: 0.0121 max mem: 6925
Epoch: [291] [624/625] eta: 0:00:02 lr: 0.000009 min_lr: 0.000009 loss: 2.1181 (2.1627) class_acc: 0.7305 (0.7212) weight_decay: 0.0500 (0.0500) grad_norm: 1.4553 (1.4307) time: 1.1352 data: 0.0013 max mem: 6925
Epoch: [291] Total time: 0:21:00 (2.0167 s / it)
Averaged stats: lr: 0.000009 min_lr: 0.000009 loss: 2.1181 (2.1625) class_acc: 0.7305 (0.7223) weight_decay: 0.0500 (0.0500) grad_norm: 1.4553 (1.4307)
Test: [ 0/50] eta: 0:10:43 loss: 0.9779 (0.9779) acc1: 76.8000 (76.8000) acc5: 92.0000 (92.0000) time: 12.8753 data: 12.8183 max mem: 6925
Test: [10/50] eta: 0:01:26 loss: 0.9386 (0.9531) acc1: 78.4000 (79.3455) acc5: 92.8000 (92.9455) time: 2.1542 data: 2.1221 max mem: 6925
Test: [20/50] eta: 0:00:50 loss: 1.0539 (1.0824) acc1: 77.6000 (76.1524) acc5: 92.0000 (92.0762) time: 1.1337 data: 1.1045 max mem: 6925
Test: [30/50] eta: 0:00:30 loss: 1.1836 (1.1183) acc1: 72.0000 (75.1742) acc5: 91.2000 (91.7419) time: 1.1513 data: 1.1215 max mem: 6925
Test: [40/50] eta: 0:00:13 loss: 1.1177 (1.1192) acc1: 72.0000 (75.1220) acc5: 92.0000 (91.7073) time: 1.0760 data: 1.0459 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.1045 (1.1194) acc1: 73.6000 (75.0080) acc5: 92.0000 (91.6640) time: 0.9976 data: 0.9672 max mem: 6925
Test: Total time: 0:00:58 (1.1774 s / it)
* Acc@1 75.492 Acc@5 92.312 loss 1.081
Accuracy of the model on the 50000 test images: 75.5%
Max accuracy: 75.57%
Epoch: [292] [ 0/625] eta: 3:34:29 lr: 0.000009 min_lr: 0.000009 loss: 2.2113 (2.2113) class_acc: 0.6914 (0.6914) weight_decay: 0.0500 (0.0500) time: 20.5920 data: 20.3589 max mem: 6925
Epoch: [292] [200/625] eta: 0:15:00 lr: 0.000008 min_lr: 0.000008 loss: 2.1675 (2.1585) class_acc: 0.7148 (0.7252) weight_decay: 0.0500 (0.0500) grad_norm: 1.3637 (1.3906) time: 2.0661 data: 0.2263 max mem: 6925
Epoch: [292] [400/625] eta: 0:07:50 lr: 0.000008 min_lr: 0.000008 loss: 2.1526 (2.1588) class_acc: 0.7148 (0.7241) weight_decay: 0.0500 (0.0500) grad_norm: 1.4068 (1.4057) time: 2.1272 data: 0.0807 max mem: 6925
Epoch: [292] [600/625] eta: 0:00:52 lr: 0.000007 min_lr: 0.000007 loss: 2.1464 (2.1585) class_acc: 0.7227 (0.7237) weight_decay: 0.0500 (0.0500) grad_norm: 1.4422 (inf) time: 2.0684 data: 0.6269 max mem: 6925
Epoch: [292] [624/625] eta: 0:00:02 lr: 0.000007 min_lr: 0.000007 loss: 2.2469 (2.1599) class_acc: 0.7031 (0.7234) weight_decay: 0.0500 (0.0500) grad_norm: 1.3739 (inf) time: 0.8005 data: 0.2103 max mem: 6925
Epoch: [292] Total time: 0:21:11 (2.0338 s / it)
Averaged stats: lr: 0.000007 min_lr: 0.000007 loss: 2.2469 (2.1612) class_acc: 0.7031 (0.7227) weight_decay: 0.0500 (0.0500) grad_norm: 1.3739 (inf)
Test: [ 0/50] eta: 0:10:49 loss: 0.9769 (0.9769) acc1: 78.4000 (78.4000) acc5: 92.8000 (92.8000) time: 12.9929 data: 12.9538 max mem: 6925
Test: [10/50] eta: 0:01:21 loss: 0.9284 (0.9520) acc1: 78.4000 (79.7818) acc5: 92.8000 (93.0909) time: 2.0429 data: 2.0133 max mem: 6925
Test: [20/50] eta: 0:00:48 loss: 1.0614 (1.0809) acc1: 76.8000 (76.3429) acc5: 92.0000 (92.1905) time: 1.0311 data: 1.0014 max mem: 6925
Test: [30/50] eta: 0:00:28 loss: 1.1803 (1.1179) acc1: 72.0000 (75.2516) acc5: 91.2000 (91.8452) time: 1.0721 data: 1.0422 max mem: 6925
Test: [40/50] eta: 0:00:13 loss: 1.1210 (1.1188) acc1: 72.8000 (75.2000) acc5: 92.0000 (91.8049) time: 0.9901 data: 0.9614 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.1031 (1.1192) acc1: 73.6000 (75.0720) acc5: 92.0000 (91.7440) time: 0.9382 data: 0.9082 max mem: 6925
Test: Total time: 0:00:58 (1.1789 s / it)
* Acc@1 75.518 Acc@5 92.364 loss 1.081
Accuracy of the model on the 50000 test images: 75.5%
Max accuracy: 75.57%
Epoch: [293] [ 0/625] eta: 3:42:14 lr: 0.000007 min_lr: 0.000007 loss: 2.0451 (2.0451) class_acc: 0.7383 (0.7383) weight_decay: 0.0500 (0.0500) time: 21.3356 data: 16.2506 max mem: 6925
Epoch: [293] [200/625] eta: 0:14:51 lr: 0.000007 min_lr: 0.000007 loss: 2.1007 (2.1583) class_acc: 0.7305 (0.7219) weight_decay: 0.0500 (0.0500) grad_norm: 1.3242 (1.4424) time: 1.9215 data: 0.0010 max mem: 6925
Epoch: [293] [400/625] eta: 0:07:47 lr: 0.000006 min_lr: 0.000006 loss: 2.1355 (2.1565) class_acc: 0.7188 (0.7225) weight_decay: 0.0500 (0.0500) grad_norm: 1.4043 (1.4290) time: 2.2413 data: 0.0016 max mem: 6925
Epoch: [293] [600/625] eta: 0:00:51 lr: 0.000006 min_lr: 0.000006 loss: 2.1433 (2.1595) class_acc: 0.7266 (0.7223) weight_decay: 0.0500 (0.0500) grad_norm: 1.4181 (1.4243) time: 1.9105 data: 0.0008 max mem: 6925
Epoch: [293] [624/625] eta: 0:00:02 lr: 0.000006 min_lr: 0.000006 loss: 2.1577 (2.1597) class_acc: 0.7109 (0.7222) weight_decay: 0.0500 (0.0500) grad_norm: 1.3685 (1.4217) time: 0.9956 data: 0.0014 max mem: 6925
Epoch: [293] Total time: 0:21:00 (2.0165 s / it)
Averaged stats: lr: 0.000006 min_lr: 0.000006 loss: 2.1577 (2.1621) class_acc: 0.7109 (0.7221) weight_decay: 0.0500 (0.0500) grad_norm: 1.3685 (1.4217)
Test: [ 0/50] eta: 0:10:57 loss: 0.9716 (0.9716) acc1: 78.4000 (78.4000) acc5: 92.0000 (92.0000) time: 13.1549 data: 13.1236 max mem: 6925
Test: [10/50] eta: 0:01:28 loss: 0.9281 (0.9511) acc1: 78.4000 (79.7091) acc5: 92.8000 (92.9455) time: 2.2227 data: 2.1929 max mem: 6925
Test: [20/50] eta: 0:00:53 loss: 1.0545 (1.0809) acc1: 77.6000 (76.1143) acc5: 92.0000 (92.0762) time: 1.2223 data: 1.1930 max mem: 6925
Test: [30/50] eta: 0:00:32 loss: 1.1789 (1.1174) acc1: 72.0000 (75.1484) acc5: 91.2000 (91.7936) time: 1.2784 data: 1.2496 max mem: 6925
Test: [40/50] eta: 0:00:14 loss: 1.1260 (1.1185) acc1: 72.0000 (75.1415) acc5: 92.0000 (91.7854) time: 1.0156 data: 0.9867 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.0982 (1.1187) acc1: 73.6000 (75.0400) acc5: 92.0000 (91.7760) time: 0.9340 data: 0.9047 max mem: 6925
Test: Total time: 0:00:58 (1.1683 s / it)
* Acc@1 75.484 Acc@5 92.352 loss 1.081
Accuracy of the model on the 50000 test images: 75.5%
Max accuracy: 75.57%
Epoch: [294] [ 0/625] eta: 3:24:28 lr: 0.000006 min_lr: 0.000006 loss: 1.9333 (1.9333) class_acc: 0.8125 (0.8125) weight_decay: 0.0500 (0.0500) time: 19.6297 data: 18.5204 max mem: 6925
Epoch: [294] [200/625] eta: 0:14:41 lr: 0.000005 min_lr: 0.000005 loss: 2.1922 (2.1633) class_acc: 0.7070 (0.7215) weight_decay: 0.0500 (0.0500) grad_norm: 1.3432 (1.4431) time: 2.4112 data: 0.0089 max mem: 6925
Epoch: [294] [400/625] eta: 0:07:41 lr: 0.000005 min_lr: 0.000005 loss: 2.1655 (2.1631) class_acc: 0.7109 (0.7222) weight_decay: 0.0500 (0.0500) grad_norm: 1.3662 (1.4093) time: 2.0928 data: 0.0009 max mem: 6925
Epoch: [294] [600/625] eta: 0:00:51 lr: 0.000004 min_lr: 0.000004 loss: 2.1323 (2.1581) class_acc: 0.7188 (0.7237) weight_decay: 0.0500 (0.0500) grad_norm: 1.3947 (1.4008) time: 2.2017 data: 0.0009 max mem: 6925
Epoch: [294] [624/625] eta: 0:00:02 lr: 0.000004 min_lr: 0.000004 loss: 2.0876 (2.1573) class_acc: 0.7266 (0.7238) weight_decay: 0.0500 (0.0500) grad_norm: 1.2973 (1.4000) time: 0.7387 data: 0.0218 max mem: 6925
Epoch: [294] Total time: 0:21:12 (2.0358 s / it)
Averaged stats: lr: 0.000004 min_lr: 0.000004 loss: 2.0876 (2.1598) class_acc: 0.7266 (0.7230) weight_decay: 0.0500 (0.0500) grad_norm: 1.2973 (1.4000)
Test: [ 0/50] eta: 0:10:37 loss: 0.9653 (0.9653) acc1: 79.2000 (79.2000) acc5: 92.8000 (92.8000) time: 12.7463 data: 12.7115 max mem: 6925
Test: [10/50] eta: 0:01:27 loss: 0.9307 (0.9481) acc1: 79.2000 (79.7091) acc5: 92.8000 (92.9455) time: 2.1982 data: 2.1684 max mem: 6925
Test: [20/50] eta: 0:00:52 loss: 1.0663 (1.0797) acc1: 76.8000 (76.0762) acc5: 92.0000 (92.0381) time: 1.2118 data: 1.1826 max mem: 6925
Test: [30/50] eta: 0:00:31 loss: 1.1767 (1.1162) acc1: 72.0000 (75.0452) acc5: 91.2000 (91.6903) time: 1.2487 data: 1.2199 max mem: 6925
Test: [40/50] eta: 0:00:14 loss: 1.1347 (1.1169) acc1: 72.0000 (75.0829) acc5: 92.0000 (91.7073) time: 1.0462 data: 1.0174 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.1056 (1.1166) acc1: 74.4000 (75.0080) acc5: 92.0000 (91.6320) time: 0.9898 data: 0.9609 max mem: 6925
Test: Total time: 0:00:58 (1.1790 s / it)
* Acc@1 75.566 Acc@5 92.288 loss 1.078
Accuracy of the model on the 50000 test images: 75.6%
Max accuracy: 75.57%
Epoch: [295] [ 0/625] eta: 3:43:07 lr: 0.000004 min_lr: 0.000004 loss: 2.1991 (2.1991) class_acc: 0.7188 (0.7188) weight_decay: 0.0500 (0.0500) time: 21.4202 data: 16.6047 max mem: 6925
Epoch: [295] [200/625] eta: 0:14:54 lr: 0.000004 min_lr: 0.000004 loss: 2.1499 (2.1536) class_acc: 0.7305 (0.7243) weight_decay: 0.0500 (0.0500) grad_norm: 1.3591 (1.3973) time: 2.0877 data: 0.0008 max mem: 6925
Epoch: [295] [400/625] eta: 0:07:48 lr: 0.000003 min_lr: 0.000003 loss: 2.1678 (2.1627) class_acc: 0.7188 (0.7229) weight_decay: 0.0500 (0.0500) grad_norm: 1.3199 (1.4185) time: 2.0180 data: 0.0009 max mem: 6925
Epoch: [295] [600/625] eta: 0:00:51 lr: 0.000003 min_lr: 0.000003 loss: 2.1660 (2.1628) class_acc: 0.7266 (0.7227) weight_decay: 0.0500 (0.0500) grad_norm: 1.3074 (1.4008) time: 2.1769 data: 0.0011 max mem: 6925
Epoch: [295] [624/625] eta: 0:00:02 lr: 0.000003 min_lr: 0.000003 loss: 2.1551 (2.1629) class_acc: 0.7227 (0.7228) weight_decay: 0.0500 (0.0500) grad_norm: 1.3665 (1.4040) time: 0.7728 data: 0.0020 max mem: 6925
Epoch: [295] Total time: 0:21:05 (2.0246 s / it)
Averaged stats: lr: 0.000003 min_lr: 0.000003 loss: 2.1551 (2.1609) class_acc: 0.7227 (0.7224) weight_decay: 0.0500 (0.0500) grad_norm: 1.3665 (1.4040)
Test: [ 0/50] eta: 0:10:18 loss: 0.9681 (0.9681) acc1: 77.6000 (77.6000) acc5: 92.8000 (92.8000) time: 12.3718 data: 12.3361 max mem: 6925
Test: [10/50] eta: 0:01:21 loss: 0.9341 (0.9476) acc1: 79.2000 (79.9273) acc5: 92.8000 (93.0909) time: 2.0382 data: 2.0082 max mem: 6925
Test: [20/50] eta: 0:00:47 loss: 1.0560 (1.0794) acc1: 77.6000 (76.1905) acc5: 92.0000 (92.1524) time: 1.0599 data: 1.0294 max mem: 6925
Test: [30/50] eta: 0:00:28 loss: 1.1684 (1.1155) acc1: 72.8000 (75.0968) acc5: 91.2000 (91.8194) time: 1.1094 data: 1.0794 max mem: 6925
Test: [40/50] eta: 0:00:13 loss: 1.1150 (1.1164) acc1: 72.8000 (75.0829) acc5: 91.2000 (91.7854) time: 1.0428 data: 1.0136 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.0986 (1.1165) acc1: 73.6000 (75.0560) acc5: 92.0000 (91.7920) time: 0.9688 data: 0.9396 max mem: 6925
Test: Total time: 0:00:58 (1.1676 s / it)
* Acc@1 75.564 Acc@5 92.340 loss 1.078
Accuracy of the model on the 50000 test images: 75.6%
Max accuracy: 75.57%
Epoch: [296] [ 0/625] eta: 3:59:16 lr: 0.000003 min_lr: 0.000003 loss: 2.1094 (2.1094) class_acc: 0.7109 (0.7109) weight_decay: 0.0500 (0.0500) time: 22.9711 data: 19.7834 max mem: 6925
Epoch: [296] [200/625] eta: 0:14:36 lr: 0.000003 min_lr: 0.000003 loss: 2.1227 (2.1537) class_acc: 0.7188 (0.7243) weight_decay: 0.0500 (0.0500) grad_norm: 1.4140 (1.4206) time: 1.9695 data: 0.0010 max mem: 6925
Epoch: [296] [400/625] eta: 0:07:38 lr: 0.000002 min_lr: 0.000002 loss: 2.1774 (2.1535) class_acc: 0.7188 (0.7248) weight_decay: 0.0500 (0.0500) grad_norm: 1.3059 (1.3959) time: 2.0430 data: 0.0537 max mem: 6925
Epoch: [296] [600/625] eta: 0:00:50 lr: 0.000002 min_lr: 0.000002 loss: 2.1493 (2.1615) class_acc: 0.7148 (0.7230) weight_decay: 0.0500 (0.0500) grad_norm: 1.3063 (1.4102) time: 2.0161 data: 0.0013 max mem: 6925
Epoch: [296] [624/625] eta: 0:00:01 lr: 0.000002 min_lr: 0.000002 loss: 2.1529 (2.1614) class_acc: 0.7266 (0.7233) weight_decay: 0.0500 (0.0500) grad_norm: 1.3331 (1.4112) time: 0.8082 data: 0.0017 max mem: 6925
Epoch: [296] Total time: 0:20:34 (1.9753 s / it)
Averaged stats: lr: 0.000002 min_lr: 0.000002 loss: 2.1529 (2.1623) class_acc: 0.7266 (0.7221) weight_decay: 0.0500 (0.0500) grad_norm: 1.3331 (1.4112)
Test: [ 0/50] eta: 0:10:42 loss: 0.9723 (0.9723) acc1: 76.8000 (76.8000) acc5: 92.8000 (92.8000) time: 12.8567 data: 12.8238 max mem: 6925
Test: [10/50] eta: 0:01:24 loss: 0.9405 (0.9503) acc1: 78.4000 (79.5636) acc5: 92.8000 (93.0182) time: 2.1116 data: 2.0823 max mem: 6925
Test: [20/50] eta: 0:00:49 loss: 1.0644 (1.0818) acc1: 77.6000 (76.0000) acc5: 92.0000 (92.0762) time: 1.0880 data: 1.0592 max mem: 6925
Test: [30/50] eta: 0:00:29 loss: 1.1759 (1.1190) acc1: 72.0000 (75.1484) acc5: 91.2000 (91.7936) time: 1.1145 data: 1.0854 max mem: 6925
Test: [40/50] eta: 0:00:12 loss: 1.1348 (1.1197) acc1: 72.0000 (75.1024) acc5: 92.0000 (91.7659) time: 0.8409 data: 0.8110 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.0995 (1.1194) acc1: 72.8000 (74.9280) acc5: 92.0000 (91.7440) time: 0.7743 data: 0.7440 max mem: 6925
Test: Total time: 0:00:52 (1.0412 s / it)
* Acc@1 75.478 Acc@5 92.318 loss 1.081
Accuracy of the model on the 50000 test images: 75.5%
Max accuracy: 75.57%
Epoch: [297] [ 0/625] eta: 3:32:31 lr: 0.000002 min_lr: 0.000002 loss: 2.2634 (2.2634) class_acc: 0.6914 (0.6914) weight_decay: 0.0500 (0.0500) time: 20.4021 data: 16.9738 max mem: 6925
Epoch: [297] [200/625] eta: 0:14:29 lr: 0.000002 min_lr: 0.000002 loss: 2.1574 (2.1511) class_acc: 0.7227 (0.7258) weight_decay: 0.0500 (0.0500) grad_norm: 1.3204 (1.3893) time: 2.1157 data: 0.1131 max mem: 6925
Epoch: [297] [400/625] eta: 0:07:32 lr: 0.000002 min_lr: 0.000002 loss: 2.1584 (2.1548) class_acc: 0.7227 (0.7233) weight_decay: 0.0500 (0.0500) grad_norm: 1.3956 (1.3980) time: 1.9370 data: 0.0013 max mem: 6925
Epoch: [297] [600/625] eta: 0:00:50 lr: 0.000002 min_lr: 0.000002 loss: 2.1304 (2.1542) class_acc: 0.7305 (0.7239) weight_decay: 0.0500 (0.0500) grad_norm: 1.3758 (1.3929) time: 2.0350 data: 0.0017 max mem: 6925
Epoch: [297] [624/625] eta: 0:00:01 lr: 0.000002 min_lr: 0.000002 loss: 2.1437 (2.1546) class_acc: 0.7109 (0.7236) weight_decay: 0.0500 (0.0500) grad_norm: 1.3087 (1.3896) time: 0.7499 data: 0.0017 max mem: 6925
Epoch: [297] Total time: 0:20:20 (1.9533 s / it)
Averaged stats: lr: 0.000002 min_lr: 0.000002 loss: 2.1437 (2.1592) class_acc: 0.7109 (0.7222) weight_decay: 0.0500 (0.0500) grad_norm: 1.3087 (1.3896)
Test: [ 0/50] eta: 0:10:35 loss: 0.9542 (0.9542) acc1: 78.4000 (78.4000) acc5: 92.8000 (92.8000) time: 12.7127 data: 12.6758 max mem: 6925
Test: [10/50] eta: 0:01:30 loss: 0.9250 (0.9445) acc1: 78.4000 (79.5636) acc5: 92.8000 (93.0182) time: 2.2709 data: 2.2400 max mem: 6925
Test: [20/50] eta: 0:00:52 loss: 1.0620 (1.0758) acc1: 76.8000 (76.0381) acc5: 92.0000 (92.0381) time: 1.2074 data: 1.1770 max mem: 6925
Test: [30/50] eta: 0:00:28 loss: 1.1709 (1.1125) acc1: 72.0000 (74.9936) acc5: 91.2000 (91.7419) time: 0.9942 data: 0.9633 max mem: 6925
Test: [40/50] eta: 0:00:11 loss: 1.1144 (1.1135) acc1: 72.0000 (74.9659) acc5: 91.2000 (91.7268) time: 0.5726 data: 0.5423 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.1009 (1.1135) acc1: 73.6000 (74.8640) acc5: 91.2000 (91.6480) time: 0.4931 data: 0.4639 max mem: 6925
Test: Total time: 0:00:49 (0.9817 s / it)
* Acc@1 75.490 Acc@5 92.336 loss 1.075
Accuracy of the model on the 50000 test images: 75.5%
Max accuracy: 75.57%
Epoch: [298] [ 0/625] eta: 3:51:27 lr: 0.000002 min_lr: 0.000002 loss: 2.0688 (2.0688) class_acc: 0.7422 (0.7422) weight_decay: 0.0500 (0.0500) time: 22.2199 data: 18.8956 max mem: 6925
Epoch: [298] [200/625] eta: 0:14:19 lr: 0.000001 min_lr: 0.000001 loss: 2.1380 (2.1622) class_acc: 0.7148 (0.7221) weight_decay: 0.0500 (0.0500) grad_norm: 1.3825 (1.3869) time: 1.7920 data: 0.0012 max mem: 6925
Epoch: [298] [400/625] eta: 0:07:26 lr: 0.000001 min_lr: 0.000001 loss: 2.1374 (2.1577) class_acc: 0.7227 (0.7227) weight_decay: 0.0500 (0.0500) grad_norm: 1.3837 (1.3894) time: 1.9199 data: 0.0014 max mem: 6925
Epoch: [298] [600/625] eta: 0:00:50 lr: 0.000001 min_lr: 0.000001 loss: 2.1450 (2.1619) class_acc: 0.7227 (0.7219) weight_decay: 0.0500 (0.0500) grad_norm: 1.4079 (1.3945) time: 2.0190 data: 0.0014 max mem: 6925
Epoch: [298] [624/625] eta: 0:00:01 lr: 0.000001 min_lr: 0.000001 loss: 2.1410 (2.1624) class_acc: 0.7266 (0.7218) weight_decay: 0.0500 (0.0500) grad_norm: 1.2803 (1.3920) time: 1.1389 data: 0.0020 max mem: 6925
Epoch: [298] Total time: 0:20:29 (1.9674 s / it)
Averaged stats: lr: 0.000001 min_lr: 0.000001 loss: 2.1410 (2.1617) class_acc: 0.7266 (0.7223) weight_decay: 0.0500 (0.0500) grad_norm: 1.2803 (1.3920)
Test: [ 0/50] eta: 0:10:44 loss: 0.9752 (0.9752) acc1: 76.8000 (76.8000) acc5: 92.8000 (92.8000) time: 12.8876 data: 12.8515 max mem: 6925
Test: [10/50] eta: 0:01:30 loss: 0.9363 (0.9516) acc1: 78.4000 (79.4182) acc5: 92.8000 (93.1636) time: 2.2735 data: 2.2428 max mem: 6925
Test: [20/50] eta: 0:00:54 loss: 1.0632 (1.0827) acc1: 77.6000 (76.0000) acc5: 92.0000 (92.1905) time: 1.2713 data: 1.2416 max mem: 6925
Test: [30/50] eta: 0:00:32 loss: 1.1714 (1.1198) acc1: 72.8000 (74.9419) acc5: 91.2000 (91.8968) time: 1.2632 data: 1.2342 max mem: 6925
Test: [40/50] eta: 0:00:13 loss: 1.1271 (1.1211) acc1: 72.8000 (74.9659) acc5: 92.0000 (91.8439) time: 0.8652 data: 0.8360 max mem: 6925
Test: [49/50] eta: 0:00:01 loss: 1.1102 (1.1210) acc1: 73.6000 (74.8640) acc5: 92.0000 (91.8240) time: 0.7394 data: 0.7095 max mem: 6925
Test: Total time: 0:00:56 (1.1318 s / it)
* Acc@1 75.482 Acc@5 92.378 loss 1.082
Accuracy of the model on the 50000 test images: 75.5%
Max accuracy: 75.57%
Epoch: [299] [ 0/625] eta: 3:31:23 lr: 0.000001 min_lr: 0.000001 loss: 2.1946 (2.1946) class_acc: 0.7148 (0.7148) weight_decay: 0.0500 (0.0500) time: 20.2937 data: 18.7209 max mem: 6925
Epoch: [299] [200/625] eta: 0:14:16 lr: 0.000001 min_lr: 0.000001 loss: 2.1638 (2.1567) class_acc: 0.7148 (0.7244) weight_decay: 0.0500 (0.0500) grad_norm: 1.4463 (1.3813) time: 1.9189 data: 0.0007 max mem: 6925
Epoch: [299] [400/625] eta: 0:07:30 lr: 0.000001 min_lr: 0.000001 loss: 2.2076 (2.1576) class_acc: 0.7148 (0.7241) weight_decay: 0.0500 (0.0500) grad_norm: 1.3523 (1.3987) time: 1.8133 data: 0.0012 max mem: 6925
Epoch: [299] [600/625] eta: 0:00:49 lr: 0.000001 min_lr: 0.000001 loss: 2.1015 (2.1553) class_acc: 0.7383 (0.7241) weight_decay: 0.0500 (0.0500) grad_norm: 1.3833 (1.4029) time: 2.0819 data: 0.0009 max mem: 6925
Epoch: [299] [624/625] eta: 0:00:01 lr: 0.000001 min_lr: 0.000001 loss: 2.1572 (2.1560) class_acc: 0.7227 (0.7240) weight_decay: 0.0500 (0.0500) grad_norm: 1.3833 (1.4012) time: 0.9351 data: 0.0015 max mem: 6925
Epoch: [299] Total time: 0:20:21 (1.9552 s / it)
Averaged stats: lr: 0.000001 min_lr: 0.000001 loss: 2.1572 (2.1591) class_acc: 0.7227 (0.7231) weight_decay: 0.0500 (0.0500) grad_norm: 1.3833 (1.4012)
Test: [ 0/50] eta: 0:09:42 loss: 0.9730 (0.9730) acc1: 77.6000 (77.6000) acc5: 92.0000 (92.0000) time: 11.6569 data: 11.6128 max mem: 6925
Test: [10/50] eta: 0:01:15 loss: 0.9348 (0.9518) acc1: 78.4000 (79.6364) acc5: 92.8000 (92.8727) time: 1.8844 data: 1.8520 max mem: 6925
Test: [20/50] eta: 0:00:43 loss: 1.0599 (1.0825) acc1: 77.6000 (76.0381) acc5: 92.0000 (92.0000) time: 0.9444 data: 0.9146 max mem: 6925
Test: [30/50] eta: 0:00:24 loss: 1.1810 (1.1190) acc1: 72.8000 (75.0452) acc5: 91.2000 (91.7677) time: 0.8308 data: 0.8023 max mem: 6925
Test: [40/50] eta: 0:00:10 loss: 1.1315 (1.1202) acc1: 72.8000 (75.0634) acc5: 91.2000 (91.7268) time: 0.5963 data: 0.5674 max mem: 6925
Test: [49/50] eta: 0:00:00 loss: 1.0934 (1.1196) acc1: 73.6000 (74.9440) acc5: 91.2000 (91.6960) time: 0.3964 data: 0.3677 max mem: 6925
Test: Total time: 0:00:45 (0.9085 s / it)
* Acc@1 75.502 Acc@5 92.332 loss 1.081
Accuracy of the model on the 50000 test images: 75.5%
Max accuracy: 75.57%
Training time 4 days, 9:27:45
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。