Chainer 介绍

Posted by 徐志平 on December 14, 2017

Chainer 介绍

这里是 Chainer 教程的第一部分。 在此部分中,您将学习如下内容:

  • 现行框架的优缺点以及我们为什么开发 Chainer
  • 前向以及反向计算的简单的例子
  • 连接的使用以及梯度计算
  • chains 的构建(即. 大多数框架所指的“模型”)
  • 参数优化
  • 连接和优化器的串行化

读完此部分,您将能够:

  • 计算一些算式的梯度
  • 用 Chainer 写一个多层感知器

核心概念

正如前文所述, Chainer 是一个柔性的神经网络框架。我们的主要目标就是柔性,使得我们能够简单直观的写出复杂的网络。

当下已有的深度学习框架使用的是“定义后运行”机制。即意味着,首先定义并且固化一个网络,再周而复始地馈入小批量数据进行训练。由于网络是在任何前向、反向计算前静态定义的,所有的逻辑作为数据必须事先嵌入网络中。 意味着,在诸如Caffe这样的框架中通过声明的方法定义网络结构。(注:可以使用torch.nn, 基于 Theano框架, 以及 TensorFlow 的命令语句定义一个静态网络)

边定义边运行

Chainer 对应地采用了一种叫做 “边定义边运行” 的机制, 即, 网络可以在实际进行前向计算的时候同时被定义。 更加准确的说, Chainer 存储的是计算的历史结果而不是计算逻辑。这个策略使我们能够充分利用Python中编程逻辑的力量。例如,Chainer不需要任何魔法就可以将条件和循环引入到网络定义中。 边定义边运行是Chainer的核心概念。 我们将在本教程中展示如何动态定义网络。

这个策略也使编写多GPU并行化变得容易,因为逻辑更接近于网络操作。我们将在本教程后面的章节中回顾这些设施。

Chainer 将网络表示为计算图上的执行路径。计算图是一系列函数应用,因此它可以用多个Function对象来描述。当这个Function是一个神经网络层时,功能的参数将通过训练来更新。因此,该函数需要在内部保留可训练的参数,因此Chainer具有Link类,它可以在类的对象上保存可训练参数。在Link对象中执行的函数的参数被表示为Variable对象。 简言之,LinkFunction之间的区别在于它是否包含可训练参数。 神经网络模型通常被描述为一系列LinkFunction

您可以通过动态“链接”各种LinkFunction来构建计算图来定义Chain。在框架中,通过运行链接图来定义网络,因此名称是Chainer。

在本教程的示例代码中,我们假定为了简单起见,已经预先导入了以下语句:

import numpy as np
import chainer
from chainer import cuda, Function, gradient_check, report, training, utils, Variable
from chainer import datasets, iterators, optimizers, serializers
from chainer import Link, Chain, ChainList
import chainer.functions as F
import chainer.links as L
from chainer.training import extensions

这些导入广泛出现在Chainer代码和例子中。为了简单起见,我们在本教程中省略了这些导入。

前向/反向计算

如上所述,Chainer使用“边定义边运行”方案,因此前向计算本身即定义了网络。为了开始前向计算,我们必须将输入数组设置为一个Variable对象。这里我们从一个简单的ndarray开始,只有一个元素:

x_data = np.array([5], dtype=np.float32)
x = Variable(x_data)

Variable 对象具有基本的算术运算符。为了计算 $y = x^2 - 2x + 1$, 只需写:

y = x**2 - 2 * x + 1

结果y也是一个Variable对象,其值可以通过访问data属性来提取:

y.data
array([ 16.], dtype=float32)

y所持有的不仅是结果的数值。它也保持计算的历史(即计算图),其能够计算其差分。这是通过调用它的backward()方法完成的:

y.backward()

其运行错误反向传播(也称为反向传播或反向模式自动差分)。然后,计算梯度并将其存储在输入变量x的grad属性中:

x.grad
array([ 8.], dtype=float32)

我们也可以计算中间变量的梯度。请注意,Chainer默认情况下会释放中间变量的梯度数组以提高内存效率。为了保留梯度信息,请将retain_grad参数传递给backward方法:

z = 2*x
y = x**2 - z + 1
y.backward(retain_grad=True)
z.grad
array([-1.], dtype=float32)

否则,z.grad将为None,如下所示:

z = 2*x
y = x**2 - z + 1
y.backward()
z.grad
z.grad is None
True

所有这些计算都很容易推广到多元素数组输入。请注意,如果我们想从一个包含多元素数组的变量开始向后计算,我们必须手动设置初始错误。 因为当一个变量的size(这意味着数组中元素的个数)是1时,它被认为是一个表示损失值的变量对象,所以变量的grad属性被自动填充为1。 另一方面,当一个变量的大小大于1时,grad属性保持为None,并且在运行backward()之前需要明确地设置初始错误。这可以简单地通过设置输出变量的grad属性来完成,如下所示:

x = Variable(np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32))
y = x**2 - 2*x + 1
y.grad = np.ones((2, 3), dtype=np.float32)
y.backward()
x.grad
array([[  0.,   2.,   4.],
       [  6.,   8.,  10.]], dtype=float32)

functions模块中定义了许多采用Variable对象的函数。您可以将它们结合起来,实现具有自动后向计算的复杂功能.

连接

为了编写神经网络,我们必须将函数与参数相结合,并优化参数。你可以使用连接来做到这一点。Link是保存参数(即优化目标)的对象。

最基本的是像常规函数一样的连接。我们将介绍更高层次的连接,但是在这里将连接看作简化的带有参数的函数。

最经常使用的连接之一是Linear 连接(也称为完全连接层或仿射变换)。它代表一个数学函数 $f(x)= Wx + b$ ,其中W为矩阵和b 为矢量参数。这个连接对应于linear(),它接受xWb 作为参数。从三维空间到二维空间的线性连接由以下行定义:

f = L.Linear(3, 2)

大多数函数和链接只接受小批量输入,其中输入数组的第一个维度被视为批量维度。在上面的线性连接情况下,输入必须具有(N,3)的形状,其中N是最小批量大小。

连接的参数被存储为属性。每个参数都是Variable的一个实例。在Linear连接的情况下,存储两个参数Wb。默认情况下,矩阵W是随机初始化的,而向量b是用零初始化的。

f.W.data
array([[ 0.19792122,  0.29951876, -0.31833425],
       [-0.59501284, -0.65519476, -0.00605371]], dtype=float32)
f.b.data
array([ 0.,  0.], dtype=float32)

Linear 连接的一个实例就像一个通常的函数:

x = Variable(np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32))
y = f(x)
y.data
array([[-0.15804404, -1.9235636 ],
       [ 0.37927318, -5.69234705]], dtype=float32)

有时计算输入空间的维数很麻烦。线性连接和一些(反)卷积连接可以在实例化时省略输入维度,并从第一个小批量中推断出输入维度来。

例如,以下行创建一个输出维度为两个的线性连接:

g = L.Linear(2)

如果我们输入一个小批量的形状为(N,M),则输入维数将被推断为M,这意味着g.W将是2×M矩阵。 请注意,它的参数在第一个小批处理中以懒惰的方式初始化。因此,如果没有数据放入连接,则f不具有W属性。

参数的梯度由backward()方法计算。请注意,梯度是由方法累积而不是覆盖。所以首先你必须清除梯度来更新计算。可以通过调用cleargrads()方法来完成。

x = Variable(np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32))
g = L.Linear(2)
p=g(x)
p
variable([[-2.64461255,  2.90179563],
          [-6.81166267,  4.94405651]])
g.cleargrads()
g.grad = np.ones((2, 2), dtype=np.float32)
g.W.grad
g.b.grad

基于 chain 写一个模型

大多数神经网络体系结构包含多个连接。例如,多层感知器由多个线性层组成。我们可以通过组合多个连接来编写具有可训练参数的复杂过程:

l1 = L.Linear(4, 3)
l2 = L.Linear(3, 2)

def my_forward(x):
    h = l1(x)
    return l2(h)

这里的L表示links模块。以这种方式定义参数的过程很难重用。更多Pythonic的方式是将连接和程序组合成一个类:

class MyProc(object):
    def __init__(self):
        self.l1 = L.Linear(4, 3)
        self.l2 = L.Linear(3, 2)

    def forward(self, x):
        h = self.l1(x)
        return self.l2(h)

为了使其更加可重用,我们希望支持参数管理,CPU / GPU迁移,强大而灵活的保存/加载功能等。这些功能都由Chainer中的Chain类支持。那么,我们要做的就是将上面的类定义为 Chain 的子类:

class MyChain(Chain):
    def __init__(self):
        super(MyChain, self).__init__()
        with self.init_scope():
            self.l1 = L.Linear(4, 3)
            self.l2 = L.Linear(3, 2)
            
    def __call__(self, x):
        h = self.l1(x)
        return self.l2(h)

它显示了一个复杂的连接是如何通过更连接的链接构建的。诸如l1l2被称为MyChain的子连接。注意,Chain本身继承自Link。这意味着我们可以定义更复杂的连接,将MyChain对象作为子连接。

我们经常通过__call__运算符定义一个前向连接。这样的连接和Chains是可调用的,并且像常规函数和变量一样。

另一种定义chain的方法是使用ChainList类,它的行为类似于连接列表:

class MyChain2(ChainList):
    def __init__(self):
        super(MyChain2, self).__init__(
            L.Linear(4, 3),
            L.Linear(3, 2),
        )

    def __call__(self, x):
        h = self[0](x)
        return self[1](h)

ChainList可以方便地使用任意数量的连接,但是如果连接的数量固定且与上述情况相同,则建议使用Chain类作为基类。

优化器

为了获得良好的参数值,我们必须通过优化器类来优化它们。它在给定的连接上运行数值优化算法。许多算法在优化器模块中实现。这里我们使用最简单的称为随机梯度下降(SGD):

model = MyChain()
optimizer = optimizers.SGD()
optimizer.setup(model)

setup()方法针对给定的连接准备对应的优化器。

一些参数/梯度操作,例如权重衰减和梯度剪切,可以通过设置钩子函数到优化器来完成。 钩子函数在梯度计算之后和实际更新参数之前调用。例如,我们可以通过预先运行下一行来设置权重衰减正则化:

 optimizer.add_hook(chainer.optimizer.WeightDecay(0.0005))

当然,你可以编写自己的钩子函数。它应该是一个函数或一个可调用的对象,以优化器为参数。

有两种使用优化器的方法。一个是通过训练器使用它,我们将在下面的部分中看到。另一种方式是直接使用它。我们在这里回顾后一种情况。如果您有兴趣以简单的方式使用优化器,请跳过本节并转到下一节。

还有两种直接使用优化器的方法。一个是手动计算梯度,然后调用没有参数的 update()方法。不要忘记事先清除梯度!

x = np.random.uniform(-1, 1, (2, 4)).astype('f')
model.cleargrads()
# compute gradient here...
loss = F.sum(model(chainer.Variable(x)))
loss.backward()
optimizer.update()

另一种方法是将损失函数传递给update()方法。在这种情况下,cleargrads() 会被update方法自动调用,所以用户不必手动调用它。

def lossfun(arg1, arg2):
    # calculate loss
    loss = F.sum(model(arg1 - arg2))
    return loss
arg1 = np.random.uniform(-1, 1, (2, 4)).astype('f')
arg2 = np.random.uniform(-1, 1, (2, 4)).astype('f')
optimizer.update(lossfun, chainer.Variable(arg1), chainer.Variable(arg2))

训练器

当我们想要训练神经网络时,我们必须运行训练循环多次更新参数。典型的训练循环包括以下过程:

  1. 对训练数据集进行迭代
  2. 提取小批量的预处理
  3. 神经网络的前向/后向计算
  4. 参数更新
  5. 评估验证数据集上的当前参数
  6. 记录和打印中间结果

Chainer提供了一个简单而强大的方法来使写这样的训练过程变得容易。训练循环抽象主要由两部分组成:

  • 数据集抽象。它在上面的列表中实现了1和2。核心组件在数据集模块中定义。数据集和迭代器模块中还有许多数据集和迭代器的实现。

  • 训练器。它在上面的列表中实现3,4,5和6。整个程序由Trainer执行。更新参数(3和4)的方式由Updater定义,可以自由定制。 5和6由Extension的实例来实现,它将一个额外的过程附加到训练循环中。用户可以通过添加扩展来自由定制训练程序。用户也可以实现自己的扩展。

序列化器

在继续第一个例子之前,我们介绍Serializer,这是本页中描述的最后一个核心功能。序列化器是一个简单的接口来序列化或反序列化一个对象。连接,优化器和训练器都支持序列化。

序列化器模块中定义了具体的序列化器。它支持NumPy NPZ和HDF5格式。

例如,我们可以通过serializers.save_npz()函数将连接对象序列化成NPZ文件:

serializers.save_npz('my.model', model)

它将模型的参数以NPZ格式保存到文件“my.model”中。保存的模型可以被serializers.load_npz()函数读取:

serializers.load_npz('my.model', model)

请注意,只有参数和持久值由该序列化代码序列化。其他属性不会自动保存。您可以通过Link.add_persistent()方法将数组,标量或任何可序列化的对象注册为持久值。注册的值可以通过传递给add_persistent方法的名称的属性来访问。

优化器的状态也可以通过相同的函数来保存:

serializers.save_npz('my.state', optimizer)
serializers.load_npz('my.state', optimizer)

请注意,优化器的序列化只保存其内部状态,包括迭代次数,MomentumSGD的动量向量等。它不保存目标连接的参数和永久值。我们必须明确地保存与优化器的目标连接,从保存状态恢复优化。

如果安装了h5py软件包,则支持HDF5格式。 HDF5格式的序列化和反序列化与NPZ格式的序列化和反序列化几乎相同;只需用save_hdf5()和load_hdf5()分别替换save_npz()和load_npz()即可。

例子:基于MNIST的多层感知器

现在,您可以使用多层感知器(MLP)来解决多类分类任务。我们使用手写数字数据集称为MNIST,这是机器学习中长期使用的事实上的“hello world”示例之一。这个MNIST例子也可以在官方仓库的examples / mnist目录中找到。我们演示如何使用训练器来构建和运行本节中的训练循环。

我们首先必须准备MNIST数据集。 MNIST数据集由70,000个尺寸为28×28(即784个像素)的灰度图像和相应的数字标签组成。数据集默认分为6万个训练图像和10,000个测试图像。我们可以通过datasets.get_mnist()获得矢量化版本(即一组784维向量)。

train, test = datasets.get_mnist()

此代码自动下载MNIST数据集并将NumPy数组保存到 $(HOME)/.chainer 目录中。返回的训练集和测试集可以看作图像标签配对的列表(严格地说,它们是TupleDataset的实例)。

我们还必须定义如何迭代这些数据集。我们想要在数据集的每次扫描开始时对每个epoch的训练数据集进行重新洗牌。在这种情况下,我们可以使用iterators.SerialIterator

train_iter = iterators.SerialIterator(train, batch_size=100, shuffle=True)

另一方面,我们不必洗牌测试数据集。在这种情况下,我们可以通过shuffle = False来禁止混洗。当底层数据集支持快速切片时,它使迭代速度更快。

test_iter = iterators.SerialIterator(test, batch_size=100, repeat=False, shuffle=False)

当所有的例子被访问时,我们停止迭代通过设定 repeat=False 。测试/验证数据集通常需要此选项;没有这个选项,迭代进入一个无限循环。

接下来,我们定义架构。我们使用一个简单的三层网络,每层100个单元。

class MLP(Chain):
    def __init__(self, n_units, n_out):
        super(MLP, self).__init__()
        with self.init_scope():
            # the size of the inputs to each layer will be inferred
            self.l1 = L.Linear(None, n_units)  # n_in -> n_units
            self.l2 = L.Linear(None, n_units)  # n_units -> n_units
            self.l3 = L.Linear(None, n_out)    # n_units -> n_out

    def __call__(self, x):
        h1 = F.relu(self.l1(x))
        h2 = F.relu(self.l2(h1))
        y = self.l3(h2)
        return y

该链接使用relu()作为激活函数。请注意,“l3”链接是最终的全连接层,其输出对应于十个数字的分数。

为了计算损失值或评估预测的准确性,我们在上面的MLP连接的基础上定义一个分类器连接:

class Classifier(Chain):
    def __init__(self, predictor):
        super(Classifier, self).__init__()
        with self.init_scope():
            self.predictor = predictor

    def __call__(self, x, t):
        y = self.predictor(x)
        loss = F.softmax_cross_entropy(y, t)
        accuracy = F.accuracy(y, t)
        report({'loss': loss, 'accuracy': accuracy}, self)
        return loss

这个分类器类计算准确性和损失,并返回损失值。参数对x和t对应于数据集中的每个示例(图像和标签的元组)。 softmax_cross_entropy()计算给定预测和基准真实标签的损失值。 accuracy() 计算预测准确度。我们可以为分类器的一个实例设置任意的预测器连接。

report() 函数向训练器报告损失和准确度。收集训练统计信息的具体机制参见 Reporter. 您也可以采用类似的方式收集其他类型的观测值,如激活统计。

请注意,类似上面的分类器的类被定义为chainer.links.Classifier。因此,我们将使用此预定义的Classifier连接而不是使用上面的示例。

model = L.Classifier(MLP(100, 10))  # the input size, 784, is inferred
optimizer = optimizers.SGD()
optimizer.setup(model)

现在我们可以建立一个训练器对象。

updater = training.StandardUpdater(train_iter, optimizer)
trainer = training.Trainer(updater, (20, 'epoch'), out='result')

第二个参数(20,’epoch’)表示训练的持续时间。我们可以使用epoch或迭代作为单位。在这种情况下,我们通过遍历训练集20次来训练多层感知器。

为了调用训练循环,我们只需调用run()方法。

这个方法执行整个训练序列。

上面的代码只是优化了参数。在大多数情况下,我们想看看培训的进展情况,我们可以在调用run方法之前使用扩展插入。

trainer.extend(extensions.Evaluator(test_iter, model))
trainer.extend(extensions.LogReport())
trainer.extend(extensions.PrintReport(['epoch', 'main/accuracy', 'validation/main/accuracy']))
trainer.extend(extensions.ProgressBar())
trainer.run()  
epoch       main/accuracy  validation/main/accuracy
     total [..................................................]  0.83%
this epoch [########..........................................] 16.67%
       100 iter, 0 epoch / 20 epochs
       inf iters/sec. Estimated time to finish: 0:00:00.
     total [..................................................]  1.67%
this epoch [################..................................] 33.33%
       200 iter, 0 epoch / 20 epochs
    270.19 iters/sec. Estimated time to finish: 0:00:43.672168.
     total [#.................................................]  2.50%
this epoch [#########################.........................] 50.00%
       300 iter, 0 epoch / 20 epochs
    271.99 iters/sec. Estimated time to finish: 0:00:43.017048.
     total [#.................................................]  3.33%
this epoch [#################################.................] 66.67%
       400 iter, 0 epoch / 20 epochs
    274.82 iters/sec. Estimated time to finish: 0:00:42.209075.
     total [##................................................]  4.17%
this epoch [#########################################.........] 83.33%
       500 iter, 0 epoch / 20 epochs
    275.19 iters/sec. Estimated time to finish: 0:00:41.789476.
1           0.6581         0.8475                    
     total [##................................................]  5.00%
this epoch [..................................................]  0.00%
       600 iter, 1 epoch / 20 epochs
    250.26 iters/sec. Estimated time to finish: 0:00:45.553447.
     total [##................................................]  5.83%
this epoch [########..........................................] 16.67%
       700 iter, 1 epoch / 20 epochs
    251.78 iters/sec. Estimated time to finish: 0:00:44.879872.
     total [###...............................................]  6.67%
this epoch [################..................................] 33.33%
       800 iter, 1 epoch / 20 epochs
    253.07 iters/sec. Estimated time to finish: 0:00:44.257362.
     total [###...............................................]  7.50%
this epoch [#########################.........................] 50.00%
       900 iter, 1 epoch / 20 epochs
    253.97 iters/sec. Estimated time to finish: 0:00:43.706513.
     total [####..............................................]  8.33%
this epoch [#################################.................] 66.67%
      1000 iter, 1 epoch / 20 epochs
    255.94 iters/sec. Estimated time to finish: 0:00:42.979372.
     total [####..............................................]  9.17%
this epoch [#########################################.........] 83.33%
      1100 iter, 1 epoch / 20 epochs
    257.61 iters/sec. Estimated time to finish: 0:00:42.311793.
2           0.868483       0.8922                    
     total [#####.............................................] 10.00%
this epoch [..................................................]  0.00%
      1200 iter, 2 epoch / 20 epochs
    250.02 iters/sec. Estimated time to finish: 0:00:43.196043.
     total [#####.............................................] 10.83%
this epoch [########..........................................] 16.67%
      1300 iter, 2 epoch / 20 epochs
    250.73 iters/sec. Estimated time to finish: 0:00:42.674737.
     total [#####.............................................] 11.67%
this epoch [################..................................] 33.33%
      1400 iter, 2 epoch / 20 epochs
    250.76 iters/sec. Estimated time to finish: 0:00:42.271780.
     total [######............................................] 12.50%
this epoch [#########################.........................] 50.00%
      1500 iter, 2 epoch / 20 epochs
    250.66 iters/sec. Estimated time to finish: 0:00:41.889907.
     total [######............................................] 13.33%
this epoch [#################################.................] 66.67%
      1600 iter, 2 epoch / 20 epochs
    250.63 iters/sec. Estimated time to finish: 0:00:41.494966.
     total [#######...........................................] 14.17%
this epoch [#########################################.........] 83.33%
      1700 iter, 2 epoch / 20 epochs
     250.3 iters/sec. Estimated time to finish: 0:00:41.150503.
3           0.893583       0.9065                    
     total [#######...........................................] 15.00%
this epoch [..................................................]  0.00%
      1800 iter, 3 epoch / 20 epochs
    245.03 iters/sec. Estimated time to finish: 0:00:41.627412.
     total [#######...........................................] 15.83%
this epoch [########..........................................] 16.67%
      1900 iter, 3 epoch / 20 epochs
    246.29 iters/sec. Estimated time to finish: 0:00:41.007745.
     total [########..........................................] 16.67%
this epoch [################..................................] 33.33%
      2000 iter, 3 epoch / 20 epochs
    246.63 iters/sec. Estimated time to finish: 0:00:40.547184.
     total [########..........................................] 17.50%
this epoch [#########################.........................] 50.00%
      2100 iter, 3 epoch / 20 epochs
    247.22 iters/sec. Estimated time to finish: 0:00:40.045529.
     total [#########.........................................] 18.33%
this epoch [#################################.................] 66.67%
      2200 iter, 3 epoch / 20 epochs
    248.21 iters/sec. Estimated time to finish: 0:00:39.482367.
     total [#########.........................................] 19.17%
this epoch [#########################################.........] 83.33%
      2300 iter, 3 epoch / 20 epochs
    248.73 iters/sec. Estimated time to finish: 0:00:38.997955.
4           0.90485        0.9154                    
     total [##########........................................] 20.00%
this epoch [..................................................]  0.00%
      2400 iter, 4 epoch / 20 epochs
    244.21 iters/sec. Estimated time to finish: 0:00:39.309754.
     total [##########........................................] 20.83%
this epoch [########..........................................] 16.67%
      2500 iter, 4 epoch / 20 epochs
    244.55 iters/sec. Estimated time to finish: 0:00:38.847329.
     total [##########........................................] 21.67%
this epoch [################..................................] 33.33%
      2600 iter, 4 epoch / 20 epochs
    245.78 iters/sec. Estimated time to finish: 0:00:38.245938.
     total [###########.......................................] 22.50%
this epoch [#########################.........................] 50.00%
      2700 iter, 4 epoch / 20 epochs
    246.89 iters/sec. Estimated time to finish: 0:00:37.668330.
     total [###########.......................................] 23.33%
this epoch [#################################.................] 66.67%
      2800 iter, 4 epoch / 20 epochs
    247.85 iters/sec. Estimated time to finish: 0:00:37.119132.
     total [############......................................] 24.17%
this epoch [#########################################.........] 83.33%
      2900 iter, 4 epoch / 20 epochs
    248.84 iters/sec. Estimated time to finish: 0:00:36.568961.
5           0.9128         0.9222                    
     total [############......................................] 25.00%
this epoch [..................................................]  0.00%
      3000 iter, 5 epoch / 20 epochs
    246.32 iters/sec. Estimated time to finish: 0:00:36.537719.
     total [############......................................] 25.83%
this epoch [########..........................................] 16.67%
      3100 iter, 5 epoch / 20 epochs
    247.27 iters/sec. Estimated time to finish: 0:00:35.993611.
     total [#############.....................................] 26.67%
this epoch [################..................................] 33.33%
      3200 iter, 5 epoch / 20 epochs
    247.64 iters/sec. Estimated time to finish: 0:00:35.535495.
     total [#############.....................................] 27.50%
this epoch [#########################.........................] 50.00%
      3300 iter, 5 epoch / 20 epochs
    248.02 iters/sec. Estimated time to finish: 0:00:35.078297.
     total [##############....................................] 28.33%
this epoch [#################################.................] 66.67%
      3400 iter, 5 epoch / 20 epochs
     248.3 iters/sec. Estimated time to finish: 0:00:34.635942.
     total [##############....................................] 29.17%
this epoch [#########################################.........] 83.33%
      3500 iter, 5 epoch / 20 epochs
    248.35 iters/sec. Estimated time to finish: 0:00:34.225545.
6           0.9182         0.9251                    
     total [###############...................................] 30.00%
this epoch [..................................................]  0.00%
      3600 iter, 6 epoch / 20 epochs
    245.49 iters/sec. Estimated time to finish: 0:00:34.217710.
     total [###############...................................] 30.83%
this epoch [########..........................................] 16.67%
      3700 iter, 6 epoch / 20 epochs
    245.88 iters/sec. Estimated time to finish: 0:00:33.755860.
     total [###############...................................] 31.67%
this epoch [################..................................] 33.33%
      3800 iter, 6 epoch / 20 epochs
     245.9 iters/sec. Estimated time to finish: 0:00:33.346716.
     total [################..................................] 32.50%
this epoch [#########################.........................] 50.00%
      3900 iter, 6 epoch / 20 epochs
    245.96 iters/sec. Estimated time to finish: 0:00:32.931534.
     total [################..................................] 33.33%
this epoch [#################################.................] 66.67%
      4000 iter, 6 epoch / 20 epochs
    245.99 iters/sec. Estimated time to finish: 0:00:32.521949.
     total [#################.................................] 34.17%
this epoch [#########################################.........] 83.33%
      4100 iter, 6 epoch / 20 epochs
    246.12 iters/sec. Estimated time to finish: 0:00:32.098613.
7           0.923683       0.9281                    
     total [#################.................................] 35.00%
this epoch [..................................................]  0.00%
      4200 iter, 7 epoch / 20 epochs
    244.37 iters/sec. Estimated time to finish: 0:00:31.918388.
     total [#################.................................] 35.83%
this epoch [########..........................................] 16.67%
      4300 iter, 7 epoch / 20 epochs
    244.24 iters/sec. Estimated time to finish: 0:00:31.526645.
     total [##################................................] 36.67%
this epoch [################..................................] 33.33%
      4400 iter, 7 epoch / 20 epochs
     244.7 iters/sec. Estimated time to finish: 0:00:31.058855.
     total [##################................................] 37.50%
this epoch [#########################.........................] 50.00%
      4500 iter, 7 epoch / 20 epochs
    245.22 iters/sec. Estimated time to finish: 0:00:30.584594.
     total [###################...............................] 38.33%
this epoch [#################################.................] 66.67%
      4600 iter, 7 epoch / 20 epochs
    245.84 iters/sec. Estimated time to finish: 0:00:30.100470.
     total [###################...............................] 39.17%
this epoch [#########################################.........] 83.33%
      4700 iter, 7 epoch / 20 epochs
     246.3 iters/sec. Estimated time to finish: 0:00:29.638363.
8           0.927233       0.9312                    
     total [####################..............................] 40.00%
this epoch [..................................................]  0.00%
      4800 iter, 8 epoch / 20 epochs
    245.02 iters/sec. Estimated time to finish: 0:00:29.385524.
     total [####################..............................] 40.83%
this epoch [########..........................................] 16.67%
      4900 iter, 8 epoch / 20 epochs
    245.47 iters/sec. Estimated time to finish: 0:00:28.923795.
     total [####################..............................] 41.67%
this epoch [################..................................] 33.33%
      5000 iter, 8 epoch / 20 epochs
    245.91 iters/sec. Estimated time to finish: 0:00:28.465973.
     total [#####################.............................] 42.50%
this epoch [#########################.........................] 50.00%
      5100 iter, 8 epoch / 20 epochs
    246.47 iters/sec. Estimated time to finish: 0:00:27.994909.
     total [#####################.............................] 43.33%
this epoch [#################################.................] 66.67%
      5200 iter, 8 epoch / 20 epochs
    246.95 iters/sec. Estimated time to finish: 0:00:27.535404.
     total [######################............................] 44.17%
this epoch [#########################################.........] 83.33%
      5300 iter, 8 epoch / 20 epochs
    247.33 iters/sec. Estimated time to finish: 0:00:27.089584.
9           0.931317       0.9341                    
     total [######################............................] 45.00%
this epoch [..................................................]  0.00%
      5400 iter, 9 epoch / 20 epochs
    245.58 iters/sec. Estimated time to finish: 0:00:26.874639.
     total [######################............................] 45.83%
this epoch [########..........................................] 16.67%
      5500 iter, 9 epoch / 20 epochs
    245.87 iters/sec. Estimated time to finish: 0:00:26.437190.
     total [#######################...........................] 46.67%
this epoch [################..................................] 33.33%
      5600 iter, 9 epoch / 20 epochs
    246.33 iters/sec. Estimated time to finish: 0:00:25.981189.
     total [#######################...........................] 47.50%
this epoch [#########################.........................] 50.00%
      5700 iter, 9 epoch / 20 epochs
    246.78 iters/sec. Estimated time to finish: 0:00:25.528408.
     total [########################..........................] 48.33%
this epoch [#################################.................] 66.67%
      5800 iter, 9 epoch / 20 epochs
     247.2 iters/sec. Estimated time to finish: 0:00:25.080847.
     total [########################..........................] 49.17%
this epoch [#########################################.........] 83.33%
      5900 iter, 9 epoch / 20 epochs
    247.69 iters/sec. Estimated time to finish: 0:00:24.627826.
10          0.934733       0.9369                    
     total [#########################.........................] 50.00%
this epoch [..................................................]  0.00%
      6000 iter, 10 epoch / 20 epochs
    246.59 iters/sec. Estimated time to finish: 0:00:24.332159.
     total [#########################.........................] 50.83%
this epoch [########..........................................] 16.67%
      6100 iter, 10 epoch / 20 epochs
       247 iters/sec. Estimated time to finish: 0:00:23.886641.
     total [#########################.........................] 51.67%
this epoch [################..................................] 33.33%
      6200 iter, 10 epoch / 20 epochs
    247.36 iters/sec. Estimated time to finish: 0:00:23.448076.
     total [##########################........................] 52.50%
this epoch [#########################.........................] 50.00%
      6300 iter, 10 epoch / 20 epochs
    247.73 iters/sec. Estimated time to finish: 0:00:23.008541.
     total [##########################........................] 53.33%
this epoch [#################################.................] 66.67%
      6400 iter, 10 epoch / 20 epochs
    248.16 iters/sec. Estimated time to finish: 0:00:22.566452.
     total [###########################.......................] 54.17%
this epoch [#########################################.........] 83.33%
      6500 iter, 10 epoch / 20 epochs
    248.61 iters/sec. Estimated time to finish: 0:00:22.123234.
11          0.937883       0.9414                    
     total [###########################.......................] 55.00%
this epoch [..................................................]  0.00%
      6600 iter, 11 epoch / 20 epochs
    247.52 iters/sec. Estimated time to finish: 0:00:21.816101.
     total [###########################.......................] 55.83%
this epoch [########..........................................] 16.67%
      6700 iter, 11 epoch / 20 epochs
    247.67 iters/sec. Estimated time to finish: 0:00:21.399559.
     total [############################......................] 56.67%
this epoch [################..................................] 33.33%
      6800 iter, 11 epoch / 20 epochs
    247.88 iters/sec. Estimated time to finish: 0:00:20.977519.
     total [############################......................] 57.50%
this epoch [#########################.........................] 50.00%
      6900 iter, 11 epoch / 20 epochs
    248.13 iters/sec. Estimated time to finish: 0:00:20.553526.
     total [#############################.....................] 58.33%
this epoch [#################################.................] 66.67%
      7000 iter, 11 epoch / 20 epochs
    248.28 iters/sec. Estimated time to finish: 0:00:20.138771.
     total [#############################.....................] 59.17%
this epoch [#########################################.........] 83.33%
      7100 iter, 11 epoch / 20 epochs
    248.42 iters/sec. Estimated time to finish: 0:00:19.724508.
12          0.940583       0.9438                    
     total [##############################....................] 60.00%
this epoch [..................................................]  0.00%
      7200 iter, 12 epoch / 20 epochs
    247.45 iters/sec. Estimated time to finish: 0:00:19.398094.
     total [##############################....................] 60.83%
this epoch [########..........................................] 16.67%
      7300 iter, 12 epoch / 20 epochs
    247.79 iters/sec. Estimated time to finish: 0:00:18.967364.
     total [##############################....................] 61.67%
this epoch [################..................................] 33.33%
      7400 iter, 12 epoch / 20 epochs
     248.1 iters/sec. Estimated time to finish: 0:00:18.540794.
     total [###############################...................] 62.50%
this epoch [#########################.........................] 50.00%
      7500 iter, 12 epoch / 20 epochs
    248.46 iters/sec. Estimated time to finish: 0:00:18.111734.
     total [###############################...................] 63.33%
this epoch [#################################.................] 66.67%
      7600 iter, 12 epoch / 20 epochs
    248.77 iters/sec. Estimated time to finish: 0:00:17.687175.
     total [################################..................] 64.17%
this epoch [#########################################.........] 83.33%
      7700 iter, 12 epoch / 20 epochs
    249.07 iters/sec. Estimated time to finish: 0:00:17.264007.
13          0.942633       0.9451                    
     total [################################..................] 65.00%
this epoch [..................................................]  0.00%
      7800 iter, 13 epoch / 20 epochs
    248.22 iters/sec. Estimated time to finish: 0:00:16.920387.
     total [################################..................] 65.83%
this epoch [########..........................................] 16.67%
      7900 iter, 13 epoch / 20 epochs
    248.52 iters/sec. Estimated time to finish: 0:00:16.497482.
     total [#################################.................] 66.67%
this epoch [################..................................] 33.33%
      8000 iter, 13 epoch / 20 epochs
    248.86 iters/sec. Estimated time to finish: 0:00:16.073042.
     total [#################################.................] 67.50%
this epoch [#########################.........................] 50.00%
      8100 iter, 13 epoch / 20 epochs
     249.2 iters/sec. Estimated time to finish: 0:00:15.649976.
     total [##################################................] 68.33%
this epoch [#################################.................] 66.67%
      8200 iter, 13 epoch / 20 epochs
    249.47 iters/sec. Estimated time to finish: 0:00:15.232395.
     total [##################################................] 69.17%
this epoch [#########################################.........] 83.33%
      8300 iter, 13 epoch / 20 epochs
    249.72 iters/sec. Estimated time to finish: 0:00:14.816816.
14          0.945083       0.9465                    
     total [###################################...............] 70.00%
this epoch [..................................................]  0.00%
      8400 iter, 14 epoch / 20 epochs
    248.89 iters/sec. Estimated time to finish: 0:00:14.463988.
     total [###################################...............] 70.83%
this epoch [########..........................................] 16.67%
      8500 iter, 14 epoch / 20 epochs
    249.19 iters/sec. Estimated time to finish: 0:00:14.045501.
     total [###################################...............] 71.67%
this epoch [################..................................] 33.33%
      8600 iter, 14 epoch / 20 epochs
    249.44 iters/sec. Estimated time to finish: 0:00:13.630462.
     total [####################################..............] 72.50%
this epoch [#########################.........................] 50.00%
      8700 iter, 14 epoch / 20 epochs
    249.64 iters/sec. Estimated time to finish: 0:00:13.219213.
     total [####################################..............] 73.33%
this epoch [#################################.................] 66.67%
      8800 iter, 14 epoch / 20 epochs
    249.92 iters/sec. Estimated time to finish: 0:00:12.804288.
     total [#####################################.............] 74.17%
this epoch [#########################################.........] 83.33%
      8900 iter, 14 epoch / 20 epochs
    250.18 iters/sec. Estimated time to finish: 0:00:12.390956.
15          0.947233       0.9495                    
     total [#####################################.............] 75.00%
this epoch [..................................................]  0.00%
      9000 iter, 15 epoch / 20 epochs
     249.4 iters/sec. Estimated time to finish: 0:00:12.028884.
     total [#####################################.............] 75.83%
this epoch [########..........................................] 16.67%
      9100 iter, 15 epoch / 20 epochs
    249.64 iters/sec. Estimated time to finish: 0:00:11.616690.
     total [######################################............] 76.67%
this epoch [################..................................] 33.33%
      9200 iter, 15 epoch / 20 epochs
    249.92 iters/sec. Estimated time to finish: 0:00:11.203418.
     total [######################################............] 77.50%
this epoch [#########################.........................] 50.00%
      9300 iter, 15 epoch / 20 epochs
    250.17 iters/sec. Estimated time to finish: 0:00:10.792487.
     total [#######################################...........] 78.33%
this epoch [#################################.................] 66.67%
      9400 iter, 15 epoch / 20 epochs
    250.43 iters/sec. Estimated time to finish: 0:00:10.382150.
     total [#######################################...........] 79.17%
this epoch [#########################################.........] 83.33%
      9500 iter, 15 epoch / 20 epochs
    250.59 iters/sec. Estimated time to finish: 0:00:09.976316.
16          0.949033       0.9496                    
     total [########################################..........] 80.00%
this epoch [..................................................]  0.00%
      9600 iter, 16 epoch / 20 epochs
    249.87 iters/sec. Estimated time to finish: 0:00:09.605143.
     total [########################################..........] 80.83%
this epoch [########..........................................] 16.67%
      9700 iter, 16 epoch / 20 epochs
    250.05 iters/sec. Estimated time to finish: 0:00:09.197988.
     total [########################################..........] 81.67%
this epoch [################..................................] 33.33%
      9800 iter, 16 epoch / 20 epochs
    250.32 iters/sec. Estimated time to finish: 0:00:08.788854.
     total [#########################################.........] 82.50%
this epoch [#########################.........................] 50.00%
      9900 iter, 16 epoch / 20 epochs
    250.58 iters/sec. Estimated time to finish: 0:00:08.380646.
     total [#########################################.........] 83.33%
this epoch [#################################.................] 66.67%
     10000 iter, 16 epoch / 20 epochs
    250.77 iters/sec. Estimated time to finish: 0:00:07.975449.
     total [##########################################........] 84.17%
this epoch [#########################################.........] 83.33%
     10100 iter, 16 epoch / 20 epochs
    251.01 iters/sec. Estimated time to finish: 0:00:07.569486.
17          0.9507         0.9526                    
     total [##########################################........] 85.00%
this epoch [..................................................]  0.00%
     10200 iter, 17 epoch / 20 epochs
    250.13 iters/sec. Estimated time to finish: 0:00:07.196375.
     total [##########################################........] 85.83%
this epoch [########..........................................] 16.67%
     10300 iter, 17 epoch / 20 epochs
    250.15 iters/sec. Estimated time to finish: 0:00:06.795972.
     total [###########################################.......] 86.67%
this epoch [################..................................] 33.33%
     10400 iter, 17 epoch / 20 epochs
    250.12 iters/sec. Estimated time to finish: 0:00:06.397005.
     total [###########################################.......] 87.50%
this epoch [#########################.........................] 50.00%
     10500 iter, 17 epoch / 20 epochs
    250.15 iters/sec. Estimated time to finish: 0:00:05.996337.
     total [############################################......] 88.33%
this epoch [#################################.................] 66.67%
     10600 iter, 17 epoch / 20 epochs
    251.26 iters/sec. Estimated time to finish: 0:00:05.571862.
     total [############################################......] 89.17%
this epoch [#########################################.........] 83.33%
     10700 iter, 17 epoch / 20 epochs
    251.44 iters/sec. Estimated time to finish: 0:00:05.170228.
18          0.952383       0.9532                    
     total [#############################################.....] 90.00%
this epoch [..................................................]  0.00%
     10800 iter, 18 epoch / 20 epochs
    250.63 iters/sec. Estimated time to finish: 0:00:04.787898.
     total [#############################################.....] 90.83%
this epoch [########..........................................] 16.67%
     10900 iter, 18 epoch / 20 epochs
    250.76 iters/sec. Estimated time to finish: 0:00:04.386683.
     total [#############################################.....] 91.67%
this epoch [################..................................] 33.33%
     11000 iter, 18 epoch / 20 epochs
     250.8 iters/sec. Estimated time to finish: 0:00:03.987294.
     total [##############################################....] 92.50%
this epoch [#########################.........................] 50.00%
     11100 iter, 18 epoch / 20 epochs
    250.85 iters/sec. Estimated time to finish: 0:00:03.587843.
     total [##############################################....] 93.33%
this epoch [#################################.................] 66.67%
     11200 iter, 18 epoch / 20 epochs
    251.83 iters/sec. Estimated time to finish: 0:00:03.176797.
     total [###############################################...] 94.17%
this epoch [#########################################.........] 83.33%
     11300 iter, 18 epoch / 20 epochs
       252 iters/sec. Estimated time to finish: 0:00:02.777783.
19          0.953817       0.953                     
     total [###############################################...] 95.00%
this epoch [..................................................]  0.00%
     11400 iter, 19 epoch / 20 epochs
    251.32 iters/sec. Estimated time to finish: 0:00:02.387425.
     total [###############################################...] 95.83%
this epoch [########..........................................] 16.67%
     11500 iter, 19 epoch / 20 epochs
    251.59 iters/sec. Estimated time to finish: 0:00:01.987384.
     total [################################################..] 96.67%
this epoch [################..................................] 33.33%
     11600 iter, 19 epoch / 20 epochs
    251.86 iters/sec. Estimated time to finish: 0:00:01.588182.
     total [################################################..] 97.50%
this epoch [#########################.........................] 50.00%
     11700 iter, 19 epoch / 20 epochs
    252.12 iters/sec. Estimated time to finish: 0:00:01.189929.
     total [#################################################.] 98.33%
this epoch [#################################.................] 66.67%
     11800 iter, 19 epoch / 20 epochs
    253.16 iters/sec. Estimated time to finish: 0:00:00.790023.
     total [#################################################.] 99.17%
this epoch [#########################################.........] 83.33%
     11900 iter, 19 epoch / 20 epochs
     253.1 iters/sec. Estimated time to finish: 0:00:00.395094.
20          0.95535        0.9551                    
     total [##################################################] 100.00%
this epoch [..................................................]  0.00%
     12000 iter, 20 epoch / 20 epochs
    252.37 iters/sec. Estimated time to finish: 0:00:00.


这些扩展执行以下任务:

  • Evaluator 在每个epoch 结束时基于测试数据集评估当前模型。它会自动切换到测试模式,因此我们不必为在训练/测试模式(例如,dropout(),BatchNormalization)中表现不同的模式采取任何特殊的功能。

  • LogReport 汇总要报告的数值并将其发送到输出目录中的日志文件。

  • PrintReport 在LogReport中打印选定的项目。

  • ProgressBar 显示进度条。

在chainer.training.extensions模块中实现了许多扩展。其中最重要的一个就是snapshot(),它将训练过程的快照(即Trainer对象)保存到输出目录中的一个文件中。

examples / mnist目录中的示例代码还包含GPU支持,尽管其基本部分与本教程中的代码相同。我们将在后面的章节中回顾如何使用GPU。