TadaoYamaokaの開発日記

個人開発しているスマホアプリや将棋AIの開発ネタを中心に書いていきます。

Google ColabでTPUを使ってみる

ほぼ自分用のメモです。

Google Colabで、Kerasを使ってTPUでMNISTの学習を試してみた。

TPUを有効にするには、「ランタイムのタイプを変更」からハードウェアアクセラレータを「TPU」に変更する必要がある。

KerasでTPUでMNISTを学習するには以下のように記述する。

import tensorflow as tf
import os

mnist = tf.keras.datasets.mnist

(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(512, activation=tf.nn.relu),
  tf.keras.layers.Dense(10)
])

def sparse_categorical_crossentropy(y_true, y_pred):
    return tf.keras.backend.sparse_categorical_crossentropy(y_true, y_pred, from_logits=True)

def sparse_categorical_accuracy(y_true, y_pred):
    return tf.keras.metrics.sparse_categorical_accuracy(y_true, tf.nn.softmax(y_pred))

model.compile(optimizer='adam',
              loss=sparse_categorical_crossentropy,
              metrics=[sparse_categorical_accuracy])

# TPU
tpu_grpc_url = "grpc://"+os.environ["COLAB_TPU_ADDR"]
tpu_cluster_resolver = tf.contrib.cluster_resolver.TPUClusterResolver(tpu_grpc_url)
strategy = tf.contrib.tpu.TPUDistributionStrategy(tpu_cluster_resolver)
model = tf.contrib.tpu.keras_to_tpu_model(model, strategy=strategy)
    
model.fit(x_train, y_train, batch_size=1024, epochs=5)
model.evaluate(x_test, y_test)

TPUのコメントがある部分で、CPU/GPUのモデルからTPUのモデルに変換している。

実行結果

INFO:tensorflow:Querying Tensorflow master (grpc://10.58.99.162:8470) for TPU system metadata.
INFO:tensorflow:Found TPU system:
INFO:tensorflow:*** Num TPU Cores: 8
INFO:tensorflow:*** Num TPU Workers: 1
INFO:tensorflow:*** Num TPU Cores Per Worker: 8
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:CPU:0, CPU, -1, 12055453447884746007)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:XLA_CPU:0, XLA_CPU, 17179869184, 18058977096921453459)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:0, TPU, 17179869184, 14808883871349854203)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:1, TPU, 17179869184, 12787960678777081298)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:2, TPU, 17179869184, 12370191287575444036)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:3, TPU, 17179869184, 10061541009258236286)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:4, TPU, 17179869184, 18415570492467594980)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:5, TPU, 17179869184, 6989406478957311628)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:6, TPU, 17179869184, 5293577435862379493)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:7, TPU, 17179869184, 11465669036580936775)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU_SYSTEM:0, TPU_SYSTEM, 17179869184, 13652277581393476684)
WARNING:tensorflow:tpu_model (from tensorflow.contrib.tpu.python.tpu.keras_support) is experimental and may change or be removed at any time, and without warning.
INFO:tensorflow:Cloning Adam {'lr': 0.0010000000474974513, 'beta_1': 0.8999999761581421, 'beta_2': 0.9990000128746033, 'decay': 0.0, 'epsilon': 1e-07, 'amsgrad': False}
INFO:tensorflow:Cloning Adam {'lr': 0.0010000000474974513, 'beta_1': 0.8999999761581421, 'beta_2': 0.9990000128746033, 'decay': 0.0, 'epsilon': 1e-07, 'amsgrad': False}
Epoch 1/5
INFO:tensorflow:New input shapes; (re-)compiling: mode=train (# of cores 8), [TensorSpec(shape=(128,), dtype=tf.int32, name='core_id_20'), TensorSpec(shape=(128, 28, 28), dtype=tf.float32, name='flatten_6_input_10'), TensorSpec(shape=(128, 1), dtype=tf.float32, name='dense_13_target_30')]
INFO:tensorflow:Overriding default placeholder.
INFO:tensorflow:Cloning Adam {'lr': 0.0010000000474974513, 'beta_1': 0.8999999761581421, 'beta_2': 0.9990000128746033, 'decay': 0.0, 'epsilon': 1e-07, 'amsgrad': False}
INFO:tensorflow:Remapping placeholder for flatten_6_input
INFO:tensorflow:KerasCrossShard: <tensorflow.python.keras.optimizers.Adam object at 0x7fe1974d4c88> []
INFO:tensorflow:Started compiling
INFO:tensorflow:Finished compiling. Time elapsed: 2.0240049362182617 secs
INFO:tensorflow:Setting weights on TPU model.
INFO:tensorflow:CPU -> TPU lr: 0.0010000000474974513 {0.001}
INFO:tensorflow:CPU -> TPU beta_1: 0.8999999761581421 {0.9}
INFO:tensorflow:CPU -> TPU beta_2: 0.9990000128746033 {0.999}
INFO:tensorflow:CPU -> TPU decay: 0.0 {0.0}
WARNING:tensorflow:Cannot update non-variable config: epsilon
WARNING:tensorflow:Cannot update non-variable config: amsgrad
57344/60000 [===========================>..] - ETA: 0s - loss: 0.5631 - sparse_categorical_accuracy: 0.8493INFO:tensorflow:New input shapes; (re-)compiling: mode=train (# of cores 8), [TensorSpec(shape=(76,), dtype=tf.int32, name='core_id_20'), TensorSpec(shape=(76, 28, 28), dtype=tf.float32, name='flatten_6_input_10'), TensorSpec(shape=(76, 1), dtype=tf.float32, name='dense_13_target_30')]
INFO:tensorflow:Overriding default placeholder.
INFO:tensorflow:Remapping placeholder for flatten_6_input
INFO:tensorflow:KerasCrossShard: <tensorflow.python.keras.optimizers.Adam object at 0x7fe1974d4c88> [<tf.Variable 'tpu_140606887606032/Adam/iterations:0' shape=() dtype=int64>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe19708d278>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe19708dc18>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe19708de80>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe19706cfd0>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe196fb5f60>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe196f7e278>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe196f49198>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe196f10cf8>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe196ed89e8>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe196ea4e10>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe196e11b70>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe196dda7b8>]
INFO:tensorflow:Started compiling
INFO:tensorflow:Finished compiling. Time elapsed: 1.968165397644043 secs
60000/60000 [==============================] - 13s 209us/sample - loss: 0.5511 - sparse_categorical_accuracy: 0.8522
Epoch 2/5
60000/60000 [==============================] - 2s 29us/sample - loss: 0.2344 - sparse_categorical_accuracy: 0.9342
Epoch 3/5
60000/60000 [==============================] - 2s 32us/sample - loss: 0.1832 - sparse_categorical_accuracy: 0.9477
Epoch 4/5
60000/60000 [==============================] - 2s 31us/sample - loss: 0.1496 - sparse_categorical_accuracy: 0.9579
Epoch 5/5
60000/60000 [==============================] - 2s 27us/sample - loss: 0.1225 - sparse_categorical_accuracy: 0.9654
INFO:tensorflow:New input shapes; (re-)compiling: mode=eval (# of cores 8), [TensorSpec(shape=(4,), dtype=tf.int32, name='core_id_30'), TensorSpec(shape=(4, 28, 28), dtype=tf.float32, name='flatten_6_input_10'), TensorSpec(shape=(4, 1), dtype=tf.float32, name='dense_13_target_30')]
INFO:tensorflow:Overriding default placeholder.
INFO:tensorflow:Cloning Adam {'lr': 0.0010000000474974513, 'beta_1': 0.8999999761581421, 'beta_2': 0.9990000128746033, 'decay': 0.0, 'epsilon': 1e-07, 'amsgrad': False}
INFO:tensorflow:Remapping placeholder for flatten_6_input
INFO:tensorflow:KerasCrossShard: <tensorflow.python.keras.optimizers.Adam object at 0x7fe196162898> []
INFO:tensorflow:Started compiling
INFO:tensorflow:Finished compiling. Time elapsed: 1.5682730674743652 secs
 9888/10000 [============================>.] - ETA: 0s - loss: 0.1541 - sparse_categorical_accuracy: 0.9534INFO:tensorflow:New input shapes; (re-)compiling: mode=eval (# of cores 8), [TensorSpec(shape=(2,), dtype=tf.int32, name='core_id_30'), TensorSpec(shape=(2, 28, 28), dtype=tf.float32, name='flatten_6_input_10'), TensorSpec(shape=(2, 1), dtype=tf.float32, name='dense_13_target_30')]
INFO:tensorflow:Overriding default placeholder.
INFO:tensorflow:Remapping placeholder for flatten_6_input
INFO:tensorflow:KerasCrossShard: <tensorflow.python.keras.optimizers.Adam object at 0x7fe196162898> []
INFO:tensorflow:Started compiling
INFO:tensorflow:Finished compiling. Time elapsed: 1.0298957824707031 secs
10000/10000 [==============================] - 7s 655us/sample - loss: 0.1553 - sparse_categorical_accuracy: 0.9530
[0.15529378306418656, 0.95299995]

同じモデルをGPUで学習すると、

Epoch 1/5
60000/60000 [==============================] - 1s 11us/sample - loss: 0.5526 - sparse_categorical_accuracy: 0.8556
Epoch 2/5
60000/60000 [==============================] - 1s 10us/sample - loss: 0.2269 - sparse_categorical_accuracy: 0.9368
Epoch 3/5
60000/60000 [==============================] - 1s 9us/sample - loss: 0.1711 - sparse_categorical_accuracy: 0.9521
Epoch 4/5
60000/60000 [==============================] - 1s 10us/sample - loss: 0.1357 - sparse_categorical_accuracy: 0.9625
Epoch 5/5
60000/60000 [==============================] - 1s 9us/sample - loss: 0.1116 - sparse_categorical_accuracy: 0.9693
10000/10000 [==============================] - 1s 70us/sample - loss: 0.1123 - sparse_categorical_accuracy: 0.9671
[0.11232251176461577, 0.9671]

TPUよりGPUの方が早いという結果だった。

畳み込みのモデルを学習

今度は、畳み込み層のあるモデルに変更して学習してみた。

import tensorflow as tf
import os

mnist = tf.keras.datasets.mnist

(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
x_train, x_test = x_train.reshape(x_train.shape[0], 28, 28, 1), x_test.reshape(x_test.shape[0], 28, 28, 1)

model = tf.keras.models.Sequential([
  tf.keras.layers.Conv2D(input_shape=(28, 28, 1), filters=256, kernel_size=3, padding='same', activation=tf.nn.relu),
  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(512, activation=tf.nn.relu),
  tf.keras.layers.Dense(10)
])

def sparse_categorical_crossentropy(y_true, y_pred):
    return tf.keras.backend.sparse_categorical_crossentropy(y_true, y_pred, from_logits=True)

def sparse_categorical_accuracy(y_true, y_pred):
    return tf.keras.metrics.sparse_categorical_accuracy(y_true, tf.nn.softmax(y_pred))

model.compile(optimizer='adam',
              loss=sparse_categorical_crossentropy,
              metrics=[sparse_categorical_accuracy])

# TPU
tpu_grpc_url = "grpc://"+os.environ["COLAB_TPU_ADDR"]
tpu_cluster_resolver = tf.contrib.cluster_resolver.TPUClusterResolver(tpu_grpc_url)
strategy = tf.contrib.tpu.TPUDistributionStrategy(tpu_cluster_resolver)
model = tf.contrib.tpu.keras_to_tpu_model(model, strategy=strategy)
    
model.fit(x_train, y_train, batch_size=1024, epochs=5)
model.evaluate(x_test, y_test)
実行結果
INFO:tensorflow:Querying Tensorflow master (grpc://10.58.99.162:8470) for TPU system metadata.
INFO:tensorflow:Found TPU system:
INFO:tensorflow:*** Num TPU Cores: 8
INFO:tensorflow:*** Num TPU Workers: 1
INFO:tensorflow:*** Num TPU Cores Per Worker: 8
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:CPU:0, CPU, -1, 12055453447884746007)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:XLA_CPU:0, XLA_CPU, 17179869184, 18058977096921453459)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:0, TPU, 17179869184, 14808883871349854203)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:1, TPU, 17179869184, 12787960678777081298)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:2, TPU, 17179869184, 12370191287575444036)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:3, TPU, 17179869184, 10061541009258236286)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:4, TPU, 17179869184, 18415570492467594980)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:5, TPU, 17179869184, 6989406478957311628)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:6, TPU, 17179869184, 5293577435862379493)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:7, TPU, 17179869184, 11465669036580936775)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU_SYSTEM:0, TPU_SYSTEM, 17179869184, 13652277581393476684)
WARNING:tensorflow:tpu_model (from tensorflow.contrib.tpu.python.tpu.keras_support) is experimental and may change or be removed at any time, and without warning.
INFO:tensorflow:Cloning Adam {'lr': 0.0010000000474974513, 'beta_1': 0.8999999761581421, 'beta_2': 0.9990000128746033, 'decay': 0.0, 'epsilon': 1e-07, 'amsgrad': False}
INFO:tensorflow:Cloning Adam {'lr': 0.0010000000474974513, 'beta_1': 0.8999999761581421, 'beta_2': 0.9990000128746033, 'decay': 0.0, 'epsilon': 1e-07, 'amsgrad': False}
Epoch 1/5
INFO:tensorflow:New input shapes; (re-)compiling: mode=train (# of cores 8), [TensorSpec(shape=(128,), dtype=tf.int32, name='core_id_40'), TensorSpec(shape=(128, 28, 28, 1), dtype=tf.float32, name='conv2d_input_10'), TensorSpec(shape=(128, 1), dtype=tf.float32, name='dense_15_target_30')]
INFO:tensorflow:Overriding default placeholder.
INFO:tensorflow:Cloning Adam {'lr': 0.0010000000474974513, 'beta_1': 0.8999999761581421, 'beta_2': 0.9990000128746033, 'decay': 0.0, 'epsilon': 1e-07, 'amsgrad': False}
INFO:tensorflow:Remapping placeholder for conv2d_input
INFO:tensorflow:KerasCrossShard: <tensorflow.python.keras.optimizers.Adam object at 0x7fe1958053c8> []
INFO:tensorflow:Started compiling
INFO:tensorflow:Finished compiling. Time elapsed: 11.533315420150757 secs
INFO:tensorflow:Setting weights on TPU model.
INFO:tensorflow:CPU -> TPU lr: 0.0010000000474974513 {0.001}
INFO:tensorflow:CPU -> TPU beta_1: 0.8999999761581421 {0.9}
INFO:tensorflow:CPU -> TPU beta_2: 0.9990000128746033 {0.999}
INFO:tensorflow:CPU -> TPU decay: 0.0 {0.0}
WARNING:tensorflow:Cannot update non-variable config: epsilon
WARNING:tensorflow:Cannot update non-variable config: amsgrad
58368/60000 [============================>.] - ETA: 0s - loss: 0.3569 - sparse_categorical_accuracy: 0.8933INFO:tensorflow:New input shapes; (re-)compiling: mode=train (# of cores 8), [TensorSpec(shape=(76,), dtype=tf.int32, name='core_id_40'), TensorSpec(shape=(76, 28, 28, 1), dtype=tf.float32, name='conv2d_input_10'), TensorSpec(shape=(76, 1), dtype=tf.float32, name='dense_15_target_30')]
INFO:tensorflow:Overriding default placeholder.
INFO:tensorflow:Remapping placeholder for conv2d_input
INFO:tensorflow:KerasCrossShard: <tensorflow.python.keras.optimizers.Adam object at 0x7fe1958053c8> [<tf.Variable 'tpu_140606857389224/Adam/iterations:0' shape=() dtype=int64>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe195210828>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe1951b6160>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe1951b6710>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe1951f2940>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe195139a90>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe195103f28>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe1950cc4a8>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe195039550>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe1950032e8>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe194fcab38>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe194f91828>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe194f5fbe0>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe194ecbe10>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe194e94b00>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe194e5e748>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe194e25f98>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe194d94f28>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe194d5da90>]
INFO:tensorflow:Started compiling
INFO:tensorflow:Finished compiling. Time elapsed: 11.685662269592285 secs
60000/60000 [==============================] - 43s 718us/sample - loss: 0.3501 - sparse_categorical_accuracy: 0.8954
Epoch 2/5
60000/60000 [==============================] - 4s 68us/sample - loss: 0.0625 - sparse_categorical_accuracy: 0.9818
Epoch 3/5
60000/60000 [==============================] - 4s 65us/sample - loss: 0.0338 - sparse_categorical_accuracy: 0.9898
Epoch 4/5
60000/60000 [==============================] - 4s 68us/sample - loss: 0.0195 - sparse_categorical_accuracy: 0.9947
Epoch 5/5
60000/60000 [==============================] - 4s 68us/sample - loss: 0.0134 - sparse_categorical_accuracy: 0.9962
INFO:tensorflow:New input shapes; (re-)compiling: mode=eval (# of cores 8), [TensorSpec(shape=(4,), dtype=tf.int32, name='core_id_50'), TensorSpec(shape=(4, 28, 28, 1), dtype=tf.float32, name='conv2d_input_10'), TensorSpec(shape=(4, 1), dtype=tf.float32, name='dense_15_target_30')]
INFO:tensorflow:Overriding default placeholder.
INFO:tensorflow:Cloning Adam {'lr': 0.0010000000474974513, 'beta_1': 0.8999999761581421, 'beta_2': 0.9990000128746033, 'decay': 0.0, 'epsilon': 1e-07, 'amsgrad': False}
INFO:tensorflow:Remapping placeholder for conv2d_input
INFO:tensorflow:KerasCrossShard: <tensorflow.python.keras.optimizers.Adam object at 0x7fe193c579e8> []
INFO:tensorflow:Started compiling
INFO:tensorflow:Finished compiling. Time elapsed: 3.3587257862091064 secs
 9824/10000 [============================>.] - ETA: 0s - loss: 0.0863 - sparse_categorical_accuracy: 0.9748INFO:tensorflow:New input shapes; (re-)compiling: mode=eval (# of cores 8), [TensorSpec(shape=(2,), dtype=tf.int32, name='core_id_50'), TensorSpec(shape=(2, 28, 28, 1), dtype=tf.float32, name='conv2d_input_10'), TensorSpec(shape=(2, 1), dtype=tf.float32, name='dense_15_target_30')]
INFO:tensorflow:Overriding default placeholder.
INFO:tensorflow:Remapping placeholder for conv2d_input
INFO:tensorflow:KerasCrossShard: <tensorflow.python.keras.optimizers.Adam object at 0x7fe193c579e8> []
INFO:tensorflow:Started compiling
INFO:tensorflow:Finished compiling. Time elapsed: 2.2662925720214844 secs
10000/10000 [==============================] - 11s 1ms/sample - loss: 0.0855 - sparse_categorical_accuracy: 0.9749
[0.08545259298430756, 0.97489995]

同じモデルをGPUで学習すると、

Epoch 1/5
60000/60000 [==============================] - 33s 551us/sample - loss: 0.3539 - sparse_categorical_accuracy: 0.8934
Epoch 2/5
60000/60000 [==============================] - 32s 540us/sample - loss: 0.0571 - sparse_categorical_accuracy: 0.9834
Epoch 3/5
60000/60000 [==============================] - 32s 540us/sample - loss: 0.0286 - sparse_categorical_accuracy: 0.9922
Epoch 4/5
60000/60000 [==============================] - 32s 541us/sample - loss: 0.0152 - sparse_categorical_accuracy: 0.9962
Epoch 5/5
60000/60000 [==============================] - 33s 546us/sample - loss: 0.0087 - sparse_categorical_accuracy: 0.9980
10000/10000 [==============================] - 4s 396us/sample - loss: 0.0458 - sparse_categorical_accuracy: 0.9845
[0.04582856161564123, 0.9845]

今度は、TPUの方が圧倒的に速くなった。
モデルの計算量が多いほどTPUが速いようだ。

2019/2/13

公式のページの方法に書き直した。
https://www.tensorflow.org/guide/using_tpuwww.tensorflow.org

import tensorflow as tf
import os

mnist = tf.keras.datasets.mnist

(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
x_train, x_test = x_train.reshape(x_train.shape[0], 28, 28, 1), x_test.reshape(x_test.shape[0], 28, 28, 1)

model = tf.keras.models.Sequential([
  tf.keras.layers.Conv2D(input_shape=(28, 28, 1), filters=256, kernel_size=3, padding='same', activation=tf.nn.relu),
  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(512, activation=tf.nn.relu),
  tf.keras.layers.Dense(10)
])

def sparse_categorical_crossentropy(y_true, y_pred):
    return tf.keras.backend.sparse_categorical_crossentropy(y_true, y_pred, from_logits=True)

def sparse_categorical_accuracy(y_true, y_pred):
    return tf.keras.metrics.sparse_categorical_accuracy(y_true, tf.nn.softmax(y_pred))

# TPU
model = tf.contrib.tpu.keras_to_tpu_model(
    model,
    strategy=tf.contrib.tpu.TPUDistributionStrategy(
        tf.contrib.cluster_resolver.TPUClusterResolver(tpu='grpc://' + os.environ['COLAB_TPU_ADDR'])
    )
)
   
model.compile(optimizer='adam',
              loss=sparse_categorical_crossentropy,
              metrics=[sparse_categorical_accuracy])

model.fit(x_train, y_train, batch_size=1024, epochs=5)
model.evaluate(x_test, y_test)
実行結果
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:CPU:0, CPU, -1, 2593207894788626939)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:XLA_CPU:0, XLA_CPU, 17179869184, 13250067999102773720)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:0, TPU, 17179869184, 4557324841645948994)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:1, TPU, 17179869184, 17679983793927275190)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:2, TPU, 17179869184, 2642697500206372799)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:3, TPU, 17179869184, 12155722493614205727)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:4, TPU, 17179869184, 16776203674625333233)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:5, TPU, 17179869184, 219400119950085365)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:6, TPU, 17179869184, 12315577173932374116)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:7, TPU, 17179869184, 8568424978166959712)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU_SYSTEM:0, TPU_SYSTEM, 17179869184, 11493405250562256329)
WARNING:tensorflow:tpu_model (from tensorflow.contrib.tpu.python.tpu.keras_support) is experimental and may change or be removed at any time, and without warning.
Epoch 1/5
INFO:tensorflow:New input shapes; (re-)compiling: mode=train (# of cores 8), [TensorSpec(shape=(128,), dtype=tf.int32, name='core_id_20'), TensorSpec(shape=(128, 28, 28, 1), dtype=tf.float32, name='conv2d_1_input_10'), TensorSpec(shape=(128, 1), dtype=tf.float32, name='dense_3_target_10')]
INFO:tensorflow:Overriding default placeholder.
INFO:tensorflow:Cloning Adam {'lr': 0.0010000000474974513, 'beta_1': 0.8999999761581421, 'beta_2': 0.9990000128746033, 'decay': 0.0, 'epsilon': 1e-07, 'amsgrad': False}
INFO:tensorflow:Remapping placeholder for conv2d_1_input
INFO:tensorflow:KerasCrossShard: <tensorflow.python.keras.optimizers.Adam object at 0x7fec0d48f198> []
INFO:tensorflow:Started compiling
INFO:tensorflow:Finished compiling. Time elapsed: 10.637016296386719 secs
INFO:tensorflow:Setting weights on TPU model.
INFO:tensorflow:CPU -> TPU lr: 0.0010000000474974513 {0.001}
INFO:tensorflow:CPU -> TPU beta_1: 0.8999999761581421 {0.9}
INFO:tensorflow:CPU -> TPU beta_2: 0.9990000128746033 {0.999}
INFO:tensorflow:CPU -> TPU decay: 0.0 {0.0}
WARNING:tensorflow:Cannot update non-variable config: epsilon
WARNING:tensorflow:Cannot update non-variable config: amsgrad
58368/60000 [============================>.] - ETA: 0s - loss: 0.3567 - sparse_categorical_accuracy: 0.8953INFO:tensorflow:New input shapes; (re-)compiling: mode=train (# of cores 8), [TensorSpec(shape=(76,), dtype=tf.int32, name='core_id_20'), TensorSpec(shape=(76, 28, 28, 1), dtype=tf.float32, name='conv2d_1_input_10'), TensorSpec(shape=(76, 1), dtype=tf.float32, name='dense_3_target_10')]
INFO:tensorflow:Overriding default placeholder.
INFO:tensorflow:Remapping placeholder for conv2d_1_input
INFO:tensorflow:KerasCrossShard: <tensorflow.python.keras.optimizers.Adam object at 0x7fec0d48f198> [<tf.Variable 'tpu_140651824059952/Adam/iterations:0' shape=() dtype=int64>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fec0cea81d0>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fec0cea8ac8>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fec0ce43278>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fec0ce070b8>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fec0cdd2240>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fec0cd9a898>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fec0cd5f668>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fec0cd2a6d8>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fec0ccf3f60>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fec0cc639b0>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fec0cc28358>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fec0cbf29b0>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fec0cbbb780>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fec0cb03cc0>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fec0caf5898>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fec0ca3f4e0>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fec0ca09e10>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fec0c9cf860>]
INFO:tensorflow:Started compiling
INFO:tensorflow:Finished compiling. Time elapsed: 11.36061954498291 secs
60000/60000 [==============================] - 39s 655us/sample - loss: 0.3495 - sparse_categorical_accuracy: 0.8974
Epoch 2/5
60000/60000 [==============================] - 4s 66us/sample - loss: 0.0626 - sparse_categorical_accuracy: 0.9814
Epoch 3/5
60000/60000 [==============================] - 4s 66us/sample - loss: 0.0338 - sparse_categorical_accuracy: 0.9900
Epoch 4/5
60000/60000 [==============================] - 4s 67us/sample - loss: 0.0203 - sparse_categorical_accuracy: 0.9941
Epoch 5/5
60000/60000 [==============================] - 4s 68us/sample - loss: 0.0110 - sparse_categorical_accuracy: 0.9972
INFO:tensorflow:New input shapes; (re-)compiling: mode=eval (# of cores 8), [TensorSpec(shape=(4,), dtype=tf.int32, name='core_id_30'), TensorSpec(shape=(4, 28, 28, 1), dtype=tf.float32, name='conv2d_1_input_10'), TensorSpec(shape=(4, 1), dtype=tf.float32, name='dense_3_target_10')]
INFO:tensorflow:Overriding default placeholder.
INFO:tensorflow:Cloning Adam {'lr': 0.0010000000474974513, 'beta_1': 0.8999999761581421, 'beta_2': 0.9990000128746033, 'decay': 0.0, 'epsilon': 1e-07, 'amsgrad': False}
INFO:tensorflow:Remapping placeholder for conv2d_1_input
INFO:tensorflow:KerasCrossShard: <tensorflow.python.keras.optimizers.Adam object at 0x7fec0b8f72e8> []
INFO:tensorflow:Started compiling
INFO:tensorflow:Finished compiling. Time elapsed: 2.7844197750091553 secs
 9856/10000 [============================>.] - ETA: 0s - loss: 0.1032 - sparse_categorical_accuracy: 0.9698INFO:tensorflow:New input shapes; (re-)compiling: mode=eval (# of cores 8), [TensorSpec(shape=(2,), dtype=tf.int32, name='core_id_30'), TensorSpec(shape=(2, 28, 28, 1), dtype=tf.float32, name='conv2d_1_input_10'), TensorSpec(shape=(2, 1), dtype=tf.float32, name='dense_3_target_10')]
INFO:tensorflow:Overriding default placeholder.
INFO:tensorflow:Remapping placeholder for conv2d_1_input
INFO:tensorflow:KerasCrossShard: <tensorflow.python.keras.optimizers.Adam object at 0x7fec0b8f72e8> []
INFO:tensorflow:Started compiling
INFO:tensorflow:Finished compiling. Time elapsed: 1.8734314441680908 secs
10000/10000 [==============================] - 9s 921us/sample - loss: 0.1027 - sparse_categorical_accuracy: 0.9699
[0.1027357829653658, 0.96989995]

モデル変換→コンパイルの順序にすると少し速くなった。