ほぼ自分用のメモです。
Google Colabで、Kerasを使ってTPUでMNISTの学習を試してみた。
TPUを有効にするには、「ランタイムのタイプを変更」からハードウェアアクセラレータを「TPU」に変更する必要がある。
KerasでTPUでMNISTを学習するには以下のように記述する。
import tensorflow as tf import os mnist = tf.keras.datasets.mnist (x_train, y_train),(x_test, y_test) = mnist.load_data() x_train, x_test = x_train / 255.0, x_test / 255.0 model = tf.keras.models.Sequential([ tf.keras.layers.Flatten(input_shape=(28, 28)), tf.keras.layers.Dense(512, activation=tf.nn.relu), tf.keras.layers.Dense(10) ]) def sparse_categorical_crossentropy(y_true, y_pred): return tf.keras.backend.sparse_categorical_crossentropy(y_true, y_pred, from_logits=True) def sparse_categorical_accuracy(y_true, y_pred): return tf.keras.metrics.sparse_categorical_accuracy(y_true, tf.nn.softmax(y_pred)) model.compile(optimizer='adam', loss=sparse_categorical_crossentropy, metrics=[sparse_categorical_accuracy]) # TPU tpu_grpc_url = "grpc://"+os.environ["COLAB_TPU_ADDR"] tpu_cluster_resolver = tf.contrib.cluster_resolver.TPUClusterResolver(tpu_grpc_url) strategy = tf.contrib.tpu.TPUDistributionStrategy(tpu_cluster_resolver) model = tf.contrib.tpu.keras_to_tpu_model(model, strategy=strategy) model.fit(x_train, y_train, batch_size=1024, epochs=5) model.evaluate(x_test, y_test)
TPUのコメントがある部分で、CPU/GPUのモデルからTPUのモデルに変換している。
実行結果
INFO:tensorflow:Querying Tensorflow master (grpc://10.58.99.162:8470) for TPU system metadata. INFO:tensorflow:Found TPU system: INFO:tensorflow:*** Num TPU Cores: 8 INFO:tensorflow:*** Num TPU Workers: 1 INFO:tensorflow:*** Num TPU Cores Per Worker: 8 INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:CPU:0, CPU, -1, 12055453447884746007) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:XLA_CPU:0, XLA_CPU, 17179869184, 18058977096921453459) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:0, TPU, 17179869184, 14808883871349854203) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:1, TPU, 17179869184, 12787960678777081298) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:2, TPU, 17179869184, 12370191287575444036) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:3, TPU, 17179869184, 10061541009258236286) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:4, TPU, 17179869184, 18415570492467594980) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:5, TPU, 17179869184, 6989406478957311628) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:6, TPU, 17179869184, 5293577435862379493) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:7, TPU, 17179869184, 11465669036580936775) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU_SYSTEM:0, TPU_SYSTEM, 17179869184, 13652277581393476684) WARNING:tensorflow:tpu_model (from tensorflow.contrib.tpu.python.tpu.keras_support) is experimental and may change or be removed at any time, and without warning. INFO:tensorflow:Cloning Adam {'lr': 0.0010000000474974513, 'beta_1': 0.8999999761581421, 'beta_2': 0.9990000128746033, 'decay': 0.0, 'epsilon': 1e-07, 'amsgrad': False} INFO:tensorflow:Cloning Adam {'lr': 0.0010000000474974513, 'beta_1': 0.8999999761581421, 'beta_2': 0.9990000128746033, 'decay': 0.0, 'epsilon': 1e-07, 'amsgrad': False} Epoch 1/5 INFO:tensorflow:New input shapes; (re-)compiling: mode=train (# of cores 8), [TensorSpec(shape=(128,), dtype=tf.int32, name='core_id_20'), TensorSpec(shape=(128, 28, 28), dtype=tf.float32, name='flatten_6_input_10'), TensorSpec(shape=(128, 1), dtype=tf.float32, name='dense_13_target_30')] INFO:tensorflow:Overriding default placeholder. INFO:tensorflow:Cloning Adam {'lr': 0.0010000000474974513, 'beta_1': 0.8999999761581421, 'beta_2': 0.9990000128746033, 'decay': 0.0, 'epsilon': 1e-07, 'amsgrad': False} INFO:tensorflow:Remapping placeholder for flatten_6_input INFO:tensorflow:KerasCrossShard: <tensorflow.python.keras.optimizers.Adam object at 0x7fe1974d4c88> [] INFO:tensorflow:Started compiling INFO:tensorflow:Finished compiling. Time elapsed: 2.0240049362182617 secs INFO:tensorflow:Setting weights on TPU model. INFO:tensorflow:CPU -> TPU lr: 0.0010000000474974513 {0.001} INFO:tensorflow:CPU -> TPU beta_1: 0.8999999761581421 {0.9} INFO:tensorflow:CPU -> TPU beta_2: 0.9990000128746033 {0.999} INFO:tensorflow:CPU -> TPU decay: 0.0 {0.0} WARNING:tensorflow:Cannot update non-variable config: epsilon WARNING:tensorflow:Cannot update non-variable config: amsgrad 57344/60000 [===========================>..] - ETA: 0s - loss: 0.5631 - sparse_categorical_accuracy: 0.8493INFO:tensorflow:New input shapes; (re-)compiling: mode=train (# of cores 8), [TensorSpec(shape=(76,), dtype=tf.int32, name='core_id_20'), TensorSpec(shape=(76, 28, 28), dtype=tf.float32, name='flatten_6_input_10'), TensorSpec(shape=(76, 1), dtype=tf.float32, name='dense_13_target_30')] INFO:tensorflow:Overriding default placeholder. INFO:tensorflow:Remapping placeholder for flatten_6_input INFO:tensorflow:KerasCrossShard: <tensorflow.python.keras.optimizers.Adam object at 0x7fe1974d4c88> [<tf.Variable 'tpu_140606887606032/Adam/iterations:0' shape=() dtype=int64>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe19708d278>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe19708dc18>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe19708de80>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe19706cfd0>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe196fb5f60>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe196f7e278>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe196f49198>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe196f10cf8>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe196ed89e8>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe196ea4e10>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe196e11b70>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe196dda7b8>] INFO:tensorflow:Started compiling INFO:tensorflow:Finished compiling. Time elapsed: 1.968165397644043 secs 60000/60000 [==============================] - 13s 209us/sample - loss: 0.5511 - sparse_categorical_accuracy: 0.8522 Epoch 2/5 60000/60000 [==============================] - 2s 29us/sample - loss: 0.2344 - sparse_categorical_accuracy: 0.9342 Epoch 3/5 60000/60000 [==============================] - 2s 32us/sample - loss: 0.1832 - sparse_categorical_accuracy: 0.9477 Epoch 4/5 60000/60000 [==============================] - 2s 31us/sample - loss: 0.1496 - sparse_categorical_accuracy: 0.9579 Epoch 5/5 60000/60000 [==============================] - 2s 27us/sample - loss: 0.1225 - sparse_categorical_accuracy: 0.9654 INFO:tensorflow:New input shapes; (re-)compiling: mode=eval (# of cores 8), [TensorSpec(shape=(4,), dtype=tf.int32, name='core_id_30'), TensorSpec(shape=(4, 28, 28), dtype=tf.float32, name='flatten_6_input_10'), TensorSpec(shape=(4, 1), dtype=tf.float32, name='dense_13_target_30')] INFO:tensorflow:Overriding default placeholder. INFO:tensorflow:Cloning Adam {'lr': 0.0010000000474974513, 'beta_1': 0.8999999761581421, 'beta_2': 0.9990000128746033, 'decay': 0.0, 'epsilon': 1e-07, 'amsgrad': False} INFO:tensorflow:Remapping placeholder for flatten_6_input INFO:tensorflow:KerasCrossShard: <tensorflow.python.keras.optimizers.Adam object at 0x7fe196162898> [] INFO:tensorflow:Started compiling INFO:tensorflow:Finished compiling. Time elapsed: 1.5682730674743652 secs 9888/10000 [============================>.] - ETA: 0s - loss: 0.1541 - sparse_categorical_accuracy: 0.9534INFO:tensorflow:New input shapes; (re-)compiling: mode=eval (# of cores 8), [TensorSpec(shape=(2,), dtype=tf.int32, name='core_id_30'), TensorSpec(shape=(2, 28, 28), dtype=tf.float32, name='flatten_6_input_10'), TensorSpec(shape=(2, 1), dtype=tf.float32, name='dense_13_target_30')] INFO:tensorflow:Overriding default placeholder. INFO:tensorflow:Remapping placeholder for flatten_6_input INFO:tensorflow:KerasCrossShard: <tensorflow.python.keras.optimizers.Adam object at 0x7fe196162898> [] INFO:tensorflow:Started compiling INFO:tensorflow:Finished compiling. Time elapsed: 1.0298957824707031 secs 10000/10000 [==============================] - 7s 655us/sample - loss: 0.1553 - sparse_categorical_accuracy: 0.9530 [0.15529378306418656, 0.95299995]
同じモデルをGPUで学習すると、
Epoch 1/5 60000/60000 [==============================] - 1s 11us/sample - loss: 0.5526 - sparse_categorical_accuracy: 0.8556 Epoch 2/5 60000/60000 [==============================] - 1s 10us/sample - loss: 0.2269 - sparse_categorical_accuracy: 0.9368 Epoch 3/5 60000/60000 [==============================] - 1s 9us/sample - loss: 0.1711 - sparse_categorical_accuracy: 0.9521 Epoch 4/5 60000/60000 [==============================] - 1s 10us/sample - loss: 0.1357 - sparse_categorical_accuracy: 0.9625 Epoch 5/5 60000/60000 [==============================] - 1s 9us/sample - loss: 0.1116 - sparse_categorical_accuracy: 0.9693 10000/10000 [==============================] - 1s 70us/sample - loss: 0.1123 - sparse_categorical_accuracy: 0.9671 [0.11232251176461577, 0.9671]
TPUよりGPUの方が早いという結果だった。
畳み込みのモデルを学習
今度は、畳み込み層のあるモデルに変更して学習してみた。
import tensorflow as tf import os mnist = tf.keras.datasets.mnist (x_train, y_train),(x_test, y_test) = mnist.load_data() x_train, x_test = x_train / 255.0, x_test / 255.0 x_train, x_test = x_train.reshape(x_train.shape[0], 28, 28, 1), x_test.reshape(x_test.shape[0], 28, 28, 1) model = tf.keras.models.Sequential([ tf.keras.layers.Conv2D(input_shape=(28, 28, 1), filters=256, kernel_size=3, padding='same', activation=tf.nn.relu), tf.keras.layers.Flatten(), tf.keras.layers.Dense(512, activation=tf.nn.relu), tf.keras.layers.Dense(10) ]) def sparse_categorical_crossentropy(y_true, y_pred): return tf.keras.backend.sparse_categorical_crossentropy(y_true, y_pred, from_logits=True) def sparse_categorical_accuracy(y_true, y_pred): return tf.keras.metrics.sparse_categorical_accuracy(y_true, tf.nn.softmax(y_pred)) model.compile(optimizer='adam', loss=sparse_categorical_crossentropy, metrics=[sparse_categorical_accuracy]) # TPU tpu_grpc_url = "grpc://"+os.environ["COLAB_TPU_ADDR"] tpu_cluster_resolver = tf.contrib.cluster_resolver.TPUClusterResolver(tpu_grpc_url) strategy = tf.contrib.tpu.TPUDistributionStrategy(tpu_cluster_resolver) model = tf.contrib.tpu.keras_to_tpu_model(model, strategy=strategy) model.fit(x_train, y_train, batch_size=1024, epochs=5) model.evaluate(x_test, y_test)
実行結果
INFO:tensorflow:Querying Tensorflow master (grpc://10.58.99.162:8470) for TPU system metadata. INFO:tensorflow:Found TPU system: INFO:tensorflow:*** Num TPU Cores: 8 INFO:tensorflow:*** Num TPU Workers: 1 INFO:tensorflow:*** Num TPU Cores Per Worker: 8 INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:CPU:0, CPU, -1, 12055453447884746007) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:XLA_CPU:0, XLA_CPU, 17179869184, 18058977096921453459) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:0, TPU, 17179869184, 14808883871349854203) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:1, TPU, 17179869184, 12787960678777081298) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:2, TPU, 17179869184, 12370191287575444036) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:3, TPU, 17179869184, 10061541009258236286) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:4, TPU, 17179869184, 18415570492467594980) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:5, TPU, 17179869184, 6989406478957311628) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:6, TPU, 17179869184, 5293577435862379493) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:7, TPU, 17179869184, 11465669036580936775) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU_SYSTEM:0, TPU_SYSTEM, 17179869184, 13652277581393476684) WARNING:tensorflow:tpu_model (from tensorflow.contrib.tpu.python.tpu.keras_support) is experimental and may change or be removed at any time, and without warning. INFO:tensorflow:Cloning Adam {'lr': 0.0010000000474974513, 'beta_1': 0.8999999761581421, 'beta_2': 0.9990000128746033, 'decay': 0.0, 'epsilon': 1e-07, 'amsgrad': False} INFO:tensorflow:Cloning Adam {'lr': 0.0010000000474974513, 'beta_1': 0.8999999761581421, 'beta_2': 0.9990000128746033, 'decay': 0.0, 'epsilon': 1e-07, 'amsgrad': False} Epoch 1/5 INFO:tensorflow:New input shapes; (re-)compiling: mode=train (# of cores 8), [TensorSpec(shape=(128,), dtype=tf.int32, name='core_id_40'), TensorSpec(shape=(128, 28, 28, 1), dtype=tf.float32, name='conv2d_input_10'), TensorSpec(shape=(128, 1), dtype=tf.float32, name='dense_15_target_30')] INFO:tensorflow:Overriding default placeholder. INFO:tensorflow:Cloning Adam {'lr': 0.0010000000474974513, 'beta_1': 0.8999999761581421, 'beta_2': 0.9990000128746033, 'decay': 0.0, 'epsilon': 1e-07, 'amsgrad': False} INFO:tensorflow:Remapping placeholder for conv2d_input INFO:tensorflow:KerasCrossShard: <tensorflow.python.keras.optimizers.Adam object at 0x7fe1958053c8> [] INFO:tensorflow:Started compiling INFO:tensorflow:Finished compiling. Time elapsed: 11.533315420150757 secs INFO:tensorflow:Setting weights on TPU model. INFO:tensorflow:CPU -> TPU lr: 0.0010000000474974513 {0.001} INFO:tensorflow:CPU -> TPU beta_1: 0.8999999761581421 {0.9} INFO:tensorflow:CPU -> TPU beta_2: 0.9990000128746033 {0.999} INFO:tensorflow:CPU -> TPU decay: 0.0 {0.0} WARNING:tensorflow:Cannot update non-variable config: epsilon WARNING:tensorflow:Cannot update non-variable config: amsgrad 58368/60000 [============================>.] - ETA: 0s - loss: 0.3569 - sparse_categorical_accuracy: 0.8933INFO:tensorflow:New input shapes; (re-)compiling: mode=train (# of cores 8), [TensorSpec(shape=(76,), dtype=tf.int32, name='core_id_40'), TensorSpec(shape=(76, 28, 28, 1), dtype=tf.float32, name='conv2d_input_10'), TensorSpec(shape=(76, 1), dtype=tf.float32, name='dense_15_target_30')] INFO:tensorflow:Overriding default placeholder. INFO:tensorflow:Remapping placeholder for conv2d_input INFO:tensorflow:KerasCrossShard: <tensorflow.python.keras.optimizers.Adam object at 0x7fe1958053c8> [<tf.Variable 'tpu_140606857389224/Adam/iterations:0' shape=() dtype=int64>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe195210828>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe1951b6160>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe1951b6710>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe1951f2940>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe195139a90>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe195103f28>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe1950cc4a8>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe195039550>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe1950032e8>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe194fcab38>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe194f91828>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe194f5fbe0>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe194ecbe10>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe194e94b00>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe194e5e748>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe194e25f98>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe194d94f28>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fe194d5da90>] INFO:tensorflow:Started compiling INFO:tensorflow:Finished compiling. Time elapsed: 11.685662269592285 secs 60000/60000 [==============================] - 43s 718us/sample - loss: 0.3501 - sparse_categorical_accuracy: 0.8954 Epoch 2/5 60000/60000 [==============================] - 4s 68us/sample - loss: 0.0625 - sparse_categorical_accuracy: 0.9818 Epoch 3/5 60000/60000 [==============================] - 4s 65us/sample - loss: 0.0338 - sparse_categorical_accuracy: 0.9898 Epoch 4/5 60000/60000 [==============================] - 4s 68us/sample - loss: 0.0195 - sparse_categorical_accuracy: 0.9947 Epoch 5/5 60000/60000 [==============================] - 4s 68us/sample - loss: 0.0134 - sparse_categorical_accuracy: 0.9962 INFO:tensorflow:New input shapes; (re-)compiling: mode=eval (# of cores 8), [TensorSpec(shape=(4,), dtype=tf.int32, name='core_id_50'), TensorSpec(shape=(4, 28, 28, 1), dtype=tf.float32, name='conv2d_input_10'), TensorSpec(shape=(4, 1), dtype=tf.float32, name='dense_15_target_30')] INFO:tensorflow:Overriding default placeholder. INFO:tensorflow:Cloning Adam {'lr': 0.0010000000474974513, 'beta_1': 0.8999999761581421, 'beta_2': 0.9990000128746033, 'decay': 0.0, 'epsilon': 1e-07, 'amsgrad': False} INFO:tensorflow:Remapping placeholder for conv2d_input INFO:tensorflow:KerasCrossShard: <tensorflow.python.keras.optimizers.Adam object at 0x7fe193c579e8> [] INFO:tensorflow:Started compiling INFO:tensorflow:Finished compiling. Time elapsed: 3.3587257862091064 secs 9824/10000 [============================>.] - ETA: 0s - loss: 0.0863 - sparse_categorical_accuracy: 0.9748INFO:tensorflow:New input shapes; (re-)compiling: mode=eval (# of cores 8), [TensorSpec(shape=(2,), dtype=tf.int32, name='core_id_50'), TensorSpec(shape=(2, 28, 28, 1), dtype=tf.float32, name='conv2d_input_10'), TensorSpec(shape=(2, 1), dtype=tf.float32, name='dense_15_target_30')] INFO:tensorflow:Overriding default placeholder. INFO:tensorflow:Remapping placeholder for conv2d_input INFO:tensorflow:KerasCrossShard: <tensorflow.python.keras.optimizers.Adam object at 0x7fe193c579e8> [] INFO:tensorflow:Started compiling INFO:tensorflow:Finished compiling. Time elapsed: 2.2662925720214844 secs 10000/10000 [==============================] - 11s 1ms/sample - loss: 0.0855 - sparse_categorical_accuracy: 0.9749 [0.08545259298430756, 0.97489995]
同じモデルをGPUで学習すると、
Epoch 1/5 60000/60000 [==============================] - 33s 551us/sample - loss: 0.3539 - sparse_categorical_accuracy: 0.8934 Epoch 2/5 60000/60000 [==============================] - 32s 540us/sample - loss: 0.0571 - sparse_categorical_accuracy: 0.9834 Epoch 3/5 60000/60000 [==============================] - 32s 540us/sample - loss: 0.0286 - sparse_categorical_accuracy: 0.9922 Epoch 4/5 60000/60000 [==============================] - 32s 541us/sample - loss: 0.0152 - sparse_categorical_accuracy: 0.9962 Epoch 5/5 60000/60000 [==============================] - 33s 546us/sample - loss: 0.0087 - sparse_categorical_accuracy: 0.9980 10000/10000 [==============================] - 4s 396us/sample - loss: 0.0458 - sparse_categorical_accuracy: 0.9845 [0.04582856161564123, 0.9845]
今度は、TPUの方が圧倒的に速くなった。
モデルの計算量が多いほどTPUが速いようだ。
2019/2/13
公式のページの方法に書き直した。
https://www.tensorflow.org/guide/using_tpuwww.tensorflow.org
import tensorflow as tf import os mnist = tf.keras.datasets.mnist (x_train, y_train),(x_test, y_test) = mnist.load_data() x_train, x_test = x_train / 255.0, x_test / 255.0 x_train, x_test = x_train.reshape(x_train.shape[0], 28, 28, 1), x_test.reshape(x_test.shape[0], 28, 28, 1) model = tf.keras.models.Sequential([ tf.keras.layers.Conv2D(input_shape=(28, 28, 1), filters=256, kernel_size=3, padding='same', activation=tf.nn.relu), tf.keras.layers.Flatten(), tf.keras.layers.Dense(512, activation=tf.nn.relu), tf.keras.layers.Dense(10) ]) def sparse_categorical_crossentropy(y_true, y_pred): return tf.keras.backend.sparse_categorical_crossentropy(y_true, y_pred, from_logits=True) def sparse_categorical_accuracy(y_true, y_pred): return tf.keras.metrics.sparse_categorical_accuracy(y_true, tf.nn.softmax(y_pred)) # TPU model = tf.contrib.tpu.keras_to_tpu_model( model, strategy=tf.contrib.tpu.TPUDistributionStrategy( tf.contrib.cluster_resolver.TPUClusterResolver(tpu='grpc://' + os.environ['COLAB_TPU_ADDR']) ) ) model.compile(optimizer='adam', loss=sparse_categorical_crossentropy, metrics=[sparse_categorical_accuracy]) model.fit(x_train, y_train, batch_size=1024, epochs=5) model.evaluate(x_test, y_test)
実行結果
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:CPU:0, CPU, -1, 2593207894788626939) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:XLA_CPU:0, XLA_CPU, 17179869184, 13250067999102773720) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:0, TPU, 17179869184, 4557324841645948994) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:1, TPU, 17179869184, 17679983793927275190) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:2, TPU, 17179869184, 2642697500206372799) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:3, TPU, 17179869184, 12155722493614205727) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:4, TPU, 17179869184, 16776203674625333233) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:5, TPU, 17179869184, 219400119950085365) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:6, TPU, 17179869184, 12315577173932374116) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:7, TPU, 17179869184, 8568424978166959712) INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU_SYSTEM:0, TPU_SYSTEM, 17179869184, 11493405250562256329) WARNING:tensorflow:tpu_model (from tensorflow.contrib.tpu.python.tpu.keras_support) is experimental and may change or be removed at any time, and without warning. Epoch 1/5 INFO:tensorflow:New input shapes; (re-)compiling: mode=train (# of cores 8), [TensorSpec(shape=(128,), dtype=tf.int32, name='core_id_20'), TensorSpec(shape=(128, 28, 28, 1), dtype=tf.float32, name='conv2d_1_input_10'), TensorSpec(shape=(128, 1), dtype=tf.float32, name='dense_3_target_10')] INFO:tensorflow:Overriding default placeholder. INFO:tensorflow:Cloning Adam {'lr': 0.0010000000474974513, 'beta_1': 0.8999999761581421, 'beta_2': 0.9990000128746033, 'decay': 0.0, 'epsilon': 1e-07, 'amsgrad': False} INFO:tensorflow:Remapping placeholder for conv2d_1_input INFO:tensorflow:KerasCrossShard: <tensorflow.python.keras.optimizers.Adam object at 0x7fec0d48f198> [] INFO:tensorflow:Started compiling INFO:tensorflow:Finished compiling. Time elapsed: 10.637016296386719 secs INFO:tensorflow:Setting weights on TPU model. INFO:tensorflow:CPU -> TPU lr: 0.0010000000474974513 {0.001} INFO:tensorflow:CPU -> TPU beta_1: 0.8999999761581421 {0.9} INFO:tensorflow:CPU -> TPU beta_2: 0.9990000128746033 {0.999} INFO:tensorflow:CPU -> TPU decay: 0.0 {0.0} WARNING:tensorflow:Cannot update non-variable config: epsilon WARNING:tensorflow:Cannot update non-variable config: amsgrad 58368/60000 [============================>.] - ETA: 0s - loss: 0.3567 - sparse_categorical_accuracy: 0.8953INFO:tensorflow:New input shapes; (re-)compiling: mode=train (# of cores 8), [TensorSpec(shape=(76,), dtype=tf.int32, name='core_id_20'), TensorSpec(shape=(76, 28, 28, 1), dtype=tf.float32, name='conv2d_1_input_10'), TensorSpec(shape=(76, 1), dtype=tf.float32, name='dense_3_target_10')] INFO:tensorflow:Overriding default placeholder. INFO:tensorflow:Remapping placeholder for conv2d_1_input INFO:tensorflow:KerasCrossShard: <tensorflow.python.keras.optimizers.Adam object at 0x7fec0d48f198> [<tf.Variable 'tpu_140651824059952/Adam/iterations:0' shape=() dtype=int64>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fec0cea81d0>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fec0cea8ac8>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fec0ce43278>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fec0ce070b8>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fec0cdd2240>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fec0cd9a898>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fec0cd5f668>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fec0cd2a6d8>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fec0ccf3f60>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fec0cc639b0>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fec0cc28358>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fec0cbf29b0>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fec0cbbb780>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fec0cb03cc0>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fec0caf5898>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fec0ca3f4e0>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fec0ca09e10>, <tensorflow.contrib.tpu.python.tpu.keras_tpu_variables.ReplicatedVariable object at 0x7fec0c9cf860>] INFO:tensorflow:Started compiling INFO:tensorflow:Finished compiling. Time elapsed: 11.36061954498291 secs 60000/60000 [==============================] - 39s 655us/sample - loss: 0.3495 - sparse_categorical_accuracy: 0.8974 Epoch 2/5 60000/60000 [==============================] - 4s 66us/sample - loss: 0.0626 - sparse_categorical_accuracy: 0.9814 Epoch 3/5 60000/60000 [==============================] - 4s 66us/sample - loss: 0.0338 - sparse_categorical_accuracy: 0.9900 Epoch 4/5 60000/60000 [==============================] - 4s 67us/sample - loss: 0.0203 - sparse_categorical_accuracy: 0.9941 Epoch 5/5 60000/60000 [==============================] - 4s 68us/sample - loss: 0.0110 - sparse_categorical_accuracy: 0.9972 INFO:tensorflow:New input shapes; (re-)compiling: mode=eval (# of cores 8), [TensorSpec(shape=(4,), dtype=tf.int32, name='core_id_30'), TensorSpec(shape=(4, 28, 28, 1), dtype=tf.float32, name='conv2d_1_input_10'), TensorSpec(shape=(4, 1), dtype=tf.float32, name='dense_3_target_10')] INFO:tensorflow:Overriding default placeholder. INFO:tensorflow:Cloning Adam {'lr': 0.0010000000474974513, 'beta_1': 0.8999999761581421, 'beta_2': 0.9990000128746033, 'decay': 0.0, 'epsilon': 1e-07, 'amsgrad': False} INFO:tensorflow:Remapping placeholder for conv2d_1_input INFO:tensorflow:KerasCrossShard: <tensorflow.python.keras.optimizers.Adam object at 0x7fec0b8f72e8> [] INFO:tensorflow:Started compiling INFO:tensorflow:Finished compiling. Time elapsed: 2.7844197750091553 secs 9856/10000 [============================>.] - ETA: 0s - loss: 0.1032 - sparse_categorical_accuracy: 0.9698INFO:tensorflow:New input shapes; (re-)compiling: mode=eval (# of cores 8), [TensorSpec(shape=(2,), dtype=tf.int32, name='core_id_30'), TensorSpec(shape=(2, 28, 28, 1), dtype=tf.float32, name='conv2d_1_input_10'), TensorSpec(shape=(2, 1), dtype=tf.float32, name='dense_3_target_10')] INFO:tensorflow:Overriding default placeholder. INFO:tensorflow:Remapping placeholder for conv2d_1_input INFO:tensorflow:KerasCrossShard: <tensorflow.python.keras.optimizers.Adam object at 0x7fec0b8f72e8> [] INFO:tensorflow:Started compiling INFO:tensorflow:Finished compiling. Time elapsed: 1.8734314441680908 secs 10000/10000 [==============================] - 9s 921us/sample - loss: 0.1027 - sparse_categorical_accuracy: 0.9699 [0.1027357829653658, 0.96989995]
モデル変換→コンパイルの順序にすると少し速くなった。