What is TensorBoard?

Training a deep learning model is a long and iterative process and, hence, it is important to have a tool to visualize the progress of the model training and monitor the learning process.

TensorBoard is an open source toolkit which enables us to understand training progress and improve model performance by updating the hyperparameters. TensorBoard toolkit displays a dashboard where the logs can be visualized as graphs, images, histograms, embeddings, text etc. It also helps in tracking information like gradients, losses, metrics, and intermediate outputs [1, 2].

arcgis.learn module integrates TensorBoard toolkit to the model training process which now makes it possible for us to monitor model training process. In this guide, we will learn how model training can be monitored using TensorBoard within arcgis.learn module.

Note: TensorBoard is supported in ArcGIS API for Python version 1.8.3 and later.

Prerequisite

The specific Python libraries mentioned below need to be installed in your deep learning environment.

pip install tensorboard=2.2.1
pip install tensorboardX=2.1

Model training with TensorBoard

The arcis.learn module currently supports TensorBoard for the following models listed:

SingleShotDetector
UnetClassifier
FeatureClassifier
RetinaNet
PSPNetClassifier
MaskRCNN
DeepLab
FasterRCNN
SuperResolution
Pix2Pix
CycleGAN
ImageCaptioner
MultiTaskRoadExtractor

from arcgis.learn import UnetClassifier , prepare_data

data_path = r'training_data'
data = prepare_data(data_path, batch_size=4)

unet_model = UnetClassifier(data) # Choose the model you want to use for training from the above mentioned list

After instantiating the model object, we now train the model using model.fit() method along with TensorBoard flag set to True , we can train the model for specified number for epochs while also visualizing it using TensorBoard. By default,the TensorBoard parameter is set to False.

unet_model.fit(2, lr=0.0001, tensorboard=True)

Monitor training on Tensorboard using the following command: 'tensorboard --host=DELDEVAL047 --logdir="C:\Users\Karthik\Desktop\Base\Tensorboard\Kent_LULC\training_log"'

epoch	train_loss	valid_loss	accuracy	dice	time
0	1.489619	1.355104	0.522247	0.522247	00:25
1	1.323257	1.155571	0.593830	0.593830	00:24

The command that needs to be run to access the TensorBoard is printed as shown above when the TensorBoard flag is enabled. If the user does not have the libraries installed which are mentioned in the Prerequisite, the model training continues. However, a warning message will be displayed that prompts the user to install the required libraries.

Launch TensorBoard on a browser

To Visualize the TensorBoard on your default web browser, the command printed during the training phase should be executed on an anaconda prompt as shown below and the user will get a message as shown

It is possible to run TensorBoard on a different port by passing the required port number in the command (Ex: --port=8008). The default port used is port 6006.

The TensorBoard is now accessible on any web browser by typing the URL that gets printed when TensorBoard command is executed. (Highlighted above). Doing this will open up TensorBoard on the URL:

In the tab 'SCALARS' various graphs related to different metrics and stats can be visualized.
In the tab 'IMAGES' the intermittent outputs of the model get displayed as shown below. Using this feature, we can compare the outputs of the model across different epochs and compare visually the model outputs across different runs of the model.

This can be done even while the training process is ongoing as these graphs and images get updated at the end of each epoch and does not wait until the entire training process to get completed.

What is TensorBoard?

Prerequisite

Model training with TensorBoard

Launch TensorBoard on a browser

References