Introduction
This guide explains the steps for the training and evaluation of multiple network architectures supported by the arcgis.learn
API. The arcgis.learn
API currently supports more than 30 deep learning networks for object detection. The API provides 4 deep learning networks, along with the MMDetection
class, which supports more than twenty object detection networks. Similarly, for pixel classification, the API supports 11 deep learning networks, along with the MMSegmentation
class, which supports more than twenty pixel classification networks.
To train a deep learning network using the arcgis.learn
API, you must follow the complete pipeline, which involves data preprocessing, network selection, hyper parameter tuning, and network selection/evaluation based on the performance of the network.
As it can be difficult to iteratively run and compare all of the different networks, the AutoDL
class automatically trains all of the supported networks with the given data within a provided time limit and provides a performance tally for all of the networks. The AutoDL
class will also save all of the networks during the training process, allowing them to be used later for fine tuning to enhance the network performance.
AutoDL supported Networks
Object Detection
- SingleShotDetector
- RetinaNet
- FasterRCNN
- YOLOv3
- ATSS
- CARAFE
- CascadeRCNN
- CascadeRPN
- DCN
- Detectors
- DoubleHeads
- DynamicRCNN
- EmpiricalAttention
- FCOS
- FoveaBox
- FSAF
- GHM
- LibraRCNN
- PaFPN
- PISA
- RegNet
- RepPoints
- Res2Net
- SABL
- VFNet
Pixel Classification
- DeepLab
- UnetClassifier
- PSPNetClassifier
- ANN
- APCNet
- CCNet
- CGNet
- HRNet
- DeepLabV3Plus
- DMNet
- DNLNet
- EMANet
- FastSCNN
- FCN
- GCNet
- MobileNetV2
- NonLocalNet
- OCRNet
- PSANet
- SemFPN
- UperNet
Implementation in arcgis.learn
Let's see how AutoDL
class works with arcgis.learn
Imports
from arcgis.learn import AutoDL, prepare_data, ImageryModel
Prepare data
Prepare the data for AutoDL
class using prepare_data()
in arcgis.learn
, the recommended value for the batch_size parameter is None
as AutoDL
class supports automatic evaluation of the batch_size based on the GPU capacity.
data = prepare_data("path_to_data_folder", batch_size=None)
Train networks using AutoDL
AutoDL
class accepts the following paramters:
-
data
(Required Parameter): The data object returned from theprepare_data
function in the previous step. -
total_time_limit
(Optional parameter): The total time limit in hours for theAutoDL
to train and evaluate the networks. This parameter becomes important when time is the main constraint to the user. TheAutoDL
class calculates the number of chips that can be processed in the given time from the prepared databunch. -
mode
(Optional Parameter): Can be "basic" or "advanced".- basic : To to be used when the user wants to train all selected networks.
- advanced : To be used when the user also wants to tune hyper parameters of the two best performing models from the basic mode.
-
network
(Optional Parameter): The list of models that will be used in the training process. If the user does not provide the parameter value, theAutoDL
class selects all of the supported networks, however the user can select one or more networks by passing the network names as string in a list.- Supported Object Detection models
- SingleShotDetector, RetinaNet, FasterRCNN, YOLOv3, ATSS, CARAFE, CascadeRCNN, CascadeRPN, DCN, Detectors, DoubleHeads, DynamicRCNN, EmpiricalAttention, FCOS, FoveaBox, FSAF, GHM, LibraRCNN, PaFPN, PISA, RegNet, RepPoints, Res2Net, SABL, VFNet
- Supported Object Detection models
- DeepLab, UnetClassifier, PSPNetClassifier, ANN, APCNet, CCNet, CGNet, HRNet, DeepLabV3Plus, DMNet, DNLNet, EMANet, FastSCNN, FCN, GCNet, MobileNetV2, NonLocalNet, OCRNet, PSANet, SemFPN, UperNet
- Supported Object Detection models
verbose
(Optional Parameter): Optional Boolean. To be used to display logs while training the networks. This parameter displays the progress with time and becomes important in case of any failure, user can use the logs to check which, when and why network training failed.
AutoDL Training modes
- Basic
- In this mode we iterate through all of the supported networks once with the default backbone, train it with the passed data, and calculate the network performance. At the end of each iteration, the function will save the model to the disk. Based on the alloted time, the program will automatically calculate the maximum number of epochs to train each network. However,the training will stop if the model stops improving for 5 epochs. A minimum difference of 0.001 in the validation loss is required for it to be considered as an improvement.
- Advanced
- To be used when the user wants to tune the hyper-parameters of two best performing networks from the basic mode. This mode will divide the total time into two halves. In the first half, it works like basic mode, where it will iterate through all of the supported networks once. In the second half, it checks for the two best performing networks. The program then trains the selected networks with different supported backbones. At the end of each iteration, the function will save the model to the disk. Based on the alloted time, the program will automatically calculate the maximum number of epochs to train each network. However,the training will stop if the model stops improving for 5 epochs. A minimum difference of 0.001 in the validation loss is required for it to be considered as an improvement.
- In this mode we use
optuna
to tune the hyper-paramaeters of the network.
dl = AutoDL(data, total_time_limit=5,verbose=True, mode="advanced")
When the AutoDL
class is initialized, it calculates the number of images that can be processed in the given time and the time required to process the all of the data. The output of the cell above can then be used to analyze and update the total_time_limit
and networks
parameters while initializing the class.
Here is an example of the output
- Given time to process the dataset is: 5.0 hours
- Number of images that can be processed in the given time: 290
- Time required to process the entire dataset of 3000 images is 52 hours
This explains how many images can be processed to train all of the selected networks in the selected mode within the given time, as well as it provides an estimate of the time that AutoDL
will take to train all of the selected networks with the entire dataset.
Supported methods in AutoDL
Supported Classification Models
dl.supported_classification_models()
The output of this function will be a list of pixel classification models supported by the AutoDL
class.
Supported Detection Models
dl.supported_detection_models()
The output of this function will be a list of object detection models supported by the AutoDL
class.
Fit
The fit
method will be used to train all of the selected networks automatically within the provided time limit.
dl.fit()
Score
This method will return an evaluation report as a dataframe that will include several fields, including the model's accuracy with train/validation loss, dice(for pixel classification), the learning rate used to train the model, train time, and backbone.
dl.score()
Report
dl.report()
This method will return an advanced html report of the networks trained by AutoDL
. In the basic
mode it shows the leaderboard of all the networks based on their performance during the training phase, and some important parameter details and charts for the best evaluated model. Additionally, in the advanced
mode it shows details of all the optuna
based trails with the hyper-tuned parameter details and the feature importance chart for each of the model evaluated during the advanced
mode.
Show Results
This method will display the results for the best performing model.
dl.show_results()
MIOU
MIOU is mean of intersection over union, this method calculates mean IOU on the validation set for each class. This function is only supported by pixel classification models.
dl.mIOU()
Average Precision Score
This method computes the average of the precision on the validation set for each class. This function is only supported by object detection models.
dl.average_precision_score()
Fine tune AutoDL models using ImageryModel
Once the best performing network is identified, it can be further fine tuned using the ImageryModel
class. This class supports methods that can be used to load, fine-tune, and save the model for further use.
im = ImageryModel()
Load the model
- The load method is used to load a saved model from the disk using the
AutoDL
class. It accepts the following parameters:- path: Path to the ESRI Model Definition (emd or dlpk) file
- data: Returned data object from
prepare_data
function
im.load("path_to_emd_file", data)
Learning rate
The lr_find
method runs the Learning Rate Finder, which helps in choosing the optimum learning rate for training the model.
im.lr_find()
Train the model
The loaded model can be trained further using the fit method. This method trains the model for the specified number of epochs while using the specified learning rates.
im.fit(10)
Save the model
This method saves the model weights and creates an Esri Model Definition and a Deep Learning Package zip for deployment to Image Server or ArcGIS Pro.
im.save("path_to_save_model")
Conclusion
This guide has explained how the AutoDL
class can be used to automate multiple deep learning models supported by the arcgis.learn
API. For every step in the workflow, we defined a function and discussed its usage. This guide can be a starting point for developers to train and evaluate multiple arcgis.learn
supported models' performances.
For more information about the API, refer to the API reference