- URL:
- https://<geoanalytics-url>/ForestBasedClassificationAndRegression
- Methods:
GET
- Version Introduced:
- 10.7
Description
The Forest
operation creates models and generates predictions using an adaptation of Leo Breiman's random forest algorithm, which is a supervised machine learning method. Predictions can be performed for both categorical variables (classification) and continuous variables (regression). Explanatory variables can take the form of fields in the attribute table of the training features. In addition to validation of model performance on the training data, predictions can be made to another feature dataset.
The following are examples:
- You have seagrass occurrence and a number of environmental explanatory variables that have been enriched using a multivariable grid to calculate distances to factories upstream and major ports. Future seagrass occurrence can be predicted based on future projections for those same environmental explanatory variables.
- You have crop yield data at hundreds of farms across the country, along with other attributes at each of those farms (number of employees, acreage, and so on). Using this data, you can provide a set of features representing farms where you don't have crop yield (but you do have all of the other variables), and make a prediction about crop yield.
- Housing values can be predicted based on the prices of houses sold in the current year. The sale price of homes sold, along with information about the number of bedrooms, distance to schools, proximity to major highways, average income, and crime counts, can be used to predict sale prices of similar homes.
Request parameters
Parameter | Details |
---|---|
| Specifies the operation mode of the tool. The tool can be run to train a model to only assess performance or to train a model and predict features. Prediction types are as follows:
REST examples
|
| The features that will be used to train the dataset. This layer must include fields representing the variable to predict and the explanatory variables. Syntax: As described in Feature input, this parameter can be one of the following:
REST examples
|
| A feature layer representing locations where predictions will be made. This layer must include explanatory variable fields that correspond to fields used in Syntax: As described in Feature input, this parameter can be one of the following:
REST examples
|
| The variable from the REST examples
|
| A list of fields representing the explanatory variables and a Boolean value denoting whether the fields are categorical. The explanatory variables help predict the value or category of the REST examples
|
| The number of trees to create in the forest model. More trees will generally result in more accurate model prediction, but the model will take longer to calculate. The default number of trees is 100. Values must be greater than 0. REST examples
|
| The minimum number of observations required to keep a leaf (that is, the terminal node on a tree without further splits). The default minimum for regression is 5, and the default for classification is 1. For very large data, increasing these numbers will decrease the run time of the tool. REST examples
|
| The maximum number of splits that will be made down a tree. Using a large maximum depth, more splits will be created, which may increase the chances of overfitting the model. The default is data driven and depends on the number of trees created and the number of variables included. REST examples
|
| The percentage of the REST examples
|
| The number of explanatory variables used to create each decision tree. Each of the decision trees in the forest is created using a random subset of the explanatory variables specified. Increasing the number of variables used in each decision tree will increase the chances of overfitting your model, particularly if there are one or two dominate variables. A common practice is to use the square root of the total number of explanatory variables if your REST examples
|
| The percentage (between 0 percent and 50 percent) of REST examples
|
| A Boolean that specifies whether an output table will be generated that contains information describing the importance of each explanatory variable used in the model created.
Values: REST examples
|
| A list of the REST examples
|
|
The task will create a feature service of the results. You define the name of the service. REST examples
|
|
The
Syntax:
|
|
The response format. The default response format is Values: |
Example usage
Below is a sample request URL for Forest
:
https://machine.domain.com/webadaptor/rest/services/System/GeoAnalyticsTools/GPServer/FindHotSpots/submitJob?
predictionType=Train&inFeatures={"url":"https://webadaptor.domain.com/server/rest/services/Hurricane/hurricaneTrack/0"}&featuresToPredict={"url":"https://webadaptor.domain.com/server/rest/services/USA/cities/0"}&variablePredict={"fieldName":"shelterCapacity","categorical":true}&explanatoryVariables={"fieldName":"townDensity","categorical":true}&numberOfTrees=20&minimumLeafSize=6&maximumTreeDepth=10&sampleSize=95&randomVariables=3&percentageForValidation=10&createVariableOfImportanceTab=false&explanatoryVariableMatching=[{"predictionLayerField":"Hurricane2019","trainingLayerField":"hurricanesIn2019"},{"predictionLayerField":"ShelterLocations","trainingLayerField":"CorpusChristiShelters"&outputTrainedName=myOutput&context={"extent":{"xmin":-122.68,"ymin":45.53,"xmax":-122.45,"ymax":45.6,"spatialReference":{"wkid":4326}}}&f=json
Response
When you submit a request, the service assigns a unique job ID for the transaction.
Syntax:
{
"jobId": "<unique job identifier>",
"jobStatus": "<job status>"
}
After the initial request is submitted, you can use job
to periodically check the status of the job and messages as described in Check job status. Once the job has successfully completed, use job
to retrieve the results. To track the status, you can make a request of the following form:
https://<analysis url>/ForestBasedClassificationAndRegression/jobs/<jobId>
Access results
When the status of the job request is esri
, you can access the results of the analysis by making a request of the following form:
https://<analysis-url>/ForestBasedClassificationAndRegression/jobs/<jobId>/results/<response type>?token=<your token>&f=json
Response | Description |
---|---|
| The input features that are fit to the model. The type of feature (point, line, or polygon) depends on the input layers.
The result has properties for parameter name, data type, and value. The contents of
See Feature output for more information about how the result layer is accessed. |
| The features predicted using the model. The type of feature (table, point, line, or polygon) depends on the input layers. This result is optional and is only returned when
|
| A table representing the variable of importance from the model fit. This result is optional and is only returned when
|
| The
|