Find Point Clusters

URL:: https://<geoanalytics-url>/FindPointClusters
Methods:: GET
Version Introduced:: 10.6.1

Description

The FindPointClusters operation extracts clusters from your input point features and identifies any surrounding noise. Two clustering methods can be used, DBSCAN or HDBSCAN. Both methods can find clusters in space, while DBSCAN can find spatiotemporal clusters in time-enabled point layers. For example, a non governmental organization is studying a particular pest-borne disease. It has a point dataset representing households in a study area, some of which are infested, and some of which are not. By using the Find Point Clusters tool, an analyst can identify clusters of infested households to help pinpoint an area to begin treatment and extermination of pests. To learn more, see the ArcGIS Pro documentation on How Density-based Clustering works.

Request parameters

Parameter	Details
`inputLayer` (Required)	The point features from which clusters will be found. Syntax: As described in Feature input, this parameter can be one of the following: A URL to a feature service layer with an optional filter to select specific features A URL to a big data catalog service layer with an optional filter to select specific features A feature collection REST examples Use dark colors for code blocksCopy `1 2 3 4 5 //REST web example {"url": "https://myportal.domain.com/server/rest/services/Hosted/hurricaneTrack/FeatureServer/0", "filter": "Month = 'September'"} //REST scripting example "inputLayer": {"url": "https://myportal.domain.com/server/rest/services/Hosted/hurricaneTrack/FeatureServer/0", "filter": "Month = 'September'"}`
`clusterMethod` (Required)	The algorithm used for cluster analysis. This parameter must be specified as either `DBSCAN` or `HDBSCAN`. The `DBSCAN` algorithm uses a specified distance to separate dense clusters from sparser noise. `DBSCAN` is faster than `HDBSCAN`, but is only appropriate if there is a very clear `searchDistance` to use that works well to define all clusters that may be present. `DBSCAN` finds clusters that have similar densities. The `HDBSCAN` algorithm finds clusters of points similar to `DBSCAN` but uses varying distances allowing for clusters with varying densities based on cluster probability (or stability). `HDBSCAN` is very data-driven and does not require or use `searchDistance`, but is a more time-consuming calculation than `DBSCAN`. The DBSCAN algorithm finds clusters in two-dimensional space only by default. When `timeMethod` is set to `Linear` and `inputLayer` is time enabled and is of type instant, DBSCAN will discover clusters in both space and time. When searching for cluster members, `minFeaturesCluster` must be found within a specified search range and search duration to form a cluster. Temporal clustering is available at ArcGIS Enterprise 10.8. `HDBSCAN` currently only supports spatial clustering and will not use time to discover clusters. Note When using the `HDBSCAN` algorithm with an input layer containing more than 3 million features, the tool may fail unless you increase the value of the `javaHeapSize` parameter on the GeoAnalyticsTools GP Service. Roughly 2 GB of heap size is needed per 3 million features. The amount of RAM specified by `javaHeapSize` should be available on each GeoAnalytics Server machine in addition to the 16 GB normally required by GeoAnalytics Server. For example, if you want to cluster 9 million features with `HDBSCAN`, you should set `javaHeapSize` to no less than 6144 MB (or 6 GB). In this case, each GeoAnalytics Server machine should have a total of at least 22 GB or RAM available. REST examples Use dark colors for code blocksCopy `1 2 3 4 5 //REST web example DBSCAN //REST scripting example "clusterMethod": "DBSCAN"`
`timeMethod` (Optional)	When this parameter is set to `Linear` and `clusterMethod` is `DBSCAN` , both space and time will be used to find point clusters. If `clusterMethod` is `HDBSCAN` , this parameter will be ignored and clusters will be found in space only. This parameter can only be used if `inputLayer` has time enabled and is of type instant. Temporal clustering is available at ArcGIS Enterprise 10.8. REST examples Use dark colors for code blocksCopy `1 2 3 4 5 //REST web example Linear //REST scripting example "timeMethod": "Linear"`
`minFeaturesCluster` (Required)	This parameter is used differently depending on the clustering method chosen. For `DBSCAN`, this parameter specifies the number of features that must be found within a search range of a point for that point to start forming a cluster. The results may include clusters with fewer features than this value. The search range distance is set using the `searchDistance` parameter. For `HDBSCAN`, the `minFeaturesCluster` parameter specifies the number of features neighboring each point (including the point itself) that will be considered when estimating density. This number is also the minimum cluster size allowed when extracting clusters. REST examples Use dark colors for code blocksCopy `1 2 3 4 5 //REST web example 10 //REST scripting example "minFeaturesCluster": 5`
`searchDistance` (Optional)	When using `DBSCAN` , this parameter is the distance within which `minFeaturesCluster` must be found. This parameter is not used when `HDBSCAN` is chosen as the clustering method. REST examples Use dark colors for code blocksCopy `1 2 3 4 5 //REST web example 108.3 //REST scripting example "searchDistance": 100`
`searchDistanceUnit` (Optional)	The units used for the `searchDistance` parameter. This parameter is required when using `DBSCAN` but will not be used with `HDBSCAN`. Values: `Meters` \| `Kilometers` \| `Feet` \| `FeetInt` \| `FeetUS` \| `Miles` \|`MilesInt` \| `MilesUS` \| `NauticalMiles` \| `NauticalMilesInt` \| `NauticalMilesUS` \| `Yards` \| `YardsInt` \| `YardsUS` REST examples Use dark colors for code blocksCopy `1 2 3 4 5 //REST web example Meters //REST scripting example "searchDistanceUnit": "Miles"`
`searchDuration` (Optional)	When using DBSCAN with `timeMethod` set as `Linear`, this parameter is the time duration within which `minFeaturesCluster` must be found. This parameter is not used when HDBSCAN is chosen as the clustering method or when `timeMethod` is not used.
`searchDurationUnit` (Optional)	The units used for the `searchDuration` parameter. This parameter is required when using DBSCAN but will not be used with HDBSCAN or space-only DBSCAN.
`outputName` (Required)	The task will create a feature service of the results. You define the name of the service. REST examples Use dark colors for code blocksCopy `1 2 3 4 5 //REST web example myOutput //REST scripting example "outputName": "myOutput"`
`context` (Optional)	The `context` parameter contains additional settings that affect task execution. For this task, there are four settings: Extent (`extent` )—A bounding box that defines the analysis area. Only those features that intersect the bounding box will be analyzed. Processing spatial reference (`processSR` )—The features will be projected into this coordinate system for analysis. Output spatial reference (`outSR` )—The features will be projected into this coordinate system after the analysis to be saved. The output spatial reference for the spatiotemporal big data store is always WGS84. Data store (`dataStore` )—Results will be saved to the specified data store. The default is the spatiotemporal big data store. Syntax: Use dark colors for code blocksCopy `1 2 3 4 5 6 { "extent": {extent}, "processSR": {spatial reference}, "outSR": {spatial reference}, "dataStore": {data store} }`
`f`	The response format. The default response format is `html` . Values: `html` \| `json` \| `pjson`

Example usage

Below is a sample request URL for FindPointClusters :

Use dark colors for code blocksCopy
https://hostname.domain.com/webadaptor/rest/services/System/GeoAnalyticsTools/GPServer/FindHotSpots/submitJob?inputLayer={"url":"https://hostname.domain.com/webadaptor/rest/services/Hurricane/hurricaneTrack/0"}&clusterMethod=HDBSCAN&minFeaturesCluster=10&searchDistance=&searchDistanceUnit=&outputName=myOutput&context={"extent":{"xmin":-122.68,"ymin":45.53,"xmax":-122.45,"ymax":45.6,"spatialReference":{"wkid":4326}}}&f=json

Response

When you submit a request, the service assigns a unique job ID for the transaction.

Syntax:

Use dark colors for code blocksCopy
{
  "jobId": "<unique job identifier>",
  "jobStatus": "<job status>"
}

After the initial request is submitted, you can use jobId to periodically check the status of the job and messages as described in Check job status. Once the job has successfully completed, use jobId to retrieve the results. To track the status, you can make a request of the following form:

Use dark colors for code blocksCopy
https://<analysis url>/FindPointClusters/jobs/<jobId>

Access results

When the status of the job request is esriJobSucceeded , you can access the results of the analysis by making a request of the following form:

Use dark colors for code blocksCopy
https://<analysis-url>/FindPointClusters/jobs/<jobId>/results/output?token=<your token>&f=json

Response Description

Response	Description
`output`	The `output` parameter will contain the cluster results. Fields added to `output` include all the fields from the `inputLayer` and the following: `CLUSTER_ID`—A numeric value showing you which cluster a feature falls into. A feature with a `CLUSTER_ID` of -1 does not fall into a cluster and is noise. `COLOR_ID`—An ID value used for rendering results. Multiple clusters will each be assigned a different color. Colors will be assigned and repeated so that each cluster is visually distinct from its neighboring clusters. When the HDBSCAN algorithm is used to find clusters, the following fields will also be added to `output`: `PROB`—The probability that a feature belongs in its assigned cluster. `OUTLIER`—The likelihood that a feature is an outlier within its own cluster. A larger value indicates that the feature is more likely to be an outlier. `EXEMPLAR`— Indicates which features are most representative of each cluster. These features are indicated by a value of 1. `STABILITY`— The persistence of each cluster across a range of scales. A larger score indicates that a cluster persists over a wider range of distance scales. Use dark colors for code blocksCopy `1 {"url": "https://<analysis-url>/FindPointClusters/jobs/<jobId>/results/output"}` The result has properties for parameter name, data type, and value. The contents of `value` depend on the `outputName` parameter provided in the initial request. The `value` contains the URL of the feature service layer. Use dark colors for code blocksCopy `1 2 3 4 5 { "paramName": "output", "dataType": "GPRecordSet", "value":{"url": "<hosted featureservice layer url>"} }` See Feature output for more information about how the result layer is accessed.

output

The output parameter will contain the cluster results. Fields added to output include all the fields from the inputLayer and the following:

CLUSTER_ID—A numeric value showing you which cluster a feature falls into. A feature with a CLUSTER_ID of -1 does not fall into a cluster and is noise.
COLOR_ID—An ID value used for rendering results. Multiple clusters will each be assigned a different color. Colors will be assigned and repeated so that each cluster is visually distinct from its neighboring clusters.

When the HDBSCAN algorithm is used to find clusters, the following fields will also be added to output:

PROB—The probability that a feature belongs in its assigned cluster.
OUTLIER—The likelihood that a feature is an outlier within its own cluster. A larger value indicates that the feature is more likely to be an outlier.
EXEMPLAR— Indicates which features are most representative of each cluster. These features are indicated by a value of 1.
STABILITY— The persistence of each cluster across a range of scales. A larger score indicates that a cluster persists over a wider range of distance scales.

Use dark colors for code blocksCopy
{"url": "https://<analysis-url>/FindPointClusters/jobs/<jobId>/results/output"}

The result has properties for parameter name, data type, and value. The contents of value depend on the outputName parameter provided in the initial request. The value contains the URL of the feature service layer.

Use dark colors for code blocksCopy
{
  "paramName": "output",
  "dataType": "GPRecordSet",
  "value":{"url": "<hosted featureservice layer url>"}
}

See Feature output for more information about how the result layer is accessed.