The tracks
module in GeoAnalytics Engine includes a collection of functions for managing and analyzing track data.
Tracks are linestrings that represent the change in an entity's location over time. Each vertex in the linestring has a timestamp (stored as the M-value) and the vertices are ordered sequentially.
Tracks are typically created from point observations of an entity over time. Some examples of track data include:- Delivery vehicle locations—The path of travel of each truck can be represented using tracks. The tracks can then be used to find anomalies in delivery routes, quantify road usage, visualization, and more.
- Marine automatic identification system (AIS) records—Historical vessel locations can be turned into tracks to see the path of travel of each vessel. These tracks can then be used to summarize vessel movement or to find outliers that could indicate illegal or dangerous activites.
- Locations of mobile personnel—Data collected by individuals working in the field (for example, with ArcGIS Field Maps) can be analyzed to assess staffing needs and optimize future dispatch of personnel.
Like the functions in the geoanalytics.sql
module, GeoAnalytics Engine
track functions can be called with Python functions or in a PySpark SQL query statement. Unlike the functions in
geoanalytics.sql
, all functions in the geoanalytics.tracks
module operate on one or more linestring columns that
contain tracks.
Create tracks
In most cases, tracks are created from a point column and a timestamp column using TRK_AggrCreateTrack. This function orders the input points sequentially using the timestamp column and then connects the points together with linestrings to form a track. Because TRK_AggrCreateTrack is an aggregate function it can operate on a grouped DataFrame to create a track for each group in the DataFrame. You can group your DataFrame using DataFrame.groupBy() or with a GROUP BY clause in a SQL statement.
For example, if you have a DataFrame of marine vessel observations, you may group on a column of vessel names to create one track for each vessel. Each track in the result would represent the path of travel of a single vessel. You can also group by more than one column, for example both vessel name and day of the week. In this case, each track in the result would represent the path of travel of a single vessel on a specific day of the week.
Tracks can also be imported from text or binary as linestrings. Any single-part linestring
is considered a valid track if it has M-values that represent the unix time at each vertex. In addition, track vertices
must be ordered by the M-values and there can be no duplicate M-values. To check if a track is valid or not, use
TRK_IsValid. In most cases, track functions will return null
for invalid tracks.
Query tracks
Many of the functions in geoanalytics.tracks
can be used to describe tracks or extract geometries from tracks.
For example, you can find track properties that may be useful in further analysis or visualization like length, duration, speed, and more.
You can also extract track segments that meet certain criteria. For example, you can find the subset of a track that is before, after, or between given distances or durations.
Similarly, you can query for a point on a track that is a certain distance or duration from the start of the track. The reverse of this operation is finding the distance along or duration along a track that a point observation occurs if it intersects the track.
Split tracks
Splitting tracks into smaller track segments may be useful in certain cases for improving performance, enhancing visualization, or optimizing storage.
For example, long tracks might occur in cases where point observations have been collected over a long period of time and/or over a long distance. Representing these long tracks as a single linestring can limit performance as the geometry data is stored in a single row of a DataFrame. By splitting long tracks into segments and creating a new row for each segment you can increase the potential paralellism of your data and thus improve performance.
There are several ways to split tracks:
-
Split at a specified distance or duration interval—Use this approach when you want to retain all segments of the original track. This can be useful for performance reasons as described above, but also for analysis and visualization by allowing you to quantify change along a track at discrete intervals.
For example, instead of finding the speed of an entire track, you could split the track into 1 kilometer segments and find the speed for each segment to see how the speed changes over distance or time.
For more information see TRK_SplitByDistance, TRK_SplitByDuration, and ST_Segments.
-
Split at a specified gap between two observations—Use this approach to remove track segments that may be unimportant or a result of bad data. Track segments that are longer in distance or time than the specified value are removed from the track. This creates smaller tracks that contain more frequent or more proximate observations.
For example, consider a truck that is recording its location over time. The truck completes a delivery and then returns to a warehouse where it stops recording location data. More than two hours later, the truck begins recording its location again and leaves the warehouse to complete another delivery. If you created a track from the resulting data, the two hour pause between deliveries would be included. By splitting this track with a two hour time gap you could obtain two tracks representing the individual deliveries and remove the line segment between them.
For more information see TRK_SplitByDistanceGap and TRK_SplitByTimeGap.
Densify tracks
Tracks can be densified like any other linestring using ST_Densify or ST_GeodesicDensify. These functions will interpolate the track's M-values along the densified vertices, ensuring that a track is still valid after densification. This can be useful if a track has very long segments due to observations being recorded at a low frequency or due to gaps in data. Densifying a track in these scenarios will improve the accuracy of track distance calculations. Geodesic densification is recommended for tracks that span large distances but is less performant than planar densification.
Compare tracks
Tracks are often compared to other tracks in order to find similar paths of travel, cotravelers, or outliers. Techniques that are used to compare linestrings can also be used to compare tracks in many cases. For a complete description of linestring similarity metrics in GeoAnalytics Engine see Similarity measures.
Other tools for track analysis
Several tools in GeoAnalytics Engine work specifically with track data. These tools operate on point observations and do not require you to create track linestrings prior to analysis. These tools include:
- Calculate Motion Statistics—Enrich track observations with statistics like speed, acceleration, and bearing.
- Detect Incidents—Classify track observations as an incident or not using an Arcade expression.
- Find Dwell Locations—Discover where an entity has been stationary using specified distance and time criteria.
- Reconstruct Tracks—Create track linestrings and buffer the tracks using a numeric column.
- Snap Tracks—Shift track observations to align with roads using road directionality and traversability.
- Trace Proximity Events—Detect possible transmission events by finding where entities have been proximate to other entities of interest.