geoanalytics.tracks.functions¶
after¶
- geoanalytics.tracks.functions.after(track, offset)¶
Returns a linestring column representing the subset of the input track that comes after the offset distance or offset duration from the start of the track. An offset column can be created with ST_CreateDistance or ST_CreateDuration. You can also define an offset with a tuple containing a number and a unit (e.g., (10, “kilometers”) or (5, “minutes”)).
Returns null if a track is invalid.
Refer to the GeoAnalytics Engine guide for examples and usage notes: TRK_After
- Parameters
track – Linestring column.
offset (pyspark.sql.Column) – The offset distance or offset duration. The offset must be greater than zero.
- Returns
Linestring column representing the subset of the input track that comes after the offset distance or offset duration from the start of the track.
- Return type
pyspark.sql.Column
aggr_create_track¶
- geoanalytics.tracks.functions.aggr_create_track(point, time)¶
Operates on a grouped DataFrame and creates tracks using the points in each group, where each point represents an entity’s observed location at an instant. The output tracks are linestrings that represent the shortest path between each observation. Each vertex in the linestring has a timestamp (stored as the M-value) and the vertices are ordered sequentially. You can group your DataFrame using DataFrame.groupBy() or with a GROUP BY clause in a SQL statement.
Refer to the GeoAnalytics Engine guide for examples and usage notes: TRK_Aggr_CreateTrack
- Parameters
point (pyspark.sql.Column) – Point geometry column.
time – Timestamp column to order points by.
- Returns
Linestring column representing the result tracks.
- Return type
pyspark.sql.Column
before¶
- geoanalytics.tracks.functions.before(track, offset)¶
Returns a linestring column representing the subset of the input track that is between the track start and the offset distance or offset duration. An offset column can be created with ST_CreateDistance or ST_CreateDuration. You can also define an offset with a tuple containing a number and a unit (e.g. (10, “kilometers”) or (5, “minutes”)).
Returns null if a track is invalid.
Refer to the GeoAnalytics Engine guide for examples and usage notes: TRK_Before
- Parameters
track – Linestring column.
offset (pyspark.sql.Column) – The offset distance or offset duration. The offset must be greater than zero.
- Returns
Linestring column representing the subset of the input track that is between the track start and the offset distance or offset duration.
- Return type
pyspark.sql.Column
between¶
- geoanalytics.tracks.functions.between(track, start_offset, end_offset)¶
Returns a linestring column representing the subset of the input track that comes between the two offset distances or offset durations. An offset column can be created with ST_CreateDistance or ST_CreateDuration. You can also define an offset with a tuple containing a number and a unit (e.g., (10, “kilometers”) or (5, “minutes”)).
Returns null if a track is invalid.
Refer to the GeoAnalytics Engine guide for examples and usage notes: TRK_Between
- Parameters
track – Linestring column.
start_offset (pyspark.sql.Column) – The start offset distance or start offset duration. The offset must be greater than zero.
end_offset (pyspark.sql.Column) – The end offset distance or end offset duration. The offset must be greater than zero.
- Returns
Linestring column representing the subset of the input track that comes between the two offset distances or offset durations on the track.
- Return type
pyspark.sql.Column
distance_along¶
- geoanalytics.tracks.functions.distance_along(track, point, max_deviation=0.0, output_units=None)¶
Returns a double column representing the length of the track between the track start and where the point intersects the track. You can optionally specify a max_deviation which is the maximum distance a point can be from the track while still being considered on the track. The value is in the units of the track’s spatial reference.
If the input track and point do not have the same spatial reference, the point will be transformed to the spatial reference of the track.
The result is returned in the units specified by output_units. When output_units is None, the result is in the units of the input track’s spatial reference if it is projected; otherwise, the result is in meters.
Returns null if a track is invalid.
Refer to the GeoAnalytics Engine guide for examples and usage notes: TRK_DistanceAlong
- Parameters
track – Linestring column.
point (pyspark.sql.Column) – Point column.
max_deviation (float/int, optional) – Numeric value representing the maximum distance a point can be from the track while still being considered on the track.
output_units (str, optional) – The units of the result. Choose from Meters, Kilometers, Feet, Yards, Miles, or NauticalMiles.
- Returns
DoubleType column representing the length of the track between the track start and where the point intersects the track
- Return type
pyspark.sql.Column
duration¶
- geoanalytics.tracks.functions.duration(track, output_units='seconds')¶
Returns a double column representing the duration of the input track. The duration is the difference between the first and last timestamps in the track. The result is returned in the units specified by output_units. Returns null for invalid tracks.
Refer to the GeoAnalytics Engine guide for examples and usage notes: TRK_Duration
- Parameters
track – Linestring column.
output_units (str, optional) – The units of the result. Choose from Milliseconds, Seconds, Minutes, Hours, or Days.
- Returns
DoubleType column representing the track duration.
- Return type
pyspark.sql.Column
duration_along¶
- geoanalytics.tracks.functions.duration_along(track, point, max_deviation=0.0, output_units='seconds')¶
Returns a double column representing the duration of the track between the track start and where the point intersects the track. You can optionally specify a max_deviation which is the maximum distance a point can be from the track while still being considered on the track. The value is in the units of the track’s spatial reference.
The result is returned in the units specified by output_units. The default is seconds.
If the input track and point do not have the same spatial reference, the point will be transformed to the spatial reference of the track.
Returns null if a track is invalid.
Refer to the GeoAnalytics Engine guide for examples and usage notes: TRK_DurationAlong
- Parameters
track (pyspark.sql.Column) – Linestring column.
point (pyspark.sql.Column) – Point column.
max_deviation (float/int, optional) – Numeric value representing the maximum distance a point can be from the track while still being considered on the track.
output_units (str, optional) – The units of the result. Choose from Milliseconds, Seconds, Minutes, Hours, or Days.
- Returns
DoubleType column representing the duration of the track between the track start and where the point intersects the track.
- Return type
pyspark.sql.Column
end_timestamp¶
- geoanalytics.tracks.functions.end_timestamp(track)¶
Returns a timestamp column containing the last timestamp of each input track. Returns null for invalid tracks.
Refer to the GeoAnalytics Engine guide for examples and usage notes: TRK_EndTimestamp
- Parameters
track – Linestring column.
- Returns
Timestamp column with start timestamp of each track.
- Return type
pyspark.sql.Column
is_valid¶
- geoanalytics.tracks.functions.is_valid(track)¶
Returns a boolean column where the result is True if the input linestring is a valid track; otherwise, it returns False. A linestring is a valid track if it is non-null, non-empty, and has M-values that are distinct and strictly increasing.
Refer to the GeoAnalytics Engine guide for examples and usage notes: TRK_IsValid
- Parameters
track – Linestring column.
- Returns
Geometry column with the centerline of the polygon feature.
- Return type
pyspark.sql.Column
lcss¶
- geoanalytics.tracks.functions.lcss(track1, track2, search_distance, search_duration=None)¶
Returns a double column representing the size of the longest common subsequence between the two input tracks.
Returns null if a track is invalid.
The longest common subsequence is a count of all pairs of observations, each from the two tracks, within the search distance and duration thresholds.
The ST_CreateDistance and ST_CreateDuration functions can be used to define the search distance and search duration parameters. You can also define them with a tuple containing a number and a unit (e.g., (10, “kilometers”) or (5, “minutes”)).
TRK_LCSS uses planar distance calculations when the tracks are in a projected coordinate system and geodesic distance calculations when the tracks are in a geographic coordinate system. If one of the tracks has an unknown spatial reference, the function will use planar distance calculations.
Refer to the GeoAnalytics Engine guide for examples and usage notes: TRK_LCSS
- Parameters
track1 (pyspark.sql.Column) – Linestring column.
track2 (pyspark.sql.Column) – Linestring column.
search_distance (pyspark.sql.Column/struct/tuple) – Distance used to calculate the longest common subsequence. It can be set using ST_CreateDistance.
search_duration (pyspark.sql.Column/struct/tuple) – Duration used to calculate the longest common subsequence. It can be set using ST_CreateDuration.
- Returns
DoubleType column representing the size of the longest common subsequence between the two tracks.
- Return type
pyspark.sql.Column
length¶
- geoanalytics.tracks.functions.length(track, output_units=None)¶
Returns a double column representing the length of the input track. Returns null for invalid tracks.
The result is returned in the units specified by output_units. When output_units is None, the result is in the units of the input track’s spatial reference if it is projected; otherwise, the result is in meters.
Planar distance calculations are used if the input tracks have a projected spatial reference or no spatial reference. Chordal distance calculations are used if the input tracks have a geographic spatial reference. For more information see Coordinate systems and transformations.
Refer to the GeoAnalytics Engine guide for examples and usage notes: TRK_Length
- Parameters
track – Linestring column.
output_units (str, optional) – The units of the result. Choose from Meters, Kilometers, Feet, Yards, Miles, or NauticalMiles.
- Returns
DoubleType column representing the track length.
- Return type
pyspark.sql.Column
query¶
- geoanalytics.tracks.functions.query(track, offset)¶
Returns a point column representing the location that is the offset distance or offset duration along the input track, measured from the track start. An offset column can be created with ST_CreateDistance or ST_CreateDuration. You can also define an offset with a tuple containing a number and a unit (e.g., (10, “kilometers”) or (5, “minutes”)).
Returns null if a track is invalid.
Refer to the GeoAnalytics Engine guide for examples and usage notes: TRK_Query
- Parameters
track – Linestring column.
offset (pyspark.sql.Column) – The offset distance or offset duration. The offset must be greater than zero.
- Returns
Point column representing the location that is the offset distance or offset duration along the input track.
- Return type
pyspark.sql.Column
speed¶
- geoanalytics.tracks.functions.speed(track, output_units='meterspersecond')¶
Returns a double column representing the speed of the input track. The speed is the length of the track (see TRK_Length) divided by the duration of the track (see TRK_Duration). The result is returned in the units specified by output_units. Returns null for invalid tracks.
Refer to the GeoAnalytics Engine guide for examples and usage notes: TRK_Speed
- Parameters
track – Linestring column.
output_units (str, optional) – The units of the result. Choose from MetersPerSecond, MilesPerHour, NauticalMilesPerHour, FeetPerSecond, or KilometersPerHour.
- Returns
DoubleType column representing the track speed.
- Return type
pyspark.sql.Column
split_by_distance¶
- geoanalytics.tracks.functions.split_by_distance(track, distance)¶
Returns an array of tracks created by splitting the input track into segments with each segment no longer than the specified distance. The distance can be created with ST_CreateDistance or with a tuple containing a number and a unit (e.g., (10, “kilometers”)).
Returns null if a track is invalid.
Refer to the GeoAnalytics Engine guide for examples and usage notes: TRK_SplitByDistance
- Parameters
track – Linestring column.
distance (pyspark.sql.Column) – The maximum length of result tracks. The distance must be greater than zero.
- Returns
Array column representing the tracks created by splitting the input track.
- Return type
pyspark.sql.Column
split_by_distance_gap¶
- geoanalytics.tracks.functions.split_by_distance_gap(track, gap_distance)¶
Returns an array of tracks created by splitting the input track wherever two vertices are farther apart than the specified gap distance. The track is split by removing the segment between the two vertices. The distance can be created with ST_CreateDistance or with a tuple containing a number and a unit (e.g., (10, “kilometers”)).
Returns null if a track is invalid.
Refer to the GeoAnalytics Engine guide for examples and usage notes: TRK_SplitByDistanceGap
- Parameters
track – Linestring column.
gap_distance (pyspark.sql.Column) – The maximum distance allowed between two track vertices. The distance must be greater than zero.
- Returns
Array column representing the tracks created by splitting the input track.
- Return type
pyspark.sql.Column
split_by_duration¶
- geoanalytics.tracks.functions.split_by_duration(track, duration)¶
Returns an array of tracks created by splitting the input track into segments with each segment no longer than the specified duration. The duration can be created with ST_CreateDuration or with a tuple containing a number and a unit (e.g., (5, “minutes”)).
Returns null if a track is invalid.
Refer to the GeoAnalytics Engine guide for examples and usage notes: TRK_SplitByDuration
- Parameters
track – Linestring column.
duration (pyspark.sql.Column) – The maximum duration of result tracks. The duration must be greater than zero.
- Returns
Array column representing the tracks created by splitting the input track.
- Return type
pyspark.sql.Column
split_by_time_gap¶
- geoanalytics.tracks.functions.split_by_time_gap(track, gap_duration)¶
Returns an array of tracks created by splitting the input track wherever two vertices are farther apart than the specified gap duration. The track is split by removing the segment between the two vertices. The duration can be created with ST_CreateDuration or with a tuple containing a number and a unit (e.g., (5, “minutes”)).
Returns null if a track is invalid.
Refer to the GeoAnalytics Engine guide for examples and usage notes: TRK_SplitByTimeGap
- Parameters
track – Linestring column.
gap_duration (pyspark.sql.Column) – The maximum duration allowed between two track vertices. The duration must be greater than zero.
- Returns
Array column representing the tracks created by splitting the input track.
- Return type
pyspark.sql.Column
start_timestamp¶
- geoanalytics.tracks.functions.start_timestamp(track)¶
Returns a timestamp column containing the first timestamp of each input track. Returns null for invalid tracks.
Refer to the GeoAnalytics Engine guide for examples and usage notes: TRK_StartTimestamp
- Parameters
track – Linestring column.
- Returns
Timestamp column with start timestamp of each track.
- Return type
pyspark.sql.Column