TRK_SplitByDuration takes a track column and a duration and returns an array of tracks. The result array contains the input track split into segments with each segment no longer than the specified duration.
The duration can be defined using ST_CreateDuration or with a tuple
containing a number and a unit string (e.g., (5, "minutes")
).
Tracks are linestrings that represent the change in an entity's location over time. Each vertex in the linestring has a timestamp (stored as the M-value) and the vertices are ordered sequentially.
For more information on using tracks in GeoAnalytics Engine, see the core concept topic on tracks.
Function | Syntax |
---|---|
Python | split |
SQL | TRK |
For more details, go to the GeoAnalytics Engine API reference for split_by_duration.
Examples
from geoanalytics.sql import functions as ST
from geoanalytics.tracks import functions as TRK
from pyspark.sql import functions as F
data = [
("LINESTRING M (-117.27 34.05 1633455010, -117.22 33.91 1633456062, -116.96 33.64 1633457132)",),
("LINESTRING M (-116.89 33.96 1633575895, -116.71 34.01 1633576982, -116.66 34.08 1633577061)",),
("LINESTRING M (-116.24 33.88 1633575234, -116.33 34.02 1633576336)",)
]
df = spark.createDataFrame(data, ["wkt"]).withColumn("track", ST.line_from_text("wkt", srid=4326))
result = df.withColumn("split_by_duration", TRK.split_by_duration("track", (10, "minutes")))
result.select(F.explode("split_by_duration"), F.monotonically_increasing_id().alias("id")) \
.st.plot(is_categorical=True, cmap_values="id", cmap="prism", linewidths=10, figsize=(15, 8))
Version table
Release | Notes |
---|---|
1.4.0 | Function introduced |