Line similarity measures calculate the resemblance between two lines. There are multiple methods to quantify line similarity, each returning a numeric value.
Line similarity measures can answer questions like the following:
- Which of my lines are more similar?
- Is Line A more similar to Line B or Line C?
- What portion of my lines are most similar?
GeoAnalytics Engine uses point-based similarity measures. Point-based methods compute similarity by finding matching observations (vertices) between a pair of lines. These methods consider observation locations and depending on the measure, they may also consider timestamps associated with each location.
A similarity measure can be applied on linestrings or tracks. The difference between the two is that line similarity does not take the timestamp value stored in each vertex into consideration while track similarity does.
The table below summarizes the line and track similarity functions included in GeoAnalytics Engine.
Function | Type of similarity | Input type |
---|---|---|
ST_EuclideanDistance | Line | Linestring |
ST_FréchetDistance | Line | Linestring |
ST_HausdorffDistance | Geometry | Point, linestring, polygon |
TRK_LCSS | Track | Track |
Line similarity
ST_EuclideanDistance—Represents the Euclidean distance between two linestrings. A distance is calculated from each vertex in the first input linestring to the corresponding vertices in the second input linestring. The Euclidean distance is the average of these distances.
The Euclidean distance is commonly used in vehicle routing problems and can also be applied to route planning for short distances like open-space hiking. It can also be used to rank lines based on similarity to a query line or find the most dissimilar lines in a road network.
For more information, see ST_EuclideanDistance.
ST_FréchetDistance—Represents the discrete Fréchet distance between two linestrings. The calculation spatially aligns vertices in the first input linestring to the closest vertices in the second input linestring. The discrete Fréchet distance is the greatest distance among all aligned pair of vertices.
The Fréchet distance is typically used to find the similarity between two pedestrian paths.
For more information, see ST_FréchetDistance.
ST_HausdorffDistance—Represents the Hausdorff distance between two geometries. The Hausdorff distance is defined as the greatest distance among all vertices of a given geometry to the closest vertex in the reference geometry. This function works with all types of geometries.
Use cases for Hausdorff distance include finding the nearest entry point to a nature walking trail from a street or finding the nearest shelter.
For more information, see ST_HausdorffDistance.
Track similarity
TRK_LCSS—Represents the Longest Common Subsequence similarity between two tracks. This function is considered a track similarity measure as it takes timestamps (stored as m-values) into consideration.
The Longest Common Subsequence measure can be used to identify the total number of outlier points when comparing two tracks.
For more information, see TRK_LCSS.