geoanalytics.tools¶
Aggregate Points¶
- class geoanalytics.tools.AggregatePoints¶
Aggregates points into square or hexagon bins, or existing polygons.
The tool first determines which points fall within each specified area. After determining this point-in-area spatial relationship, statistics about all points in the area are calculated and assigned to the area.
Refer to the GeoAnalytics Engine guide for examples and usage notes: Aggregate Points
- addSummaryField(summary_field, statistic, alias=None)¶
Adds a summary statistic of a field in the input DataFrame to the result DataFrame.
- Parameters
summary_field (str) – The name of a field from the input DataFrame.
statistic (str) – Choose from Count, Sum, Mean, Max, Min, Range, Stddev, Var, or Any.
alias (str) – The name of the result field containing the statistic. The default is the field name and statistic separated by an underscore.
- run(dataframe)¶
Runs the AggregatePoints tool using the provided DataFrame.
- Parameters
dataframe (DataFrame) – A DataFrame containing a point column.
- Returns
A DataFrame containing a polygon column, count of points within the polygon, and any summary statistics for each polygon.
- Return type
DataFrame
- setBins(bin_size, bin_size_unit, bin_type='square')¶
Sets the size and shape of bins used to aggregate into.
Note
This method will override setPolygons.
- Parameters
bin_size (int/float) – Distance between parallel sides of a bin or H3 resolution.
bin_size_unit (str) – Choose from Meters, Kilometers, Feet, Miles, NauticalMiles, or Yards or H3Res for H3 bins.
bin_type (str) – Choose from Square, Hexagon or H3.
- setPolygons(polygons)¶
Sets the DataFrame containing a column of polygons into which the input points will be aggregated.
Note
This method will override setBins.
- Parameters
polygons (pyspark.sql.DataFrame) – A DataFrame containing a column of polygons.
- setTimeStep(interval_duration, interval_unit, repeat_duration=None, repeat_unit=None, reference_time=None)¶
Sets the time step interval, time step repeat, and reference time. If set, points will be aggregated into each bin for each time step. The input DataFrame must have a datetime column to use this setter.
- Parameters
interval_duration (int) – Duration of each time step.
interval_unit (str) – Choose from Milliseconds, Seconds, Minutes, Hours, Days, Weeks, Months, or Years.
repeat_duration (int) – Time between one time step to the next time step.
repeat_unit (str) – Choose from Milliseconds, Seconds, Minutes, Hours, Days, Weeks, Months, or Years
reference_time (int/long/datetime.datetime) – A reference datetime to align the time steps to. The default is epoch time 0.
Calculate Density¶
- class geoanalytics.tools.CalculateDensity¶
Calculates the density of points and their attributes.
Each point represents the location of some event or incident, and the result calculation represents a count of incidents per unit area. A higher density value in a new location means that there are more points near that location.
In many cases, the result layer can be interpreted as a risk surface for future events. For example, if the input points represent locations of lightning strikes, the result layer can be interpreted as a risk surface for future lightning strikes.
Refer to the GeoAnalytics Engine guide for examples and usage notes: Calculate Density
- run(dataframe)¶
Runs the CalculateDensity tool using the provided DataFrame.
- Parameters
dataframe (DataFrame) – A DataFrame containing a point column with a spatial reference.
- Returns
A DataFrame of square or hexagon bins with a column of calculated density values.
- Return type
DataFrame
- setAreaUnit(area_unit)¶
Sets the desired output units of the density values. The default is SquareKilometers. If density values are very small, you can increase the scale of the area units to return larger values.
- Parameters
area_unit (str) – Choose from SquareMeters, SquareKilometers, Hectares, SquareFeet, SquareYards, SquareMiles or Acres.
- setBins(bin_size, bin_size_unit, bin_type='square')¶
Sets the size and shape of bins used to calculate density.
- Parameters
bin_size (float) – Distance between parallel sides of a bin.
bin_size_unit (str) – Choose from Meters, Kilometers, Feet, Miles, NauticalMiles, or Yards.
bin_type (str) – Choose from Square or Hexagon.
- setFields(*fields)¶
Sets one or more fields specifying the number of incidents at each location. You can calculate the density on multiple fields. The density of the count of points will always be calculated.
- Parameters
fields (*str) – The names of one or more fields from the input DataFrame.
- setNeighborhood(distance, distance_unit)¶
Sets the size of the neighborhood within which to calculate density. The distance must be larger than the bin size.
- Parameters
distance (float) – Radius of the neighborhood, measured from each bin center.
distance_unit (str) – Choose from Meters, Kilometers, Feet, Miles, NauticalMiles, or Yards.
- setTimeStep(interval_duration, interval_unit, repeat_duration=None, repeat_unit=None, reference_time=None)¶
Sets the time step interval, time step repeat, and reference time. If set, density will be calculated for each time step at each bin location. The input DataFrame must have a datetime column to use this setter.
- Parameters
interval_duration (int) – Duration of each time step.
interval_unit (str) – Choose from Milliseconds, Seconds, Minutes, Hours, Days, Weeks, Months, or Years.
repeat_duration (int) – Time between one time step to the next time step.
repeat_unit (str) – Choose from Milliseconds, Seconds, Minutes, Hours, Days, Weeks, Months, or Years.
reference_time (int/long/datetime.datetime) – A reference datetime to align the time steps to. The default is epoch time 0.
- setWeightType(weight_type)¶
Sets the type of weighting applied to density calculations. This parameter supports two options:
Uniform: calculates density as magnitude-per-area. This is the default.
Kernel: calculates density by applying a kernel function to fit a smooth tapered surface to each point.
- Parameters
weight_type (str) – Choose from Uniform or Kernel.
Calculate Field¶
- class geoanalytics.tools.CalculateField¶
Creates and populates a new field or edits an existing field using ArcGIS Arcade.
Your calculation can optionally be track aware. Track-aware equations use Arcade expressions that include track functions. To include a track-aware calculation, setTrackFields must be called and the input DataFrame must have datetime and track ID columns.
Refer to the GeoAnalytics Engine guide for examples and usage notes: Calculate Field
- run(dataframe)¶
Runs the CalculateField tool using the provided DataFrame.
- Parameters
dataframe (DataFrame) – A DataFrame.
- Returns
A copy of the input DataFrame with the calculated field appended or overwritten.
- Return type
DataFrame
- setExpression(expression)¶
Sets an Arcade expression used to calculate the new field values. You can use any of the Date, Logical, Mathematical, or Text functions available with Arcade expressions.
- Parameters
expression (str) – An Arcade expression.
- setField(field_name, field_type)¶
Sets the name and type of the new field. If the name already exists in the dataset the field will be overwritten.
- Parameters
field_name (str) – The name of the column that will be appended to the input DataFrame.
field_type (str) – Choose from Date, Double, Integer, or String.
- setTimeBoundarySplit(time_boundary_split, time_boundary_split_unit, time_boundary_reference=None)¶
Sets boundaries to limit calculations to defined spans of time. For example, if you use a time boundary of 1 day, starting on January 1, 1980 tracks will be analyzed one day at a time.
- Parameters
time_boundary_split (int) – The scale of the time boundary.
time_boundary_split_unit (str) – Choose from Milliseconds, Seconds, Minutes, Hours, Days, Weeks, Months, or Years.
time_boundary_reference (int/long/datetime.datetime) – A reference datetime to align the time boundaries to. The default is epoch time 0.
- setTrackFields(*track_fields)¶
Sets one or more fields used to identify distinct tracks.
- Parameters
track_fields (*str) – The names of one or more fields from the input DataFrame.
Calculate Motion Statistics¶
- class geoanalytics.tools.CalculateMotionStatistics¶
Calculates motion statistics and descriptors for time-enabled points that represent one or more moving entities.
Points are grouped together into tracks representing each entity using a unique identifier. Motion statistics are calculated at each point using one or more points in the track history. Calculations include summaries of distance traveled, duration, elevation, speed, acceleration, bearing, and idle status.
Refer to the GeoAnalytics Engine guide for examples and usage notes: Calculate Motion Statistics
- run(dataframe)¶
Runs the CalculateMotionStatistics tool using the provided DataFrame.
- Parameters
dataframe (DataFrame) – A DataFrame containing a track ID column and a datetime column.
- Returns
A copy of the input DataFrame with motion statistics appended to each row.
- Return type
DataFrame
- setDistanceMethod(distance_method)¶
Sets the method used to calculate distances between track observations. There are two methods to choose from:
Planar: measures distances using a Euclidean plane and will not calculate statistics across the date line.
Geodesic: calculations will cross the date line when appropriate. This is the default. If the spatial reference cannot be panned, calculations will be limited to the coordinate system extent and may not wrap.
- Parameters
distance_method (str) – Choose from Planar or Geodesic.
- setIdleTolerance(distance_tolerance, distance_tolerance_unit, time_tolerance, time_tolerance_unit)¶
Sets the tolerances to use to decide if an entity is idling. An entity is idling when it hasn’t moved more than the distance tolerance in at least the time tolerance.
- Parameters
distance_tolerance (float) – Spatial idling tolerance.
distance_tolerance_unit (str) – Choose from Meters, Kilometers, Feet, Miles, NauticalMiles, or Yards.
time_tolerance (int) – Temporal idling tolerance.
time_tolerance_unit (str) – Choose from Milliseconds, Seconds, Minutes, Hours, Days, Weeks, Months, or Years.
- setMotionStatistics(*motion_statistics)¶
Sets the statistic groups that will be calculated.
- Parameters
motion_statistics (*str) – Choose from Distance, Speed, Acceleration, Duration, Elevation, Slope, Idle, and Bearing.
- setStatisticUnits(distance_unit='Meters', duration_unit='Seconds', speed_unit='MetersPerSecond', acceleration_unit='MetersPerSecondSquared', elevation_unit='Meters')¶
Sets the output units for each statistic group.
- Parameters
distance_unit (str) – Choose from Meters, Kilometers, Feet, Miles, NauticalMiles, or Yards.
duration_unit (str) – Choose from Milliseconds, Seconds, Minutes, Hours, Days, Weeks, Months, or Years.
speed_unit (str) – Choose from MetersPerSecond, KilometersPerHour, FeetPerSecond, MilesPerHour, or NauticalMilesPerHour.
acceleration_unit (str) – Choose MetersPerSecondSquared or FeetPerSecondSquared.
elevation_unit (str) – Choose from Meters, Kilometers, Feet, Miles, NauticalMiles, or Yards.
- setTimeBoundarySplit(time_boundary_split, time_boundary_split_unit, time_boundary_reference=None)¶
Sets boundaries to limit calculations to defined spans of time. For example, if you use a time boundary of 1 day, starting on January 1, 1980 tracks will be analyzed one day at a time.
- Parameters
time_boundary_split (int) – The scale of the time boundary.
time_boundary_split_unit (str) – Choose from Milliseconds, Seconds, Minutes, Hours, Days, Weeks, Months, or Years.
time_boundary_reference (int/long/datetime.datetime) – A reference datetime to align the time boundaries to. The default is epoch time 0.
- setTrackFields(*track_fields)¶
Sets one or more fields used to identify distinct tracks.
- Parameters
track_fields (*str) – The names of one or more fields from the input DataFrame.
- setTrackHistoryWindow(track_history_window)¶
Sets the number of observations (including the current observation) that will be used when calculating summary statistics that are not instantaneous. This includes minimum, maximum, average, and total statistics.
The default track history window is 3, which means that at each point in a track summary, statistics will be calculated using the current observation and the previous two observations.
Note
This setter does not affect instantaneous statistics or idle classification.
- Parameters
track_history_window (int) – Number of observations.
Clip¶
- class geoanalytics.tools.Clip¶
Extracts geometries that overlay clip geometries.
Note
This tool operates on the entire input DataFrame and thus can more performant than equivalent row-wise operations using SQL functions.
Refer to the GeoAnalytics Engine guide for examples and usage notes: Clip
- run(input_dataframe, clip_dataframe)¶
Runs the Clip tool using the provided DataFrames.
- Parameters
input_dataframe (DataFrame) – A DataFrame containing a geometry column.
clip_dataframe (DataFrame) – A DataFrame containing a polygon column to clip with.
- Returns
A DataFrame containing the result of the clip.
- Return type
DataFrame
Create Routes¶
- class geoanalytics.tools.CreateRoutes¶
Uses a network dataset to understand the connectivity of a transportation network in order to find the best route between a series of input points. The resulting DataFrame contains a linestring column with the routes that visit the input points.
Refer to the GeoAnalytics Engine guide for examples and usage notes: Create Routes
- run(dataframe)¶
Runs the CreateRoutes tool using the provided DataFrame.
- Parameters
dataframe (DataFrame) – A DataFrame containing a column with an array of points representing the stops for which a route will be created.
- Returns
A copy of the input DataFrame that will also contain the result route if specified, the total travel time for the route in minutes and the total distance in meters.
- Return type
DataFrame
- setNetwork(path)¶
Sets the network data source from a mobile map package or a mobile geodatabase.
- Parameters
path (str) – The path to the network data source.
- setRouteGeometry(route_geometry)¶
Sets the shape of the route between stops. The following options are supported:
AlongNetwork: returns a route that has the exact shape of the underlying network dataset.
StraightLines: returns a route that will be a straight line between the stops.
NoLines: doesn’t return any route geometry.
- Parameters
route_geometry (str) – Choose from AlongNetwork, StraightLines or NoLines.
- setSequence(find_best, preserve_first, preserve_last)¶
Determines the order that the input points will be used to create the route.
- Parameters
find_best (bool) – True to find the best sequence or False to use the current sequence of the provided points in the array.
preserve_first (bool) – True to preserve the first point in the array, False to honor the best sequence and not preserve the first point.
preserve_last (bool) – True to preserve the last point in the array, False to honor the best sequence and not preserve the last point.
- setStops(*stops)¶
Sets the stops, which are locations that the returned route will visit.
- Parameters
stops (*pyspark.sql.Column) – An array of points that will be used to create the route.
- setTime(day_of_week, time, time_zone='UTC')¶
Sets the start time for the route.
- Parameters
day_of_week (str) – Choose from Sunday, Monday, Tuesday, Wednesday, Thursday, Friday, or Saturday.
time (str/datetime.time) – The time of the day as a string in HH:mm:ss format, or as a datetime.time object.
time_zone (str) – The time zone, represented as a UTC offset or time zone identifier. Refer to this list of tz database time zones for allowed identifiers. The default is “UTC”.
- setTravelMode(travel_mode)¶
Sets the travel mode. A travel mode refers to the mode of transportation, such as driving or walking. By default, the tool uses the default travel mode in the network dataset.
- Parameters
travel_mode (str) – The mode of transportation. The parameter accepts any travel mode that is defined on the network data source or a JSON format for a custom travel mode.
Create Service Areas¶
- class geoanalytics.tools.CreateServiceAreas¶
Generates reachable service areas around facilities that contain all streets accessed within a specified travel distance or travel time.
For example, the 10-minute walk-time service area around a subway station indicates a region where residents can walk to the station within ten minutes.
This tool requires that the input DataFrame contains a point column representing the facilities around which the service areas will be created.
Refer to the GeoAnalytics Engine guide for examples and usage notes: Create Service Areas
- accumulateAttributes(*attributes)¶
Accumulates cost attributes along the path from the facility to the reachable location. By default, the cost attribute of the travel mode will be accumulated and returned in the output DataFrame.
- Parameters
attributes (*str) – The cost attributes to accumulate.
- run(dataframe)¶
Runs the CreateServiceAreas tool using the provided DataFrame representing the facilities.
- Parameters
dataframe (DataFrame) – A DataFrame containing a point column representing the facilities around which the service areas will be created.
- Returns
A copy of the input DataFrame with service area polygons or linestrings representing the reachable service areas around the facilities.
- Return type
DataFrame
- setCutoffs(cutoffs, unit=None)¶
Sets impedance cutoffs to determine the extent of the service areas.
There are two types of cutoffs, distance and time cutoffs.
Distance cutoffs specify the maximum distance that can be traveled from or to the facilities.
Time cutoffs specify the maximum time allowed to travel from or to the facilities.
- Parameters
cutoffs (*int/*float/str) – The impedance cutoffs used to calculate the extent of the service areas. It accepts a single cutoff value, or one or multiple values in an array format. For analysis on a per-facility basis, it accepts a string representing the name of the cutoff field in the input DataFrame.
unit (str) – Choose from Meters, Kilometers, Feet, Miles, NauticalMiles, or Yards for distance cutoffs. Choose from Milliseconds, Seconds, Minutes, Hours, Days, Weeks, Months, or Years for time cutoffs. By default, the cutoff values are in the units of the impedance attribute used by the selected travel mode.
- setGeometryAtCutoff(geometry_at_cutoff)¶
Specifies whether concentric service area polygons will be created as rings or disks.
Rings: the polygons representing larger breaks will exclude the polygons of smaller breaks. This creates polygons between consecutive breaks. Use this option to find the area from one break to another. For instance, if you create 5- and 10-minute service areas, the 10-minute service area polygon will exclude the area under the 5-minute service area polygon. This is the default.
Disks: the polygons will be created from the facility to the break. For instance, if you create 5- and 10-minute service areas, the 10-minute service area polygon will include the area under the 5-minute service area polygon.
- Parameters
geometry_at_cutoff (str) – Choose from Rings (default) or Disks.
- setNetwork(path)¶
Sets the network data source from a mobile map package or a mobile geodatabase.
- Parameters
path (str) – The path to the network data source.
- setOutputType(output_type: str = 'polygons')¶
Specifies the type of output to be generated. Service area output can be linestrings representing the roads reachable before the cutoffs are exceeded or polygons representing the reachable area that encompasses these linestrings.
Polygons—Polygons which cover the areas of the network that can be reached within the given cutoffs. This is the default.
Polylines—Linestrings which cover the streets or network edges that can be reached within the given cutoffs. Lines are a truer representation of a service area than polygons since service area analyses are based on measurements along the network lines.
- Parameters
output_type – Choose from Polygons (default) or Polylines.
- setPolygonDetail(polygon_detail)¶
Sets the level of detail for the output polygons representing the reachable areas within the specified impedance cutoffs. Supported options are “Standard” and “High”.
Standard: polygons will be created with a standard level of detail. Standard polygons are generated quickly and are fairly accurate, but quality deteriorates as you move closer to the borders of the service area polygons. This is the default.
High: polygons will be created with the highest level of detail. Holes in the polygon may exist; they represent islands of network elements, such as streets, that couldn’t be reached without exceeding the cutoff impedance or due to travel restrictions. Use this option for applications in which precise results are important.
- Parameters
polygon_detail (str) – Choose from Standard`(default) or `High.
- setTime(day_of_week, time, time_zone='UTC')¶
Sets the time to depart from or arrive at the facilities of the service area analysis. The time represents the departure time if travel direction is set to from facilities, and it represents the arrival time if travel direction is set to toward facilities.
- Parameters
day_of_week (str) – Choose from Sunday, Monday, Tuesday, Wednesday, Thursday, Friday, or Saturday.
time (str/datetime.time) – The time of the day as a string in HH:mm:ss format, or as a datetime.time object.
time_zone (str) – The time zone, represented as a UTC offset or time zone identifier. Refer to this list of tz database time zones for allowed identifiers. The default is “UTC”.
- setTravelDirection(travel_direction)¶
Sets the direction of travel to or from the facilities. This parameter supports two options:
FromFacilities: the service area is calculated starting from the facilities and extending outward to the periphery. It means that the tool calculates how far you can travel from the facilities within the specified impedance cutoffs.
ToFacilities: the service area is calculated the opposite direction from the periphery to the facilities within the specified impedance cutoffs.
- Parameters
travel_direction (str) – The direction of travel to or from the facilities. Choose from ‘FromFacilities’ (default) or ‘ToFacilities’.
- setTravelMode(travel_mode)¶
Sets the travel mode. A travel mode refers to the mode of transportation, such as driving or walking. By default, the tool uses the default travel mode in the network dataset.
- Parameters
travel_mode (str) – The mode of transportation. The parameter accepts any travel mode that is defined on the network data source or a JSON format for a custom travel mode.
Detect Incidents¶
- class geoanalytics.tools.DetectIncidents¶
Determines which observations are incidents of interest using a specified condition.
Rows in the input DataFrame are grouped using a track ID and ordered sequentially before an incident condition is applied. Rows that meet the starting incident condition are marked as an incident. An ending incident condition can be applied; when the end condition is true, the track is no longer in an incident. You can return all input rows or only rows that are incidents.
Refer to the GeoAnalytics Engine guide for examples and usage notes: Detect Incidents
- run(dataframe)¶
Runs the DetectIncidents tool using the provided DataFrame.
- Parameters
dataframe (DataFrame) – A DataFrame containing a track ID column and a datetime column.
- Returns
A copy of the input DataFrame with incident status appended to each row.
- Return type
DataFrame
- setEndConditionExpression(end_condition_expression)¶
Sets the condition used to end incidents. If there is an end condition, any feature that meets the start condition expression and does not meet the end condition expression is an incident.
- Parameters
end_condition_expression (str) – Arcade expression used to identify incidents.
- setOutputMode(output_mode)¶
Sets which observations are returned. There are two options:
All: all of the input observations are returned. This is the default.
Incidents: only observations that were found to be incident are returned.
- Parameters
output_mode (str) – Choose from All or Incidents.
- setStartConditionExpression(start_condition_expression)¶
Sets the condition used to start incidents. If there is no end condition expression specified, any feature that meets this condition is an incident. If there is an end condition, any feature that meets the start condition expression and does not meet the end condition expression is an incident.
- Parameters
start_condition_expression (str) – Arcade expression used to identify incidents.
- setTimeBoundarySplit(time_boundary_split, time_boundary_split_unit, time_boundary_reference=None)¶
Sets boundaries to limit calculations to defined spans of time. For example, if setting a time boundary of 1 day starting on January 1, 1980 tracks will be analyzed one day at a time.
- Parameters
time_boundary_split (int) – The scale of the time boundary.
time_boundary_split_unit (str) – Choose from Milliseconds, Seconds, Minutes, Hours, Days, Weeks, Months, or Years.
time_boundary_reference (int/long/datetime.datetime) – A reference datetime to align the time boundaries to. The default is epoch time 0.
- setTrackFields(*track_fields)¶
Sets one or more fields used to identify distinct tracks.
- Parameters
track_fields (*str) – The names of one or more fields from the input DataFrame.
Find Closest Facilities¶
- class geoanalytics.tools.FindClosestFacilities¶
Finds the given number of facilities from each incident within the specified travel time or travel distance, and returns the best routes between the incidents and the chosen facilities. When finding closest facilities, you can specify whether the direction of travel is to or away from the facilities. Examples of using this tool include finding the closest fire stations to fire incidents, closest healthcare providers to resident’s addresses, or closest nearby hospitals for emergency responses.
Refer to the GeoAnalytics Engine guide for examples and usage notes: Find Closest Facilities
- accumulateAttributes(*attributes)¶
Accumulates cost attributes along the network between the incident and the identified facility. No accumulated cost is returned by default.
- Parameters
attributes (*str) – The cost attributes to accumulate.
- run(incidents_df, facilities_df)¶
Runs the FindClosestFacilities tool using the provided DataFrames.
- Parameters
incidents_df (DataFrame) – A DataFrame containing points that represent the incidents.
facilities_df (DataFrame) – A DataFrame containing points that represent the facilities.
- Returns
A copy of the inputs combined into one DataFrame that will also contain the rank of the closest facilities, the travel time in minutes, the travel distance in meters between the incident and the identified facility, and if specified, the resulting route and accumulated cost attributes.
- Return type
DataFrame
- setCutoff(cutoff, unit=None)¶
Sets the maximum travel distance or travel time when searching for facilities for each incident. Its unit should match the travel mode. For example, if the travel mode is set in the units of distance, the impedance cutoff must be set in distance.
There are two types of cutoffs, distance and time cutoffs.
Distance cutoffs specify the maximum travel distance between incidents and facilities. For example, when analyzing walking distance from schools (incident DataFrame) to subway stations (facility DataFrame), a cutoff value of 1 mile (e.g., setCutoff(1, “mile”)) means that the tool will search for the closest subway stations within 1 mile walking from each school.
Time cutoffs specify the maximum travel time between incidents and facilities. For example, when analyzing driving time from fire stations (facility DataFrame) to fire incidents (incident DataFrame), a cutoff value of 15 minutes (e.g. setCutoff(15, “minutes”)) means the tool will search for the closest fire stations within 15-minutes drive-time to the fire incidents.
- Parameters
cutoff (int/float) – The impedance cutoff used to calculate the maximum travel distance or travel time when searching for facilities for each incident. It accepts a positive value.
unit (str) – Choose from Meters, Kilometers, Feet, Miles, NauticalMiles, or Yards for distance cutoffs. Choose from Milliseconds, Seconds, Minutes, Hours, Days, Weeks, Months, or Years for time cutoffs. If the unit is missing, the tool will use the distance or time unit defined in the travel mode.
- setNetwork(path)¶
Sets the network data source from a mobile map package or a mobile geodatabase.
- Parameters
path (str) – The path to the network data source.
- setNumFacilities(count)¶
Specifies the maximum number of closest facilities to find for each incident. If there are multiple facilities with an equal travel cost to an incident, the tool will break ties by randomly selecting one or more records from the equidistant facilities to ensure the specified number of closest facilities.
- Parameters
count (int) – The number of facilities to find. The default is 1.
- setRouteGeometry(route_geometry)¶
Sets the shape of the route between the incident and the identified facility. You can also choose not to return the line geometry for better performance. The following options are supported:
AlongNetwork: returns the true shape of the result route that is based on the streets along the network.
StraightLines: returns a straight line between the incident and the identified facility.
NoLines: doesn’t return any route geometry.
- Parameters
route_geometry (str) – Choose from AlongNetwork, StraightLines or NoLines.
- setTime(day_of_week, time, time_zone='UTC', usage='departure')¶
Sets the time at which the routes will begin or end. When usage is set to Departure, the time is interpreted as the departure time from the facility or incident. When usage is set to Arrival, the time is interpreted as the arrival time at the facility or incident.
- Parameters
day_of_week (str) – Choose from Sunday, Monday, Tuesday, Wednesday, Thursday, Friday, or Saturday.
time (str/datetime.time) – The time of the day as a string in HH:mm:ss format, or as a datetime.time object.
time_zone (str) – The time zone, represented as a UTC offset or time zone identifier. Refer to this list of tz database time zones for allowed identifiers. The default is “UTC”.
usage (str) – Choose from Departure or Arrival.
- setTravelDirection(travel_direction)¶
Sets the direction of travel to or from the facilities. This parameter supports two options:
FromFacilities: the closest facilities are searched along the network from the incidents to the facilities within the specified impedance cutoff. This is the default.
ToFacilities: the closest facilities are searched along the network from the facilities to the incidents within the specified impedance cutoff.
- Parameters
travel_direction (str) – The direction of travel to or from the facilities. Choose from ‘FromFacilities’ (default) or ‘ToFacilities’.
- setTravelMode(travel_mode)¶
Sets the travel mode. A travel mode refers to the mode of transportation, such as driving or walking. By default, the tool uses the default travel mode in the network dataset.
- Parameters
travel_mode (str) – The mode of transportation. The parameter accepts any travel mode that is defined on the network data source or a JSON format for a custom travel mode.
Find Dwell Locations¶
- class geoanalytics.tools.FindDwellLocations¶
Finds where entities dwell within a specific distance and duration using a record of their location through time.
Dwell locations are determined using time and distance tolerances. First, the tool groups points into tracks representing each entity using a track identifier and orders them sequentially. Next, the distance between the first point in a track and the next is calculated. If two temporally consecutive points stay within the given distance for at least the given duration, they are considered part of a dwell. When two points are found to be part of a dwell, the first point in the dwell is used as a reference point, and the tool finds consecutive points that are within the specified distance of the reference point in the dwell.
Once all points within the specified distance are found, the tool collects the dwell points and calculates their mean center. Features before and after the current dwell are added to the dwell if they are within the given distance of the dwell location’s mean center. This process continues until the end of the track.
Refer to the GeoAnalytics Engine guide for examples and usage notes: Find Dwell Locations
- addSummaryField(summary_field, statistic, alias=None)¶
Adds a summary statistic of a field in the input DataFrame to the result DataFrame.
- Parameters
summary_field (str) – The name of a field from the input DataFrame.
statistic (str) – Choose from First, Last, Count, Sum, Mean, Max, Min, Range, Stddev, Var, or Any.
alias (str) – The name of the result field containing the statistic. The default is the field name and statistic separated by an underscore.
- run(dataframe)¶
Runs the FindDwellLocations tool using the provided DataFrame.
- Parameters
dataframe (DataFrame) – A DataFrame containing a point column with a spatial reference, a track ID column, and a datetime column
- Return type
DataFrame
- setDistanceMethod(distance_method)¶
Sets the method used to calculate distances between track observations. There are two methods to choose from:
Planar: measures distances using a Euclidean plane and will not calculate statistics across the date line.
Geodesic: calculations will cross the date line when appropriate. This is the default. If the spatial reference cannot be panned, calculations will be limited to the coordinate system extent and may not wrap.
- Parameters
distance_method (str) – Choose from Planar or Geodesic.
- setDwellMaxDistance(max_distance, max_distance_unit)¶
Sets the maximum distance between points for them to be considered part of a single dwell event.
Note
This method is used along with setDwellMinDuration to define dwell criteria.
- Parameters
max_distance (float) – The maximum distance between points to be considered in a single dwell location.
max_distance_unit (str) – Choose from Meters, Kilometers, Feet, Miles, NauticalMiles, or Yards.
- setDwellMinDuration(min_duration, min_duration_unit)¶
Sets the minimum time between points for them to be considered part of a single dwell event.
Note
This method is used along with setDwellMaxDistance to define dwell criteria.
- Parameters
min_duration (int) – The minimum time duration of a dwell to be considered in a single dwell location.
min_duration_unit (str) – Choose from Milliseconds, Seconds, Minutes, Hours, Days, Weeks, Months, or Years
- setOutputType(output_type)¶
Sets the output type.
DwellMeanCenters: A point representing the centroid of each discovered dwell location. This is the default.
DwellConvexHulls: Polygons representing the convex hull of each dwell group.
DwellPoints: All of the input points determined to belong to a dwell are returned.
AllPoints: All of the input points are returned.
- Parameters
output_type – Choose from DwellMeanCenters, DwellConvexHulls, DwellPoints, or AllPoints.
- Returns
The result DataFrame specified by output_type
- setTimeBoundarySplit(time_boundary_split, time_boundary_split_unit, time_boundary_reference=None)¶
Sets boundaries to limit calculations to defined spans of time. For example, if you use a time boundary of 1 day, starting on January 1, 1980 tracks will be analyzed one day at a time.
- Parameters
time_boundary_split (int) – The scale of the time boundary.
time_boundary_split_unit (str) – Choose from Milliseconds, Seconds, Minutes, Hours, Days, Weeks, Months, or Years.
time_boundary_reference (int/long/datetime.datetime) – A reference datetime to align the time boundaries to. The default is epoch time 0.
- setTrackFields(*track_fields)¶
Sets one or more fields used to identify distinct tracks.
- Parameters
track_fields (*str) – The names of one or more fields from the input DataFrame.
Find Hot Spots¶
- class geoanalytics.tools.FindHotSpots¶
Aggregates points into square bins and finds statistically significant bins of high incidents (hot spots) and low incidents (cold spots).
This tool finds hot and cold spots using the Getis-Ord Gi* statistic. The local counts of points for a bin and its neighbors are compared proportionally to the sum of points in all bins. A local sum is considered statistically significant (larger z-score) when it is very different from the expected local sum and when that difference is too large to be the result of random chance.
Refer to the GeoAnalytics Engine guide for examples and usage notes: Find Hot Spots
- run(dataframe)¶
Runs the FindHotSpots tool using the provided DataFrame.
- Parameters
dataframe (DataFrame) – A DataFrame containing a point column with a projected spatial reference.
- Returns
A DataFrame of square bins assigned a z-score, p-value, and confidence level.
- Return type
DataFrame
- setBins(bin_size, bin_size_unit)¶
Sets the size of square bins used to find hot spots.
- Parameters
bin_size (float) – Distance between parallel sides of a bin.
bin_size_unit (str) – Choose from Meters, Kilometers, Feet, Miles, NauticalMiles, or Yards.
- setNeighborhood(distance, distance_unit)¶
Sets the size of the neighborhood used to find hot spots. The neighborhood size must be larger than the bin size.
- Parameters
distance (float) – Radius of the neighborhood, measured from each bin center.
distance_unit (str) – Choose from Meters, Kilometers, Feet, Miles, NauticalMiles, or Yards.
- setTimeStep(interval_duration, interval_unit, reference_time=None, alignment=None)¶
Sets the time step interval, time step repeat, and reference time. If set, hot spots will be calculated for each time step at each bin location. The input DataFrame must have a datetime column to use this setter.
- Parameters
interval_duration (int) – Duration of each time step.
interval_unit (str) – Choose from Milliseconds, Seconds, Minutes, Hours, Days, Weeks, Months, or Years.
reference_time (int/long/datetime.datetime) – A reference datetime to align the time steps to if alignment is ReferenceTime. The default is epoch time 0.
alignment (str) – Defines how aggregation will occur based on a given interval duration. Choose from StartTime, EndTime, or ReferenceTime.
Find Point Clusters¶
- class geoanalytics.tools.FindPointClusters¶
Finds clusters of points within surrounding noise based on their spatial or spatiotemporal distribution.
Two clustering methods are supported: DBSCAN or HDBSCAN. Both methods can find clusters in space, while DBSCAN can find spatiotemporal clusters in time-enabled point layers.
Refer to the GeoAnalytics Engine guide for examples and usage notes: Find Point Clusters
- run(dataframe)¶
Runs the FindPointClusters tool using the provided DataFrame.
- Parameters
dataframe (DataFrame) – A DataFrame containing a point column with a projected spatial reference.
- Returns
A copy of the input DataFrame with a cluster ID assigned to each point.
- Return type
DataFrame
- setClusterMethod(cluster_method)¶
Sets The algorithm used for cluster analysis. Supported options are “DBSCAN” and “HDBSCAN”.
The DBSCAN algorithm uses a specified distance to separate dense clusters from sparser noise. DBSCAN is faster than HDBSCAN, but is only appropriate if there is a clear search distance to use that works well to define all clusters that may be present.
DBSCAN finds clusters that have similar densities. The HDBSCAN algorithm allows for clusters with varying densities based on cluster probability (or stability).
HDBSCAN is data-driven and does not use a search distance, but is a more time-consuming calculation than DBSCAN. The DBSCAN algorithm finds clusters in two-dimensional space by default. When setTimeMethod is called, DBSCAN will discover clusters in both space and time.
- Parameters
cluster_method (str) – Choose from DBSCAN or HDBSCAN.
- setMinPointsCluster(min_points_cluster)¶
This setter is used differently depending on the clustering method chosen. For DBSCAN, min_points_cluster specifies the number of points that must be found within a search range of a point for that point to start forming a cluster. The results may include clusters with fewer points than this value.
For HDBSCAN, min_points_cluster specifies the number of points neighboring each point (including the point itself) that will be considered when estimating density. This number is also the minimum cluster size allowed when extracting clusters.
- Parameters
min_points_cluster (int) – Number of points.
- setSearchDistance(search_distance, search_distance_unit)¶
Sets the search distance within which the number of points specified by setMinPointsCluster must be found (in addition to being within the search duration, if applicable) to form a cluster using the DBSCAN algorithm. No search distance is used by HDBSCAN.
- Parameters
search_distance (float) – Distance within which min_points_cluster must be found to start forming a cluster. Results may include clusters with fewer points min_points_cluster.
search_distance_unit (str) – Choose from Meters, Kilometers, Feet, Miles, NauticalMiles, or Yards if the input DataFrame has a spatial reference. Otherwise use None if the input DataFrame has no spatial reference.
- setSearchDuration(search_duration, search_duration_unit)¶
Sets the search duration within which the number of points specified by setMinPointsCluster must be found (in addition to being within the search distance) to form a cluster using the DBSCAN algorithm.
Warning
The input DataFrame must have a datetime column to use this setter.
Note
This method is not used by HDBSCAN.
- Parameters
search_duration (int) – Duration within which min_points_cluster must be found to start forming a cluster. Results may include clusters with fewer points than min_points_cluster.
search_duration_unit (str) – Choose from Milliseconds, Seconds, Minutes, Hours, Days, Weeks, Months, or Years.
Find Similar Locations¶
- class geoanalytics.tools.FindSimilarLocations¶
Measures the similarity of candidate locations to one or more reference locations.
This tool requires two DataFrames, one containing the reference locations and one containing candidate locations. Using specified fields representing the criteria to match, the tool will rank all of the candidate locations by how closely they match the reference locations.
Refer to the GeoAnalytics Engine guide for examples and usage notes: Find Similar Locations
- run(reference_dataframe, search_dataframe)¶
Runs the FindSimilarLocations tool using the provided DataFrames.
- Parameters
reference_dataframe (DataFrame) – A DataFrame containing one or more reference rows with attributes.
search_dataframe (DataFrame) – A DataFrame containing candidate locations that will be evaluated for similarity to the reference rows.
- Returns
The similarity statistics with appended fields.
- Return type
DataFrame
- setAnalysisFields(*analysis_fields)¶
Sets the fields that will be used to determine similarity. They must be numeric fields, and the fields must exist on both input DataFrames. Depending on the match method selected, the tool will find rows that are most similar based on values or profiles of the fields.
- Parameters
analysis_fields (*str) – The names of one or more fields from the input DataFrames.
- setAppendFields(*append_fields)¶
Sets which fields from the search DataFrame are included in the result. By default, all fields from the search DataFrame are appended.
- Parameters
append_fields (*str) – The names of one or more fields from the search DataFrame.
- setMatchMethod(match_method)¶
Sets the method that specifies how matching is determined. There are two options:
AttributeValues: uses the squared differences of standardized values. This is the default.
AttributeProfiles: uses cosine similarity mathematics to compare the profile of standardized values. This option requires the use of at least two analysis fields.
- Parameters
match_method (str) – Choose from AttributeValues or AttributeProfiles.
- setMostOrLeastSimilar(most_or_least_similar)¶
Sets the rows that will be returned. Options include returning rows that are either most similar or least similar to the reference, or return both the most and least similar.
- Parameters
most_or_least_similar (str) – Choose from MostSimilar, LeastSimilar, or Both.
- setNumberOfResults(number_of_results)¶
Sets the number of ranked candidate rows to return. The default is 10 and the maximum allowed is 10000.
- Parameters
number_of_results (int) – Number of most or least similar locations to return.
Generate OD Matrix¶
- class geoanalytics.tools.GenerateODMatrix¶
Creates an origin-destination cost matrix from multiple origins to multiple destinations. It returns a table that contains the travel cost, including travel time and travel distance from each origin to each destination within the specified impedance cutoff.
This tool accepts two point DataFrames as the input, representing the origins and destinations.
Refer to the GeoAnalytics Engine guide for examples and usage notes: Generate OD Matrix
- accumulateAttributes(*attributes)¶
Accumulates cost attributes along the network between the associated origin and destination. No accumulated cost is returned by default.
- Parameters
attributes (*str) – The cost attributes to accumulate.
- run(origins_df, destinations_df)¶
Runs the GenerateODMatrix tool using the provided DataFrames.
- Parameters
origins_df (DataFrame) – A DataFrame containing points that represent the origins.
destinations_df (DataFrame) – A DataFrame containing points that represent the destinations.
- Returns
A copy of the inputs combined into one DataFrame that will also contain the rank of the destinations, the travel time in minutes, the travel distance in meters between the origin and the destination, and if specified, the resulting straight lines and accumulated cost attributes.
- Return type
DataFrame
- setCutoff(cutoff, unit=None)¶
Sets the maximum travel distance or travel time when searching for destinations for each origin. Its unit should match the travel mode. For example, if the travel mode is set in the units of distance, the impedance cutoff must be set in distance.
There are two types of cutoffs, distance and time cutoffs.
Distance cutoffs specify the maximum travel distance between origins and destinations. For example, when analyzing walking distance, a cutoff value of 1 mile (e.g. setCutoff(1, “miles”)) means that the tool will search for the destinations in 1 mile walking from the origin.
Time cutoffs specify the maximum travel time between origins and destinations. For example, when analyzing driving time, a cutoff value of 15 minutes (e.g. setCutoff(15, “minutes”)) means the tool will search for the destinations within 15 driving minutes from the origin.
- Parameters
cutoff (int/float) – The impedance cutoff used to calculate the maximum travel distance or travel time from an origin to a destination. It accepts a positive value.
unit (str) – Choose from Meters, Kilometers, Feet, Miles, NauticalMiles, or Yards for distance cutoffs. Choose from Milliseconds, Seconds, Minutes, Hours, Days, Weeks, Months, or Years for time cutoffs. By default, it is in the unit of the impedance attribute used by the travel mode.
- setNetwork(path)¶
Sets the network data source from a mobile map package or a mobile geodatabase.
- Parameters
path (str) – The path to the network data source.
- setNumDestinations(count)¶
Sets the number of destinations to find for each origin.
- Parameters
count (int) – The number of destinations to find for each origin. The default is returning all destinations within the impedance cutoff.
- setRouteGeometry(route_geometry)¶
Specifies whether to return the straight line between the incidents and the destinations. The tool does not output the true shape of routes for performance reasons, but the travel time and travel distance are calculated along the network.
The following options are supported:
StraightLines: returns a straight line from the origin to the destination.
NoLines: doesn’t return any geometry.
- Parameters
route_geometry (str) – Choose from StraightLines or NoLines.
- setTime(day_of_week, time, time_zone='UTC')¶
Sets the departure time from origins.
- Parameters
day_of_week (str) – Choose from Sunday, Monday, Tuesday, Wednesday, Thursday, Friday, or Saturday.
time (str/datetime.time) – The time of the day as a string in HH:mm:ss format, or as a datetime.time object.
time_zone (str) – The time zone, represented as a UTC offset or time zone identifier. Refer to this list of tz database time zones for allowed identifiers. The default is “UTC”.
- setTravelMode(travel_mode)¶
Sets the travel mode. A travel mode refers to the mode of transportation, such as driving or walking. By default, the tool uses the default travel mode in the network dataset.
- Parameters
travel_mode (str) – The mode of transportation. The parameter accepts any travel mode that is defined on the network data source or a JSON format for a custom travel mode.
Geocode¶
- class geoanalytics.tools.Geocode¶
Converts addresses into geographic coordinates.
This tool requires an input DataFrame that contains one or more columns that store the string addresses to be geocoded and a locator accessible to all nodes in the Spark cluster.
Refer to the GeoAnalytics Engine guide for examples and usage notes: Geocode
- run(dataframe)¶
Runs the Geocode tool using the provided DataFrame.
- Parameters
dataframe (DataFrame) – A DataFrame containing string addresses that will be geocoded.
- Returns
A copy of the input DataFrame with output fields specified in setOutFields(), including the geocoded locations as point geometries.
- Return type
DataFrame
- setAddressFields(*address_fields)¶
Sets one or more input address fields used by the locator to geocode addresses.
- Parameters
address_fields (*str) – The names of one or more address fields from the input DataFrame.
- setCountryCode(country_code)¶
Sets the country to search the geocoded addresses in.
- Parameters
country_code (str) – A two-letter or three-letter country code defined in ISO 3166-1.
- setLocator(path)¶
Sets the address locator that will be used to geocode the addresses. The locator must be accessible to all nodes in your Spark cluster. For more information, read about Staging the locators.
- Parameters
path (str) – The file path of a locator (.loc) or a mobile map package (.mmpk).
- setMinScore(min_score)¶
Sets the minimum score of the records that will be matched in the output.
- Parameters
min_score (int/float) – The value of the minimum score. The value should be greater than 0 and less than 100.
- setOutFields(predefined_set)¶
Sets the output fields.
LocationOnly: geocode_location is returned.
Minimal: geocode_location, Status, Score, Match_addr, and Addr_type are returned. This is the default.
MinimalAndUserFields: geocode_location, Status, Score, Match_addr, Addr_type, and any custom output fields available in the locator are returned.
All: All fields are returned including any custom fields defined in your locator.
- Parameters
predefined_set (str) – Choose from LocationOnly, Minimal, MinimalAndUserFields or All.
GWR¶
- class geoanalytics.tools.GWR¶
Performs Geographically Weighted Regression (GWR), a local form of linear regression used to model spatially varying relationships.
GWR provides a local model of a variable by fitting a regression equation to every row in the input DataFrame using the geometry and any specified explanatory variables.
Refer to the GeoAnalytics Engine guide for examples and usage notes: GWR
- Result¶
alias of
GeographicallyWeightedRegressionResult
- run(dataframe)¶
Runs the GWR tool using the provided DataFrame.
- Parameters
dataframe (DataFrame) – A DataFrame containing a point column with a projected spatial reference, dependent variables, and explanatory variables.
- Returns
A copy of the input DataFrame with model attributes appended to each row.
- Return type
DataFrame
- runIncludeDiagnostics(dataframe)¶
Runs the GWR tool using the provided DataFrame.
- Parameters
dataframe (DataFrame) – A DataFrame containing a point column with a projected spatial reference, dependent variables, and explanatory variables.
- Returns
A named tuple containing: outputTrained, a copy of the input DataFrame with model attributes appended to each row; and modelDiagnostics, a dictionary containing the model diagnostics.
- Return type
namedtuple
- setDependentVariable(dependent_variable)¶
The numeric field containing the observed values to model.
- Parameters
dependent_variable (str) – The name of a field in the input DataFrame.
- setDistanceBand(distance_band=None, distance_band_unit=None)¶
Sets the neighborhood size as a fixed distance for each feature.
Note
This method will override setNumNeighbors if called last.
- Parameters
distance_band (float) – The distance for the spatial extent of the neighborhood.
distance_band_unit (str) – Choose from Meters, Kilometers, Feet, Miles, NauticalMiles, or Yards.
- setExplanatoryVariables(*explanatory_variables)¶
Sets one or more fields to represent independent explanatory variables in the model.
- Parameters
explanatory_variables (*str) – The names of one or more fields from the input DataFrame.
- setLocalWeightingScheme(local_weighting_scheme)¶
Sets the kernel type that will be used to provide the spatial weighting in the model. The kernel defines how each points is related to other points within its neighborhood. Two options are supported:
Bisquare: assigns a weight of 0 to any geometry outside the neighborhood. This is the default.
Gaussian: assigns weights to all geometries, but weights become exponentially smaller the farther away they are from the target geometry.
- Parameters
local_weighting_scheme (str) – Choose from Bisquare or Gaussian.
- setNumNeighbors(number_of_neighbors)¶
Sets the neighborhood size as a function of a specified number of neighbors included in calculations for each point. Where points are dense, the spatial extent of the neighborhood is smaller; where points are sparse, the spatial extent of the neighborhood is larger.
Note
This method will override setDistanceBand if called last.
- Parameters
number_of_neighbors (int) – The number of neighbors included in calculations.
Group By Proximity¶
- class geoanalytics.tools.GroupByProximity¶
Groups geometries that are within spatial or spatiotemporal proximity of each other.
Refer to the GeoAnalytics Engine guide for examples and usage notes: Group By Proximity
- run(dataframe)¶
Runs the GroupByProximity tool using the provided DataFrame.
- Parameters
dataframe (DataFrame) – A DataFrame containing a geometry column.
- Returns
A copy of the input DataFrame with a column of group IDs appended.
- Return type
DataFrame
- setAttributeRelationship(expression, expression_type='sql')¶
Sets the attribute relationship expression to further refine groupings.
- Parameters
expression (str) – Expression representing the attribute relationship.
expression_type (str) – Choose from Arcade or SQL.
- setSpatialRelationship(spatial_relationship='Intersects', near_distance=None, near_distance_unit=None)¶
Sets the type of spatial relationship to group by.
- Parameters
spatial_relationship (str) – Choose from Intersects, Touches, NearGeodesic, or NearPlanar.
near_distance (float) – The search distance to determine if geometries are near one another. This is only applied if NearGeodesic or NearPlanar are set as the spatial relationship.
near_distance_unit (str) – Choose from Meters, Kilometers, Feet, Miles, NauticalMiles, or Yards.
- setTemporalRelationship(temporal_relationship='Intersects', temporal_distance=None, temporal_distance_unit=None)¶
Sets the type of temporal relationship to group by.
- Parameters
temporal_relationship (str) – Choose from Intersects or Near.
temporal_distance (int) – Sets the temporal search distance to determine if geometries are near one another.
temporal_distance_unit (str) – Choose from Milliseconds, Seconds, Minutes, Hours, Days, Weeks, Months, or Years.
Nearest Neighbors¶
- class geoanalytics.tools.NearestNeighbors¶
Search for the given number of neighbors to a record in a DataFrame from records in another DataFrame. The records from the input DataFrames are matched based on closest proximity.
Refer to the GeoAnalytics Engine guide for examples and usage notes: Nearest Neighbors
- addSummaryField(summary_field, statistic, alias=None)¶
Adds a summary statistic of a field in the input DataFrame to the result DataFrame.
- Parameters
summary_field (str) – The name of a field from the input DataFrame.
statistic (str) – Choose from Count, Sum, Mean, Max, Min, Range, Stddev, Var, or Any.
alias (str) – The name of the result field containing the statistic. The default is the field name and statistic separated by an underscore.
- run(query_dataframe, data_dataframe=None)¶
Runs the NearestNeighbors tool using the provided DataFrames.
If you only provide a query_dataframe, the DataFrame is used as both the query_dataframe and the data_dataframe. In this case, each record will be joined with other nearby records, excluding itself.
- Parameters
query_dataframe (DataFrame) – A DataFrame containing geometries whose nearest neighbors will be found.
data_dataframe (DataFrame) – A DataFrame containing the neighbor candidates.
- Returns
A DataFrame containing the result of the join.
- Return type
DataFrame
- setDistanceMethod(distance_method)¶
Specify the distance method category for relative nearness. There are two methods:
Planar: this is the default when the input DataFrame is in a projected coordinate system.
Geodesic: this is the default when the input DataFrame is in a geographic coordinate system.
- Parameters
distance_method (str) – Choose from Planar or Geodesic.
- setNumNeighbors(k)¶
The number of neighbors to find that are nearest to each query record.
- Parameters
k (int) – The number of nearest neighbors. The number must be greater than 0.
- setOutputUnit(distance_unit)¶
Sets the desired output unit of the distance values. The default is meters.
- Parameters
distance_unit (str) – Choose from Meters, Kilometers, Feet, Miles, NauticalMiles, or Yards if the provided DataFrame has a spatial reference. Inapplicable if the input has no spatial reference.
- setResultLayout(layout='long')¶
Sets the layout format for the result DataFrame. There are two options:
long: Each row represents a query record with a single nearest neighbor, and the output is organized by stacking all paired records. This is the default (when summary statistics are not in use).
wide: Each row represents a query record with all nearest neighbors, with the fields in data_dataframe consolidated into one column for each nearest neighbor.
- Parameters
layout (str) – Choose from long format or wide format.
- setSearchDistance(search_distance, search_distance_unit)¶
Sets a distance bound within which to search for nearest neighbors.
- Parameters
near_distance (float) – The search distance to determine if geometries are near one another based on the distance method in use.
near_distance_unit (str) – Choose from Meters, Kilometers, Feet, Miles, NauticalMiles, or Yards if the provided DataFrame has a spatial reference. Otherwise use None if the input has no spatial reference.
Overlay¶
- class geoanalytics.tools.Overlay¶
Combines two or more geometry columns into a single column using a spatial overlay operation.
Note
This tool operates on the entire input DataFrame and thus can more performant than equivalent row-wise operations using SQL functions.
Refer to the GeoAnalytics Engine guide for examples and usage notes: Overlay
- run(input_dataframe, overlay_dataframe)¶
Runs the Overlay tool using the provided DataFrames.
- Parameters
input_dataframe (DataFrame) – A DataFrame containing a geometry column.
overlay_dataframe (DataFrame) – A DataFrame containing a geometry column to overlay.
- Returns
A DataFrame containing the result of the overlay.
- Return type
DataFrame
- setOverlayType(overlay_type)¶
Sets the type of overlay to be performed.
- Parameters
overlay_type (str) – Choose from Intersect, Erase, Union, Identity, or SymmetricalDifference.
Reconstruct Tracks¶
- class geoanalytics.tools.ReconstructTracks¶
Creates a line or polygon representing an entity’s path of movement over time using points or polygons with associated timestamps.
This tool groups input rows into tracks representing unique entities using a track identifier field. It then creates a linestring by connecting the point observations for each entity sequentially. The linestring can be buffered with a variable distance using a field from the input DataFrame.
Refer to the GeoAnalytics Engine guide for examples and usage notes: Reconstruct Tracks
- addSummaryField(summary_field, statistic, alias=None)¶
Adds a summary statistic of a field in the input DataFrame to the result DataFrame.
- Parameters
summary_field (str) – The name of a field from the input DataFrame.
statistic (str) – Choose from First, Last, Count, Sum, Mean, Max, Min, Range, Stddev, Var, or Any.
alias (str) – The name of the result field containing the statistic. The default is the field name and statistic separated by an underscore.
- run(dataframe)¶
Runs the ReconstructTracks tool using the provided DataFrame.
- Parameters
dataframe (DataFrame) – A DataFrame containing a point or polygon column, a track ID column, and a datetime column.
- Returns
A DataFrame containing the result linestrings or polygons.
- Return type
DataFrame
- setArcadeSplit(arcade_split)¶
Sets an Arcade expression to split tracks with. The expression will be evaluated for each point in a track and the track will be split if the expression equals True.
- Parameters
arcade_split (str) – An Arcade expression.
- setBufferField(buffer_field)¶
Sets a field in the input DataFrame that contains a buffer distance or a buffer expression. A buffer expression must begin with an equal sign (=).
- Parameters
buffer_field (str) – The name of a field from the input DataFrame.
- setDistanceMethod(distance_method)¶
Sets the method used to calculate distances between track observations. There are two methods to choose from:
Planar: measures distances using a Euclidean plane and will not calculate statistics across the date line.
Geodesic: calculations will cross the date line when appropriate. This is the default. If the spatial reference cannot be panned, calculations will be limited to the coordinate system extent and may not wrap.
- Parameters
distance_method (str) – Choose from Planar or Geodesic.
- setDistanceSplit(distance_split, distance_split_unit)¶
Sets the distance used to split tracks. Any rows in the input DataFrame that are in the same track and are farther apart than this distance will be split into a new track. If both the distance split and the time split are used, the track is split when at least one condition is met.
- Parameters
distance_split (float) – The distance used to split tracks.
distance_split_unit (str) – Choose from Meters, Kilometers, Feet, Miles, NauticalMiles, or Yards.
- setSplitBoundaryOption(split_boundary_option)¶
Sets how the track segment between two points is created when a track is split. The split type is applied to split expressions, distance splits, and time splits. There are three options:
Gap: no segment is created between the two points (this is the default).
FinishLast: a segment is created between the two points that ends after the split.
StartNext: a segment is created between the two points that ends before the split.
- Parameters
split_boundary_option (str) – Choose from Gap, FinishLast, or StartNext
- setTimeBoundarySplit(time_boundary_split, time_boundary_split_unit, time_boundary_reference=None)¶
Sets boundaries to limit calculations to defined spans of time. For example, if you use a time boundary of 1 day, starting on January 1, 1980 tracks will be analyzed one day at a time.
- Parameters
time_boundary_split (int) – The scale of the time boundary.
time_boundary_split_unit (str) – Choose from Milliseconds, Seconds, Minutes, Hours, Days, Weeks, Months, or Years.
time_boundary_reference (int/long/datetime.datetime) – A reference datetime to align the time boundaries to. The default is epoch time 0.
- setTimeSplit(time_split, time_split_unit)¶
Sets the time duration used to split tracks. Any rows in the input DataFrame that are in the same track and are farther apart than this time will be split into a new track. If both the distance split and time split are used, a track is split when at least one condition is met.
- Parameters
time_split (int) – The time duration used to split tracks.
time_split_unit (str) – Choose from Milliseconds, Seconds, Minutes, Hours, Days, Weeks, Months, or Years
- setTrackFields(*track_fields)¶
Sets one or more fields used to identify distinct tracks.
- Parameters
track_fields (*str) – The names of one or more fields from the input DataFrame.
Reverse Geocode¶
- class geoanalytics.tools.ReverseGeocode¶
Creates addresses from point geometries and returns them as string values.
This tool requires an input DataFrame that contains a column of point geometries and a locator accessible to all nodes in the Spark cluster.
Refer to the GeoAnalytics Engine guide for examples and usage notes: Reverse Geocode
- run(dataframe)¶
Runs the ReverseGeocode tool using the provided DataFrame.
- Parameters
dataframe (DataFrame) – A DataFrame containing a column of point geometries with a spatial reference.
- Returns
A copy of the input DataFrame with output fields specified in setOutFields(), including the matched reverse-geocoded addresses as string values.
- Return type
DataFrame
- setFeatureTypes(*feature_types)¶
Sets one or more match types that reverse geocoded addresses are returned with.
- Parameters
feature_types (*str) – Specifies the possible match types from Subaddress, PointAddress, StreetAddress, DistanceMarker, StreetName, StreetInt, Postal, Locality, and POI.
- setLanguageCode(language_code)¶
Sets the language in which reverse geocoded addresses are returned.
- Parameters
language_code (str) – A two-letter or three-letter language code defined in ISO 639.
- setLocator(path)¶
Sets the address locator that will be used to geocode the addresses. The locator must be accessible to all nodes in your Spark cluster. For more information, read about Staging the locators.
- Parameters
path (str) – The file path of a locator (.loc) or a mobile map package (.mmpk).
- setOutFields(predefined_set)¶
Sets the output fields.
Minimal: Match_addr, and Addr_type are returned. This is the default.
MinimalAndUserFields: Match_addr, Addr_type, and any custom output fields available in the locator are returned.
All: All fields are returned including any custom fields defined in your locator.
- Parameters
predefined_set (str) – Choose from Minimal, MinimalAndUserFields or All.
Snap Tracks¶
- class geoanalytics.tools.SnapTracks¶
Snaps input track points to lines. The points dataframe must have a timestamp column where each row represents an instant in time. The lines dataframe must also contain fields indicating the from and to nodes for analysis.
Refer to the GeoAnalytics Engine guide for examples and usage notes: Snap Tracks
- run(points_dataframe, lines_dataframe)¶
Runs the SnapTracks tool using the provided DataFrames.
- Parameters
points_dataframe (DataFrame) – A DataFrame containing points that will be matched to lines.
lines_dataframe (DataFrame) – A DataFrame containing lines to which points will be matched. The input must contain fields with values indicating the from and to nodes of the line.
- Returns
The snapped points DataFrame with appended fields.
- Return type
DataFrame
- setAppendFields(*line_fields)¶
Sets one or more fields from the input lines DataFrame that will be included in the output result.
- Parameters
line_fields (*str) – The names of one or more fields from the line DataFrame.
- setConnectivityFields(from_node, to_node)¶
The line DataFrame fields that will be used to define the connectivity of the input lines.
- Parameters
from_node (str) – The field that represents the from_node, the node that the travel along a line is moving away from.
to_node (str) – The field that represents the from_node, the node that the travel along a line is moving to.
- setDirectionFieldMatching(direction_field, forward_value=None, backward_value=None, both_value=None, none_value=None)¶
The line field and attribute values that will be used to define the direction of the input lines.
- Parameters
direction_field (str) – The field from the line DataFrame that describes the direction of travel.
forward_value (str) – The value from the direction_field that indicates the supported direction of travel is forward along a line.
backward_value (str) – The value from the direction_field that indicates the supported direction of travel is backward along a line.
both_value (str) – The value from the direction_field that indicates both forward and backward directions of travel are supported along a line.
none_value (str) – The value from the direction_field that indicates there are no supported directions of travel along a line.
- setDistanceMethod(distance_method)¶
Sets the method used to calculate distances. There are two methods to choose from: ‘Planar’ or ‘Geodesic’ (default).
- Parameters
distance_method (str) – Choose from Planar or Geodesic.
- setDistanceSplit(distance_split, distance_split_unit)¶
Sets the distance used to split tracks. Any observations in the input DataFrame that are in the same track and are farther apart than this distance will be split into a new track. If both the distance split and the time split are used, the track is split when at least one condition is met.
- Parameters
distance_split (float) – The distance used to split tracks.
distance_split_unit (str) – Choose from Meters, Kilometers, Feet, Miles, NauticalMiles, or Yards.
- setOutputMode(output_mode)¶
Sets the result type. There are two options:
AllPoints: All input points are returned. This is the default.
MatchedPoints: Only input points that matched to a line are returned.
- Parameters
output_mode (str) – Choose from AllPoints or MatchedPoints.
- setSearchDistance(search_distance, search_distance_unit)¶
The maximum distance allowed between a point and any line to be considered a match.
- Parameters
search_distance (float) – Maximum distance between any point and a line.
search_distance_unit (str) – Choose from Meters, Kilometers, Feet, Miles, NauticalMiles, or Yards.
- setTimeBoundarySplit(time_boundary_split, time_boundary_split_unit, time_boundary_reference=None)¶
Sets boundaries to limit calculations to defined spans of time. For example, if you use a time boundary of 1 day, starting on January 1, 1980, tracks will be analyzed one day at a time.
- Parameters
time_boundary_split (int) – The scale of the time boundary.
time_boundary_split_unit (str) – Choose from Milliseconds, Seconds, Minutes, Hours, Days, Weeks, Months, or Years.
time_boundary_reference (int, long, datetime.datetime) – A reference datetime to align the time boundaries to. The default is epoch time 0.
- setTimeSplit(time_split, time_split_unit)¶
Sets the time duration used to split tracks. Any observations in the point DataFrame that are in the same track and are farther apart than this time will be split into a new track. If both the distance split and time split are used, a track is split when at least one condition is met.
- Parameters
time_split (int) – The time duration used to split tracks.
time_split_unit (str) – Choose from Milliseconds, Seconds, Minutes, Hours, Days, Weeks, Months, or Years
- setTrackFields(*track_fields)¶
One or more fields used to identify distinct tracks.
- Parameters
track_fields (*str) – The names of one or more fields from the input points DataFrame.
Spatiotemporal Join¶
- class geoanalytics.tools.SpatiotemporalJoin¶
Joins attributes from one DataFrame to another based on spatial, temporal, and attribute relationships or some combination of the three.
The tool determines all input rows that meet the specified join conditions and joins the second DataFrame to the first. You can optionally join all rows to the matching rows or summarize the matching rows.
Refer to the GeoAnalytics Engine guide for examples and usage notes: Spatiotemporal Join
- addSummaryField(summary_field, statistic, alias=None)¶
Adds a summary statistic of a field in the input DataFrame to the result DataFrame.
- Parameters
summary_field (str) – The name of a field from the input DataFrame.
statistic (str) – Choose from Count, Sum, Mean, Max, Min, Range, Stddev, Var, or Any.
alias (str) – The name of the result field containing the statistic. The default is the field name and statistic separated by an underscore.
- includeDistance(include=True, distance_unit=None)¶
Specifies whether to include spatial distance and/or temporal difference in the columns of the result DataFrame (new in version 1.2.0).
- Parameters
include (bool) – True to include, or False to exclude, spatial distance and/or temporal difference.
distance_unit (str) – the desired output unit of the spatial distance values. The default is meters. Choose from Meters, Kilometers, Feet, Miles, NauticalMiles, or Yards if the input DataFrames have a spatial reference. Otherwise use None if the input DataFrames have no spatial reference.
- run(target_dataframe, join_dataframe)¶
Runs the SpatiotemporalJoin tool using the provided DataFrames.
- Parameters
target_dataframe (DataFrame) – A DataFrame.
join_dataframe (DataFrame) – A DataFrame to join.
- Returns
A DataFrame containing the result of the join.
- Return type
DataFrame
- setAttributeRelationship(attribute_relationship)¶
Sets a target field, relationship, and join field used to join equal attributes.
An equals relationship can be used (equal in JSON, and = using the string format), or to check for join strings that are equal without comparing casing or trailing and leading white spaces, equalIgnoreCaseTrimWhiteSpace can be used through JSON or ~= using a string.
- Parameters
attribute_relationship (str) – Expression representing the attribute relationship.
- setJoinCondition(join_condition)¶
Sets a condition to specified fields using an Arcade expression. Only rows with columns that meet this condition will be joined.
- Parameters
join_condition (str) – An Arcade expression.
- setJoinOneToMany()¶
Sets the join operation to one to many. If multiple join rows are found that have the same relationships with a single target row, the result DataFrame will contain multiple copies of the target row.
For example, if a single point in the target DataFrame is found within two separate polygons in the join DataFrame, the result DataFrame will contain two copies of the target row: one row with the attributes of one polygon and another row with the attributes of the other polygon. There are no summary statistics available with this method.
Note
This method will override setJoinOneToOne.
- setJoinOneToOne()¶
Sets the join operation to one to one. If multiple join rows are found that have the same relationships with a single target row, the fields from the multiple join rows will be aggregated using the specified summary statistics.
For example, if a point is found within two separate polygons, the fields associated with the two polygons will be aggregated before being returned in the result DataFrame. If one polygon has an attribute value of 3 and the other has a value of 7, and a summary statistic of sum is specified, the aggregated value in the output DataFrame will be 10. There will always be a Count field calculated, with a value of 2, for the number of rows specified.
Note
This method will override setJoinOneToMany
- setLeftJoin(left_join=True)¶
Specifies whether all target rows will be returned in the result DataFrame (known as a left or left outer join) or only those that have the specified relationships with the join rows (inner join). Left join can be used with a one-to-one join or a one-to-many join (new in version 1.1.0).
- Parameters
left_join (bool) – If True a left outer join will be used, if False an inner join will be used.
- setSpatialRelationship(spatial_relationship, near_distance=None, near_distance_unit=None)¶
Sets the spatial relationship used to spatially join rows.
- Parameters
spatial_relationship (str) – Choose from Equals, Intersects, Contains, Within, Crosses, Touches, Overlaps, NearPlanar, NearGeodesic.
near_distance (float) – A double value used for the search distance to determine if a target geometry is near a join geometry. This is only applied if NearPlanar or NearGeodesic is the specified spatial relationship.
near_distance_unit (str) – Choose from Meters, Kilometers, Feet, Miles, NauticalMiles, or Yards if the input DataFrames have a spatial reference. Otherwise use None if the input DataFrames have no spatial reference.
- setTemporalRelationship(temporal_relationship, near_duration=None, near_duration_unit=None)¶
Sets the temporal relationship used to temporally join rows.
- Parameters
temporal_relationship (str) – Choose from Equals, Intersects, During, Contains, Finishes, FinishedBy, Meets, MetBy, Overlaps, OverlappedBy, Starts, StartedBy, Near,`NearBefore` or NearAfter.
near_duration (int) – An integer value used for the temporal search distance to determine if a target geometry is temporally near a join geometry.
near_duration_unit (str) – Choose from Milliseconds, Seconds, Minutes, Hours, Days, Weeks, Months, or Years.
Summarize Within¶
- class geoanalytics.tools.SummarizeWithin¶
Summarizes geometries from the input DataFrame where they intersect summary polygons or bins using statistics.
Refer to the GeoAnalytics Engine guide for examples and usage notes: Summarize Within
- Result¶
alias of
SummarizeWithinResult
- addRateField(rate_field)¶
Marks a numeric field in the input DataFrame as having quantity type rate/index (rather than count/sum).
- Parameters
rate_field (str) – The name of a field from the input DataFrame
- addStandardSummaryField(summary_field, statistic, alias=None)¶
Adds a summary statistic of a field in the input DataFrame to the result DataFrame.
- Parameters
summary_field (str) – The name of a field from the input DataFrame.
statistic (str) – Choose from Count, Sum, Mean, Max, Min, Range, Stddev, Var, or Any.
alias (str) – The name of the result field containing the statistic. The default is the field name and statistic separated by an underscore.
- addWeightedSummaryField(summary_field, statistic, alias=None)¶
Adds a summary statistic of a field in the input DataFrame to the result DataFrame.
- Parameters
summary_field (str) – The name of a field from the input DataFrame
statistic (str) – Choose from Mean, Stddev, or Var.
alias (str) – The name of the result field containing the statistic. The default is ‘p’, the field name, underscore, and statistic.
- includeShapeSummary(include=True, units=None)¶
Sets the inclusion of calculated statistics based on the geometry type of the primary geometry column in the input DataFrame, such as the length of lines or areas of polygons within each summary polygon.
- Parameters
include (bool) – If True, geometry summary statistics will be included in the result.
units (str) – Choose from Meters, Kilometers, Feet, Miles, NauticalMiles, Yards, SquareMeters, SquareKilometers, Hectares, SquareFeet, SquareYards, SquareMiles or Acres.
- run(dataframe)¶
Runs the SummarizeWithin tool using the provided DataFrame.
- Parameters
dataframe (DataFrame) – A DataFrame containing a geometry column.
- Returns
A named tuple with a DataFrame containing the summary polygons and a DataFrame containing the group-by summary (if applicable).
- Return type
namedtuple
- setGroupBy(group_by_field, include_minor_major_fields=True, include_group_percentages=True)¶
Sets a field from the input DataFrame that will be used to calculate statistics for each unique value.
When setGroupBy is called, the tool will return a DataFrame containing the statistics in addition to a DataFrame containing the summaries.
For example, suppose the input DataFrame contains city boundaries and the polygons set by setSummaryPolygons are parcels. One of the fields of the parcels is Status which contains two values: VACANT and OCCUPIED. To calculate the total area of vacant and occupied parcels within the boundaries of cities, use Status as the group-by field.
- Parameters
group_by_field (str) – The name of a field from the input DataFrame.
include_minor_major_fields (bool) – If True, the minority (least dominant) or the majority (most dominant) attribute values for each group will be included in the result.
include_group_percentages (bool) – If True, the percentage of each unique field value is calculated for each summary polygon.
- setSummaryBins(bin_size, bin_size_unit, bin_type='square')¶
Sets the size and shape of bins that the input DataFrame will be summarized into.
Note
This method overrides setSummaryPolygons. Use setSummaryPolygons if summarizing into an existing column of polygons.
- Parameters
bin_size (float) – Distance between parallel sides of a bin.
bin_size_unit (str) – Choose from Meters, Kilometers, Feet, Miles, NauticalMiles, or Yards.
bin_type (str) – Choose from Square or Hexagon.
- setSummaryPolygons(summary_polygons)¶
Sets the DataFrame containing a column of polygons that the input DataFrame will be summarized into.
Note
This method overrides setSummaryBins. Use setSummaryBins instead if summarizing into square or hexagon bins that are generated when the tool runs.
- Parameters
summary_polygons (pyspark.sql.DataFrame) – A DataFrame containing a polygon column.
Trace Proximity Events¶
- class geoanalytics.tools.TraceProximityEvents¶
Analyzes points representing moving entities. The tool will follow entities of interest in space (location) and time to see which other entities the entities of interest have interacted with. The trace will continue from entity to entity to a configurable maximum degrees of separation from the original entity of interest.
For example, suppose an organization monitors company-issued devices carried by workers. The company is interested in determining which employees were near an individual known to have COVID-19. Using the point layer representing device locations and time, they can identify devices that have been within 6 meters and 5 minutes of the contagious person and other possibly contagious employees.
Refer to the GeoAnalytics Engine guide for examples and usage notes: Trace Proximity Events
- Result¶
alias of
TraceProximityEventsResult
- includeTracksDataFrame()¶
Includes a second DataFrame with the points used in the trace.
- run(dataframe)¶
Runs the TraceProximityEvents tool using the provided DataFrame.
- Parameters
dataframe (DataFrame) – A DataFrame containing a point column, timestamp column, and entity ID column.
- Returns
A named tuple containing a copy of the input DataFrame with proximity event info appended and a DataFrame containing only points used in the trace.
- Return type
DataFrame
- setAttributeMatchCriteria(*attribute_match_criteria)¶
One or more fields used to constrain the proximity events. Entities will only be considered near when the spatial search distance and temporal search distance criteria are met and the two entities have equal values of the fields specified.
- Parameters
attribute_match_criteria (*str) – The names of one or more fields from the input DataFrame.
- setDistanceMethod(distance_method)¶
Sets the method used to calculate distances between track observations. There are two methods to choose from:
Planar: measures distances using a Euclidean plane and will not calculate statistics across the date line.
Geodesic: calculations will cross the date line when appropriate. This is the default. If the spatial reference cannot be panned, calculations will be limited to the coordinate system extent and may not wrap.
- Parameters
distance_method (str) – Choose from Planar or Geodesic.
- setEntitiesOfInterestIds(entities_of_interest_ids)¶
Sets one or more entities that you are interested in tracing from, as well as a time to start tracing from. If you do not specify a time, January 1, 1970, at 12:00 a.m. will be used.
- Parameters
entities_of_interest_ids (str) – A stringified list of dictionaries containing entity IDs and times in epoch ms.
- Example
‘[{“entityID”: “user5”, “epochTimeStamp”: 1598390663000}, {“entityID”: “user9”, “epochTimeStamp”: None}]’
- setEntityIdField(entity_id_field)¶
Sets the field used to identify distinct entities.
- Parameters
entity_id_field (str) – The name of a field from the input DataFrame.
- setMaxTraceDepth(max_trace_depth)¶
Sets the maximum degrees of separation between an entity of interest and an entity further down the trace.
- Parameters
max_trace_depth (int) – Degrees of separation.
- setSearchDistance(search_distance, search_distance_unit)¶
Sets the maximum distance between two points to be considered in proximity. Points closer together in space and that also meet the search duration criteria are considered in proximity of each other.
Note
This method is used along with setSearchDuration to define proximity.
- Parameters
search_distance (float) – The search distance used to determine if points are in proximity.
search_distance_unit (str) – Choose from Meters, Kilometers, Feet, Miles, NauticalMiles, or Yards.
- setSearchDuration(search_duration, search_duration_unit)¶
Sets the maximum duration between two points that are considered in proximity. Points closer together in time and that also meet the search distance criteria are considered in proximity of each other.
Note
This method is used along with setSearchDistance to define proximity.
- Parameters
search_duration (int) – The search duration used to determine if points are in proximity.
search_duration_unit (str) – Choose from Milliseconds, Seconds, Minutes, Hours, Days, Weeks, Months, or Years