geoanalytics.STDataFrameAccessor¶
create_optimal_sr¶
- geoanalytics.extensions.STDataFrameAccessor.create_optimal_sr(self, property, custom_name=None, geometry=None)¶
Creates a spatial reference with a custom projected coordinate system optimal for the data extent and intended purpose of your analysis.
Supported Properties:
EQUAL_AREA - Preserves the relative area of regions everywhere on earth. Shapes and distances will be distorted.
CONFORMAL - Preserves angles in small areas. Shapes, sizes, and distances will be distorted.
EQUIDISTANT_ONE_POINT - Preserves distances when measured through the center of the projection. Areas, shapes, and other distances will be distorted.
EQUIDISTANT_MERIDIANS - Preserves distances when measured along meridians. Area, shape, and other distances will be distorted.
COMPROMISE_WORLD - Does not preserve areas, shapes, or distances specifically, but creates a balance between these geometric properties. Compromise projections are only suggested for very large areas.
- Parameters
property (str) – A property that represents the purpose of the projection. Choose from EQUAL_AREA, CONFORMAL, EQUIDISTANT_ONE_POINT, EQUIDISTANT_MERIDIANS, COMPROMISE_WORLD.
custom_name (str, optional) – The name of the custom projected coordinate system. If unspecified, the name will be Custom_Projection.
geometry (str, optional) – Geometry field name. Required if there is more than one geometry field and the default is not set.
- Returns
A spatial reference object
- Return type
SpatialReference
get_extent¶
- geoanalytics.extensions.STDataFrameAccessor.get_extent(self, geometry=None)¶
Computes the spatial extent of a geometry column in the dataframe and returns it as a BoundingBox.
- Parameters
geometry (str, optional) – Geometry field name. Required if there is more than one geometry field and the default is not set.
- Returns
a bounding box representing the extent
- Return type
BoundingBox
get_geometry_field¶
- geoanalytics.extensions.STDataFrameAccessor.get_geometry_field(self, *, infer=True)¶
Returns the set geometry field for the Spark DataFrame.
- Parameters
infer (Boolean, optional, by name only) – If there is exactly one geometry column, then infer that it is the geometry field.
- Returns
the geometry field name if set
- Return type
str
get_spatial_reference¶
- geoanalytics.extensions.STDataFrameAccessor.get_spatial_reference(self, geometry_field=None)¶
Returns the spatial reference for the geometry field.
- Parameters
geometry_field (pyspark.sql.Column, optional) – Geometry type column.
- Returns
NamedTuple containing the srid, if projected (PCS), and spatial reference unit.
- Return type
geoanalytics.sql.SpatialReference
get_time_fields¶
- geoanalytics.extensions.STDataFrameAccessor.get_time_fields(self, *, infer=True)¶
Returns the set time field(s) for the Spark DataFrame.
- Parameters
infer (Boolean, optional, by name only) – If there is exactly one timestamp column, then infer that it is the start time field.
- Returns
a list of time field names if set
- Return type
list
plot¶
- geoanalytics.extensions.STDataFrameAccessor.plot(self, geometry=None, cmap_values=None, is_categorical=None, vmin=None, vmax=None, ax=None, cmap=None, figsize=None, dpi=None, aspect='equal', max_geoms=1000000, legend=False, legend_kwds=None, classification_method=None, classification_kwds=None, basemap=None, xmargin=None, ymargin=None, sr=None, extent=None, **style_kwds)¶
Plot a geometry column from a PySpark DataFrame.
- Parameters
geometry (str, optional) – Name of the geometry column to plot. Required if the DataFrame has more than one geometry column.
cmap_values (str, optional) – Name of the column to use for color mapping.
classification_method (str) – The name of the classification method for MapClassify
classification_kwds (dict) – keyword arguments to pass to mapclassify.classify such a ‘k’
is_categorical (bool, optional) – Set to True when the cmap_values column is categorical. The default is False.
vmin (float, optional) – Cmap minimum value.
vmax (float, optional) – Cmap maximum value.
ax (matplotlib.axes.Axes, optional) – The axes on which to plot. By default new axes are created.
cmap (str, optional) – Name of the matplotlib colormap to use.
figsize ((float, float), optional) – Tuple representing the width and height of the resulting matplotlib.figure.Figure in inches. This parameter is ignored when the ax parameter is set.
dpi (float, optional) – The resolution of the figure in dots-per-inch.
aspect (str or float, optional) – Aspect of the axes. Choose from “equal” (default), “auto”, or set a number representing the ratio of the height to the width.
max_geoms (int, optional) – Maximum number of geometries to plot. The default is 1,000,000.
legend (bool, optional) – Adds a legend to the plot for the cmap_values values if set to True. The default is False.
legend_kwds (dict, optional) – A dictionary of legend keyword arguments. For categorical legends, any argument accepted by matplotlib.axes.Axes.legend is supported. For continuous legends, see the arguments for matplotlib.pyplot.colorbar.
basemap (str, optional) – Adds a basemap to the plot. Choose from “light” (Light Gray Canvas), “dark” (Dark Gray Canvas), “streets” (Esri Streets Basemap) or “osm” (OpenStreetMap Vector Basemap). Basemap labels are not supported.
xmargin (float, optional) – Sets padding of X data. For more information see matplotlib.axes.Axes.set_xmargin.
ymargin (float, optional) – Sets padding of Y data. For more information see matplotlib.axes.Axes.set_ymargin.
sr (SpatialReference, optional) – Spatial reference (SRID or WKT) to set or transform to on the resulting plot.
extent (BoundingBox, optional) – Sets the extent for plotting geometries. Only geometries that intersect the extent will be visible in the plot.
**style_kwds –
zorder (float): Sets the drawing order when multiple geometry columns are plotted on the same axes.
If plotting points and multipoints, any argument accepted by matplotlib.pyplot.scatter is supported. For linestrings and polygons, see the arguments for matplotlib.collections.LineCollection and matplotlib.collections.PatchCollection respectively.
- Returns
Matplotlib axes
- Return type
set_geometry_field¶
- geoanalytics.extensions.STDataFrameAccessor.set_geometry_field(self, geometry_field)¶
Returns a Spark DataFrame with the set geometry field.
- Parameters
geometry_field (pyspark.sql.Column) – Geometry type column.
- Returns
Spark DataFrame with the set geometry field.
- Return type
pyspark.sql.dataframe.DataFrame
set_spatial_reference¶
- geoanalytics.extensions.STDataFrameAccessor.set_spatial_reference(self, srid, geometry_field=None)¶
Sets the spatial reference on the geometry field.
- Parameters
srid (int) – spatial reference wkid
geometry_field (pyspark.sql.Column) – Geometry type column.
- Returns
Spark DataFrame with the spatial reference set on the geometry field.
- Return type
pyspark.sql.dataframe.DataFrame
set_time_fields¶
- geoanalytics.extensions.STDataFrameAccessor.set_time_fields(self, start_time_field, end_time_field=None)¶
Returns a Spark DataFrame with the set time field(s).
- Parameters
start_time_field (pyspark.sql.Column) – TimestampType column or StringType column that will be cast to TimestampType.
end_time_field (pyspark.sql.Column, optional) – TimestampType column or StringType column that will be cast to TimestampType.
- Returns
Spark DataFrame with the set time field(s).
- Return type
pyspark.sql.dataframe.DataFrame
to_pandas_sdf¶
- geoanalytics.extensions.STDataFrameAccessor.to_pandas_sdf(self, geometry=None)¶
Converts a Spark DataFrame to a Pandas Spatially Enabled DataFrame.
Note
The map viewer widget is only supported in Jupyter Notebooks.
- Parameters
geometry (pyspark.sql.Column) – Geometry type column to use for the Pandas Spatial Enabled DataFrame geometry column, defaults to None. If no column is specified, the first valid geometry type column will be used.
- Returns
A Pandas Spatially Enabled DataFrame.
- Return type
pandas.core.frame.DataFrame