Vector tiles contain vector representations of data across a range of
scales and can be used to visualize geometries in a Spark DataFrame. Using the vector-tile
format when writing a
DataFrame will write a directory of tiles to an output location. These tiles can be
visualized in applications like ArcGIS Pro and MapBox.
Below is an example of a vector tile result stored in PBF files.
This layout includes a collection of folders with subfolders and .pbf files, as well as the vector-tile.json file that contains metadata like the path to the tiles folder, layer name, spatial reference, and symbology.
The following table shows differences in terms between GeoAnalytics Engine and GIS software (such as ArcGIS Pro).
Description | GeoAnalytics Engine term | GIS software term |
---|---|---|
A single record in your dataset. | Record or row | Feature |
A DataFrame column or a field in a GIS dataset | Column | Field or attribute |
Create vector tiles in GeoAnalytics Engine
The following table shows examples of the Python syntax for writing to vector tiles with GeoAnalytics Engine, where
path
is a path to a directory of vector tiles.
Load | Save |
---|---|
Not supported | df.write.format("vector-tile").save(path) |
Not supported | df.write.save(path, format="vector-tile") |
When you write to vector tiles, specify the export format as vector-tile
. There are also required and optional parameters
explained in the table below.
Parameter name | Explanation | Type | Example |
---|---|---|---|
layer | The name of the output vector tile layer. | Required | "earthquakes" |
min | The minimum zoom level created. If a min is provided, min will be ignored. The default is 0. | Optional | 5 In this case, zoom levels from 5 to 15 will be included in the vector tile result. The folder will include subfolders 5-15. |
max | The maximum zoom level created. The default is 15. | Optional | 12 In this case, zoom levels from 0 to 12 will be included in the vector tile result. The folder will include subfolders 0-12. |
min | The minimum (smallest) scale at which the tiles are created, for zooming out as far as possible. This parameter accepts the reciprocal of the scale, the scale itself is 1: min . If the min parameter is not provided, the min will be used. | Optional | 295828764 A value of 295828764 specifies a scale of 1:295828764. When zooming in from whole-world extent, data will start to appear at a scale of 1:295828764. |
max | The maximum (largest) scale at which records will display when zooming in. This parameter accepts the reciprocal of the scale, the scale itself is 1: max . If max is greater than the scale corresponding to max , then the high scale range is supported for ArcGIS viewers by overzoom, without writing additional levels of tiles. Clients that do not support overzoom will only zoom in to the scale corresponding to max . If the max parameter is not provided, the max will be used. | Optional | 564 A value of 564 specifies a scale of 1:564. ArcGIS Pro will overzoom in to 1:564; other clients may not overzoom, max should be increased in this case. |
max | The maximum number of records that will be included in a single tile. If the number of records exceeds this limit, not all records will draw. In this case, the records will be chosen arbitrarily. The default is 50000. Set this parameter to -1 to include all features in the vector tile result. | Optional | 1000 |
prio | The prioritization field determines which records will be prioritized to be included in the vector tile result. Use this field when the number of records in a tile exceeds the max . | Optional | "magnitude" In this example, if the number of records in a tile exceeds the max , the records with highest magnitude value will be prioritized to be included in the vector tile result. |
attributes | The columns to be included in the vector tiles. By default, no columns are included. | Optional | "magnitude,depth" In this example, two columns will be shown in client applications, such as the popups in ArcGIS Pro. |
The example below writes the DataFrame df
to vector tiles with a maximum zoom level of 12, a maximum of 10,000 points
per tile, and includes the magnitude
field as an attribute in the result.
df.write.format("vector-tile") \
.option("layerName", "earthquakes") \
.option("maxLevel", 12) \
.option("maxPointsPerTile", 10000) \
.option("prioField", "magnitude") \
.option("attributes", "magnitude") \
.save(r"C:\vector_tile_output\earthquakes")
Visualize vector tiles
Most client applications require that vector tiles are hosted on a web server to visualize them. There are multiples ways to host vector tiles. See the following tutorials for some example workflows:
Visualize hosted vector tiles in ArcGIS Pro
The steps below outline how to view vector tiles in ArcGIS Pro.
-
Create a new map or open an existing one.
-
Click Map > Add Data > Data From Path.
-
For the path provide the URL to the hosted
vector-tile.json
file that was written by GeoAnalytics Engine for the vector tiles you want to visualize. For service type select "Vector Tile Service" and click Add.
The geometries will draw on the map and you will see a vector tile layer in the Contents pane.
You can also enable popups on vector tile layers and view attribute values. To enable popups, right-click on the vector tile layer in Contents and select "Enable Pop-ups".
Visualize hosted vector tiles in Mapbox
-
Create a basic HTML application using the Mapbox API.
-
Add the vector tile layer that references the hosted location.
Use dark colors for code blocks Copy window.map.addSource('tiles', { "type": "vector", "minzoom": 0, "maxzoom": 9, "tiles": [`{z}/{x}/{y}.pbf`] });
-
Add the layer to the map along with properties.
Use dark colors for code blocks Copy >map.addLayer({ "id": "test", "type": "circle", "source": "tiles", "source-layer": "earthquakes", "paint": { "circle-color": "red" } }, "waterway-label");
Example Mapbox app:
HTMLUse dark colors for code blocks Copy <!DOCTYPE html> <html> <head> <meta charset='utf-8' /> <title>Display a map</title> <meta name='viewport' content='initial-scale=1,maximum-scale=1,user-scalable=no' /> <script src='https://api.tiles.mapbox.com/mapbox-gl-js/v1.12.0/mapbox-gl.js'></script> <link href='https://api.tiles.mapbox.com/mapbox-gl-js/v1.12.0/mapbox-gl.css' rel='stylesheet' /> <style> body { margin: 0; padding: 0; } #map { position: absolute; top: 0; bottom: 0; width: 100%; } </style> </head> <body> <div id='map'></div> <script> const agsKey = '<YOUR_KEY>'; const agsStyle = 'ArcGIS:Community'; var map = new mapboxgl.Map({ container: 'map', // container id style: `https://basemaps-api.arcgis.com/arcgis/rest/services/styles/${agsStyle}/?type=style&token=${agsKey}`, center: [-85.50, 40], zoom: 0 }); map.on('load', function () { window.map.addSource('tiles', { "type": "vector", "minzoom": 0, "maxzoom": 15, "tiles": [`http://<YOUR_PATH>/{z}/{y}/{x}.pbf`] // Add the path to your hosted tiles here }); map.addLayer({ "id": "test", "type": "circle", "source": "tiles", "source-layer": "vector_layer", "paint": { "circle-color": "red" } }); }); </script> </body> </html>
Best practices
If you are using a large, dense dataset and wish to export multiple zoom
levels, it may take a long time to export and render the vector tiles.
In this case, it is recommended to use the max
parameter to limit the number of records included in each tile.
The records to include are prioritized in an arbitrary way by default. You can also use a prioritization field in cases where you wish to include and display certain records first.
For example, if you are visualizing a dataset of world cities, using the population as the prioritization field will prioritize displaying the largest cities first in the shallower levels.
In case you want to get more detailed information about individual records in your vector tile layer, you can include attributes when exporting in GeoAnalytics Engine and then enable popups in ArcGIS Pro.
Limitations
Only points are supported. The only supported vector tile layout is folders of .pbf files.