Introduction
In part-2 of this guide series, we saw how GIS data can be accessed from various data formats using Spatially enabled DataFrame (SeDF). In this part of the guide series, we will look at how SeDF can be used to export the data to various spatial and non-spatial formats. We will also explore how local data can be easily overwritten using SeDF. Let's explore some of the different options available with the versatile Spatially enabled DataFrame.
The data used in this guide is provided as an item. We will start by importing some libaries and downloading and extracting the data needed for the analysis in this guide.
# Import Libraries
import pandas as pd
from arcgis.features import GeoAccessor, GeoSeriesAccessor
from arcgis.gis import GIS
from IPython.display import display
import zipfile
import os
import shutil
# Create a GIS connection
gis = GIS()
agol_gis = GIS("https://www.arcgis.com", "arcgis_python", "amazing_arcgis_123")
# Get the data item
data_item = gis.content.get('c7140ae3d7ae4fd0817181461019aa75')
data_item
The cell below downloads and extracts the data from the data item to your machine.
# Download and extract the data
def unzip_data():
"""
This function:
- creates a directory `sedf_data` to download the data from the item
- downloads the item as `sedf_guide_data.zip` file in the sedf_data directory
- unzips and extracts the data to '.\sedf_data\cities'.
"""
try:
# path to downloaded data folder
data_dir = os.path.join(os.getcwd(), 'sedf_data')
# remove existing cities directory if exists
if os.path.isdir(data_dir):
shutil.rmtree(data_dir)
print(f'Removed existing data directory')
else:
os.makedirs(data_dir)
data_item.download(data_dir) # download the data item
# path to zipped file inside data folder
zipped_file_path = os.path.join(data_dir, 'sedf_guide_data.zip')
# unzip the data
zip_ref = zipfile.ZipFile(zipped_file_path, 'r')
zip_ref.extractall(data_dir)
zip_ref.close()
# path to new cities directory
cities_dir = os.path.join(data_dir, 'cities')
print(f'Dataset unzipped at: {os.path.relpath(cities_dir)}')
except Exception as e:
print(f'Error unzipping file: {e}')
# Extract data
unzip_data()
Dataset unzipped at: sedf_data\cities
Create a SeDF
Here, we will create a SeDF and then export the data to various data formats.
gis = GIS()
item = gis.content.search(
"USA Major Cities", item_type="Feature layer", outside_org=True)[0]
item
# Obtain the first feature layer from the item
flayer = item.layers[0]
# Use the `from_layer` static method in the 'spatial' namespace on the Pandas' DataFrame
sdf = pd.DataFrame.spatial.from_layer(flayer)
# Check shape
sdf.shape
(3886, 50)
# Check first few records
sdf.head()
AGE_10_14 | AGE_15_19 | AGE_20_24 | AGE_25_34 | AGE_35_44 | AGE_45_54 | AGE_55_64 | AGE_5_9 | AGE_65_74 | AGE_75_84 | ... | PLACEFIPS | POP2010 | POPULATION | POP_CLASS | RENTER_OCC | SHAPE | ST | STFIPS | VACANT | WHITE | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1313 | 1058 | 734 | 2031 | 1767 | 1446 | 1136 | 1503 | 665 | 486 | ... | 1601990 | 13816 | 15181 | 6 | 1271 | {"x": -12462673.723706163, "y": 5384674.994080... | ID | 16 | 271 | 13002 |
1 | 890 | 817 | 818 | 1799 | 1235 | 1330 | 1143 | 1099 | 721 | 579 | ... | 1607840 | 11899 | 11946 | 6 | 1441 | {"x": -12506251.313993266, "y": 5341537.793529... | ID | 16 | 318 | 9893 |
2 | 12750 | 13959 | 16966 | 32135 | 27048 | 29595 | 24177 | 12933 | 12176 | 7087 | ... | 1608830 | 205671 | 225405 | 8 | 33359 | {"x": -12938676.6836459, "y": 5403597.04949123... | ID | 16 | 6996 | 182991 |
3 | 790 | 768 | 699 | 1445 | 1136 | 1134 | 935 | 959 | 679 | 464 | ... | 1611260 | 10345 | 10727 | 6 | 1461 | {"x": -12667411.402393516, "y": 5241722.820606... | ID | 16 | 241 | 7984 |
4 | 3803 | 3779 | 3687 | 7571 | 5559 | 4744 | 3624 | 4397 | 2296 | 1222 | ... | 1612250 | 46237 | 53942 | 7 | 5196 | {"x": -12989383.674504517, "y": 5413226.487333... | ID | 16 | 1428 | 35856 |
5 rows × 50 columns
# Check type of sdf
type(sdf)
pandas.core.frame.DataFrame
# Access spatial namespace
sdf.spatial.geometry_type
['point']
We can see that the dataset has 3886 records and 50 columns. Inspecting the
type
ofsdf
object and accessing thespatial
namespace shows us that a Spatially enabled DataFrame has been created from all the data in the layer.
Writing GIS Data
The Spatially enabled DataFrame can export data to various data formats for use in other applications. Let's dive into the details of exporting GIS data to various sources.
Publish as a Feature Layer
Data in a Spatially enabled DataFrame can be exported to Feature layers
hosted on ArcGIS Online or ArcGIS Enterprise using the to_featurelayer()
method.
Let's export the sdf
DataFrame, created above, to a feature layer stored in an ArcGIS Online organization.
# Export to feature layer
lyr = sdf.spatial.to_featurelayer('census_cities_export', gis=agol_gis)
lyr
# Check type
type(lyr.layers[0])
arcgis.features.layer.FeatureLayer
The census_cities_export feature layer has been created at the ArcGIS Online connection specified.
Write to JSON based formats
Data in a Spatially enabled DataFrame can be exported to JSON based formats, such as FeatureSet or FeatureCollection, using the to_featureset()
and to_feature_collection()
methods. Let's take a look.
Write to FeatureSet
The to_featureset()
method can be used to export data from a SeDF into a FeatureSet.
# Write to FeatureSet
fset_exp = sdf.spatial.to_featureset()
# Check type
type(fset_exp)
arcgis.features.feature.FeatureSet
A FeatureSet object has been created from the data in the SeDF.
Write to FeatureCollection
The to_feature_collection()
method can be used to export data from a SeDF into a FeatureCollection.
# Write to FeatureCollection
fc_exp = sdf.spatial.to_feature_collection()
# Check type
type(fc_exp)
arcgis.features.feature.FeatureCollection
A FeatureCollection object has been created from the data in the SeDF.
Write to a local file
Data in a Spatially enabled DataFrame can be exported to local spatial file formats, such as Feature classes or shapefiles, and non-spatial formats, such as csv files or tables. Let's take a look.
Write to local databases
The to_featureclass()
method can be used to export spatial data from a SeDF into various local databases, such as a File geodatabase, a Mobile geodatabase (.geodatabase), or a SQLite Database.
File Geodatabase
arcpy
, the Fiona package must be present in your current conda environment to perform this operation.
# Export to a feature class in File Geodatabase
sdf.spatial.to_featureclass(
location="./sedf_data/cities/cities.gdb/major_cities_export")
'C:\\Users\\mohi9282\\Documents\\sedf_guides\\sedf_data\\cities\\cities.gdb\\major_cities_export'
A Feature Class has been created in a File Geodatabase from the data in the SeDF.
Mobile Geodatabase
arcpy
.
# Export to a feature class in Mobile Geodatabase
sdf.spatial.to_featureclass(
location="./sedf_data/cities/cities_mobile.geodatabase/major_cities_export")
'C:\\Users\\mohi9282\\Documents\\sedf_guides\\sedf_data\\cities\\cities_mobile.geodatabase\\main.major_cities_export'
A Feature Class has been created in a Mobile Geodatabase from the data in the SeDF.
SQLite Database
arcpy
.
# Export to a feature class in SQLite Database
sdf.spatial.to_featureclass(
location="./sedf_data/cities/cities.sqlite/major_cities_export")
'C:\\Users\\mohi9282\\Documents\\sedf_guides\\sedf_data\\cities\\cities.sqlite\\major_cities_export'
A Feature Class has been created in a SQLite Database from the data in the SeDF.
Write to a shapefile
The to_featureclass()
method can also be used to export spatial data from a SeDF into a shapefile.
arcpy
, the Fiona package must be present in your current conda environment to perform this operation.
# Export to a shapefile
sdf.spatial.to_featureclass(
location="./sedf_data/cities/major_cities_export.shp")
'C:\\Users\\mohi9282\\Documents\\sedf_guides\\sedf_data\\cities\\major_cities_export.shp'
A Shapefile has been created from the data in the SeDF.
Write to Non-spatial formats
The to_table()
method can be used to export data from a SeDF into non-spatial formats, such as csv files or tables.
Write to a csv file
# Export to a csv file
sdf.spatial.to_table(location="./sedf_data/cities/cities_table_export.csv")
'./sedf_data/cities/cities_table_export.csv'
A csv file has been created from the data in the SeDF.
Write to a table in a File Geodatabase
arcpy
.
# Export to a table in a File Geodatabase
sdf.spatial.to_table(
location="./sedf_data/cities/cities.gdb/cities_table_export")
'C:\\Users\\mohi9282\\Documents\\sedf_guides\\sedf_data\\cities\\cities.gdb\\cities_table_export'
A table has been created in a File Geodatabase from the data in the SeDF.
Overwriting GIS Data
The GIS data stored locally can be easily overwritten using the Spatially enabled DataFrame. Let's take a look.
Overwrite a Featureclass
The default overwrite=True
argument in the to_featureclass()
method can be used to overwrite an existing feature class from the data in a SeDF.
The major_cities_export
featureclass was created in a section above using sdf
. We will overwrite this featureclass with a subset of the data from sdf
.
# Subset the data
sub_df = sdf.iloc[:10, -13:].copy()
sub_df.shape
(10, 13)
# Check head
sub_df.head(2)
NAME | OTHER | OWNER_OCC | PLACEFIPS | POP2010 | POPULATION | POP_CLASS | RENTER_OCC | SHAPE | ST | STFIPS | VACANT | WHITE | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Ammon | 307 | 3205 | 1601990 | 13816 | 15181 | 6 | 1271 | {"x": -12462673.723706163, "y": 5384674.994080... | ID | 16 | 271 | 13002 |
1 | Blackfoot | 1077 | 2788 | 1607840 | 11899 | 11946 | 6 | 1441 | {"x": -12506251.313993266, "y": 5341537.793529... | ID | 16 | 318 | 9893 |
arcpy
, the Fiona package must be present in your current conda environment to perform this operation.
# Export sub_df to the existing major_cities_export featureclass
sub_df.spatial.to_featureclass(
location="./sedf_data/cities/cities.gdb/major_cities_export", overwrite=True)
'C:\\Users\\mohi9282\\Documents\\sedf_guides\\sedf_data\\cities\\cities.gdb\\major_cities_export'
# Check if the featureclass is updated
fc_new_df = pd.DataFrame.spatial.from_featureclass(
location="./sedf_data/cities/cities.gdb/major_cities_export")
fc_new_df.shape
(10, 14)
The featureclass has been overwritten with new data.
Overwrite a table
The default overwrite=True
argument in the to_table()
method can be used to overwrite an existing non-spatial table from the data in a SeDF.
The cities_table_export
table was created in a section above using sdf
. We will overwrite this table with a subset of the data sub_df
defined above.
Table in a csv file
# Export sub_df to an existing cities_table_export.csv file
sub_df.spatial.to_table(
location="./sedf_data/cities/cities_table_export.csv", overwrite=True)
'./sedf_data/cities/cities_table_export.csv'
# Check if the csv file is updated
tbl_new_df = pd.DataFrame.spatial.from_table(
filename="./sedf_data/cities/cities_table_export.csv")
tbl_new_df.shape
(10, 14)
The csv file has been overwritten with new data.
Table in a File Geodatabase
arcpy
.
# Export sub_df to an existing table in a File Geodatabase
sub_df.spatial.to_table(
location="./sedf_data/cities/cities.gdb/cities_table_export")
'C:\\Users\\mohi9282\\Documents\\sedf_guides\\sedf_data\\cities\\cities.gdb\\cities_table_export'
# Check if the table file is updated
tbl_new_df2 = pd.DataFrame.spatial.from_table(
filename="./sedf_data/cities/cities.gdb/cities_table_export")
tbl_new_df2.shape
(10, 13)
The table file has been overwritten with new data.
Memory-based Workspace
Writing geoprocessing outputs to memory is an alternative to writing output to a geodatabase or file-based format. It is often significantly faster than writing to on-disk formats. Data written into memory is temporary and is deleted when the application is closed, so it is an ideal location to write intermediate data.
ArcGIS provides two memory-based workspaces where geoprocessing outputs can be written.
memory
- is a new memory-based workspace developed for ArcGIS Pro that supports output feature classes, tables, and raster datasets.in_memory
- is the legacy memory-based workspace built for ArcMap that supports output feature classes, tables, and raster datasets.
Let's look at an example of writing to a memory
workspace. Here, we will:
- write data from SeDF to a
memory
workspace. - use the data in the memory workspace to generate buffers and export the results to another memory workspace.
- see how results in a memory workspace can be converted to a featureclass.
- delete memory workspaces.
- - Memory-based workspaces do not support geodatabase elements, such as feature datasets, representations, topologies, geometric networks, or network datasets.
- - Folders cannot be created in memory-based workspaces.
- - Since memory-based workspaces are stored in your system's physical memory, or RAM, your system may run low on memory if you write large datasets into the workspace. This can negatively impact processing performance.
arcpy
.
# Import arcpy
import arcpy
# Check head
sub_df.head(2)
NAME | OTHER | OWNER_OCC | PLACEFIPS | POP2010 | POPULATION | POP_CLASS | RENTER_OCC | SHAPE | ST | STFIPS | VACANT | WHITE | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Ammon | 307 | 3205 | 1601990 | 13816 | 15181 | 6 | 1271 | {"x": -12462673.723706163, "y": 5384674.994080... | ID | 16 | 271 | 13002 |
1 | Blackfoot | 1077 | 2788 | 1607840 | 11899 | 11946 | 6 | 1441 | {"x": -12506251.313993266, "y": 5341537.793529... | ID | 16 | 318 | 9893 |
# Write data from SeDF to a memory workspace.
sub_df.spatial.to_featureclass(r"memory\sub_df")
'memory\\sub_df'
# Use data in memory to generate buffers, exporting output to memory
arcpy.Buffer_analysis(in_features=r"memory\sub_df",
out_feature_class="memory\subBuffers",
buffer_distance_or_field=1)
Output
memory\subBuffersMessages
Start Time: Friday, November 12, 2021 12:14:14 PMSucceeded at Friday, November 12, 2021 12:14:15 PM (Elapsed Time: 0.08 seconds)
# Read buffer output into a SeDF
buffered_df = pd.DataFrame.spatial.from_featureclass(r"memory\subBuffers")
buffered_df.shape
(10, 16)
# Check head
buffered_df.head(2)
OBJECTID | name | other | owner_occ | placefips | pop2010 | population | pop_class | renter_occ | st | stfips | vacant | white | BUFF_DIST | ORIG_FID | SHAPE | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | Ammon | 307 | 3205 | 1601990 | 13816 | 15181 | 6 | 1271 | ID | 16 | 271 | 13002 | 1.0 | 1 | {"curveRings": [[[-12462673.7237, 5384675.9940... |
1 | 2 | Blackfoot | 1077 | 2788 | 1607840 | 11899 | 11946 | 6 | 1441 | ID | 16 | 318 | 9893 | 1.0 | 2 | {"curveRings": [[[-12506251.314, 5341538.79349... |
# Convert buffer results to a featureclass
arcpy.Dissolve_management(r"memory\subBuffers",
"./sedf_data/cities/cities.gdb/memBuffers2")
Output
.\sedf_data\cities\cities.gdb\memBuffers2Messages
Start Time: Friday, November 12, 2021 12:19:46 PMDissolving...
Succeeded at Friday, November 12, 2021 12:19:46 PM (Elapsed Time: 0.47 seconds)
# Delete the in-memory item
arcpy.Delete_management(r"memory\sub_df")
Output
trueMessages
Start Time: Friday, November 12, 2021 12:20:45 PMSucceeded at Friday, November 12, 2021 12:20:45 PM (Elapsed Time: 0.00 seconds)
# Delete the in-memory item
arcpy.Delete_management(r"memory\subBuffers")
Output
trueMessages
Start Time: Friday, November 12, 2021 12:20:47 PMSucceeded at Friday, November 12, 2021 12:20:47 PM (Elapsed Time: 0.00 seconds)
Conclusion
In this guide, we explored how Spatially enabled DataFrame (SeDF) can be used to export spatial data to various formats. We started by exporting the data to web feature layers and to in-memory JSON based formats, such as FeatureSet and FeatureCollection. Next, we explored writing the data to various local data sources, such as a file geodatabase, a mobile geodatabase, an sqlite database, and a shapefile. We also discussed exporting the data to non-spatial formats, such as a csv file or a table. We introduced how the data in local file formats, such as a feature class or a table in a File Geodatabase, can be overwritten using a SeDF. Towards the end, we discussed how the data from SeDF can be exported to in-memory workspaces.
In the next part of the guide series, you will learn about the various properties of a SeDF and how they can be used to pre-process a SeDF.
Creating quality documentation is time-consuming and exhaustive, but we are committed to providing you with the best experience possible. With that in mind, we will be rolling out the revamped guides on this topic as different parts of a guide series (like the Data Engineering or Geometry guide series). This is "part-3" of the guide series for Spatially Enabled DataFrame. You will continue to see the existing documentation as we revamp it to add new parts. Stay tuned for more on this topic.