Learn how to automate downloading data from portal using ArcGIS API for Python.
In this tutorial you will download and import data taken from the Los Angeles GeoHub using the ArcGIS API for Python. The data sets include a Trailheads (CSV), Trails (GeoJSON), and a Parks and Open Space (Shapefile) file.
The data will be stored locally on your machine.
Prerequisites
The ArcGIS API for Python tutorials use Jupyter Notebooks to execute Python code. If you are new to this environment, please see the guide to install the API and use notebooks locally.
Steps
Import modules and log in
-
Import the
GIS
class and create a connection to ArcGIS Online. You will also loadPath
frompathlib
andZip
from the Python standard library. Because the data is public, we can use an anonymous connection to ArcGIS Online to download the data.File Use dark colors for code blocks from arcgis.gis import GIS from pathlib import Path from zipfile import ZipFile gis = GIS()
Access the item by ID
-
Create a variable to store the ID of the public data item.
Use dark colors for code blocks from arcgis.gis import GIS from pathlib import Path from zipfile import ZipFile gis = GIS() public_data_item_id = 'a04933c045714492bda6886f355416f2'
-
The
content
property of aGIS
object is an instance of aContent
class. This can be used to manage content in ArcGIS Online. TheManager get()
method makes an HTTP request to retrieve an Item object.Use dark colors for code blocks gis = GIS() public_data_item_id = 'a04933c045714492bda6886f355416f2' # `ContentManager.get` will return `None` if there is no Item with ID `a04933c045714492bda6886f355416f2` data_item = gis.content.get(public_data_item_id) data_item
Download the item
-
Download
LA
to the notebook server's current location._Hub _datasets.zip Use dark colors for code blocks public_data_item_id = 'a04933c045714492bda6886f355416f2' # `ContentManager.get` will return `None` if there is no Item with ID `a04933c045714492bda6886f355416f2` data_item = gis.content.get(public_data_item_id) data_item # configure where to save the data, and where the ZIP file is located data_path = Path('./data') if not data_path.exists(): data_path.mkdir() zip_path = data_path.joinpath('LA_Hub_Datasets.zip') extract_path = data_path.joinpath('LA_Hub_datasets') data_item.download(save_path=data_path)
-
Use
Zip
to extract the contents of the dataset.File Use dark colors for code blocks # `ContentManager.get` will return `None` if there is no Item with ID `a04933c045714492bda6886f355416f2` data_item = gis.content.get(public_data_item_id) data_item # configure where to save the data, and where the ZIP file is located data_path = Path('./data') if not data_path.exists(): data_path.mkdir() zip_path = data_path.joinpath('LA_Hub_Datasets.zip') extract_path = data_path.joinpath('LA_Hub_datasets') data_item.download(save_path=data_path) zip_file = ZipFile(zip_path) zip_file.extractall(path=data_path)
-
Call glob('*') on the
extract
to list the contents of the data directory._path Use dark colors for code blocks from arcgis.gis import GIS from pathlib import Path from zipfile import ZipFile gis = GIS() public_data_item_id = 'a04933c045714492bda6886f355416f2' # `ContentManager.get` will return `None` if there is no Item with ID `a04933c045714492bda6886f355416f2` data_item = gis.content.get(public_data_item_id) data_item # configure where to save the data, and where the ZIP file is located data_path = Path('./data') if not data_path.exists(): data_path.mkdir() zip_path = data_path.joinpath('LA_Hub_Datasets.zip') extract_path = data_path.joinpath('LA_Hub_datasets') data_item.download(save_path=data_path) zip_file = ZipFile(zip_path) zip_file.extractall(path=data_path) files = [file.name for file in extract_path.glob('*')] files