- URL:
- https://<geoanalytics-url>/RunPythonScript
- Methods:
GET
- Version Introduced:
- 10.7
Description
The Run
operation runs a Python script on an ArcGIS GeoAnalytics Server site. In the script, you can create an analysis pipeline by chaining together multiple GeoAnalytics Tools without writing intermediate results to a data store. You can also use other Python functionality in the script that can be distributed across the GeoAnalytics Server.
For example, suppose that each week you receive a new dataset of vehicle locations containing billions of point features. Each time you receive a new dataset, you must perform the same workflow involving multiple GeoAnalytics Tools to create an information product that you share within your organization. This workflow creates several large intermediate layers that take up a large amount of space in your data store. By scripting this workflow in Python and running the code in the Run
operation, you can avoid creating these unnecessary intermediate layers, while simplifying the steps to create the information product.
When you use Run
, the Python code is run on the GeoAnalytics Server. The script runs with the Python 3.9 environment that is installed with GeoAnalytics Server, and all console output is returned as job messages. Some Python modules can be used in the script to use code across multiple cores of one or more machines in the GeoAnalytics Server using Apache Spark 3.3.0 (the compute platform that distributes analysis for GeoAnalytics Tools).
A geoanalytics
module is available and allows you to run GeoAnalytics Tools in the script. This package is imported automatically when you use Run
. To learn more, see Using GeoAnalytics Tools in Run Python Script.
To interact directly with Spark in the Run
operation, use the pyspark
module, which is imported automatically when you run the task. The pyspark
module is the Python API for Spark and provides a collection of distributed analysis tools for data management, clustering, regression, and more that can be called in Run
and run across GeoAnalytics Server.
For examples demonstrating how to use the geoanalytics
and pyspark
packages, see Examples: Scripting custom analysis with the Run Python Script task.
When using the geoanalytics
and pyspark
packages, most functions return analysis results in memory as Spark DataFrames. Spark Data Frames can be written to a data store or used in the script. This allows the chaining together of multiple geoanalytics
and pyspark
tools, while only writing out the final result to a data store, eliminating the need to create intermediate result layers. To learn more, see Reading and writing layers in pyspark.
For advanced users, an instance of SparkContext is instantiated automatically as sc
and can be used in the script to interact with Spark. This allows custom distributed analysis across GeoAnalytics Server.
It is recommended that you use an integrated development environment (IDE) to write the Python script, and copy the script text into the Run
tool. This way, you can identify syntax and typographical errors before running the script. It is also recommended that you run the script using a small subset of the input data first to verify that there are no logic errors or expectations. You can use the Describe
task to create a sample layer for this purpose.
When ArcGIS GeoAnalytics Server is installed on Linux, additional configuration steps are required before using the Run
operation. These steps are not required in Windows environments. To use Run
on Linux, install and configure Python 3.7+ for Linux on each machine in the GeoAnalytics Server site, ensuring that Python is installed in the same directory on each machine. Then, update the ArcGIS Server Properties on the GeoAnalytics Server site with the pyspark
property. The value of this property should be the path to the Python executable on the GeoAnalytics Server machines, for example, {"pyspark
.
Request parameters
Parameter | Details |
---|---|
| The Python script that will run on GeoAnalytics Server. This must be the full script as a string. The layer provided in REST examples
|
| A list of input layers that will be used in the Python script. Each input layer follows the same formatting as described in the Feature input topic. This can be one of the following:
In the REST web example for REST examples
|
| A JSON object that will be automatically loaded into the script environment as a local variable named REST example
|
| To control the output data store, use the |
|
The response format. The default response format is Values: |
Example usage
The following is a sample request URL for Run
:
https://hostname.domain.com/webadaptor/rest/services/System/GeoAnalyticsTools/GPServer/RunPythonScript/submitJob?pythonScript=print("Hello world!"}&inputLayer={"url":"https://myportal.domain.com/server/rest/services/Hosted/hurricaneTrack/FeatureServer/0", "filter":"Month = 'September'"}
Response
When you submit a request, the service assigns a unique job ID for the transaction.
Syntax:
{
"jobId": "<unique job identifier>",
"jobStatus": "<job status>"
}
After the initial request is submitted, you can use job
to periodically check the status of the job and messages as described in Check job status. Once the job has successfully completed, use job
to retrieve the results. To track the status, you can make a request of the following form:
https://<analysis url>/RunPythonScript/jobs/<jobId>
Any Python console output will be returned as an informative job message. In the following example, "Hello World!" is printed to the console using python
and a job message containing the print statement is returned as shown:
{
"type": "esriJobMessageTypeInformative",
"description": "{\"messageCode\":\"BD_101138\",\"message\":\"[Python] Hello World!\",\"params\":{\"text\":\"Hello World!\"}}"
}
Access results
All results written to ArcGIS Enterprise are available in your portal contents.