HTML table to Pandas Data Frame to Portal Item

Read HTML table to Pandas Data Frame

Often we read informative articles that present data in a tabular form. If such data contains location information, it would be much more insightful if presented as a cartographic map. Thus this sample shows how Pandas can be used to extract data from a table within a web page (in this case, a Wikipedia article) and how it can be then brought into the GIS for further analysis and visualization.

Note: to run this sample, you need a few extra libraries in your conda environment. If you don't have the libraries, install them by running the following commands from cmd.exe or your shell

conda install lxml
conda install html5lib
conda install beautifulsoup4
conda install matplotlib```
import pandas as pd

from arcgis.gis import GIS

Let us read the Wikipedia article on Estimated number of guns per capita by country as a pandas data frame object

df = pd.read_html("https://en.wikipedia.org/wiki/Number_of_guns_per_capita_by_country")[0]
df.shape
(230, 10)

Let us process the table by dropping an unnecessary column.

df.head()
LocationFirearms per 100RegionSubregionPopulation 2017Civilian firearmsComputation methodRegistered firearmsUnregistered firearmsNotes
0United States120.5AmericasNorth America32647400039334700011073743.0392,273,257 Est.[note 2]
1Falkland Islands62.1AmericasSouth America3000200021705.0295NaN
2Yemen52.8AsiaWestern Asia28120000148590002NaNNaNNaN
3New Caledonia42.5OceaniaMelanesia270000115000255000.060000NaN
4Serbia39.1EuropeSouthern Europe6946000271900021186086.01532914NaN
del df['Notes']
df.head()
LocationFirearms per 100RegionSubregionPopulation 2017Civilian firearmsComputation methodRegistered firearmsUnregistered firearms
0United States120.5AmericasNorth America32647400039334700011073743.0392,273,257 Est.
1Falkland Islands62.1AmericasSouth America3000200021705.0295
2Yemen52.8AsiaWestern Asia28120000148590002NaNNaN
3New Caledonia42.5OceaniaMelanesia270000115000255000.060000
4Serbia39.1EuropeSouthern Europe6946000271900021186086.01532914

Let's process column names so there are no spaces because this can cause inconsistencies with some GIS operations:

df.columns = df.columns.str.replace(" ", "_")
df.head()
LocationFirearms_per_100RegionSubregionPopulation_2017Civilian_firearmsComputation_methodRegistered_firearmsUnregistered_firearms
0United States120.5AmericasNorth America32647400039334700011073743.0392,273,257 Est.
1Falkland Islands62.1AmericasSouth America3000200021705.0295
2Yemen52.8AsiaWestern Asia28120000148590002NaNNaN
3New Caledonia42.5OceaniaMelanesia270000115000255000.060000
4Serbia39.1EuropeSouthern Europe6946000271900021186086.01532914
gis = GIS(profile="your_online_admin_profile")
dpath = r"/Users/john3092/Job/"

df.to_csv(path_or_buf=dpath + "worldwide_gun_ownwership_df.csv", index=False)

Plot as a map

Let us connect to our GIS to geocode this data and present it as a map, either by specifying username and password, e.g. in gis = GIS("https://www.arcgis.com", "arcgis_python", "P@ssword123") or via an existing profile:

from arcgis.gis import GIS
import json

#gis = GIS("home")
gis = GIS(profile="your_online_admin_profile")

The table is using the Location column to signify the country name, so we'll create a feature collection by passing the mapping relationship below to the Geocoder through the import_data() method:

fc = gis.content.import_data(df, {"CountryCode":"Location"})
gun_fset = fc.query()
map1 = gis.map(
    location = 'Brazil'
)
map1
map1.zoom = 1

We'll use smart mapping to render the points with varying sizes representing the number of firearms per 100 residents

map1.content.add(gun_fset)
smart_map_mgr = map1.content.renderer(0).smart_mapping()
smart_map_mgr.class_breaks_renderer(
    break_type="size",
    field="Firearms_per_100",
    num_classes=4
)
map1.legend.enabled=True

Publish as Portal Item

Let us publish this layer as a feature collection item in our GIS

Let's use the FolderManager to get the logged in user's Root folder and add the Feature Collection to the folder.

fmgr = gis.content.folders
root = fmgr.get(owner=gis.users.me)
from arcgis.gis import ItemProperties, ItemTypeEnum
iprops = ItemProperties(title="Worldwide Firearms Ownership Folder stream",
                        item_type=ItemTypeEnum.FEATURE_COLLECTION,
                        tags=["guns,violence"],
                        snippet = "GSR Worldwide firearms ownership",
                        description = "test description",
                        type_keywords = ["Data", "Feature Collection", "Singlelayer"],
                        extent = "-102.5272,-41.7886,172.5967,64.984")
rf_item = root.add(item_properties=iprops,
                   text = json.dumps({"layers": [dict(fc.properties)]}))
rf_item.result()
Worldwide Firearms Ownership Folder stream
GSR Worldwide firearms ownership
Feature Collection by ArcGISPyAPIBot
Last Modified: October 11, 2024
0 comments, 0 views

Your browser is no longer supported. Please upgrade your browser for the best experience. See our browser deprecation post for more details.