Just another programming blog

Azure, HowTo

Visualize geospatial data with Azure Notebooks, Python, Pandas and Folium

Introduction

So today we are going to go through the example of visualizing geospatial data in a form of heatmap, straight from csv file. Our main goal is to discover the purpose of tools used in this simple, but very valuable use case.

Our data

The data used in this tutorial is taken from Cary town OpenData portal. We are going to take a closer look on crime mapping dataset.

Why would we want that particular dataset? In example: if you would consider buying a house in that area, it would be great to be aware which parts of the city are the safest for you children! 🙂

Tools used

Azure Notebooks

First tool we are going to use is Azure Notebooks. It is Azure-powered, Software as a Service version of Jupyter Notebook. To put it`s purpose in simple words - it allows you to easily write, run (in the cloud) and share your python code directly in the browser!

Python

Not much to explain here, but it is worth mentioning that in this article we are going to use Python 3.6.

Pandas

Pandas is something I have known about for a while, but recently I had a chance to put it into work for our client at HugeServices and I have to admit, this is my discovery of the year. If it comes to data science, I have used in example numpy, but pandas in on another level.

What does it exatly help us with? According to the official page: "pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language".

The most basic thinking to know about pandas is what are Series and DataFrames - I strongly recommend you to learn that. In example here.

Folium

Folium is a visualization library for python based on leaflet.js maps.

Tutorial

Azure Notebooks configuration

  1. Go to Azure Notebooks Portal
    Azure Notebooks Portal
  2. Login with your Microsoft Account
  3. Open My Projects tab
    Azure Notebooks My Projects Tab
  4. Click on New Project button
    Azure Notebooks New Project
  5. Fill-up the creation dialog and hit Create
    Azure Notebooks New Project Dialog
  6. Click on New button and select Notebook from the dropdown list
    Azure Notebooks New Notebook
  7. Fill-up Notebook Name, select Python 3.6 and hit New
    Azure Notebooks New Notebook Dialog
  8. Open a newly created notebook
    Azure Notebooks Open Notebook
  9. A view like below should appear
    Azure Notebooks Notebook View
  10. Fill-up first cell with the code below
import numpy as np
import pandas as pd
import folium
from folium.plugins import HeatMap

# Read the data from the remote resource as DataFrame
df = pd.read_csv('https://data.townofcary.org/api/v2/catalog/datasets/crime-mapping/exports/csv',
                 sep=';',
                 usecols=['lat', 'lon'])
# Remove all rows with missing values
df = df.dropna(axis = 0, how ='any')
# Display first 5 rows of the DataFrame
df.head()
  1. Hit run and you should be able to see the top 5 rows of the DataFrame
    Azure Notebooks Frist Cell
  2. Fill up the newly created cell (visible under the first cell, after invoking the run action) with the following code and hit run again
# Calculate approximated center point for our map view
center = [np.mean(df.lat.values), np.mean(df.lon.values)]
# Setup our map
map = folium.Map(location=center, zoom_start=12)
# Setup our heatmap layer
heatMap = HeatMap(zip(df.lat.values, df.lon.values),
                   min_opacity=0.1,
                   max_val=5,
                   radius=5, blur=15,
                   max_zoom=1)
# Adds the headmap to the map and returns the map itself so this will be displayed as a result
map.add_child(heatMap)
  1. After this step a map view should appear

Results

If everything went ok, you should be able to see an interactive map with a heatmap on top of it.

Azure Notebooks Results

Summary

There is a lot of geospatial data to be visualized out there. I hope this way of doing it will appear useful for you. I highly encourage you to share your story in the comments section on what kind of data you have visualized and what did you conclude from it.

P.S. this tutorial can be realized using just Jupyter Notebook instead of Azure Notebooks as well. To setup the environment on your own machine, I highly recommend Anaconda. Anyway, our choice here is Azure Notebooks, since it is easier/faster to start with (i.e. pandas and folium dependencies are already preinstalled for us).

1 Comment

  1. Stachu

    This is awesome! Thank you

Leave a Reply