:::: MENU ::::
Browsing posts in: Modelling

Creating spatial data analytics dashboards in Cartoframes

With the strength of Carto in terms of spatial science and location intelligence ; and the easy access to data science packages in Python, Carto’s new project ‘Cartoframes‘ has a lot of potential to provide excellent mapping dashboards for data-hungry workflows.

Below is a quick tutorial I have made which will hopefully help new users figure out how to use it. It is in no way comprehensive, and there are probably some pieces missing ; but it should be enough to go off to get started! The tutorial covers some of the elements of creating a ‘live’ weather data dashboard for New South Wales in Australia.

What is Cartoframes ? (from https://github.com/CartoDB/cartoframes)
A Python package for integrating CARTO maps, analysis, and data services into data science workflows.

Python data analysis workflows often rely on the de facto standards pandas and Jupyter notebooks. Integrating CARTO into this workflow saves data scientists time and energy by not having to export datasets as files or retain multiple copies of the data. Instead, CARTOframes give the ability to communicate reproducible analysis while providing the ability to gain from CARTO’s services like hosted, dynamic or static maps and Data Observatory augmentation.


Write pandas DataFrames to CARTO tables
Read CARTO tables and queries into pandas DataFrames
Create customizable, interactive CARTO maps in a Jupyter notebook
Interact with CARTO’s Data Observatory
Use CARTO’s spatially-enabled database for analysis

Step 1 – Install libraries
Install all of the relevant libraries. For me I’m using Canopy. Canopy provides Python 2.7 and 3.5, with easy installation and updates via a graphical package manager of over 450 pre-built and tested scientific and analytic Python packages from the Enthought Python Distribution. These include NumPy, Pandas, SciPy, matplotlib, scikit-learn, and Jupyter / IPython. You can get Canopy for free here.

Once installed, open the console and install the packages:
pip install cartoframes
pip install pandas

Step 2 – Import libraries

In a new Jupyter notebook, start by importing the libraries in the first block, these are the ones you’ll generally need (though you can go to town with other numerical / statistical packages here!):

import cartoframes
import pandas as pd
import numpy as np

Step 3 – Set up a Carto account and register for an API key

Start by going to Carto.com and signing up through the prompts.

Once you have signed up, in the top-right of your home page there should be setting toggle which show you:

View your public profile
Your account
Your API keys
Close session

Click on ‘Your API keys’ and copy what shows up on the next page. It should be a long string of text, looking something like this:


Step 4 – Connecting to your Carto account in Python
Try the following line of code in your next Jupyter code block, where xxxxxxxxxxxx is your new API key. This key allows Cartoframes to communicate directly with the data in your Carto account.

Where it says ‘oclock’ you should put your own username.

cc = cartoframes.CartoContext(base_url='https://oclock.carto.com',api_key='xxxxxxxxxxxx')

When you run this code and call ‘cc’ it should provide you with a message such as this:
cartoframes.context.CartoContext at 0x1ea3fa2c518

This means that cartoframes has successfully accessed your Carto account and you can call ‘cc’ to reference accessing this account from now on. Make sure you keep your API key safe!

Step 5 – Upload some data to Carto
For this task, I downloaded the shapefile components of weather locations from the Australian Bureau of Meteorology. This is all of the spatial files (.shp, shx, .dbf etc) for IDM 13 from:

These are all the files prefixed by IDM000013 and suffixed by .dbf,.prj,.sbn,.sbx,.shp,.shx,.shp.xml. Carto will need these all in a .zip file before you upload them.

The metadata for this dataset can be found here:

IDM00013 – point places (precis, fire, marine)

Once you have downloaded these you can upload the shapefile and it should give you a series of geolocated dots covering all of Australia, with many attributes as described in the metadata above. For this I called the dataset ‘idm00013’.

Step 6 – Read the data in jupyter
Let’s test if everything is working. The following should display a dataframe of all of the aspatial information stored in each weather location:

carto_df = cc.read('idm00013')

The following should give you a list of all of the variables available to you to access and change:

Step 7 – Making a map
Now for the exciting bit – creating a Carto map inside the Jupyter notebook.
Here I’ve picked the elevation column with a brown colour scheme, try:

from cartoframes import Layer, BaseMap, styling
cc.map(layers=[BaseMap('light'),Layer('idm00013',color={'column': 'elevation','scheme': styling.brwnYl(7)},size=5)],

The following map should display, with light brown showing where the weather points are a low elevation, and high points shown in a darker brown.

Extension – Accessing and parsing a live data feed

The code below retrieves the latest weather forecasts for the weekend ahead from the Bureau of Meteorology’s API. It is stored in a dataframe ‘df’.

I’ll leave the indentation as part of this tutorial!

import xml.etree.ElementTree as ET
import csv
import pandas as pd
import urllib.request
req = urllib.request.Request('ftp://ftp.bom.gov.au/anon/gen/fwo/IDN11060.xml')
with urllib.request.urlopen(req) as response:
xml_data = response.read()
list_dict = []
root = ET.XML(xml_data)
for element in root.findall('forecast'):
for area in element:
for forecast in area:
min_temp = ''
max_temp = ''
aac_id = area.get('aac')
forecast_date = forecast.get('start-time-local')
for element in forecast:
if element.attrib['type'] == 'air_temperature_minimum':
min_temp = element.text
elif element.attrib['type'] == 'air_temperature_maximum':
max_temp = element.text
list_dict.append({'aac':aac_id, 'forecast_date':forecast_date, 'low_temp': min_temp, 'max_temp':max_temp})
df = pd.DataFrame(list_dict)

Extension Part 1 – Joining in a live data source

We now want to join the geographical data from the first exercise with this live data feed.
This is done with a ‘left’ join, so we keep all of the weather forecast records and add the geographic data to them.

merged_data = pd.merge(df,carto_df,on='aac')

Extension Part 2 – Selecting some data

Now we filter out all records to get one particular day’s forecast (you will need to change the date here to current date).
The filtered data is then written to a new dataset in Carto called ‘merged_weathermap’.

one_forecast = merged_data[merged_data['forecast_date']=='2018-01-16T00:00:00+11:00']
cc.write(one_forecast, 'merged_weathermap',overwrite=True)

Extension Part 3 – Putting it all together
#Step 10

Now let’s add the data from the Weather feed API to a Cartoframes map. The following reads in the merged_weathermap dataset we just created and colours
in the maximum temperature for the forecast data for each weather point in New South Wales. Pink being a high temperature, and blue being a lower temperatue.

from cartoframes import Layer, BaseMap, styling
cc.map(layers=[BaseMap('light'),Layer('merged_weathermap',color={'column': 'max_temp','scheme': styling.tropic(10)},size=10)],

That’s it! From here, it is feasible to see with a bit of extra work and some scripts that continuously ping the APIs etc that we are only a few steps away from creating live dashboards which integrate other statistical and mathematical packages, such as even including machine learning.

Looking forward to seeing developments in this space and if you have any feedback or ideas let me know!

For more information on Cartoframes have a look at their documentation.


Measuring accessibility – on the 30 minute city


One of the recent projects I’ve been involved in at Arup has been developing spatial, analytical tools to understand transport accessibility. In particular, this is to do with destination-based accessibility – so rather than assessing how well-performing a city is delivering transport at particular points (which could go anywhere), we looked at how this performs delivering to all other places in the city. In particular we were looking at places that are important to creating liveable environments – such as to education, parks, healthcare and our jobs.

For me, this topic was building well on research I had done in 2015 (See ‘Where to From Here? A Modelling Methodology for Measuring Land-Use and Public Transport Accessibility in Melbourne), which assessed destination-based accessibility within transport modelling software, restricted to travel zones. This time there were some major improvements to the method ; mostly from removing from a software shell to raw code, and much more disaggregate units of analysis.

We assessed Greater Sydney Sydney at a 300m x 300m grid level, producing over a million travel time isochrones for driving (including traffic), public transport and walking to assign accessibility values to liveability variables in approximately 120,000 small cells in the city. In a nutshell, our toolkit involved a bit of OpenTripPlanner, Python, Amazon Web Server and FME – all using Open Data sources. This means means the method is highly reproducible for both other cities, and applicable to the same city with a different network (which, could be used to evaluate transport network changes, or alternate land use scenarios). A web map has been produced to showcase some of the work done in this space is so far , exploring what the ’30 minute city’ means for Sydney:



It is certainly exciting to see the potential of this thinking and method being applied to both Sydney and other cities. Accessibility and the impact on individual opportunities is often overlooked and undervalued in many forms of transport analyses. With the increasing richness of the data that is becoming available from the Government and other forms of Open Data; combined with open analytical and visual methods like these it is encouraging and clear that these analyses can potentially produce insight towards tackling some of our growing issues in Australian cities, such housing affordability, transport disadvantage, sustainability.


Processing Simulations Part II – Cellular Automata


Continued revision, here are some examples of cellular automata used for simulation.

Cellular automata consists of a grid of cells, a neighbourhood around each cell, a set of rules as to how what happens in a cells neighbourhood affects the cell and a set of states that a cell can take on.

One of the most well-known of these models is Conway’s Game of Life

This has the basic rules
Any live cell with fewer than two live neighbours dies, as if caused by under-population.
Any live cell with two or three live neighbours lives on to the next generation.
Any live cell with more than three live neighbours dies, as if by overcrowding.
Any dead cell with exactly three live neighbours becomes a live cell, as if by reproduction.

The ways in which this simulation evolves complex patterns provide a good example of emergence and self-organisation.

Below is my own simple cellular automata examples of three cities – all starting with a different amount of resources. These cities then grow according to a number of rules based on proximity to neighbours. One can observe that the city on the left grows much faster, the middle sometimes never grows at all, and the right slowly – the left eventually enveloping them all.