Graphing data from Airtable

This guide will walk you through how to connect to an Airtable base from a Jupyter notebook, pull in your data and plot it, all with python.

Introduction

Airtable is an awesome tool for centralising data and running multiple different segments of your business. However, sometimes we can have so much tabular data that it is hard to really grasp week-to-week developments.

It is possible (on the pro plan) to build charts using Airtable's code blocks. Another option is to export to excel. But to really supercharge your analytics, this work should be as automated as possible. So this guide will show how you can fetch Airtable data from a Jupyter Notebook, manipulate and visualise this data, all in Python.

This post is for you if:

  • You are already using Airtable for some specific knowledge management - your sales CRM, for example - but are frustrated with the built-in data visualisation features.

  • If you think your team could benefit from a more automated process for tracking metrics that impact decision-making.

Getting started

Airtable Python uses Requests, which you can install by running:

conda install -c anaconda requests -y

Note that you will need to generate an API key for your account if you haven't already. You can do this in your Airtable account settings.

It is bad practice to have your API keys hard-coded in your script. Instead, we can export a global variable in our ~/.bash_profile. Copy your API key to the clipboard. Open your bash profile in your preferred editor and enter the following:

export 'AIRTABLE_API_KEY' = "API_KEY"

replacing "API_KEY" with the string you just copied.

From the command line, run:

source ~/.bash_profile

We are now ready to start using Airtable Python.

In Jupyterlab

Fetching our data

For the purpose of this guide we will work with one of Airtable's sample templates - the Sales Pipeline base. Let's fetch the data:

from airtable import airtable
at = airtable.Airtable('BASE_ID', 'AIRTABLE_API_KEY')
response1 = at.get('Sales Deals')
response2 = at.get('Sales Reps')

where 'AIRTABLE_API_KEY' is the global variable we just set in our bash profile and 'BASE_ID' is the ID for the specific base we are working with. Airtable's REST API interface can be found here: https://airtable.com/api, where you can access a list of all your Airtable bases. Each Airtable base will provide its own API to create, read, update, and destroy records.

We have pulled in 2 different responses the correspond to the two tables inside our Sales Pipeline base.

Converting the response into a Pandas DataFrame

import pandas as pd
sales = pd.DataFrame(response1, columns=response.keys())
reps = pd.DataFrame(response2, columns=response.keys())
sales = pd.concat([sales.drop(['records'], axis=1), sales['records'].apply(pd.Series)], axis=1)
sales = pd.concat([salesdf.drop(['fields'], axis=1), sales['fields'].apply(pd.Series)], axis=1).fillna('')
reps = pd.concat([reps.drop(['records'], axis=1), reps['records'].apply(pd.Series)], axis=1)
reps = pd.concat([reps.drop(['fields'], axis=1), reps['fields'].apply(pd.Series)], axis=1).fillna('')

Above we've assigned our responses to variables and read them into their respective pandas DataFrames. The response from Airtable comes in the form of an ordered dictionary, which we've had to parse above.

Visualising our data

Now it is time to start gaining insights from our data, but first up:

Define your business questions

What business questions do you want to answer? We've laid out some example questions for our sample sales pipeline data.

  1. What is the value of our closed deals? Who has secured the most revenue?

  2. What stages are other deals at in the pipeline? What is the weighted value of these deals?

  3. What is our actual monthly revenue? What are our forecasted targets?

  4. How are our sales reps doing? Are they hitting their quotas?

Install required packages

At Kyso, we work a lot with Plotly. For use in JupyterLab, we need to install the jupyterlab and ipywidgets packages:

conda install -c conda-forge jupyterlab-plotly-extension -y

Plotly has a wrapper for pandas (data manipulation library) called Cufflinks, which is currently having compatibility issues with plotly's latest version. A workaround for this at the moment is to downgrade plotly and install cufflinks with the following commands. Once you see how easy it is to generate graphs with Cufflinks, you'll understand why we go through this hassle. The Cufflinks guys have been working on the fix for sometime, but it's yet to emerge.

As of September 2019 this will get you up and running:

conda install -c conda-forge plotly=2.7.0 -y
conda install -c conda-forge cufflinks-py -y

Now we're ready to answer our questions above:

1. Closed deals by contract size and sales representative

sales[sales['Sales Stage']=='Closed-Won'][['Signed Contract Value', 'Sales Rep']].set_index(
'Sales Rep').iplot(kind='bar', color = 'rgba(50, 171, 96, 1.0)', showgrid=False,
title='Closed Deals', yTitle='Signed Contract Value')

2. Deal valuations by stage & sales representative

pd.DataFrame(sales[sales['Sales Stage']!='Closed-Won'][['Name', 'Sales Stage', 'Weighted Value', 'Sales Rep']].set_index('Sales Stage').groupby(
['Sales Stage', 'Sales Rep'])['Weighted Value'].sum().unstack(level=0)).iplot(kind='bar', mode='group', barmode='stack',
showgrid=False, title='Deal Valuation by Stage',
yTitle='Weighted Value (Forecasted Value * Probability of Closing)')

3. Actual revenue & Forecasted deal valuations

sales['Last Contact'] = pd.to_datetime(sales['Last Contact'])
sales[['Last Contact', 'Forecast Value', 'Signed Contract Value']].sort_values(by='Last Contact').set_index('Last Contact').iplot(kind='scatter', mode='markers',
showgrid=True, title='Actual Revenue and Forecasted Deals- January 2017',
yTitle='Signed Conntract Value')

4. Success rates of our sales reps

fig = reps[['Name', 'Quota', 'Closed Amount', '% of Quota']].set_index('Name').iplot(kind='bar', mode='group', secondary_y=['% of Quota'],
showgrid=False, title = 'Sales Rep Quota Statuses', asFigure=True)
fig['layout']['yaxis1'].update(title='Quota & Closed Amounts')
fig['layout']['yaxis2'].update(ticksuffix="%")
fig['layout']['yaxis1'].update(ticksuffix="$")
fig['data'][0].marker.update(opacity=1)
fig['data'][1].marker.update(opacity=1)
fig['data'][2].marker.update(opacity=0.5)
fig.iplot()

Conclusion

And there you have it. Easy-to-build data dashboards are now right at your fingertips! Note that you can check out this guide on Github here, with an example notebook and the python code ready for a quick set up. To really automate this workflow, you could fork the repository & edit the notebook (i.e. insert your own Airtable API key, the name of your base and table names).

Follow the guide below on syncing your repository to your Kyso account:

Now, every time you push a new commit - like running the jupyter otebook weekly - these changes will be reflected on the Kyso post, where your team can learn and collaborate. Remember - data is only useful to you if you actually use the insights gained!