Introduction to Workbench

Chris Knox

chris@functionalvis.com

10 July, 2021

Workbench

Scrape, clean, combine and analyze data without code

Unfornately data is almost

never in the form
we want it

  • Learn how to rearrange and combine data sets to get
    stories others can’t

Workbench is one tool, but there are lots of others, learn the one you are most comfortable with

Why? An example

Screenshot of a line graph with two variables in a single plot

lol

What’s going on?

  • There are two ways of putting data in tables
YearVariableValue
2016tui10
2016kereru15
2016kākā8
2017tui12
2017kereru11
2017kākā13

When what you need is:

Yeartuikererukākā
201610158
2017121113
  • The operation you need is called a
    pivot
    - let’s see how to do it using
    Workbench

Using workbench

The first tab in this workflow that takes set of

wide
data from Figure.NZ and converts it into
long
data for use in Datawrapper.

The steps needed are:

  • Get the data url from Figure.NZ (
    /data.csv
    )
  • Use a
    Load from URL
    step to get the data into Workbench
  • Use a
    Select columns
    step to focus your attention (not strictly necessary)
  • Use a
    Reshape
    step to do a
    Long to wide
    conversion
  • Make the notebook public so that Datawrapper can access the data
  • Use the export button to get a
    CSV
    url for Datawrapper

A few points

  • You can go back and forwards in the workflow
    • Highlighting a step will show you the state of the data after that step
    • You can export the data from each step too
    • The effect of adding, or changing, steps will flow through the workflow
  • We are making workflows public to make it easy to get the data to Datawrapper
    • The workflows really are public
    • If don’t want someone to see it then keep it private and download the data and upload it
    • Or pay for Workbench

Workbench can do more

Twitter analysis

Let’s see how many often Judith Collins uses the Prime Minister’s name on Twiiter

Caution

Data journalism always runs the risk of reporting on things where there is data

  • There is a lot on data on Twitter about Judith Collins’ tweets because she used Twitter
  • Jacinda Ardern does not use Twitter so there isn’t much to say

Police Data

Is there a difference in the use of non-court based proceedings between police districts?

The police release a lot of data here

  • Use the download tab to make a selection
  • The download button is on the bottom right
  • You want
    Full data
    and
    Show all columns
  • It isn’t actuall
    csv
    data its
    tsv
    data
    • Just rename the file