Workbench
Scrape, clean, combine and analyze data without code
Unfornately data is almost never in the form we want it
- Learn how to rearrange and combine data sets to get stories others can’t
Workbench is one tool, but there are lots of others, learn the one you are most comfortable with
Why? An example
lol
What’s going on?
- There are two ways of putting data in tables
Year | Variable | Value |
---|
2016 | tui | 10 |
2016 | kereru | 15 |
2016 | kākā | 8 |
2017 | tui | 12 |
2017 | kereru | 11 |
2017 | kākā | 13 |
When what you need is:
Year | tui | kereru | kākā |
---|
2016 | 10 | 15 | 8 |
2017 | 12 | 11 | 13 |
- The operation you need is called a pivot - let’s see how to do it using
Workbench
Using workbench
The first tab in this workflow that takes set of wide data from Figure.NZ and converts it into long data for use in Datawrapper.
The steps needed are:
- Get the data url from Figure.NZ (
/data.csv
) - Use a
Load from URL
step to get the data into Workbench - Use a
Select columns
step to focus your attention (not strictly necessary) - Use a
Reshape
step to do a Long to wide
conversion - Make the notebook public so that Datawrapper can access the data
- Use the export button to get a CSV url for Datawrapper
A few points
- You can go back and forwards in the workflow
- Highlighting a step will show you the state of the data after that step
- You can export the data from each step too
- The effect of adding, or changing, steps will flow through the workflow
- We are making workflows public to make it easy to get the data to Datawrapper
- The workflows really are public
- If don’t want someone to see it then keep it private and download the data and upload it
- Or pay for Workbench
Workbench can do more
Let’s see how many often Judith Collins uses the Prime Minister’s name on Twiiter
Caution
Data journalism always runs the risk of reporting on things where there is data
- There is a lot on data on Twitter about Judith Collins’ tweets because she used Twitter
- Jacinda Ardern does not use Twitter so there isn’t much to say
Police Data
Is there a difference in the use of non-court based proceedings between police districts?
The police release a lot of data here
- Use the download tab to make a selection
- The download button is on the bottom right
- You want Full data and Show all columns
- It isn’t actuall csv data its tsv data