Paul Bradshaw from Birmingham City University says:
Data can be the source of data journalism, or it can be the tool with which the story is told — or it can be both.
The Bureau of Investigative Journalism says:
Data journalism is simply journalism.
The former is a new and trendy term but ultimately, it is just a way of describing journalism in the modern world.
Both in the sense of spending some of your career as a data journalist - but also in the work you produce. Ask yourself
Are there stories in this data that are of public interest?
Journalists are overworked and deadline driven
The easier you can make your data to understand and consume the more likely it is to be picked up by a journalist
Narrative/Cognitive tension?
Not sure exactly what to call it - but I think it is important.
You should be able to run a single command that updates your data, runs your analysis, creates your assets and then publishes your article/report
Things change all the time - and more interesting things change more often. Don’t become King Canute and try to stop the tide coming in.
If you are not in control of data collection and your workflow tries to control the data collection and collation your workflow will break
Automated workflows can lock you into a single technology restricting you ability to make use of the best tools for a job.
Use the compiler Luke
Flexible (dynamic and often weakly-typed langauges) are the mainstay for analysis - especially exploratory analysis
Haskell is a wonderful language with a steep learning curve - find a mentor if you want to learn it
You probably don’t want to do this as part of your actual analysis workflow - it is possible - but I have not found it to be very efficient.
The point is that your Haskell pipeline will break - on your computer - if your assumptions are no longer true
This approach can be implement in other languages too