Why Data Journalism is Important

After studying Data Journalism for a year at City University I have come to appreciate the importance of having the skillset to make the most out of numbers and statistics. Many aspiring journalists still see data as something that is separate from journalism, and as something that does not interest them. In response, I have compiled some reasons why data is increasingly important:

1.       Make sense of Mass Information

Having the skills to scrape, analyse, clean and present data allows journalists to present complicated and otherwise incomprehensible information in a clear way. It is an essential part of journalism to find material and present it to the public. Understanding data allows journalists to do this with large amounts of information, which would otherwise be impossible to understand.

2.       New Approaches to Storytelling

Able to create infographics and visualisations, data journalists can see and present information in a new and interesting way. Stories no longer need to be linear and based solely on text. Data can be grafted into a narrative which people can read visually. Interactive elements of data visualisations allow people to explore the information presented and make sense of it in their own way.

3.       Data Journalism is the Future

Understanding data now will put journalists ahead of the game. Information is increasingly being sourced and presented using data. Journalists who refuse to adapt to the modern, increasingly technological world will be unable to get the best stories, by-lines and scoops and their careers will suffer as a result.

4.       Save Time

No longer must journalists pore over spread-sheets and numbers for hours when there could be a simpler way to organise the information. Being technologically savvy and knowing the skills to apply to data sets can save journalists time when cleaning, organising and making sense of data. Not making mistakes due to lack of knowledge can also save a journalist time.

5.       A way to see things you might otherwise not see

Understanding large data sets can allow journalists to see significant information that they might otherwise have overlooked. Equally, some stories are best told using data visualisations as this enables people to see things that they might otherwise have been unable to understand.

 6.       A way to tell richer stories

Combining traditional methods of storytelling with data visualisations, infographics, video or photographs, creates richer, more interesting and detailed stories.

7.       Data is an essential part of Journalism

Many journalists do not see data as a specialist and separate area of journalism, but an interwoven, essential and important element of it. It is not there to replace traditional methods of finding information, but to enhance them. The journalist that can combine a good contact book and an understanding of data will be invaluable in the future.

VISUALISATION ANALYSIS #3

http://www.guardian.co.uk/news/datablog/interactive/2012/mar/26/office-for-national-statistics-health

Simon Rogers has published a fantastic interactive graphic for the Guardian Datastore that maps teenage pregnancy rates in England and Wales from 1998 to 2010.

The visualisation shows the conception rate of under-eighteen year olds, per 1000 women, in different counties across England and Wales. The interactive map is an ideal way to present the information, as the visualisation contains a large amount of data in a comprehensible way. From the graphic we can derive that the number of teenage pregnancies has declined in the last decade, although this varies by area.

In order to focus on a specific county the user can scroll the mouse over the map and click on a different area, labelled by county at the side of the map. Once you click on a county the line graph changes to show the counties’ change in number of teenage pregnancies by year and how this compares to the England and Wales average. This allows the user to have more detailed and specific information simply by clicking on the infographic. Thus the graphic allows users to see the more personalised, local data.

By using this tool the user can focus on various localised data, and see how they compare with each other. For example, in Wales it is apparent that poorer counties, such as Merthyr Tydfil and the South Wales Valleys, are significantly over the national average regarding the number of teenage pregnancies. In contrast, geographically close but wealthier counties like Monmouthshire and Powys are below the national average. In most cases this has not altered over the decade.

The map thus proves that in certain circumstances seeing only the larger data can give a limited understanding, as it shows a national decline in the number of teenage pregnancies but does not tell us that many individual counties have not changed significantly. In this way a graphic of this kind presents to users the ‘big picture’, in a clearer way than text alone.

The graphic also allows users to ignore information that is not of interest to them and to focus on geographical locations that are. This gives users a certain amount of control over the visualisation, as information is not decided for the user, as would be the case with textual narrative.

The interactive element of the visualisation allows users to find the story or information for themselves with no difficulty. This is more satisfying than simply being told information. At a time when the general public’s trust in journalism is low, visualisations such as this demonstrate that the journalist has not played around and sifted information but presented all of it to the user and allowed them to draw their own conclusions. In this way the user can get a more detailed, accurate and neutral understanding of the issue presented. It also breaks down the barrier between journalist and user and implies trust in the user to interpret and organise the data in an intelligent way.

The graph also uses visual symbols to organise the large amount of data. The map of England and Wales is easily recognisable, as is many of the counties. The counties that are under the national average are a light shade of blue and this gets darker as the percentage increases. The use of blue and purple makes the map visually attractive and the differences in shade easily identifiable. It is apparent that darker areas cluster together and that generally the North of England is darker than the South. In this way the user can obtain information from the visualisation by looking at it alone. The darker shade of purple stands out amongst the generally lighter shades and thus the graphic signals to the reader some of the most dramatic information. Thus, although the user is given control and the freedom to explore the data and draw their own conclusions, visual signals guide them to the most extreme data.

The orange circle that is drawn around a county when it is selected contrasts with the blue, making it clear. It also correlates with the colour of the line graph, making the visualisation easily readable.

By pressing ‘play’ the user can focus on one county and see how it breaks down by each year, as well as how the colours across the UK has changed by year, thus presenting more information.

The visualisation thus works as it presents a large amount of data comprehensibly. It allows the user to interpret and organise the data, but gives them visual signals to guide them. It also gives information for the whole country, as well as localised data, thus presenting the ‘big picture’. It is clear and easy to read and breaks down the barrier between journalist and user. It is therefore an excellent way to present the data.

Visualisation Analysis #2

Simon Rogers has created a visualisation showing death penalty statistics, country by country, for the Guardian Data Blog.

http://bit.ly/hdFOpa

http://bit.ly/hflX1V

The visualisation uses a bubble graph on a map of the world to depict how many people have been given death sentences and how many people have been executed in 2011. This is then broken down by country, giving users the opportunity to compare and contrast regions.

Continue reading “Visualisation Analysis #2”

Visualisation Analysis #1

Following on from my earlier post exploring different ways to present data, I have decided to analyse two examples of visualisations from the Guardian Data Store.

http://bit.ly/HsqsLf

The first is a map of UK fuel shortages; ‘The Petrol Panic Mapped’. The map works because it is clear, simple and easy to use. The map is interactive, giving the user control and allowing them to display the information in the way that best suits them, prioritising data that they find most interesting. It also makes viewing the map a more entertaining experience, keeping users on the page for longer.

Continue reading “Visualisation Analysis #1”

How to do a good visualisation and why it’s important

Visualisations are an important tool when presenting data, and can be used to show patterns, correlations and the ‘big picture’.

Ben Fry has said that visualisations ‘answer questions in a meaningful way that makes answers accessible to others’ and Paul Bradshaw explains that ‘visualisation is the process of giving a visual form to information which is otherwise dry or impenetrable.’

Traditionally stories have been conveyed through text, and visualisations have been used to display additional or supporting information. Recently, however, improved software has allowed journalists to create sophisticated narrative visualisations that are increasingly being used as standalone stories. These can be be linear and interactive, inviting verification, new questions and alternative explanations.

Continue reading “How to do a good visualisation and why it’s important”

The Data Journalism Handbook: Teaching the World how to work with data [VIDEO]

This video is cross posted on DataDrivenJournalism.net, the Open Knowledge Foundation blog and on the Data Journalism Blog.

The Data Journalism Handbook is a project coordinated by the European Journalism Centre and the Open Knowledge Foundation, launched at the Mozilla Festival in London on 5 November 2011.

Journalists and experts in data gathered to create the first ever handbook to data journalism over a two-days challenge.

Read more about the Data Journalism Handbook in this article by Federica Cocco.

What data tool or great example of data journalism would you add to the handbook? Let’s make this comments section useful!

Every contribution, big or small, to the Data Journalism Handbook is very much appreciated. So use this space to give us links and examples to what you think should be included in the manual.

And if you feel more chatty, email us at editor@datajournalismblog.com

Hacks and hackers gather to write the first Data Journalism Handbook

By Federica Cocco

This article is cross posted on DataDrivenJournalism.net, the Open Knowledge Foundation blog and on the Data Journalism Blog.

Ravensbourne college is an ultramodern cubist design school which abuts the O2 arena on the Greenwich peninsula. It is perhaps an unusual and yet apt setting for journalists to meet.

Members of the Open Knowledge Foundation and the European Journalism Centre saw this as a perfect opportunity to herd a number of prominent journalists and developers who, fuelled by an unlimited supply of mocacchinos, started work on the first Data Journalism Handbook.

The occasion was the yearly Mozilla Festival, which acts as an incubator to many such gatherings. This year the focus was on media, freedom and the web.

The manual aims to address one crucial problem: “There are a lot of useful resources on the web,” Liliana Bounegru of the EJC said, “but they are all scattered in different places. So what we’re trying to do is put everything together and have a comprehensive step-by-step guide”.

In data journalism, most people are self-taught, and many find it hard to keep up-to-date with every tool produced by the industry. “It could be vital having a handbook that really explains to journalists how you can approach data journalism from scratch with no prior knowledge, ” says Caelainn Barr of the Bureau of Investigative Journalism
Friedrich Lindenberg of the OKF believes there is a real urgency in making newsrooms data-literate: “If journalists want to keep up with the information they need to learn coding, and some bits of data analysis and data-slicing techniques. That will make much better journalism and increase accountability.”

And who better than the New York Times’ Interactive Editor Aron Pilhofer, The Guardian Data Blog’s Simon Rogers and others to lead the ambitious efforts?
In charge of sorting the wheat from the chaff, around 40 people joined them in the sixth floor of the college, for a 48 hour session.

The first draft of the handbook should be ready in the coming months, as other contributions from every corner of the web are still working on making an input.
Of course the first data journalism handbook had to be open source. How else would it be able to age gracefully and be relevant in years to come?

Workshops of this sort represent a decisively different break from the past. Aspiring data journalists will know that hands-on sessions are a cut above the usual lectures featuring knowledgeable speakers and PowerPoint presentations. Discussing the topic and citing examples is not enough. After all, if you give a man a fish you have fed him for a day. But if you teach a man ho w to fish, you have him fed for a lifetime.

Jonathan Gray concurs: “Rather than just provide examples of things that have been done with data, we want to make it easier for journalists to understand what data is available, what tools they can use to work with data, how they can visualise data sets and how they can integrate that with the existing workflows of their news organisations.”

At the event itself, after a brief introduction, the crowd split into five groups and began collaborating on each chapter of the handbook. Some were there to instill knowledge, others were there to absorb and ask questions.

“I like the fact that everyone is bringing a different skillset to the table, and we’re all challenging each other”, one participant said.

Francis Irving, CEO of ScraperWiki, led the session on new methods of data acquisitions. He believes the collaboration between journalists, programmers, developers and designers, though crucial, can generate a culture clash: “When working with data, there’s a communication question, how do you convey what you need to someone more technical and how do they then use that to find it in a way that’s useful.”

“A project like this is quite necessary,” noted Pilhofer, “It’s kind of surprising someone hasn’t tried to do this until now.”

The free e-book will be downloadable from the European Journalism Centre’s DataDrivenJournalism.net/handbook in the coming months. If you want to follow our progress or contribute to the handbook you can get in touch via the data journalism mailing list, the Twitter hashtags #ddj and #ddjbook, or email bounegru@ejc.net.

Watch here the full video report from the Data Journalism Handbook session at the Mozilla Festival, 4-6 November in London.

The organisers would like to thank everyone who is contributing to the handbook for their input and to Kate Hudson for the beautiful graphics.

 
About the author: Federica Cocco is a freelance journalist and the former editor of Owni.eu, a data-driven investigative journalism site based in Paris. She has also worked with Wired, Channel 4 and the Guardian. 

 

6 Data Journalism Blogs To Bookmark, Part 2

10,000 WORDS – By Elana Zak

Editor’s Note: This is the second part of a post from 10,000 Words. Find the first one here which included the Guardian’s Data Blog, Pro Publica and your very own Data Journalism Blog…

Last week, I started a list of six data journalism blogs you should take note of. The post stemmed from a project some journalists are leading to develop a data-driven journalism handbook that covers all aspects of the field. This weekend, thanks to a massive effort by attendees at the Mozilla Festival in London, the project morphed from the bare bones of an idea into something very tangible.

In just two days, 55 contributors, from organizations such as the New York Times, the Guardian and Medill School of Journalism, were able to draft 60 pages, 20,000 words, and six chapters of the handbook. The goal is to have a comprehensive draft completed by the end of the year, said Liliana Bounegru of the European Journalism Centre, which is co-sponsoring production of the handbook. If you’re interested in contributing, email Bounegru at bounegru@ejc.net. You can see what the group has so far atbit.ly/ddjbook.

Since the handbook is still being tweaked, why not check out these data journalism blogs?

Open
Like the Guardian, the New York Times is widely known for its spectacular use of data journalism and news apps. Open is written by the news organization’s developers, highlighting hacking events and describing general news of interest to the bloggers.

Data Desk
The Los Angeles Times is at the forefront of data journalism, with its Data Desk blog covering topics from crime to the Lakers to vehicle complaints. Everything on the site is a great example of how to use data to find and craft stories that will matter to your readers. One project I highly recommend you take a look at is mapping LA’s neighborhoods. It is something that could be replicated in almost any town and would grab your audience’s’ attention.

News Apps Blog
The News Apps blog is where developers from the Chicago Tribune discuss “matters of interest” and give their tips and suggestions on how to make some of the stunning maps and apps that appear in the paper. This is one of, if not the, place to go to see what experts in the field are talking about.

What data journalism blogs do you go to?

Image from the Data Journalism Handbook presentation.

The Data Journalism Handbook at #MozFest 2011 in London

The following post is from Jonathan Gray, Community Coordinator at the Open Knowledge Foundation.

With the Mozilla Festival approaching fast, we’re getting really excited about getting stuck into drafting the Data Journalism Handbook, in a series of sessions run by the Open Knowledge Foundation and the European Journalism Centre.

As we blogged about last month, a group of leading data journalists, developers and others are meeting to kickstart work on the handbook, which will aim to get aspiring data journalists started with everything from finding and requesting data they need, using off the shelf tools for data analysis and visualisation, how to hunt for stories in big databases, how to use data to augment stories, and plenty more.

We’ve got a stellar line up of contributors confirmed, including:

Here’s a sneak preview of our draft table of contents:

  • Introduction
    • What is data journalism?
    • Why is it important?
    • How is it done?
    • Examples, case studies and interviews
      • Data powered stories
      • Data served with stories
      • Data driven applications
    • Making the case for data journalism
      • Measuring impact
      • Sustainability and business models
    • The purpose of this book
    • Add to this book
    • Share this book
  • Getting data
    • Where does data live?
      • Open data portals
      • Social data services
      • Research data
    • Asking for data
      • Freedom of Information laws
      • Helpful public servants
      • Open data initiatives
    • Getting your own data
      • Scraping data
      • Crowdsourcing data
      • Forms, spreadsheets and maps
  • Understanding data
    • Data literacy
    • Working with data
    • Tools for analysing data
    • Putting data into context
    • Annotating data
  • Delivering data
    • Knowing the law
    • Publishing data
    • Visualising data
    • Data driven applications
    • From datasets to stories
  • Appendix
    • Further resources

If you’re interested in contributing you can either:

  1. Come and find us at the Mozilla Festival in London this weekend!
  2. Contribute material virtually! You can pitch in your ideas via the public data-driven-journalismmailing list, via the #ddj hashtag on Twitter, or by sending an email to bounegru@ejc.net.

We hope to see you there!

Data-Driven Journalism In A Box: what do you think needs to be in it?

The following post is from Liliana Bounegru (European Journalism Centre), Jonathan Gray (Open Knowledge Foundation), and Michelle Thorne (Mozilla), who are planning a Data-Driven Journalism in a Box session at the Mozilla Festival 2011, which we recently blogged about here. This is cross posted at DataDrivenJournalism.net and on the Mozilla Festival Blog.

We’re currently organising a session on Data-Driven Journalism in a Box at the Mozilla Festival 2011, and we want your input!

In particular:

  • What skills and tools are needed for data-driven journalism?
  • What is missing from existing tools and documentation?

If you’re interested in the idea, please come and say hello on our data-driven-journalism mailing list!

Following is a brief outline of our plans so far…

What is it?

The last decade has seen an explosion of publicly available data sources – from government databases, to data from NGOs and companies, to large collections of newsworthy documents. There is an increasing pressure for journalists to be equipped with tools and skills to be able to bring value from these data sources to the newsroom and to their readers.

But where can you start? How do you know what tools are available, and what those tools are capable of? How can you harness external expertise to help to make sense of complex or esoteric data sources? How can you take data-driven journalism into your own hands and explore this promising, yet often daunting, new field?

A group of journalists, developers, and data geeks want to compile a Data-Driven Journalism In A Box, a user-friendly kit that includes the most essential tools and tips for data. What is needed to find, clean, sort, create, and visualize data — and ultimately produce a story out of data?

There are many tools and resources already out there, but we want to bring them together into one easy-to-use, neatly packaged kit, specifically catered to the needs of journalists and news organisations. We also want to draw attention to missing pieces and encourage sprints to fill in the gaps as well as tighten documentation.

What’s needed in the Box?

  • Introduction
    • What is data?
    • What is data-driven journalism?
    • Different approaches: Journalist coders vs. Teams of hacks & hackers vs. Geeks for hire
    • Investigative journalism vs. online eye candy
  • Understanding/interpreting data:
    • Analysis: resources on statistics, university course material, etc. (OER)
    • Visualization tools & guidelines – Tufte 101, bubbles or graphs?
    • Acquiring data
  • Guide to data sources
  • Methods for collecting your own data
  • FOI / open data
  • Scraping
    • Working with data
  • Guide to tools for non-technical people
  • Cleaning
    • Publishing data
  • Rights clearance
  • How to publish data openly.
  • Feedback loop on correcting, annotating, adding to data
  • How to integrate data story with existing content management systems

What bits are already out there?

What bits are missing?

  • Tools that are shaped to newsroom use
  • Guide to browser plugins
  • Guide to web-based tools

Opportunities with Data-Driven Journalism:

  • Reduce costs and time by building on existing data sources, tools, and expertise.
  • Harness external expertise more effectively
  • Towards more trust and accountability of journalistic outputs by publishing supporting data with stories. Towards a “scientific journalism” approach that appreciates transparent, empirically- backed sources.
  • News outlets can find their own story leads rather than relying on press releases
  • Increased autonomy when journalists can produce their own datasets
  • Local media can better shape and inform media campaigns. Information can be tailored to local audiences (hyperlocal journalism)
  • Increase traffic by making sense of complex stories with visuals.
  • Interactive data visualizations allow users to see the big picture & zoom in to find information relevant to them
  • Improved literacy. Better understanding of statistics, datasets, how data is obtained & presented.
  • Towards employable skills.