Mapping election data for North Wales

This is the data analysis behind how I, Andrew Stuart, did the data for the Daily Post interactive election map for 2012 on the website for the Daily Post, a regional daily newspaper in North Wales. I used Google Docs and Excel to work with the data we got hold of.

How the story appeared in the newspaper, with what we found through the data.

As a British citizen, I know that getting information for council elections is pretty difficult. How do you vote? Yes, you can vote along party lines, but they are generally dictated by national policy, wherever that may be. Generally, for local council elections, you have to wait for the information to drop through the letter box, or have a story about them.

However, Local councils really are where the stuff that we see and use on a day-to-day basis is done. Rubbish collections, inspecting where we go to eat, repairing the roads, street lighting, and planning. So, the people who decide this policy are important. And, knowing what they’re for, against, or couldn’t give two hoots about matters.

Sadly, writing individual feature pieces on 243 wards, with over 753 residents putting their names forward, for a regional paper covering 6 counties (5 of which are to have elections) is next to impossible. I say next to, because nothing is impossible.

So, when I was at the Daily Post, we decided to use the web to show who was standing where. That way, they are a quick Google search or a reference away to find out more about them. This is what we came up with:

The Election Map. Click the image to go the fusion table

So, how did we do it?

First, you need to gather data. This sounds easier than it is. Some council’s had a nice list of each statement of nomination so you can scroll through. Some had a good word doc for reference. Some had the images saved as PDF files, and are on the website individually. Some had three different areas of the council because the county is so big! All of them were not in the same format.

So, we have to type them out. Not the best way, but the only way. These are new candidates, and the data is not online in any sort of format I can import to Google Docs. Claire Miller for WalesOnline had to do the same thing. For every council in Wales, bar the 5 I did. I do not envy her job.

I typed all the name for a ward into the one cell in the format “Name Middle name surname (party), etc”. The comma is important. I saved three files – the online version, the reference version, and a raw backup.
Using a uniform way of typing means I can parse easily at the comma. This allows the file to be shared around different journalists, so they can cover the patches and get the story behind the story. The single cell one for online works in the info box.

The next bit was to make the map work. For this, I need the KML files. There is an easy way of doing this using ScraperWiki. That would bring all the children of each County Council into a file. What I did, however, was to save each file from mapit.mysociety.org (not that strenuous), then create individual county shapefiles in Google Earth. I then have individual maps, and joining all the wards together allows me to create a whole North Wales map.

Then, merge the two tables – the one with all the prospective councillor details and the one with the shape files into Google Fusion tables, and check for errors. The one which did flag up was Mostyn. There is a Mostyn ward in Conwy and Flintshire. The way around it? Type Mostyn, Conwy, and Mostyn Flintshire. It worked.

All you need to do then is to colour the shapefiles by county. To do this, I put the HTML colour codes in a column on the councillor list, and selected that column as the one for the colours for the polygons, and you have coloured counties.

And to get around the way of Anglesey not having elections? In the Anglesey cells, I typed no election. The info box then shows no election.

That’s how you inform 243 wards of who’s standing where, in one fell swoop, and may I say so, quite beautifully too.

This was originally posted on andrewwgstuart.com. Trinity Mirror own copyright for the cuttings used in this piece. Andrew Stuart created the map. 

VISUALISATION ANALYSIS #3

http://www.guardian.co.uk/news/datablog/interactive/2012/mar/26/office-for-national-statistics-health

Simon Rogers has published a fantastic interactive graphic for the Guardian Datastore that maps teenage pregnancy rates in England and Wales from 1998 to 2010.

The visualisation shows the conception rate of under-eighteen year olds, per 1000 women, in different counties across England and Wales. The interactive map is an ideal way to present the information, as the visualisation contains a large amount of data in a comprehensible way. From the graphic we can derive that the number of teenage pregnancies has declined in the last decade, although this varies by area.

In order to focus on a specific county the user can scroll the mouse over the map and click on a different area, labelled by county at the side of the map. Once you click on a county the line graph changes to show the counties’ change in number of teenage pregnancies by year and how this compares to the England and Wales average. This allows the user to have more detailed and specific information simply by clicking on the infographic. Thus the graphic allows users to see the more personalised, local data.

By using this tool the user can focus on various localised data, and see how they compare with each other. For example, in Wales it is apparent that poorer counties, such as Merthyr Tydfil and the South Wales Valleys, are significantly over the national average regarding the number of teenage pregnancies. In contrast, geographically close but wealthier counties like Monmouthshire and Powys are below the national average. In most cases this has not altered over the decade.

The map thus proves that in certain circumstances seeing only the larger data can give a limited understanding, as it shows a national decline in the number of teenage pregnancies but does not tell us that many individual counties have not changed significantly. In this way a graphic of this kind presents to users the ‘big picture’, in a clearer way than text alone.

The graphic also allows users to ignore information that is not of interest to them and to focus on geographical locations that are. This gives users a certain amount of control over the visualisation, as information is not decided for the user, as would be the case with textual narrative.

The interactive element of the visualisation allows users to find the story or information for themselves with no difficulty. This is more satisfying than simply being told information. At a time when the general public’s trust in journalism is low, visualisations such as this demonstrate that the journalist has not played around and sifted information but presented all of it to the user and allowed them to draw their own conclusions. In this way the user can get a more detailed, accurate and neutral understanding of the issue presented. It also breaks down the barrier between journalist and user and implies trust in the user to interpret and organise the data in an intelligent way.

The graph also uses visual symbols to organise the large amount of data. The map of England and Wales is easily recognisable, as is many of the counties. The counties that are under the national average are a light shade of blue and this gets darker as the percentage increases. The use of blue and purple makes the map visually attractive and the differences in shade easily identifiable. It is apparent that darker areas cluster together and that generally the North of England is darker than the South. In this way the user can obtain information from the visualisation by looking at it alone. The darker shade of purple stands out amongst the generally lighter shades and thus the graphic signals to the reader some of the most dramatic information. Thus, although the user is given control and the freedom to explore the data and draw their own conclusions, visual signals guide them to the most extreme data.

The orange circle that is drawn around a county when it is selected contrasts with the blue, making it clear. It also correlates with the colour of the line graph, making the visualisation easily readable.

By pressing ‘play’ the user can focus on one county and see how it breaks down by each year, as well as how the colours across the UK has changed by year, thus presenting more information.

The visualisation thus works as it presents a large amount of data comprehensibly. It allows the user to interpret and organise the data, but gives them visual signals to guide them. It also gives information for the whole country, as well as localised data, thus presenting the ‘big picture’. It is clear and easy to read and breaks down the barrier between journalist and user. It is therefore an excellent way to present the data.

Visualisation Analysis #2

Simon Rogers has created a visualisation showing death penalty statistics, country by country, for the Guardian Data Blog.

http://bit.ly/hdFOpa

http://bit.ly/hflX1V

The visualisation uses a bubble graph on a map of the world to depict how many people have been given death sentences and how many people have been executed in 2011. This is then broken down by country, giving users the opportunity to compare and contrast regions.

Continue reading “Visualisation Analysis #2”

Visualisation Analysis #1

Following on from my earlier post exploring different ways to present data, I have decided to analyse two examples of visualisations from the Guardian Data Store.

http://bit.ly/HsqsLf

The first is a map of UK fuel shortages; ‘The Petrol Panic Mapped’. The map works because it is clear, simple and easy to use. The map is interactive, giving the user control and allowing them to display the information in the way that best suits them, prioritising data that they find most interesting. It also makes viewing the map a more entertaining experience, keeping users on the page for longer.

Continue reading “Visualisation Analysis #1”

How much does society actually mix?

The Office for National Statistics recently released results from the Citizenship Survey, conducted by the Department for Communities and Local Government for 2010-2011. The survey aimed to find out the level of integration in the UK between different ethnic and religious groups. The data was categorised by locations that people are likely to mix, for example at the shops, within schools, at work and within the home. Continue reading “How much does society actually mix?”

How to do a good visualisation and why it’s important

Visualisations are an important tool when presenting data, and can be used to show patterns, correlations and the ‘big picture’.

Ben Fry has said that visualisations ‘answer questions in a meaningful way that makes answers accessible to others’ and Paul Bradshaw explains that ‘visualisation is the process of giving a visual form to information which is otherwise dry or impenetrable.’

Traditionally stories have been conveyed through text, and visualisations have been used to display additional or supporting information. Recently, however, improved software has allowed journalists to create sophisticated narrative visualisations that are increasingly being used as standalone stories. These can be be linear and interactive, inviting verification, new questions and alternative explanations.

Continue reading “How to do a good visualisation and why it’s important”

How to find Data

This post is for people who are new to data sourcing, or interested in Data Journalism but unsure of where to begin.

First, it is useful to start with an idea, question or hypothesis. In Story Based Enquiry Mark Lee Hunter emphasises the importance of having an idea of what you are looking for in data.

He said: “We do not think that the only issue is finding information. Instead, we think that the core task is telling the story. Stories are the cement which holds together every step of the investigative process, from conception to research, writing, quality control and publication.”

Data stories and visualisations are part of journalism and, when looking for information, a good starting place is to use traditional journalistic methods. Contacts, tip offs, interviews and research can all point you in the direction of interesting data, and of questions that could be answered by statistics. This is known as Active Data Journalism.

Continue reading “How to find Data”

Visualisation showing patients detained under the Mental Health Act 1983

Here I have created a visualisation showing patients detained under the Mental Health Act 1983 over the last six years.

I took statistics from the mental health pages of the NHS website and downloaded them into an Excel spreadsheet. I then cleaned the data, taking out any information that was unnecessary and that would confuse the image. I rearranged the columns, data and information and made it easier to understand and clearer, visually.

I then experimented with Many Eyes, Google Docs and Excel graphs to create the visualisation. I tried other ways of presenting the image, in a pie chart and a line graph, but found that the bar chart worked best.

The information is broken down by gender as well as by type of hospital; NHS Facilities and Independent hospitals. The graph shows that more men have been detained under the mental health act than women, on a year by year basis. This is consistent with both NHS Facilities and Independent Hospitals. The number of men detained has also gone up marginally in the last two years, though has stayed relatively consistent over the last six years.

This is interesting because statistics have indicated that more women than men are diagnosed with mental health disorders, such as depression and anxiety. However, when it comes to severe cases, where patients are legally detained due to mental illness, men are significantly more likely to be affected.

 

Metropolitan Police expenditure on cars is on the rise

The Metropolitan Police’s expenditure on brand new unmarked cars is rising, a Freedom of Information request can reveal. In excess of £28 million was spent on new cars between December 2006 and November 2011.

Annual spending figures show that expenditure has risen from £4.8 million in 2007, to £6 million in 2010, with spend exceeding the £1 million mark in single months alone; £1.6 million was spent in April 2008 alone, and £1 million spent in August 2011. Unmarked police cars do not display any police logos and are used by police to assist in operations and responding to incidents. These figures therefore exclude any money spent on the Metropolitan Police’s marked fleet cars, and considering the cuts being made to policing across the country, the increasing spend is concerning.

When questioned about the figures, a Metropolitan Police spokesperson said:

“The MPS Fleet contains over 5,000 vehicles with a wide range of functions, capabilities and specialisms. Maintaining the fleet requires the purchase and maintenance of vehicles – some of which are without police livery. Procurement of vehicles is through a competitive tender process and framework, ensuring we obtain best value for money. Unmarked vehicles are often directly involved with or support operational activity and although not instantly recognisable to the public, play a crucial role as part of our fleet and supporting policing London.” The Metropolitan Police refused to comment on how many cars that were bought with the figures revealed, nor on specifically the type of cars that were bought.

Emma Boon, Campaign Director of the Tax Payer’s Alliance commented on the matter saying: “It’s worrying that Met police spending on fast cars increased by over a million pounds in just a couple of years. Clearly the force will need some high performance vehicles to be able to do their job properly, but these figures show spending in this area increasing, even during the recent recession. With police budgets tight, the Met must get this spending under control and ensure that they are getting the best value for taxpayers’ money”.