€20,000 to win in The Open Data Challenge: Get crackin’!

So you are a data enthusiast? Here is a great opportunity to get noticed…

The Open Data Challenge is a data competition organised by the Open Knowledge Foundation, in conjunction with the Openforum Academy and Share-PSI.eu.

European public bodies produce thousands upon thousands of datasets every year – about everything from how our tax money is spent to the quality of the air we breathe. With the Open Data Challenge, the Open Knowledge Foundation and the Open Forum Academy are challenging designers, developers, journalists and researchers to come up with something useful, valuable or interesting using open public data.

Everybody from the EU can submit an idea, app, visualization or dataset to the competition between 5th April and 5th June. The winners will be announced in mid June at the European Digital Assembly in Brussels. A total of €20,000 in prizes could be another motivator if you’re not convinced yet.

All entries must use or depend on open, freely reusable data from local, regional or national public bodies from European member states or from European institutions (e.g. EurostatEEA, …).

Some starting points for you to get data are http://publicdata.eu or http://lod2.okfn.org. The organisers are focused on solutions that are reusable in different countries, cover pan-European issues and use open licenses for any code, content and data. Get all the info about the competition and on how to join here.

We are very eager to see what you come up with so share your work with us in the Data Art Corner or in the comments!

 

 

Breaking Bin Laden: visualizing the power of a single tweet

The shape of rumours on Twitter by Social Flow

 

SOCIAL FLOW

A full hour before the formal announcement of Bin-Laden’s death, Keith Urbahn posted his speculation on the emergency presidential address. Little did he know that this Tweet would trigger an avalanche of reactions, Retweets and conversations that would beat mainstream media as well as the White House announcement.

Keith Urbahn wasn’t the first to speculate Bin Laden’s death, but he was the one who gained the most trust from the network. Why did this happen?

Before May 1st, not even the smartest of machine learning algorithms could have predicted Keith Urbahn’s online relevancy score, or his potential to spark an incredibly viral information flow. While politicos “in the know” certainly knew him or of him, his previous interactions and size and nature of his social graph did little to reflect his potential to generate thousands of people’s willingness to trust within a matter of minutes.

While connections, authority, trust and persuasiveness play a key role in influencing others, they are only part of a complex set of dynamics that affect people’s perception of a person, a piece of information or a product. Timing, initiating a network effect at the right time, and frankly, a dash of pure luck matter equally. [Read more…]

 

10 CHARTS ABOUT SEX [Infographics]

OWNI.EU

Data journalism can make sense out of very complicated and sometimes uncommon information. But some creative minds came up with really good data visualisation regarding our daily life activities and in this instance: Sex. So here is an article from OWNI.eu, originally published on OkCupid’s blog, dealing with many aspects of our tumultuous sex life. . . Enjoy!

This was one of the first infographics ever made:

Later remembered as “the map that made a nation cry”, it depicts Napoleon’s failed invasion of Russia in 1812. The wide tan swath shows his Grande Armée, almost half a million strong, marching East to Moscow; the black trickle shows the few who straggled back. It’s an elegant fusion of geography, time, and temperature into a single statement of military disaster.

Of course, using modern tools of analysis, like circles and the color blue, we can get an even clearer picture of history:

It is our goal today to create graphics of similar concision and power, but about something more useful than war—sex.

All the data below, even the most personal stuff, has been gleaned from real user activity on OkCupid. Some of it our users have told us outright by answering match questions; some of it we’ve had to learn from observation.

Other than the unifying theme, sex, there’s no big point or thesis to this post: just comparisons, correlations, and quirky trends.

Chart #1

We found this by crossing the match questions Do you like to exercise? and Is it difficult for you to have an orgasm?, and, as you can see, women who don’t like working out report twice the orgasm problems of women who do.

Chart #2

Here, we took a single question—Is your ideal sex rough or gentle?—and scraped people’s profile text for the words that most correlated to each answer. Here are word clouds for women and men in their 20s.

The text is basically Hot Topic versus, I dunno, Burberry. But beyond the words the interesting thing is how men’s and women’s preferences change with age:

This dataset only includes single people, of course, but I was still very surprised at how many old men like it rough. Looks like I’m going to have to rethink a cherished part of my worldview.

Chart #3

The odds shown in this chart, and the others like it later in the post, are odds “in favor”—in this case, odds in favor of being into giving oral sex. The higher a group’s odds, the more into it they are.

Since so much sexual slang involves meat—”hot dog,” “sausage,” “burger,” “beef injection,” “another beef injection,” and so on—I thought this would be a fine occasion to point out that there are plenty of veggie alternatives:

Vegetarian-Friendly Sex Slang
Peeling the banana.
Tossing the salad.
Squeezing the melons.
Zeroing in on a grown man’s nuts and nutsack.
Putting Monsanto in yoursanto.
Ordering the split pea soup.
Sorry, that’s got ham.

Cornholing others.

Charts #4 & #5

Frequent tweeters have shorter real-life relationships than everyone else, probably via some bit.ly hack. Unfortunately, we have no way to tell who’s dumping who here; whether the twitterati are more annoying or just more flighty than everyone else. There is also this:

If someone tweets every day, it’s 2-to-1 that they’re #ingthemselves just as often. Like the “shorter relationships” thing, this is true across all age and gender groups.

Chart #6:

In the Bible, in between the part where Reuben kills a he-goat so he can dip some clothes in the blood of the he-goat and where Judah tries to give Tamar a goat but decides maybe she should be burned to death instead, God kills a man named Onan because Onan intentionally spills his seed on the ground.

(1) Thou shalt not whack off. (2) Mo goats mo problems.

Life lessons! From the Iron Age!

Charts #7 & #8

This bubble chart, plotting body type, sex drive, and self-confidence, is dynamic—you can use the slider at the bottom change it. As you can see as you move the control from left to right, a woman’s sexuality peaks in her twenties, holds more or less steady for twenty years, and then falls to the floor. And while sex drive waxes and wanes, self-confidence steadily grows.

Remember, the women themselves select their body-descriptions; the bubbles show the size of each group. Though many of the words are just a shade of meaning apart, there are dramatic differences in the traits of the people who choose them. Go through the animation and compare full-figured to curvy orskinny to thin.

It’s particularly interesting to isolate skinny—a deprecating way to say something generally considered positive (being thin)—and curvy—an empowering way to say something generally considered negative (being heavy). Here are those bubbles’ complete paths across the graph:

Curvy women pass skinny ones in self-confidence at age 29 and never look back. They also consistently have the highest sex drive among the groups. Curvy, as a word, has the strongest sensual overtones of all our self-descriptions. So we’re getting a little insight into the real-world implications of a label.

This is the “complete path” plot for men:

Things to notice: (1) almost no men choose curvy or full-figured as self-descriptions, so those words aren’t plotted here; (2) men of all body types have roughly the same peak sex drive; (3) and the thing that matters most for guys is simply to not be overweight. The other four body types are clustered relatively together at most ages.

Chart #9

For this chart, we took our own data and mixed it with a little outside stuff: college tuitions from U.S. News & World Report.

Generally speaking, the more your parents are paying for your education, the more horny you are. If only Freud were still around to help us understand; instead we have psychology majors, those Adidas shower sandals, and darkness.

You can think of the dotted best-fit line as dividing the good sex-ed values (above the line) from the bad ones (below). The line also gives us a handy sliding scale: given a 36-week school year and the average partner, every $2,000 spent on your college tuition is an extra time you could be having sex that year.

Chart #10

The correlation between sex and money is robust for colleges, but it gets even stronger when extended to entire nations.

We were amazed at this result—money seems to be a more powerful influence on sex drive than culture or even religion.

You have, for example, Portugal, Oman, Slovenia, and Taiwan within a few pixels of each other on the right side of the graph, and Syria, Sri Lanka, and Guatemala almost stacked on the left, and all of them sit along the trend line.

—-

This post was originally published on OkCupid’s blog

Photo Credits: OkCupid and Flickr CC HikingArtist.com

 

Data Journalism: The Story So Far

DATA MINER UK – by Nicola Hughes

Such a great article on the story of data journalism by Nicola Hughes that we decided to put it all! Get the original article on Data Miner UK

[youtube 3YcZ3Zqk0a8]

And here’s what Tim Berner-Lee, founder of the internet, said regarding the subject of data journalism:

Journalists need to be data-savvy… [it’s] going to be about poring over data and equipping yourself with the tools to analyse it and picking out what’s interesting. And keeping it in perspective, helping people out by really seeing where it all fits together, and what’s going on in the country

How the Media Handle Data:

Data has sprung onto the journalistic platform of late in the form of the Iraq War Logs (mapped by The Guardian), the MP’s expenses (bought by The Telegraph) and the leaked US Embassy Cables (visualized by Der Spiegel). What strikes me about these big hitters is the existence of the data is a story in itself. Which is why they had to be covered. And how they can be sold to an editor. These data events force the journalistic platform into handling large amounts of data. The leaks are stories so there’s your headline before you start actually looking for stories. In fact, the Fleet Street Blues blog pointed out the sorry lack of stories from such a rich source of data, noting the quick turn to headlines about Wikileaks and Assange.

Der Spiegel - The US Embassy Dispatches
Der Spiegel – The US Embassy Dispatches

 

So journalism so far has had to handle large data dumps which has spurred on the area of data journalism. But they also serve to highlight the fact that the journalistic platform as yet cannot handle data. Not the steady stream of public data eking out of government offices and public bodies. What has caught the attention of news organizations is social media. And that’s a steady stream of useful information. But again, all that’s permitted is some fancy graphics hammered out by programmers who are glad to be dealing with something more challenging than picture galleries (here’s an example of how  CNN used twitter data).

So infographics (see the Stanford project: Journalism in the Age of Data) and interactives (e.g. New York Times: A Peek into Netflix Queues) have been the keystone from which the journalism data platform is being built. But there are stories and not just pictures to be found in data. There are strange goings-on that need to be unearthed. And there are players outside of the newsroom doing just that.

How the Data Journalists Handle Data:

Data, before it was made sociable or leakable, was the beat of the computer-assisted-reporters (CAR). They date as far back as 1989 with the setting up of the National Institute for Computer-Assisted Reporting in the States. Which is soon to be followed by the European Centre for Computer Assisted Reporting. The french group, OWNI, are the latest (and coolest) revolutionaries when it comes to new age journalism and are exploring the data avenues with aplomb. CAR then morphed into Hacks/Hackers when reporters realized that computers were tools that every journalist should use for reporting. There’s no such thing as telephone-assisted-reporting.  So some whacky journalists (myself now included) decided to pair up with developers to see what can be done with web data.

This now seems to be catching on in the newsroom. The Chicago Tribune has a data center, to name just one. In fact, the data center at the Texas Tribune drives the majority of the sites traffic. Data journalism is growing alongside the growing availability of data and the tools that can be used to extract, refine and probe it. However, at the core of any data driven story is the journalist. And what needs to be fostered now, I would argue, is the data nose of a (any) journalist. Journalism, in its purest form, is interrogation. The world of data is an untapped goldmine and what’s lacking now is the data acumen to get digging. There are Pulitzers embedded in the data strata which can be struck with little use of heavy machinery. Data driven journalism and indeed CAR has been around long before social media, web 2.0 and even the internet. One of the earliest examples of computer assisted reporting was in 1967, after riots in Detroit, when Philip Meyer used survey research, analyzed on a mainframe computer, to show that people who had attended college were equally likely to have rioted as were high school dropouts. This turned the publics’ attention to the pervasive racial discrimination in policing and housing in Detroit.

Where Data Fits into Journalism:

I’ve been looking at the States and the broadsheets reputation for investigative journalism has produced some real gems. What stuck me, by looking at news data over the Atlantic, is that data journalism has been seeded earlier and possibly more prolifically than in the UK. I’m not sure if it’s more established but I suspect so (but not by a wide margin). For example, at the end of 2004, the then Dallas Morning News analyzed the school test scores of the Texas Assessment of Knowledge and Skills and uncovered one school’s alleged cheating on standardized tests. This then turned into a story on cheating across the state. The Seattle Times piece of 2008, logging and landslides, revealed how a logging company was blatantly allowed to clear-cut unstable slopes. Not only did they produce and interactive but the beauty of data journalism (which is becoming a trend) is to write about how the investigation was uncovered using the requested data.

The Seattle Times: Landslides in the Upper Chehalis River Basin
The Seattle Times: Landslides in the Upper Chehalis River Basin

 

Newspapers in the US are clearly beginning to realize that data is a commodity for which you can buy trust from your consumer. The need for speed seems to be diminishing as social media gets there first, and viewers turn to the web for richer information. News in the sense of something new to you, is being condensed into 140 character alerts, newsletters, status updates and things that go bing on your mobile device. News companies are starting to think about news online as exploratory information that speaks to the individual (which is web 2.0). So the The New York Times has mapped the census data in its project “Mapping America: Every City, Every Block”. The Los Angeles Times has also added crime data so that its readers are informed citizens not just site surfers. My personal heros are the investigative reporters at ProPublica who not only partner with mainstream news outlets for projects like Dollars for Doctors, they also blog about the new tools they’re using to dig the data. Proof the US is heading down the data mine is the fact that Pulitzer finalists for local journalism included a two year data dig by the Las Vegas Sun into preventable medical mistakes in Las Vegas hospitals.

Lessons in Data Journalism:

Another sign that data journalism is on the up is the recent uptake at teaching centres for the next generation journalist. Here in the UK, City University has introduced an MA in Interactive Journalism which includes a module in data journalism. Across the pond, the US is again ahead of the game with Columbia University offering a duel masters’ in Computer Science and Journalism. Words from the journalism underground are now muttering terms like Goolge Refine, Ruby and Scraperwiki. O’Reilly Radar has talked about data journalism.

The beauty of the social and semantic web is that I can learn from the journalists working with data, the miners carving out the pathways I intend to follow. They share what they do. Big shot correspondents get a blog on the news site. Data journalists don’t, but they blog because they know that collaboration and information is the key to selling what it is they do (e.g Anthony DeBarros, database editor at USA Today). They are still trying to sell damned good journalism to the media sector!  Multimedia journalists for local news are getting it (e.g David Higgerson, Trinity Mirror Regionals). Even grassroots community bloggers are at it (e.g. Joseph Stashko of Blog Preston). Looks like data journalism is working its way from the bottom up.

Back in Business:

Here are two interesting articles relating to the growing area of data and data journalism as a business. Please have a look: Data is the New Oil and News organizations must become hubs of trusted data in a market seeking (and valuing) trust.

 

Open Data And Emergent Digital Horizons At Future Everything 2011 [Event]

PSFK: by Stephen Fortune

Picture from the PSFK website

Now in it’s 16th year, the recently-renamed FutureEverything Festival will continue to showcase and illuminate creative technologies and digital innovation this coming May in Manchester, UK.

Befitting it’s role in leading Manchester’s recent Open Data revolution, FutureEverything will provide centre stage consideration of Open Data as part of it’s two day conference. Open Data is shifting the digital landscape in a manner comparable to the sea changes which followed in the wake of social media and FutureEverything 2011 offers the means to understand how it will transform the way consumers engage with brands, and the ways citizens engage in local government. The topics under consideration range from the enterprise that can be fomented with open data to what shape algorithm driven journalism will take. [Read more…]

 

#opendata: What is open government data? What is it good for? [VIDEO]

#opendata film

[vimeo 21711338]

This short film by the Open Knowledge Foundation deals with the raise of open government data and can be found on the Open Government Data website. Open data is changing the relationship between citizens and their government. People are now more aware of government’s spending, who is representing them, and the companies that do business with the government. Some say that open data is bringing a global social change, that it is modifying the way society works.Watch this film and tell us what you think…

 

The Royal Wedding: An experiment in data journalism

WANNABE HACKS: by Matthew Caines

Graph by Matthew Caines using ManyEyes

UPDATE: After having a stab at data journalism today, my first ever piece has since been featured on the MANY EYES homepage. Not too bad for first-timer…

Seeing as today is all about taking the plunge and tying the knot, I’ve been thinking about joining to someone in holy matrimony myself… to data journalism! I say taking the plunge because it’s not necessarily a match made in heaven – data journalism is something I’ve often shied away from, always assuming the tech geeks + web guys are the only ones who can do it and do it well.

My cold feet were that anyone who saw my Microsoft Word multi-coloured pie-chart would surely scoff at my horrendous attempt at interactive data. But I need this marriage to work because data journalism is fast becoming a valuable skill for any aspiring journalist. [Read more…]

 

The Social Media Buzz Behind the Royal Wedding [INFOGRAPHIC]

MASHABLE: by Ben Parr

 

Infographic from Mashable website

With hours to go until the Royal Wedding, online buzz surrounding the big event has surpassed the chatter that surrounded the Egypt uprising and the Japan earthquake.

New stats gathered and analyzed by Webtrends reveal that the world simply can’t stop talking about the Royal Wedding (not that you needed us to tell you). According to the web analytics company, people have sent 911,000 tweets in the last 30 days, or just a little more than 30,000 tweets per day, which accounts for 71% of the buzz Webtrends tracked. For comparison, there were approximately 217,000 Facebook status updates and 145,000 blog posts about William and Kate’s big day. [Read more…]

 

10 things every journalist should know about data

NEWS:REWIRED: by SARAH MARSHALL

Picture from News:Rewired website

Every journalist needs to know about data. It is not just the preserve of the investigative journalist but can – and should – be used by reporters writing for local papers, magazines, the consumer and trade press and for online publications.

Think about crime statistics, government spending, bin collections, hospital infections and missing kittens and tell me data journalism is not relevant to your title.

If you think you need to be a hacker as well as a hack then you are wrong. Although data journalism combines journalism, research, statistics and programming, you may dabble but you don’t need to know much maths or code to get started. It can be as simple as copying and pasting data from an Excel spreadsheet. [Read more…]

 

Journalism in the Age of Data: A video report on Data Visualisation a storytelling medium

STANFORD.EU: Geoff McGhee

[vimeo 14777910]

Journalists are coping with the rising information flood by borrowing data visualization techniques from computer scientists, researchers and artists. Some newsrooms are already beginning to retool their staffs and systems to prepare for a future in which data becomes a medium. But how do we communicate with data, how can traditional narratives be fused with sophisticated, interactive information displays? [Watch the full version with annotations and links on Stanford.eu]