Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

psmag.com http://www.psmag.

com/navigation/business-economics/man-whos-quantifying-new-york-city-i-quant-ny-92977/

The Man Who’s Quantifying New York City


Noah Davis talks to the proprietor of I Quant NY. His methodology: a little something called
“addition.”

Ben Wellington didn’t mean to start a data revolution. He just wanted to teach his students. As
the visiting assistant professor in Pratt Institute’s City & Regional Planning program, he
teaches a statistics course based on NYC Open Data, a repository of information provided by
the city. After he started playing around with the data sets, he decided to launch a blog. I Quant
NY features his discoveries—from the farthest Manhattan apartment from the subway to the
fire hydrant that brought in the most ticket revenue—and has become something of a media
sensation, even affecting policy. Wellington spoke with Pacific Standard about asking the right
questions, the simplicity and power of summing, and why posts don’t need to be difficult.

Did you anticipate the blog getting attention beyond your students?

“For the most part, all my work is just counts and means. The fanciest thing
is a correlation.”

I was definitely surprised by the interest. I think there are a lot of compelling stories to be told
within the different data sets. Often, people do really fancy visualizations and beautiful things. I
don’t have the skills to make beautiful things. I’m taking the opposite approach. I take small
slices of things, thing that people can actually internalize. On one hand, that’s probably more
relatable. Watching every CitiBike fly around the city is cool, but exploring where the most
females and males are is simpler to internalize and easier to draw conclusions from.

I was surprised that the simple approach had as much or even more interest than the people
with JavaScript skills. That’s pretty cool.

It can be really hard to have an idea and express it in a simple way with stats and data.
You seem to have mastered that.

The key is figuring out the right question to ask. If you look through Freakanomics, they aren’t
doing the fanciest analysis. They are looking at one variable, but they are looking at it from a
new angle. For the most part, all my work is just counts and means. The fanciest thing is a
correlation. After I did the post about the fire hydrant that was generating the most ticket
revenue I got a call from the New York Post asking about my methodology. I told them that I
summed. I added. It was an awkward conversation.

There’s a fear of numbers. I work at Pratt to teach city planners that you don’t need to be a
statistician or computer scientist to do this type of work. If you can learn a little bit about Excel,
you can go on Open Data and download a .csv file and do the same CitiBike study I did with a
few clicks.
How do you think of the questions to ask?

It’s probably three ways. I keep an ear open to interesting current trends that are happening. If
a hearing about Vision Zero is coming up, I might have an extra look at their data. I live in the
city, and I’m always asking myself, “Why, why, why?” My experience as a New Yorker really
leads me to a fascination with a certain type of data. I’ve gotten my share of parking tickets in
my life. The third way is to look at a data set and start rattling off questions. I’ll ask myself a
question, look at the answer and if it seems cool, write about it.

What percentage of the time is there an interesting conclusion?

More than half, honestly. There’s no data set that I couldn’t find something. Some are more
“pop culture” exciting. The recent work I did about finding the apartment farthest from the
subway somehow got picked up by a bunch of media outlets. I find that less interesting, but it’s
more relatable to people. There’s a balance between interest and useful, where maybe I can
help with policy decisions, and exciting to the Internet. That’s a conflicting battle I have.

I first found the blog after the MTA post about the perfect value to add to your
Metrocard. That’s similar in the not-all-that-useful-but Internet-fascinating way.

Yeah, that’s another example. I hope they fix it, but it’s not that compelling analysis. It’s more
“this is quirky, let’s fix it.” Those aren’t the things I’m most excited about, but they are the most
relatable, I think. People flock to those types of things.

The apartment one was fun but also sort of silly. It’s just an apartment on the water.
That’s not all that surprising.

I thought it was funny to find the apartment, see it listed, and see the price. [It's on the market
for $18.9 million.] But to do the same thing in Brooklyn would be more compelling because you
could take about public transportation and places where you have a real distance. You could
do a lot, though, like count the number of apartments that will have closer access to the
subway when the Second Avenue line starts. What percentage of Manhattan was affected by
that?

“I’ve had a lot of people call it ‘data journalism,’ and I’ve never thought about
it that way. It never occurred to me at all to call it that. It’s just analysis.”

You’ve done a lot of work with CitiBike.

One of the cool things I saw in the CitiBike data was the median age of riders per station. The
Lower East Side has the oldest riders. The median age is 41 or 42. That was a data point that
stuck out to me and got me thinking. It could mean one of two things: Either they need more
bikes and that’s a good thing or they feel their only option is to bike to get to work and maybe
that’s a sign that we need better public transportation. You’d have to go figure out what is
going on, but it’s certainly an outlier.

That was cool, too, because the youngest area is the East Village. I colored it by age and you
have these opposing things going on.

Are you surprised that public agencies have responded?

I put the data out there. I don’t consider myself a journalist. Usually what happens is a
journalist finds the data, reaches out to the agencies, and the agencies respond to them. I will
read that response. This is all a brand new thing for government.

Open data plays two roles. You’re leveraging the power of people who are passionate to find
things. The fact that you can help find issues with streets and have them fixed is maybe useful
to the Department of Transportation. If the Fire Department were to release information about
the times of fires, I bet people would start modeling the likelihood of fires. Maybe the Fire
Department could do that, but there are a bunch of scientists out there who would have a great
time doing it, and the Fire Department could leverage that work for free. On the other hand, it’s
also a bit of a watchdog with transparency and accountability.

There are two sides to the coin. Anytime you point out something, it could go either way. If you
tell the Department of Health that there’s something wrong with the rating system, they could
either say, “Wow, let’s look into that” or they could play defensive. Generally, agencies are
defensive, but there’s also not a good mechanism for them to take in information like this. They
get caught off guard. I hope in the coming years they build in ways to reach out like this. If
there were a liaison I could reach out to, maybe I would go that route. But right now, the only
way to get attention is through the media. Unfortunately, that can create an adversarial
relationship, which I think is the wrong way to look at open data. I really believe that if you
empower people, you’ll get much more out than you’ll get criticism.

Is this happening in other cities?

It’s definitely growing. A lot of cities have open data portals. I think New York is a leader in its
size and scope, but the federal government has open data and the state has a large one as
well. Los Angeles does as well. There are dozens and dozens of cities that have jumped on.

Have you had interest from other people in other cities?

A lot of people have reached out and asked me to do something I’ve done for New York for
their city. If I do something for New York, they ask about Philly. Or if I do something for
Manhattan, someone will ask if I can do it for Brooklyn. There is a demand for this kind of
thing.

People study data sets all over the country, but most studies are these long in-depth analysis.
You do real reporting, show up somewhere, look around, ask questions, and then write a
report that makes recommendations. I’ve taken the opposite approach, a breadth over depth
approach. I want to dig up as many interesting tidbits as possible and put them out there. I let
the experts dig in. I don’t know much about traffic safety, but I can analyze the most dangerous
places, give that to people who actually know about it, and let them go with it. I’m not going to
sit there and make infrastructural suggestions.

Do you think we’ll see more of what you’re doing? There’s definitely a trend in this type
of data journalism.

I’ve had a lot of people call it “data journalism,” and I’ve never thought about it that way. It
never occurred to me at all to call it that. It’s just analysis.

One of my favorite things was a tongue-in-cheek 4th of July post I did about the best place to
see illegal fireworks. I looked at the number of complaints about illegal fireworks and made a
map. It turns out that Inwood was the best place. An hour or so after I posted it, the Times had
an article about Inwood fireworks. It had nothing to do with my work. They actually had done a
real story. They went there, had the pictures, and had done the reporting. The fact that we had
zeroed in on the same place—me from my couch in a few hours; them in a much more
compelling, interesting, and deep way—was an affirmation that there was value. It was cool to
see.

You might also like