(This post is part of News.me’s ongoing series, “Getting the News.” In our efforts to understand everything about social news, we’re reaching out to writers and thinkers we like to ask them how they get their daily news. Read the first post here. See all of the posts, from writers and thinkers like Chris Dixon, Zach Seward, and Megan Garber, here.)
This week we sat down with Hilary Mason, chief scientist at our sister company bitly. Most people know bitly as a link-shortening service, but behind the scenes, bitly has a huge amount of data to work with. Hilary’s job is to play with that data — and we’re always astounded with what she comes up with. A few months ago, when we were writing our post "Watching News Break," we sent a quick email poll to everyone in the betaworks offices: How did you find out Moammar Qaddafi had been killed? We got back a variety of responses, ranging from the Today Show to “this email,” but Hilary’s quick response was the most surprising. “I actually found out because ‘killed’ and ‘Gaddafi’ were trending through bitly data in our new trends API.” We might work in the same offices, but we did not know what they were up to next door — so we cornered Hilary for a few minutes to talk more about how she gets her news. It’s a different perspective from almost anyone else we’ve talked to — a highly technical and personalized way of sifting through news data. What else did we expect from a data scientist? In addition to working with huge amounts of data, Hilary is a co-founder of HackNY, one of Fortune's 40 under 40, and a cookie enthusiast.
How do you get your news in the morning?
I have a group of friends who often email around links, and so that’s where I look first. Those links tend to be either breaking news or mostly silly but interesting nerdy stuff on the Internet. After that I’ll usually go to Twitter and page back through to see what’s going on. I have some tools I’ve written on my GitHub page for going through Twitter. Really simple things, like, “Show me any link that’s been tweeted by more than two people I follow in the last 24 hours.” It’s configurable, so I can tag things by category. I’m able to go through all the tweets really quickly, filter out the sports tweets, because I don’t care about that, filter out the celebrity gossip, and then elevate the things that matter most.
How does your hack categorize the tweets?
There’s a trained classifier based on label data that I’ve given it over time. Some of it are pretty standard categories like “sports” or “technology.” I also have a “narcissism” category. It finds things like people saying the words “I,” “my,” or “mine” in the same tweet for people constantly promoting their own blog posts.
That’s brilliant, actually.
It’s more of a hack for dealing with the noise and the full stream of data. And then I do get a couple of emails, like the News.me email, that I really like for finding big things I might have missed.
I also have a script that reads the feed of birthdays off my Facebook feed and automatically writes the birthday emails for me, which is just a hack. I feel like leaving the “happy birthday” comment leaves you partial credit, but when you write the email, you get full credit. But I can still automate it off the same data source.
Your group of friends’ system is really intriguing. We know people mostly share news via email, but sharing breaking news is an interesting phenomenon. What does that typically look like?
It’s usually a link with a “Wow, have you seen this?” or “Are you following this story that happened?” One of the things we look at through bitly is how an idea can jump from a comment, to a blog post, to a blog, to a mainstream news source. It’s fun to see when people gather the pieces together on their own. It might be — “Did you see that this GitHub project has had a new push that allows you to do….” whatever. And then someone else will say, “Oh yeah, there’s an article about it over here.”
What were you looking at on the day of the Qaddafi assassination?
It’s not currently demo-able, which is a shame, but one of the systems we’re working on is something designed to tell us what the world is paying attention to at any given moment — and not at the link level, but at the idea level, where we consider ideas to be collections of phrases. We’re measuring the click-rate on any given phrase, so we can tell you that “Jennifer Lopez,” for example, gets a typical 0.01 clicks/second click-rate, but when there was a potential dress malfunction at the Oscars, that went up to over 20 clicks/second. By watching this you’re able to see whenever something happens that gets enough attention from people that they’re actually clicking links about it. With Qaddafi, what I saw was what we call a “burst” in attention to that phrase.
Then what, do you Google the phrase?
Then I can go back and see through bitly which URLs are leading to that burst. The difference is that bitly is what people are paying attention to, and Google is the whole Internet. If you Google a restaurant’s name in New York, you’d find that restaurant’s homepage. If that restaurant happens to be on fire, with Google you’d still see that restaurant’s homepage, but through our data you’d see all the people saying “Oh god, this place is on fire!”
Who on Twitter do you find particularly valuable or interesting?
I follow a lot of people in the data and machine-learning community, which is something I’m pretty involved and interested in. I want to know all the interesting things that happen in that community, whether it’s a new code release or somebody getting a new job. I follow that very deeply, and then I follow people who tend to retweet things that show up on PandoDaily, Mashable, or ReadWriteWeb — but I don’t really care to know everything in that sphere, I just want to know when big things happen.
I also follow NewYorkology, which is a great account if you live in New York. The woman who runs it is always tweeting beautiful photos of New York City, events that are happening, museum exhibits that are opening, subway service changes. I also follow museums that I like, like the American Museum of Natural History (@amnh). And I follow @WNYC for local news.
Do you ever watch local television news?
Well, I don’t have a TV. I do have an xBox… but that doesn’t count.
What platforms do you use to get your news content?
Mostly my iPhone and my laptop. I have a Kindle, but I use that for reading books.
I still use Google Reader a lot. But it depends on whether I’m waking up and catching up on the news, or whether I’ve been programming for an hour and I need to take a break. The RSS reader tends to come into the latter piece, where I’m sitting at a computer and I don’t want to look at code or email for a while. I just want my brain to be distracted. So there I follow the same mix of academic data blogs and tech blogs as well as random things that are entertaining or interesting, like Boing Boing.
There are no news sites that I feel like I have to check in on their homepage. I tend to use CNN as my default domain when I’m trying to get on free WiFi — because it’s short to type, and it’s not an https domain, so it’ll get me through the authentication process quickly. Every so often I’ll find something interesting there, but that’s mostly an accident.
What was the last great article you read?
Last night I read a fascinating story on NPR’s site. It was about this near-extinct species of insect. I don’t have to tell you the whole story. Okay, I’ll tell you the whole story. They call them tree lobsters, and they’re huge, and they have hideous legs. They used to live on this tiny island off of Australia that houses maybe a couple hundred people. A hundred years a boat shipwrecked on the island and a bunch of rats came off the boat and ate all these insects. So they were thought to be extinct. But researchers just discovered some living on a huge rock about a mile away, under one bush. The 24 remaining tree lobsters in the world.
They managed to take a few away from that and breed them in a zoo, and now there are hundreds of these things, so they’re contemplating — do they keep them all in captivity, or do they go to this little island, kill off all the rats, and try reintroduce these bugs? They’re doing a public service campaign to convince people these bugs are more desirable than rats. So they made a video of one of them hatching out of an egg that is supposed to be cute but is horrible. It’s not the kind of thing you want to read right before you go to sleep. Which is probably why it made quite an impression on me.
It’s not breaking news, really, but I really like this type of story because it teaches you something remarkable about the world.
You’ve built your own tools to manage your news consumption. Are there any other tools you wish you had?
There are two reasons to read the news. One is so you’re not missing out on something you need to know to be successful in the society in which you interact. And there’s another one, which finds these delightful, intriguing stories about the world. For the former, applications like News.me are great examples of things that are sort of inching towards that superpower of ambient awareness of what’s happening in the world, without having to invest too much energy in searching it out every day. But I don’t think the problem’s solved yet.
If we take a step back, there’s this universe of data that’s happening around us, and some of it is really relevant to the things we need to know to do our jobs or the things we’re really interested in. The problem is then — out of that whole universe of data, how do you find the things you need to know at the time you need to know them in a way that is least intrusive into your life?
How are we doing?
(laughs) We’re getting there.
(All interviews conducted by Sonia Saraiya.)