Craig Silverman has dedicated his life’s work to researching and analyzing rumors in the media and on the Internet. A fellow at Columbia University’s Tow Center for Digital Media, Silverman has published a book, Regret the Error, and an eponymous blog, examining media errors and corrections, and trends regarding veracity and verification.
Now he is taking his passion for accuracy online in a deeper way with an ambitious data-driven startup, Emergent.info. The site sorts through myriad online rumours to discern which are true, which are false and which fall into that unverifiable “gray zone.”
How does Emergent work? Silverman and an assistant scour the Web, searching social media and news sites before entering rumours or claims which require vetting into a database. An algorithm then scans URLs of related news stories, blogs, social media posts and other data to monitor whether the content in question has been changed, updated, corrected or verified. Final decisions on the status of any given claim are always made by a real person.
Emergent differs form other rumor debunking sites like Snopes, which Silverman greatly respects, in that it uses a data-driven approach to analyzing rumors and claims. This higher level of automation, always backed up by human decision, allows Emergent to process far more information than its competitors.
B2B NN recently spoke with Silverman about the focus and direction of Emergent.
B2B NN: Do you envision Emergent growing to become something akin to PolitiFact, perhaps one day becoming the Internet’s go-to source for debunking or confirming rumors?
CS: Yes, that’s definitely what I’m working toward. The challenge is the huge amount of information that is out there online. A lot of it is confusing, incorrect, incomplete and often lacks the right context for people to be able to understand it. The core idea [of Emergent] is offering a smart filter that’s biased toward determining veracity.
The other thing I’ll mention is that Emergent is a data-driven initiative, so we’re collecting date that we believe will also enable us to deliver other kinds of products based on the data we’re bringing in about media credibility, source credibility and other things.
B2B NN: Do you think that your data-driven approach allows you to filter more information, be more accurate with a larger volume of information?
CS: Yes. Snopes is incredible. They’ve been around for 10, 20 years and have a huge amount of information. But the question is, how do you scale? If you look at PolitiFact and Snopes, the way that they deliver their verdicts is through written articles that might run several hundred words. There are only so many of those that you can churn out.
So for me, the idea is, how do we deal with the scale of the problem, and also how do we go after things that are in the middle ground? There’s a lot of stuff out there that isn’t yet true or isn’t yet false, it’s in the gray zone, and so how do you communicate that level of uncertainty in a way that actually helps people understand what’s going on? And gathering the data about what’s being said about a given claim or rumour is a way to get into the conversation and show what’s out there right now and show what evidence exists without having to re-write the article every hour.
B2B NN: Speaking of those gray areas, what about controversial topics for which the answers aren’t easily classified as true, false or unverified?
CS: The approach with things like that is, the first thing we do is identify a factual claim. Otherwise it’s not something we can look at. Is somebody a jerk? That’s not really a factual claim that we can look at. Is there a well-accepted definition of genocide, for example? If there is, and you can gather evidence to support that it is happening, then potentially that’s a claim that can be checked. But at the core of it, we try to find really small atomic units that are factually verifiable.
B2B NN: There are high-profile hoaxes that can fool even the most esteemed corporate mainstream media outlets, who sometimes run with stories that turn out to be utter bunk. This year’s hoax of Justin Bieber moving to Atlanta comes to mind. Are you surprised that mainstream media outlets are so often so easily bamboozled?
CS: For 10 years I’ve been looking at errors and corrections in the media. And so for me, I certainly wouldn’t say that I’m surprised to see that news organizations pick up stuff that’s dubious and make mistakes. That’s been happening forever. To a certain extent there’s [much more] information out there now, news organizations are trying to find content that’s going to increase traffic and stimulate engagement, so the incentive is there for them to jump on stuff quickly and get it out there so they can capture attention.
And you see that play out again and again, like with the story about the woman with three breasts, where news organizations jumped on it but they didn’t really treat it with the skepticism that they should have. They treat it as if it were true, and then a day or two later it comes out that it was fake, and what we’ve seen in the data we’ve been collecting is that unfortunately news organizations will not come back to a story once it’s been debunked.
B2B NN: Do you envision any B2B applications for Emergent?
CS: One piece of this is giving people a definitive resource to check stuff that’s out there and spreading. The other thing that’s important is that as we gather more data about news organizations, about individuals on Twitter, about companies, about source and about others, the goal is to get to the point where we can actually do projections and potentially credibility ratings to say, this rumor is out there and based on the entities involved in it, we think it’s 70 percent likely to turn out to be false.
We believe that data feed will be of interest to people in the financial world. We believe that having credibility ratings for media sources will be of interest to media databases, and so the idea is that the data we have can certainly power our website, and it would also be used in other products that we would build or we would work on other with to create a layer of credibility and information prediction to add value to what they have. That’s the vision for the business. Creating a public resource and building up as big of an audience around that as possible, but also monetizing the data we’re gathering and building products from that.