There’s an old proverb that says “seeing is believing.” But in the age of artificial intelligence, it’s becoming increasingly difficult to take anything at face value—literally.
The rise of so-called “deepfakes,” in which different types of AI-based techniques are used to manipulate video content, has reached the point where Congress held its first hearing last month on the potential abuses of the technology. The congressional investigation coincided with the release of a doctored video of Facebook CEO Mark Zuckerberg delivering what appeared to be a sinister speech.
Scientists are scrambling for solutions on how to combat deepfakes, while at the same time others are continuing to refine the techniques for less nefarious purposes, such as automating video content for the film industry.
At one end of the spectrum, for example, researchers at New York University’s Tandon School of Engineering have proposed implanting a type of digital watermark using a neural network that can spot manipulated photos and videos.
The idea is to embed the system directly into a digital camera. Many smartphone cameras and other digital devices already use AI to boost image quality and make other corrections. The authors of the study out of NYU say their prototype platform increased the chances of detecting manipulation from about 45 percent to more than 90 percent without sacrificing image quality.
On the other hand, researchers at Carnegie Mellon University recently hit on a technique for automatically and rapidly converting large amounts of video content from one source into the style of another. In one example, the scientists transferred the facial expressions of comedian John Oliver onto the bespectacled face of late night show host Stephen Colbert.
The CMU team says the method could be a boon to the movie industry, such as by converting black and white films to color, though it also conceded that the technology could be used to develop deepfakes.
Words Matter with Fake News
While the current spotlight is on how to combat video and image manipulation, a prolonged trench warfare on fake news is being fought by academia, nonprofits, and the tech industry.
This isn’t the fake news that some have come to use as a knee-jerk reaction to fact-based information that might be less than flattering to the subject of the report. Rather, fake news is deliberately-created misinformation that is spread via the internet.
In a recent Pew Research Center poll, Americans said fake news is a bigger problem than violent crime, racism, and terrorism. Fortunately, many of the linguistic tools that have been applied to determine when people are being deliberately deceitful can be baked into algorithms for spotting fake news.
That’s the approach taken by a team at the University of Michigan (U-M) to develop an algorithm that was better than humans at identifying fake news—76 percent versus 70 percent—by focusing on linguistic cues like grammatical structure, word choice, and punctuation.
For example, fake news tends to be filled with hyperbole and exaggeration, using terms like “overwhelming” or “extraordinary.”
“I think that’s a way to make up for the fact that the news is not quite true, so trying to compensate with the language that’s being used,” Rada Mihalcea, a computer science and engineering professor at U-M, told Singularity Hub.
The paper “Automatic Detection of Fake News” was based on the team’s previous studies on how people lie in general, without necessarily having the intention of spreading fake news, she said.
“Deception is a complicated and complex phenomenon that requires brain power,” Mihalcea noted. “That often results in simpler language, where you have shorter sentences or shorter documents.”
AI Versus AI
While most fake news is still churned out by humans with identifiable patterns of lying, according to Mihalcea, other researchers are already anticipating how to detect misinformation manufactured by machines.
A group led by Yejin Choi, with the Allen Institute of Artificial Intelligence and the University of Washington in Seattle, is one such team. The researchers recently introduced the world to Grover, an AI platform that is particularly good at catching autonomously-generated fake news because it’s equally good at creating it.
“This is due to a finding that is perhaps counterintuitive: strong generators for neural fake news are themselves strong detectors of it,” wrote Rowan Zellers, a PhD student and team member, in a Medium blog post. “A generator of fake news will be most familiar with its own peculiarities, such as using overly common or predictable words, as well as the peculiarities of similar generators.”
The team found that the best current discriminators can classify neural fake news from real, human-created text with 73 percent accuracy. Grover clocks in with 92 percent accuracy based on a training set of 5,000 neural network-generated fake news samples. Zellers wrote that Grover got better at scale, identifying 97.5 percent of made-up machine mumbo jumbo when trained on 80,000 articles.
It performed almost as well against fake news created by a powerful new text-generation system called GPT-2 built by OpenAI, a nonprofit research lab founded by Elon Musk, classifying 96.1 percent of the machine-written articles.
OpenAI had so feared that the platform could be abused that it has only released limited versions of the software. The public can play with a scaled-down version posted by a machine learning engineer named Adam King, where the user types in a short prompt and GPT-2 bangs out a short story or poem based on the snippet of text.
No Silver AI Bullet
While real progress is being made against fake news, the challenges of using AI to detect and correct misinformation are abundant, according to Hugo Williams, outreach manager for Logically, a UK-based startup that is developing different detectors using elements of deep learning and natural language processing, among others. He explained that the Logically models analyze information based on a three-pronged approach.
Publisher metadata: Is the article from a known, reliable, and trustworthy publisher with a history of credible journalism?
Network behavior: Is the article proliferating through social platforms and networks in ways typically associated with misinformation?
Content: The AI scans articles for hundreds of known indicators typically found in misinformation.
“There is no single algorithm which is capable of doing this,” Williams wrote in an email to Singularity Hub. “Even when you have a collection of different algorithms which—when combined—can give you relatively decent indications of what is unreliable or outright false, there will always need to be a human layer in the pipeline.”
The company released a consumer app in India back in February just before that country’s election cycle that was a “great testing ground” to refine its technology for the next app release, which is scheduled in the UK later this year. Users can submit articles for further scrutiny by a real person.
“We see our technology not as replacing traditional verification work, but as a method of simplifying and streamlining a very manual process,” Williams said. “In doing so, we’re able to publish more fact checks at a far quicker pace than other organizations.”
“With heightened analysis and the addition of more contextual information around the stories that our users are reading, we are not telling our users what they should or should not believe, but encouraging critical thinking based upon reliable, credible, and verified content,” he added.
AI may never be able to detect fake news entirely on its own, but it can help us be smarter about what we read on the internet.
New threats to effective scientific communication make it more difficult to separate science from science fiction. Patients can be harmed by misinformation or by misplaced trust; for example, patients with cancer using complementary medicine are more likely than patients not using it to refuse evidence-based therapies and have higher mortality.1 Researchers who produce objective science can no longer focus on simply disseminating the message. Now they must also defend that evidence from challenges to the validity and interpretation of their research and, at times, be proactive to ensure that unsubstantiated messages do not compete with the correct message. For instance, the unfounded, and yet persistent, beliefs linking autism with vaccination demonstrates both the health dangers of misinformation and the effort required to counteract that misinformation. The adversarial stance seems destined to decrease trust in the scientific enterprise, but the alternatives seem worse.
Three related factors contribute to current circumstances. One is the rapid decrease in the cost of publishing information. When getting information to the public was expensive, communication could come only from governments or highly resourced private interests. Communication also came from publishing houses that developed editorial processes to protect the value of their capital investments. These organizations could communicate correct or incorrect information as they saw fit, but there were fewer of them, and they were typically identifiable, which made their biases easier to understand. Now, anyone can Tweet or post on Facebook. Social media is indeed democratizing, but its novel dynamics allow strategic content to infiltrate trusted social networks, posing and propagating as influential commentary.
Second is the increasing ability to select what information is heard. When the public was restricted to the local newspaper or radio station, everyone heard the same thing. The powerful urge to favor information that confirms prior views paired with a new ability to filter out the alternative creates the echo chamber of contemporary media. Twitter accounts presumed to be bots have generated positive online sentiment about the use of e-cigarettes.2 Clinicians and scientists are also vulnerable, with the increased ability to selectively expose themselves to confirming evidence.
Third, and more recently, is that the ubiquity of misinformation has created a tool to perpetuate it. Opponents of the content of a report or a message need only decry it as “fake news” to invoke a conspiracy against that content. This single phrase almost seems to initiate an anamnestic response among those disinclined to accept or believe the content, automating cascades of disbelief and dismissal. Misinformation has no constraints and can be strategically designed for spread. For instance, false information about the Zika pandemic had greater uptake than accurate posts.3
Social media has created an unprecedented ability to spread sentiment and exert influence. Individuals exposed to fewer positive expressions on social media are then more likely to post fewer positive expressions on social media.4 The world has been alarmed at revelations of the politically motivated release of misinformation through social media channels and the reach that information has achieved. Science and health are just as vulnerable to strategic manipulation.
How can scientists and institutions that communicate scientific information anticipate and respond to these threats to their value? What countermeasures can they deploy?
When accounts surfaced about the use of Facebook to influence political thought, the evidence was largely from 2 avenues: either by revealing the identity and motives of groups who have used these media strategically or by revealing the provenance of specific messages that have propagated through it. The ability to credit information from scientific journals, and, in turn, to discredit information without such sources, is perhaps the most conventional countermeasure to misinformation. Information from journals usually comes with explicit identification of sources and their conflicts of interest, and is curated through peer review.
Although each of these steps occasionally fails, journals offer provenance structurally designed for the precise purpose of separating fiction from nonfiction and helping readers understand the difference. Because of this critical role, journals may be the ally in greatest need of support. If the peer review process did not exist, one of the first actions scientists would take to counteract misinformation would be to invent it. Thus, it is surprising that some scientists are now embracing preprint publication that eliminates many of the protections between the creation of information and its dissemination.5 Science may benefit more from strengthening the reality and perception of its review than sacrificing these factors for the sake of speed.
Scientists engaging thoughtfully on social media is important but incomplete. Uncoordinated efforts of individual scientists cannot take on resourced interests with fleets of bots. The bot capable of reaching millions will generate more messages and activity than the researcher with 1000 followers every time. What is needed is a campaign, engaging the platforms that patients use. In some cases, fake news could be seen as a teachable moment and an opportunity for researchers to clarify scientific findings. Significant occurrences of misinformation may require stronger responses. Recently, the task has required defending good research from attack. The more aggressive stance is disabling misinformers. However, because moving down that slope puts credibility at risk, evidence-based organizations that trade most on that credibility must consider those risks.
One element that makes misinformation so potent is that it can target those who are most receptive to the information. Precision marketing recapitulates precision medicine. When individuals share their symptoms, diet, medication usage, and medical histories, they leave enough digital residue to define a targetable persona. Facebook posts can be used to predict a diagnosis of depression.6 Because there is an increased focus in research to return findings to patients,7 there could also be a concerted focus to assist patient access to the information underlying these personas and how those personas may distort their world view.
Evocative stories are typically far more emotionally persuasive than multiple tables reporting systematic findings. In experimental settings, participants randomized to read about a person who is experiencing poverty donated more money than those randomized to read ostensibly more systematic and objective statistics reflecting the broad extent of that poverty; however, more concerning is that participants randomized to read both donated an amount intermediate to the other 2 groups.8 Evocative anecdotes are not necessarily more emotionally persuasive than systematic data, but instead data often weaken emotional appeal rather than strengthen it.
Social media is leaving peer-reviewed communication behind as some scientists begin to worry less about their citation index (which takes years to develop) and more about their Twitter response (measurable in hours). Science is not supposed to be a popularity contest and yet humans delight in competitive rankings. Published college rankings have used more dimensionalized criteria to unseat what are, literally, old schools. At the same time, the organizations that produce such rankings may have merely substituted their own metrics to elevate themselves rather than the cause of higher education. Some journals now link to aggregators like Altmetric, which report Tweets about articles with the immediacy of stock tickers. The appeal is irresistible: Altmetric ratings deliver fame in 15-minute doses. Like the college rankings, these alternative metrics broaden the understanding of the value of a scientific contribution. One approach is to develop additional indices that offer immediacy and yet are not so subject to flights of fancy.
Scientific information and misinformation are amplified through social media. As those channels become vulnerable to scientific integrity, there are opportunities to develop countermeasures and specific strategies for vigilance and response.
A decline in public trust in physicians and the rise of misinformation and “fake news” spread via the Internet has led to an increase in the use of unproven, unconventional treatments by cancer patients, claims an editorial that says efforts to communicate genuine medical advances need to be redoubled.
The editorial, published in the September issue of the Lancet Oncology, says that the “collision” between greater patient autonomy, falling trust, and the rise in social media has led to an increase in self-diagnosis and the use of alternative therapies by cancer patients.
This, it warns, may result in patients refusing conventional, proven therapies and increase their risk for death compared with patients who follow recommended treatment regimens, as previously reported by Medscape Medical News.
The editorial urges all those working in the oncology world to tackle the “disinformation and…lies” that are spread across social media, news platforms, and marketing channels by focusing on the communication of accurate information.
It points to a National Institutes of Health website that aims to help users evaluate health information on the Internet, as well as the recent hiring of a digital nurse by UK charity Macmillan Cancer Support to debunk fake news via a question-and-answer service.
There is nothing new in having to deal with fake news and misinformation in oncology, says Martin Ledwick, head information nurse at the leading charity Cancer Research UK. “This is something that has been around for a long time, and as a charity, our position is to challenge where there isn’t a decent evidence base for a treatment that’s being promoted,” he told Medscape Medical News.
Ledwick explained that Cancer Research UK set up its online forum around 10 years ago partly because “we could see when we looked around at the time that there wasn’t really a properly moderated forum out there for cancer patients.
“You could see that people were having suggestions made to them about alternative therapies and things like that, and no one was really picking that up,” he commented.
Ledwick noted that Cancer Research UK’s science blog also “spends a lot of time debunking myths and reinforcing the value of proper evidence base before people make decisions.”
Their team of nurses staffing their helpline “respond to a lot of inquiries from people who have heard about something perhaps through the Internet and want to explore it but haven’t understood that it’s not as good as it looks,” he said.
Ledwick believes that the growth of the Internet in recent years has created “much more of a platform for ideas to spread” and that that has led to a shift in the nature of the inquiries they receive.
He said that complementary treatments are “usually pretty harmless.” As long as the people offering them “are not overclaiming about them,” there is no problem with people taking them, he said.
“What gets tricky is when you’ve got someone saying this will definitely work and getting people signing up to it for that reason,” he warns.
Another problem, Ledwick noted, is when patients think they can use alternative treatments instead of conventional therapies.
“That, I think, is a worry,” he said, “that sometimes people think, well, maybe I will put off having conventional treatment and see if this works first, because, to be honest, it won’t.
“If it’s not a scientifically based, properly researched therapy, you’re basing your choices in hearsay and anecdotes rather than proper evidence,” he warns.
Decline in Trust in Health Professionals
The editorial notes that a “major challenge” in oncology today is a decline in trust by the lay public in professional opinion, at the center of which is “a collision between personal autonomy, specious journalism, social media, widespread disinformation, and political marginalisation.”
The editorial states that together, these factors undermine the standing of science and academic endeavor, which, in oncology, has led to self-diagnosis and patients’ “demand for specific treatments irrespective of their doctors’ advice.”
Patients are also turning to “alternative unproven therapies,” and clinicians are practicing what has been termed “defensive medicine” to avoid lawsuits, primarily through the overuse of diagnostic tests.
The editorial points to two studies, one published earlier this year and one in 2017, that show that cancer patients who use complementary medicine are more likely to refuse surgery, radiotherapy, and chemotherapy and are more than twice as likely to die than those who receive conventional medicine.
“How has society got to this point, where unproven interventions are being chosen in preference to evidence-based, effective treatments?” the editorial asks.
“Unfortunately, disinformation and — frankly — lies are propagated widely and with the same magnitude as verified evidence due to the ease with which social media, ubiquitous online news platforms, and disreputable marketing exercises can populate information channels, which often do not have sufficient funding to employ subject-specific journalists to weed out facts from fiction,” it comments.
The editorial asserts that to tackle this problem and to stem the decline in public trust, greater efforts need to be made to communicate medical advances accurately to both patients and the lay public “to ensure genuine knowledge can be separated from false material.”
It adds that oncologists need to be better protected from “spurious legal proceedings, bureaucracy, and unnecessary stresses.
“If these challenges are not addressed soon, the great advances in science and medicine that have markedly improved human health worldwide could be easily undone and society will come to regret such inaction and reliance on unreliable sources of information,” it concludes.
“We can’t shy away from phrases because they’ve been somehow weaponized.”
This week, more than a dozen high-profile social scientists and legal scholars charged their profession to help fix democracy by studying the crisis of fake news.Their call to action, published in Science, was notable for listing all that researchers still do not know about the phenomenon. How common is fake news, how does it work, and what can online platforms do to defang it? “There are surprisingly few scientific answers to these basic questions,” the authors write.
But just as notable as their admission was the language used to make it. I was surprised to find this group of scholars using the term fake news at all—even though they were calling for research into fake news.
That may sound odd. How can you study something and not call it by its name? Yet over the past year, academics and tech companies have increasingly shied away from the phrase. Facebook has pushed an alternative term, false news. And some scholars have worried that by using the term, they amplify President Trump’s penchant for calling all negative media coverage of himself “fake.”
The authors of the Science essay—who include Cass Sunstein, a Harvard Law School professor and former Obama administration official, and Duncan Watts, a social scientist at Microsoft Research—argue that avoiding the term distorts the issue. Fake news refers to a distinct phenomenon with a specific name, they say, and we should just use that name (fake news) to talk about that problem (fake news).“We can’t shy away from phrases because they’ve been somehow weaponized. We have to stick to our guns and say there is a real phenomenon here,” said David Lazer, one of the authors of the essay and a professor of political science and computer science at Northeastern University.
“We think it’s a phrase that should sometimes be used,” he told me. “We define it in a very particular way. It’s content that is being put out there that has all the dressings of something that looks legitimate. It’s not just something that is false—it’s something that is manufactured to hide the fact that it is false.”
Facebook now almost exclusively uses the term false newsto talk about fake news. First Draft, a nonprofit research group within Harvard University, also prefers false news, arguing that fake news fails to capture the scope of the misinformation problem online. (Claire Wardle, First Draft’s director of research, goes so far as to call it “f-asterisk-asterisk-asterisk news.”)But Lazer rejected this phrase as imprecise. Not all false news, he said, is fake.
“I’m sure The Atlantic has sometimes gotten things wrong and published incorrect reporting,” he told me. “Those reports may be false, but I wouldn’t call them fake. For fake news, the incorrect nature of it is a feature, not a bug. Whereas when The Atlantic publishes something that’s incorrect, it’s a bug.”
“The term fake news, describing this problem, has been around for a long time,” he added. “There’s a wonderful Harper’s article about the role of fake news and how information technology is rapidly spreading fake news around the world. It used that term, and it was published in 1925.”
None of the political scientists endorsed President Trump’s tack of calling almost any news coverage he dislikes fake news. “We see that usage getting picked up by authoritarian types around the world,” Lazer said. But he does hope that by using the eye-grabbing term, scholars can reinforce the idea that there is something wrong with the information ecosystem, even though “it may not be the pathology that Donald Trump wants you to believe in.”
Just saying fake news won’t make the pathology go away, though. Nor is fake news the only internet’s only truth affliction.“I think there’s a whole menagerie of animals in the false-information zoo,” Lazer told me. They include rumors, hoaxes, outright lies, and disinformation from foreign governments or hostile entities. “It’s clearly the case that there was a coordinated Russian campaign around disinformation, but that’s another animal in the zoo,” he said.
Yet no research has pointed to effective ways of reducing the spread of falsehoods online. Some still-unpublished studies have suggested that labeling fake news as such on Facebook could cause more people to share it. The same goes for relying on fact-checking sites like Snopes and Politifact. “Despite the apparent elegance of fact checking, the science supporting its efficacy is, at best, mixed,” say the authors.
At times, seeing a fact-checked rumor may cause people to remember the rumor itself as true. “People tend to remember information, or how they feel about it, while forgetting the context within which they encountered it,” they write. “There is thus a risk that repeating false information, even in a fact-checking context, may increase an individual’s likelihood of accepting it as true.”
“People are not going to fact-check every sort of information they come across online,” said Brendan Nyhan, a professor of government at Dartmouth College and one of the authors of the recent Science essay. “So we have to help them make better decisions and more accurately evaluate the information they encounter.”The fight against misinformation is two-fold, he told me. First, powerful individuals and popular Twitter users have to lead the fight against fake news and bad information.
“Research has found that people who are important nodes in the network play an important role in dissemination,” especially on Twitter, Nyhan told me. “Stories are being refracted through these big hubs. And I’m not a big hub, but I think it’s important to practice what I preach.”
Nyhan, who has about 65,000 Twitter followers, tries to correct incorrect information that he’s tweeted as quickly as possible, and he also tries to courteously notify other users when they’ve been tricked by unreliable information.
“We will all inadvertently share false or misleading information—that’s part of being online in 2018,” said Nyhan. “But I think we’ve seen people in public life be wildly irresponsible.” Users who repeatedly share bad information or fake news should suffer “reptuational consequences,” he said.
He specifically criticized Laurence Tribe, a widely respected Harvard Law professor who has argued dozens of cases in the Supreme Court. Tribe also has more than 300,000 Twitter followers. “He’s one of the most important constitutional-law scholars in the country, but he has repeatedly retweeted the most dubious anti-Trump information,” said Nyhan. “He’s gotten better, but I think what he did was irresponsible.”
(In an email, Tribe responded: “I do my best to avoid retweeting or relying in any way on dubiously sourced material and assume that, with experience, I’m coming closer to my own ideal. But no source is infallible, and anyone who pretends to reach that goal is guilty of self-deception or worse.”)But individuals can never fight fake news or bad information by themselves, Nyhan said. Which led him to his second point: that online platforms like Facebook, Google, and YouTube have to work with researchers and civil-society organizations to learn how to combat the spread of falsehood.
“There are lots of people in these companies trying to do their best, but they can’t solve the problem of our public debate for us, and we shouldn’t expect them to,” he told me.
“We need more research about what works and what doesn’t on the platforms so we can be sure they are intervening in an effective way—but also so we can make sure they’re not intervening in a destructive manner,” he said. “I don’t think people take seriously enough the risks of major public intervention by the platforms. I don’t think we want Twitter, Facebook, and Google deciding what kinds of news and information are shown to people.”
“This,” he said—meaning fake news, falsehood, and the entire debacle of unreliable information online—“is not strictly the fault of the platforms. Part of what it’s revealing are the limitations of human psychology. But human psychology is not going to change.”
So the institutions that buttress that psychology—the journalists and editors, the politicians and judges, the readers and consumers of news, and the programmers and executives who design the platforms themselves—must change to accommodate it. Abraham Lincoln once said that one of the great tasks of the United States was “to show to the world that freemen could be prosperous.” Now, Americans and people all over the world must show that they can use every technological blessing of that prosperity—and remain well informed, enlightened, and liberated from falsehood themselves.
The massive new study analyzes every major contested news story in English across the span of Twitter’s existence—some 126,000 stories, tweeted by 3 million users, over more than 10 years—and finds that the truth simply cannot compete with hoax and rumor. By every common metric, falsehood consistently dominates the truth on Twitter, the study finds: Fake news and false rumors reach more people, penetrate deeper into the social network, and spread much faster than accurate stories.
“It seems to be pretty clear [from our study] that false information outperforms true information,” said Soroush Vosoughi, a data scientist at MIT who has studied fake news since 2013 and who led this study. “And that is not just because of bots. It might have something to do with human nature.”
The study has already prompted alarm from social scientists. “We must redesign our information ecosystem in the 21st century,” write a group of 16 political scientists and legal scholars in an essay also published Thursday in Science. They call for a new drive of interdisciplinary research “to reduce the spread of fake news and to address the underlying pathologies it has revealed.”
“How can we create a news ecosystem … that values and promotes truth?” they ask.
The new study suggests that it will not be easy. Though Vosoughi and his colleagues only focus on Twitter—the study was conducted using exclusive data that the company made available to MIT—their work has implications for Facebook, YouTube, and every major social network. Any platform that regularly amplifies engaging or provocative content runs the risk of amplifying fake news along with it.
Though the study is written in the clinical language of statistics, it offers a methodical indictment of the accuracy of information that spreads on these platforms. A false story is much more likely to go viral than a real story, the authors find. A false story reaches 1,500 people six times quicker, on average, than a true story does. And while false stories outperform the truth on every subject—including business, terrorism and war, science and technology, and entertainment—fake news about politics regularly does best.
Twitter users seem almost to prefer sharing falsehoods. Even when the researchers controlled for every difference between the accounts originating rumors—like whether that person had more followers or was verified—falsehoods were still 70 percent more likely to get retweeted than accurate news.
And blame for this problem cannot be laid with our robotic brethren. From 2006 to 2016, Twitter bots amplified true stories as much as they amplified false ones, the study found. Fake news prospers, the authors write, “because humans, not robots, are more likely to spread it.”
Political scientists and social-media researchers largely praised the study, saying it gave the broadest and most rigorous look so far into the scale of the fake-news problem on social networks, though some disputed its findings about bots and questioned its definition of news.
“This is a really interesting and impressive study, and the results around how demonstrably untrue assertions spread faster and wider than demonstrable true ones do, within the sample, seem very robust, consistent, and well supported,” said Rasmus Kleis Nielsen, a professor of political communication at the University of Oxford, in an email.
“I think it’s very careful, important work,” Brendan Nyhan, a professor of government at Dartmouth College, told me. “It’s excellent research of the sort that we need more of.”
“In short, I don’t think there’s any reason to doubt the study’s results,” said Rebekah Tromble, a professor of political science at Leiden University in the Netherlands, in an email.
This new paper takes a far grander scale, looking at nearly the entire lifespan of Twitter: every piece of controversial news that propagated on the service from September 2006 to December 2016. But to do that, Vosoughi and his colleagues had to answer a more preliminary question first: What is truth? And how do we know?
It’s a question that can have life-or-death consequences.
“[Fake news] has become a white-hot political and, really, cultural topic, but the trigger for us was personal events that hit Boston five years ago,” said Deb Roy, a media scientist at MIT and one of the authors of the new study.
On April 15, 2013, two bombs exploded near the route of the Boston Marathon, killing three people and injuring hundreds more. Almost immediately, wild conspiracy theories about the bombings took over Twitter and other social-media platforms. The mess of information only grew more intense on April 19, when the governor of Massachusetts asked millions of people to remain in their homes as police conducted a huge manhunt.
“I was on lockdown with my wife and kids in our house in Belmont for two days, and Soroush was on lockdown in Cambridge,” Roy told me. Stuck inside, Twitter became their lifeline to the outside world. “We heard a lot of things that were not true, and we heard a lot of things that did turn out to be true” using the service, he said.
The ordeal soon ended. But when the two men reunited on campus, they agreed it seemed seemed silly for Vosoughi—then a Ph.D. student focused on social media—to research anything but what they had just lived through. Roy, his adviser, blessed the project.
He made a truth machine: an algorithm that could sort through torrents of tweets and pull out the facts most likely to be accurate from them. It focused on three attributes of a given tweet: the properties of its author (were they verified?), the kind of language it used (was it sophisticated?), and how a given tweet propagated through the network.
“The model that Soroush developed was able to predict accuracy with a far-above-chance performance,” said Roy. He earned his Ph.D. in 2015.
After that, the two men—and Sinan Aral, a professor of management at MIT—turned to examining how falsehoods move across Twitter as a whole. But they were back not only at the “what is truth?” question, but its more pertinent twin: How does the computer know what truth is?
They opted to turn to the ultimate arbiter of fact online: the third-party fact-checking sites. By scraping and analyzing six different fact-checking sites—including Snopes, Politifact, and FactCheck.org—they generated a list of tens of thousands of online rumors that had spread between 2006 and 2016 on Twitter. Then they searched Twitter for these rumors, using a proprietary search engine owned by the social network called Gnip.
Ultimately, they found about 126,000 tweets, which, together, had been retweeted more than 4.5 million times. Some linked to “fake” stories hosted on other websites. Some started rumors themselves, either in the text of a tweet or in an attached image. (The team used a special program that could search for words contained within static tweet images.) And some contained true information or linked to it elsewhere.
Then they ran a series of analyses, comparing the popularity of the fake rumors with the popularity of the real news. What they found astounded them.
Speaking from MIT this week, Vosoughi gave me an example: There are lots of ways for a tweet to get 10,000 retweets, he said. If a celebrity sends Tweet A, and they have a couple million followers, maybe 10,000 people will see Tweet A in their timeline and decide to retweet it. Tweet A was broadcast, creating a big but shallow pattern.
Meanwhile, someone without many followers sends Tweet B. It goes out to their 20 followers—but one of those people sees it, and retweets it, and then one of their followers sees it and retweets it too, on and on until tens of thousands of people have seen and shared Tweet B.
Tweet A and Tweet B both have the same size audience, but Tweet B has more “depth,” to use Vosoughi’s term. It chained together retweets, going viral in a way that Tweet A never did. “It could reach 1,000 retweets, but it has a very different shape,” he said.
Here’s the thing: Fake news dominates according to both metrics. It consistently reaches a larger audience, and it tunnels much deeper into social networks than real news does. The authors found that accurate news wasn’t able to chain together more than 10 retweets. Fake news could put together a retweet chain 19 links long—and do it 10 times as fast as accurate news put together its measly 10 retweets.
These results proved robust even when they were checked by humans, not bots. Separate from the main inquiry, a group of undergraduate students fact-checked a random selection of roughly 13,000 English-language tweets from the same period. They found that false information outperformed true information in ways “nearly identical” to the main data set, according to the study.
What does this look like in real life? Take two examples from the last presidential election. In August 2015, a rumor circulated on social media that Donald Trump had let a sick child use his plane to get urgent medical care. Snopesconfirmed almost all of the tale as true. But according to the team’s estimates, only about 1,300 people shared or retweeted the story.
In February 2016, a rumor developed that Trump’s elderly cousin had recently died and that he had opposed the magnate’s presidential bid in his obituary. “As a proud bearer of the Trump name, I implore you all, please don’t let that walking mucus bag become president,” the obituary reportedly said. But Snopes could not find evidence of the cousin, or his obituary, and rejected the story as false.
Nonetheless, roughly 38,000 Twitter users shared the story. And it put together a retweet chain three times as long as the sick-child story managed.
Why does falsehood do so well? The MIT team settled on two hypotheses.
First, fake news seems to be more “novel” than real news. Falsehoods are often notably different from the all the tweets that have appeared in a user’s timeline 60 days prior to their retweeting them, the team found.
Second, fake news evokes much more emotion than the average tweet. The researchers created a database of the words that Twitter users used to reply to the 126,000 contested tweets, then analyzed it with a state-of-the-art sentiment-analysis tool. Fake tweets tended to elicit words associated with surprise and disgust, while accurate tweets summoned words associated with sadness and trust, they found.
The team wanted to answer one more question: Were Twitter bots helping to spread misinformation?
After using two different bot-detection algorithms on their sample of 3 million Twitter users, they found that the automated bots were spreading false news—but they were retweeting it at the same rate that they retweeted accurate information.
“The massive differences in how true and false news spreads on Twitter cannot be explained by the presence of bots,” Aral told me.
But some political scientists cautioned that this should not be used to disprove the role of Russian bots in seeding disinformation recently. An “army” of Russian-associated bots helped amplify divisive rhetoric after the school shooting in Parkland, Florida, The New York Times has reported.
“It can both be the case that (1) over the whole 10-year data set, bots don’t favor false propaganda and (2) in a recent subset of cases, botnets have been strategically deployed to spread the reach of false propaganda claims,” said Dave Karpf, a political scientist at George Washington University, in an email.
“My guess is that the paper is going to get picked up as ‘scientific proof that bots don’t really matter!’ And this paper does indeed show that, if we’re looking at the full life span of Twitter. But the real bots debate assumes that their usage has recently escalated because strategic actors have poured resources into their use. This paper doesn’t refute that assumption,” he said.
Vosoughi agrees that his paper does not determine whether the use of botnets changed around the 2016 election. “We did not study the change in the role of bots across time,” he told me in an email. “This is an interesting question and one that we will probably look at in future work.”
Some political scientists also questioned the study’s definition of “news.” By turning to the fact-checking sites, the study blurs together a wide range of false information: outright lies, urban legends, hoaxes, spoofs, falsehoods, and “fake news.” It does not just look at fake news by itself—that is, articles or videos that look like news content, and which appear to have gone through a journalistic process, but which are actually made up.
Therefore, the study may undercount “non-contested news”: accurate news that is widely understood to be true. For many years, the most retweeted post in Twitter’s history celebrated Obama’s re-election as president. But as his victory was not a widely disputed fact, Snopes and other fact-checking sites never confirmed it.
The study also elides content and news. “All our audience research suggests a vast majority of users see news as clearly distinct from content more broadly,” Nielsen, the Oxford professor, said in an email. “Saying that untrue content, including rumors, spread faster than true statements on Twitter is a bit different from saying false news and true news spread at different rates.”
But many researchers told me that simply understanding why false rumors travel so far, so fast, was as important as knowing that they do so in the first place.
“The key takeaway is really that content that arouses strong emotions spreads further, faster, more deeply, and more broadly on Twitter,” said Tromble, the political scientist, in an email. “This particular finding is consistent with research in a number of different areas, including psychology and communication studies. It’s also relatively intuitive.”
“False information online is often really novel and frequently negative,” said Nyhan, the Dartmouth professor. “We know those are two features of information generally that grab our attention as human beings and that cause us to want to share that information with others—we’re attentive to novel threats and especially attentive to negative threats.”
“It’s all too easy to create both when you’re not bound by the limitations of reality. So people can exploit the interaction of human psychology and the design of these networks in powerful ways,” he added.
He lauded Twitter for making its data available to researchers and called on other major platforms, like Facebook, to do the same. “In terms of research, the platforms are the whole ballgame. We have so much to learn but we’re so constrained in what we can study without platform partnership and collaboration,” he said.
“These companies now exercise a great deal of power and influence over the news that people get in our democracy. The amount of power that platforms now hold means they have to face a great deal of scrutiny and transparency,” he said. “We can study Twitter all day, but only about 12 percent of Americans are on it. It’s important for journalists and academics, but it’s not how most people get their news.”
In a statement, Twitter said that it was hoping to expand its work with outside experts. In a series of tweets last week, Jack Dorsey, the company’s CEO, said the company hoped to “increase the collective health, openness, and civility of public conversation, and to hold ourselves publicly accountable toward progress.”
Facebook did not respond to a request for comment.
But Tromble, the political-science professor, said that the findings would likely apply to Facebook, too. “Earlier this year, Facebook announced that it would restructure its News Feed to favor ‘meaningful interaction,’” she told me.
“It became clear that they would gauge ‘meaningful interaction’ based on the number of comments and replies to comments a post receives. But, as this study shows, that only further incentivizes creating posts full of disinformation and other content likely to garner strong emotional reactions,” she added.
“Putting my conservative scientist hat on, I’m not comfortable saying how this applies to other social networks. We only studied Twitter here,” said Aral, one of the researchers. “But my intuition is that these findings are broadly applicable to social-media platforms in general. You could run this exact same study if you worked with Facebook’s data.”
Yet these do not encompass the most depressing finding of the study. When they began their research, the MIT team expected that users who shared the most fake news would basically be crowd-pleasers. They assumed they would find a group of people who obsessively use Twitter in a partisan or sensationalist way, accumulating more fans and followers than their more fact-based peers.
In fact, the team found that the opposite is true. Users who share accurate information have more followers, and send more tweets, than fake-news sharers. These fact-guided users have also been on Twitter for longer, and they are more likely to be verified. In short, the most trustworthy users can boast every obvious structural advantage that Twitter, either as a company or a community, can bestow on its best users.
The truth has a running start, in other words—but inaccuracies, somehow, still win the race. “Falsehood diffused further and faster than the truth despite these differences [between accounts], not because of them,” write the authors.
This finding should dispirit every user who turns to social media to find or distribute accurate information. It suggests that no matter how adroitly people plan to use Twitter—no matter how meticulously they curate their feed or follow reliable sources—they can still get snookered by a falsehood in the heat of the moment.
It is unclear which interventions, if any, could reverse this tendency toward falsehood. “We don’t know enough to say what works and what doesn’t,” Aral told me. There is little evidence that people change their opinion because they see a fact-checking site reject one of their beliefs, for instance. Labeling fake news as such, on a social network or search engine, may do little to deter it as well.
In short, social media seems to systematically amplify falsehood at the expense of the truth, and no one—neither experts nor politicians nor tech companies—knows how to reverse that trend. It is a dangerous moment for any system of government premised on a common public reality.
Fake news evolved from seedy internet sideshow to serious electoral threat so quickly that behavioral scientists had little time to answer basic questions about it, like who was reading what, how much real news they also consumed and whether targeted fact-checking efforts ever hit a target.
Sure, surveys abound, asking people what they remember reading. But these are only as precise as the respondents’ shifty recollections and subject to a malleable definition of “fake.” The term “fake news” itself has evolved into an all-purpose smear, used by politicians and the president to deride journalism they don’t like.
But now the first hard data on fake-news consumption has arrived. Researchers last week posted an analysis of the browsing histories of thousands of adults during the run-up to the 2016 election — a real-time picture of who viewed which fake stories, and what real news those people were seeing at the same time.
The reach of fake news was wide indeed, the study found, yet also shallow. One in four Americans saw at least one false story, but even the most eager fake-news readers — deeply conservative supporters of President Trump — consumed far more of the real kind, from newspaper and network websites and other digital sources.
While the research can’t settle the question of whether misinformation was pivotal in the 2016 election, the findings give the public and researchers the first solid guide to asking how its influence may have played out. That question will become increasingly important as online giants like Facebook and Google turn to shielding their users from influence by Russian operatives and other online malefactors.
“There’s been a lot of speculation about the effect of fake news and a lot of numbers thrown around out of context, which get people exercised,” said Duncan Watts, a research scientist at Microsoft who has argued that misinformation had a negligible effect on the election results. “What’s nice about this paper is that it focuses on the actual consumers themselves.”
In the new study, a trio of political scientists — Brendan Nyhan of Dartmouth College (a regular contributor to The Times’s Upshot), Andrew Guess of Princeton University and Jason Reifler of the University of Exeter — analyzed web traffic data gathered from a representative sample of 2,525 Americans who consented to have their online activity monitored anonymously by the survey and analytic firm YouGov.
The data included website visits made in the weeks before and after the 2016 election, and a measure of political partisanship based on overall browsing habits. (The vast majority of participants favored Mr. Trump or Hillary Clinton.)
The team defined a visited website as fake news if it posted at least two demonstrably false stories, as defined by economists Hunt Allcott and Matthew Gentzkow in research published last year. On 289 such sites, about 80 percent of bogus articles supported Mr. Trump.
The online behavior of the participants was expected in some ways, but surprising in others. Consumption broke down along partisan lines: the most conservative 10 percent of the sample accounted for about 65 percent of visits to fake news sites.
Pro-Trump users were about three times more likely to visit fake news sites supporting their candidate than Clinton partisans were to visit bogus sites promoting her.
Still, false stories were a small fraction of the participants’ overall news diet, regardless of political preference: just 1 percent among Clinton supporters, and 6 percent among those pulling for Mr. Trump. Even conservative partisans viewed just five fake news articles, on average, over more than five weeks.
There was no way to determine from the data how much, or whether, people believed what they saw on these sites. But many of these were patently absurd, like one accusing Mrs. Clinton of a “Sudden Move of $1.8 Billion to Qatar Central Bank,” or a piece headlined “Video Showing Bill Clinton With a 13-Year-Old Plunges Race Into Chaos.”
“For all the hype about fake news, it’s important to recognize that it reached only a subset of Americans, and most of the ones it was reaching already were intense partisans,” Dr. Nyhan said.
“They were also voracious consumers of hard news,” he added. “These are people intensely engaged in politics who follow it closely.”
Given the ratio of truth to fiction, Dr. Watts said, fake news paled in influence beside mainstream news coverage, particularly stories about Mrs. Clinton and her use of a private email server as secretary of state. Coverage of that topic appeared repeatedly and prominently in venues like The New York Times and the Washington Post.
The new study does not rule out the possibility that fake news affected the elections, said David Rand, an associate professor of psychology, economics and management at Yale University.
Americans over age 60 were much more likely to visit a fake news site than younger people, the new study found. Perhaps confusingly, moderately left-leaning people viewed more pro-Trump fake news than they did pro-Clinton fake news.
One interpretation of that finding, Dr. Rand said, may be that older, less educated voters who switched from Obama in 2012 to Trump in 2016 were particularly susceptible to fake news.
“You can see where this might have had an impact in some of those close swing states, like Wisconsin,” Dr. Rand said. “But this of course is a matter of conjecture, reasoning backward from the findings.”
The study found that Facebook was by far the platform through which people most often navigated to a fake news site. Last year, in response to criticism, the company began flagging stories on its site that third-party fact-checkers found to make false claims with a red label saying “disputed.”
Most people in the new study encountered at least some of these labels, but “we saw no instances of people reading a fake news article and a fact-check of that specific article,” Dr. Nyhan said. “The fact-checking websites have a targeting problem.”
In December, Facebook announced a change to its monitoring approach. Instead of labeling false stories, Facebook will surface the fact-checks along with the fake story in the user’s news feed.