Google’s New AI Is a Master of Games, but How Does It Compare to the Human Mind?

After building AlphaGo to beat the world’s best Go players, Google DeepMind built AlphaZero to take on the world’s best machine players

AI Chess
Google’s new artificial intelligence program, AlphaZero, taught itself to play chess, shogi, and Go in a matter of hours, and outperforms the top-ranking AIs in the gameplay arena.

For humans, chess may take a lifetime to master. But Google DeepMind’s new artificial intelligence program, AlphaZero, can teach itself to conquer the board in a matter of hours.

Building on its past success with the AlphaGo suite—a series of computer programs designed to play the Chinese board game Go—Google boasts that its new AlphaZero achieves a level of “superhuman performance” at not just one board game, but three: Go, chess, and shogi (essentially, Japanese chess). The team of computer scientists and engineers, led by Google’s David Silver, reported its findings recently in the journal Science.

“Before this, with machine learning, you could get a machine to do exactly what you want—but only that thing,” says Ayanna Howard, an expert in interactive computing and artificial intelligence at the Georgia Institute of Technology who did not participate in the research. “But AlphaZero shows that you can have an algorithm that isn’t so [specific], and it can learn within certain parameters.”

AlphaZero’s clever programming certainly ups the ante on gameplay for human and machine alike, but Google has long had its sights set on something bigger: engineering intelligence.

The researchers are careful not to claim that AlphaZero is on the verge of world domination (others have been a little quicker to jump the gun). Still, Silver and the rest of the DeepMind squad are already hopeful that they’ll someday see a similar system applied to drug design or materials science.

So what makes AlphaZero so impressive?

Gameplay has long been revered as a gold standard in artificial intelligence research. Structured, interactive games are simplifications of real-world scenarios: Difficult decisions must be made; wins and losses drive up the stakes; and prediction, critical thinking, and strategy are key.

Encoding this kind of skill is tricky. Older game-playing AIs—including the first prototypes of the original AlphaGo—have traditionally been pumped full of codes and data to mimic the experience typically earned through years of natural, human gameplay (essentially, a passive, programmer-derived knowledge dump). With AlphaGo Zero (the most recent version of AlphaGo), and now AlphaZero, the researchers gave the program just one input: the rules of the game in question. Then, the system hunkered down and actively learned the tricks of the trade itself.

AlphaZero is based on AlphaGo Zero, part of the AlphaGo suite designed to play the Chinese board game Go, pictured above. Early iterations of the original program were fed data from human-versus-human games; later versions engaged in self-teaching, wherein the software played games against itself to learn its own strategy.

This strategy, called self-play reinforcement learning, is pretty much exactly what it sounds like: To train for the big leagues, AlphaZero played itself in iteration after iteration, honing its skills by trial and error. And the brute-force approach paid off. Unlike AlphaGo Zero, AlphaZero doesn’t just play Go: It can beat the best AIs in the business at chess and shogi, too. The learning process is also impressively efficient, requiring only two, four, or 30 hours of self-tutelage to outperform programs specifically tailored to master shogi, chess, and Go, respectively. Notably, the study authors didn’t report any instances of AlphaZero going head-to-head with an actual human, Howard says. (The researchers may have assumed that, given that these programs consistently clobber their human counterparts, such a matchup would have been pointless.)

AlphaZero was also able to trounce Stockfish (the now unseated AI chess master) and Elmo (the former AI shogi expert) despite evaluating fewer possible next moves on each turn during game play. But because the algorithms in question are inherently different, and may consume different amounts of power, it’s difficult to directly compare AlphaZero to other, older programs, points out Joanna Bryson, who studies artificial intelligence at the University of Bath in the United Kingdom and did not contribute to AlphaZero.

Google keeps mum about a lot of the fine print on its software, and AlphaZero is no exception. While don’t know everything about the program’s power consumption, what’s clear is this: AlphaZero has to be packing some serious computational ammo. In those scant hours of training, the program kept itself very busy, engaging in tens or hundreds of thousands of practice rounds to get its board game strategy up to snuff—far more than a human player would need (or, in most cases, could even accomplish) in pursuit of proficiency.

This intensive regimen also used 5,000 of Google’s proprietary machine-learning processor units, or TPUs, which by some estimates consume around 200 watts per chip. No matter how you slice it, AlphaZero requires way more energy than a human brain, which runs on about 20 watts.

The absolute energy consumption of AlphaZero must be taken into consideration, adds Bin Yu, who works at the interface of statistics, machine learning, and artificial intelligence at the University of California, Berkeley. AlphaZero is powerful, but might not be good bang for the buck—especially when adding in the person-hours that went into its creation and execution.

Energetically expensive or not, AlphaZero makes a splash: Most AIs are hyper-specialized on a single task, making this new program—with its triple threat of game play—remarkably flexible. “It’s impressive that AlphaZero was able to use the same architecture for three different games,” Yu says.

So, yes. Google’s new AI does set a new mark in several ways. It’s fast. It’s powerful. But does that make it smart?

This is where definitions start to get murky. “AlphaZero was able to learn, starting from scratch without any human knowledge, to play each of these games to superhuman level,” DeepMind’s Silver said in a statement to the press.

Even if board game expertise requires mental acuity, all proxies for the real world have their limits. In its current iteration, AlphaZero maxes out by winning human-designed games—which may not warrant the potentially alarming label of “superhuman.” Plus, if surprised with a new set of rules mid-game, AlphaZero might get flummoxed. The actual human brain, on the other hand, can store far more than three board games in its repertoire.

What’s more, comparing AlphaZero’s baseline to a tabula rasa (blank slate)as the researchers do—is a stretch, Bryson says. Programmers are still feeding it one crucial morsel of human knowledge: the rules of the game it’s about to play. “It does have far less to go on than anything has before,” Bryson adds, “but the most fundamental thing is, it’s still given rules. Those are explicit.”

And those pesky rules could constitute a significant crutch. “Even though these programs learn how to perform, they need the rules of the road,” Howard says. “The world is full of tasks that don’t have these rules.”

When push comes to shove, AlphaZero is an upgrade of an already powerful program—AlphaGo Zero, explains JoAnn Paul, who studies artificial intelligence and computational dreaming at the Virginia Polytechnic Institute and State University and was not involved in the new research. AlphaZero uses many of the same building blocks and algorithms as AlphaGo Zero, and still constitutes just a subset of true smarts. “I thought this new development was more evolutionary than revolutionary,” she adds. “None of these algorithms can create. Intelligence is also about storytelling. It’s imagining things that are not yet there. We’re not thinking in those terms in computers.”

Part of the problem is, there’s still no consensus on a true definition of “intelligence,” Yu says—and not just in the domain of technology. “It’s still not clear how we are training critically thinking beings, or how we use the unconscious brain,” she adds.

To this point, many researchers believe there are likely multiple types of intelligence. And tapping into one far from guarantees the ingredients for another. For instance, some of the smartest people out there are terrible at chess.

With these limitations, Yu’s vision of the future of artificial intelligence partners humans and machines in a kind of coevolution. Machines will certainly continue to excel at certain tasks, she explains, but human input and oversight may always be necessary to compensate for the unautomated.

Of course, there’s no telling how things will shake out in the AI arena. In the meantime, we have plenty to ponder. “These computers are powerful, and can do certain things better than a human can,” Paul says. “But that still falls short of the mystery of intelligence.”

Particle Physicists Turn to AI to Cope with CERN’s Collision Deluge

Can a competition with cash rewards improve techniques for tracking the Large Hadron Collider’s messy particle trajectories?

Particle Physicists Turn to AI to Cope with CERN's Collision Deluge
A visualization of complex sprays of subatomic particles, produced from colliding proton beams in CERN’s CMS detector at the Large Hadron Collider near Geneva, Switzerland in mid-April of 2018. Credit: CERN

Physicists at the world’s leading atom smasher are calling for help. In the next decade, they plan to produce up to 20 times more particle collisions in the Large Hadron Collider (LHC) than they do now, but current detector systems aren’t fit for the coming deluge. So this week, a group of LHC physicists has teamed up with computer scientists to launch a competition to spur the development of artificial-intelligence techniques that can quickly sort through the debris of these collisions. Researchers hope these will help the experiment’s ultimate goal of revealing fundamental insights into the laws of nature.

At the LHC at CERN, Europe’s particle-physics laboratory near Geneva, two bunches of protons collide head-on inside each of the machine’s detectors 40 million times a second. Every proton collision can produce thousands of new particles, which radiate from a collision point at the centre of each cathedral-sized detector. Millions of silicon sensors are arranged in onion-like layers and light up each time a particle crosses them, producing one pixel of information every time. Collisions are recorded only when they produce potentially interesting by-products. When they are, the detector takes a snapshot that might include hundreds of thousands of pixels from the piled-up debris of up to 20 different pairs of protons. (Because particles move at or close to the speed of light, a detector cannot record a full movie of their motion.)

From this mess, the LHC’s computers reconstruct tens of thousands of tracks in real time, before moving on to the next snapshot. “The name of the game is connecting the dots,” says Jean-Roch Vlimant, a physicist at the California Institute of Technology in Pasadena who is a member of the collaboration that operates the CMS detector at the LHC.

After future planned upgrades, each snapshot is expected to include particle debris from 200 proton collisions. Physicists currently use pattern-recognition algorithms to reconstruct the particles’ tracks. Although these techniques would be able to work out the paths even after the upgrades, “the problem is, they are too slow”, says Cécile Germain, a computer scientist at the University of Paris South in Orsay. Without major investment in new detector technologies, LHC physicists estimate that the collision rates will exceed the current capabilities by at least a factor of 10.

Researchers suspect that machine-learning algorithms could reconstruct the tracks much more quickly. To help find the best solution, Vlimant and other LHC physicists teamed up with computer scientists including Germain to launch the TrackML challenge. For the next three months, data scientists will be able to download 400 gigabytes of simulated particle-collision data—the pixels produced by an idealized detector—and train their algorithms to reconstruct the tracks.

Participants will be evaluated on the accuracy with which they do this. The top three performers of this phase hosted by Google-owned company Kaggle, will receive cash prizes of US$12,000, $8,000 and $5,000. A second competition will then evaluate algorithms on the basis of speed as well as accuracy, Vlimant says.

Prize appeal

Such competitions have a long tradition in data science, and many young researchers take part to build up their CVs. “Getting well ranked in challenges is extremely important,” says Germain. Perhaps the most famous of these contests was the 2009 Netflix Prize. The entertainment company offered US$1 million to whoever worked out the best way to predict what films its users would like to watch, going on their previous ratings. TrackML isn’t the first challenge in particle physics, either: in 2014, teams competed to ‘discover’ the Higgs boson in a set of simulated data (the LHC discovered the Higgs, long predicted by theory, in 2012). Other science-themed challenges have involved data on anything from plankton to galaxies.

From the computer-science point of view, the Higgs challenge was an ordinary classification problem, says Tim Salimans, one of the top performers in that race (after the challenge, Salimans went on to get a job at the non-profit effort OpenAI in San Francisco, California). But the fact that it was about LHC physics added to its lustre, he says. That may help to explain the challenge’s popularity: nearly 1,800 teams took part, and many researchers credit the contest for having dramatically increased the interaction between the physics and computer-science communities.

TrackML is “incomparably more difficult”, says Germain. In the Higgs case, the reconstructed tracks were part of the input, and contestants had to do another layer of analysis to ‘find’ the particle. In the new problem, she says, you have to find in the 100,000 points something like 10,000 arcs of ellipse. She thinks the winning technique might end up resembling those used by the program AlphaGo, which made history in 2016 when it beat a human champion at the complex game of Go. In particular, they might use reinforcement learning, in which an algorithm learns by trial and error on the basis of ‘rewards’ that it receives after each attempt.

Vlimant and other physicists are also beginning to consider more untested technologies, such as neuromorphic computing and quantum computing. “It’s not clear where we’re going,” says Vlimant, “but it looks like we have a good path.”

For all book lovers please visit my friend’s website.

Google’s TPU Chip Helped It Avoid Building Dozens of New Data Centers.

%d bloggers like this: