The Newscripts blog would like to be closer Internet buddies with our glossy print Newscripts column, so here we highlight what’s going on in the current issue of C&EN. The following comes courtesy of the writer of this week’s glossy print column, C&EN Senior Editor Michael Torrice.
The Nintendo Entertainment System came out in the U.S. almost 30 years ago. My family bought the gray video game console in 1988, and my friends and I played it for countless hours. We once staged a Nintendo Olympics, with each kid adopting an official song and flag. Winners received medals or a trophy made from Legos, I think. For this week’s Newscripts column, I relived a bit of my childhood when I wrote about a computer scientist who taught his computer how to play Nintendo games.
Tom W. Murphy VII is the computer scientist, and he works on machine learning, which is basically teaching computers how to perform specific tasks. (Yes, Murphy is the seventh Thomas Murphy in his family. He says the first died in a prisoner-of-war camp during the Civil War.)
The neat thing about Murphy’s Nintendo-playing program is that it uses a simple, general strategy that works on several games, including the classic “Super Mario Bros.” The program can play a wide range of games because it doesn’t know anything specific about a game (for instance, it’s unaware that mushrooms make Mario grow big). Instead it uses a two-phase process to learn what it means to win in a specific game and then looks for the best series of button presses to succeed.
In the first phase, the computer “watches me play the game and peers inside the memory of the Nintendo and looks at what’s going on,” Murphy says. Basically, it finds bytes of memory that increase in value as Murphy plays. These bytes often correspond to things like the score or progress through a game level—although the program doesn’t know what the bytes translate to on the screen.
Next, the program starts playing the game itself. It simulates pressing different combinations of the eight buttons on a Nintendo controller. Going forward through the game, the program tests to see if the inputs make those winning bytes increase. If not, the computer goes back in time and tries another combination until it thinks it’s winning.
I think the program’s most impressive feat is how it stumbles on to moves that exploit bugs in the games. For example, Mario can jump into and kill a goomba—an evil sentient mushroom—as long as the plumber is falling slightly. A person would probably avoid trying this move because normally running into the mushrooms is deadly. But in this scenario, Mario is immune.
Why does the program try such a risky move? “It doesn’t really know what jumping is; it doesn’t know that goombas are dangerous,” Murphy says. Because the program is just trying a bunch of button presses to maximize the score, it doesn’t know not to try these moves, he says.
I suggest watching Murphy’s YouTube video about the program. He shows footage of its successes and failures. My favorite bit is at the very end when he talks about how poorly the program plays “Tetris.” The computer figures out things aren’t going its way, and to avoid defeat, it pauses the game right before the final block lands at the top of the screen. “And really, the only winning move is not to play,” Murphy says, referencing a famous line from “WarGames,” the 1983 movie about a video-game-playing supercomputer.
Oh, the ’80s nostalgia!
Leave a Reply