Computers have figured out how to win at chess, checkers and tic-tac-toe, and now, a computer program has conquered the game of poker.
A research team led by Michael Bowling, a professor of computer science at the University of Alberta in Canada, developed a computer program that can outplay humans at a two-player poker game — specifically, heads-up limit hold 'em. The results could have far-reaching implications for other situations that require complex decision-making, such as in foreign policy or medical treatment.
Unlike chess or checkers, in poker, one player doesn't always know the past moves of the other players. Plus, a player can win a hand when the other players fold. Therefore, in mathematical terms, the game has imperfect information. [Top 10 Revolutionary Computers]
"Chess has a perfect play solution — the answer for a given position is, a win for black, a win for white or a draw," Bowling said. "Poker is more probabilistic." In other words, there is no absolutely perfect hand or strategy.
How it works
In the version of hold 'em poker that the computer played, the bets between two players are fixed and the number of raises is limited. The dealer gives each player two cards, called hole cards. A round of betting follows, known as the "pre-flop." After that, three more cards are laid out on the table, called a "flop." The flop is a set of community cards, dealt face up, so both players know what they are. Another round of betting follows, and then a fourth card is put on the table, called the "turn." After a third round of betting, the last community card is dealt (this is known as the "river"), and at that point, the players have to show their hole cards, assuming that one player hasn't folded yet.
The computer doesn't calculate every possible hand as it plays. Instead, it builds a table of results before the game starts. Using some 4,000 central processing units for two months — equal to about 1,000 years of computing time — it simulates billions of hands of poker. The table of results alone took up some 15 terabytes of computer storage, Bowling said. For comparison, a typical backup drive for a desktop is one terabyte. [10 Technologies That Will Transform Your Life]
The algorithm goes through all of the possible hands an opposing player could have, and then tallies up the results for each tactic — for example, raising, folding or calling the bet (i.e., matching the opponent). To get an idea of how big the task is, there are 13.8 trillion different situations that can come up in the game. To get there, every human being on Earth would have to play nearly 4,000 hands of poker.
This differs from chess, where a computer can brute-force calculate moves as the game progresses to get a result that is good enough to win. (Contrary to what many people think, few computer programs actually go through every single permutation, just the ones that produce the best results). Imagine instead if chess-playing computers had to look up the results of billions of previous games with a specific configuration of pieces on the board.
As billions of hands are played, the program comes up with an optimal strategy — that is, it converges on what the best move is for a given hand. "The way this works … it's already played a billion billion hands of poker," Bowling said.
Mastering the game
Because poker isn't solvable the way chess or checkers is, Bowling and his team came up with a different set of requirements for calling the game "solved." In scientific terms, the game is "essentially solved," which means that there is a way to exploit the strategy the computer uses. The researchers assumed a person played the computer for 70 years, 365 days per year, for 24 hours a day. The program they wrote played so well that if the big blind — the fixed bet — is $1,000, the most a perfect player can win is about $1 per hand, or 1/1000 of the big blind.
Other experts have worked on poker-playing computers that are used in casinos, and at least one company says it has designed a machine-learning algorithm that adjusts strategy according to the human player. But none has demonstrated that its exploitability — the ability of a perfect human player to beat the machine — is as small as the program designed by Bowling's team. Nor have any solved the game in the same mathematically rigorous way.
But the algorithm does have limitations. For one, it only works with two-handed games. In a three-player game, it is possible that one player could have a terrible strategy (for instance, perhaps the player has a tendency to raise all the time), and loses less than the second player, who has a better strategy, resulting in a win for the third player.
Another problem is figuring out how to test three-player games fairly. One experiment could have two humans play the machine, but Bowling said the human players may collude against the machine, even if unintentionally. Similar problems could arise in experiments with two machine players and one human: Even if the two programs didn't collude, it might look that way to a human being. "We don't know how to run it fairly," he said.
Bowling said this technology could have diverse uses, ranging from national security, to tracking fare evasion on transit systems, to making decisions about medical treatment. For example, the program could help a doctor who needs to make a decision about treatment but is unsure of the possible outcomes. The methods used in the poker program could help doctors identify treatment options with optimal results, or one with the best probability of success.
The research was described online today (Jan. 8) in the journal Science.