Bowling, Dr. Michael

2017 Finalist: Outstanding Leadership In Alberta Technology

Using Poker to build better Artificial Intelligence

Can artificial intelligence use intuition?

Dr. Michael Bowling, a professor in Computer Science at the University of Alberta, says they are getting pretty close. Bowling and his team have recently created DeepStack. This AI system can make decisions, reasoning about the unknown, without having to explicitly think about everything that could happen or could have happened, which is what the researchers are calling intuition.

The competition between artificial intelligences and humans in games has a long tradition. In the mid ‘90s, chess lead to an important milestone for AI development, as computers were able to play against and beat the world’s best chess players.

Since then, AI technology and machine learning have continued to develop and it is researchers such as Bowling and his colleague behind DeepStack who drive the innovation forward.

Working with the unknown

Although developing an AI system to play high-level chess was a major breakthrough, Bowling has taken this a monumental step forward by creating an AI system that can play in the unknown environment of no-limit poker.

“In chess, one can search all the possible moves and figure out the right thing to do, but that seems impossible in game like poker,” Bowling explains. “It kind of is. The techniques that work in chess don’t work in poker. It’s been almost a decade of really concerted effort to develop new AI techniques and new machine learning techniques that can cope with what happens when you can’t see the other player’s cards.”

Fascinated by the concept of games and machine learning, Bowling started to create AI that could play Atari 2600 games. Although the AI didn’t know what it was doing the first time it saw a game, it would press the fire button, hit an alien and score 100 points. Bowling adds, “You didn’t get 100 points for what you did right before you got the 100 points. The important thing was that you pressed the button one second ago, right when you were lined up to hit the alien right in time.”

This type of learning is referred to as reinforcement learning, which means that actions are periodically either rewarded or punished. The system learns to repeat the actions that caused the rewards.

With this in mind, the team created DeepStack and had it play billions of hands against itself with only the rules of poker as guidance. Bowling says, “It just knows ‘here are actions I could take: I could bet $100. I don’t really know what that means but I just know I can take the $100 bet action.’”

Initially the system played utterly randomly without reasoning behind its actions. After more and more games played, it started to understand the feedback it was receiving.

“As it’s playing it starts to realize that, ‘Wait, when I have good cards and I bet, I tend to win money. Maybe I should bet more when I have good cards and not be folding in those situations.’ It has some regret for not having bet more so it starts to bet more,” Bowling says.

As the game progressed, the AI was also able to consider its opponent. When it had poor cards it tended to lose money and it realized that its opponent only bet when it had good cards.

“Now you realize, ‘Wait, my opponent is folding when they have bad cards. Maybe I should bet when I have mediocre cards or maybe I should even bet when I have lousy cards because they’re still going to fold their lousy cards and I might make money even though I don’t have the best hand,’” explains Bowling. “Suddenly bluffing comes from just learning to play the game against yourself. These quintessential, seemingly human, activities fall out of just learning to play the game.”

Utilizing these concepts, in 2008 a system developed by Bowling’s team beat top professional players in the game of heads-up limit, where the player can only make fixed bets. Now, DeepStack plays no-limit poker where the possible options the AI needs to learn are exponentially greater.

“What if they bet $100, I need to figure out what I should do, but what if they bet $101, does that change things?” says Bowling. “If you try to consider all those possibilities, it’s hopeless. You’ll never be able to build an AI system that can do that.”

Learning and intuition

Intuition comes in when the AI learns to evaluate different situations on the fly to make decisions that make its position in the game better than its opponent. It doesn’t have every possible combination of actions and counteractions programmed in; instead it makes decisions to reinforce its position in the game based on factors it knows and unknown factors it can intuit.

When they began to see DeepStack exhibit evidence of intuition, Bowling says it was very exciting. He adds, “You’re seeing this visceral connection of the AI do something that is having an effect on the world. You get to sit across from it and take actions against it and see it respond.”

Although creating a superhuman poker-playing computer is remarkable, Bowling sees a more important broader application.

“Games are so powerful and we’re starting to realize this all through society,” he says. “On one level, you have AI systems that can play chess and checkers, where you can see the entire picture of the world and what’s going on, and you don’t have uncertainty about the state of the world. Being able to build systems that can cope with a more complex uncertainty that is closer to what you might see in the real world is really important. I think the way DeepStack views it as this combination of being able to reason about what you don’t know while being able to have intuition about the future is an important paradigm,” Bowling says.

Bowling says creating a program with that level of problem solving capability will lead to addressing many real-world challenges. Whether it’s security, finance or medicine, the robust decision-making born from DeepStack will almost certainly have impact far beyond the game of poker.

From the developments of Bowling and his team at the University of Alberta, DeepMind, one of the leaders in the AI industry, has chosen Edmonton to house its first non-UK research lab. Bowling will be co-leading the DeepMind Alberta lab while remaining a professor at the University of Alberta

“That’s exciting because I think it’s going to be a catalyst for creating a whole ecosystem around this next generation technology. AI technology isn’t going to be a small thing; there’s no way it can’t change the world and affect almost every industry in the world,” says Bowling.

“Having an important play in that is going to be huge for Alberta going forward and that constant desire to see the economy diversify. This is, I think, a step in that direction.”