What do we gain by automating a pastime?
It is a bad idea to intuit how broadly intelligent a machine must be, or have the capacity to be, based solely on a single task. The checkers-playing machines of the 1950s amazed researchers and many considered these a huge leap towards human-level reasoning, yet we now appreciate that achieving human or superhuman performance in this game is far easier than achieving human-level general intelligence. In fact, even the best humans can easily be defeated by a search algorithm with simple heuristics. Human or superhuman performance in one task is not necessarily a stepping-stone towards near-human performance across most tasks.
— Luke Hewitt
Then AlphaZero revived interest in chess algorithm research, using a deep-learning approach paired with Monte Carlo techniques, instead of the alpha-beta pruning algorithm used by IBM in the 1990’s as well as Stockfish in the 2000’s.
“I’m not sure the ideas in AlphaZero generalize readily. Games are a very unusual thing.” —
I know what some of you readers are thinking. A gamer is a human, and if an algorithm can play a game as well as (or better than) a human, then it has replicated human intelligence. This is a fallacious way to think, because it is likely an algorithm can solve 24653546734 + 5345434534 quicker than you but that does not mean it replicated or outperformed human intelligence. Just because an algorithm has been optimized to do one task (e.g. playing StarCraft) does not mean it can be optimized to do any task. Without explicit heuristics and hard-coding, algorithms fail at doing more than one thing.
Another countering sentiment is the objective is not to solve the game as efficiently as possible, but to have it “learn” how to solve the game without explicit guidance and heuristics. I understand the objective here, but I think it is marginalized by the fact it is only being trained how to do one task and doing it in a brute-force way (more on that later).
Gaming seems to be the main emphasis and focus at DeepMind. If you look at a public list of their projects, a great majority of them are game-related. Why is that? And what is the point of running massive computations and thousands of years worth of gameplay… only to beat a hardcore gamer who can master a game in a matter of weeks and on far less data?
Even more, heuristics can create a decent AI and do it much more cheaply. We all know the objective is to have a machine “learn” to do a task without being explicitly coded for it, but is it not ironic we train and train and train just so it learns one task before even executing it, resulting in a slow and inefficient implementation? Meanwhile, old school heuristics will do it immediately and effectively by skipping the learning part.
“Most real-world strategic interactions involve hidden information. I feel like that’s been neglected by the majority of the AI community.” —
Noam Brown, AI Research Scientist at Facebook
This gaming fixation with artificial intelligence research is hard to ignore, and I think there is a need to explore why. There are three main advantages games have in AI research, which we will cover:
Games are a completely self-contained problem where all possible events, variables, and outcomes are known.Data can be generated in games through randomized gameplay.Games can have deterministic outcomes due to predictable and controlled environments.When Games Capture Real-World Problems
You can create game-like Monte Carlo simulations and call it “AI” as well. For those of you unfamiliar, Monte Carlo algorithms utilize randomness to achieve an objective. For example, if you take some simple random distributions describing how long it takes to process a customer (a normal distribution) as well as how frequently a customer walks in (a Poisson distribution), you can create customer queue simulations like the one below:
There is a point where it feels like we are building AI against games for the sake of, which is fine and that is the prerogative of research. However it is perplexing when the creators of these algorithms allege there is an untapped potential for these algorithms to solve real-world problems at an extraordinary AGI scale, while staying stuck in a loop of finding the next game to automate rather than tackling an industrial problem.
When Games Do Not Capture the Real World
Today, AlphaZero made a lot of headlines in late 2018, with remarkably identical reactions to those of Deep Blue in 1996. There was one notable article which I have linked below:
Note carefully the choice of words in this article, which anthropomorphize the algorithm with words like “human-like”, “creativity”, and “intuition”. Can we be real here? This is just a better chess algorithm using fitted randomized data instead of a tree search, and humanizing words are used to make the algorithm sound like a human rather than a calculator.
I thought it was pretty strange that this article glossed over the massive Monte Carlo generation of data used for training, where the algorithm plays countless random games against itself, and then a regression is performed on that data to estimate an optimal move on a given turn. However, the article marginalized incumbent algorithms like Stockfish for “calculating millions of possible outcomes as it plays” and being computationally expensive. Is this not the pot calling the kettle black? Both Stockfish and AlphaZero require heavy computation and generate large numbers of outcomes, and an argument can be made that AlphaZero requires much more.
And we do this for what end? To create a better chess algorithm with a massive data generation/training overhead? That’s fine, it really is an accomplishment for chess research and knowledge. But let’s not kid ourselves and start saying SkyNet is now possible, contingent we have a faucet to give us unlimited labeled data to train with.
AlphaZero, like all of DeepMind’s gaming-related AI projects, generated data by playing random games with itself, which you cannot do in the real-world.
Why Game AI Fails in the Real World
Common sense can point to three reasons why game AI struggles to find utility in the real world:
Games are a completely self-contained problem where all possible events, variables, and outcomes are known. In the real-world, uncertainty and unknowns are everywhere and ambiguity is the norm.Data can be generated in games through randomized gameplay, but this cannot be done for most real-world problems. You can generate data with simulations (like the customer queue example above), but the data is only as good as the simulation which likely already has predictive value.Games can have deterministic outcomes and have all necessary information (other than what the adversarial player will do next), whereas real-world problems can be highly nondeterministic and have limited partial information.
It is for these reasons that games like Go, Chess, StarCraft, and DOTA 2 are easy to build AI for and yet difficult to utilize in the real world. On top of that, games have room for error and poor moves which can easily go unnoticed. In real world applications there is a lot less tolerance for error unless the application is uncritical, like pushing ads or social media posts. And again, the real-world is often going to prefer heuristics rather than experimental deep learning that has struggled to be logistically practical.
“If you’re in an environment where there is unlimited data available to learn, then you can be incredibly great at it, and there are many, many ways you can be great at it. The smarts about AI comes when you have limited data. Human beings like you and me, we actually learn with very limited data, we learn new skills with one-shot guidance. That’s really where AI needs to get to. That’s the challenge. We are working towards enabling true AI.”
From another perspective, one really should consider the P versus NP problem. I am surprised contemporary AI literature seems to eschew this topic, because it really is the key to truly unlocking effective AI. I highly recommend watching this video, and it is worth the 10 minutes.
Although it has been neither proven or disproven, more scientists are coming to believe that P does not equal NP. This is extremely inconvenient for AI research because it means complexity is always going to limit what we can do. I sometimes wonder if all these data-driven AI models of today are a frustrated attempt to move away from heuristics and try to work around the P versus NP problem. The irony is the process of optimizing loss in machine learning is still very much in the P versus NP problem space, and is one of the major reasons why machine learning is so hard.
Then again, it is likely real-world problems are not as sexy. Can you really use the Traveling Salesman Problem as a publicity stunt? Or is it cooler to have the algorithm win an adversarial match against a world champion of [put game here]? I guarantee you, the latter is more likely to make headlines and bring in the VC funding.