A virtual robot arm has learned to solve a wide range of different puzzles —stacking blocks, setting the table, arranging chess pieces—without having to be retrained for each task. It did this by playing against a second robot arm that was trained to give it harder and harder challenges.
Self play: Developed by researchers at OpenAI, the identical robot arms—Alice and Bob—learn by playing a game against each other in a simulation, without human input. The robots use reinforcement learning, a technique in which AIs are trained by trial and error what actions to take in different situations to achieve certain goals. The game involves moving objects around on a virtual tabletop. By arranging objects in specific ways, Alice tries to set puzzles that are hard for Bob to solve. Bob tries to solve Alice’s puzzles. As they learn, Alice sets more complex puzzles and Bob gets better at solving them.