OpenAI has developed a new program, inspired by human imitation instincts and learning skills, and translates them to the VR world for robots. Their new algorithm, named “one-shot imitation learning” requires humans demonstrate a tasks just once in VR for a robot to learn it.
The above demo shows how the system can learn a behavior from a single demonstration delivered within a simulator, and then reproduce that behavior in different setups in reality.
According to OpenAI, the system is powered by two neural networks — a vision network and an imitation network:
“The vision network ingests an image from the robot’s camera and outputs state representing the positions of the objects. As before, the vision network is trained with hundreds of thousands of simulated images with different perturbations of lighting, textures, and objects. (The vision system is never trained on a real image.)”
“The imitation network observes a demonstration, processes it to infer the intent of the task, and then accomplishes the intent starting from another starting configuration. Thus, the imitation network must generalize the demonstration to a new setting. But how does the imitation network know how to generalize? The network learns this from the distribution of training examples. It is trained on dozens of different tasks with thousands of demonstrations for each task. Each training example is a pair of demonstrations that perform the same task. The network is given the entirety of the first demonstration and a single observation from the second demonstration. We then use supervised learning to predict what action the demonstrator took at that observation. In order to predict the action effectively, the robot must learn how to infer the relevant portion of the task from the first demonstration.”
While this model is still in prototype form, its prospective applications are undeniably beneficial for training purposes. Teaching robots entirely by VR simulation would allow humans to train robots for complex or dangerous tasks, without needing to come in contact with any real life, physical elements. Researchers would be able to virtually approximate extreme environments without ever having to really be there. Some applications for this would be space exploration, deep sea exploration, and hazardous waste cleanup. The options are virtually limitless.