Feeling, complicated

It was very striking to me to contrast the two recent successes of OpenAI: one, OpenAI Five, beating some of the best humans at a complex game in a sophisticated virtual environment; and the other, Dactyl, fumblingly manipulating blocks in ways that children master at young ages. This is not to diminish how much of an achievement Dactyl is - no other reinforcement learning system has come close to this sort of performance in a physical task. But it does show that the real world is very complicated, compared with even our most advanced virtual worlds. To be fair, the graphics and physics engines used to render videos are becoming very good (and as movies show, practically indistinguishable from real life when enough work is put in). Audio generation is worse, except on human voices, which are now very convincing - but background sounds aren't a crucial component of our environment anyway. The biggest experiential difference between current simulations and the real world seems to be tactile sensations, or the lack thereof. OpenAI couldn't get realistic simulation of tactile sensations even in the very simple Dactyl environment (and eventually decided to do without them).

This may be due to the intrinsic difficulty of generating tactile feedback, but may also be because of the type of situation in which it's required. You can get impressive visual output from a static landscape. By contrast, we get tactile feedback mostly from physical interactions with objects in our environment - but modelling objects which can interact with each other is very hard! Consider how many things I can do with a piece of paper: write on it, tear it, crumple it, blow it away, make a paper plane out of it, set it on fire, soak it in water, eat it, cut holes in it, braid it into ropes, and so on. Many of these effects depend on molecular-level physics, which we're very far from being able to simulate on a large scale. And even the macroscopic effects rely on friction, which is also very difficult to model efficiently. If we add in fluid dynamics, then it seems plausible that it will take half a century or more before any simulated world is advanced enough to model all the interactions I listed above in real time. And that's just paper, not machines or electronics!

An alternative approach is to constrain the types of interactions which are allowed - e.g. a simulation with only rigid bodies. In such an environment, we could develop efficient approximations to trade quality for speed (a tradeoff which the graphics community has been wrestling with for some time). Friction would still be a major difficulty, as well as the ability to feel surface textures, but it's likely that immersive, interactive simulations with these properties will be developed within the next decade or two. The reason visual rendering is so advanced is because it's so crucial to the multi-billion-dollar video game and film industries. Now that VR is becoming a thing, those market pressures will be pushing hard for realistic environments with tactile feedback - and in doing so, increasing the effective number of people working on AI even faster than the nominal number.


Popular posts from this blog

In Search of All Souls

25 poems

Moral strategies at different capability levels