Eavesdropping Technique Reconstructs Sound from Moving Images Alone
MIT engineers have devised a way to image objects' tiny vibrations and translate them back into the sounds that created them.
When you speak, surrounding objects quake and quiver in response to your voice—you just can’t see their vibrations because they’re so minute.
But now, engineers at MIT led by Abe Davis have devised a way to image those tiny vibrations and translate them back into sound. By using a special high-frame-rate camera to film objects down to about one one-thousandth of a pixel at a time, they were able to recreate the original frequencies, in order, that caused those vibrations. What’s more, they could also discern human speech from behind soundproof glass using the same technique.
Even more impressive, they modified their approach to use an everyday digital camera. The trick was a technique known as a rolling shutter, in which a scene is scanned rapidly from top to bottom rather than taken in all at once. Their reconstructed “Mary Had a Little Lamb” included frequencies up to six times higher than the camera’s actual frame rate of 60 frames per second.
Here’s Hal Hodson, writing for New Scientist:
Davis says that although spying is the obvious application for visual microphones, he is more excited about using them as a new way of measuring the physical properties of objects remotely.
“We look at how light is reflected off an object, and that tells us the colour of that object,” Davis explains. “Now we can see how the object responds to sound. It’s a whole other dimension we could use. How something responds to sound indicates structural material properties that we’re not used to looking at, and our hope is that the project this will find completely new applications.”
Davis says that while a regular digital camera with a rolling shutter can’t recover speech, it can recover vocal qualities like gender. In many ways, their demonstration illustrates how sound leaves a literal imprint on our environment.
Photo Credit: betmari / Flickr (CC BY-NC 2.0)