Stare at a blank wall in any room, and you are unlikely to learn much more than the paint color. But a new technology can inconspicuously scan the same surface for shadows and reflections imperceptible to the human eye, then analyze them to determine details including how many people are in the room—and what they are doing. This tool could extrapolate information from a partial view of a space, perhaps spying on activity from around a corner or monitoring someone who avoids a camera’s line of sight.
As people move around a room, their bodies block a portion of any available light to create subtle and indistinct “soft shadows” on walls. Brightly colored clothing can even cast a dim, reflected glow. But these faint signals are usually drowned out by the main source of ambient light. “If we could do something like subtracting this ambient term from whatever we are observing, then you would just be left with camera noise—and signal,” says Prafull Sharma, a graduate student at the Massachusetts Institute of Technology. Sharma and other M.I.T. researchers isolated that ambient term by filming a wall in a room as its occupants moved around and averaging the frames over time. This eliminated the humans’ shifting shadows, leaving only the light from the main source plus shadows from furniture or other stationary objects. Then the researchers removed these features from the video in real time, revealing moving shadows on the wall.
Next, Sharma’s team recorded blank walls in several rooms in which the researchers enacted various scenarios and activities. People moved around, alone or in pairs, outside the camera’s view. Others crouched, jumped or waved their arms. Then the team fed the videos into a machine-learning model to teach it which soft shadow patterns indicated which behavior. The resulting system can automatically analyze footage of a blank wall in any room in real time, determining the number of people and their actions. The work was presented at the 2021 International Conference on Computer Vision in October.
Although this system can function without calibration in any room, it performs poorly in dim lighting or in the presence of a flickering light source such as a television. It can register only group sizes and activities for which it has been trained, and it requires a high-resolution camera; a standard digital camera created too much background noise, and smartphone camera results were even worse.
Despite its limitations, the method highlights how imaging and machine learning can transform imperceptible indicators into surveillance. “It’s a very cool scientific finding that such a low-intensity signal can be used to predict information,” Sharma says. “And of course, as we established, the naked eye cannot do this at all.”
A blank wall is far from the first innocent-looking item to reveal secrets about its surroundings. “In general, these are called side-channel attacks, or side-channel surveillance,” says Bennett Cyphers, staff technologist at the nonprofit Electronic Frontier Foundation, which promotes digital rights. “It’s when you use sources of information that aren’t directly what you’re looking for—that might be outside the box of normal ways of gathering information—to learn things that it doesn’t seem like you’d be able to.”
Side-channel attacks can take advantage of some extremely unassuming inputs. In 2020 researchers used reflections from various shiny objects—including a bag of chips—to reconstruct an image of a surrounding room. Sound and other vibrations can also yield a lot of indirect information. For example, audio of a person typing at a computer can reveal the words being written. And a computer itself can act as a microphone: in a 2019 study, researchers developed software that detected and analyzed how ambient sound waves jiggled a hard drive’s read head over its magnetic disk—and could thus effectively record conversations taking place near the machine.
Scientists have also developed floor-based sensors capable of detecting footstep vibrations, discerning individuals’ identities and even diagnosing them with certain illnesses. Most of these techniques rely on machine learning to detect patterns that human intelligence cannot. With high-resolution audiovisual recording and computational power becoming more widely available, researchers can train systems with many different inputs to glean information from often overlooked clues.
So far at least, the surveillance potential does not seem to be keeping many privacy advocates awake at night. “This blank-wall attack, and other sophisticated side-channel attacks like it, simply should not be a worry for the average person,” says Riana Pfefferkorn, a research scholar at the Stanford Internet Observatory. “They are cool tricks by academic researchers that are a long way off from being operationalized by law enforcement.” Routine use is “way off in the future, if ever—and even then, the police still couldn’t just trespass on your property and stick a camera up against your window.” Cyphers agrees. “Everyone carries a smartphone, tons of people have smart speakers in their houses, and their cars are connected to the Internet,” he notes. “Companies and governments don’t usually have to turn to things like footage of a blank wall to gather the kind of information that they want.”
Although side-channel methods are unlikely to target an average person for now, they could eventually find their way into real-world applications. “The military and intelligence agencies have always had specific uses for any kind of surveillance they can get their hands on,” Cyphers says. Sharma agrees that such uses are possible, but he also suggests some more innocuous ones: for example, vehicles could scan blank walls as part of an autonomous pedestrian-detection system for areas with poor lines of sight, such as parking garages. And some researchers who explore side-channel techniques suggest they could be used to monitor the elderly and detect falls or other problems.
Sharma says his own system would be capable of fall detection—if he had gathered the examples to train it. But, he quips, “I refuse to fall down in 20 different rooms to collect data.”