Why look at reality when you can edit what you see in real time?

The adoption of augmented reality is happening slowly but surely, and it’s easy to see one possible future for the technology: hardware that lets you edit what you see in real time, replacing objects around you with virtual overlays. Call it mixed reality, to be more precise.

Recent research from the AI team at Porsche (yes, the carmakers — we’ll get to that in a bit) shows how this might work. The team has built an AI system dubbed TransforMR that detects objects like cars and people, removes them, then replaces them with CGI alternatives in real-time. The end results are hardly flawless (edits are haphazard and the CGI models look like they were borrowed from 3D Movie Maker) but the concept is striking. It’s not hard to imagine applications like this becoming commonplace in decades to come.

The team behind the work told The Verge that although individual elements of their work had been done before, the composite system is novel. TansforMR can run on regular smartphones and tablets, but requires a 4G connection to send data to the cloud. Images are processed so that objects are not just covered up, as with Snapchat AR lenses or Apple’s Memoji, but edited out entirely. Objects are detected, segmented, then “inpainted” (replaced with AI-generated background), and a CGI model substituted for the original.

The TransforMR model involves many distinct steps.

There are obviously lots of areas for improvement. The frame-rate is just 15fps at low quality inpainting; the lag is 50 to 100 milliseconds; and the CGI replacements are not the best quality. But, the team behind the system say these aspects are relatively easy to improve.

“The main limitation is that large images are very compute-intensive,” Mohamed Kari, a machine learning researcher at Porsche, told The Verge. “So for the inpainting we do this with very small images currently, operating on 512 x 512 images. But the bandwidth [usage] is negligible. If you can do FaceTime you can do TransforMR.”

One of the key elements of the system, says Kari, is its use of pose detection. This means that when the system detects a person, for example, it identifies 18 separate joints in the body. That means the CGI replacement can be anchored to the target’s movement in real time. Kari compares this to other AR systems which simply identify geometric surfaces.

Looking at clips of TransforMR in action, it’s not hard to imagine such software being integrated into AR glasses. Users could pick a “theme” for their day, replacing cars, buildings, and people with sci-fi alternatives, or items taken from nature. But, as Kari points out, this would involves a huge hardware challenge. Current augmented reality glasses can only project low-resolution, semi-opaque overlays onto their lenses. Right now, we just don’t have the technology to “edit” what users are seeing with this sort of hardware. (Though it could presumably be done using a “passthrough” VR system, where first-person cameras play a live video feed onto screens that completely occlude the wearer’s vision.)

“We are reproducing the full image on screen, so we can remove whatever we want to, but with augmented reality glasses removing objects is difficult because it adds light intensity,” says Kari. “In HoloLens for example, you are looking through the glass, so removing stuff is more difficult. That question is open to research.”

But why is Porsche investigating this sort of tech in the first place? According to one of the company’s AI architects, Tobias Grosse-Puppendahl, it’s all about improving the experience of passengers and drivers. Future versions of the TransforMR software could be used to entertain people when they’re stuck in traffic, Grosse-Puppendahl tells The Verge. “Our main question was, how can we modify reality in a way that is fun and entertaining to react with? And that’s where our idea originated from.”

Other research projects at Porsche follow a similar theme. For example, the company has also built a prototype system called SoundRide which uses a car’s machine vision to detect changes in scenery and cue up appropriate music. “Maybe, for example, you’re driving through the Alps, driving through a beautiful route, and suddenly you have a wonderful view and maybe the music changes,” says Kari. “We’re thinking how technology can make the experience in the car even more interesting and beautiful.” And that means tinkering with what people would otherwise see and hear.

Leave a Comment