Slashing video call bandwidth with AI

hugi · October 10, 2020, 2:27pm

This is a pretty clever use of machine learning to solve a prevalent problem.

A high-definition reference image of person A is sent to person B
After that, most of the data sent from A to B is a stream of low-definition images
High-def reference image is mixed with the new images, adjusting the parts that move (mouth, eyes, head position, etc)

What the researchers have achieved has remarkable results: by replacing the traditional h.264 video codec with a neural network, they have managed to reduce the required bandwidth for a video call by an order of magnitude. In one example, the required data rate fell from 97.28 KB/frame to a measly 0.1165 KB/frame – a reduction to 0.1% of required bandwidth.

Since the image you see is actually an animation and not a video it can do some pretty spooky stuff, like twisting your head for you to make sure you’re looking into the camera or replacing your face with an animated avatar.

What we have ahead of us is certainly double-edged:

A potentially incredible jump in the reliability of video calls
Another nail in the coffin for reliable identity without verification, as this makes it possible to appear in a Zoom call using the face of anyone you have a good reference photo of.

johncoate · October 13, 2020, 4:00pm

This is very interesting indeed. I was just thinking about the “big” NGI question - what should the next generation human-centered Internet look and act like?" Since I got into this project it seems that there has been lots of talk about regulation, enduring rights, privacy, etc. but not a lot that solves more basic issues that are now more important than before because so much of the need for bandwidth has shifted to households and away from concentrated office spaces.

What I was thinking was that while of course we need good privacy protection and anti-trust action, what all these millions of work-at-home or home-office people need as a baseline to all of it is an excellent high bandwidth connection regardless of where you are located (with extreme exceptions of course).

I have not seen much talk about improving basic last-mile infrastructure. Maybe it’s too mundane. Not sure. But I would put it much higher on the list of priorities. This development by Nvidia directly addresses such problems without a society necessarily putting up more towers and laying more cable.

But like every tech development, the sword is double edged. Better connectivity, more opportunity for mischief. In this case I am not sure a policy can be devised ahead of time other than whatever is on the books for identity theft. Maybe that needs shoring up.

nadia · October 16, 2020, 11:44am

this looks interesting, nice to be able to be out and about while looking like you are sitting still in a room staring into the screen

On the other hand, i can see a zillion forms of videobombing this could give rise to

johncoate · October 16, 2020, 3:21pm

Right. Always a downside to every kind of tech. The human condition.