For the most part, Zoom has dominated video conferencing, but it might soon face competition thanks to NVIDIA. Recently, NVIDIA announced its new GPU-Accelerated AI Platform, NVIDIA Maxine, that it says will “vastly improve streaming quality” and offer incredible AI-powered features.
NVIDIA Maxine is a cloud-native video-streaming AI platform so data doesn’t need to be processed on local servers. Instead, NVIDIA’s servers process the information so users can use the cool AI features without having to purchase any new specialized hardware.
“NVIDIA Maxine integrates our most advanced video, audio, and conversational AI capabilities to bring breakthrough efficiency and new capabilities to the platforms that are keeping us all connected,” said Ian Buck, vice president and general manager of Accelerated Computing at NVIDIA, in a press release.
Maxine’s “breakthrough efficiency” can be seen in its AI-based video compression technology. The AI tech reduces the bandwidth used on a call to one-tenth of the H.264 video compression standard without compromising video quality. In doing so, less data is transmitted back and forth so slow internet connection and limited bandwidth won’t be a problem anymore. Hopefully, this helps bring an end to the dreaded “you have a poor connection, blah, blah, blah” message.
Some of the features that make NVIDIA Maxine standout are face alignment and gaze correction. These two features allow for a better face-to-face conversation. For instance, people will no longer appear to be staring off into outer space. With face alignment, the software will automatically adjust people so it looks like they are facing each other. And, with gaze correction, it will help simulate eye contact. According to NVIDIA, “These features help people stay engaged in the conversation rather than looking at their camera.”
Also, if developers choose to do so, they can allow users to choose an animated avatar. These avatars offer a realistic feel because they are driven by a person’s “voice and emotional tone in real-time.” Plus, the auto frame feature automatically follows the person in the frame so they are always in view. This is great when you’re doing a presentation or demo.
The feature that stands out to me is the noise cancellation filter that removes background noise. Anyone with a toddler or dog will be a big fan of that one! Continually pressing the mute and unmute button could finally become a thing of the past.
Maxine also has a “conversational AI”. With NVIDIA Jarvis (not to be confused with Iron Man’s Just A Rather Very Intelligent System), developers can integrate virtual assistants to take notes, set action items, and answer questions in human-like voices. Additionally, this AI offers translations and closed captions all in real-time.
By taking a look at what NVIDIA Maxine has to offer, there is no denying Zoom has a lot of work to do if it wants to stay on top. Although it did dabble with real-time captioning back in June, Zoom’s offering was very limited. And, Maxine is on its way up.
Early access to the NVIDIA Maxine platform is available to Computer vision AI developers, software partners, startups, and computer manufacturers creating audio and video apps and services.
Veronica Garcia has a Bachelor of Journalism and Bachelor of Science in Radio/TV/Film from The University of Texas at Austin. When she’s not writing, she’s in the kitchen trying to attempt every Nailed It! dessert, or on the hunt trying to find the latest Funko Pop! to add to her collection.

Pingback: Use AI to automatically transcribe your Google Meet meetings