On Sept. 28, podcast host Lex Fridman uploaded a video titled “First Interview in the Metaverse,” where he sat down with Mark Zuckerberg to discuss and demonstrate a new project Meta is working on called Codec Avatars. The project aims to create photorealistic avatars derived from their users’ scans that mimic their body and facial expressions in real time. The podcast featured segments from Zuckerberg and Fridman’s points of view, as they talked to each other’s Codec Avatars from virtual feet away, while wearing VR headsets hundreds of miles away from each other.
As of now, the technology is far from streamlined. Users are required to undergo multiple detailed scans from more than 100 different cameras, involving a variety of facial and bodily expressions which are used to construct a detailed computer model. This model is then collapsed into a Codec. Once the user puts on the headset, it detects their facial features and expressions using the Codec, and transmits an encoded version of their appearance. So, in addition to being photorealistic, these avatars are much more bandwidth-efficient than transmitting a full video or especially a 3D immersive video of a whole scene.
This technology has the potential to massively impact the way we interact with one another virtually. Never before has this level of photorealism been coupled with the immersive feel of virtual reality. Despite the complexity involved, ongoing research and development efforts aim to further simplify the user experience, with the end goal being self-scans that can be made from a user’s phone in no more than a few minutes.
Zuckerberg underscores the profound potential of Codec Avatars in mixed-reality spaces, where digital representations of people or objects are superimposed on real-world environments through VR headsets. For instance, a coworker could participate in a work meeting as a photorealistic hologram, faithfully replicating their facial expressions and body language from the comfort of their home. Zuckerberg also points out that some users might prefer an avatar version of themselves that’s more emotive than their actual faces, and a big question his team grapples with is the extent to which users should be given control over that.
“So for example, you know, I always get a lot of critique[s]… for having, like, a relatively stiff expression,” Zuckerberg said, “…For me, […] I’d wanna have my avatar really be able to better express, like, how I’m feeling than how I can do physically.”
Because these avatars are generated from an initial scan, they do not account for subsequent changes such as body modifications, weight fluctuations or other variables, which could complicate the avatar’s ability to accurately reflect the user’s ongoing physical characteristics and expressions.
With these and many other problems needing to be addressed, it is likely that the final form of these avatars will lie several years down the road, but Meta will continue to roll out updates until the product is ready for the mainstream market. Zuckerberg emphasized the importance of scanning more individuals to gather an extensive range of expressions, essentially “over-collecting” data. His team can then extrapolate from these scans to figure out how streamlined and efficient the scanning process can become. Once these scans are efficient and accessible enough to be used by the general public, Zuckerberg plans to integrate Codec avatars into each of his apps.
Could Codec Avatars become the future of virtual communication? The answer I think lies in the accessibility and ease of obtaining these scans, society’s openness to embracing virtual or mixed reality and the growing demand for more dynamic and immersive communication experiences.
There is, however, reason to hold skepticism. Previous metaverse products developed by Zuckerberg have fallen short in several ways, including a limited sense of immersion caused by graphical constraints and a seeming inability of developers to comprehend the essence of virtual experiences that users are looking for. Those who sought genuine immersion, an experience that transcended physical constraints, found themselves disillusioned with the shallow, cartoonish 3D experience offered in Zuckerberg’s Horizon Worlds and his other virtual reality products.
Codec Avatars seem to be a virtual paradigm shift away from the superficial landscape of Horizon worlds, and towards a more rich and authentic virtual experience. Here, individuals can engage with lifelike avatars of one another, capturing the full range of expressive capacities unique to each person. I believe that the realization of Zuckerberg’s vision for Codec Avatars and the success of this form of virtual reality hinges on its ability to fulfill the promises it makes and to capture the sought-after essence of immersion that has long eluded definition. If Zuckerberg and his team are successful in this project, it could mark a transformative leap into a new era of communication and digital experience, redefining and expanding the boundaries of human interaction in the virtual realm.