This article is based on an article from the Japanese edition of Engadget and was created using the translation tool Deepl.
Japanese startup EmbodyMe has launched its xpression camera, a virtual camera for use in video conferencing, for free. Initially available for Mac only, a Windows version will be available soon. Users who have registered on the official website will be able to download the software in stages.
xpression camera is an app that lets you set up a single image of any person, and then take over the image with your own facial expressions and move it in real-time. It can be set up as a virtual camera, so you can use it in any video conferencing tool such as Zoom or Google Meet. If you register a photo of yourself in a suit, you can attend a meeting even if you're wearing sleepwear.
Another advantage is that you can eliminate the so-called "Zoom fatigue”. Zoom fatigue is caused by being aware that your face is displayed on the screen and being watched all the time, and this is reduced by using an avatar.
Another advantage of the xpression camera is that it runs on a typical laptop. According to EmbodyMe's President Kazuhoshi Yoshida, similar efforts exist with open-source technology, but "you need a great machine to make it work in real-time," he says, "and it's not perfect in terms of quality. The front is often off, and it is far from practical." On this point, xpression camera runs on popular machines like MacBooks.
It's similar to the work of so-called VTubers, but VTubers are expensive to create character models for and typically use 2D animated characters, so their user base is limited. In this regard, the xpression camera, which allows you to manipulate your own childhood photos and photos of celebrities with your own expression, can target a wider range of users.
Technically, the xpression camera utilizes EmbodyMe's proprietary image recognition technology. Instead of using a 3D sensor like the iPhone's Face ID, it uses only image recognition to estimate more than 50,000 3D points on your face and track your facial expressions in real-time. In addition to this, the system utilizes GAN (Generative Adversarial Networks), a deep learning technology that has become a hot topic in the deep-fake video of former U.S. President Barack Obama. This makes it possible to create a 3D-style avatar from a single photo.
In the future, the company plans to introduce technology that does not track facial expressions but instead uses only voice to generate facial expressions. Users will be able to participate in videoconferencing while doing household chores or jogging without any location restrictions.
Although the technology is currently being released as a virtual camera for use in video conferencing, it can be used in a wide range of applications, including remote customer service, education, and video production. For example, in an educational setting, a familiar character, not the face of the teacher, can teach a class, and in a video production setting, any line could be made to be said later, based on an already existing video, without the need for additional filming.
When I actually conducted a videoconference using the xpression camera, although the so-called "uncanny valley" was not exceeded, I could still feel that I was in a meeting with the camera turned on because the expressions were clearly conveyed despite the avatars being generated from images. If the "uncanny valley" is surpassed by future technological advances, we will need to consider measures against fake attendees.
EmbodyMe, the developer, raised approximately 230 million yen in funding, most recently in September of last year. The funding came from DEEPCORE, Incubate Fund, Deep30 (a VC from the University of Tokyo's Matsuo Lab), Techstars, SMBC Venture Capital, a third-party allotment to Shigeru Urushibara, and a grant from NEDO's research and development venture support program. To date, approximately 360 million yen has been raised.
This article is based on an article from the Japanese edition of Engadget and was created using the translation tool Deepl. The Japanese edition of Engadget does not guarantee the accuracy or reliability of this article.