Search
Close this search box.

We are creating some awesome events for you. Kindly bear with us.

Revolutionising Realism in AI-Generated Facial Animations

Getting your Trinity Audio player ready...

With just an audio clip and a face photo, a group of researchers from Nanyang Technological University, Singapore (NTU Singapore) have created a computer programme that generates lifelike videos that mimic the speaker’s facial expressions and head movements.

DIverse yet Realistic Facial Animations or DIRFA is an artificial intelligence programme that combines speech recognition and image processing to create a three-dimensional (3D) video that synchronises a subject’s realistic and consistent facial animations with spoken audio. The programme created by NTU is an improvement over current methods that have trouble controlling emotions and changing positions.

Image credit: ntu.edu.sg

The team trained DIRFA on more than one million audiovisual clips from more than 6,000 individuals obtained from The VoxCeleb2 Dataset, an open-source database, to predict speech cues and associate them with head movements and facial expressions.

According to the researchers, DIRFA has the potential to open up new applications in a variety of fields and industries, including healthcare, by enabling chatbots and virtual assistants that are more realistic and sophisticated, thus enhancing user experiences.

Additionally, it might be a very useful tool for people who have trouble speaking or using their faces or voices to express their feelings and ideas through animated figures or digital representations, which would improve their communication skills.

The study’s lead author, Associate Professor Lu Shijian of NTU Singapore’s School of Computer Science and Engineering (SCSE), stated “Our study could have a significant and wide-ranging impact as it revolutionises multimedia communication by enabling the creation of incredibly lifelike videos of people speaking by combining techniques like AI and machine learning (ML).

The programme also builds on previous research and represents a technological advancement, as videos created with the programme include accurate lip movements, vivid facial expressions, and natural head poses while using only audio recordings and static images.

“Speech exhibits a multitude of variations,” said first author Dr Wu Rongliang, a PhD graduate of NTU’s SCSE. People pronounce the same words differently in different contexts, with variations in duration, amplitude, tone, and other factors. Besides, speech conveys rich information about the speaker’s emotional state as well as identity factors such as gender, age, ethnicity, and even personality traits, in addition to its linguistic content.

Dr Wu, a Research Scientist at Singapore’s Agency for Science, Technology, and Research (A*STAR) Institute for Infocomm Research added that the approach is a pioneering effort to improve performance in AI and ML from the standpoint of audio representation learning.

According to the researchers, creating lifelike facial expressions driven by audio poses a complex challenge. There are numerous possible facial expressions for a given audio signal, and these possibilities multiply when dealing with a sequence of audio signals over time.

Because audio has strong associations with lip movements but weaker associations with facial expressions and head positions, the team set out to create talking faces with precise lip synchronisation, rich facial expressions, and natural head movements that corresponded to the provided audio.

To address this, the team first created DIRFA, an AI model that captures the complex relationships between audio signals and facial animations. The team trained their model on over one million audio and video clips from a publicly available database of over 6,000 people.

Also, to add more options and improvements to DIRFA’s interface, NTU researchers will fine-tune its facial expressions with a broader range of datasets that include more diverse facial expressions and voice audio clips.

PARTNER

Qlik’s vision is a data-literate world, where everyone can use data and analytics to improve decision-making and solve their most challenging problems. A private company, Qlik offers real-time data integration and analytics solutions, powered by Qlik Cloud, to close the gaps between data, insights and action. By transforming data into Active Intelligence, businesses can drive better decisions, improve revenue and profitability, and optimize customer relationships. Qlik serves more than 38,000 active customers in over 100 countries.

PARTNER

CTC Global Singapore, a premier end-to-end IT solutions provider, is a fully owned subsidiary of ITOCHU Techno-Solutions Corporation (CTC) and ITOCHU Corporation.

Since 1972, CTC has established itself as one of the country’s top IT solutions providers. With 50 years of experience, headed by an experienced management team and staffed by over 200 qualified IT professionals, we support organizations with integrated IT solutions expertise in Autonomous IT, Cyber Security, Digital Transformation, Enterprise Cloud Infrastructure, Workplace Modernization and Professional Services.

Well-known for our strengths in system integration and consultation, CTC Global proves to be the preferred IT outsourcing destination for organizations all over Singapore today.

PARTNER

Planview has one mission: to build the future of connected work. Our solutions enable organizations to connect the business from ideas to impact, empowering companies to accelerate the achievement of what matters most. Planview’s full spectrum of Portfolio Management and Work Management solutions creates an organizational focus on the strategic outcomes that matter and empowers teams to deliver their best work, no matter how they work. The comprehensive Planview platform and enterprise success model enables customers to deliver innovative, competitive products, services, and customer experiences. Headquartered in Austin, Texas, with locations around the world, Planview has more than 1,300 employees supporting 4,500 customers and 2.6 million users worldwide. For more information, visit www.planview.com.

SUPPORTING ORGANISATION

SIRIM is a premier industrial research and technology organisation in Malaysia, wholly-owned by the Minister​ of Finance Incorporated. With over forty years of experience and expertise, SIRIM is mandated as the machinery for research and technology development, and the national champion of quality. SIRIM has always played a major role in the development of the country’s private sector. By tapping into our expertise and knowledge base, we focus on developing new technologies and improvements in the manufacturing, technology and services sectors. We nurture Small Medium Enterprises (SME) growth with solutions for technology penetration and upgrading, making it an ideal technology partner for SMEs.

PARTNER

HashiCorp provides infrastructure automation software for multi-cloud environments, enabling enterprises to unlock a common cloud operating model to provision, secure, connect, and run any application on any infrastructure. HashiCorp tools allow organizations to deliver applications faster by helping enterprises transition from manual processes and ITIL practices to self-service automation and DevOps practices. 

PARTNER

IBM is a leading global hybrid cloud and AI, and business services provider. We help clients in more than 175 countries capitalize on insights from their data, streamline business processes, reduce costs and gain the competitive edge in their industries. Nearly 3,000 government and corporate entities in critical infrastructure areas such as financial services, telecommunications and healthcare rely on IBM’s hybrid cloud platform and Red Hat OpenShift to affect their digital transformations quickly, efficiently and securely. IBM’s breakthrough innovations in AI, quantum computing, industry-specific cloud solutions and business services deliver open and flexible options to our clients. All of this is backed by IBM’s legendary commitment to trust, transparency, responsibility, inclusivity and service.