Search
Close this search box.

We are creating some awesome events for you. Kindly bear with us.

Enhancing Data-Driven Molecular Discovery

Image credits: news.mit.edu
Getting your Trinity Audio player ready...

Scientists at MIT and the collaborative research lab have created a new unified framework, enabling the simultaneous prediction of molecular properties and the generation of new molecules with superior efficiency compared to conventional deep-learning methods.

To train a machine learning model to predict a molecule’s biological or mechanical attributes, researchers typically present it with millions of labelled molecular structures. However, obtaining such extensive training datasets is often challenging and expensive due to the difficulties of discovering and hand-labelling numerous structures. As a result, the efficacy of machine learning techniques could be improved.

In contrast, the system developed by MIT researchers demonstrates the ability to forecast molecular properties with a minimal amount of data accurately. The system is foundational to the principles governing the combination of building blocks to form valid molecules. These principles capture the resemblances between molecular structures, enabling the system to generate new molecules and predict their properties highly efficiently, even with limited data.

This approach surpassed alternative machine learning methods when tested on datasets of varying sizes, delivering precise predictions of molecular properties and producing viable molecules when provided with datasets containing fewer than 100 samples.

Minghao Guo, a graduate student in computer science and electrical engineering (EECS) and the study’s lead author, explains that this project’s objective is to leverage data-driven techniques to accelerate the process of discovering novel molecules.

The aim is to train a model capable of making predictions without relying on expensive experimental procedures. The intention is to reduce costs and expedite the molecular discovery process by implementing these data-driven methods.

To achieve optimal outcomes with machine learning models, researchers require extensive training datasets consisting of millions of molecules that exhibit similar properties to the ones they aim to discover. However, in practice, these domain-specific datasets are often limited in size. Consequently, researchers employ pre-trained models on large datasets encompassing general molecules.

These pre-trained models are then applied to the smaller, targeted datasets. Unfortunately, their performance tends to be defective due to the need for substantial domain-specific knowledge in these models.

The researchers at MIT adopted a unique approach. They developed a machine learning system that autonomously learns molecules’ intricate “language”, referred to as molecular grammar, using a limited, domain-specific dataset. This system utilises the acquired molecular grammar to generate viable molecules and accurately predict their properties.

Guo underscored the substantial efficacy of the grammar-based representation employed in this research. The generality of the grammar itself enables its application to diverse types of graph-based data. The researchers actively identify additional domains beyond chemistry or material science where this powerful representation can be successfully deployed.

“This grammar-based representation is very efficacious. And due to its general nature, the grammar can be applied to diverse types of graph-based data. Our aim is to explore additional applications beyond the domains of chemistry and material science,” Guo said.

The researchers envision expanding their molecular grammar to encompass the three-dimensional (3D) geometry of molecules and polymers. Understanding the interactions between polymer chains is crucial in this term.

Additionally, they are in the process of developing an interface that would allow users to view the learned grammar production rules. This interface also enables users to provide feedback to rectify any potentially inaccurate rules, enhancing the system’s accuracy.

PARTNER

Qlik’s vision is a data-literate world, where everyone can use data and analytics to improve decision-making and solve their most challenging problems. A private company, Qlik offers real-time data integration and analytics solutions, powered by Qlik Cloud, to close the gaps between data, insights and action. By transforming data into Active Intelligence, businesses can drive better decisions, improve revenue and profitability, and optimize customer relationships. Qlik serves more than 38,000 active customers in over 100 countries.

PARTNER

CTC Global Singapore, a premier end-to-end IT solutions provider, is a fully owned subsidiary of ITOCHU Techno-Solutions Corporation (CTC) and ITOCHU Corporation.

Since 1972, CTC has established itself as one of the country’s top IT solutions providers. With 50 years of experience, headed by an experienced management team and staffed by over 200 qualified IT professionals, we support organizations with integrated IT solutions expertise in Autonomous IT, Cyber Security, Digital Transformation, Enterprise Cloud Infrastructure, Workplace Modernization and Professional Services.

Well-known for our strengths in system integration and consultation, CTC Global proves to be the preferred IT outsourcing destination for organizations all over Singapore today.

PARTNER

Planview has one mission: to build the future of connected work. Our solutions enable organizations to connect the business from ideas to impact, empowering companies to accelerate the achievement of what matters most. Planview’s full spectrum of Portfolio Management and Work Management solutions creates an organizational focus on the strategic outcomes that matter and empowers teams to deliver their best work, no matter how they work. The comprehensive Planview platform and enterprise success model enables customers to deliver innovative, competitive products, services, and customer experiences. Headquartered in Austin, Texas, with locations around the world, Planview has more than 1,300 employees supporting 4,500 customers and 2.6 million users worldwide. For more information, visit www.planview.com.

SUPPORTING ORGANISATION

SIRIM is a premier industrial research and technology organisation in Malaysia, wholly-owned by the Minister​ of Finance Incorporated. With over forty years of experience and expertise, SIRIM is mandated as the machinery for research and technology development, and the national champion of quality. SIRIM has always played a major role in the development of the country’s private sector. By tapping into our expertise and knowledge base, we focus on developing new technologies and improvements in the manufacturing, technology and services sectors. We nurture Small Medium Enterprises (SME) growth with solutions for technology penetration and upgrading, making it an ideal technology partner for SMEs.

PARTNER

HashiCorp provides infrastructure automation software for multi-cloud environments, enabling enterprises to unlock a common cloud operating model to provision, secure, connect, and run any application on any infrastructure. HashiCorp tools allow organizations to deliver applications faster by helping enterprises transition from manual processes and ITIL practices to self-service automation and DevOps practices. 

PARTNER

IBM is a leading global hybrid cloud and AI, and business services provider. We help clients in more than 175 countries capitalize on insights from their data, streamline business processes, reduce costs and gain the competitive edge in their industries. Nearly 3,000 government and corporate entities in critical infrastructure areas such as financial services, telecommunications and healthcare rely on IBM’s hybrid cloud platform and Red Hat OpenShift to affect their digital transformations quickly, efficiently and securely. IBM’s breakthrough innovations in AI, quantum computing, industry-specific cloud solutions and business services deliver open and flexible options to our clients. All of this is backed by IBM’s legendary commitment to trust, transparency, responsibility, inclusivity and service.