Search
Close this search box.

We are creating some awesome events for you. Kindly bear with us.

PolyU Study Advancing AI with Next Sentence Prediction

Getting your Trinity Audio player ready...

Generative artificial intelligence (GenAI) has significantly transformed social interactions, drawing considerable attention to large language models (LLMs) that leverage deep-learning algorithms for language processing. A recent study conducted by The Hong Kong Polytechnic University (PolyU) has revealed that LLMs exhibit brain-like performance when trained in ways analogous to human language processing. This finding offers crucial insights for both brain research and AI model development.

Image credits: Hong Kong Polytechnic University

LLMs today primarily depend on a single pretraining technique: contextual word prediction. This straightforward learning strategy, combined with vast amounts of training data and extensive model parameters, has yielded remarkable success, as exemplified by popular models like ChatGPT. Studies indicate that word prediction in LLMs can serve as a plausible model for human language processing. However, human language comprehension involves more than just predicting the next word; it integrates high-level contextual information.

To explore this, a research team led by Prof. LI Ping from PolyU investigated the next sentence prediction (NSP) task. This task simulates a core process of discourse-level comprehension in the human brain, assessing whether a pair of sentences is coherent. The team examined how this task affects model pretraining and its correlation with brain activity. Their findings were recently published in the journal *Science Advances*.

The researchers trained two models: one with NSP enhancement and one without, both incorporating word prediction. They collected functional magnetic resonance imaging (fMRI) data from participants reading connected and disconnected sentences. The team then analysed how closely the models’ patterns aligned with brain patterns observed in the fMRI data.

The results showed that the model trained with NSP provided significant benefits. This model’s brain activity patterns more closely matched those observed in humans across multiple brain areas than the model trained solely on word prediction. The NSP-enhanced model’s mechanisms also aligned well with established neural models of human discourse comprehension. This research offers new insights into how the human brain processes extended discourse, such as conversations, revealing that both sides of the brain—not just the left—are involved in understanding longer narratives. Additionally, the NSP-trained model more accurately predicted reading speeds, suggesting that simulating discourse comprehension through NSP enables AI to better understand human language processing.

Recent advancements in LLMs, including ChatGPT, have primarily focused on scaling up training data and model size to enhance performance. However, Prof. Li Ping highlights the limitations of relying solely on such scaling. He advocates for making models more efficient by using less data. The study’s findings suggest that incorporating diverse learning tasks like NSP can make LLMs more human-like and potentially closer to human intelligence.

Prof. Li further emphasises that these findings demonstrate how neurocognitive researchers can leverage LLMs to study higher-level language mechanisms in the brain. This promotes collaboration between AI researchers and neurocognitive scientists, fostering studies on AI-informed brain research and brain-inspired AI development.

The PolyU study underscores the potential of NSP to enhance the performance and human-likeness of LLMs. By integrating high-level contextual information, these models can better mimic human language processing, offering a path towards more efficient and intelligent AI. The research highlights the importance of diverse learning tasks in AI training and paves the way for future interdisciplinary collaborations that can advance both AI and brain research.

PARTNER

Qlik’s vision is a data-literate world, where everyone can use data and analytics to improve decision-making and solve their most challenging problems. A private company, Qlik offers real-time data integration and analytics solutions, powered by Qlik Cloud, to close the gaps between data, insights and action. By transforming data into Active Intelligence, businesses can drive better decisions, improve revenue and profitability, and optimize customer relationships. Qlik serves more than 38,000 active customers in over 100 countries.

PARTNER

As a Titanium Black Partner of Dell Technologies, CTC Global Singapore boasts unparalleled access to resources.

Established in 1972, we bring 52 years of experience to the table, solidifying our position as a leading IT solutions provider in Singapore. With over 300 qualified IT professionals, we are dedicated to delivering integrated solutions that empower your organization in key areas such as Automation & AI, Cyber Security, App Modernization & Data Analytics, Enterprise Cloud Infrastructure, Workplace Modernization and Professional Services.

Renowned for our consulting expertise and delivering expert IT solutions, CTC Global Singapore has become the preferred IT outsourcing partner for businesses across Singapore.

PARTNER

Planview has one mission: to build the future of connected work. Our solutions enable organizations to connect the business from ideas to impact, empowering companies to accelerate the achievement of what matters most. Planview’s full spectrum of Portfolio Management and Work Management solutions creates an organizational focus on the strategic outcomes that matter and empowers teams to deliver their best work, no matter how they work. The comprehensive Planview platform and enterprise success model enables customers to deliver innovative, competitive products, services, and customer experiences. Headquartered in Austin, Texas, with locations around the world, Planview has more than 1,300 employees supporting 4,500 customers and 2.6 million users worldwide. For more information, visit www.planview.com.

SUPPORTING ORGANISATION

SIRIM is a premier industrial research and technology organisation in Malaysia, wholly-owned by the Minister​ of Finance Incorporated. With over forty years of experience and expertise, SIRIM is mandated as the machinery for research and technology development, and the national champion of quality. SIRIM has always played a major role in the development of the country’s private sector. By tapping into our expertise and knowledge base, we focus on developing new technologies and improvements in the manufacturing, technology and services sectors. We nurture Small Medium Enterprises (SME) growth with solutions for technology penetration and upgrading, making it an ideal technology partner for SMEs.

PARTNER

HashiCorp provides infrastructure automation software for multi-cloud environments, enabling enterprises to unlock a common cloud operating model to provision, secure, connect, and run any application on any infrastructure. HashiCorp tools allow organizations to deliver applications faster by helping enterprises transition from manual processes and ITIL practices to self-service automation and DevOps practices. 

PARTNER

IBM is a leading global hybrid cloud and AI, and business services provider. We help clients in more than 175 countries capitalize on insights from their data, streamline business processes, reduce costs and gain the competitive edge in their industries. Nearly 3,000 government and corporate entities in critical infrastructure areas such as financial services, telecommunications and healthcare rely on IBM’s hybrid cloud platform and Red Hat OpenShift to affect their digital transformations quickly, efficiently and securely. IBM’s breakthrough innovations in AI, quantum computing, industry-specific cloud solutions and business services deliver open and flexible options to our clients. All of this is backed by IBM’s legendary commitment to trust, transparency, responsibility, inclusivity and service.