A multidisciplinary team of Massachusetts Institute of Technology (MIT) researchers led by Iddo Drori, a lecturer in the MIT Department of Electrical Engineering and Computer Science (EECS), has used a neural network model to solve university-level math problems at a human level in a matter of seconds.
“It will help students improve, and it will help teachers create new content, and it could help increase the level of difficulty in some courses. It also allows us to build a graph of questions and courses, which helps us understand the relationship between courses and their pre-requisites, not just by historically contemplating them, but based on data,” Iddo explained, also an adjunct associate professor at Columbia University’s Department of Computer Science.
Additionally, the model automatically explains solutions and rapidly generates new math problems for university-level courses. When the researchers presented these machine-generated questions to university students, the students were unable to distinguish whether the questions were created by a human or an algorithm.
This approach might be used to simplify the creation of course content, which would be particularly beneficial for big residential courses and massive open online courses (MOOCs) with thousands of students. The technology might also be used as an automated tutor that demonstrates to students how to solve basic math problems.
In the past, researchers employed a neural network, such as GPT-3, that was merely pretrained on the text like it was shown millions of examples of text to learn the patterns of natural language. This time, they employed a neural network that was trained on the text and “tuned” on code.
A machine learning model can perform better by using this network, known as Codex, which is effectively an additional pre-training procedure.
The model was exposed to millions of code examples from internet repositories. As the training data for this model contained millions of natural language words and millions of lines of code, it learns the relationships between text and code.
The machine-generated questions were evaluated by showing them to university students. The researchers assigned students 10 problems from each undergraduate math course in random order; five questions were prepared by people and the remaining five were generated by a computer.
Students were unable to discern whether the machine-generated questions were produced by an algorithm or a human, and they scored the difficulty level and course-appropriateness of questions generated by humans and machines similarly.
Researchers emphasised that this effort is not meant to take the place of actual teachers. They claim that although automation has reached 80 per cent accuracy, it will never reach 100 per cent. Every time someone figures something out, someone else will pose a more challenging problem.
Simply this work opens the door for people to begin using machine learning to answer ever-harder questions, and academics are optimistic that it will have a significant impact on higher education.
The team has expanded the work to handle math proofs because of the approach’s effectiveness, although there are several limits they intend to address. Due to computational complexity, the model is currently unable to answer questions with a visual component or resolve computationally intractable issues.
The model is being scaled up to hundreds of courses in addition to these obstacles. They will produce more data with those hundreds of courses, which they may use to improve automation and offer perceptions into course design and curricula.