Getting your Trinity Audio player ready...
|
Recent research by Assistant Professor David Jurgens and Doctoral Candidate Jiaxin Pei from the University of Michigan School of Information unveiled a significant revelation. Their study demonstrated that the individuals responsible for annotating data, encompassing texts, videos, and online media, carry substantial influence due to their unique demographics, life experiences, and backgrounds.
Professor Jurgens emphasised that these annotators can be treated in various ways, as their diverse perspectives significantly impact data labelling. The study’s implications underscored the importance of comprehending the annotators’ backgrounds and diligently gathering labels from a demographically balanced cohort of crowd workers. By embracing this approach, the aim is to mitigate the inherent biases often present in datasets.
The research’s methodology involved analysing a substantial dataset comprising 6,000 comments. This scrutiny revealed a crucial insight: the beliefs and decision-making processes of the annotators profoundly affect the performance of learning models employed to flag the deluge of online content they encounter daily.
What one segment of the population deems polite might be rated significantly less polite by another, highlighting the nuanced and subjective nature of politeness and offensiveness in the digital realm. This revelation serves as a poignant reminder of the intricate dynamics when interpreting online interactions and content perception within a diverse society.
Pei underscored the significance of the type of data employed by AI systems, emphasising that their study offered a crucial revelation regarding identifying the individuals responsible for labelling this data. When data annotation is limited to a specific subset of the population, the resulting AI system may fail to capture the broader average perspective accurately.
Jurgens and Pei’s research endeavour was driven by the aspiration to gain a more comprehensive understanding of the divergent identities among annotators and the profound influence their unique experiences exert on their decision-making processes. Unlike previous studies that focused on singular aspects of identity, such as gender, their objective is to enhance AI models, ensuring they more effectively encapsulate the diverse beliefs and opinions of the entire spectrum of individuals.
The study provided several notable insights. Firstly, it challenges earlier research by revealing no statistically significant difference in offensive language ratings between men and women. However, it does show that individuals with nonbinary gender identities tend to rate messages as less offensive compared to those identifying as men and women. Additionally, participants aged 60 and above tend to assign higher offensiveness scores than their middle-aged counterparts.
A striking observation pertained to the influence of racial factors on offensiveness ratings. Black participants consistently rated the same comments as more offensive than individuals from other racial groups. It suggested that classifiers trained on data annotated by white individuals may potentially underestimate the perceived offensiveness of comments for Black and Asian individuals.
Interestingly, there were no significant variations in annotator education concerning offensiveness ratings, suggesting that educational background did not significantly shape these perceptions.
Using these findings, Jurgens and Pei developed POPQUORN, a comprehensive Potato-Prolific dataset designed for Question Answering, Offensiveness assessment, text Rewriting, and politeness rating with nuanced demographic considerations. This dataset provides social media and AI companies a unique opportunity to construct models that account for diverse, intersectional perspectives and beliefs.
Jurgens raised a crucial question regarding the increasing use of Generative AI in daily tasks. He highlighted the importance of knowing the values embedded in these trained models. If a representative sample is consistently taken without acknowledging differences, certain groups of people might continue to be marginalised.
Pei emphasised that POPQUORN is pivotal in ensuring equitable systems that align with individuals’ beliefs and backgrounds, fostering inclusivity and fairness in AI applications.