Journal of Computational Social Science
Forma mentis networks predict creativity ratings of short texts via interpretable artificial intelligence in human and Al-simulated raters
Creativity is a fundamental skill of human cognition. This study introduces textual forma mentis networks (TFMN) to extract network and emotional features from stories generated by humans, GPT-3.5, and Sonnet 3.7. Utilizing Explainable Artificial Intelligence (XAI) and XGBoost, we investigate whether Mednick's associative theory features explain creativity ratings by humans or AI. Our findings indicate that GPT-3.5 and Sonnet 3.7 ratings significantly differ from human assessments, with AI models exhibiting a bias towards their own generated content. Network features are more predictive for human creativity, while emotional features are more critical for AI's self-ratings. This highlights critical limitations in current AI models' ability to align with human perceptions of creativity, underscoring the need for caution in deploying AI for creative content evaluation and generation.
Executive Impact: Quantifying AI's Role in Creative Assessment
Our research uncovers critical performance metrics and behavioral patterns of AI models in assessing creativity, providing a quantitative basis for strategic AI integration.
Deep Analysis & Enterprise Applications
Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.
The introduction sets the stage for understanding creativity as a fundamental human skill and how it's measured in textual data. It highlights the challenges of evaluating text creativity due to subjectivity and introduces the potential of network science and machine learning approaches, along with the gap in research concerning LLM creativity assessment.
This section details the methods used, including the collection of human-, GPT-3.5-, and Sonnet 3.7-generated stories, the process of assigning creativity ratings by human and AI raters, and the computation of Textual Forma Mentis Networks (TFMNs) to extract network and emotional features. Explainable AI (XAI) with SHAP values is employed to interpret model predictions.
Textual Forma Mentis Network (TFMN) Construction
TFMNs provide a powerful tool for analyzing the structural and conceptual construction of short stories. They capture both syntactic and emotional links, breaking narratives into associations between words.
The results present the findings from our study, including the comparison of story lengths, statistical differences in network and emotion features between human and AI-generated stories, and model performance evaluations. It highlights the varying feature importance patterns identified by SHAP scores across different rating scenarios.
AI Rating Divergence: A Core Challenge
p < 0.001 Significant Difference (Human vs. AI Ratings)Our analysis revealed statistically significant differences between human and both GPT-3.5 and Sonnet 3.7 ratings of human-authored stories, highlighting a fundamental divergence in assessment criteria.
This discussion interprets the findings, focusing on the structural and emotional differences between human- and AI-generated stories. It explores the implications of AI's distinct evaluative frameworks for creativity and emphasizes the need for caution when using AI for creative content generation and assessment.
| Assessment Focus | Human Raters | GPT-3.5 (Self-Rating) | Sonnet 3.7 (Mixed) |
|---|---|---|---|
| Key Features Prioritized | Network features (ASPL, LCC, Degree Centrality) | Emotional features (Joy, Anger, Anticipation) | Mixed (Joy, Clustering Coeff, Degree Centrality) |
| Impact on Rating | Higher scores for balanced cohesion & flexibility | Higher scores for vivid, positive emotional content | Higher scores for blend of structure & emotion |
| Underlying Framework | Associative Theory of Creativity (Mednick) | Emotional elicitation, less structural novelty | Nuanced consideration of both |
The Homogenisation Effect in AI Narratives
"In a [ADJECTIVE] town/village/empire..." or "The grand cathedral stood...", and even character names repeated across unrelated stories.
Source: Section 6.4
Both GPT-3.5 and Sonnet 3.7-generated stories exhibited strong similarities in narrative and sentence structure, frequently starting with common phrases and repeating character names. This 'homogenisation effect' indicates a lack of semantic diversity, underscoring AI's limitations in generating truly varied creative outputs.
The study concludes by summarizing the key findings: significant differences in creativity assessment between human and AI raters, AI models' bias towards their own content, and divergent feature importance (structural vs. emotional) for human and AI evaluations. It highlights limitations in current LLMs' interpretive frameworks for creativity and stresses the need for caution and further alignment with human evaluative criteria.
Calculate Your Potential ROI with AI-Powered Creativity Analysis
Estimate the efficiency gains and cost savings your enterprise could achieve by integrating our interpretable AI creativity assessment solutions.
Your AI Creativity Assessment Journey
A structured roadmap for integrating interpretable AI into your creative content evaluation workflows, ensuring a smooth transition and maximum impact.
Phase 1: Discovery & Strategy
Initial consultation to understand your current creative assessment processes, identify key challenges, and define clear objectives for AI integration. We'll develop a tailored strategy aligned with your enterprise goals.
Phase 2: Data Preparation & Model Training
Work with your team to prepare and annotate creative data. Our experts will then train custom TFMN and XAI models, ensuring they accurately reflect your specific creativity criteria and content types.
Phase 3: Integration & Pilot Deployment
Seamlessly integrate the AI assessment tools into your existing platforms. Conduct pilot programs with a subset of your team to gather feedback and refine the system for optimal performance and user experience.
Phase 4: Full-Scale Rollout & Continuous Improvement
Deploy the AI creativity assessment solution across your enterprise. Establish ongoing monitoring, provide comprehensive training, and implement continuous model improvement based on performance data and evolving needs.
Ready to Unlock Deeper Insights into Creative Content?
Don't let subjective evaluations limit your creative potential. Partner with us to implement AI-powered, interpretable creativity assessment tailored to your enterprise needs.