Enterprise AI Analysis
Trust to Reliance: Measurement Constructs for Human-AI Appropriate Reliance
This paper reviews measurement constructs for human-AI appropriate reliance, distinguishing between trust, reliance, and appropriate reliance. It highlights the lack of consensus and need for objective metrics to assess how users appropriately rely on AI advice, aiming to advance research in human-AI decision-making.
Key Performance Indicators
Optimizing human-AI interaction is crucial for enterprise efficiency. Our analysis reveals key metrics influencing effective AI integration.
Deep Analysis & Enterprise Applications
Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.
Artificial Intelligence (AI) systems are increasingly used in diverse domains; however, whether humans can make effective decisions with them remains an open problem. Research has explored various constructs, bifurcated between trust and reliance, to measure human perception and behavior with AI systems. The goal is to ensure users rely on AI when it provides correct advice and make correct decisions even when AI provides incorrect advice, forming 'appropriate reliance'. This work provides a review of measurement constructs to assess people's appropriate reliance on AI advice, clarifying conceptual differences and arguing about different views of reliance and objective metrics. Measurement constructs for human-AI appropriate reliance are still nascent and require consensus among studies. Our work aims to explore objective metrics for assessing human appropriate reliance on AI advice.
AI-assisted decision-making research operationalizes trust as users' subjective perception and reliance as observable objective behavior. Trust captures a broader impression, while reliance covers case-by-case discrimination. Initial works focused on trust, but recent studies increasingly focus on reliance. Constructs of trust or reliance can be contested in terms of their usefulness to clearly differentiate how humans consume AI advice. High trust can lead to over-reliance if not assessed appropriately, and low trust can lead to under-reliance even if the system is highly accurate. Appropriate reliance means humans can differentiate between correct and incorrect AI recommendations and prevent over- and under-reliance. Recent reviews have explored human-AI decision-making research, focusing on trust and reliance, but appropriate reliance remains underexplored. This work aims to add conceptual clarity and explore objective metrics for appropriate reliance.
A systematic protocol (PRISMA framework) was used to identify and analyze research studies. Keywords like 'over-reliance', 'under-reliance', 'appropriate reliance', 'artificial intelligence', 'machine learning', and 'human-AI decision-making' were used across SCOPUS and ACM Digital Library. 729 search results (Jan 2018 - Dec 2025) were filtered based on inclusion criteria: peer-reviewed full papers, human subjects, human-AI assistive decision-making contexts, and evaluation of objective metrics for appropriate reliance. This resulted in 22 selected papers. The protocol and coding are provided as an Open Science Framework (OSF) repository for evaluation.
Human-AI reliance is conceptualized under three views: Traditional, Appropriateness, and Dominance. The Traditional view (45% of studies) assesses objective behavior, where users follow correct AI advice and avoid incorrect advice, but doesn't fully explore behavioral assessment or error identification. The Appropriateness view (45% of studies) defines Relative Self-Reliance (RSR) and Relative AI Reliance (RAIR), measuring correct rejection of wrong AI advice or shifting to correct AI advice. The Dominance view (by Cabitza et al.) captures how technology exerts influence, which can be beneficial (avoiding mistakes) or detrimental (leading to more mistakes or over-reliance). This work contributes by distinguishing these views and their implications for measuring appropriate reliance.
Studies employed both objective and subjective metrics. Objective metrics include: Decision Accuracy (16 studies), Agreement Fraction (8 studies), Switch Fraction (8 studies), AI's effect on accuracy (5 studies), Over-reliance (13 studies), Under-reliance (8 studies), Relative Self-Reliance (9 studies), Relative AI Reliance (10 studies), and Weight of Advice. Subjective metrics include: Confidence (8 studies), Trust in Automation (5 studies), Perceived Trust (4 studies), Mental Demand (5 studies), Usefulness (4 studies), and Understanding of advice (4 studies). The protocol for AI advice (concurrent vs. sequential) affects metric choice. Most studies (20) use a two-step process. Task difficulty, user expertise, and personality traits are also auxiliary metrics. There is limited consensus on common measurements of human reliance on AI across studies.
Assessing human-AI appropriate reliance differs from appropriate trust. Our findings reveal gaps in constructs and measurement in realistic contexts. We aim to clarify appropriate reliance constructs and metrics, focusing on objective performance for case-to-case interactions and ability to discriminate correct/incorrect AI assistance. Varying task complexity, user expertise, and situational factors (time pressure, uncertainty) affect reliance behavior. Calibrating user self-confidence is crucial, especially with generative AI tools where reliance is non-binary. Future work should explore appropriate reliance in contexts with user agency and evolving AI interactions.
Enterprise Process Flow
| View | Key Characteristics | Measurement Focus |
|---|---|---|
| Traditional |
|
|
| Appropriateness |
|
|
| Dominance |
|
|
Impact of Explanations on Appropriate Reliance
Research indicates that AI explanations play a crucial role in fostering appropriate reliance. However, the *type* and *quality* of explanations significantly influence user behavior. Studies highlight the following:
Challenges
- ✗ Fragmented metrics hinder generalization across domains.
- ✗ Lack of consensus on definitions and measurements for appropriate reliance.
- ✗ Subjective metrics often do not validate actual reliance behavior.
- ✗ Underexplored aspects: user reliance development over time, task complexity, user expertise, and personality traits.
Solutions
- ✓ Standardized objective metrics for over- and under-reliance.
- ✓ Focus on users' ability to discriminate correct/incorrect AI assistance.
- ✓ Calibrating user self-confidence in decision-making.
- ✓ Exploring appropriate reliance in contexts with user agency and evolving AI (e.g., generative AI) interactions.
Advanced ROI Calculator
Estimate the potential return on investment for optimizing human-AI appropriate reliance in your enterprise. Improved reliance leads to fewer errors, increased efficiency, and better decision quality.
Implementation Timeline
Our phased approach ensures a seamless integration of strategies designed to foster appropriate human-AI reliance within your organization.
Phase 1: Assessment & Strategy
Comprehensive analysis of current AI integration, user reliance patterns, and identification of key areas for improvement. Develop a tailored strategy for fostering appropriate reliance.
Phase 2: Metric Implementation & Training
Deploy standardized objective metrics for monitoring reliance. Conduct targeted training programs for users to improve their ability to discriminate AI advice.
Phase 3: System Calibration & Iteration
Calibrate AI systems and interfaces based on feedback and metric data. Implement continuous improvement cycles to refine human-AI interaction and reliance over time.
Ready to Optimize Human-AI Collaboration?
Schedule a personalized strategy session to discuss how your enterprise can achieve appropriate AI reliance and unlock new levels of efficiency and accuracy.