Enterprise AI Analysis
Modeling Structural Deviation in 10-K Risk Factors: A Semantic Anomaly Detection and Explainable AI Approach
This study presents an exploratory methodological framework for examining structural changes in regulatory risk disclosure using sentence embeddings, multivariate anomaly detection, and explainable artificial intelligence. Prior research typically relies on dictionary-based word frequencies, tone indicators, or topic proportions to quantify risk disclosure. While these measures capture disclosure intensity, they do not directly assess whether the internal semantic organization of risk narratives has shifted relative to historical patterns. We propose a structural semantic deviation framework that represents each company-year disclosure using thematic shares and embedding-based dispersion statistics and evaluates deviations from a historical baseline through unsupervised anomaly detection. Using Item 1A Risk Factors from Wells Fargo and JPMorgan Chase surrounding the 2016 regulatory shock as a focused two-firm case study, we show that traditional lexical metrics do not clearly isolate structural breaks, whereas embedding-based semantic trajectories reveal substantial narrative reconfiguration. Isolation-based modeling provides stable and discriminative anomaly scores in this setting, and SHAP decomposition highlights semantic distance, litigation emphasis, and disclosure contraction as important drivers of deviation in 2025 out-of-sample disclosures. These findings should be interpreted as methodological evidence rather than broad population-level claims. The study demonstrates how structural semantic modeling can be operationalized in regulatory disclosure analysis and provides a transparent framework that can be extended to larger panels and cross-industry settings in future research.
Key Operational Impact
Understand the quantifiable benefits and strategic implications of implementing advanced semantic analysis and Explainable AI in your enterprise risk management.
Deep Analysis & Enterprise Applications
Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.
Dictionary-Based Measures
Explores how traditional dictionary-based word counts and sentiment indicators have been used to quantify risk disclosure and their limitations in capturing structural shifts. It highlights that these methods often fail to detect deeper narrative restructuring, even when disaggregated into tone-specific categories.
Traditional Metrics Miss Structural Breaks
2016 Regulatory shock undetected by word countsTraditional frequency-based metrics, including raw risk word counts and dictionary-based intensity measures, failed to capture the distinct structural break associated with the 2016 Wells Fargo scandal, which was a major regulatory event. These metrics suggested only incremental variation rather than narrative reconfiguration, underscoring their limitation in detecting deeper semantic shifts.
Semantic Modeling
Introduces the use of sentence embeddings and semantic dispersion metrics to capture internal distributional properties of annual filings. This approach goes beyond word frequencies to model the joint distribution of thematic shares and semantic dispersion patterns, identifying deviations from a learned historical baseline.
Enterprise Process Flow
The proposed framework integrates thematic shares, sentence embeddings, and embedding-derived dispersion statistics into a multi-dimensional feature space. This allows for a more nuanced understanding of disclosure patterns, moving beyond simple word counts to capture structural reorganization.
Anomaly Detection & XAI
Details the application of unsupervised anomaly detection (e.g., Isolation Forest) to identify structural deviations in risk narratives, and the use of Explainable AI (SHAP values) to decompose anomaly scores into economically meaningful components. This allows for transparent interpretation of what drives structural change.
| Company | JPMorgan Chase | Wells Fargo |
|---|---|---|
| Overall Anomaly Score | Much Higher (3X WFC) | Moderate |
| Semantic Mean Distance | Largest contributor, geometrically farther from historical centroid, implying narrative reconfiguration | Less dramatic contribution, structure closer to baseline |
| Litigation Emphasis | Elevated share, stronger legal framing | Remains relatively low but gradually increases |
| Disclosure Length | Sharp contraction (60% reduction), compounded anomaly effect | Stable volume, incremental adjustments |
While both firms experienced the 2016 regulatory shock, their disclosure responses in 2025 varied significantly. JPMorgan exhibited a stronger structural deviation driven by a substantial reduction in disclosure length, increased semantic distance, and elevated litigation content. Wells Fargo, in contrast, showed more moderate structural adjustments with a balanced SHAP profile, maintaining stable disclosure length and incremental increases in governance-related themes.
Explaining the 2025 Deviation: SHAP Insights for JPM
For JPMorgan Chase, the 2025 disclosure showed significant deviation driven primarily by three factors identified by SHAP: semantic mean distance, litigation share, and sentence count contraction. The large semantic mean distance indicates a fundamental shift in how JPM's narrative is structured, moving geometrically farther from its historical semantic centroid. This suggests a deep narrative reconfiguration beyond simple lexical adjustments. The elevated litigation share highlights a stronger legal framing in its recent disclosures compared to historical norms. Furthermore, the sharp contraction in the total number of sentences in the 2025 filing (a 60% reduction) significantly amplified this structural deviation, altering the proportional weighting across various semantic features.
Calculate Your Potential ROI
Our AI-powered structural semantic analysis identifies critical deviations in regulatory disclosures, enabling proactive risk management and enhanced compliance. By automating the detection of nuanced narrative shifts, enterprises can reduce manual review time, mitigate oversight risks, and optimize resource allocation. This leads to substantial operational efficiencies and improved decision-making.
Your Implementation Roadmap
A phased approach to integrating structural semantic analysis and Explainable AI into your existing enterprise risk management workflows.
Phase 1: Data Ingestion & Baseline Establishment
Collect and preprocess historical 10-K risk factor filings. Establish a robust baseline of 'normal' disclosure patterns using sentence embeddings and thematic features from prior years (e.g., 2014-2018 for this study). Validate baseline stability.
Phase 2: Semantic Model Deployment & Anomaly Detection
Deploy pretrained sentence embedding models and train unsupervised anomaly detectors (e.g., Isolation Forest) on the established baseline. Integrate the system to continuously monitor new filings for structural deviations, generating percentile-based risk scores.
Phase 3: Explainable AI Integration & Reporting
Integrate SHAP values to decompose anomaly scores, providing clear, feature-level explanations for detected structural deviations. Develop intuitive dashboards and reports that highlight specific linguistic and semantic drivers of risk, enabling targeted compliance actions and strategic adjustments.
Phase 4: Feedback Loop & Continuous Improvement
Establish a feedback mechanism with legal, compliance, and risk teams to refine model parameters and thematic keyword sets. Continuously monitor model performance and retrain with new data, ensuring the system adapts to evolving regulatory landscapes and disclosure practices.
Ready to Transform Your Risk Disclosure Analysis?
Book a personalized consultation with our AI specialists to discuss how structural semantic deviation modeling and Explainable AI can elevate your enterprise risk management strategy.