AI RESEARCH ANALYSIS

Joint Graph Learning for Robust Causal Inference over Knowledge Graphs

Causal inference is critical for understanding cause-effect relationships in real-world domains. However, applying it over knowledge graphs (KGs) poses unique challenges due to two key issues: missing attributes caused by the Open-World Assumption and interference effects arising from complex relational dependencies among entities. Existing methods often assume fully observed data or fail to model inter-unit dependencies, leading to biased or unreliable effect estimates. We introduce BALU, a joint graph learning framework that addresses both challenges through an end-to-end solution. BALU reformulates the causal inference over KGs as two interconnected tasks: (1) attribute imputation as edge prediction between units (entities) and their attributes, and (2) treatment effect estimation as node prediction that accounts for interference through representation learning. BALU employs Graph Neural Networks (GNNs) to capture attribute similarity and relational structure, enabling both accurate imputation and interference-aware message passing. Experiments on four benchmark datasets show that BALU consistently outperforms state-of-the-art baselines—even when enhanced with strong imputation techniques—demonstrating robust performance in incomplete and relationally complex KGs. These results demonstrate that BALU offers a principled and practical solution for robust causal inference in knowledge-driven domains, empowering data-driven decision-making under real-world conditions of incompleteness and relational complexity.

Authors: Hao Huang, Maria-Esther Vidal
Publication: WSDM '26: Proceedings of the Nineteenth ACM International Conference on Web Search and Data Mining (February 2026)

Schedule Your Strategy Session

Executive Impact & Strategic Value

The paper introduces BALU, a graph learning framework for robust causal inference over Knowledge Graphs (KGs). It addresses two key challenges: attribute incompleteness (missing data) and relational interference (dependencies between entities). BALU unifies attribute imputation via edge prediction and treatment effect estimation via node prediction, using Graph Neural Networks (GNNs). Experimental results demonstrate BALU's superior performance over state-of-the-art baselines on various datasets, even with strong imputation techniques. This framework provides accurate and reliable causal effect estimation in incomplete and relationally complex KGs, supporting data-driven decision-making.

1 Order of Magnitude Lower Errors Performance Boost (vs baselines)

Improved by Graph Learning Attribute Imputation Accuracy

Robust TE Estimates Interference-Aware Estimation

~0.8 RMSE Reduction in RMSE on Synthetic Data

~0.0 MAE Reduction in MAE on Synthetic Data

Discuss Strategic Implementation

Deep Analysis & Enterprise Applications

Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.

Problem Statement

Applying causal inference over Knowledge Graphs (KGs) is challenged by attribute incompleteness (missing data due to Open-World Assumption) and relational interference (complex dependencies between entities). Traditional methods assume fully observed data and unit independence, leading to biased estimates. This work aims to estimate Individual Treatment Effect (ITE) and Average Treatment Effect (ATE) for units in KGs while addressing these challenges. A motivating example highlights how missing data and interference can lead to underestimated treatment effects and biased populations. The proposed solution, BALU, jointly performs attribute imputation and causal effect estimation using graph learning.

Methodology

BALU (Bipartite Attribute and Link-based Unit learning) is a graph learning framework with three main components:
1. Graph Representation: Models units and contextual attributes into a bipartite graph, with attribute values as edge labels. Defines observed attributes, treatment (PT), and outcome (PY).
2. Data Imputation Component: An L-layer neural network learns node embeddings for units and attributes, and edge embeddings for attribute edges. It uses Unit-Attribute Message Passing and Relational Message Passing, followed by Edge Embedding Updating. A Feedforward Neural Network (FNN) predicts missing attribute values, integrating them into contextual representations.
3. Causal Estimation Component: Takes enriched contextual representations and estimates ITEs. Interference is modeled using a K-layer Graph Neural Network (GNN) that aggregates causal influence from neighbors. It involves node prediction tasks for treatment assignment and potential outcome estimation, optimized by a joint loss function including cross-entropy and Wasserstein-1 distance.

Enterprise Process Flow

Graph Representation

→

Data Imputation (Edge Prediction)

→

Contextual Representation

→

Interference Modeling (GNN)

→

Causal Effect Estimation (Node Prediction)

Experimental Results

BALU was evaluated on synthetic (Instagram, YouTube) and semi-synthetic (BlogCatalog, Flickr) datasets under data-complete (pmiss=0.0) and data-incomplete (pmiss>0.0) scenarios.
Q1: Unit relatedness & CI performance: BALU consistently outperforms all baselines (T-/X-/R-Learner, CausalForest, GNN-HSIC, SAGE-HSIC, NetDeconf, SPNet) in data-complete scenarios, achieving RMSE and MAE approximately one order of magnitude lower on synthetic data, confirming that modeling unit similarity via relationships enhances CI.
Q2: Imputation & CI enhancement: Imputation generally improves performance (5-20% gains in RMSE, 10-40% in MAE). BALU and BALU(-edge) consistently outperform all baselines, even with strong imputation techniques.
Q3: Relational signals & Imputation: Relational signals significantly boost imputation performance. While edge embeddings are helpful, BALU maintains strong performance without them; removing relationships in imputation leads to noticeable drops on semi-synthetic datasets. Statistical significance tests confirm BALU's superior performance for EMAE across all datasets and scenarios, and for RMSE in most comparisons.

Feature	Traditional CI Methods	BALU Framework
Missing Data Handling	Assumes complete data; excludes incomplete cases or uses simple imputation (e.g., Mean, KNN)	Jointly performs attribute imputation via graph learning, integrates missing values
Interference Modeling	Assumes unit independence; limited to homogeneous graphs	Explicitly models interference via Graph Neural Networks (GNNs) over relational structures
Data Completeness Assumption	Violated by Open-World Assumption in KGs	Addresses incompleteness via edge prediction and representation learning
Performance on KGs (with missing data)	Biased or unreliable estimates; struggles with complex relational relational dependencies	Significantly more accurate and robust causal estimates (up to 1 order of magnitude lower errors)

Conclusion

BALU is a novel framework for robust causal inference over KGs, integrating data imputation and interference-aware causal estimation. It addresses attribute incompleteness and relational interference, outperforming state-of-the-art baselines. Future work includes handling missing relationships, other missingness patterns (MAR, MNAR), tabular data settings, multi-type entities, multi-hop causal effects, and deeper exploitation of KG semantics.

10X Improved Accuracy in Causal Effect Estimation

Calculate Your Potential ROI with BALU

Estimate the impact of robust causal inference on your enterprise operations.

Your Industry

Number of Employees (impacted by causal decisions)

Avg. Hours/Week on Manual Decision Support

Avg. Hourly Rate of Employees ($)

Estimated Annual Savings $0

Hours Reclaimed Annually 0

Unlock Your AI Potential

Your BALU Implementation Roadmap

A typical phased approach to integrate robust causal inference into your existing knowledge graph infrastructure.

Data Preparation & Graph Construction

Construct bipartite graph from KG, initialize node and edge embeddings.

Attribute Imputation

Train L-layer GNN for message passing, predict missing attributes using FNN.

Contextual Representation & Interference Modeling

Form enriched representations, model interference via K-layer GNN.

Causal Effect Estimation

Predict treatment assignments and potential outcomes using FNNs, calculate ITE and ATE.

Evaluation & Refinement

Assess performance using RMSE and MAE, refine model based on empirical results.

Book a Consultation

Ready to Transform Your Data-Driven Decisions?

Leverage the power of robust causal inference with BALU to gain unparalleled insights from your knowledge graphs. Our experts are ready to guide you.

Schedule a Free Strategy Session

AI RESEARCH ANALYSIS

Joint Graph Learning for Robust Causal Inference over Knowledge Graphs

Executive Impact & Strategic Value

Deep Analysis & Enterprise Applications

Problem Statement

Methodology

Enterprise Process Flow

Experimental Results

Conclusion

Calculate Your Potential ROI with BALU

Your BALU Implementation Roadmap

Data Preparation & Graph Construction

Attribute Imputation

Contextual Representation & Interference Modeling

Causal Effect Estimation

Evaluation & Refinement

Ready to Transform Your Data-Driven Decisions?

Ready to Get Started?

Book Your Free Consultation.

Let's Discuss Your AI Strategy!

Lets Discuss Your Needs

Select Time Zone

Big Competitive Advantage With Ai

Learn More

Our Demos

Research Center

Contact Us

1 888 985 3025

Solutions@OwnYourAi.com

Get Your Ai