AI-POWERED KNOWLEDGE ASSESSMENT
KNIGHT: Knowledge Graph-Driven Multiple-Choice Question Generation with Adaptive Hardness Calibration
By Mohammad Amanlou, Erfan Shafiee Moghaddam, Yasaman Amou Jafari, Mahdi Noori, Farhan Farsi, Behnam Bahrak.
University of Tehran, Independent Researcher, Amirkabir University of Technology, TEIAS Institute.
With the rise of large language models (LLMs), evaluating these systems remains bottlenecked by the time and cost of building specialized assessment datasets. KNIGHT introduces an LLM-based, knowledge-graph-driven framework for generating multiple-choice question (MCQ) datasets from external sources. It constructs a topic-specific knowledge graph (KG), a structured, parsimonious summary of entities and relations, that can be reused to generate instructor-controlled difficulty levels, including multi-hop questions, without repeatedly re-feeding the full source text. This KG acts as a compressed, reusable state, making question generation a cheap read over the graph.
Unlocking Knowledge: The KNIGHT Impact
KNIGHT redefines MCQ generation, offering a robust, efficient, and highly customizable solution for educational assessment and LLM evaluation.
Deep Analysis & Enterprise Applications
Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.
Topic-Specific Knowledge Graph (KG) Construction
KNIGHT's first stage involves building a topic-specific knowledge graph. Given a user-specified topic and an optional prompt, it retrieves ranked context from external sources like Wikipedia. A Description Generator synthesizes a structured eight-point gloss for the seed entity, followed by a Relation Extractor that distills explicit facts into subject-predicate-object triples. These triples undergo rigorous curation and pruning, involving type checks (Wikidata), NLI-based consistency checks, and content-policy screening. The graph expands breadth-first up to a user-defined dmax depth, ensuring a compact, reusable, and token-efficient representation.
Multi-Hop MCQ Synthesis with Difficulty Control
The framework generates difficulty-calibrated multi-hop MCQs by traversing paths within the validated KG. For each seed node, it enumerates forward or reverse paths of configurable length (hops). These paths are verbalized into compact context templates, which an LLM uses to output an MCQ tuple (question, key, and three semantically proximate distractors). Difficulty is controlled by path length and abstraction, with reverse questions empirically shown to increase model entropy.
LLM-Based Validation and Filtering
To ensure high-quality MCQs, KNIGHT employs an LLM-based validator that scores each candidate question against five criteria: grammatical fluency, single-correct-answer unambiguity, option uniqueness, answerability from source, and topic relevance. Items are retained only if they pass all checks, significantly reducing hallucination and improving pedagogical validity. Human audits corroborate the automated filtering, showing low violation rates and strong correlation between model uncertainty and human-perceived difficulty.
Enterprise Process Flow: KNIGHT's Pipeline
KNIGHT's difficulty calibration, based on KG depth and abstraction, shows a strong correlation with human cognitive assessments of question complexity (r ≈ 0.78), validating its adaptive hardness tuning.
| KNIGHT Advantages | Traditional Benchmarks (e.g., MMLU) |
|---|---|
|
|
Diverse Applications: History, Biology, Mathematics
KNIGHT was instantiated on Wikipedia/Wikidata to generate six MCQ datasets across History, Biology, and Mathematics at two difficulty levels (Level 1 and Level 3). These case studies demonstrate the framework's flexibility and reusability, producing high-quality, grammatical, and difficulty-calibrated items within minutes. The system successfully reduced hallucinations and aligned model rankings with MMLU-style benchmarks, proving its broad applicability.
Calculate Your Potential AI ROI
Estimate the time and cost savings your enterprise could achieve by automating knowledge assessment and content generation with KNIGHT.
Your AI Implementation Roadmap
A phased approach to integrate KNIGHT into your existing learning and evaluation ecosystems for maximum impact.
Phase 1: Discovery & Strategy
Initial consultation to understand your unique content needs, assessment goals, and existing infrastructure. Define scope and custom KG requirements.
Phase 2: KG Setup & Integration
Build your domain-specific knowledge graphs, integrate with your data sources, and configure KNIGHT for optimal performance and token efficiency.
Phase 3: Content Generation & Calibration
Generate initial MCQ datasets, fine-tune difficulty controls, and perform iterative validation to ensure quality and alignment with your pedagogical standards.
Phase 4: Deployment & Optimization
Deploy KNIGHT within your learning platforms, conduct user training, and establish ongoing monitoring and optimization protocols.
Ready to Transform Your Assessments?
Book a free 30-minute consultation with our AI experts to see how KNIGHT can streamline your content creation and elevate your evaluation process.