AI-POWERED KNOWLEDGE ASSESSMENT

KNIGHT: Knowledge Graph-Driven Multiple-Choice Question Generation with Adaptive Hardness Calibration

By Mohammad Amanlou, Erfan Shafiee Moghaddam, Yasaman Amou Jafari, Mahdi Noori, Farhan Farsi, Behnam Bahrak.

University of Tehran, Independent Researcher, Amirkabir University of Technology, TEIAS Institute.

With the rise of large language models (LLMs), evaluating these systems remains bottlenecked by the time and cost of building specialized assessment datasets. KNIGHT introduces an LLM-based, knowledge-graph-driven framework for generating multiple-choice question (MCQ) datasets from external sources. It constructs a topic-specific knowledge graph (KG), a structured, parsimonious summary of entities and relations, that can be reused to generate instructor-controlled difficulty levels, including multi-hop questions, without repeatedly re-feeding the full source text. This KG acts as a compressed, reusable state, making question generation a cheap read over the graph.

Schedule Your AI Strategy Session

Unlocking Knowledge: The KNIGHT Impact

KNIGHT redefines MCQ generation, offering a robust, efficient, and highly customizable solution for educational assessment and LLM evaluation.

Reduced Hallucinations

High Quality MCQs

MMLU Aligned Rankings

0.78 Human Difficulty Correlation

Explore Advanced Capabilities

Deep Analysis & Enterprise Applications

Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.

Topic-Specific Knowledge Graph (KG) Construction

KNIGHT's first stage involves building a topic-specific knowledge graph. Given a user-specified topic and an optional prompt, it retrieves ranked context from external sources like Wikipedia. A Description Generator synthesizes a structured eight-point gloss for the seed entity, followed by a Relation Extractor that distills explicit facts into subject-predicate-object triples. These triples undergo rigorous curation and pruning, involving type checks (Wikidata), NLI-based consistency checks, and content-policy screening. The graph expands breadth-first up to a user-defined dmax depth, ensuring a compact, reusable, and token-efficient representation.

Multi-Hop MCQ Synthesis with Difficulty Control

The framework generates difficulty-calibrated multi-hop MCQs by traversing paths within the validated KG. For each seed node, it enumerates forward or reverse paths of configurable length (hops). These paths are verbalized into compact context templates, which an LLM uses to output an MCQ tuple (question, key, and three semantically proximate distractors). Difficulty is controlled by path length and abstraction, with reverse questions empirically shown to increase model entropy.

LLM-Based Validation and Filtering

To ensure high-quality MCQs, KNIGHT employs an LLM-based validator that scores each candidate question against five criteria: grammatical fluency, single-correct-answer unambiguity, option uniqueness, answerability from source, and topic relevance. Items are retained only if they pass all checks, significantly reducing hallucination and improving pedagogical validity. Human audits corroborate the automated filtering, showing low violation rates and strong correlation between model uncertainty and human-perceived difficulty.

Enterprise Process Flow: KNIGHT's Pipeline

User Input (Topic, Depth)

→

External Evidence Retrieval

→

Knowledge Graph Construction

→

MCQ Generation (Multi-Hop)

→

LLM-Powered Validation & Filtering

→

Final MCQ Dataset

0.78 Strong Correlation with Human Difficulty

KNIGHT's difficulty calibration, based on KG depth and abstraction, shows a strong correlation with human cognitive assessments of question complexity (r ≈ 0.78), validating its adaptive hardness tuning.

KNIGHT vs. Traditional MCQ Benchmarks

KNIGHT Advantages	Traditional Benchmarks (e.g., MMLU)
Low-cost, token-efficient generation. User-controlled difficulty (multi-hop design). Topic-specific, refreshable datasets. Explicit multi-hop structure. High quality and validity with LLM-based filtering.	High cost, expert-driven creation. Static difficulty, limited control. Broad, static, difficult to update. Limited multi-hop exposure. Evaluation bottlenecks.

Diverse Applications: History, Biology, Mathematics

KNIGHT was instantiated on Wikipedia/Wikidata to generate six MCQ datasets across History, Biology, and Mathematics at two difficulty levels (Level 1 and Level 3). These case studies demonstrate the framework's flexibility and reusability, producing high-quality, grammatical, and difficulty-calibrated items within minutes. The system successfully reduced hallucinations and aligned model rankings with MMLU-style benchmarks, proving its broad applicability.

Calculate Your Potential AI ROI

Estimate the time and cost savings your enterprise could achieve by automating knowledge assessment and content generation with KNIGHT.

Industry

Number of Employees (impacted by content tasks)

Average Weekly Hours on Content Tasks per Employee

Average Hourly Cost per Employee ($)

Annual Savings $0

Hours Reclaimed Annually 0

Get Your Custom ROI Report

Your AI Implementation Roadmap

A phased approach to integrate KNIGHT into your existing learning and evaluation ecosystems for maximum impact.

Phase 1: Discovery & Strategy

Initial consultation to understand your unique content needs, assessment goals, and existing infrastructure. Define scope and custom KG requirements.

Phase 2: KG Setup & Integration

Build your domain-specific knowledge graphs, integrate with your data sources, and configure KNIGHT for optimal performance and token efficiency.

Phase 3: Content Generation & Calibration

Generate initial MCQ datasets, fine-tune difficulty controls, and perform iterative validation to ensure quality and alignment with your pedagogical standards.

Phase 4: Deployment & Optimization

Deploy KNIGHT within your learning platforms, conduct user training, and establish ongoing monitoring and optimization protocols.

Start Your AI Journey

Ready to Transform Your Assessments?

Book a free 30-minute consultation with our AI experts to see how KNIGHT can streamline your content creation and elevate your evaluation process.

Book Your Free Consultation

AI-POWERED KNOWLEDGE ASSESSMENT

KNIGHT: Knowledge Graph-Driven Multiple-Choice Question Generation with Adaptive Hardness Calibration

Unlocking Knowledge: The KNIGHT Impact

Deep Analysis & Enterprise Applications

Topic-Specific Knowledge Graph (KG) Construction

Multi-Hop MCQ Synthesis with Difficulty Control

LLM-Based Validation and Filtering

Enterprise Process Flow: KNIGHT's Pipeline

KNIGHT vs. Traditional MCQ Benchmarks

Diverse Applications: History, Biology, Mathematics

Calculate Your Potential AI ROI

Your AI Implementation Roadmap

Phase 1: Discovery & Strategy

Phase 2: KG Setup & Integration

Phase 3: Content Generation & Calibration

Phase 4: Deployment & Optimization

Ready to Transform Your Assessments?

Ready to Get Started?

Book Your Free Consultation.

Let's Discuss Your AI Strategy!

Lets Discuss Your Needs

Select Time Zone

Big Competitive Advantage With Ai

Learn More

Our Demos

Research Center

Jobs

Contact Us

1 888 985 3025

Solutions@OwnYourAi.com

Get Your Ai