KNOWLEDGE MODEL PROMPTING FOR LLM PLANNING
Unlocking Advanced Reasoning in Large Language Models with TMK Framework
Large Language Models (LLMs) often struggle with complex reasoning and planning. This paper introduces the Task-Method-Knowledge (TMK) framework as a novel prompting technique to enhance LLM performance on such tasks. Evaluated on the PlanBench Blocksworld domain, TMK-structured prompts significantly improve LLM accuracy, particularly on opaque symbolic tasks, achieving up to 97.3% where models previously failed (31.5%). The findings suggest that TMK acts as a 'symbolic steering mechanism,' shifting LLMs from linguistic approximation to formal, code-execution pathways, thereby bridging the gap between semantic understanding and symbolic manipulation. This approach aligns with cognitive science principles and demonstrates a path for LLMs to engage more robust reasoning processes beyond pattern matching.
Key Impact & Performance Metrics
Highlighting the significant improvements achieved through TMK prompting in LLM planning capabilities.
Deep Analysis & Enterprise Applications
Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.
TMK Framework
The Task-Method-Knowledge (TMK) framework is a formal knowledge representation language designed to model the teleology of intelligent agents. It hierarchically decomposes a system into three components: Tasks (goals and conditions), Methods (procedural mechanisms), and Knowledge (domain concepts and relationships). This explicit decomposition helps capture causal, teleological, and hierarchical reasoning structures, enabling LLMs to better understand 'what to do,' 'how to do it,' and 'why actions are taken.'
Enterprise Process Flow
TMK's structured approach provides a clear basis for reasoning, mirroring cognitive decomposition. It addresses planning deficiencies by defining clear steps, relationships, and driving goals, equipping LLMs with causally grounded context.
Experimental Setup
The study evaluates TMK on the PlanBench benchmark, specifically the Blocksworld domain and its variants (Classic, Mystery, Random). OpenAI models (GPT-4, GPT-4o, o1-mini, o1, GPT-5) were used. Experiments involved one-shot prompting with TMK, comparing against zero-shot and one-shot plain-text baselines.
| Variant Type | Description | TMK Impact |
|---|---|---|
| Classic | Uses canonical English labels (e.g., pick up, stack), relying on semantic memory. |
|
| Mystery | Maps actions to semantically distinct but unrelated words (e.g., attack, feast). Tests reasoning from provided rules, not semantic associations. |
|
| Random | Replaces labels with opaque alphanumeric strings (e.g., 1jpkithdyjmlikck). Steers models to rely on symbolic manipulation. |
|
Key Findings & Discussion
TMK prompting consistently improved LLM performance across Blocksworld variants, with flagship models showing significant gains. Notably, the 'performance inversion' in the o1 model (Random tasks surpassing Mystery tasks under TMK) suggests TMK acts as a 'symbolic steering mechanism,' shifting inference from linguistic approximation to formal, code-execution pathways, aligning with LLMs' code training data.
The 'performance inversion' where Random tasks became significantly easier than Mystery tasks under TMK prompting provides empirical validation for the steering effect. This indicates a fundamental shift in reasoning modality, moving away from semantic interference.
Calculate Your Potential AI ROI
Estimate the efficiency gains and cost savings your enterprise could achieve by integrating advanced AI solutions.
Our AI Implementation Roadmap
A strategic, phased approach to integrating advanced AI capabilities into your enterprise operations, ensuring maximum impact and minimal disruption.
Phase 1: Initial TMK Model Development
Convert domain knowledge into TMK structure (Tasks, Methods, Knowledge). Focus on core planning actions for Blocksworld.
Phase 2: Prompt Integration & Baseline Testing
Integrate JSON-formatted TMK into one-shot prompts. Run baseline evaluations on PlanBench Blocksworld variants (Classic, Mystery, Random) with plain text and TMK prompts.
Phase 3: Performance Analysis & Iteration
Analyze performance gains, identify 'performance inversion' effects. Refine TMK structure based on model responses and observed reasoning shifts. Document key insights.
Phase 4: Expansion & Generalization (Future Work)
Apply TMK to other planning domains (Logistics, multi-agent) and compare against alternative frameworks like BDI/HTN for broader validation.
Ready to Transform Your Enterprise with AI?
Book a complimentary consultation with our AI specialists to explore how TMK-powered solutions can drive efficiency and innovation in your organization.