Enterprise AI Analysis: Search Planning

Probe-then-Plan: Environment-Aware Planning for Industrial E-commerce Search

Modern e-commerce search is evolving from keyword matching to complex user intents. Existing LLM-based search paradigms face a blindness-latency dilemma: query rewriting is often environment-agnostic, leading to invalid plans, while deep search agents are too slow. We propose Environment-Aware Search Planning (EASP), a novel paradigm that reformulates search planning as a grounded, single-step reasoning process. EASP introduces a Probe-then-Plan mechanism, where a lightweight Retrieval Probe exposes the retrieval environment, enabling the Planner to diagnose execution gaps and generate grounded search plans. Our three-stage methodology includes Offline Data Synthesis, Planner Training and Alignment, and Adaptive Online Serving. Extensive evaluations and A/B testing on JD.com demonstrate EASP significantly improves relevant recall, UCVR, and GMV, with low latency, making it suitable for industrial deployment.

Schedule Your Strategy Session

Key Business Impacts

Our analysis of Probe-then-Plan: Environment-Aware Planning for Industrial E-commerce Search reveals significant gains in critical enterprise metrics, ensuring both performance and efficiency in AI-powered search.

0.0% UCVR Lift (Online)

0.0% GMV Lift (Online)

0ms Planner Latency (p75)

0% Recall Improvement

Deep Analysis & Enterprise Applications

Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.

Methodology Overview

Offline Data Synthesis

Planner Training & Alignment

Adaptive Online Serving

Experimental Results

~80% of traffic processed via Fast Path for sub-20ms latency

Enterprise Process Flow

Offline Data Synthesis (Teacher Agent)

→

Planner Training & Alignment (SFT + RL)

→

Adaptive Online Serving (Router + Planner)

The Probe-then-Plan mechanism is central to EASP. Unlike traditional methods, a lightweight Retrieval Probe first provides a real-time snapshot of the retrieval environment (Oinit). This enables the Planner to make an informed, single-step decision, generating a grounded search plan that directly addresses identified execution gaps, such as entity drift or attribute misalignment, without requiring iterative, slow interactions. This approach resolves the 'blindness-latency' dilemma prevalent in other LLM-based search paradigms, ensuring both accuracy and speed.

The first stage involves generating a diverse dataset of execution-validated plans. A Teacher Agent (SOTA LLM) analyzes user queries (q) and retrieval snapshots (Oinit) from the Retrieval Probe. It performs perceptual diagnosis to categorize retrieval states (Effective, Recall Failure, Precision Failure) and then selects an adaptive planning strategy (Preservation, Sanitization, Concretization) to formulate optimal search plans. Stochastic decoding ensures a diverse set of valid reformulation strategies, which are then validated against the retrieval environment.

The Planner, a lightweight LLM, undergoes two phases: Supervised Fine-Tuning (SFT) on the offline dataset to internalize the Teacher's diagnostic and planning patterns, and Reinforcement Learning (RL) using Group Relative Policy Optimization (GRPO). RL aligns the Planner with critical business outcomes, specifically conversion rate (UCVR). The reward function incorporates a 'Hard Relevance Gate', ensuring that high-priced but irrelevant items are penalized, thus grounding conversion optimization in semantic accuracy. This stage ensures the Planner not only generates effective plans but also drives business value.

In the online serving stage, a Complexity-Aware Router (an even smaller language model) selectively activates the EASP pipeline. For simple queries (e.g., 'iPhone 17'), a Fast Path bypasses the planner entirely, ensuring near-zero latency for the majority (~80%) of traffic. Only complex queries trigger the full Complex Path with the Retrieval Probe and Planner, balancing latency and effectiveness. This intelligent routing mechanism is crucial for industrial deployment, where sub-second response times are paramount.

Offline Evaluation Result Comparison
Method	REL@30	HR@30 (%)	Latency
Blind Rewriter	20.7	28.6	low (ms level)
w/o RL	23.0	29.5	low (ms level)
ReAct Agent	24.1	30.2	high (s level)
EASP (ours)	23.3	31.0	low (ms level)
Note: EASP achieves optimal balance between performance and efficiency, outperforming baselines across key metrics while maintaining millisecond latency.

Online A/B testing on JD.com's live traffic showed statistically significant lifts: +0.89% UCVR (p<0.05) and +0.57% GMV (p<0.05) on overall traffic. For requests that triggered the new model (complex path), UCVR increased by 4.10% and GMV by 2.59%. End-to-end planner latency was under 20ms at p75 and under 700ms at p99, well within industrial constraints. These results confirm EASP's industrial viability and its successful deployment in JD.com's AI-Search system.

Calculate Your Potential ROI

Estimate the potential annual savings and reclaimed hours for your enterprise by leveraging AI-powered search planning. Adjust the parameters below to see a personalized projection.

Your Industry

Number of Employees (using search daily)

Average Daily Hours per Employee (on search tasks)

Average Hourly Wage (for employees on search tasks)

Estimated Annual Savings $0

Annual Hours Reclaimed 0

Schedule Your Strategy Session

Our Proven Implementation Roadmap

A structured approach to ensure seamless integration and maximum impact.

Phase 1: Foundation & Data Integration

Integrate EASP with existing search infrastructure, establish data pipelines for Retrieval Probe, and synthesize initial offline training data.

Phase 2: Model Training & Business Alignment

Initiate supervised fine-tuning for the Planner, followed by reinforcement learning (GRPO) to align with conversion objectives and business KPIs.

Phase 3: Adaptive Deployment & Monitoring

Deploy the Complexity-Aware Router and EASP pipeline with A/B testing, continuous monitoring, and iterative refinement based on online performance.

Phase 4: Personalization & Advanced Features

Extend EASP to incorporate user behavioral signals for personalized planning, adapting strategies based on individual preferences and interaction history.

Ready to Transform Your Enterprise?

Schedule a personalized strategy session with our AI experts to explore how EASP can revolutionize your search capabilities.

Schedule Your Strategy Session

Enterprise AI Analysis: Search Planning

Probe-then-Plan: Environment-Aware Planning for Industrial E-commerce Search

Key Business Impacts

Deep Analysis & Enterprise Applications

Enterprise Process Flow

Offline Evaluation Result Comparison

Calculate Your Potential ROI

Our Proven Implementation Roadmap

Phase 1: Foundation & Data Integration

Phase 2: Model Training & Business Alignment

Phase 3: Adaptive Deployment & Monitoring

Phase 4: Personalization & Advanced Features

Ready to Transform Your Enterprise?

Ready to Get Started?

Book Your Free Consultation.

Let's Discuss Your AI Strategy!

Lets Discuss Your Needs

Select Time Zone

Big Competitive Advantage With Ai

Learn More

Our Demos

Research Center

Jobs

Contact Us

1 888 985 3025

Solutions@OwnYourAi.com

Get Your Ai