Skip to main content
Enterprise AI Analysis: Step2Motion: Locomotion Reconstruction from Pressure Sensing Insoles

MOTION CAPTURE & AI

Step2Motion: Locomotion Reconstruction from Pressure Sensing Insoles

Human motion is fundamentally driven by continuous physical interaction with the environment. Whether walking, running, or simply standing, the forces exchanged between our feet and the ground provide crucial insights for understanding and reconstructing human movement. Recent advances in wearable insole devices offer a compelling solution for capturing these forces in diverse, real-world scenarios. Sensor insoles pose no constraint on the users' motion (unlike mocap suits) and are unaffected by line-of-sight limitations (in contrast to optical systems). These qualities make sensor insoles an ideal choice for robust, unconstrained motion capture, particularly in outdoor environments. Surprisingly, leveraging these devices with recent motion reconstruction methods remains largely unexplored. Aiming to fill this gap, we present Step2Motion, the first approach to reconstruct human locomotion from multi-modal insole sensors. Our method utilizes pressure and inertial data—accelerations and angular rates—captured by the insoles to reconstruct human motion. We evaluate the effectiveness of our approach across a range of experiments to show its versatility for diverse locomotion styles, from simple ones like walking or jogging up to moving sideways, on tiptoes, slightly crouching, or dancing. The complete source code, trained model, data, and supplementary material used in this paper can be found at: https://vcai.mpi-inf.mpg.de/projects/Step2Motion/

Executive Impact & ROI

Step2Motion introduces a novel deep learning-based approach for reconstructing human locomotion and root motion using pressure and inertial measurements from insole sensors. This system provides an unparalleled solution for robust, unconstrained motion capture, especially in outdoor environments where traditional systems fall short. By leveraging multi-modal insole data and a diffusion-based model with a new multi-head cross-attention mechanism, Step2Motion achieves accurate lower-body motion reconstruction and plausible upper-body movements. This significantly reduces the need for expensive, restrictive motion capture suits, offering a versatile and accessible solution for diverse applications.

0 Pose Reconstruction MPJPE
0 Root Displacement MRPE
0 In-the-Wild Drift

Deep Analysis & Enterprise Applications

Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.

Diffusion Probabilistic Framework

Diffusion Probabilistic Framework: Step2Motion employs a diffusion probabilistic model for reconstructing poses. This model iteratively adds Gaussian noise to an initial pose sequence and then progressively denoises it to recover the original motion. This approach is highly effective for capturing complex distributions and synthesizing high-quality, temporally consistent data.

The model operates on sequences of poses, enhancing temporal consistency and handling inherent ambiguities where similar sensor readings might correspond to different poses. Conditioning the diffusion process with multi-modal insole data is key to its success.

Insole Multi-head Cross-Attention

Insole Multi-head Cross-Attention: A carefully designed cross-attention mechanism allows the network to selectively attend to different sensor modalities (pressure, acceleration, angular rate, total force, CoP) based on the specific body part being reconstructed and the current motion context. This prevents the high-dimensionality of the combined signal from diluting critical cues from specific sensors.

For instance, during a heel strike, the network prioritizes heel pressure data, while during a swing phase, it focuses on IMU data to track leg orientation. This dynamic focus significantly improves accuracy and interpretability.

Displacement Predictor Network

Displacement Predictor Network: In addition to reconstructing poses, Step2Motion independently estimates root displacements directly from IMU data using a separate Transformer network. This direct regression approach is suitable for displacement prediction, as a given sequence of IMU readings typically corresponds to a specific displacement pattern.

This two-stage pipeline for pose and displacement allows for more effective processing of distinct feature types and prevents displacement information from being ignored, which can be a common issue with end-to-end models. The displacement predictor is crucial for accurate global root trajectory estimation.

Body-Partitioned Pose Encoding

Body-Partitioned Pose Encoding: The system decomposes each pose into three body parts: left leg, right leg, and the remaining body. Distinct vector embeddings are generated for each part at every frame and then concatenated into a single sequence.

This design enhances communication within the Transformer network, allowing self-attention globally across both spatial (body parts) and temporal (frames) dimensions. This guides the network to prioritize pertinent information, especially given that each insole primarily influences its corresponding leg.

7.2 cm MPJPE (cm) - Best Pose Accuracy

Step2Motion achieves superior pose reconstruction accuracy, outperforming MLP and Transformer baselines with a Mean Per Joint Positional Error (MPJPE) of 7.2 cm. This metric reflects the average Euclidean distance between predicted and ground-truth joint positions, highlighting the system's ability to capture intricate human movements with high fidelity.

Enterprise Process Flow

Insole Sensor Data (Pressure & IMU)
Data Preprocessing & Encoding
Diffusion Model (Pose Reconstruction)
Displacement Predictor (Root Motion)
Full Body Locomotion Animation

The Step2Motion pipeline integrates multi-modal insole sensor data to reconstruct full-body human locomotion. Starting with raw sensor inputs, the data undergoes encoding before feeding into a diffusion model for pose estimation. Simultaneously, a dedicated predictor estimates root motion, culminating in a coherent animation.

Comparison of Motion Capture Systems

Feature Traditional Mocap Suits (IMU-based) Optical Systems Step2Motion (Insole-based)
Constraint on Motion
  • Specialized suits restrict movement
  • Requires clear line-of-sight
  • Limited capture area
  • No constraints on user movement
Environment
  • Requires frequent calibration
  • Controlled environments (indoor)
  • Robust in diverse, real-world, outdoor scenarios
Ease of Use
  • Complex setup
  • External attachments
  • Complex & expensive setup
  • Easy setup & wear (regular insoles)
Drift/Occlusion
  • Prone to drift errors
  • Sensor looseness
  • Affected by line-of-sight limitations
  • Lighting variation
  • Reduced sensor displacement (feet anchors)
  • Unaffected by occlusions
1.25% In-the-Wild Root Drift (%) - Minimal Global Error

In real outdoor settings, Step2Motion demonstrates exceptional global accuracy with only 1.25% root drift over a 60-meter jogging sequence. This highlights the system's robustness and reliability for unconstrained, long-duration motion capture in challenging environments, a significant improvement over traditional methods prone to accumulated errors.

Case Study: Reconstructing Diverse Locomotion Styles

Summary: Step2Motion's versatility was demonstrated by reconstructing a wide range of locomotion styles using only insole sensor data. This capability extends beyond simple walking or jogging to more complex movements, proving its adaptability for various applications.

Challenge: Traditional motion capture systems often struggle with diverse and unconstrained motion styles, especially in non-studio environments. Reconstructing complex, in-the-wild movements like dancing, tiptoeing, or sideways walking with minimal sensor data is a significant challenge.

Solution: The multi-modal insole sensors (pressure and IMU) combined with the diffusion-based model and specialized cross-attention allow Step2Motion to capture subtle cues from foot-ground interaction and leg orientation. This enables the model to generalize across different locomotion patterns, even those involving minimal ground contact or unusual foot pressure distributions.

Results: The system successfully reconstructed complex motions such as walking, running, jogging backwards, tiptoeing, walking sideways, jumping, jumping on one leg, and light crouching. The animations were temporally coherent and accurately reflected the diverse styles, validating Step2Motion's ability to provide robust and versatile motion capture for entertainment, sports analytics, and rehabilitation.

Advanced ROI Calculator

Estimate the potential cost savings and efficiency gains for your enterprise by integrating AI solutions.

Estimated Annual Savings
Employee Hours Reclaimed Annually

Your AI Implementation Roadmap

Our structured approach ensures a seamless integration of AI, delivering measurable results and sustained growth for your enterprise.

Phase 1: Discovery & Strategy

In-depth analysis of current workflows, identification of AI opportunities, and development of a tailored implementation strategy with clear KPIs.

Phase 2: Pilot & Proof of Concept

Deployment of a small-scale AI pilot to validate the solution, gather initial performance data, and refine the approach based on real-world feedback.

Phase 3: Scaled Integration

Full-scale integration of the AI solution across relevant departments, ensuring smooth adoption and minimal disruption to ongoing operations.

Phase 4: Optimization & Future-Proofing

Continuous monitoring, performance optimization, and strategic planning for future AI enhancements and expansion.

Ready to Transform Your Enterprise with AI?

Book a personalized consultation with our AI strategists to discuss your unique needs and unlock the full potential of artificial intelligence.

Ready to Get Started?

Book Your Free Consultation.

Let's Discuss Your AI Strategy!

Lets Discuss Your Needs


AI Consultation Booking