Skip to main content
Enterprise AI Analysis: Programmable and GPU-Accelerated Edge Inference for Real-Time ISAC on NVIDIA Aerial Testbed

Programmable and GPU-Accelerated Edge Inference for Real-Time ISAC on NVIDIA Aerial Testbed

Real-time ISAC on NVIDIA Aerial Testbed enables high-accuracy sensing for 6G with GPU-accelerated AI dApps.

This paper presents a groundbreaking framework that integrates GPU-accelerated Artificial Intelligence (AI) applications into the edge Radio Access Network (RAN) infrastructure for real-time Integrated Sensing and Communication (ISAC). Leveraging NVIDIA Aerial Testbed, the system processes PHY/MAC signals with minimal overhead (150 µs), supporting multiple inference engines and AI backends. A key demonstration, 'cuSense,' achieves 77 cm mean localization error for person tracking on a 5G NR deployment without dedicated sensing hardware, showcasing a practical pathway for AI-native RANs and 6G ISAC applications.

Key Executive Impact

Our framework delivers measurable performance advantages for next-generation RANs.

0 Framework Overhead
0 Localization Error
0 Predictions within 1m

Deep Analysis & Enterprise Applications

Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.

150µs Framework Overhead for Real-Time Inference on NVIDIA Aerial Testbed

GPU-Accelerated dApp Framework Data Path

cuPHY copies data GPU to CPU
cuBB notifies ADL data ready
ADL copies data CPU to SHM
E3 Agent sends pointers to E3 Manager
E3 Manager dispatches to engine
Engine sets inputs and runs inference
Engine returns results to E3 Manager
E3 Manager sends control to E3 Agent
E3 Agent receives & applies control

Inference Engine Performance Comparison (GH200 with MIG)

Feature Triton C API (TRT) Triton gRPC (TRT)
Mean Latency
  • 167 µs
  • ~350 µs
Serialization Overhead
  • Minimal
  • ~200 µs
Real-Time Suitability
  • Excellent (317 µs E2E)
  • Good (500 µs E2E)

Case Study: cuSense: Indoor Person Localization

Challenge: Extracting accurate sensing estimates from communication signals (DMRS) in real-time with high-dimensional, noisy CSI data and static multipath components.

Solution: Developed cuSense, an ISAC dApp using real-time UL CSI estimates, static multipath removal, and a GPU-accelerated neural network for 2D position inference. Runs on NVIDIA Aerial Testbed without dedicated sensing hardware.

Result: Achieved a mean localization error of 77 cm (75% within 1 meter) in a 3GPP-compliant 5G NR deployment, meeting 3GPP sensing requirements.

77cm Mean Localization Error of cuSense dApp

cuSense Uplink CSI Processing Pipeline

Input CSI Ht
Offline Background Avg. CSI Template HB
Temporal Averaging (Ht)
Background Filtering (Ht-HB)
Z-score Normalization
Neural Network Inference (2D Probability Map)
Kalman Filter Tracking
Raw Predictions

Case Study: GPU-Accelerated dApp Framework

Challenge: Enabling real-time, low-latency access to PHY/MAC data, co-locating dApps with gNB while ensuring loose coupling, providing GPU-native AI tooling, ensuring isolation and scalability, and aligning with O-RAN/AI-RAN standards.

Solution: Introduced a dApp framework on NVIDIA ATB 5G-NR stack with Real-time ADL using shared memory, an E3 Agent for communication, and a modular dApp container architecture supporting multiple AI backends.

Result: Achieved ~150 µs framework overhead and sub-millisecond E2E control-loop latency for real-time, high-accuracy ISAC dApps on GPU-accelerated RANs.

Calculate Your Potential ROI

Estimate the efficiency gains and cost savings your enterprise could achieve with GPU-accelerated AI.

Estimated Annual Savings $0
Hours Reclaimed Annually 0

Your AI Implementation Roadmap

A typical journey to deploy GPU-accelerated AI for real-time applications.

Phase 1: Discovery & Strategy (2-4 weeks)

Initial assessment of existing infrastructure, data sources, and target use cases. Develop a tailored strategy aligned with business objectives, identifying key performance indicators and potential ROI. Includes stakeholder workshops and technology readiness evaluation.

Phase 2: Framework Integration & Pilot (6-10 weeks)

Integrate the GPU-accelerated dApp framework into your edge RAN. Develop and deploy a pilot ISAC or AI-native RAN application (e.g., cuSense localization) to validate the real-time data access and inference capabilities on a limited scale. Establish baseline performance metrics.

Phase 3: Model Development & Optimization (8-16 weeks)

Iterative development and training of AI/ML models using GPU-native tooling. Optimize models for low-latency inference, leveraging various backends (e.g., TRT, ONNX). Conduct extensive OTA evaluations and dataset refinement to ensure high accuracy and robust generalization.

Phase 4: Scalable Deployment & Expansion (Ongoing)

Roll out the solution across your network, ensuring scalability and isolation for multiple concurrent dApps. Monitor performance, fine-tune models, and explore additional AI-native RAN and ISAC use cases to maximize long-term value and evolve with 6G standards.

Ready to Transform Your RAN?

Unlock the full potential of AI-native 6G with GPU-accelerated edge inference.

Ready to Get Started?

Book Your Free Consultation.

Let's Discuss Your AI Strategy!

Lets Discuss Your Needs


AI Consultation Booking