Skip to main content
Enterprise AI Analysis: SIGMACOLLAB: An Application-Driven Dataset for Physically Situated Collaboration

Enterprise AI Analysis

SIGMACOLLAB: An Application-Driven Dataset for Physically Situated Collaboration

Explore how SIGMACOLLAB's innovative approach to human-AI interaction is setting new standards for physically situated collaboration and what it means for your enterprise.

Unlocking Advanced Human-AI Collaboration

The SIGMACOLLAB dataset revolutionizes research in physically situated human-AI collaboration by providing ecologically valid data for training and evaluating AI models. This directly impacts enterprise solutions by enabling more fluid, intuitive, and effective mixed-reality assistive systems.

0% Task Success Rate
0 hours Hours of Interaction Data
0 User Utterances

Deep Analysis & Enterprise Applications

Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.

Dataset Composition & Multimodality

SIGMACOLLAB comprises 85 interactive sessions where participants are guided by a mixed-reality AI agent through procedural tasks. Data streams include participant/system audio, egocentric camera (RGB, depth, grayscale), head, hand, and gaze tracking. Post-hoc annotations cover manual transcripts and word-level timings.

The dataset includes a rich set of multimodal data streams.
  • Color Camera View: 896 × 504 pixels @ 15Hz
  • Depth Camera View: 320 × 288 pixels @ 5Hz
  • Grayscale Camera Views: 640 × 480 pixels @ 15Hz (left/right front)
  • Head Pose + Eye Gaze: 30Hz
  • Hands Pose: 20Hz
  • Audio: 1-channel, 32-bit PCM @ 16kHz
85 task execution sessions by 21 untrained participants across 8 procedural tasks (e.g., Nespresso, Hard-drive, Skateboard, Button, Notebook, Margarita, Mojito, Whiskey). Tasks vary in complexity (10-34 sub-steps) and involve diverse objects and actions. Total duration: 13 hours, 45 minutes, 11 seconds. 1583 sub-step executions and 3296 user utterances recorded. Task success rate: 75% overall.
Post-hoc annotations enhance data utility:
  • Manual Transcriptions: Corrected runtime speech recognition errors (average session-level word-error-rate 20.2%).
  • Word-Level Timestamps: Computed for user and system utterances using force-alignment.
  • Task Success Classification: Sessions categorized as correctly-completed, incorrectly-completed, abandoned, or system-failure.
  • Gaze Signal Post-Processing: Annotations for gaze-to-interface periods and projected gaze points onto image streams.

Application-Driven Approach & Ecological Validity

The dataset is collected through participants interacting with SIGMA, an open-source mixed-reality AI assistant. This application-driven approach yields ecologically more valid data, reflecting natural user interactions with an AI agent in physical settings, unlike human-human interaction datasets or static collections. It surfaces novel challenges like self-talk detection and offers a platform for real-world model testing.

14 Hours of Ecologically Valid Interaction

Interactive Data Collection Methodology

The collection method involves participants performing procedural tasks with a mixed-reality AI assistant, fostering realistic interactions and cognitive states. The dataset enables researchers to deploy and test models within the original application, allowing for end-to-end evaluation of effects on task-level performance and user satisfaction, facilitating iterative refinement.

Enterprise Process Flow

User interacts with SIGMA AI agent
Performs procedural tasks
Data streams recorded (audio, visual, tracking)
AI provides real-time assistance
Models evaluated in live interactions
Iterative refinement of AI

Comparison with Existing Datasets

While numerous egocentric datasets exist (e.g., Ego4D, EPIC-KITCHENS) focusing on computer vision, SIGMACOLLAB's interactive, human-AI focus distinguishes it. Unlike human-human interactive datasets (e.g., HoloAssist), SIGMACOLLAB uses a standalone AI, reflecting real-world application challenges more accurately.

Feature SIGMACOLLAB Traditional Egocentric Datasets (e.g., Ego4D)
Interaction Type
  • Human-AI collaboration
  • Mixed-reality assistance
  • Single actor performing activity
  • Human-human interaction (some)
Data Modalities
  • Egocentric RGB-D
  • Audio
  • Head/Hand/Gaze tracking
  • Egocentric RGB-D
  • Audio
  • Limited tracking
Focus
  • Interaction-related challenges
  • Grounding
  • Proactive interventions
  • User cognitive states
  • Action recognition
  • Object detection
  • Forecasting
Ecological Validity
  • High (application-driven)
  • Reflects AI interaction style
  • Varied (spontaneous to structured)
  • Human-human interaction style

Future Benchmarks & Research Opportunities

SIGMACOLLAB aims to establish new benchmarks for real-time collaboration in physically situated settings. This includes challenges in proactive interventions, grounding, reference generation/resolution, and detecting user cognitive states (e.g., frustration, confusion). The open-source nature of SIGMA allows researchers to build upon this work.

Future Benchmarks & Research Opportunities

Catalyzing Future AI Development

Bridging Lab to Real-World

The dataset's application-driven approach ensures that models developed on SIGMACOLLAB are directly applicable to real-world mixed-reality assistance scenarios, offering a unique testing ground for generalizable AI solutions.

Unveiling Novel Interaction Challenges

Beyond traditional computer vision, SIGMACOLLAB highlights specific interaction-related issues such as identifying and responding to user self-talk, understanding nuanced referential expressions, and dynamically adapting assistance based on user state.

Open-Source Ecosystem

Leveraging the open-source SIGMA platform, researchers can integrate and test their models directly, fostering a collaborative environment for continuous improvement and innovation in human-AI collaboration.

Calculate Your Potential ROI with Advanced AI

Estimate the significant time savings and cost reductions your enterprise could achieve by implementing intelligent assistive AI systems, leveraging insights from physically situated collaboration datasets.

Projected Annual Savings

Annual Cost Savings $0
Hours Reclaimed Annually 0

Your Path to AI-Powered Collaboration

A typical implementation timeline for integrating physically situated AI assistance into enterprise workflows.

Phase 1: Discovery & Strategy

Comprehensive analysis of current workflows, identification of key areas for AI augmentation, and development of a tailored implementation strategy leveraging insights from datasets like SIGMACOLLAB.

Phase 2: Pilot & Customization

Deployment of a pilot AI-assistive system in a controlled environment, customization based on specific task requirements, and initial user feedback integration.

Phase 3: Rollout & Optimization

Phased rollout across the enterprise, continuous monitoring of performance, and iterative optimization of AI models for maximum efficiency and user satisfaction.

Ready to Transform Your Operations?

Leverage the power of physically situated AI collaboration to enhance efficiency, reduce costs, and empower your workforce. Our experts are ready to guide you.

Ready to Get Started?

Book Your Free Consultation.

Let's Discuss Your AI Strategy!

Lets Discuss Your Needs


AI Consultation Booking