Skip to main content
Enterprise AI Analysis: FRAGATA: Semantic Retrieval of HPC Support Tickets via Hybrid RAG over 20 Years of Request Tracker History

Enterprise AI Analysis

FRAGATA: Revolutionizing HPC Support with Semantic Retrieval and 20 Years of Data

Supercomputing centers accumulate vast amounts of knowledge in support tickets, but traditional search tools like Request Tracker (RT) fail to unlock this operational memory. FRAGATA introduces a groundbreaking semantic retrieval system that leverages over two decades of RT history, transforming unstructured ticket data into an intelligent, searchable knowledge base for HPC support teams.

Executive Impact: Unleashing Institutional Knowledge for Operational Excellence

By overcoming the limitations of conventional keyword-based search, FRAGATA dramatically enhances the efficiency and effectiveness of HPC support operations. This innovative solution ensures that critical past resolutions are easily discoverable, preventing redundant effort and accelerating problem-solving.

0 Years of Data Unlocked
0 Faster Incident Resolution
0 Reduction in Duplicate Efforts

Deep Analysis & Enterprise Applications

Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.

Problem Overview
FRAGATA Architecture
Deployment & Ingestion

The Challenge with Traditional Support Systems

HPC centers manage complex, heterogeneous platforms. Support teams act as a critical interface, resolving diverse incidents. The Galician Supercomputing Center (CESGA) has used Request Tracker (RT) for over two decades, accumulating an invaluable operational memory. However, RT's built-in search (version 4.4.1) has severe limitations:

  • Does not index the full ticket body.
  • Case-sensitive and does not tolerate typos.
  • Lacks morphological variant normalization.
  • Has no notion of semantic similarity.

These limitations lead to duplicated effort, loss of institutional knowledge, and increased mean resolution time, making past solutions effectively "invisible" to support staff.

Hybrid RAG: The FRAGATA Retrieval Paradigm

FRAGATA employs a hybrid Retrieval-Augmented Generation (RAG) paradigm to achieve high-quality semantic retrieval. Its core components include:

  • Dense Retrieval: Uses embeddings (vector representations of text meaning) generated by models like Sentence-BERT to find semantically similar documents.
  • Lexical Retrieval: Integrates classical BM25 for exact keyword matching, crucial for specific terminology.
  • Query Variants: Generates canonical, spell-corrected, intent-based, and translated (Spanish/English) query variants to maximize recall across the trilingual corpus.
  • Weighted Fusion (WRRF): Combines results from dense and lexical channels using Reciprocal Rank Fusion, robust against differing score scales.
  • Query-Aware Reranking: A cross-encoder model (e.g., mmarco-mMiniLMv2) re-evaluates top candidates against various prompts, applying domain heuristics (boosts/penalties) for greater precision.

This multi-stage approach ensures robust and relevant retrieval, even with noisy, multilingual, and technically complex queries.

Robust Deployment & Incremental Ingestion for HPC

FRAGATA is designed for production with continuous availability and efficient resource utilization:

  • Hybrid Deployment: Frontend and API run on a virtual machine, while computationally expensive tasks (embedding generation, re-indexing) are offloaded to the FinisTerrae III supercomputer's NVIDIA T4 GPUs.
  • Incremental Ingestion: A weekly batch pipeline extracts new/modified RT tickets, normalizes them, and adds them to the knowledge base. This process uses a transactional watermark to ensure data consistency.
  • Atomic Promotion & Hot-Swap: New indices are built in staging and then atomically promoted, guaranteeing continuous service without downtime during re-indexing. If a build fails, the service gracefully continues with the previous engine.
  • Data Preparation Pipeline: SQL queries extract full ticket history. Messages are normalized (redundancy removal, noise filtering), then chunked into overlapping fragments suitable for embedding models.

This robust architecture ensures scalability, reliability, and efficient processing of large historical datasets.

Enterprise Process Flow: FRAGATA's Data Pipeline

Script for reading, formatting and compacting RT tickets
Conversation turn identifi- cation
Ticket splitting into chunks
Embedding generation per chunk for search

FRAGATA vs. RT Native Search Capabilities

Feature RT Native Search (v4.4.1) FRAGATA (Hybrid RAG)
Full Ticket Body Indexing No ✓ Yes
Semantic Search No ✓ Yes (Hybrid RAG)
Typo Tolerance No (Case-sensitive) ✓ Yes (Spell correction, embeddings)
Multilingual Support No ✓ Yes (Dictionary translation, multilingual embeddings)
Knowledge Reuse Limited ✓ Enhanced (Over 20 years of history)

CESGA Deployment: Real-World Impact

FRAGATA is currently deployed in CESGA's internal production environment, yielding significant qualitative improvements over RT's native search capabilities. Key benefits include:

  • Multilingual Queries: Successfully retrieves old tickets originally reported in English, even when queries are in Spanish or Galician, thanks to dictionary-based translation and multilingual embeddings.
  • Typo & Variant Tolerance: Resolves issues with morphological variants or typos in scientific application names that RT's strict lexical search cannot handle.
  • Intent-Based Retrieval: Effectively answers queries based on intent (e.g., "how to install X" or "error when running Y") where the underlying meaning is paramount, not just exact keywords.
  • Comprehensive Search: Allows combined searches by content, date range, or department, providing highly specific and relevant results.

This demonstrates FRAGATA's ability to turn decades of support history into an actionable, intelligent resource.

Hybrid RAG Core Retrieval Paradigm

Quantify Your Potential ROI

Estimate the tangible benefits your enterprise could achieve by implementing a FRAGATA-like semantic retrieval system for your internal knowledge base.

Estimated Annual Savings $0
Annual Hours Reclaimed 0

Your Path to Enhanced Knowledge Retrieval

Our structured approach ensures a smooth transition and rapid deployment, maximizing the impact on your operational efficiency.

Phase 1: Data Integration & Normalization

Extract, clean, and prepare your historical ticket data, including RT and complementary sources. This critical step ensures high-quality inputs for the semantic retrieval engine.

Phase 2: Hybrid RAG Engine Deployment

Set up the FRAGATA architecture, including dense and lexical retrieval, reranking, and HPC offloading. We configure the system for optimal performance on your infrastructure.

Phase 3: Customization & Team Enablement

Tailor the system with domain heuristics and train your support staff for optimal usage and continuous improvement. We ensure your team can fully leverage FRAGATA's capabilities.

Ready to Transform Your Support Operations?

Unlock decades of institutional knowledge and empower your support team with the intelligence of FRAGATA. Schedule a free consultation to see how our solution can be tailored for your enterprise.

Ready to Get Started?

Book Your Free Consultation.

Let's Discuss Your AI Strategy!

Lets Discuss Your Needs


AI Consultation Booking