Enterprise AI Analysis

Detecting GenAI assistance in programming assessments with over-uniqueness and sample matching

In engineering education, Generative Artificial Intelligence (GenAI) might be misused to complete assessments with limited understanding. On courses that allow the use of GenAI, students might also forget to acknowledge its assistance. There is a need to identify such assistance. We present an automated detector with over-uniqueness and sample matching. GenAI-assisted submissions are identified based on their uniqueness and their similarity to a GenAI sample. Unique to our GenAI detector, it requires no training data and/or dedicated rules for each programming/scripting language. Further, the method can be integrated into any existing similarity detectors to identify plagiarism. The detector covers five similarity measurements, two similarity modes, and eight programming/scripting languages. Our evaluation of four data sets with thousands of submissions shows that our detector is effective (71% MAP). However, many factors can affect its effectiveness, including submission length and student attempts to align the code. Combining both mechanisms does not result in higher effectiveness, yet it takes longer to process.

Schedule Your Strategy Session

Executive Impact at a Glance

This research introduces a novel, practical GenAI detection method, designed to integrate seamlessly into existing academic integrity frameworks while offering robust performance across diverse programming contexts.

0 Overall Detection Accuracy (MAP)

0 Peak Effectiveness (Jaccard, Exam)

0 Fastest Processing Time

0 Programming Languages Supported

Discuss How AI Enhances Integrity

Deep Analysis & Enterprise Applications

Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.

The Rise of GenAI in Academia and the Need for Detection

Generative Artificial Intelligence (GenAI) is rapidly transforming academia, offering unprecedented ways to access information and complete tasks. While GenAI can be a powerful tool for learning—aiding in understanding program flow, error messages, and providing code feedback—it also introduces significant challenges, particularly in programming assessments. The ease with which GenAI can generate solutions means students might complete assessments without genuine understanding or forget to acknowledge assistance, leading to academic integrity issues.

Current GenAI detectors often fall short for programming tasks, either requiring extensive training data, relying on language-specific syntax rules, or lacking integration with existing plagiarism detection systems. This creates a practical barrier for instructors needing to identify unacknowledged GenAI assistance efficiently. This research addresses this gap by presenting a novel, automated GenAI detector that is both practical and effective.

Our Novel GenAI Detection Methodology

Our detector identifies GenAI-assisted submissions through two primary mechanisms: over-uniqueness and sample matching. GenAI-generated code often exhibits a unique style, distinct from typical student submissions, and produces similar solutions for common prompts. By analyzing these characteristics, we can effectively flag potentially assisted code. A key advantage is its independence from training data and language-specific rules, making it highly adaptable across various programming contexts.

The detector integrates seamlessly with existing code similarity analysis tools like SSTRANGE, employing five robust similarity measurements—Cosine, Jaccard, MinHash, Super-Bit, and RKRGST—each with standard and sensitive modes. This comprehensive approach ensures high detection accuracy while maintaining efficiency.

Enterprise Process Flow

Submission Preprocessing

→

Similarity Measurement

→

Over-Uniqueness Measurement

→

Sample Matching Measurement

→

Reporting & Review

The process begins with preprocessing submissions to normalize code and generalize identifiers. Next, similarity scores are calculated. These scores then feed into our over-uniqueness and sample matching algorithms. Finally, a detailed report highlights suspicious submissions, enabling educators to make informed decisions about academic integrity.

Robust Evaluation & Performance Metrics

Our detector was evaluated across four diverse Python and Java datasets, encompassing thousands of student submissions and GenAI-assisted examples. The Mean Average Precision (MAP) metric was used to assess effectiveness, focusing on the ranked position of identified GenAI-assisted submissions. Processing time measured efficiency.

71% MAP Average effectiveness across all scenarios, demonstrating robust performance for GenAI detection.

The over-uniqueness mechanism showed strong performance, especially on assessments expecting longer solutions (Exam dataset: 82% average MAP). The sensitive mode, which accounts for identifier names and constants, often improved detection, particularly for distinct GenAI styles. However, its effectiveness was significantly reduced when GenAI-generated code was explicitly aligned to student styles (Weekly Align dataset: 26% average MAP).

The sample matching mechanism proved highly effective, achieving an 80% average MAP on the Weekly dataset. Its strength lies in identifying code segments similar to known GenAI samples, with sensitive mode often yielding statistically significant improvements. This mechanism is particularly strong when students are less fluent in disguising GenAI output.

Combining both mechanisms yielded an overall average of 72% MAP. While sometimes outperforming individual mechanisms, the combination did not consistently result in higher effectiveness than sample matching alone, suggesting that not all GenAI-assisted submissions are simultaneously unique and similar to a GenAI sample. Efficiency analysis consistently showed MinHash and Super-Bit as the fastest measurements due to their locality-sensitive hashing and binning mechanisms, while RKRGST was the slowest due to its quadratic complexity.

Table: Effectiveness (MAP) of Over-Uniqueness Mechanism by Data Set (Selected)
Similarity Measurement	Mode	Weekly (%)	Weekly Alt (%)	Weekly Align (%)	Exam (%)
Cosine	Sensitive	71	80	25	79
Cosine	Standard	66	73	27	79
Jaccard	Sensitive	80	88	22	100
Jaccard	Standard	76	82	25	95
MinHash	Sensitive	71	83	25	83
MinHash	Standard	66	70	26	79
RKRGST	Sensitive	63	74	25	80
RKRGST	Standard	56	66	27	71

Case Study: Exam Data Set Highlights

The Exam data set presented unique characteristics, featuring longer submissions due to four tasks per exam and a strict "no discussion" policy. In this context, our approach achieved its highest effectiveness, with an 82% average MAP. Notably, Jaccard in sensitive mode reached 100% MAP, demonstrating exceptional precision when GenAI-generated code had minimal external influence and clear stylistic differences. This outcome underscores the detector's power in controlled, high-stakes assessment environments where GenAI assistance is less disguised and solutions are more complex.

Limitations & Future Research Directions

While effective, our current GenAI detector has certain limitations. It was primarily evaluated on introductory programming courses and Python/Java submissions, suggesting a need for replication across diverse programming languages, course levels, and institutional settings. The chosen metrics (MAP, processing time) provide strong indicators, but exploring precision, recall, and ROC curves could offer a more nuanced understanding of performance trade-offs.

Future work will involve testing additional similarity measurements like Winnowing or local alignment, and comparing our detector against existing text-based and programming-specific GenAI detectors in controlled environments. Investigating the impact of submission length, content variation, assessment design, and the proportion of aligned content on detection performance is crucial. We also plan to develop a script to leverage results from popular plagiarism detectors (MOSS, JPlag, Sherlock) and explore additional mechanisms for identifying heavily disguised or aligned GenAI-assisted submissions, potentially by monitoring the creation process for anomalous behaviors. Enhancements to token weighting and the combination of over-uniqueness and sample matching mechanisms are also planned to maximize effectiveness.

Ultimately, this research serves as a stepping stone towards more robust and adaptive tools for maintaining academic integrity in the evolving landscape of AI-assisted learning.

Calculate Your Potential ROI with AI Solutions

See how implementing advanced AI detection and integrity solutions can save your institution significant resources annually.

Your Industry/Sector:

Number of Employees/Students (Affected by AI Integrations):

Average Weekly Hours Spent on Integrity Management/Manual Review (per employee/student):

Average Hourly Cost/Value of this Time ($):

Estimated Annual Savings $0

Annual Hours Reclaimed 0

Your AI Implementation Roadmap

A phased approach to integrating advanced AI solutions for academic integrity, ensuring a smooth transition and maximum impact.

Phase 1: Discovery & Strategy

Comprehensive assessment of your current academic integrity challenges, existing systems, and institutional goals. Define key performance indicators and tailor an AI strategy to your unique needs.

Phase 2: Solution Design & Customization

Design and customize the GenAI detection framework, integrating it with your learning management systems and existing plagiarism detectors. Develop custom rules and thresholds based on your assessment types.

Phase 3: Pilot Implementation & Testing

Roll out the solution in a pilot program with selected courses. Gather feedback, conduct rigorous testing, and fine-tune the system for optimal performance and user experience.

Phase 4: Full-Scale Deployment & Training

Deploy the AI integrity solution across your institution. Provide comprehensive training for instructors and administrators on system usage, report interpretation, and best practices.

Phase 5: Continuous Optimization & Support

Ongoing monitoring, performance analysis, and iterative improvements. Benefit from continuous updates, dedicated support, and adaptation to new GenAI models and academic policies.

Start Your AI Journey

Ready to Implement AI in Your Enterprise?

Our team specializes in leveraging advanced AI solutions to enhance academic integrity and operational efficiency. Book a free consultation to see how our expertise can benefit your institution.

Discuss Your Implementation

Enterprise AI Analysis

Detecting GenAI assistance in programming assessments with over-uniqueness and sample matching

Executive Impact at a Glance

Deep Analysis & Enterprise Applications

The Rise of GenAI in Academia and the Need for Detection

Our Novel GenAI Detection Methodology

Enterprise Process Flow

Robust Evaluation & Performance Metrics

Case Study: Exam Data Set Highlights

Limitations & Future Research Directions

Calculate Your Potential ROI with AI Solutions

Your AI Implementation Roadmap

Phase 1: Discovery & Strategy

Phase 2: Solution Design & Customization

Phase 3: Pilot Implementation & Testing

Phase 4: Full-Scale Deployment & Training

Phase 5: Continuous Optimization & Support

Ready to Implement AI in Your Enterprise?

Ready to Get Started?

Book Your Free Consultation.

Let's Discuss Your AI Strategy!

Lets Discuss Your Needs

Select Time Zone

Big Competitive Advantage With Ai

Learn More

Our Demos

Research Center

Contact Us

1 888 985 3025

Solutions@OwnYourAi.com

Get Your Ai