Education / Assessment AI

ChatGPT and Gemini participated in the Korean College Scholastic Ability Test - Earth Science I

This study analyzes the performance of state-of-the-art LLMs (GPT-4o, Gemini 2.5 Flash, Gemini 2.5 Pro) on the 2025 Korean College Scholastic Ability Test (CSAT) Earth Science I section. It identifies key cognitive limitations in multimodal scientific reasoning, including 'Perception Errors,' 'Calculation-Conceptualization Discrepancy,' and 'Process Hallucination.' The findings suggest how to design 'AI-resistant questions' by exploiting these vulnerabilities to distinguish human competency from AI-generated responses.

Schedule Your Strategy Session

Executive Impact: Diagnosing AI's Cognitive Gaps

Understanding AI's fundamental reasoning flaws is crucial for robust assessment design and educational integration. Our analysis reveals key performance metrics and areas of vulnerability.

0 Accuracy (Optimized)

0 Perception Error Rate

0 Reasoning Error Rate

0 AI-Resistant Question Potential

Discuss Your Implementation

Deep Analysis & Enterprise Applications

Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.

The Perception-Cognition Gap in LLMs

The study reveals a significant 'Perception-Cognition Gap' where LLMs struggle to interpret symbolic meanings in schematic diagrams, even when visual data is recognized. This is not merely a visual error but a deeper failure to connect visual information with underlying scientific concepts. Key sub-categories include Visual Data Misreading (9 cases, 25%) and Schematic Misinterpretation (6.5 cases, 18.06%).

Conceptual Application Challenges

LLMs demonstrate 'Calculation-Conceptualization Discrepancy', successfully performing calculations but failing to apply the underlying scientific concepts to the results. This indicates a superficial understanding rather than deep conceptual integration. Sub-categories are Concept Misapplication (4.5 cases, 12.50%) and Calculation-Concept Discrepancy (1 case, 2.78%).

Flawed Reasoning and Process Hallucination

A critical vulnerability identified is the LLMs' tendency to skip complex reasoning steps and generate plausible but unfounded conclusions, termed 'Process Hallucination.' They also exhibit 'Flawed Reasoning' by making logical leaps or setting false premises. Sub-categories: Flawed Reasoning (7 cases, 19.44%), Spatio-temporal Failure (2 cases, 5.56%), Factual Hallucination (2 cases, 5.56%), and Process Hallucination (4 cases, 11.11%).

0 of all errors were 'Perception Errors', indicating a fundamental failure at the initial interpretation stage.

Enterprise Process Flow

Data Familiarization

→

AI Initial Draft Generation

→

Iterative Refinement (Human-AI Feedback)

→

Human-Led Verification & Finalization

Model	Full-Page Input (Accuracy)	Optimized Input (Accuracy)	Key Limitations
Gemini 2.5 Flash	8%	20%	Systematic visual information misinterpretation High 'Process Hallucination' Difficulty with atypical diagrams
GPT-4o	14%	22%	Lowest OCR accuracy Frequent 'Factual Hallucination' Tendency to skip visual verification
Gemini 2.5 Pro	28%	68%	Strongest overall performance Limitations concentrated in high-order 'Perception Errors' and 'Flawed Reasoning'
Human Examinee (Top)	N/A	95%+	Conceptual depth and flexible reasoning No 'Perception-Cognition Gap'

AI-Resistant Question Design: Leveraging LLM Weaknesses

By exploiting the identified vulnerabilities, educators can design questions that effectively distinguish genuine human understanding from AI-generated responses. For instance, creating items that require interpreting atypical schematic diagrams (targeting Perception-Cognition Gap) or multi-step problems where procedural calculations must be connected to deep scientific meaning (targeting Calculation-Conceptualization Discrepancy) can serve as powerful AI-resistant assessments. Also, questions demanding strict visual data verification to counter 'Process Hallucination' are critical.

Calculate Your Potential AI Optimization ROI

See how understanding and addressing AI's cognitive limitations can translate into tangible benefits for your organization.

Your Industry

Number of Employees Impacted by AI Processes

Avg. Hours/Week per Employee on AI-Related Tasks

Avg. Hourly Rate ($)

Potential Annual Cost Savings $0

Annual Hours Reclaimed 0

Your Strategic Implementation Roadmap

Based on the research, we've outlined a phased approach to leverage AI's strengths while mitigating its weaknesses within your organization.

Phase 1: Vulnerability Assessment & Gap Analysis

Identify specific AI cognitive gaps within your enterprise data and processes, leveraging insights from CSAT-like reasoning failures.

Phase 2: AI-Resistant Design Prototyping

Develop and prototype AI-resistant assessment strategies or data validation mechanisms tailored to your unique operational challenges.

Phase 3: Human-AI Collaboration Frameworks

Establish protocols for human oversight and verification, building on the observed limitations of AI in deep reasoning and hallucination.

Phase 4: Continuous Monitoring & Refinement

Implement systems for ongoing evaluation of AI performance and adaptation of strategies to maintain assessment fairness and data integrity.

Ready to Transform Your AI Strategy?

Book a personalized consultation to discuss how these insights can be applied to your enterprise, ensuring robust and fair AI integration.

Book a Consultation Now

Education / Assessment AI

ChatGPT and Gemini participated in the Korean College Scholastic Ability Test - Earth Science I

Executive Impact: Diagnosing AI's Cognitive Gaps

Deep Analysis & Enterprise Applications

The Perception-Cognition Gap in LLMs

Conceptual Application Challenges

Flawed Reasoning and Process Hallucination

Enterprise Process Flow

AI-Resistant Question Design: Leveraging LLM Weaknesses

Calculate Your Potential AI Optimization ROI

Your Strategic Implementation Roadmap

Phase 1: Vulnerability Assessment & Gap Analysis

Phase 2: AI-Resistant Design Prototyping

Phase 3: Human-AI Collaboration Frameworks

Phase 4: Continuous Monitoring & Refinement

Ready to Transform Your AI Strategy?

Ready to Get Started?

Book Your Free Consultation.

Let's Discuss Your AI Strategy!

Lets Discuss Your Needs

Select Time Zone

Big Competitive Advantage With Ai

Learn More

Our Demos

Research Center

Contact Us

1 888 985 3025

Solutions@OwnYourAi.com

Get Your Ai