Enterprise AI Analysis

The development and evaluation of agricultural question-answering systems based on large language models

By Ayşe Eldem & Hüseyin Eldem

This study conducts a comprehensive evaluation of Large Language Models (LLMs) in agriculture, developing and testing a domain-specific question-answering system called AgriQAs. Using GPT-40 and Gemini-2.0-flash, along with various prompt strategies (Zero-Shot, CoT, Self-Consistency, ToT, and APE optimization), the research assesses performance across different agricultural topics and difficulty levels. Key findings indicate that LLMs, particularly GPT-40 with Self-Consistency, demonstrate high accuracy and consistency, significantly outperforming simpler prompting methods like Zero-Shot. The study highlights the potential of LLMs to create innovative digital assistants for agricultural experts, enhancing decision-making and sustainable practices, while also addressing the need for careful model selection and ethical considerations.

Schedule Your Strategy Session

Executive Impact: LLMs in Agriculture

Integrating LLMs into agricultural QA systems can significantly boost accuracy and operational efficiency for professionals. Our analysis reveals key performance indicators:

0 GPT-40 Highest Accuracy

Achieved by GPT-40 with Self-Consistency prompting.

0 Gemini-2.0-flash Highest Accuracy

Achieved by Gemini-2.0-flash with Tree-of-Thought (ToT) prompting.

0 Lowest Error Rate (GPT-40)

Recorded by GPT-40 with Self-Consistency, indicating high reliability.

0 Accuracy Improvement (GPT-40)

Difference between GPT-40 Self-Consistency and Zero-Shot, highlighting prompt impact.

Deep Analysis & Enterprise Applications

Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.

AgriQAs System Workflow

The AgriQAs system employs a structured process from question input to result evaluation, incorporating advanced LLM prompting and optimization.

Create Database (Question-Answer Set)

→

Prompt Engineering (Zero-Shot, CoT, ToT, Self-Consistency)

→

Prompt Optimization with APE

→

Answer Generation (GPT-40, Gemini-2.0-flash)

→

Evaluation & Statistical Analysis

→

Interpretation of Results

LLM Performance by Prompt Strategy
Category	Key Findings
GPT-40 Performance	Self-Consistency: Highest Accuracy (95.3%) and consistency. Tree-of-Thought (ToT): Strong performance (91.7%). Chain-of-Thought (CoT): Good performance (87.1%). Zero-Shot: Lowest performance (84.8%), suggesting limited reasoning without explicit guidance.
Gemini-2.0-flash Performance	Tree-of-Thought (ToT): Highest Accuracy (88.4%) for Gemini. Self-Consistency: Good performance (83.7%). Chain-of-Thought (CoT): Moderate performance (81.5%). Zero-Shot: Significantly lower performance (74.8%), emphasizing the need for structured prompts.

AgriQAs: A Digital Assistant for Agricultural Experts

Problem: Agricultural experts often face challenges in quickly accessing accurate, context-specific information for complex decision-making, leading to potential inefficiencies and sub-optimal practices.

Solution: The AgriQAs system, powered by optimized LLMs (GPT-40, Gemini-2.0-flash) and advanced prompting techniques, provides a user-friendly, intelligent decision support and consulting tool. It facilitates reliable information access, supports knowledge exchange, and pioneers precision agriculture practices. For example, GPT-40 with Self-Consistency demonstrated 95.3% accuracy, proving its capability to deliver highly consistent and correct answers.

Impact: This system enhances workflow efficiency, reduces information access time, and promotes sustainable agricultural practices. It serves as an infrastructure for smart digital applications, ultimately improving agricultural productivity and sustainability.

"The AgriQAs system developed in this study can serve as a user-friendly, intelligent decision support and consulting tool that can easily be used by agricultural engineers and technicians."

4.6% Lowest Error Rate in Horticulture (GPT-40 Self-Consistency)

GPT-40 with Self-Consistency achieved exceptional accuracy, demonstrating its strong reasoning capabilities even for specific agricultural domains like horticulture.

Error Rates by Difficulty & Category in Horticulture
LLM & Prompt	Performance Insights
GPT-40 Horticulture Performance	Self-Consistency showed lowest error rates (1.3% for Easy, 5.3% for Medium, 7.3% for Difficult). CoT had moderate error rates. Zero-Shot had higher error rates (12.0% for Easy, 12.6% for Medium, 20.6% for Difficult).
Gemini-2.0-flash Horticulture Performance	ToT showed lowest error rates (6.0% for Easy, 8.0% for Medium, 20.6% for Difficult). Self-Consistency and CoT performed moderately. Zero-Shot had highest error rates (22.6% for Easy, 24.0% for Medium, 28.6% for Difficult).

88.4% Gemini-2.0-flash ToT Accuracy in Crop Production

Gemini-2.0-flash with Tree-of-Thought (ToT) achieved its highest accuracy in Crop Production, demonstrating its efficacy with structured reasoning in this domain.

LLM Strengths in Crop Production
LLM	Performance Highlights
GPT-40 in Crop Production	Self-Consistency: Consistently strong, very low error rates. ToT: High accuracy, particularly for complex questions. Lower overall error rates compared to Gemini for all prompt methods except ToT.
Gemini-2.0-flash in Crop Production	ToT: Outperformed GPT-40 ToT in Crop Production (p>0.05 not significant, but mean scores were competitive). Zero-Shot: Notably high error rates, indicating poor performance without guided reasoning.

Advanced ROI Calculator: Agriculture AI Impact

Estimate the potential return on investment for integrating advanced LLM-based QA systems into your agricultural operations. Adjust parameters to see projected annual savings and reclaimed expert hours.

Your Agricultural Focus

Number of Agricultural Experts

Avg. Hours/Week on Information Retrieval

Avg. Hourly Rate of Expert ($)

Estimated Annual Cost Savings

Estimated Annual Hours Reclaimed

This calculator provides an estimate based on industry averages and the study's performance metrics. Actual ROI may vary depending on specific operational contexts and implementation details.

Your AI Implementation Roadmap for Agriculture

Deploying an LLM-powered agricultural QA system requires a strategic, phased approach to ensure optimal integration and expert adoption.

Phase 1: Needs Assessment & Data Curation

Identify specific agricultural domains, data sources (e.g., crop data, soil conditions, pest management), and expert information needs. Begin curating and structuring domain-specific datasets for LLM training and fine-tuning, similar to the AgriQAs dataset developed in this study.

Phase 2: LLM Selection & Prompt Engineering

Choose appropriate LLMs (e.g., GPT-40, Gemini-2.0-flash) based on performance benchmarks and cost-efficiency. Develop and iteratively optimize prompt strategies (CoT, Self-Consistency, ToT) and leverage Automatic Prompt Engineering (APE) for domain-specific query handling. This phase is crucial for achieving high accuracy, as demonstrated by Self-Consistency's 95.3% accuracy with GPT-40.

Phase 3: Pilot Deployment & Expert Feedback

Deploy a pilot version of the QA system within a controlled environment, involving agricultural engineers and technicians. Gather feedback on accuracy, relevance, and user experience. Refine prompt strategies and knowledge bases based on real-world expert interactions, focusing on critical scenarios like disease diagnosis or pest control recommendations.

Phase 4: Scaled Integration & Continuous Improvement

Integrate the refined LLM-QA system into existing digital agricultural platforms and workflows. Establish a continuous improvement loop for model updates, new data incorporation, and adaptation to evolving agricultural practices and regional specifics. Explore integrations with IoT sensor data for real-time recommendations, ensuring the system remains current and effective.

Ready to Transform Agricultural Intelligence?

Unlock the full potential of AI for your agricultural operations. Our experts can help you design and implement a tailored LLM-powered QA system, just like AgriQAs, to drive efficiency, sustainability, and expert decision-making.

Schedule Your Strategy Session

Enterprise AI Analysis

The development and evaluation of agricultural question-answering systems based on large language models

Executive Impact: LLMs in Agriculture

Deep Analysis & Enterprise Applications

AgriQAs System Workflow

AgriQAs: A Digital Assistant for Agricultural Experts

Advanced ROI Calculator: Agriculture AI Impact

Your AI Implementation Roadmap for Agriculture

Phase 1: Needs Assessment & Data Curation

Phase 2: LLM Selection & Prompt Engineering

Phase 3: Pilot Deployment & Expert Feedback

Phase 4: Scaled Integration & Continuous Improvement

Ready to Transform Agricultural Intelligence?

Ready to Get Started?

Book Your Free Consultation.

Let's Discuss Your AI Strategy!

Lets Discuss Your Needs

Select Time Zone

Big Competitive Advantage With Ai

Learn More

Our Demos

Research Center

Jobs

Contact Us

1 888 985 3025

Solutions@OwnYourAi.com

Get Your Ai