Comparing Copilot AI Models for C# Bug Fixing: GPT-5, Gemini 2.5 Pro, and Others
Sea-Key3106 discusses a direct comparison between GPT-5, Gemini 2.5 Pro, Claude Sonnet 4, O3, and O3 High models within GitHub Copilot for a challenging C# bug, highlighting each model’s effectiveness.
Comparing Copilot AI Models for C# Bug Fixing: GPT-5, Gemini 2.5 Pro, and Others
Author: Sea-Key3106
In this post, I share my experience debugging a tricky C# delegate issue using various AI models through GitHub Copilot and related tools. The problem involved dynamic invocation of a delegate and an overlooked inheritance hierarchy check for one of the parameters.
The Problem
- Scenario: Bug caused by calling a delegate dynamically in C# without checking inheritance hierarchy of a parameter.
- Goal: Have an AI assistant spot and fix the bug, ideally with robust, elegant code.
Models Tested
I used the same prompt and code file for each AI model:
- GPT-5 (in Copilot)
- Failed to find the root cause. Made unrelated code changes.
- Gemini 2.5 Pro (in Copilot)
- Successfully identified and fixed the root cause as well as a similar issue in the same file.
- Downside: Kept editing the file for over 10 minutes, requiring manual intervention.
- Claude Sonnet 4 (in Copilot)
- Successfully found and fixed both the main and a related bug.
- Used explicit inheritance type-checking rather than
IsAssignableFrom
.
- O3 (in windsurf)
- Found and fixed the main bug but missed a similar related issue.
- O3 High (in windsurf)
- Detected and fixed both issues, merging similar conditions efficiently in an
if
clause.
- Detected and fixed both issues, merging similar conditions efficiently in an
Observations
- The bug required understanding C#’s inheritance checking, specifically around dynamic delegates.
- GPT-5 underperformed in this scenario, failing to address the relevant code section.
- Gemini 2.5 Pro and Claude Sonnet 4 were both successful, with minor differences in approach.
- O3 High produced the most comprehensive and robust fix.
- It’s unclear if the limitations observed are due to the AI model itself or Copilot’s implementation.
Takeaway
Choosing the right AI model in GitHub Copilot can significantly impact code quality and developer productivity, especially for nuanced issues like inheritance in C#. This single-case comparison highlights practical strengths and weaknesses to consider.
Further Exploration
I plan to test more with Claude Sonnet 4 in Copilot to see if its explicit approach to inheritance checking leads to consistently elegant solutions.
This post appeared first on “Reddit Github Copilot”. Read the entire article here