When graduates of computing degree programs enter the software industry, they will most likely join teams working on legacy code bases developed by people other than themselves. In these so-called brownfield software development settings, generative artificial intelligence (GenAI) coding assistants like GitHub Copilot are rapidly transforming software development practices, yet the impact of GenAI on student programmers performing brownfield development tasks remains underexplored.
This paper investigates how GitHub Copilot influences undergraduate students' programming performance, behaviors, and understanding when completing brownfield programming tasks, in which they add new code to an unfamiliar code base. We conducted a controlled experiment in which 10 undergraduate computer science students completed isomorphic brownfield development tasks with and without Copilot in a legacy web application.
Using a mixed-methods approach combining performance analysis, behavioral analysis, and exit interviews, we found that students completed tasks 34.9% faster (p < 0.05) and made 50% more solution progress (p < 0.05) when using Copilot. Moreover, our analysis revealed that, when using Copilot, students spent 10.63% less time manually writing code (p < 0.05), and 11.6% less time conducting web searches (p < 0.05), providing evidence of a fundamental shift in how they engaged in programming.
Students demonstrated significant improvements in both efficiency and correctness when using GitHub Copilot for brownfield programming tasks.
Figure 1a: Task completion times
Box plots showing 34.9% faster completion with Copilot
Figure 1b: Tests passed
Box plots showing 50% more solution progress with Copilot
Our behavioral analysis revealed fundamental shifts in how students engage in programming when using Copilot. The traditional read → understand → implement workflow was replaced by a GenAI-mediated prompt → view response → implement pattern.
Figure 2: Activity time distribution
Comparison of time spent on different programming activities with and without Copilot
Figure 3: Code writing methods
Breakdown showing shift from manual code entry to mixed methods with Copilot
Figure 4: Workflow transitions
Markov transition diagrams showing emergence of GenAI-mediated coding cycle
To ensure fair comparison, we designed two equivalent brownfield programming tasks with similar complexity and scope. Both tasks required participants to implement new functionality across three sequential tasks with parallel structure. To control for potential task-specific effects, we counterbalanced the assignment of Copilot availability across both tasks—half the participants used Copilot for the Add Distance feature and no Copilot for Add Picture, while the other half used no Copilot for Add Distance and Copilot for Add Picture. The task specifications shown below represent the no-Copilot version of the Add Distance feature and the Copilot version of the Add Picture feature.
Complexity Metric | Add Distance | Add Picture |
---|---|---|
Lines of code | 80 | 71 |
Program statements | 29 | 28 |
Variables | 4 | 4 |
Control structures | 3 | 3 |
Operations | 23 | 21 |
Higher-performing students demonstrated more selective and strategic use of Copilot compared to lower performers. Top performers were more selective in their use of AI-generated code, preferring granular inline suggestions over adoption of code blocks wholesale.
Figure 5: Copilot interaction strategies
Comparison of higher and lower performers' code writing methods with Copilot
We conducted a within-subjects, mixed-methods experimental study with two conditions: No Copilot (control) and Copilot (experimental). Ten undergraduate computer science students completed isomorphic brownfield front-end web development tasks in a legacy web application consisting of 3,818 lines of code.
10 undergraduate CS students (3rd/4th year) with strong web development background but minimal GenAI experience
AWS Workspace with Visual Studio Code, GitHub Copilot extension, and legacy web application
Within-subjects experiment: No Copilot → Copilot conditions with counterbalanced task order
Mixed-methods: performance metrics, behavioral coding, Markov transition analysis, exit interviews
Our findings reveal a crucial tension: GenAI may promote brownfield programming efficiency at the cost of diminished learning. Students expressed concerns about not understanding how or why Copilot suggestions work, highlighting the need for computing educators to develop new pedagogical approaches.
While Copilot significantly enhanced programming efficiency, students expressed concerns about not understanding how or why Copilot suggestions work, suggesting that productivity gains may come at the cost of a diminished understanding of the legacy code base.
@article{shihab2025copilot,
author = {Shihab, Md Istiak Hossain and Hundhausen, Christopher and Tariq, Ahsun and Haque, Summit and Qiao, Yunhan and Wise, Brian Mulanda},
title = {The Effects of GitHub Copilot on Computing Students' Programming Effectiveness, Efficiency, and Processes in Brownfield Programming Tasks},
journal = {ICER},
year = {2025},
}