AI: UC Berkeley study finds grade inflation after ChatGPT
A UC Berkeley analysis of more than 500,000 grades ties post‑November 2022 rises in A's to AI use on homework, not better learning.
TL;DR
- 01A UC Berkeley analysis of more than 500,000 grades ties post‑November 2022 rises in A's to AI use on homework, not better learning.
- 02The study found the share of A grades jumped by 13 percentage points, about 30 percent above the 2022 baseline, and average GPA rose by 0.12 points after ChatGPT's November 2022 release.
- 03It tracks grade trends across eight fall semesters (2018 through 2025) in 319 courses spanning 84 departments and measures each course's AI exposure from the fall 2022 syllabi.
A UC Berkeley study of more than 500,000 grades found that courses heavy on writing and coding experienced sharp grade increases after ChatGPT launched in November 2022, with the share of A grades up 13 percentage points and average GPA rising by 0.12 points.
What did the UC Berkeley study find?
The study found the share of A grades jumped by 13 percentage points, about 30 percent above the 2022 baseline, and average GPA rose by 0.12 points after ChatGPT's November 2022 release. It tracks grade trends across eight fall semesters (2018 through 2025) in 319 courses spanning 84 departments and measures each course's AI exposure from the fall 2022 syllabi.
Beyond mean changes, the grade distribution narrowed: A-minus and B-plus grades were increasingly bumped up to straight A's. The effect concentrates in courses with a high share of writing and coding assignments, the areas where generative AI performs best, according to the study.
How did researchers test whether AI replaced student work rather than improved learning?
They compared courses by how much homework counted toward the final grade and by assignment type set out in pre‑ChatGPT syllabi. In courses where homework accounted for more than the median share of the grade, A's rose by an extra 16 percentage points compared with otherwise similar courses below the median. In lower‑homework courses the effect was small and not statistically significant.
The study also ran a placebo test on oral presentation assignments, where AI is far less helpful, and found no grade changes there. The author, Igor Chirikov, summarizes the implication bluntly: the result is "difficult to reconcile with broad learning gains or sorting effects alone." Together these checks point to AI doing students' unsupervised work, not to widespread increases in student learning.
Why it matters
If college grades in writing and coding courses increasingly reflect AI‑generated output, employers and graduate programs risk making worse selection decisions because the grades no longer reliably signal students' skills. Chirikov warns of a feedback loop: if AI takes over skill‑building tasks during college, graduates could be weaker in the very areas where AI is strongest, which could accelerate automation and widen skill gaps in the labor market.
The study positions AI as a different driver of grade inflation than earlier explanations such as lenient grading or institutional incentives. Earlier drivers acted at the grading stage after work was submitted; generative AI changes how the work gets made before instructors see it.
What to watch
Watch whether institutions change assessment design: the study recommends rethinking exam formats and creating assignments that either limit AI use or explicitly fold it in, for example through documentation of the work process or follow‑up interactions that demonstrate understanding. Also look for policy shifts: the study notes that Norway recently mostly banned AI tools from elementary schools and limited their use in secondary schools, a concrete contrast in how systems respond.
One further signal will be statements from education leaders. OpenAI CEO Sam Altman said three and a half years after ChatGPT's launch the education system has barely responded and warned critical thinking skills risk "significant atrophy," a public admission that underscores the stakes.
If universities adopt measures that change how homework is weighted, how assignments are structured, or how student work is authenticated, those reforms will confirm whether the grade changes reflect a temporary distortion or a longer redefinition of what grades mean.
Written by The Brieftide · Source: The Decoder
The Brieftide Daily · 06:00
Briefs like this one, in your inbox every morning.
Continue reading
More in AI InfrastructureIEEE launches virtual training course on large language models
IEEE is offering a virtual training course that teaches engineers to use large language models as reasoning engines in development.
AI4SE and SE4AI: A decade review of AI in systems engineering
H. Sinan Bank, Daniel R. Herber and Thomas Bradley map three research phases and assess 1.
Amazon's AWS may sell Trainium chips to challenge Nvidia
AWS executives say selling Trainium to third parties is possible, with Andy Jassy estimating a potential ~$50 billion annual run rate.
Hyperscalers AI spending to outpace cash flow by Q3 2026
Epoch AI data shows infrastructure spending growing ~70% annually versus operating cash flow at ~23%, with a crossover around Q3 2026.