YOUR MONEY

OpenAI's New GPT-5.4 Model Outperforms Office Workers in Spreadsheet Tasks

By Reese Coleman · Saturday, March 7, 2026

Finn's Take· TL;DR

OpenAI's GPT-5.4 outperforms office workers 83% on professional tasks, with 33% fewer false claims and native computer-use capabilities.
ChatGPT for Excel integration lets users describe spreadsheet needs in plain language, improving financial modeling scores from 68% to 87%.
New financial data partnerships enable real-time access to market information, filings, and research while using fewer tokens for faster, cheaper responses.

See this from any side — with sources:

Left take Neutral Right take

AI Surpasses Human Performance in Professional Tasks

OpenAI has launched GPT-5.4, a groundbreaking artificial intelligence model that outperformed office workers 83% of the time on GDPval, an OpenAI benchmark measuring performance on real-world tasks across 44 occupations . The new model represents a significant leap forward in workplace AI capabilities, particularly excelling at tasks that have traditionally required human expertise and judgment.

The company designed GPT-5.4 to be less error-prone, more efficient and better at workplace tasks like drafting documents , with individual claims 33% less likely to be false and full responses 18% less likely to contain errors compared to GPT-5.2 . This enhanced reliability addresses one of the most persistent concerns about AI adoption in professional environments.

Most notably, GPT-5.4 introduces native, state-of-the-art computer-use capabilities , allowing the AI to read screenshots, move a cursor, and use a keyboard to operate a desktop environment without a separate tool layer . On computer operation benchmarks, GPT-5.4 scored 75.0% while the human baseline on that same test is 72.4% .

Revolutionary Spreadsheet Integration Changes Office Work

The most immediate impact comes through ChatGPT for Excel in beta, an Excel add-in that brings ChatGPT directly into workbooks to help build and update models, run scenarios, and generate outputs based on cells and formulas . This integration allows professionals to describe what they need in plain language, and ChatGPT will create or update live Excel models directly in the workbook .

The financial performance improvements are particularly striking. On OpenAI's internal investment banking benchmark, which evaluates real-world workflows such as building a three-statement model with proper formatting and citations, performance improved from 43.7% with GPT‑5 to 87.3% with GPT‑5.4 Thinking . For spreadsheet modeling specifically, GPT-5.4 scores 87% compared with 68% for GPT-5.2 .

The tool maintains transparency by explaining what it's doing as it works and links answers to the exact cells it references and updates . Before making changes, ChatGPT asks for permission, so users can review each step and undo edits if needed , addressing concerns about AI making unauthorized modifications to critical business documents.

Enhanced Financial Data Integration

Beyond Excel integration, OpenAI has introduced financial data integrations directly in ChatGPT for FactSet, Dow Jones Factiva, LSEG, Daloopa, S&P Global, and more . These partnerships enable users to access filings, company data, research reports, and market information while generating outputs such as earnings summaries, valuation snapshots, or credit memos .

The system can now reason across workbooks, understand how sheets and formulas connect across the model, explain why outputs changed, trace and fix errors, and show how assumptions flow through a model . This capability proves especially valuable when analysts inherit complex models or need to understand existing templates quickly.

Implications for the Future of Work

GPT-5.4's efficiency gains extend beyond accuracy improvements. The model can also solve problems using fewer tokens, OpenAI says — which can translate to faster responses and lower costs . The company reports GPT-5.4 uses fewer reasoning tokens than GPT-5.2 to reach the same output quality, translating to reduced token usage and faster speeds .

Early testing by industry professionals suggests transformative potential. Matt Shumer, founder and CEO of OthersideAI, tested GPT-5.4 for a week before the public launch and posted: "Even in standard mode, GPT-5.4 is better than previous models in Pro mode... Coding capabilities are ridiculous. It's essentially flawless" .

As AI capabilities increasingly match or exceed human performance in knowledge work, organizations face both unprecedented opportunities for productivity gains and fundamental questions about workforce adaptation. The technology promises to free professionals from routine tasks while requiring new skills in AI collaboration and oversight.

Have a question about this story?

Ask Finn — answers grounded in this article, from any viewpoint.