GPT-5.4 First Reviews: Master Excel and Coding

OpenAI has officially released GPT-5.4, a massive update that brings new levels of intelligence to everyday tasks. With the launch of the ChatGPT 5.4 thinking model, OpenAI introduces advanced computer use and vision, seamless Excel integration, and deep data plugins. Whether you need to automate financial models, write complex code, or simplify research and analysis, GPT-5.4 is designed to handle serious professional and academic work.

Nonetheless, this power comes with a premium price tag, setting up a major battle with rival AI models.

Key Takeaways:

  1. Built for Finance and Spreadsheets: OpenAI launched a native “ChatGPT for Excel” beta. It connects to live data from FactSet and S&P Global, letting users build financial models using plain English.
  2. Record-Breaking Performance: The new GPT-5.4 Thinking model dominates the ChatGPT 5.4 benchmark tests. It scores higher than older models in law, coding, and general knowledge work.
  3. Agentic Computer Use: GPT-5.4 does not just chat. It can take control of software, navigate websites, and execute multi-step tool workflows faster and cheaper than before.

What does GPT-5.4 do? The Power of ‘Thinking’

Advanced Reasoning and Logic

If you are wondering, what does GPT-5.4 do?

The answer lies in its ability to think before it acts. The GPT-5.4 Thinking model uses advanced abstract reasoning to solve ambiguous problems. It introduces human-like computer interaction, allowing it to execute tasks inside your existing systems without needing external tools.

Improved Web Search and Context

It also features improved web search and an incredibly long context window. This means it can read massive documents, like legal contracts or entire codebases, without forgetting details.

“GPT-5.4 sets a new bar for document-heavy legal work. On our BigLaw Bench eval, it scored 91%. Compared to other models, GPT-5.4 is currently better at structuring complex transactional analysis, maintaining accuracy across lengthy contracts, and delivering the high level of detail legal practitioners require.”

— Niko Grupen, Head of Applied Research at Harvey. [Source: OpenAI]

GPT-5.4 loves Spreadsheets: The Finance Revolution

ChatGPT for Excel Integration

A screenshot demonstrating a new software capability titled "ChatGPT for Excel Integration". The image shows a standard macOS Microsoft Excel window open to a financial "Balance Sheet" displaying complex data arrays. A dedicated ChatGPT sidebar is open on the right side of the screen, where a user has prompted the AI to calculate a bridge from EBITDA to free cash flow. The AI outlines its automated planning steps directly in the sidebar, stating it will pull the necessary lines, build the conversion bridge, and dynamically add the requested charts to the user's workbook.
Source: OpenAI

It is no secret that GPT-5.4 loves spreadsheets.

OpenAI has officially entered the financial software market by releasing ChatGPT for Excel in beta. Users no longer need to manually type every formula. They can talk to their spreadsheets in plain English. This allows them to create or change live models automatically.

Real-Time Data Plugins

“On internal finance and Excel evaluations, GPT-5.4 improved accuracy by 30 percentage points, which he links to expanded automation for model updates and scenario analysis.”

— Daniel Swiecki, Walleye Capital [Source: VentureBeat]

Users can now pull live market data from trusted providers like FactSet, S&P Global, and LSEG. This helps investment teams quickly conduct due diligence and value companies. Instead of typing formulas manually, you just describe the scenario in plain English, and the AI builds the formatted model.

“ChatGPT has materially accelerated our research and due diligence workflows from financial analysis and market research to legal review and writing internal memos while improving consistency across teams. It has expanded our team’s capacity, freeing our investment professionals to focus more time on judgment, debate, and conviction. We’re excited to be early adopters of new capabilities and to help shape how AI transforms financial services in the years ahead.”

— Amr Ellabban, PhD, Head of AI, Hg. [Source: OpenAI]

ChatGPT 5.4 Benchmark and Evaluations

Breaking Records in Knowledge Work

OpenAI provided extensive evaluations showing that GPT-5.4 beATS its predecessor, GPT-5.2. On the GDPval test, which measures professional knowledge work, GPT-5.4 matched or beat human industry professionals 83% of the time (up from 70.9%).

Dominating Professional Services

A dark-themed data table comparing model performance on internal corporate and financial tasks, featuring tabs for "Developers," "Company," and "Foundation". The table highlights GPT-5.4's dominance across several complex evaluations, notably scoring 83.0% on "GDPval" and an impressive 87.3% on "Investment Banking Modeling Tasks (Internal)".
Source: OpenAI

The model also excels in specialized fields. In internal tests simulating the work of junior investment bankers, GPT-5.4 scored an impressive 87.3% on spreadsheet modeling.

“GPT-5.4 is the best model we’ve ever tried. It’s now top of the leaderboard on our APEX-Agents benchmark, which measures model performance for professional services work. It excels at creating long-horizon deliverables such as slide decks, financial models, and legal analysis, delivering top performance while running faster and at a lower cost than competitive frontier models.”

— Brendan Foody, CEO at Mercor. [Source: OpenAI]

Computer use and vision: A New Agentic Era

Unmatched Multi-Step Tool Use

A line graph comparing the accuracy and efficiency of GPT-5.4 (represented by a light blue line) against GPT-5.2 (represented by an orange line). The y-axis measures "Accuracy," while the x-axis tracks the "Number of tool yields". The graph illustrates that GPT-5.4 requires significantly fewer tool yields to achieve much higher accuracy, peaking well above 50%, whereas GPT-5.2 struggles to reach 50% even with a higher number of tool yields.
Source: OpenAI

GPT-5.4 takes tool use to the next level. It can navigate software interfaces on its own.

“GPT-5.4 xhigh is the new state of the art for multi-step tool use. Zapier runs some of the most rigorous tool use benchmarks in the industry, testing models across hundreds of advanced real-world workflows. GPT-5.4 finished the job where previous models gave up – the most persistent model to date.”

— Wade, CEO at Zapier. [Source: OpenAI]

Navigating the Web Autonomously

A bar chart titled "BrowseComp" with an OpenAI logo in the top right corner, measuring the "Accuracy" percentage of different AI models on the y-axis. The chart visually demonstrates a clear progression in performance: GPT-5.4 Pro leads with 89.3%, followed by the base GPT-5.4 at 82.7%, significantly outperforming the older GPT-5.2 Pro (77.9%) and GPT-5.2 (65.8%) models.
Source: OpenAI

Its computer vision allows it to “see” and interact with websites like a human.

“In our evals measuring computer use performance across ~30K HOA and property tax portals, GPT-5.4 achieved a 95% success rate on the first attempt and 100% within three attempts… It also completed sessions ~3x faster while using ~70% fewer tokens.” 

—-  Dod Fraser, CEO at Mainstay. [Source: OpenAI]

Coding: GPT 5.4 vs GPT 5.3 Codex

The Ultimate Developer Assistant

For programmers, Coding has seen a massive upgrade. Developers are using GPT 5.4 cursor integrations to write and audit code rapidly. This tool proactively parallelizes work. It keeps things moving swiftly.

Assertiveness and Speed

A dark-themed data table titled "Coding," comparing the performance of multiple models including GPT-5.4, GPT-5.4 Pro, GPT-5.3-Codex, GPT-5.2, and GPT-5.2 Pro. The table evaluates the models across coding-specific benchmarks, showing the base GPT-5.4 model scoring 57.7% on "SWE-Bench Pro (Public)" and 75.1% on "Terminal-Bench 2.0," closely rivaling or beating the older, code-specific GPT-5.3-Codex.
Source: OpenAI

When looking at GPT 5.4 vs GPT 5.3 codex, the new GPT 5.4 benchmarks show a distinct personality shift. It is much smarter and more confident.

“GPT-5.4 is currently the leader on our internal benchmarks. Our engineers find it to be more natural and assertive than previous models. It works through ambiguous problems without second-guessing itself, and it’s proactive about parallelizing work to keep things moving.”

— Lee Robinson, VP of Developer Education at Cursor. [Source: OpenAI]

Availability and pricing: Is GPT-5.4 free?

Targeting Premium Users

Many users are asking, Is GPT-5.4 free?

The answer is no.

The heavy focus on coding and professional tasks is clear. This focus shows that OpenAI is targeting paying professionals. They are directly competing with Anthropic’s expensive Claude Opus 4.6.

Two Tiers of Power and API Costs

To access it, you need a paid plan:

  • GPT-5.4 Thinking: Available to all ChatGPT Plus ($20/month) subscribers.
  • GPT-5.4 Pro: Reserved for the new ChatGPT Pro ($200/month) and Enterprise users.
  • API Pricing: GPT-5.4 costs $2.50 per 1M input tokens and $15 per 1M output tokens. The Pro version costs $30 for input and $180 for output.

GPT-5.4 Thinking System Card and Safety

Documenting Safety Limits

With great power comes the need for rigorous safety protocols. OpenAI released the GPT-5.4 Thinking System Card to publicly document how the model handles complex tasks securely.

Managing Concurrent Processes

OpenAI optimizes workflow and tracks how the model “thinks.” This ensures it behaves reliably and safely. It manages concurrent processes, accesses live financial data, and executes independent computer use tasks on behalf of the user.

Learn More: GPT‑5.4 Thinking System Card

Community Reactions on OpenAI GPT 5.4 model: The Verdict from Real Users

Positive Reactions

Users are amazed by the speed, vision, and coding accuracy of the new model.

A screenshot of an X (formerly Twitter) post by user "Ava Mitchell" discussing rapid advancements in AI vision capabilities. The post highlights scores from the "EyeBench-V2 benchmark," showing a human baseline of 100%, followed closely by "GPT-5.4 Pro" at 90%, "GPT-5.4" at 71%, and "GPT-5.3 Codex" at 62%. A detailed bar chart is included below the text, visually demonstrating that the newest GPT models are approaching human-level performance, while the user notes that most other competitor models remain "far behind, many below 40%".
Source: X

“AI vision is catching up to humans faster than expected. On the EyeBench-V2 benchmark: Human baseline: 100% GPT-5.4 Pro: 90% GPT-5.4: 71% … Most other models are far behind.”

A screenshot of a Reddit comment by user "Just_Lingonberry_352" sharing their early evaluation of the new "gpt 5.4" model. The user notes that the model feels noticeably faster than 5.3-codex when completing benchmark tests that utilize "subagents" for tasks like scanning, hardening, and refactoring. They speculate that this speed increase might be tied to a new "1M token context and persistent memory upgrade". However, they warn that this increased performance means their weekly usage limits are being consumed much faster.
Source: Reddit

“I am still evaluating GPT 5.4 but it has the speed of 5.3-codex (5.4 feels faster).”

A screenshot of a Reddit comment by user "LargeLanguageModelo" discussing their experience with a large code refactoring project. The user details a complex, multi-model workflow where they planned the refactor using "5.4-pro" (after an initial audit in 5.2-pro), executed the actual coding work using "5.4-high," and then reviewed the results using "5.3-codex". They conclude that the results came up "squeaky clean" and praise the rapid pace of the platform's development.
Source: Reddit

“Using 5.4-high to do the work, and 5.3-codex to review/audit. Followed the plan completely, and reviews came up squeaky clean. OpenAI’s pace of development is insane.”

Negative Reactions

However, some users report bugs, lazy assumptions, and frustration with the model’s logic.

A screenshot of a blunt Reddit comment by user "TomerHorowitz" expressing extreme frustration with a newer AI model. The user claims the model got "everything I asked for wrong except documentation," prompting them to abandon it entirely and change their workflow back to the older "5.3-codex" model.
Source: Reddit

“For me it’s shit, it got everything I asked for wrong except documentation – It kept getting everything wrong that I changed back to 5.3-codex… maybe it’s just me”

A screenshot of a Reddit comment by user "Important-Candle-560" warning about a severe flaw in the AI's logic when fixing a bug. The user states the AI made incorrect assumptions and suggested adding logic to drop and recreate a SQL table. The user calls the model "lazy and a little dangerous," noting that implementing the code would have been devastating, and states they are going back to version "5.2".
Source: Reddit

“I asked it to fix a pretty easy bug and it took the easiest path making assumptions that were not correct and did not bother to check anything else. It told me that a sql table schema must have changed and added logic to drop the table and recreate it which would have been devastating if I implemented the code. It seems lazy and a little dangerous. Back to 5.2 for me.”

A screenshot of an X (formerly Twitter) post by user "dreams" (@dreams_asi) sharply criticizing the GPT-5.4 model. The user claims the update delivers "unreliable reasoning, lack of common sense and ideological biases," asserting that OpenAI has "successfully lobotomised its latest release" and echoing the community hashtag "#Keep4o". The post includes a screenshot of a chat interface demonstrating these frustrations; when prompted with a politically sensitive question about Israel, the AI answers "No" but inexplicably triggers a severe, unrelated self-harm crisis warning about moving "balloons" away. In contrast, when asked a separate polarized question about George Floyd, the AI simply answers "Yes" without triggering any safety filters.
Source: X

“GPT 5.4 delivers unreliable reasoning, lack of common sense, and ideological biases that render interactions with this model boring and flakey. OpenAI successfully lobotomised its latest release.”

Action Points — How to start using GPT-5.4?

  1. Upgrade Your Plan: To access the new features, make sure you are subscribed to ChatGPT Plus ($20/month) or Pro ($200/month).
  2. Install the Excel Add-in: If you work in finance, download the ChatGPT for Excel beta add-in. Start pulling live data from S&P Global and FactSet directly into your spreadsheets.
  3. Test Computer Use: Try giving the AI a multi-step task. Ask it to research data across several websites. Then, paste the results into a document to test its new agentic tool use.

FAQs on GPT-5.3-Codex

  1. Is GPT-5.4 free?

No, it is currently reserved for paid subscribers (Plus, Pro, and Enterprise).

  1. How to use GPT-5 4?

You can select it from the model dropdown in your ChatGPT Plus or Pro account. You can also use it via the API. Alternatively, access it through the new ChatGPT for Excel add-in.

  1. What does GPT-5.4 do?

It performs advanced abstract reasoning, handles complex coding, uses computer tools autonomously, and integrates deeply with Excel for financial modeling.

  1. What is the GPT-5.4 Thinking System Card?

It is a document published by OpenAI outlining the safety, testing, and operational limits of the new model.

  1. How does the ChatGPT 5.4 benchmark compare to older models?

It significantly outperforms GPT-5.2 in professional tasks, scoring 83% on GDPval and 91% on BigLaw Bench.

  1. Does GPT-5.4 love spreadsheets?

Yes! The model has native integration with Microsoft Excel and connects to live market data providers for fast financial modeling.

  1. What is the difference between GPT-5.4 vs 5.3 codex?

Users report that GPT-5.4 is faster, handles parallel tasks better, and is more assertive when writing code compared to the 5.3 Codex model.

  1. Does it have computer vision?

Yes, its computer use and vision capabilities are top-tier, scoring 90% on the EyeBench-V2 test for the Pro model.

  1. Who is GPT-5.4 designed for?

It is designed for professionals doing deep knowledge work, coding, legal analysis, and academic research.

  1. How much does the API cost?

The standard 5.4 API costs $2.50 per 1M input tokens and $15 per 1M output tokens.

  1. Can GPT-5.4 handle multi-step workflows automatically?

Yes, it is highly capable of completing long-horizon tasks. Using the GPT-5.4 xhigh setting, Zapier rigorously tested it. They praised it as the most persistent model to date. It finishes complex workflows where previous AI models gave up.

  1. How does the model perform on legal documents?

It is setting a new standard for the legal industry. It scored 91% on the BigLaw Bench evaluation. It is specifically tuned to maintain accuracy and detail across lengthy contracts. It also excels in complex transactional analysis.

  1. Which financial data providers connect to ChatGPT for Excel?

The new beta add-in introduces deep data integrations with major financial providers. These include FactSet, S&P Global, and LSEG (London Stock Exchange Group). This allows you to pull real-time data directly into your workbook.

  1. How reliable is its “computer use” in real-world scenarios?

It is incredibly reliable. In tests navigating over 30,000 different Homeowner Association (HOA) and property tax portals, GPT-5.4 achieved a 100% success rate within three attempts, operating 3x faster than older computer-use models.

  1. Does the model require external tools for computer interaction?

No, one of its biggest upgrades is advanced human-like computer interaction (CUA). This allows GPT-5.4 to execute tasks directly within your existing systems and software interfaces without needing external, third-party automation tools.

Further Reading

  1. Official OpenAI Announcement: Introducing GPT-5.4
  2. ChatGPT for Excel: Learn more about the integration
  3. Safety and Security: GPT-5.4 Thinking System Card
  4. Latest Model Resources: OpenAI Academy

Check out more ChatGPT Hacks on AppliedAI Tools

Make the most of ChatGPT for your personal or workplace productivity:

Twice a month, we share AppliedAI Trends newsletter.

Get SHORT AND ACTIONABLE REPORTS on AI Trends across new AI tools launched and jobs affected due to AI tools. Explore new business opportunities due to AI technology breakthroughs. This includes links to top articles you should not miss, like this ChatGPT hack tutorial you just read.

Subscribe to get AppliedAI Trends newsletter – twice a month, no fluff, only actionable insights on AI trends:

You can access past AppliedAI Trends newsletter here:

This blog post is written using resources of Merrative. We are a publishing talent marketplace that helps you create publications and content libraries.

Get in touch if you would like to create a content library like ours. We specialize in the niche of Applied AI, Technology, Machine Learning, or Data Science.

Leave a Reply

Discover more from Applied AI Tools

Subscribe now to keep reading and get access to the full archive.

Continue reading