15 AI Product Insights for Usable, Scalable AI Products [ICONIQ 2025 Report]

I came across this excellent report on state of AI specifically researched for product builders.

If you are building a new AI product, this report helps avoid speculation based on existing patterns. It has already been more than 2 years since ChatGPT launched. Now, the focus has shifted decisively to building AI that delivers tangible, measurable value.

Product leaders, engineers, and founders are no longer just asking “What can AI do?” but “How can AI solve a real business problem, and how do we build, ship, and scale it effectively?”

This playbook is grounded in data from ICONIQ’s comprehensive 2025 State of AI Report. It offers a blueprint for builders on the front lines.

You can access the report here, its free and very insightful – 2025 State of AI Report: The Builder’s Playbook

The report distills survey results from 300 executives and in-depth interviews with AI leaders into 15 actionable insights.

These insights span the entire product lifecycle. It provides a data-driven guide for turning generative intelligence from a promising concept into a dependable, revenue-driving asset. This includes foundational strategy and architecture to go-to-market, team building, and cost management.

AI product creator checklist: 15 insights to keep in mind

I have prepared this below table of key lessons for AI product builders for quick reference:

Insight #Key InsightActionable Takeaway for AI product builders
1The future is Agentic, not just conversationalDesign AI to execute multi-step workflows, not just answer questions.
2A multi-model strategy is now key to successArchitect a model-agnostic layer to route tasks to the best model for the job.
3High-growth companies go deeper than APIsStart with APIs for speed, but plan to fine-tune models on proprietary data to build a long-term moat.
4Accuracy is king for customers, cost rules internallyCreate separate decision frameworks for selecting external-facing vs. internal-use models.
5RAG and fine-tuning are the workhorses of customizationUse RAG for knowledge injection and fine-tuning for behavioral modification; plan to use both.
6The cloud is your factory; treat vendor management as a core competencyAssign a dedicated owner for AI cost and vendor management to de-risk the business.
7Hallucinations and customer trust must be conqueredDesign user experiences that expect model fallibility with transparency and feedback loops.
8Your pricing model is an existential choiceFind your product’s ‘value metric’ and build a hybrid pricing model around it.
9AI isn’t a feature, it’s the roadmapAudit your roadmap; if less than a third is AI-driven, you are underinvesting.
10As you scale, monitoring and explainability become product featuresPlan for trust and transparency features on your roadmap from the GA stage.
11Dedicated AI leadership is a scaling imperativeIf approaching $100M in revenue, start the search for a dedicated, cross-functional AI leader.
12The AI/ML engineer is your most critical and constrained resourceConduct a “talent-aware roadmap review” and rank based on hiring realities.
13Your Budget Will Shift Dramatically From People to MachinesRemodel your product’s P&L to account for variable compute costs that scale with usage.
14API Usage Fees are the Hardest Cost to ControlImplement real-time token monitoring and task teams with cost-down prompt engineering.
15High-Growth Companies Spend Aggressively on the User ExperienceInvest in low-latency inference; in the AI era, performance is a core feature.

3 insights on foundational product strategy for AI product builders

The most successful AI products are built on a clear and deliberate strategy. The early decisions are crucial. These include choices about what product to build. It also involves which architectural patterns to adopt, and how deep to go down the technology stack. All these factors define a product’s competitive position and long-term trajectory.

Here are some insights from the report backed by data that can help with AI product building’s fundamental decisions:

The future is Agentic, not just conversational

The data reveals a clear consensus among the most progressive-looking companies about what to build next.

An overwhelming 79% of AI-native companies report that they are building ‘agentic workflows’. This makes it the most common type of AI product being developed, far surpassing simpler applications.

What type of AI products are you building? chart

This signifies a significant evolution. Early generative AI products were primarily conversational, focused on answering questions or generating content in a single turn. In contrast, an agentic workflow accomplishes multi-step goals autonomously, planning, using tools, and executing sequences to achieve specific outcomes.

AI agents can handle complex customer service issues or automate internal IT support tasks like password resets. They also manage financial processes and detect cybersecurity threats. Companies like Power Design, with its ‘HelpBot for internal IT support‘, show how agents can transform business processes.

Explore the case study here: How Power Design built the service desk of the year with AI

Helpbot by Power Design in action and its interface

The market is moving beyond products that give an answer to products that deliver an outcome. A simple chatbot’s value is limited to the quality of its response. An agent’s value is tied to the successful completion of a complex task. This shift elevates the competitive moat.

The underlying language models for conversation are becoming commoditized. Nevertheless, the ability to reliably orchestrate agents to automate a valuable, end-to-end business process is a much more defensible asset. It requires a different product mindset focused on process design, systems integration, and building user trust, not just prompt engineering.

Actionable Takeaway:

Stop thinking about the AI as a chatbot.

Instead, map a critical, multi-step user workflow. Then, design an AI agent that can execute it from start to finish. Make sure it requires minimal human intervention without compromising on accuracy.

A multi-model strategy is now key to success

Across the industry, companies are now using an average of 2.8 different foundation models in their products.

OpenAI’s GPT models stay the most popular, used by 95% of “Full Stack” AI companies. Still, they are almost always part of a broader, multi-model strategy. This strategy includes providers like Anthropic, Google, Meta, and Mistral AI.

Top Model Providers Chart

This is a deliberate architectural choice driven by practical needs. As one VP of Product at a billion-dollar company stated in the report by ICONIQ:

“We use different proprietary and 3rd party models because our customers have diverse needs. Specialized models allow us to better tailor the experiences for our customers and their use case… [and] offer our customers more flexible price points”.

This approach is exemplified by companies like Spotify, which uses a suite of different models for distinct tasks. It applies collaborative filtering to understand user similarity. Natural Language Processing (NLP) analyzes lyrics and web content for context. Audio analysis models deconstruct the features of a music track.

Source: Spotify Research

A multi-model architecture is not just for cost improvement; it is a strategic necessity for performance, resilience, and future-proofing.

No single model excels at every task.

One system may be superior at complex reasoning. Another may offer the best balance of speed and cost for simple tasks. A third may be best for creative generation.

A sophisticated product builder, doesn’t just choose “a model.” They build an abstraction layer—a router. This layer can dynamically direct specific tasks to the optimal model. The tasks are directed based on the requirements of the job. This makes the model layer an interchangeable component. It prevents vendor lock-in and creates a single point of control for both performance and cost.

The core intellectual property of the product is no longer the model itself. Instead, it is the logic that dictates which model to use and at what time.

This orchestration layer is the new defensible asset.

Actionable Takeaway:

Architect the product with a model-agnostic layer. Implement a router that can dynamically pick the best model for a given task. The choice is based on criteria like cost, latency, and required accuracy. Start by using a high-performance model for complex tasks and a cheaper, faster model for simple ones.

High-growth companies go deeper than APIs

Fastest-growing companies often move beyond just calling third-party APIs.

The data by ICONIQ shows that high-growth companies are significantly more inclined to fine-tune existing foundation models (77% vs. 61% for other companies) or develop proprietary models from scratch (54% vs. 32%).

How does your company use AI models chart

This trend is particularly pronounced among companies with over $100 million in revenue. As a business scales, the need for customization and control begins to outweigh the first convenience of off-the-shelf APIs.

Stitch Fix provides a powerful example of this strategy in action.

stitchfix algorithms
Source: Explore Stitch’s algorithms

Here’s a coverage by Nvidia to learn more: Deep Learning Helps Stitch Fix Dress Customers

The company’s business model hinges on a personalization engine driven by multiple algorithms trained on its proprietary customer dataset. This enables precise clothing recommendations, inventory management, and apparel design that surpasses generic models.

Moving beyond generic APIs is a leading indicator of market leadership and product maturity.

Generic APIs, by their nature, produce generic results. To win in a crowded market, a product needs a unique and differentiated user experience.

Fine-tuning a model with specific domain knowledge and style enhances its capabilities beyond simple prompting. Investing in a proprietary model emphasizes the company’s unique data and model behavior as competitive advantages. It transitions the focus from merely using AI in applications to embedding AI at the center of business operations.

Actionable Takeaway

Develop a “Model Maturity Roadmap.” Start with third-party APIs to achieve speed-to-market. As the product gathers proprietary data, it identifies its core, high-value use case. Create a business case for investing in fine-tuning. This will build a durable competitive moat.

The AI product builder’s toolkit – models, infrastructure, and data

Building great AI products involves crucial tactical decisions. Product teams face daily challenges in selecting appropriate models. They also need to apply effective customization techniques and manage infrastructure efficiently.

Accuracy is king for customers, but cost rules internally

Product teams are making fundamentally different technology choices depending on who the end-user is.

Here’s what the ICONIQ report shares:

When selecting a foundation model for a customer-facing product, ‘Accuracy’ is the top consideration. It is ranked as a top-three factor by 74% of builders. Nonetheless, when choosing a model for internal productivity tools, the priorities flip. ‘Cost’ becomes the number one consideration (74%), with ‘Accuracy’ falling to second place (72%).

Top Considerations When Choosing a Foundational Model

This bifurcation is entirely rational. An inaccurate or nonsensical response delivered to a paying customer can erode trust, damage the brand, and lead to churn.

For an internal tool, though, the calculus is different. An AI assistant that is 90% accurate and 95% cheaper than the top-performing model can still deliver enormous productivity gains. This makes the cost-benefit trade-off highly favorable. The data also shows that ‘Privacy’ becomes a significantly higher priority for internal use cases. These are environments where sensitive company data is often processed.

Top Considerations When Choosing a Foundational Model for Internal Use Cases

This distinction is leading to the emergence of two separate AI tech stacks within many companies, each optimized for a different primary metric:

  • Performance for external products
  • Efficiency for internal ones.

The shift in procurement, architecture, and governance involves multiple aspects. The external product team may choose expensive managed APIs for accuracy. Meanwhile, the internal productivity team experiments with cost-effective open-source models. This scenario necessitates product builders to master both high-performance and high-efficiency architectures.

Actionable Takeaway:

Create separate decision frameworks for selecting models for external versus internal applications. For customer-facing features, benchmark obsessively for accuracy. For internal tools, start with the most cost-effective model that meets a “good enough” performance threshold.

RAG and fine-tuning are the workhorses of customization

To move beyond generic model behavior, builders are overwhelmingly turning to two key techniques: Retrieval-Augmented Generation (RAG) and fine-tuning. They are the most common model adaptation approaches. RAG is used by 69% of companies, while fine-tuning is used by 68%. The adoption of both techniques has grown since the last year.

Model Training / Adaptation Techniques chart

These techniques solve different problems.

RAG is a technique for grounding a model’s responses in a specific set of data. This can include a company’s internal knowledge base or real-time product documentation. RAG retrieves relevant information to give it to the model as context. This ensures factual accuracy and reduces the risk of hallucinations.

Learn more about how RAG works here:

Fine-tuning, on the other hand, involves retraining a pre-existing model on a smaller dataset. This dataset is curated to teach the model a specific skill, style, or format. It alters the model’s fundamental behavior.

The debate over “RAG vs. Fine-tuning” is a false dichotomy. Mature AI teams understand that they are complementary tools.

RAG is for imparting knowledge, while fine-tuning is for teaching a skill.

Consider a customer support agent. It needs to know the latest product specifications and return policies—this is knowledge that changes frequently. RAG is the perfect tool for this; the knowledge base can be updated without retraining the model.

The agent also needs to be polite, empathetic, and follow a specific conversational structure—this is a skill. Fine-tuning is the right tool to instill this behavior.

The most advanced products, thus, use a hybrid approach. They use RAG to inject real-time, factual context into the prompt. Additionally, they use a fine-tuned model to process that context and respond with the appropriate skill and style.

Actionable Takeaway:

Use RAG when the AI needs to access and reason about a large, dynamic body of proprietary information. Use fine-tuning when the goal is to change the fundamental behavior, style, or format of the model’s output. For complex applications, plan to use both.

The cloud is your factory; treat vendor management as a core competency

The vast majority of AI development is happening on fully managed, third-party infrastructure. Data shows that 68% of companies are operating entirely in the cloud, and 64% rely on external AI API providers.

In contrast, fewer than one in ten keep fully on-premise infrastructure. This approach is driven by a wish to reduce large upfront capital investments in hardware and to maximize speed-to-market.

AI Infrastructure for Training and Inference chart

Hyperscalers like AWS, Google, and Microsoft have become the default platforms. They offer managed services like Amazon Bedrock and Google Vertex AI. These services bundle model hosting, governance, and billing into a single package.

While this model offers convenience, it also creates a strategic dependency.

The dependence on a few cloud and model providers is overwhelming. As a result, skills in negotiation, cost management, and multi-cloud architecture are becoming as critical for AI teams as model development itself. A significant part of an AI product’s cost structure is dictated by these vendors’ terms. The product’s performance also depends on their terms. A price hike from an API provider can be devastating for a startup.

Similarly, a change in a cloud platform’s service-level agreement can also have severe impacts.

Thus, vendor management has shifted from a back-office function to a critical business risk. AI leaders must evolve from technical roles to strategic vendor managers. They need to focus on building a resilient and cost-effective AI business and keep suitable relationships.

Actionable Takeaway

Assign a dedicated owner for ‘AI cost and vendor management.’ This role should be responsible for tracking API usage daily. It involves negotiating volume discounts with providers. Additionally, building a business case becomes essential for when to switch vendors or bring certain workloads in-house. This helps to de-risk the business.

Hallucinations and customer trust must be conquered

Despite rapid advances in model capabilities, fundamental challenges of reliability persist.

Challenges in Model Deployment chart

‘Hallucinations is the tendency of models to generate false or nonsensical information. ‘Explainability and trust’ are the top two challenges builders face when deploying models. This is cited by 39% and 38% of respondents, respectively.

These concerns rank higher than compute cost, security, or even finding the right use cases.

The problem of trust is particularly acute for companies building applications for regulated industries like healthcare and finance. In these sectors, explainability is often a legal and compliance necessity.

This data suggests that waiting for a perfectly reliable model is a losing strategy. The models will likely always have a degree of unpredictability.

The battle against hallucinations is not a technical problem to be solved in a lab. Instead, it is a product design challenge to be managed in the wild.

The most successful products will be those that design user experiences that expect and mitigate the impact of model fallibility.

This puts product design at the center of the solution. The user interface itself becomes the primary tool for building trust.

How is AI-generated content presented to the user? Is it clearly labeled as such? Are sources and citations provided to allow for fact-checking (a key benefit of RAG systems)? Is there a simple, one-click mechanism for users to report inaccuracies and provide feedback? Are there “guardrails” built into the system to enforce safety checks and prevent harmful outputs?

By designing for transparency and user control, product teams can build trust and manage the inherent risks of the technology.

Actionable Takeaway:

Treat trust as a core product feature. From day one, design a UI/UX that transparently communicates the AI’s confidence level. It should give sources for its claims. Include a simple, one-click feedback loop for users to report inaccuracies.

From code to customer: Go-to-market and commercialization

AI’s unique features and costs are changing how software is priced, packaged, and sold. Traditional SaaS models are facing challenges. Product plans are now updated to focus on AI as the main source of value.

Your pricing model is an existential choice

How a company charges for its AI product is one of the most critical decisions it will make. Many AI-enabled SaaS companies now bundle AI features into premium subscription tiers (40%). Others offer them at no extra cost (33%). The data shows a clear trend towards more sophisticated models.

The most common approach overall is a “Hybrid” model (38%). While 37% of companies plan to change their pricing in the next twelve months, moving towards consumption and ROI-based structures.

Primary Pricing Model (Including AI Products / Features and Software)

The traditional, flat-fee subscription model is breaking under the strain of AI’s high, variable inference costs. As one VP of Product lamented in the ICONIQ report:

“Power users tend to use a lot resulting in negative margins… while users who aren’t using are at risk of churn”.

The solution is a hybrid model that combines a stable, recurring subscription fee with a variable, usage-based component.

Intercom exemplifies this strategy by keeping its traditional seat-based subscription while introducing an extra fee of $0.99 for each resolved customer service issue through its AI agent, aligning costs with the value provided.

Fin AI Agent pricing by Intercom

The shift to hybrid pricing is forcing a radical re-evaluation of product value. To add a usage-based element, a company must first find the atomic unit of value. It must then measure the value its AI delivers.

For Intercom, that unit is a “resolution.” For a coding assistant, it might be “lines of code accepted.” For a marketing tool, it could be “ad copy variations generated.”

This requires a level of product telemetry and customer understanding that goes far beyond traditional SaaS. The billing system is no longer just a financial tool. It is a core part of the product architecture. The ability to precisely define and meter a value metric becomes a key competitive advantage.

Actionable Takeaway

Stop selling “access to AI.”

Instead, find the single most valuable outcome the AI produces for a customer. Define this as the “value metric” (e.g., “reports generated,” “invoices processed,” “threats detected”). Build the pricing model around a hybrid of a platform subscription fee and a per-unit charge for that metric.

AI isn’t a feature, it’s the roadmap

The most successful companies are not just adding AI features; they are rebuilding their entire product strategy around AI. A telling gap has emerged in resource allocation. High-growth companies estimate that 43% of their product roadmap will focus on AI-driven features by the end of 2025. For all other companies, that figure is just 36%.

What % of your product roadmap is focused on AI-driven features? chart

This 7-percentage-point difference shows a massive delta in engineering hours, budget, and strategic focus over time. It is a powerful leading indicator of future market share.

This investment focuses on developing foundational platforms. These platforms generate a compounding advantage through multi-model architecture. They also focus on fine-tuning on proprietary data and robust evaluation systems. These interconnected projects create a cycle. Improved evaluations yield better data. This data enhances model tuning and user experience. Ultimately, more users and data are attracted.

Companies that treat AI as a core strategic pillar are building a fundamentally more powerful and intelligent system over time. Those that treat it as a series of bolt-on features risk falling into a vicious cycle. Their core product becomes less competitive. This makes it even harder to justify future AI investment. The roadmap allocation gap is a proxy for a widening capabilities gap that will be difficult for laggards to close.

Actionable Takeaway

Audit the product roadmap. If less than one-third of the planned initiatives for the next 12 months are fundamentally AI-driven, the product is likely underinvesting. It may be lagging behind its fastest-growing competitors. Use the 43% figure from high-growth companies as a benchmark to advocate for increased resource allocation.

As you scale, monitoring and explainability become product features

For early-stage products, simply “working” is often good enough. But as an AI product matures and scales, the definition of quality evolves. The data shows a clear correlation between product maturity and investment in transparency and monitoring. Only 4% of pre-launch products have advanced monitoring capabilities like drift detection. Yet, this figure jumps to 44% for products that are scaling.

Approach to AI Performance Monitoring chart

Similarly, the practice of providing customers with detailed model transparency reports starts at just 6% in pre-launch. This practice increases to 25% at scale.

Strategy for AI Explainability and Transparency to Customers charts

This shift is driven by the changing expectations of the customer base. Early adopters may be tolerant of a “black box” system, but enterprise customers, especially in regulated fields, will not be. Their procurement, legal, and security teams will ask tough questions: “How do you detect and mitigate model bias?” “Can you give an audit trail for this automated decision?” “What is your process for monitoring model performance degradation over time?”

Answering these questions requires treating monitoring and explainability not as internal engineering concerns, but as external-facing product capabilities. Features like a “Model Health Dashboard” or an “Explainability Report” become critical table stakes for selling into the enterprise market. Teams should start building these capabilities early. If they wait until reaching scale, they will find themselves unable to close large enterprise deals. They are not just nice-to-haves; they are essential components of an enterprise-grade product.

Actionable Takeaway

Add “Trust & Transparency Features” to the product roadmap.

Plan to introduce basic outcome explanations in the General Availability (GA) release. Develop a plan for building advanced monitoring. Create detailed reporting capabilities as the product targets larger, more sophisticated customers.

The human engine – team, talent, and organization

Technology alone does not build great products. The success of any AI initiative relies on having the right people. A well-structured team is essential. Effective leadership is crucial to navigate the unique challenges of this new paradigm.

Dedicated AI leadership is a scaling imperative

A clear organizational tipping point emerges as companies grow. Only 33% of companies with less than $100 million in revenue have dedicated AI/ML leadership. This includes roles like a Chief AI Officer or Head of ML. This figure jumps to between 48% and 61% for companies above that threshold.

Dedicated AI/ML Leadership chart

The $100 million revenue mark is where AI shifts from a series of siloed R&D projects. It becomes a centralized, strategic business function. In the early stages, AI development can often be managed within the existing engineering organization.

But as a company scales, the challenges become more than just technical. The AI budget becomes a significant line item requiring careful management. Go-to-market strategies for AI products become more complex. And issues of compliance, governance, and data privacy become critical.

These are not problems that can be solved by a single engineering team. They need a leader who can work cross-functionally. This leader should interface with Finance, Legal, Product, and Sales. Their goal will be to create a cohesive AI strategy. Hiring a dedicated AI leader shows that an organization recognizes AI as a core business function. It indicates that AI requires executive oversight, not just a technical skill.

Actionable Takeaway: If the company is approaching $100 million in revenue and does not have a dedicated AI leader, start the hiring process now. The role should be scoped not just for technical leadership, but for cross-functional strategy, budget ownership, and governance.

The AI/ML engineer is your most critical and constrained resource

The data on hiring is stark. AI/ML engineers are the most in-demand role, with 88% of companies reporting they presently have them on staff. They are also the most difficult to hire, with an average lead time of 70 days. The primary constraint is not budget or competition. Instead, it’s a simple “lack of qualified candidates.” This was cited by 60% of respondents as the main reason for slow hiring.

AI-Specific Roles and Hiring Plan chart

This talent bottleneck is arguably the single biggest constraint on the AI industry’s growth. For product builders, this has a direct and immediate impact on strategy. A brilliant roadmap that requires a team of senior ML engineers that can’t be hired is a fantasy. The feasibility of a project is no longer just a question of technology. It is not merely about budget either. Instead, it is a question of talent availability.

This elevates talent acquisition and retention to a strategic level. It also forces product leaders to be more pragmatic. The classic “build vs. buy” decision is now heavily influenced by the “hire vs. can’t hire” reality. It may be more prudent to pursue a simpler project. This project can be built by the existing team or by leveraging managed third-party services. This is preferable to a more ambitious project that stalls for months while critical roles stay unfilled.

Actionable Takeaway

Partner with HR and recruiting teams to create a specialized hiring pipeline for AI/ML engineers. At the same time, conduct a “talent-aware roadmap review.” For each major initiative, ask:

“Do we have the talent to execute this? Or can we realistically hire it in the next quarter?”

If the answer is no, focus initiatives that rely more heavily on managed services or existing team skills.

Managing costs and proving AI product value

Building with AI involves unique financial considerations. The cost structures differ from traditional software. There are challenges in demonstrating ROI. Understanding AI economics is crucial alongside technological skill.

Your budget will shift dramatically from people to machines

The financial lifecycle of an AI product is fundamentally different from that of traditional SaaS. As an AI product matures from pre-launch to scaling, the budget allocation undergoes a dramatic transformation. In the pre-launch phase, AI talent (salaries and hiring) is the largest expense, accounting for 57% of the budget. By the time the product is scaling, this figure drops to just 36%. Conversely, the running costs of the product grow significantly. AI infrastructure and cloud costs increase from 13% to 22%. The variable cost of inference surges from 4% to 10% of the total budget.

Budget Allocation chart for building AI products

This pattern breaks traditional SaaS financial models, which typically assume a relatively low and stable cost of goods sold (COGS). For an AI product, COGS is dominated by variable compute costs that are directly tied to user engagement. High engagement is the goal of any product. Still, it can paradoxically lead to lower margins or even losses. This happens if the pricing model is not structured correctly. The financial model must be rebuilt to show this new reality

Product leaders must work closely with their finance counterparts. Together, they need to create a new type of profit and loss (P&L) model. This model must accurately forecast these variable costs and guarantee the unit economics of the product are possible at scale.

Actionable Takeaway

Re-model the product’s P&L.

Instead of a fixed COGS, model inference and infrastructure costs as a variable expense that scales directly with a key usage metric (e.g., API calls per active user). Use this model to decide a sustainable pricing structure before scaling.

API usage fees are the hardest cost to control

When builders were asked which infrastructure costs are the most challenging to control, the answer was unequivocal. “API usage fees” was the top response, cited by 70% of respondents. This was significantly higher than other major costs like inference (49%) or model training (47%).

Infrastructure Costs for building AI products chart

The unpredictability of API fees stems from two sources: user behavior and model verbosity.

Training costs are large but predictable—a team decides when to run a training job.

Infrastructure costs can be forecasted based on user growth. API fees depend on the number of API calls. They also depend on the number of tokens processed per call. The number of calls is driven by user behavior, which can be spiky and unpredictable. The number of tokens depends on the user’s input length. It also depends on the model’s response length, which can vary wildly from one interaction to the next.

This means that cost control is not just an engineering optimization problem. It is also a product design and prompt engineering challenge.

A poorly designed feature can encourage long, rambling user inputs. Additionally, a poorly engineered prompt may elicit unnecessarily verbose responses from the model. Both can silently drive up costs.

Managing API fees requires a three-pronged approach.

  • The first aspect is engineering, which involves implementing caching and rate limiting.
  • The second aspect is product design, focusing on creating interfaces that guide users toward concise inputs.
  • The third aspect is prompt engineering, which entails crafting prompts that instruct the model to be brief without sacrificing quality.

Actionable Takeaway

Implement a real-time dashboard to track API token consumption per user and per feature. Set budget alerts to catch unexpected spikes. Assign the product and engineering teams specific “cost-down” initiatives. These initiatives include implementing response caching for common queries and refining prompts to reduce output length.

High-growth companies spend aggressively on the user experience

A final look at the spending patterns of high-growth companies reveals a key strategic priority.

At the General Availability and scaling stages, these companies increase their spending. They assign significantly more on the infrastructure that directly impacts the user experience. Their median monthly spend on inference can be up to double that of their peers ($2.3 million vs. $1.6 million at scale). They also consistently outspend on data storage and processing.

Deployment costs chart across inference, data storage, and processing.

This higher spending is not a sign of inefficiency. It is a deliberate strategic investment in a superior, faster, and more reliable user experience. This extra budget is likely being used to deploy more powerful (and more expensive) models. It is also used to provision more GPU capacity to reduce latency and user wait times. Additionally, it helps run more complex data pipelines to deliver more personalized and relevant results.

These are all investments that translate directly into a better end-user experience. In the world of AI, an answer that appears in one second feels magical. An answer that takes ten seconds feels broken. A recommendation that is perfectly personalized feels intelligent; one that is generic feels useless.

The spending gap is a proxy for an “experience gap.”

High-growth companies have correctly identified that in the AI era, performance is the feature. They are willing to invest in the expensive infrastructure needed to deliver that performance. They know it is a primary driver of user adoption. Retention is also driven by this performance. Ultimately, it leads to growth.

Actionable Takeaway

When budgeting for infrastructure, do not just solve for the average case; solve for the optimal user experience. Benchmark the response latency of the product against best-in-class competitors. If it is significantly slower, consider investing in more expensive, lower-latency inference solutions. This is a key driver of growth.

Learn more about latest in AI and tools developed with AppliedAI Tools

The era of AI exploration has given way to the age of execution. This playbook’s core message is simple: shift your focus from conversational features to agentic, outcome-driven workflows.

Master the new unit economics of AI. Align your pricing with its variable costs. Build user trust through transparent design.

Your most durable advantage will not come from accessing a specific model. Instead, it lies in having the operational rigor to build a scalable, profitable, and reliable AI-native business.

Subscribe to our newsletter to get the best in AI tools, news, and tutorials, shared only once per week!

Here are some more explainers on AI models and AI research papers:

This blog post is written using resources of Merrative. We are a publishing talent marketplace that helps you create publications and content libraries.

Get in touch if you would like to create a content library like ours. We specialize in the niche of Applied AI, Technology, Machine Learning, or Data Science.

Leave a Reply

Discover more from Applied AI Tools

Subscribe now to keep reading and get access to the full archive.

Continue reading