Conversation Data

August 2025

We have the unique privilege at Recall.ai of being an index for the greater conversational AI ecosystem, by nature of serving as underlying infrastructure for most of the end user applications. As a usage-based business, our revenue growth from $2m ARR at the beginning of 2024 to over $15m ARR barely a year later is a reflection of the broader explosion in adoption of such tools.

From that vantage, one trend stands out: the questions customers ask of their call data have evolved from “What was said?” to "What next steps should we take?” Meeting minutes ingested per customer are soaring, but the real shift is qualitative—usage spikes whenever a product goes beyond transcription to drive a concrete follow-up (create a task, adjust a price, trigger a care plan). The data is only a control surface.

The hard part is no longer capturing words but turning them straight into action. Products that treat conversation as an operating system—detecting intent, deciding, and doing—will swallow entire workflows and redefine enterprise software.

‍The Cambrian Explosion

‍When the conversational AI industry first emerged ten years ago, there were only two viable use cases: general purpose notetaking and sales coaching. A small handful of startups, namely Otter.ai, Fireflies, and Chorus, achieved commercial success. Today, there are thousands of startups generating meaningful revenue across verticals. This is a result of two concurrent forces: the emergence of devtools making developing conversational AI products easier, and increasingly performant AI models making conversation data vastly more useful.

Before GPT-3, AI assistants could barely produce vague meeting notes from call transcripts. Today, an LLM can summarize with high fidelity, extract relevant insights, and update systems of record with that data. Conversation intelligence tools are now able to actively automate existing workflows across, sales, medicine, and recruiting—graduating from being a marginal productivity tool to a massive time saver.

At the same time, it’s become enormously easier to build such software. Just a few years ago, Chorus often advertised its technological superiority backed by 14 patents, while Otter retained a staff of machine learning PhDs to improve its AI summarization quality. Today, performance has been democratized at each level of the stack: data can be reliably ingested with Recall, accurately transcribed with Deepgram, and faithfully summarized by LLMs. In core functionality, there is no longer a technological gap between the products of the indie hacker and the publicly traded company.

The accessibility of such devtools has effectively transformed the CapEx of building out each part of the stack into OpEx of paying vendors on a usage-based basis. Where previously it would take an engineering team a year to build a conversational AI tool from scratch, a fieldable MVP can now be launched in a couple of months, while spending less on infrastructure versus building in house.

This flips the economics of product development on its head. You no longer need a billion-dollar TAM to justify shipping a conversational AI product: when the build-out is rented instead of owned, a five-person team can profitably build for a 5,000-seat niche—be it oncology coordinators, FP&A analysts, or even one company’s internal support desk. Paradoxically, the smaller the niche the bigger the ROI, because specialized tooling doesn’t just take notes; it’s capable of wiping out entire hours of function-specific grind work, making each seat dramatically more valuable.

‍The Last Frontier of Context

‍A fundamental transformation is currently reshaping the nature of what a conversational AI product is. Until now, the core focus has been building a better notetaker—slightly tailored by use case, but ultimately just a more accurate transcription and summarization tool.

Today, conversation AI tools produce templated reports: Regardless of what is said in a meeting, the deliverable produced will be a set of meeting notes, in the form factor the product’s designers prescribed. One tool might use the transcript to score sales calls and another might automatically push the notes to an ATS, but the end product does not notably alter the end user’s workflows. These tools have delivered massive value by improving memory and automating general notetaking, but none have transformed the nature of a profession.

The new generation of conversational AI tools go beyond the read-only form factor and are able to automate complex existing workflows. In medicine, Abridge is automating the practice of highly structured medical note taking, integrating deeply with EHRs and automating much of a medical scribe’s work. In product management, ClickUp uses meeting data to automatically create tasks, assign follow-ups, and update tickets. This is not just taking custom notes, it is entirely automating the practice of data entry. As LLMs improve, it is now possible to transform unstructured transcripts into structured data, performing systemized post-meeting work with the same level of quality as a human. Dropping the frictions of data entry to near zero doesn’t just save humans time, it results in data capture that would otherwise would never have been recorded.

We’re seeing a flywheel effect between the quantity of conversations recorded and the quality of what’s able to be done with that data. With more conversations recorded, the tool gains greater context of the business and is able to automate workflows with higher quality. Because the tool is more useful, more conversations get recorded. Through this process, it becomes socially acceptable, expected even, to record all conversations. It started with all customer support calls being recorded, then sales, then interviewing, with every other vertical to follow.

As we near the totality of work conversations being documented, AI will have complete context over an entire business. Slack, Docs, Email, are all already programmatically accessible—conversations are the last piece of the puzzle. But what happens in a world of such total context?

An emerging trend we’re beginning to see is software which is proactively actionable: unlocking actions and insights that the user would otherwise have not known to ask for. For example, Lapel is giving every account the feeling of a dedicated account manager—monitoring signals like recent sales calls, support tickets, and usage trends to recommend timely, personalized actions that enable CSMs to drive better customer relationships. In incident management, Datadog automatically surfaces relevant context from previous incidents to better inform the engineers responding to the present issue. The software of tomorrow will not just respond to queries, it will proactively catalyze solutions.

As the number of meetings recorded every year races towards the totality of all work conversations, there are a few categories of products which may capture outsized value. Particularly, I’m excited about:

‍Systems of Record

‍Conversation data is only transformative when it’s aggregated with the rest of business context into a single source of truth. The race is therefore to decide whose ledger, incumbent or up-start, stores that truth, and how a challenger can pry open a foothold without triggering a costly rip-and-replace. That contest turns on a few core design principles that separate true systems of record from
mere transcription logs:
‍
• Raw transcripts aren’t actionable; structured data is. The fastest path from conversation to workflow is to give developers domain-specific building blocks—like an OncologySchema()—that turn messy speech into clean, typed outputs (e.g., ICD-10 codes).
‍
• Always-on ingestion = perfect recall. With ingestion cost dropping substantially year after year, systems can afford to log all conversations, not just milestone calls. The result is a comprehensive longitudinal record no human note-taker could match.
‍
• Compounding defensibility. Each incremental transcript enriches the knowledge graph, tightening feedback loops for model fine-tuning and making the dataset uniquely irreplicable. Whoever owns the record owns the downstream ecosystem of analytics and workflow automations.

‍Action Engines

‍Structured speech is just potential energy; the real value comes when software converts it into autonomous, compounding action. We’re moving from systems that report problems to engines that resolve them. In this world:
‍
• Trigger → Plan → Act loops become self-reinforcing flywheels. Each executed intervention (e.g., a renewal-risk email, an ICD-10 correction) produces a success/failure label that feeds the engine’s next policy update. Over time, the model’s “playbook” doesn’t just automate known workflows—it discovers new micro-tactics (change email tone at 7 PM local, escalate after the third unreturned call) that no playbook author ever wrote, creating an edge that rivals cannot scrape from UI screenshots.
‍
• Latent-context: catching patterns humans would miss, and acting before it’s obvious. LLMs can now cross-reference subtle signals—like stressed language in a transcript, a packed calendar, or a slowdown in engineering output—to spot problems before they show up in a dashboard. It’s one thing for a chatbot to say, “This account might churn.” It’s another for software to suggest repricing the contract or escalating a touchpoint while usage is still strong, acting on it faster than any human could.
‍
• Explainability is a moat. In regulated industries, automation only wins if it can explain itself. Buyers want a clear audit trail showing why each decision was made. Engines that log their reasoning, cite the source data, and show what they would have done differently in another scenario (“I chose Path B because RiskScore > 0.7”) don’t only satisfy compliance, they generate proprietary training data with every action.
‍
Real-world beachheads are already hinting at the shape of this future—DevRev’s auto-triage, Datadog’s instant incident briefs—but these are still manual-override first drafts. The fully matured Action Engine will look less like a dashboard and more like an always-on agent that silently patches revenue leaks, plugs compliance gaps, and pages you only when judgment, not data entry, is required.

‍Enabling Infrastructure

‍Ingestion and transcription are no longer the primary technical hurdle, but the path from raw audio to production-ready product still hides landmines. A massive opportunity exists for infrastructure providers to turn these invisible months of yak-shaving into a single API call. Three unlocks stand out:

• Plug-and-Play Compliance: Conversation data is inherently sensitive—full of PHI, PII, and trade secrets. Teams can’t deploy without clearing security hurdles. A drop-in proxy that handles redaction, consent, and immutable audit logs out of the box turns months of compliance overhead into a default setting.
‍
• Continuous Quality & Cost Optimizer: Shipping v1 is straightforward—but accuracy drifts, costs spike, and vendor performance varies. The hard part is managing quality over time while keeping margins in check. A smart router that dynamically splits traffic across model providers, benchmarks performance (latency, hallucinations, etc.), and fine-tunes from user corrections lets teams maintain reliability without building a full MLOps stack.
‍
• Unified Ingest: Right now, every niche AI tool wants to own the meeting data it touches: sales notetakers keep it in their enablement platform, PM tools in their ticket system. That leads to a fractured reality where the same conversation has to be “re-heard” by a dozen different bots, each siloing its own slice of context. The more logical future is one capture layer that records once and pipes structured outputs everywhere else, like Segment did for event data. This product wouldn’t just summarize a call; it would map the relevant pieces into each downstream system in the right schema: a promised follow-up becomes a Gmail draft, a feature request hits Linear, a revenue forecast change flows into the planning model.

The next era of enterprise software won’t be built around buttons and forms—it will be built around full business context. As speech becomes fully captured, structured, and executable, the line between talking and doing collapses. The future of conversation data isn’t AI that listens, it's software that acts.