Gartner surveyed 413 martech leaders last year. 89% expected significant business benefit from AI agents. Only 45% said their existing vendors actually delivered those results.
That gap isn't random. Having talked to thousands of GTM leaders who tested AI and got burned, I've noticed the failures almost always trace back to the same four mistakes, and they all happen before deployment. They pick the wrong tool. They expect magic on day one. They skip stakeholder alignment. Or they set targets that were never realistic.
The fix isn't better AI, it's asking better questions before you buy.
The obvious evaluation (and why it doesn't work)
The way most teams evaluate AI for GTM is to run a trial, look at the output, and compare it against what their team does manually. Feed it some accounts, let it send some emails, grade the quality. Pick the one with the best demo.
This seems reasonable and fair. But it misses almost everything that matters.
It doesn't test whether your expectations are realistic for your stage. It doesn't surface the stakeholder misalignment that will kill the project at the three-month review. It doesn't tell you whether your motion is even ready for AI to accelerate. And it doesn't help you figure out if you're solving a task when you should be solving a system.
I've spoken to teams all the time who have run flawless pilots with their vendors and still fail in production because they never asked the questions underneath the trial. Here are the four I always start with (even with our own customers).
Question 1: What does "good" actually look like for us?
This is where most teams get it wrong first. The answer depends entirely on where you are.
If you're a small company standing up outbound or inbound for the first time, AI can look like a 10x result. But be honest about what's driving that: you're going from zero to one. Any structured program, even a fully manual one, would produce dramatic gains. AI just gets you there faster.
If you're a mature org with a working GTM motion, 10x isn't on the table and anyone who promises it is selling you a fantasy. What is realistic is 20-30% improvement through better process, better coverage, and better speed. That's still enormous at scale, and it's worth investing in seriously. But it requires you to walk in with the right number in your head.
The teams that succeed set targets based on their actual starting point, not based on a vendor's case study from a company in a completely different stage.
Question 2: How fast should this actually take?
If you already have a working GTM motion (messaging that converts, lists that perform, a process that reps follow) you can see results from AI in about two weeks. That's roughly how long it takes to set up infrastructure like inboxes, sequences, and templates, and let the system start running.
If your motion is broken, or if you're building one from scratch, AI won't fix that. It will just execute the wrong playbook faster and at greater scale. You still need to find what works, and that testing phase takes time. Usually up to three months.
This is the thing most people don’t want to hear: AI accelerates a working motion. It does not create one. The teams that accept this and commit to the iteration phase are the ones that win. The ones that expect week-one miracles churn out after 90 days and blame the tool.
Question 3: Does everyone involved actually agree on what success means?
I've seen this kill more AI deployments than bad technology ever has.
Here's a version of it I've watched play out multiple times: Sales owns the AI project. They deploy an AI SDR, it starts delivering leads in two weeks, manual work drops. Success, right?
Then the three-month review happens. The executive sponsor wanted a strategic transformation, not a single use case. Marketing is upset about email messaging they didn't approve. RevOps sees it as yet another tool bolted onto a bloated stack.
Everyone had a different definition of success. Nobody wrote them down. The project gets killed even though it was working.
Before you buy anything, get every stakeholder in one room: sales, marketing, RevOps, your executive sponsor. Write down what "success" means for each of them. Refer back to it at every check-in. This sounds like basic project management because it is. But almost nobody does it, and it's the single most common reason AI programs get pulled.
Question 4: Are we solving a task or improving a system?
Look at your answers to questions one through three. I'd bet none of them say "our goal is to send a better email."
If you're trying to grow pipeline 25%, you need more than better outbound copy. You need to find the right buyers, engage them at the right moment, qualify them before they go cold, and hand them to a rep with full context. That's a system, not a task.
When the pieces are siloed (separate inbound tool, separate outbound tool, separate qualification flow) they break in ways that are hard to diagnose. A prospect talks to your chatbot, then gets an outbound email with zero context. Leads get routed wrong. Performance drops and nobody can tell which part of the workflow failed.
When the system works together, it improves together. Every touchpoint enriches the next. Reps show up to calls with full context. Follow-ups reference the demo. You can trace a straight line from first touch to pipeline to revenue.
Most GTM teams don't need another point tool. They need the motion to work as one connected thing.
The starting point, not the finish line
If you answer these four questions honestly, you'll end up with something like this:
"We're a mature org targeting 20-30% pipeline growth. We have a working motion, so we expect early results in 2-3 weeks. Our exec sponsor wants pipeline growth, RevOps wants fewer tools, sales wants less manual work, and marketing wants faster lead response. We need a system, not a point solution."
This isn’t a strategy, it's more of a sketch. But it's the right sketch. One that sets realistic expectations and gives you a framework to evaluate whether any AI tool is actually going to work for your team.
The worst thing you can do is buy something based on a demo, set targets based on a vendor's best case study, and skip the alignment work. That's how the 55% of disappointed leaders in the Gartner study got there.
If you want to talk through what this looks like for your org specifically, we're happy to.




