FORTUNE 50 ENTERPRISE
Hitting ambitious deadlines
for a Fortune 50 enterprise
We guided a Fortune 50 company through shipping their first two AI experiences for a flagship product, from zero to serving millions of users in four months.

THE CHALLENGE
Unprecedented personalization
The director of product was between a rock and a hard place. He oversaw a flagship product at
a Fortune 50 company, and leadership had made a big ask: ship a set of AI experiences that
would deliver their content to users with an unprecedented degree of personalization.
The catch? They wanted it in four months, in time for a big PR event.
He had a crack team, but they were experts in conventional software, not AI. They would have to
upskill on new technology and build the experiences from scratch, all while navigating a tight
deadline and the company's quality assurance process for AI experiences.

MAXIMIZING ROI
Teach a man
to fish
After consulting with the team, we proposed engaging as scouts: guiding product discussions,
defining AI architecture, and creating prototypes.
We would also build evals to measure quality of the systems' outputs, which is essential for
AI systems but often overlooked by teams who've only shipped conventional software.
The client team, with our guidance, would build the final production-ready version.
This minimized compliance barriers — essential for our ambitious timeline — and increased ROI
by giving the client's team the hands-on experience essential for future AI work.
SPEED AND QUALITY
Getting it right
the first time
We started by prototyping the first experience. This helped us dial in data flow, create
alignment among a dozen stakeholders, and decompose the problem into bite-sized, AI-friendly
pieces.
Prototype in hand, the team started the production implementation while we pivoted to evals.
We produced a human- and machine-gradable rubric to quantify quality and catch the 16 failure
modes we identified, then reviewed with subject matter experts and made two minor additions.
By building on our work, the system's first iteration scored an average of 84%, well above our
pre-agreed threshold (75%) needed to ship. Before going live, we increased scores to 95%.
TARGETED SOLUTIONS
Bounce rate: from
70% to 11%
With things in good shape for the first experience, we picked up support for the second.
Since the team had a working first iteration, we started with evaluation.
One major concern was user bounce rate: initial estimates placed it at 70% due to low quality
outputs. After working with us for two weeks, it was down to 11%.
To drive these improvements, we identified two problem categories and recommended specific remediation
steps. Solutions ranged from passing more context to the underlying search system, to improving
generated responses through better task decomposition.
EXCEPTIONAL RESULTS
Delivering a
dream outcome
At the end of the day, the director of product and his team hit the ambitious deadline that
leadership had asked for, and did so in a way that's making a genuine impact for users.
We're still collecting hard numbers, but we expect to speed up user workflows by 3-5x while
dramatically increasing the value they receive from our client's content.
All on top of building the team's AI capabilities and supporting key strategic initiatives for
our client.
SAME OR DIFFERENT
Timely feedback
We believe frank reflection is key to driving business value. For this project, our main
takeaway is the importance of providing feedback early and often when team members are
facing challenges.
Early in the project, we had a team member whose performance was dipping. When our team leads
caught it, it had been happening long enough that we needed to replace them to meet our responsibilities
to the client.
While we think it was the right call for this project, it's important to us moving forward to
catch issues early and give our team members the time and support they need to make improvements.
