Build in public — CallBuddy AI

From idea to live product.
Every step, documented.

This is our ongoing build log for CallBuddy AI — a 10-stage case study series covering every phase of the MVP, from problem discovery through launch pricing. No polish, no hindsight. Each entry is written as the stage happens. We publish the wins and the setbacks because that’s the only way this is useful to anyone reading it.

Stage 5 of 10 — core engine in progress 42%
01
Recognizing the problemCompleted

Why small service businesses bleed revenue from missed calls — and why the problem is harder to see than it sounds.

+

The situation

The starting point wasn’t a pitch deck. It was a pattern. Across conversations with small business owners — salons, med spas, independent clinics — the same frustration kept surfacing: “We miss calls constantly. By the time we call back, they’ve already booked somewhere else.”

The problem is deceptively simple. A potential client calls during a busy hour, no one picks up, they don’t leave a voicemail, and the business has no idea who called or how to recover them. The revenue walks out the door silently — and never shows up in any report.

62%
of calls to small businesses go unanswered
$75B+
estimated annual revenue lost to missed calls across US small businesses
< 5 min
window to recover a caller before they move on

Key insight

The problem isn’t that businesses don’t care — it’s that they have no system. When you’re the only person at the front desk and you’re in the middle of a service, answering the phone isn’t an option. The cost shows up as “leads that never came in” rather than a line item anyone tracks.

What we learned

Missed calls are a top-of-funnel problem that almost no one measures. Most owners we spoke with couldn’t tell us how many calls they missed last week. The data simply doesn’t exist for them. That means any solution has to be self-evidently valuable from day one — not something that requires setting up tracking to prove its worth.

Up next

Stage 02 — Moving from anecdotal frustration to validated market demand. We needed to talk to real business owners, systematically.

02
Validating the marketCompleted

20+ conversations with small business owners. What they told us changed the scope of what we planned to build.

+

The approach

Before writing a single line of code, we spent three weeks talking to business owners across three verticals: med spas, hair salons, and independent wellness practices. The question wasn’t just “does this bother you?” — it was what do you do right now when you miss a call? Stated pain is easy to find. Actual behavior reveals the real problem.

22
business owners interviewed across three verticals
18/22
had no formal missed call follow-up system
3–4 hrs
average time before a callback was even attempted

What we heard

The most common follow-up strategy was a sticky note and a mental note to call back later. A few relied on voicemail but admitted they rarely checked it same-day. Only three had tried an automated tool — all three described it as “too complicated” or “not built for what we do.”

One salon owner said something that stayed with us: “By the time I see the missed call, they’ve already booked down the street.” That single sentence became the product brief.

Key insight

Speed is the entire product. Not features, not dashboards. The window to recover a missed caller is under five minutes — and the only way to hit that window consistently is full automation. Any solution that requires the owner to take an action is functionally the same as no solution.

What we learned

The market is real, the pain is acute, and existing tools are either built for enterprise teams or too generic to feel relevant to a one-location service business. There’s a clear gap between “enterprise missed call management” and nothing. That’s where we’re building.

Up next

Stage 03 — The problem is validated. Now we had to decide exactly what to build first — and more importantly, what to cut.

03
Defining the MVPCompleted

Forcing ourselves to answer: what is the single thing this product has to do well before anything else matters?

+

The constraint we gave ourselves

The temptation after validation interviews is to build everything people asked for — two-way SMS, AI voice responses, CRM syncing, multi-location dashboards, no-show prediction. All reasonable. All would have killed the MVP.

We gave the team one rule: the product has to do one thing exceptionally well before it does anything else. Everything else is version two.

The one thing

Text back a missed caller within 60 seconds and give them a direct path to book an appointment. That’s it. If we can do that reliably, we have a product worth paying for. If we can’t, nothing else matters.

In scope for V1

Missed call detection via telephony webhook. Automated SMS reply within 60 seconds. Business name personalization. A direct booking link in the SMS. A simple dashboard showing missed call history and delivery status.

Explicitly out of scope

AI voice calls. Two-way SMS conversation. CRM integrations. Multi-location support. No-show reminders. Custom SMS flows. Any feature requiring more than 15 minutes of onboarding to configure.

What we learned

Writing down what you’re not building — and sharing that list publicly — is as strategically valuable as the feature list itself. It keeps the team focused and gives potential customers clarity upfront. Scope creep doesn’t just slow development; it muddies positioning before you’ve had a single real sales conversation.

Up next

Stage 04 — With a defined scope, the next decision was infrastructure. Every stack choice at this stage has a longer tail than it looks.

04
Choosing the tech stackCompleted

Every infrastructure decision at this stage has downstream consequences. Here’s what we chose and exactly why.

+

The philosophy

Don’t build what you can buy. At the MVP stage, the goal is to validate the business model — not demonstrate engineering ambition. Every custom-built component is technical debt before you have revenue. We looked for the minimum number of systems that could deliver a reliable end-to-end experience without inventing infrastructure that already exists.

The decisions

Telephony + SMS: Twilio. Industry standard, webhook-based call event handling, programmable SMS with delivery receipts. Cost per message is predictable and scales linearly.

Booking integration (V1): Universal link approach. During onboarding, each business provides their existing booking URL. We embed it in every outgoing SMS. Imperfect, but it ships without three months of integration work.

Dashboard: Next.js on Vercel. Fast to build, easy to iterate, clean server-side rendering for the missed call feed.

Database: Supabase (PostgreSQL). Row-level security for multi-tenant data isolation, built-in auth, and real-time subscriptions for the live dashboard feed.

Key insight

The most consequential decision wasn’t which tool to pick — it was choosing existing infrastructure over custom telephony entirely. Building a custom SIP stack gives more control and lower per-unit cost at scale. But three extra months of development to save fractions of a cent per SMS is the wrong trade before you have a single paying customer.

What we learned

Stack decisions feel permanent but aren’t. Pick what moves fastest now, note what you’d rebuild first with more runway, and move. We’d rebuild the booking integration first — the universal link approach will hit a ceiling faster than anything else in the product.

Up next

Stage 05 — Building the core loop. Call in → no answer → SMS out within 60 seconds. Sounds simple. Isn’t.

05
Building the core engineIn progress

The loop: call comes in → no answer → SMS fires within 60 seconds. Getting it to work reliably is a different problem than getting it to work at all.

×

The core loop

On paper, three steps: (1) detect a missed call via Twilio’s status callback webhook, (2) look up the account registered to that phone number, (3) fire an SMS with a personalized message and booking link. In practice, each step has failure modes that have to be anticipated before any real business relies on it at volume.

Challenges we’ve run into

Webhook reliability: Twilio webhooks can fail — network blips, server timeouts, deployment restarts. We built a retry queue with exponential backoff so a failed delivery attempt doesn’t silently become a missed SMS. Any event that fails is retried up to three times before logging as an error.

Variable no-answer windows: Different phone systems have different ring durations before routing to voicemail or dropping the call. We had to account for this variability so every missed call scenario produces the same outcome: an SMS within 60 seconds.

Carrier filtering: SMS from new, low-volume numbers increasingly gets flagged as spam. We’re working through 10DLC campaign registration to properly categorize outbound messages and improve deliverability before beta launch.

94%
SMS delivery rate in current test environment
< 45s
average time from missed call to SMS delivery
3
active edge cases in the retry queue backlog

What surprised us

The hardest part of building a “simple” automation isn’t the automation itself — it’s the infrastructure surrounding it. Logging, retry logic, error alerting, and webhook idempotency are invisible to the customer. But they’re the difference between a product that works and a product that sometimes works. We’re treating reliability as a core feature from the start.

Up next

Stage 06 — The SMS fires. But where does it send people? The booking integration is where the actual revenue recovery closes.

06
The booking integrationUpcoming

The SMS drives recovery. The booking link is where it closes. Here’s how we’re handling a fragmented scheduling landscape.

+

The challenge

Our target clients use a wide range of scheduling tools — Jane App, Mindbody, Acuity, Square Appointments, and several industry-specific platforms. Building native integrations before launch would push the MVP back by months and add maintenance overhead before we have a single paying customer.

V1 solution: universal link

During onboarding, each business provides their existing booking URL. We store it and embed it in every SMS. The missed caller lands directly on the business’s current booking page — no new tool, no migration, no friction. The tradeoff: the experience is only as good as the client’s existing booking setup.

V2 roadmap

Native integrations with the top scheduling platforms across our target verticals. This lets us pre-populate appointment type, pass caller info, and eventually attribute confirmed bookings back to a specific SMS — closing the ROI loop that business owners will eventually ask for.

What we’re watching

The key metric at this stage is SMS link click rate, not booking completion. If recipients get the message but don’t click, the friction lives in the copy — not the booking page. We need to isolate where the drop happens before building deeper integrations that might not address the real issue.

Up next

Stage 07 — Before any real business touches this, we break it ourselves. Systematic testing across carriers, phone types, and edge cases.

07
Internal testing & QAUpcoming

The goal isn’t zero bugs. It’s finding the bugs that would damage trust with a real business owner on day one.

+

Testing priorities

We’ll run the core loop across multiple carrier networks, across phone system types (VoIP, mobile, traditional landline), and against different call scenarios — immediate hang-up, caller disconnects before voicemail, call forwarded to a second number. Each needs to produce the same output: an SMS within 60 seconds.

What we’re measuring

SMS delivery rate by carrier. Time from missed call event to message delivery. False positive rate — SMS triggered when a call was actually answered. Dashboard accuracy: does what the business owner sees match what actually happened in the call log?

The standard we’re holding to

98%+ SMS delivery rate across all tested carriers. Sub-60-second time-to-send in every tested scenario. Zero false positives — no SMS sent when a call was answered. If we can’t clear these numbers in controlled testing, no real business gets access.

Up next

Stage 08 — Once internal testing clears, three real businesses get access. No charge, complete transparency, honest feedback required.

08
Beta pilotUpcoming

Three founding businesses. No charge. Thirty days. We document everything — including what breaks.

+

The pilot structure

We’re onboarding three businesses — one from each target vertical (med spa, salon, and an independent wellness practice). Free access for 30 days in exchange for honest weekly feedback, permission to document results in this series, and a 30-minute debrief at close. No NDA, no pressure. If it doesn’t work, we want to know exactly why.

What we’re asking them to do

Flag any instance where the SMS felt wrong — wrong timing, wrong tone, wrong situation. Note any callers they’re reasonably certain were recovered because of the automated reply. We’re not asking them to run a controlled experiment. We’re asking them to tell us what feels off. Their qualitative read matters more at this stage than precise attribution data.

The real signal we’re watching for

Do business owners feel confident the system is working without logging in to check? That’s the core product promise — set it and trust it. If they’re checking the dashboard every day to verify delivery, we have a trust problem. No amount of new features solves a reliability perception issue.

Up next

Stage 09 — After 30 days, we read the data. What actually happened vs. what we expected, and what it means for the product going forward.

09
Reading the dataUpcoming

Thirty days of real usage. What the numbers say, what they miss, and what changes as a result.

+

What we plan to measure

Recovery rate: the percentage of missed callers who respond to the automated SMS. Booking conversion: the percentage who click through to the booking link. Response rate by time-of-day. And the signal that matters most: would each beta business pay for this when the free period ends?

What we expect to find

Our hypothesis is that 20–35% of missed callers will respond positively to the automated text. We expect time-of-day to have a meaningful effect on response rates. And we expect at least one failure mode we didn’t catch in internal testing — something about real-world usage that controlled environments can’t replicate.

Why we’re publishing this publicly

If the numbers are strong, publishing them builds the case for every future customer. If the numbers disappoint, publishing what we found and what we changed is more credible than hiding it. Founders who pretend everything is working tend to build products that eventually stop working.

Up next

Stage 10 — Once we understand the value being delivered, we can price with confidence rather than guesswork.

10
Pricing the productUpcoming

Pricing anchored to value delivered — not to what feels comfortable to charge. Here’s the framework we’ll use.

+

The value anchor

The average appointment value across our three target verticals runs between $75 and $200 depending on service type. If a business recovers one missed booking per week they otherwise would have lost, the annual value is $3,900 to $10,400. A product priced at $99–$149/month pays for itself with one recovered booking. That’s the conversation we plan to have with every prospect — not “here’s what it costs” but “here’s what one missed booking you used to lose is worth.”

$75–200
average appointment value across target verticals
1 booking
recovered per month = product pays for itself
$99–149
provisional monthly price range going into beta review

What we’re not doing

We’re not pricing cost-plus. We’re not finding what a competitor charges and going slightly lower. Both approaches produce pricing that falls apart the moment a prospect asks why they should pay it. We’re pricing based on documented return — and using beta data to validate or adjust before public launch.

Provisional model

Flat monthly subscription, no per-SMS fees visible to the client. We absorb messaging costs into the margin. One price per tier, simple enough to explain in a 30-second cold message. That’s the real test of whether your pricing is clear.

The signal we’re watching from beta

The most important data point won’t be product feedback — it’ll be whether founding businesses would have paid for this if the free period ended tomorrow. And if yes, how much. That answer, not our internal modeling, is where the launch price gets set.