All posts

My AI Was Right, but My Guests Asked Me Anyway.

AI trust wedding planning AI hallucination AI citations fact-checking AI wedding concierge
A wedding concierge answer with a small 'Fact-checked' badge underneath it

The week of our wedding in Mauritius, I kept getting the same text. Some version of: "I asked the aiDo AI and it gave me an answer, but I wanted to double-check with you."

Sometimes aiDo was wrong, and that was obviously a problem.

But the overwhelming majority of the time it was correct, and somehow that was still a problem: guests weren't believing aiDo even when it was right.

The whole point of the concierge was to take the where-is-the-Haldi, what-do-I-wear, what-time-do-I-arrive, is-there-a-pharmacy questions off my plate during the one week I had no spare minutes. If a guest asks the AI and then asks me anyway, I haven't saved myself a question. I've added a step to it. The thing became something people consulted before they came to me, not instead of me.

I spent a lot of time and energy making sure aiDo didn't hallucinate. It pulled from real data. It even labels the moments when a note came straight from me, with a little "The groom says:" tag so people knew I'd written the words myself.

That week taught me that trust in an AI is really two separate problems that look like one.

The first one is obvious: make sure the AI isn't making stuff up (we don't want Aunt Betty going to the wrong location for the reception). The second one is getting people to believe it when it's correct. Solve only the first and you get my wedding: a careful, accurate assistant nobody quite offloads to. Solve only the second and you get something worse, a confident voice people act on whether or not it's right.

Screenshots of guests texting the groom to confirm answers the concierge had already given correctly Screenshots of guests texting the groom to confirm answers the concierge had already given correctly

The confident liar

I'd built a feature that read a guest's flight ticket, worked out when they'd land, and saved the arrival time so we could arrange airport pickups. I knew an AI reading a PDF could slip, so I added a confirmation step. The AI would show its work and ask the guest to okay it.

A few of them just said "yep! looks good!" without reading it. So the AI cheerfully saved arrival times that were wrong, and we didn't catch it until a couple of sharp-eyed guests flagged their own. We ended up keeping a separate spreadsheet as the real source of truth, which I hand-copied onto paper for my wife's father so he could run the pickups.

One wrong arrival time, rubber-stamped by someone who trusted the screen, and suddenly there's a parallel spreadsheet and a groom doing data entry at midnight. An AI people blindly believe is a huge liability with a knock-on effect. After that slip, the trust didn't come back for the correct answers either. One bad data point poisons the whole well.

That being said, users shouldn't have to do their own internal audits every time the agent responds. It should be easy to decide whether or not the agent is telling the truth.

Half one: making it say true things

Most of the engineering inside aiDo goes into making the model's confident voice actually earn the confidence. The short version: the concierge answers from real records and curated notes that it collects using tools. Those tools are hooked into the same database the invitation cards pull from. If I've written it down correctly, aiDo is going to get the correct information.

Ask it for the nearest pharmacy and it doesn't reach for a plausible-sounding name. It queries Google Places and reports only what comes back, with the real address and a working map link, because a guest might genuinely walk there for medication.

For the things a map can't answer (the cash-in-an-envelope custom, whether the cliffside villa works for someone with a walker, the real arrival time versus the one printed on the invitation), there's a curated answer key. The couple writes those notes once, and the AI quotes them instead of guessing. (More on that in Introducing aiDo Documents.)

Then there's a short list of rules aiDo can't talk its way around. It can never claim an action it has no tool for, like "I let the kitchen know" when there is no kitchen tool. It can never state a business, address, or phone number a search didn't return. It's never the one who decides whether your uninvited plus-one is welcome. And it can never harm a human.

I don't even let aiDo do its own timezone math anymore. Every time or date the model sees has already been converted to the local hour, and the raw timestamp is stripped out before it reaches the model, so it can't fumble the arithmetic. It quotes a finished string and nothing else.

This is the half most people mean when they say "make the AI accurate." It's necessary, and at my wedding it was mostly working. But it also did almost nothing to help users know when they could trust it was working.

This led to the line I wrote in my own post-mortem that I didn't expect to: the problem wound up not being hallucinations. The AI was right far more often than it was wrong, but people still texted me.

Half two: making people believe it

My first instinct was to assert trust. I added little "pulling from the knowledge base" labels. I added the "The groom says:" tags. I thought this was really clever. Whenever the agent sourced answers from a document, if it saw "The groom says" it would quote that in the answer so users knew I was the one who specifically added that note.

These failed in various ways, some more surprising than others. For example, many users thought the "Groom says" feature was actually the AI hallucinating when in fact I had truly gone in and added all those curated notes! That shows just how difficult a problem this is to solve. Even when you tell users you wrote something, they may not believe it.

The upshot of all of this is that telling the user the AI is right doesn't work, because maybe the AI hallucinated that it was right. You have to show your work on every claim and you have to be willing to say "I'm not sure."

And one more thing for developers: you will feel like this is not important. As developers, we know 100% perfect is almost impossible, so we go for things like 95% success rate. But with AI answers, it NEEDS to work 100% of the time, or else the whole thing is broken.

So based on everything I learned, I've added two new features to aiDo.

Receipts. Every concrete fact in an answer carries a small tappable marker. The venue, a phone number, an arrival time. Tap it and you see exactly where it came from (the event record, the document, the Google listing), with the real snippet sitting right there. The point isn't to be believed on faith. You can open the source yourself. And the AI doesn't get to choose what looks well-sourced: the matching is deterministic, done by the system after the answer is written, checking which real records actually appear in the text. A claim with no receipt is, by definition, ungrounded.

An answer with a tappable citation marker that opens the exact source it came from An answer with a tappable citation marker that opens the exact source it came from

Second Opinion. After the concierge writes an answer, a separate AI reads it back and re-checks every factual claim against the same evidence the first one had: the tool results, the documents, the couple's own notes. Then it stamps the message. Fact-checked. Or, when something doesn't hold up: Couldn't confirm 2 details. Check these with the couple. Tap the badge and it walks you through it claim by claim, what it confirmed, what it couldn't, and why.

The name is the whole pitch. My guests were already getting a second opinion, from me, over text, at the rehearsal dinner. So we moved the second opinion inside the product, where it happens in two seconds instead of becoming another thing on my plate.

The Second Opinion badge under an answer, expanded to show which claims were confirmed and which could not be The Second Opinion badge under an answer, expanded to show which claims were confirmed and which could not be

The part I care about most is the part that admits failure. The badge is allowed to say "I couldn't confirm this." An assistant that flags its own shaky claim reads as more honest than one that's smooth and certain about everything. It's also the same mechanism that catches a real hallucination before anyone acts on it. That flight-time slip that started all of this is exactly the kind of claim Second Opinion now stops and questions instead of waving through.

You need both halves

They fail in opposite directions, which is why one without the other gets you nowhere.

Truth without trust is the double-check tax: correct answers that still route back to a human, so the assistant makes work instead of absorbing it. Trust without truth is a guest confidently walking to a pharmacy that was never there. Only together do you get the thing I actually wanted that week. Something people hand their questions to and don't think twice about.

Most AI is a black box. An answer arrives with no seams, and you either take it or you don't. We're building the opposite, a glass box: you can see where every word came from, and you can watch it check its own work before you ever read it.

The bar for a wedding concierge was never "usually right." It's "right, and you can tell." Nobody should have to text the groom to be sure.

— Jon, founder of aiDo

aiDo is the AI wedding planner and guest concierge I built for our own wedding, and now yours. See it at ai-do.io.