About us

Blinqx offers AI workflow platforms that empower service professionals in their growth and success.

Sectors

Blinqx develops AI workflow platforms for financial and business service providers in selected sectors.

Insights

Stay up to date on what's going on at Blinqx: awards, acquisitions, knowledge, cases. You can find it here!

Search

AI should not gamble: the lesson for B2B SaaS

Modified on
Written by Ynze Sipkema

During my college days, I had a set strategy on exams. Some of the questions I knew for sure. The rest? I guessed. With a 25% chance, you could get a point.

That worked fine in a lecture hall. But in the world of enterprise AI and B2B SaaS, gaming is disastrous. Yet Large Language Models (LLMs) do just that. When they don’t know the answer, they give an answer anyway. In science, this is called a hallucination: a seemingly convincing but factually wrong result.

For an exam, maybe smart. For an AI system making decisions in legal, financial or operational processes, it is unacceptable.

Why LLMs hallucinate

I recently read research by OpenAI in which they explain this phenomenon. The comparison to an exam is apt: LLMs function like a student who does not receive punishment for wrong answers.

The training process rewards certainty, not honesty. “I don’t know” is punished. Result: models learn to prefer to always say something rather than honestly show their uncertainty.

This explains why benchmark scores are often impressive. Models score high on multiple-choice tests like SWE-bench, but do so partly by guessing. Just like I did on exams back in the day. Interesting for scientists; risky for companies.

The price of AI hallucinations in B2B SaaS

For a consumer asking for a nice recipe, a wrong answer is harmless. For an insurer, lawyer or accountant, it is different. A misanalyzed claim can lead to erroneous payouts. A legal AI agent who incorrectly summarizes a document can harm a client’s litigation position. A miscalculation in accounting can cause tax risks.In B2B SaaS, it’s all about reliable AI. An AI that gambles undermines client trust. And in B2B SaaS, without trust there is no adoption.

Our approach at Blinqx: reliable AI from the start

At Blinqx, we have taken this problem seriously from the beginning. Within our Qore/AI platform, we build reliability into our models. That means:

  • Honestly stating when something is not certain. Our models can explicitly report back that they do not have enough certainty.
  • Fallback mechanisms. If the knowledge is lacking, an additional check can be requested, such as via retrieval or human validation.
  • Central guard rails. Security mechanisms are built into Qore/AI by default, keeping agents we build predictable and controllable.

These principles make AI usable and scalable in our industries, where erroneous outputs can have a direct impact on users.

OpenAI and the need for reliable AI

OpenAI’s recent publication on reducing AI hallucinations shows that this is not a nice-to-have, but a necessary step in the evolution of agentic AI.

By rewarding models differently – not only on correctness, but also on honesty – the frequency of hallucinations can be drastically reduced. This confirms the approach we already take at Blinqx: rather an AI that says “I don’t know,” than one that returns something wrong with conviction. Because it likes to give an answer.

From slot machine to digital colleague

The transition from generative AI to agentic AI makes this issue even more urgent. Agents don’t just work reactively; they make decisions and execute actions independently. If such an agent guesses, the impact can be much greater than one wrong answer: entire processes can be derailed.

That’s why I see it as a fundamental responsibility of any CAIO or CTO: don’t build AI that allows gambling. Build reliable AI that knows what it knows, and honestly identifies what it doesn’t know. Only then can your agent live up to the role of trusted digital colleague.

My own gambling strategy worked fine in college classrooms. But for B2B SaaS, one thing is clear: gambling is not an option in our customers’ practices.

Check out the latest Blinqx developments around AI here.


Frequently asked questions about hallucination by LLMs

What do we mean by “AI guessing”?

Many Large Language Models (LLMs) give an answer anyway, even if they are not sure. This is called a hallucination: a seemingly convincing but factually wrong answer. For consumer applications, this can be harmless, but in enterprise SaaS it is risky.

Why do LLMs hallucinate in the first place?

LLMs’ training process rewards certainty, not honesty. “I don’t know” is punished. As a result, models always learn to say something, even when in doubt. Benchmark scores seem impressive, but often include “guesswork.”

How does Blinqx ensure reliable AI solutions?

Our Qore/AI platform prevents gaming by:
Honesty: AI indicates when it does not know something
Checks & fallback: additional validation via retrieval or experts
Guardrails: built-in safety mechanisms for predictable output

Related articles