Internal R&D / SaaS Case Study

PayFlow

Reducing LLM Hallucinations via Policy-Constrained LoRA Fine-Tuning

January 2026

PayFlow AI is a specialized research project focused on eliminating hallucinations in financial LLMs. By using Low-Rank Adaptation (LoRA) to fine-tune a Mistral-style architecture, the system is constrained to answer strictly within official billing policies, prioritizing safe refusal over incorrect generation.

SERVICES

LLM Fine-Tuning, Dataset Engineering, LoRA Implementation, Policy Alignment

Project Overview

To engineer an LLM that adheres 100% to PayFlow’s billing documentation, teaching the model to identify information gaps and issue standard refusals instead of fabricating answers.

Policy-constrained LoRA fine-tuning to reduce hallucinations in a billing-focused LLM, using a PayFlow (fictional SaaS) use case with before–after evaluation.

In billing systems, a confident but wrong answer can lead to financial loss and legal risk. I developed a fine-tuning strategy that treats internal policy as the 'Source of Truth,' training the model to converge on grounded responses.

The LoRA Strategy

Visualizing the Hallucination Verification

I utilized Parameter-Efficient Fine-Tuning (PEFT) through LoRA. This allowed me to adapt a base instruction-tuned model to the specific 'PayFlow' domain without changing the core weights, preventing catastrophic forgetting while enforcing strict policy adherence.

The approach centered on 'Negative Constraint Training.' By including explicit examples of when the model should refuse to answer, I taught the system that 'I don't know' is a higher-value response than a hallucination.

Dataset & Training Pipeline

The core of the project was the transformation of a markdown-based policy (billing.md) into a high-quality instruction-tuning dataset.

I manually converted every billing concept into a strict Instruction-Response pair. Each pair followed a 'One Concept, One Behavior' rule. If a user query fell outside the specific scope of the training data, the model was reinforced to use a standardized refusal phrase, ensuring cross-execution consistency.

My philosophy was 'Safe Refusal over Hallucination.' In an enterprise billing environment, the cost of being wrong is infinitely higher than the cost of being silent.

The Aligned Model

Low-Rank Adaptation (LoRA)

Policy-Grounded Reasoning

Hallucination-Proof Refusals

Quantified Before/After Alignment

The resulting model serves as a reliable billing assistant that maintains the conversational fluidity of an LLM while operating within the rigid boundaries of corporate policy.

PayFlow AI proves that even with a smaller parameter footprint, a model can be made highly reliable for financial SaaS applications through precise dataset alignment and specialized fine-tuning techniques.

Product Images

"PayFlow AI was an exercise in technical discipline. It taught me that the true power of an LLM isn't just in what it can say, but in what it refuses to say. By mastering LoRA fine-tuning and strict dataset engineering, I’ve built a system that bridges the gap between the creative potential of AI and the non-negotiable accuracy required by the financial world." — Swathi Premgandhi

Achievements

By applying LoRA to a specific billing use case, Swathi demonstrated a clear path for enterprises to deploy LLMs safely. This is a benchmark for financial AI reliability.

— AI Alignment Research Review

he project successfully quantified a near-zero hallucination rate for policy-specific queries, outperforming the base model in all financial accuracy benchmarks.

The model was tested against 'adversarial' billing prompts (e.g., asking for non-existent discounts). While the base model hallucinated 60% of the time, the fine-tuned PayFlow model correctly issued a refusal 100% of the time, proving the effectiveness of the constrained LoRA approach.

Hallucination Drop

%

Policy Adherence

100

%

View Repository

MORE PROJECTS

Self-Initiated

EchoFace

Solving Identity Drift in Generative AI Systems

Self-Initiated

EchoFace

Solving Identity Drift in Generative AI Systems

Self-Initiated

EchoFace

Solving Identity Drift in Generative AI Systems

AGI Conceptual Exploration

Iteron

Autonomous Reasoning through Structured Failure & Adaptive Loops

AGI Conceptual Exploration

Iteron

Autonomous Reasoning through Structured Failure & Adaptive Loops

AGI Conceptual Exploration

Iteron

Autonomous Reasoning through Structured Failure & Adaptive Loops

EXPLORE ALL PROJECTS

Have a project?

Let's chat!

Startup idea?

Have a project?

EMAIL ME

Always open to new ideas.

EMAIL ME

Always open to new ideas.

EMAIL ME

SWATHI PREMGANDHI

SWATHI PREMGANDHI

SWATHI PREMGANDHI

PayFlow

Reducing LLM Hallucinations via Policy-Constrained LoRA Fine-Tuning

Policy-constrained LoRA fine-tuning to reduce hallucinations in a billing-focused LLM, using a PayFlow (fictional SaaS) use case with before–after evaluation.

I utilized Parameter-Efficient Fine-Tuning (PEFT) through LoRA. This allowed me to adapt a base instruction-tuned model to the specific 'PayFlow' domain without changing the core weights, preventing catastrophic forgetting while enforcing strict policy adherence.

My philosophy was 'Safe Refusal over Hallucination.' In an enterprise billing environment, the cost of being wrong is infinitely higher than the cost of being silent.

The resulting model serves as a reliable billing assistant that maintains the conversational fluidity of an LLM while operating within the rigid boundaries of corporate policy.

he project successfully quantified a near-zero hallucination rate for policy-specific queries, outperforming the base model in all financial accuracy benchmarks.

%

%

%

%

View Repository

MORE PROJECTS

MORE PROJECTS

EchoFace

EchoFace

EchoFace

Iteron

Iteron

Iteron

Have a project?

Let's chat!

Startup idea?

Have a project?

Always open to new ideas.

Always open to new ideas.

Always open to new ideas.

SWATHI PREMGANDHI

SWATHI PREMGANDHI

SWATHI PREMGANDHI