Building an AI Chatbot as a Product Interface, Not a Side Feature

Most AI chatbot integrations are implemented as a separate input box attached to an existing application. The user asks a question, the model returns text, and the rest of the product remains mostly unchanged.

In an analytics product, that approach is too limited. Users do not just need answers — they need the system to understand the current business context, translate intent into backend requests, execute those requests safely, and return results that connect back to the application.

The goal was not to build a generic assistant. The goal was to make natural language another way to operate the product: asking questions, drilling into numbers, comparing scenarios, and moving between summary views and detailed financial data without forcing users to know every filter, report, or data model behind the scenes.

The Core Problem

Financial analytics queries are rarely simple text questions. A user may ask about headcount, budget variance, department-level spend, historical trends, vendor concentration, plan-versus-actual differences, or changes between forecast versions.

Behind each question is a structured request involving dimensions, filters, time periods, metrics, grouping logic, authorization rules, and sometimes tenant-specific business definitions.

For example, a question like “Why did sales and marketing payroll go up last quarter?” is not one operation. The system may need to identify the relevant accounting periods, map “sales and marketing” to departments, identify payroll-related GL accounts, compare against a prior period or plan version, group the variance by employee or category, and return a result that can be drilled into.

The challenge was converting natural language into something the backend could execute reliably without letting the model invent business logic or financial numbers.

Why a Plain Chatbot Is Not Enough

A chatbot that only produces text creates several problems:

  • It cannot reliably query application data without a structured request.
  • It cannot enforce permissions, tenant boundaries, or product-specific validation rules by itself.
  • It cannot return rich outputs such as tables, charts, variance bridges, or drill-down paths if everything is treated as plain text.
  • It makes follow-up questions fragile because the model has to infer context instead of receiving explicit application state.

For business users, the useful answer is often not a paragraph. It may be a table showing the top contributors to a variance, a chart showing a trend, a breakdown by department, or a follow-up action such as “drill into this location” or “compare this against the approved budget.”

The Design Approach

At Precanto, I worked on integrating an LLM-powered conversational layer directly into the product workflow. The chatbot was not treated as a separate assistant. It was designed as a natural language interface over existing backend capabilities.

The architecture separated the system into two broad responsibilities: the LLM interpreted what the user wanted, while deterministic backend services executed the request, validated access, computed numbers, and prepared structured outputs for the frontend.

This made the chatbot useful without making it the source of truth. The model helped users express intent. The application still owned the data, permissions, calculations, and final rendering contract.

Structured Request Generation

The first step was translating user intent into a structured request that backend APIs could understand. Instead of asking the model to directly answer financial questions, the model produced a strict JSON representation of the user’s intent.

This request captured information such as:

  • Metric being requested
  • Accounting period or relative time range
  • Department, location, and GL filters
  • Plan version, scenario, or comparison baseline
  • Grouping and drill-down dimensions
  • Requested visualization or output format
  • Ambiguous terms that required backend resolution

For example, the model might translate “Show me why engineering spend is above plan this quarter” into a request that asks for an expense variance query filtered to engineering departments, the current quarter, actuals versus the selected plan version, grouped by GL hierarchy and optionally by person or location.

The backend then validated the request, resolved business terms against master data, applied authorization, and executed the query using normal application services.

Keeping the Backend in Control

A key design principle was that the LLM should interpret intent, not own business logic. The backend remained responsible for:

  • Tenant and user authorization
  • Data access and row-level filtering
  • Validation of requested dimensions and metrics
  • Financial aggregation and variance calculations
  • Mapping user language to configured business entities
  • Computation correctness and auditability

This separation made the system safer and easier to reason about. The LLM handled ambiguity and language, while deterministic backend services handled data and computation.

It also made the system easier to test. Instead of testing whether a model could correctly answer every possible finance question, the team could test whether user prompts were converted into valid request objects and whether those requests produced the expected backend results.

Context-Aware Conversations

The conversational layer was designed to be available from different parts of the application, not just from a single chatbot screen. A user could start from a dashboard, a table, a chart, or a particular variance and ask a follow-up question from that context.

This meant the request sent to the model needed more than the user’s text. It also needed product state: selected filters, current organization, accounting period, plan version, visible chart or table, selected row, and sometimes the user’s previous analytical path.

With that context, a follow-up like “break this down by location” or “show me the employees behind this increase” could be interpreted correctly because “this” already had a concrete meaning inside the product.

Rich Response Contract

The response layer was designed to support more than text. The backend and frontend exchanged structured response objects rather than treating every answer as a plain string.

A useful response could include:

  • Explanatory text
  • Tables with sortable and drillable rows
  • Charts showing trends or comparisons
  • Variance breakdowns
  • Suggested follow-up questions
  • Links back into relevant product screens

This response contract allowed the assistant to explain the result in natural language while still returning precise data structures that the frontend could render as product-native components.

Handling Ambiguity

Finance users often ask questions using business language rather than exact system labels. A department name may be informal. A GL category may be referenced by a phrase. A time period may be expressed as “last quarter,” “this year,” or “since the latest forecast.”

The system needed to distinguish between ambiguity the backend could resolve automatically and ambiguity that required clarification. If the phrase mapped cleanly to known master data, the backend could continue. If multiple matches were possible, the assistant could ask a focused follow-up instead of guessing silently.

This was important because a confident but wrong interpretation is worse than a clarification in financial software. The system had to optimize for trust, not just fluency.

Designing for Iteration

Analytical workflows are iterative. Users rarely ask one question and stop. They compare, narrow, expand, drill down, change the time period, and ask why a number moved.

That shaped the design of conversation history. The system did not need to pass every previous message forever. It needed to preserve the pieces that mattered for analysis: the active request, selected filters, generated result references, and the user’s current drill path.

This kept follow-up questions useful without making the model responsible for remembering the entire session from raw chat text alone.

Key Design Principles

1. Use the LLM for Interpretation, Not Truth

The model should translate natural language into structured intent. It should not invent numbers, bypass permissions, or compute financial results independently.

2. Keep Outputs Structured

Structured requests and structured responses make the system testable, debuggable, and easier to integrate with existing product workflows.

3. Preserve Application Context

The best user experience comes when the assistant understands where the user is in the product and what data they are currently looking at.

4. Design for Follow-up Questions

Analytical workflows are iterative. The system needs to support follow-ups while preserving enough structured context to remain useful.

5. Make the System Auditable

In enterprise analytics, users need to trust how a result was produced. The system should make it possible to inspect the interpreted request, the filters applied, and the source data path behind the answer.

What I Learned

AI becomes valuable in enterprise software when it is integrated into the workflow, not placed beside it. A chatbot is most useful when it becomes an interaction layer over real product capabilities.

The most important design decision is not which model to use. It is deciding where the model belongs in the system boundary.

In this architecture, the LLM is a translation layer between human intent and backend capability. That keeps the system powerful without giving up correctness, security, or control.

The deeper lesson was that conversational AI should not replace the product architecture. It should expose the architecture in a more natural way, while still relying on the same APIs, permissions, validations, and data contracts that make the product reliable.