Case Study: Intelligent Knowledge Orchestration for a Leading Global Financial Institution

Domain: Generative AI · Knowledge Management · Banking Operations

Timeline: 8 months

Team: 6-8 specialists

Key Impact

Faster Knowledge Access: Document retrieval times reduced from hours to seconds, with average query response under 4 seconds across the full document corpus

Improved Accuracy: Generated summaries validated through random sampling audits and continuous offline evaluation against compliance-approved golden datasets

Reusable Architecture: A governed, extensible foundation now deployed across additional knowledge domains within the bank including treasury, trade finance, and customer onboarding

Reduced compliance and conduct risk by eliminating answer inconsistency that had been flagged as a material control gap by internal audit

Challenge

A global financial institution with operations across multiple jurisdictions was struggling with a knowledge access problem that was quietly draining productivity across every line of business. Policy, compliance, product, and operational documents were spread across legacy intranet portals, SharePoint sites, shared drives, and team-specific wikis, each with its own taxonomy, permissions model, and update cadence. Front-line bankers, relationship managers, and operations staff routinely needed to find authoritative answers to questions about lending policy, AML/KYC procedures, regulatory interpretation, or product eligibility. In practice they were forced to either rely on memory, ping a senior colleague, or navigate four or five different systems to piece together an answer. Search inside those systems was almost universally keyword-based, returning long lists of partially relevant documents with no synthesis, no context, and no source ranking. The consequences were measurable. Employees were losing 5 to 7 hours per week to information retrieval. Customer-facing turnaround times for non-standard enquiries stretched to days. Worse, the same question asked twice often produced two different answers, creating real compliance and conduct risk. The bank's internal audit team had flagged this inconsistency as a material control gap that needed remediation before the next regulatory review cycle. Leadership had already evaluated several off-the-shelf enterprise search tools and at least one early-generation chatbot pilot. Both had failed: the search tools could not reason across documents, and the chatbot hallucinated answers in a regulated environment where being plausibly wrong is worse than being honestly unsure.

Solution

Get AI Ready designed and implemented a multi-agent Retrieval-Augmented Generation (RAG) platform on Databricks to unify knowledge access across the enterprise, purpose-built for regulated environments where every answer must be traceable to a source. The architecture used LangGraph-based orchestration to manage the entire query lifecycle. Each incoming question is first classified by intent (policy lookup, compliance interpretation, product enquiry, advisory summary) and routed to a specialised retrieval agent tuned for that document type. The retrieval layer combined dense vector search over fine-tuned financial embeddings with metadata filters drawn from Databricks Unity Catalog, ensuring that every retrieved chunk carried full lineage back to its source document, version, and owning team. A modular Vector Search adapter sat between the orchestration layer and the embedding store, allowing the bank to swap embedding models or storage backends without rewriting downstream logic. Semantic embeddings were fine-tuned on a curated corpus of the bank's own policy and regulatory language, materially improving recall for domain-specific terminology that generic foundation models routinely miss. The synthesis layer dynamically adjusted prompting strategies depending on query intent. Policy lookups used a strict extractive style that pulled verbatim passages with citations. Advisory summaries used a more flexible abstractive style with explicit confidence statements. Every generated response included inline source links, document versions, and a confidence score derived from the underlying retrieval scores. Critically, automated evaluation pipelines using DeepEval Faithfulness metrics were integrated with MLflow Evaluate. Every model output was scored for faithfulness (does the answer actually reflect the source?), relevance, and completeness against a golden dataset curated by the bank's compliance team. Outputs falling below threshold were flagged for human review before being shown to end users. This evaluation harness ran continuously in production, providing the audit trail the internal audit team had been asking for.

RAG Workflow Architecture for Financial Knowledge Platform

Retrieval Augmented Generation (RAG) Workflow Architecture showing data flow, governance, and evaluation processes

Technology Stack

Databricks

LangGraph

MLflow Evaluate

Unity Catalog

Vector Search

DeepEval

Transformer APIs

Ready for Similar Results?

Let's discuss how we can help transform your organisation's data and AI capabilities.