Oracle AI Developer Hub

Agentic RAG with
langchain-oracledb

Multi-agent retrieval augmented generation powered by Oracle AI Database, local LLMs via Ollama, and the A2A protocol

Oracle AI Vector Search Chain of Thought A2A Protocol Local LLM Inference

01 / 12

The Challenge

Traditional RAG
Falls Short

Simple vector similarity search retrieves text chunks — but it doesn't reason. Complex questions need planning, multi-step research, and synthesis across heterogeneous knowledge sources.

Single-hop retrieval misses multi-step reasoning chains
No awareness of document type — PDFs, code, and web pages are treated identically
Context window limits force lossy truncation
No audit trail or explainability of the retrieval process

Traditional RAG

        Query → Embed → Top-K → Concatenate → Generate

        ✗ No planning

        ✗ No reasoning

        ✗ No synthesis

        ✗ No source routing

Agentic RAG

        Query → Plan → Research → Reason → Synthesize

        ✓ Multi-step planning

        ✓ Deep reasoning

        ✓ Cross-source synthesis

        ✓ Intelligent routing

02 / 12

System Architecture

End-to-End Pipeline

%%{init: {'theme':'dark','themeVariables':{'primaryColor':'#C74634','primaryTextColor':'#f1f5f9','primaryBorderColor':'#C74634','lineColor':'#64748b','secondaryColor':'#1a2235','tertiaryColor':'#111827','background':'#111827','mainBkg':'#1a2235','nodeBorder':'#475569','clusterBkg':'#0d1117','clusterBorder':'#334155','titleColor':'#d4a853','edgeLabelBackground':'#111827','nodeTextColor':'#f1f5f9'}}}%%
graph LR
    subgraph INPUT["📄 Document Ingestion"]
        PDF["PDF
Docling"]
        WEB["Web
Trafilatura"]
        CODE["Code
Gitingest"]
    end

    subgraph ORACLE["🗄️ Oracle AI Database"]
        SPLIT["OracleTextSplitter
Normalize & Chunk"]
        EMBED["OracleEmbeddings
ALL_MINILM_L12_V2"]
        VS["OracleVS
Vector Store"]
        EVENTS["Event Logger
6 Event Tables"]
    end

    subgraph AGENTS["🤖 Agent Chain of Thought"]
        PLAN["Planner"]
        RES["Researcher"]
        REA["Reasoner"]
        SYN["Synthesizer"]
    end

    subgraph UI["💬 Interfaces"]
        GRAD["Gradio"]
        OWUI["Open WebUI"]
        API["FastAPI"]
        CLI["CLI"]
    end

    PDF --> SPLIT
    WEB --> SPLIT
    CODE --> SPLIT
    SPLIT --> EMBED
    EMBED --> VS
    VS --> RES
    PLAN --> RES
    RES --> REA
    REA --> SYN
    SYN --> GRAD
    SYN --> OWUI
    SYN --> API
    SYN --> CLI
    VS --> EVENTS

    style INPUT fill:#1a0a0a,stroke:#C74634,stroke-width:2px
    style ORACLE fill:#0d1117,stroke:#d4a853,stroke-width:2px
    style AGENTS fill:#0a1a2e,stroke:#3b82f6,stroke-width:2px
    style UI fill:#0a1a0a,stroke:#14b8a6,stroke-width:2px

03 / 12

%%{init: {'theme':'dark','themeVariables':{'primaryColor':'#C74634','primaryTextColor':'#f1f5f9','lineColor':'#64748b','secondaryColor':'#1a2235','tertiaryColor':'#111827','background':'#111827','mainBkg':'#1a2235','nodeTextColor':'#f1f5f9'}}}%%
graph TB
    LCO["langchain-oracledb"]
    LCO --> OVS["OracleVS
Vector Store"]
    LCO --> OEM["OracleEmbeddings
In-DB Embeddings"]
    LCO --> OTS["OracleTextSplitter
Server-side Chunking"]
    OVS --> SIM["Similarity Search"]
    OVS --> META["Metadata Filtering"]
    OEM --> MODEL["ALL_MINILM_L12_V2"]
    OTS --> NORM["Text Normalization"]

    style LCO fill:#C74634,stroke:#C74634,color:#fff
    style OVS fill:#1a2235,stroke:#d4a853
    style OEM fill:#1a2235,stroke:#d4a853
    style OTS fill:#1a2235,stroke:#d4a853

Core Integration

langchain-oracledb

Three LangChain components that bring Oracle AI Database's native vector operations into the RAG pipeline — embedding generation, text splitting, and similarity search all execute server-side.

# Configuration in config.yaml
ORACLE_EMBEDDINGS_PARAMS:
  provider: "database"
  model: "ALL_MINILM_L12_V2"

# In-database embedding — no external API calls
embeddings = OracleEmbeddings(
    conn=connection,
    params={"provider": "database",
            "model": "ALL_MINILM_L12_V2"}
)

# Server-side text normalization & chunking
splitter = OracleTextSplitter(
    conn=connection,
    params={"normalize": "all"}
)

04 / 12

Vector Storage

Four Specialized Collections

Each content type gets its own vector space — enabling targeted retrieval and collection-specific metadata enrichment.

PDF

PDFCOLLECTION

Docling extraction → Markdown export → Server-side chunking → Document ID tracking

Web

WEBCOLLECTION

Trafilatura extraction → Rich metadata (author, date, tags) → Special URL handling

Repository

REPOCOLLECTION

Gitingest extraction → Code structure preservation → File-level metadata

General

GENERALCOLLECTION

General knowledge base → Cross-domain queries → Fallback collection

05 / 12

Chain of Thought

Four Specialized
Agents

Instead of a monolithic LLM call, the system decomposes reasoning into four distinct roles — each agent focuses on what it does best.

Planner Decomposes query into 3-4 actionable steps

Researcher Queries vector collections per step, gathers context

Reasoner Applies logical analysis to each research finding

Synthesizer Combines all reasoning into a coherent final answer

%%{init: {'theme':'dark','themeVariables':{'primaryColor':'#C74634','primaryTextColor':'#f1f5f9','lineColor':'#64748b','secondaryColor':'#1a2235','tertiaryColor':'#111827','background':'#111827','mainBkg':'#1a2235','nodeTextColor':'#f1f5f9'}}}%%
graph TB
    Q["🔍 User Query"]
    P["🗺️ Planner
3-4 steps"]
    R1["📚 Researcher
Step 1"]
    R2["📚 Researcher
Step 2"]
    R3["📚 Researcher
Step 3"]
    L1["🧠 Reasoner
Analysis 1"]
    L2["🧠 Reasoner
Analysis 2"]
    L3["🧠 Reasoner
Analysis 3"]
    S["✨ Synthesizer
Final Answer"]

    Q --> P
    P --> R1
    P --> R2
    P --> R3
    R1 --> L1
    R2 --> L2
    R3 --> L3
    L1 --> S
    L2 --> S
    L3 --> S

    style Q fill:#C74634,stroke:#C74634,color:#fff
    style P fill:#1a2235,stroke:#C74634
    style R1 fill:#1a2235,stroke:#3b82f6
    style R2 fill:#1a2235,stroke:#3b82f6
    style R3 fill:#1a2235,stroke:#3b82f6
    style L1 fill:#1a2235,stroke:#a78bfa
    style L2 fill:#1a2235,stroke:#a78bfa
    style L3 fill:#1a2235,stroke:#a78bfa
    style S fill:#1a2235,stroke:#14b8a6

06 / 12

Interoperability

Agent-to-Agent
Protocol

JSON-RPC 2.0 based protocol enabling distributed agent deployment. Each agent can run on a separate server — scale independently, upgrade individually.

document.query agent.discover task.create reasoning.execute health.check

%%{init: {'theme':'dark','themeVariables':{'primaryColor':'#C74634','primaryTextColor':'#f1f5f9','lineColor':'#64748b','secondaryColor':'#1a2235','tertiaryColor':'#111827','background':'#111827','mainBkg':'#1a2235','nodeTextColor':'#f1f5f9'}}}%%
graph LR
    USER["User
Gradio / Open WebUI"]
    ORCH["A2A Orchestrator
localhost:8000"]
    PA["Planner Agent
server1:8001"]
    RA["Researcher Agent
server2:8002"]
    REA["Reasoner Agent
server3:8003"]
    SA["Synthesizer Agent
server4:8004"]
    DB[("Oracle AI
Database")]

    USER -->|"POST /a2a"| ORCH
    ORCH -->|"agent.query"| PA
    PA -->|"plan"| ORCH
    ORCH -->|"agent.query"| RA
    RA -->|"findings"| ORCH
    RA -.->|"query_pdf_collection"| DB
    ORCH -->|"agent.query"| REA
    REA -->|"analysis"| ORCH
    ORCH -->|"agent.query"| SA
    SA -->|"final answer"| ORCH
    ORCH -->|response| USER

    style USER fill:#14b8a6,stroke:#14b8a6,color:#fff
    style ORCH fill:#C74634,stroke:#C74634,color:#fff
    style PA fill:#1a2235,stroke:#d4a853
    style RA fill:#1a2235,stroke:#3b82f6
    style REA fill:#1a2235,stroke:#a78bfa
    style SA fill:#1a2235,stroke:#14b8a6
    style DB fill:#1a2235,stroke:#d4a853,stroke-width:2px

07 / 12

Ingestion

Document
Processing
Pipeline

Three specialized processors handle different content types, all converging into Oracle's server-side text splitting and embedding pipeline.

Input → Extract → Split → Embed → Store

PDF Docling + OracleTextSplitter

converter = DocumentConverter()
result = converter.convert(file_path)
text = result.document.export_to_markdown()
chunks = splitter.split_text(text)

Web Trafilatura + Metadata Extraction

downloaded = fetch_url(url)
text = extract(downloaded)
metadata = extract_metadata(downloaded)
# author, date, categories, tags

Code Gitingest + Structure Preservation

repo = gitingest(repo_url)
# Preserves file structure & paths
# Code-aware chunking boundaries

08 / 12

Reasoning Ensemble

9 Strategies × 2 Modes

Each strategy available in both standalone and RAG-augmented variants

Strategy	Approach	RAG
`standard`	Direct response	✓
`cot`	Step-by-step reasoning	✓
`tot`	Parallel path exploration	✓
`react`	Reasoning + Acting	✓
`self_reflection`	Iterative critique	✓
`consistency`	Multi-sample voting	✓
`decomposed`	Sub-problem breakdown	✓
`least_to_most`	Progressive complexity	✓
`recursive`	Recursive decomposition	✓

Strategy Complexity Spectrum

09 / 12

Implementation Detail

OraDBVectorStore

The bridge between the RAG agents and Oracle AI Database. Manages four collection instances, sanitizes metadata, and includes a monkeypatch for dual-format metadata handling.

class OraDBVectorStore:
    def __init__(self, persist_directory, embedding_function):
        # Initialize 4 OracleVS instances
        self.pdf_store = OracleVS(
            client=conn, table_name="PDFCOLLECTION",
            embedding_function=oracle_emb,
            distance_strategy=DistanceStrategy.COSINE
        )
        self.web_store  = OracleVS(...)  # WEBCOLLECTION
        self.repo_store = OracleVS(...)  # REPOCOLLECTION
        self.gen_store  = OracleVS(...)  # GENERALCOLLECTION

    def query_pdf_collection(self, query, n_results=3):
        results = self.pdf_store.similarity_search_with_score(
            query, k=n_results
        )
        return [{
            "text": doc.page_content,
            "similarity": 1/(1 + dist),
            "metadata": doc.metadata
        } for doc, dist in results]

4

Vector Store Instances

API Surface

add_pdf_chunks()
add_web_chunks()
add_repo_chunks()
query_*_collection()
get_collection_count()
delete_documents()

Similarity Score

        score = 1 / (1 + distance)
      

Euclidean distance → normalized 0–1 similarity

10 / 12

Deployment

Run
Anywhere

💻

Local

python run_app.py --gradio

🐳

Docker

docker run --gpus all agentic-rag

☸️

Kubernetes

cd k8s && ./deploy.sh

Four User Interfaces

Gradio

Model management, document processing tabs, standard & CoT chat, A2A testing

Open WebUI

ChatGPT-like experience with 18 reasoning "models", streaming, history

REST API

FastAPI with OpenAI-compatible /v1/chat/completions endpoint

CLI

Interactive agent_cli.py for direct PDF/web processing and chat

Open WebUI Functions

RAG Filter RAG Pipe (6 models) Document Sync

11 / 12

Summary

Why This Architecture Wins

Oracle AI

Native vector ops at database scale. In-DB embeddings — no external API calls. Full event audit trail across 6 tables.

Multi-Agent

4 specialized agents via A2A protocol. 9 reasoning strategies × 2 modes. Distributed deployment ready.

Privacy-First

100% local inference with Ollama. No data leaves your infrastructure. 4 interfaces — choose your workflow.

4

Vector Collections

18

Reasoning Models

12

A2A Methods

6

Event Tables

langchain-oracledb Ollama FastAPI Gradio Open WebUI

github.com/oracle-devrel/oracle-ai-developer-hub → apps/agentic_rag

12 / 12

Agentic RAG withlangchain-oracledb

Traditional RAGFalls Short