Oracle AI Developer Hub

Agentic RAG with
Oracle AI Database

Multi-agent retrieval augmented generation powered by langchain-oracledb, local LLMs via Ollama, and the Agent-to-Agent protocol

Oracle AI Vector Search Chain of Thought A2A Protocol Local LLM Inference

slide 01 / 12

The Challenge

Traditional RAG
Falls Short

Simple vector similarity retrieves text chunks—but it doesn't reason. Complex questions demand planning, multi-step research, and synthesis across heterogeneous knowledge.

Single-hop retrieval misses multi-step reasoning chains
No awareness of document type—PDFs, code, and web treated identically
Context window limits force lossy truncation
No audit trail or explainability of retrieval

Traditional RAG

Query → Embed → Top-K → Concat → Generate
× No planning
× No reasoning
× No synthesis
× No source routing

Agentic RAG

Query → Plan → Research → Reason → Synthesize
✓ Multi-step planning
✓ Deep reasoning
✓ Cross-source synthesis
✓ Intelligent routing

slide 02 / 12

System Architecture

End-to-End Pipeline

%%{init: {'theme':'dark','themeVariables':{'primaryColor':'#C74634','primaryTextColor':'#F8FAFC','primaryBorderColor':'#C74634','lineColor':'#475569','secondaryColor':'#0f1423','tertiaryColor':'#0a0d14','background':'#0a0d14','mainBkg':'#0f1423','nodeBorder':'#334155','clusterBkg':'#080b12','clusterBorder':'#1e293b','titleColor':'#D4A853','edgeLabelBackground':'#0a0d14','nodeTextColor':'#F8FAFC'}}}%%
graph LR
    subgraph INPUT["Document Ingestion"]
        PDF["PDF
Docling"]
        WEB["Web
Trafilatura"]
        CODE["Code
Gitingest"]
    end

    subgraph ORACLE["Oracle AI Database"]
        SPLIT["OracleTextSplitter
Normalize & Chunk"]
        EMBED["OracleEmbeddings
ALL_MINILM_L12_V2"]
        VS["OracleVS
Vector Store"]
        EVENTS["Event Logger
6 Event Tables"]
    end

    subgraph AGENTS["Agent Chain of Thought"]
        PLAN["Planner"]
        RES["Researcher"]
        REA["Reasoner"]
        SYN["Synthesizer"]
    end

    subgraph UI["Interfaces"]
        GRAD["Gradio"]
        OWUI["Open WebUI"]
        API["FastAPI"]
        CLI["CLI"]
    end

    PDF --> SPLIT
    WEB --> SPLIT
    CODE --> SPLIT
    SPLIT --> EMBED
    EMBED --> VS
    VS --> RES
    PLAN --> RES
    RES --> REA
    REA --> SYN
    SYN --> GRAD
    SYN --> OWUI
    SYN --> API
    SYN --> CLI
    VS --> EVENTS

    style INPUT fill:#1a0a0a,stroke:#C74634,stroke-width:2px
    style ORACLE fill:#0d1117,stroke:#D4A853,stroke-width:2px
    style AGENTS fill:#0a0f1e,stroke:#818CF8,stroke-width:2px
    style UI fill:#0a1a14,stroke:#2DD4BF,stroke-width:2px

slide 03 / 12

%%{init: {'theme':'dark','themeVariables':{'primaryColor':'#C74634','primaryTextColor':'#F8FAFC','lineColor':'#475569','secondaryColor':'#0f1423','tertiaryColor':'#0a0d14','background':'#0a0d14','mainBkg':'#0f1423','nodeTextColor':'#F8FAFC'}}}%%
graph TB
    LCO["langchain-oracledb"]
    LCO --> OVS["OracleVS
Vector Store"]
    LCO --> OEM["OracleEmbeddings
In-DB Embeddings"]
    LCO --> OTS["OracleTextSplitter
Server-side Chunking"]
    OVS --> SIM["Similarity Search"]
    OVS --> META["Metadata Filtering"]
    OEM --> MODEL["ALL_MINILM_L12_V2"]
    OTS --> NORM["Text Normalization"]

    style LCO fill:#C74634,stroke:#C74634,color:#fff
    style OVS fill:#0f1423,stroke:#D4A853
    style OEM fill:#0f1423,stroke:#D4A853
    style OTS fill:#0f1423,stroke:#D4A853

Core Integration

langchain-oracledb

Three LangChain components bringing Oracle AI Database's native vector operations into the RAG pipeline—embedding, splitting, and similarity search all execute server-side.

# In-database embedding — no external API calls
embeddings = OracleEmbeddings(
    conn=connection,
    params={"provider": "database",
            "model": "ALL_MINILM_L12_V2"}
)

# Server-side text normalization & chunking
splitter = OracleTextSplitter(
    conn=connection,
    params={"normalize": "all"}
)

slide 04 / 12

Vector Storage

Four Specialized Collections

Each document type gets a dedicated vector store with optimized chunking strategies

📄

PDF

PDFCOLLECTION

Docling markdown extraction with structure-aware chunking

🌐

Web

WEBCOLLECTION

Trafilatura clean text extraction from web pages

💻

Repository

REPOCOLLECTION

Gitingest code ingestion preserving file structure

📚

General

GENERALCOLLECTION

Catch-all store for mixed or unclassified content

slide 05 / 12

Chain of Thought

Four Specialized
Agents

Instead of a monolithic LLM call, the system decomposes reasoning into four distinct roles—each focuses on what it does best.

Planner Decomposes query into 3-4 actionable steps

Researcher Queries vector collections per step, gathers context

Reasoner Applies logical analysis to each research finding

Synthesizer Combines all reasoning into a coherent final answer

%%{init: {'theme':'dark','themeVariables':{'primaryColor':'#C74634','primaryTextColor':'#F8FAFC','lineColor':'#475569','secondaryColor':'#0f1423','tertiaryColor':'#0a0d14','background':'#0a0d14','mainBkg':'#0f1423','nodeTextColor':'#F8FAFC'}}}%%
graph TB
    Q["User Query"]
    P["Planner
3-4 steps"]
    R1["Researcher
Step 1"]
    R2["Researcher
Step 2"]
    R3["Researcher
Step 3"]
    L1["Reasoner
Analysis 1"]
    L2["Reasoner
Analysis 2"]
    L3["Reasoner
Analysis 3"]
    S["Synthesizer
Final Answer"]

    Q --> P
    P --> R1
    P --> R2
    P --> R3
    R1 --> L1
    R2 --> L2
    R3 --> L3
    L1 --> S
    L2 --> S
    L3 --> S

    style Q fill:#C74634,stroke:#C74634,color:#fff
    style P fill:#0f1423,stroke:#C74634
    style R1 fill:#0f1423,stroke:#38BDF8
    style R2 fill:#0f1423,stroke:#38BDF8
    style R3 fill:#0f1423,stroke:#38BDF8
    style L1 fill:#0f1423,stroke:#818CF8
    style L2 fill:#0f1423,stroke:#818CF8
    style L3 fill:#0f1423,stroke:#818CF8
    style S fill:#0f1423,stroke:#2DD4BF

slide 06 / 12

Reasoning Ensemble

9 Strategies × 2 Modes

Each strategy available in both standalone and RAG-augmented variants—18 total reasoning models.

Strategy	Approach	RAG
`standard`	Direct response	✓
`cot`	Step-by-step reasoning	✓
`tot`	Parallel path exploration	✓
`react`	Reasoning + Acting	✓
`self_reflection`	Iterative critique	✓
`consistency`	Multi-sample voting	✓
`decomposed`	Sub-problem breakdown	✓
`least_to_most`	Progressive complexity	✓
`recursive`	Recursive decomposition	✓

Reasoning Depth Spectrum

slide 07 / 12

Ingestion

Document Processing Pipeline

Three specialized processors handle different content types, all converging into Oracle's server-side splitting and embedding pipeline.

Input → Extract → Split → Embed → Store

PDF Docling

converter = DocumentConverter()
result = converter.convert(path)
text = result.document.export_to_markdown()
chunks = splitter.split_text(text)

Web Trafilatura

text = trafilatura.extract(
    trafilatura.fetch_url(url),
    include_tables=True,
    include_links=True
)

Code Gitingest

repo = gitingest(repo_url)
# Preserves file structure
# Code-aware chunking
# Path metadata retained

slide 08 / 12

Interoperability

Agent-to-Agent Protocol

document.query agent.discover task.create reasoning.execute health.check

%%{init: {'theme':'dark','themeVariables':{'primaryColor':'#C74634','primaryTextColor':'#F8FAFC','lineColor':'#475569','secondaryColor':'#0f1423','tertiaryColor':'#0a0d14','background':'#0a0d14','mainBkg':'#0f1423','nodeTextColor':'#F8FAFC'}}}%%
graph LR
    USER["User
Gradio / Open WebUI"]
    ORCH["A2A Orchestrator
localhost:8000"]
    PA["Planner Agent
server1:8001"]
    RA["Researcher Agent
server2:8002"]
    REA["Reasoner Agent
server3:8003"]
    SA["Synthesizer Agent
server4:8004"]
    DB[("Oracle AI
Database")]

    USER -->|"POST /a2a"| ORCH
    ORCH -->|"agent.query"| PA
    PA -->|"plan"| ORCH
    ORCH -->|"agent.query"| RA
    RA -->|"findings"| ORCH
    RA -.->|"query collections"| DB
    ORCH -->|"agent.query"| REA
    REA -->|"analysis"| ORCH
    ORCH -->|"agent.query"| SA
    SA -->|"final answer"| ORCH
    ORCH -->|response| USER

    style USER fill:#2DD4BF,stroke:#2DD4BF,color:#0a0d14
    style ORCH fill:#C74634,stroke:#C74634,color:#fff
    style PA fill:#0f1423,stroke:#D4A853
    style RA fill:#0f1423,stroke:#38BDF8
    style REA fill:#0f1423,stroke:#818CF8
    style SA fill:#0f1423,stroke:#2DD4BF
    style DB fill:#0f1423,stroke:#D4A853,stroke-width:2px

JSON-RPC 2.0 based — each agent runs on a separate server, scales independently, upgrades individually

slide 09 / 12

Deployment

Run
Anywhere

💻

Local

python run_app.py --gradio

🐳

Docker

docker run --gpus all agentic-rag

☸️

Kubernetes

cd k8s && ./deploy.sh

Four User Interfaces

Gradio

Model management, document processing, standard & CoT chat, A2A testing

Open WebUI

ChatGPT-like experience with 18 reasoning "models", streaming, history

REST API

FastAPI with OpenAI-compatible /v1/chat/completions endpoint

CLI

Interactive agent_cli.py for direct PDF/web processing and chat

Open WebUI Functions

RAG Filter RAG Pipe (6 models) Document Sync

slide 10 / 12

Metrics

By the Numbers

4

Vector Collections

18

Reasoning Models

12

A2A Methods

6

Event Tables

Oracle AI

Native vector ops at database scale. In-DB embeddings—zero external API calls. Full event audit trail.

Multi-Agent

4 specialized agents via A2A. 9 reasoning strategies × 2 modes. Distributed deployment ready.

Privacy-First

100% local inference with Ollama. No data leaves your infrastructure. 4 interfaces—your choice.

slide 11 / 12

Build Smarter RAG with
Oracle AI Database

Enterprise vector search, multi-agent reasoning, and complete privacy—all in a single open-source stack.

langchain-oracledb Ollama FastAPI Gradio Open WebUI

github.com/oracle-devrel/oracle-ai-developer-hub → apps/agentic_rag

slide 12 / 12

Agentic RAG with Oracle AI Database

Traditional RAGFalls Short