LANGCHAIN + ORACLE DATABASE

langchain-oracledb

The integration layer bridging LangChain agents to Oracle AI Vector Search

Oracle AI Developer Hub · 2026

01

The Bridge

Four components, one seamless integration

At a Glance

langchain-oracledb by the Numbers

4
Core Components
VectorStore, Embeddings, TextSplitter, DocLoader
384
Embedding Dimensions
ALL_MINILM_L12_V2 via ONNX
3
Distance Strategies
Cosine, Euclidean, Dot Product
3
Search Modes
Keyword, Semantic, Hybrid

Architecture

LangChain ↔ langchain-oracledb ↔ Oracle Database

graph LR
  subgraph LangChain["LangChain Framework"]
    Chain["Chains / Agents"]
    Retriever["Retriever"]
    LLM["LLM Provider"]
  end

  subgraph LCO["langchain-oracledb"]
    OVS["OracleVS"]
    OEmb["OracleEmbeddings"]
    OTS["OracleTextSplitter"]
    ODL["OracleDocLoader"]
    HYB["HybridSearchRetriever"]
  end

  subgraph OracleDB["Oracle Database"]
    VT["VECTOR Columns"]
    IDX["Vector Indexes"]
    ONNX["ONNX Runtime"]
    CTX["Oracle Text"]
  end

  Chain -->|queries| Retriever
  Retriever -->|similarity_search| OVS
  Retriever -->|hybrid_search| HYB
  Chain -->|embed| OEmb
  ODL -->|load| OTS
  OTS -->|chunks| OEmb
  OEmb -->|vectors| OVS
  OVS -->|SQL| VT
  OVS -->|create_index| IDX
  OEmb -->|in-DB| ONNX
  HYB -->|keyword + semantic| CTX

  classDef lc fill:#0891b222,stroke:#0891b2,stroke-width:2px
  classDef bridge fill:#C7463422,stroke:#C74634,stroke-width:2px
  classDef db fill:#d4a73a22,stroke:#d4a73a,stroke-width:2px

  class Chain,Retriever,LLM lc
  class OVS,OEmb,OTS,ODL,HYB bridge
  class VT,IDX,ONNX,CTX db
      

Components

Four Classes, Complete Coverage

  • OracleVS — Vector store with native VECTOR type, similarity search, and index management
  • OracleEmbeddings — In-database embedding generation via ONNX models, no external API calls
  • OracleTextSplitter — Server-side text splitting by chars, words, or sentences with normalization
  • OracleDocLoader — Load documents directly from Oracle Database tables
LangChain langchain-oracledb OracleVS OracleEmb OracleSplit OracleDoc Oracle Database

Embeddings

In-Database Embedding Generation

embeddings.py
from langchain_oracledb.embeddings import OracleEmbeddings

# Load ONNX model into Oracle Database
OracleEmbeddings.load_onnx_model(
    conn=connection,
    model_dir="ALL_MINILM_L12_V2",
    model_name="doc_model"
)

# Create embeddings instance (runs in-DB)
embeddings = OracleEmbeddings(
    conn=connection,
    params={
        "provider": "database",
        "model": "doc_model"
    }
)

vectors = embeddings.embed_documents(["Hello world"])
# Returns: [[0.023, -0.114, ...]]  # 384 dimensions
02

Vector Storage & Search

Native VECTOR type, three distance strategies

Vector Store

OracleVS — Store, Index, Search

vector_store.py
from langchain_oracledb.vectorstores import OracleVS
from langchain_community.vectorstores.utils import DistanceStrategy

# Initialize vector store with Oracle connection
vs = OracleVS(
    client=connection,
    embedding_function=embeddings,
    table_name="DOCUMENTS",
    distance_strategy=DistanceStrategy.COSINE
)

# Ingest documents (embed + store in one call)
vs = OracleVS.from_documents(
    docs, embeddings, client=connection,
    table_name="DOCUMENTS",
    distance_strategy=DistanceStrategy.COSINE
)

# Search: returns top-k similar documents
results = vs.similarity_search("vector indexing", k=5)

Comparison

Distance Strategies

Strategy Enum Best For Normalized Range
Cosine COSINE Semantic similarity, text matching Yes [0, 2]
Euclidean EUCLIDEAN_DISTANCE Absolute distance, clustering No [0, ∞)
Dot Product DOT_PRODUCT Magnitude-aware ranking No (-∞, ∞)
03

Text Processing

Server-side splitting and document loading

OracleTextSplitter

Server-Side Chunking

  • Split by chars, words, or sentences
  • Configurable max size per split mode
  • Built-in text normalization (all, nfkc, nfd)
  • Runs inside Oracle DB — no data movement

OracleDocLoader

Direct DB Loading

  • Load documents from Oracle tables
  • Preserves metadata columns as Document.metadata
  • SQL-based filtering at source
  • Integrates with LangChain DocumentLoader interface

Data Pipeline

Document Ingestion Flow

01
Load
OracleDocLoader
reads from DB tables
02
Split
OracleTextSplitter
chunks by sentence
03
Embed
OracleEmbeddings
ONNX in-DB vectors
04
Store
OracleVS
VECTOR columns + index
05
Search
similarity_search
or hybrid retrieval

All steps execute inside Oracle Database — zero data movement between services

04

Hybrid Search

Keyword + semantic in a single query

Hybrid Architecture

OracleHybridSearchRetriever

graph TD
  Q["User Query"] --> HSR["HybridSearchRetriever"]
  HSR --> KW["Keyword Search"]
  HSR --> SEM["Semantic Search"]
  KW --> CTX["Oracle Text Index"]
  SEM --> VEC["Vector Index"]
  CTX --> SCORE["Score Fusion"]
  VEC --> SCORE
  SCORE --> RES["Ranked Results"]
  RES --> META["Combined Scores"]

  classDef query fill:#0891b222,stroke:#0891b2,stroke-width:2px
  classDef search fill:#C7463422,stroke:#C74634,stroke-width:2px
  classDef index fill:#d4a73a22,stroke:#d4a73a,stroke-width:2px
  classDef result fill:#34d39922,stroke:#34d399,stroke-width:2px

  class Q query
  class HSR,KW,SEM search
  class CTX,VEC index
  class SCORE,RES,META result
      

Hybrid Search

Keyword + Semantic in One Call

hybrid_search.py
from langchain_oracledb.retrievers.hybrid_search import (
    OracleHybridSearchRetriever,
    OracleVectorizerPreference, create_hybrid_index
)

# Create vectorizer preference and hybrid index
pref = OracleVectorizerPreference.create_preference(
    vector_store=vs, preference_name="PREF_DOCS"
)
create_hybrid_index(conn, "IDX_HYBRID", pref)

# Build retriever with score breakdown
retriever = OracleHybridSearchRetriever(
    vector_store=vs,
    idx_name="IDX_HYBRID",
    search_mode="hybrid",  # "hybrid" | "keyword" | "semantic"
    k=5, return_scores=True
)

docs = retriever.invoke("refund policy")

Agentic RAG

Powering Multi-Agent Retrieval

  • OraDBVectorStore wraps OracleVS into 4 collection types: PDF, Web, Repo, General
  • ResearchAgent calls similarity_search per step of the reasoning plan
  • 9 reasoning strategies run in parallel, each backed by RAG context from OracleVS
  • A2A protocol routes queries across distributed agents, all sharing the same vector store
Planner Research Reason Synth OracleVS

Collections

OraDBVectorStore Collections

Collection Table Name Source Ingest Method
PDF PDF_COLLECTION Uploaded PDF documents add_pdf_chunks()
Web WEB_COLLECTION Scraped web pages add_web_chunks()
Repository REPO_COLLECTION Git repository content add_repo_chunks()
General GEN_COLLECTION General knowledge base add_general_chunks()
The best vector database is the one you already trust with your data.
— Oracle AI Vector Search philosophy

langchain-oracledb

One Library. Full Pipeline.
Zero Data Movement.

pip install langchain-oracledb

Oracle AI Developer Hub · github.com/jasperan