Agentic RAG — Project Recap Dashboard

02

Architecture Map

%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a2040', 'primaryTextColor': '#c8d6e5', 'primaryBorderColor': '#4fc3f7', 'lineColor': '#4fc3f7', 'secondaryColor': '#161b33', 'tertiaryColor': '#111529', 'edgeLabelBackground': '#0c0f1d', 'clusterBkg': '#111529', 'clusterBorder': '#1e2a4a', 'nodeTextColor': '#c8d6e5' }}}%% graph TB subgraph INPUT["INPUT LAYER"] direction LR PDF["PDF Documents
Docling"] WEB["Web Pages
Trafilatura"] CODE["Code Repos
Gitingest"] end subgraph PROCESSING["PROCESSING PIPELINE"] direction LR SPLIT["OracleTextSplitter
Recursive chunking"] EMBED["OracleEmbeddings
all-MiniLM-L6-v2"] VS["OracleVS
Vector store"] end subgraph STORAGE["ORACLE AI DATABASE"] direction LR C1["PDFCOLLECTION"] C2["CODECOLLECTION"] C3["WEBCOLLECTION"] C4["LOCALCOLLECTION"] ET["6 Event Tables
API | MODEL | DOC
INGEST | QUERY | A2A"] end subgraph INTELLIGENCE["INTELLIGENCE LAYER"] AGENT["LocalRAGAgent
Core orchestrator"] FACTORY["Agent Factory"] PLAN["Planner Agent"] RES["Researcher Agent"] REAS["Reasoner Agent"] SYN["Synthesizer Agent"] ENSEMBLE["RAGReasoningEnsemble
9 strategies x 2 modes"] end subgraph OUTPUT["OUTPUT INTERFACES"] direction LR GR["Gradio UI
:7860"] OW["Open WebUI
18 models"] FA["FastAPI
OpenAI-compat"] CLI["CLI
Terminal"] A2A["A2A Handler
JSON-RPC 2.0"] end PDF --> SPLIT WEB --> SPLIT CODE --> SPLIT SPLIT --> EMBED --> VS VS --> C1 VS --> C2 VS --> C3 VS --> C4 AGENT --> FACTORY FACTORY --> PLAN FACTORY --> RES FACTORY --> REAS FACTORY --> SYN AGENT --> ENSEMBLE C1 & C2 & C3 & C4 --> AGENT AGENT --> ET AGENT --> GR AGENT --> OW AGENT --> FA AGENT --> CLI AGENT --> A2A style INPUT fill:#111529,stroke:#4fc3f7,stroke-width:1px style PROCESSING fill:#111529,stroke:#7c4dff,stroke-width:1px style STORAGE fill:#111529,stroke:#ef5350,stroke-width:1px style INTELLIGENCE fill:#111529,stroke:#66bb6a,stroke-width:1px style OUTPUT fill:#111529,stroke:#ffa726,stroke-width:1px

03

Dependency Graph

%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a2040', 'primaryTextColor': '#c8d6e5', 'primaryBorderColor': '#4fc3f7', 'lineColor': '#253358', 'secondaryColor': '#161b33', 'tertiaryColor': '#111529', 'nodeTextColor': '#c8d6e5' }}}%% graph LR APP(("Agentic
RAG")) subgraph CORE["CORE"] LC["langchain-oracledb"] ODB["oracledb"] AR["agent-reasoning"] OL["ollama"] end subgraph INGEST["INGESTION"] DOC["docling"] TRA["trafilatura"] GIT["gitingest"] end subgraph IFACE["INTERFACES"] FAPI["fastapi"] GRAD["gradio"] OWUI["open-webui"] end subgraph FALLBACK["FALLBACKS"] CHR["chromadb"] ST["sentence-transformers"] end APP --> LC & ODB & AR & OL APP --> DOC & TRA & GIT APP --> FAPI & GRAD & OWUI APP -.-> CHR & ST style APP fill:#4fc3f7,stroke:#4fc3f7,color:#0c0f1d,font-weight:bold style CORE fill:#111529,stroke:#4fc3f7,stroke-width:1px style INGEST fill:#111529,stroke:#66bb6a,stroke-width:1px style IFACE fill:#111529,stroke:#7c4dff,stroke-width:1px style FALLBACK fill:#111529,stroke:#ffa726,stroke-width:1px,stroke-dasharray:5 5

LC

langchain-oracledb

Oracle AI Vector Search integration for LangChain. Provides OracleVS, OracleEmbeddings, OracleTextSplitter.

core database

DB

oracledb

Python driver for Oracle Database. Thick/thin client for connection pooling and session management.

database

AR

agent-reasoning

9 reasoning strategies library: CoT, ToT, ReAct, MCTS, R1, Beam Search, Self-Consistency, PRM, Meta-Reasoning.

core

OL

ollama

Local LLM inference server. Default model: gemma3:270m. No cloud dependency.

core

DL

docling

PDF document processing with structure extraction. Handles tables, headers, and metadata.

ingestion

TF

trafilatura

Web content extraction. Cleans HTML to readable text with metadata preservation.

ingestion

GI

gitingest

Code repository ingestion. Processes Git repos into document chunks for RAG indexing.

ingestion

FA

fastapi

High-performance API framework. Serves OpenAI-compatible endpoints and A2A protocol.

interface

GR

gradio

Interactive web UI with tabbed interface for querying, uploading documents, and monitoring.

interface

WU

open-webui

Full-featured chat UI. Connects via OpenAI-compatible API exposing 18 reasoning models.

interface

CR

chromadb

Fallback vector store when Oracle DB is unavailable. Same interface abstraction.

fallback

ST

sentence-transformers

Fallback embedding model. Used when OracleEmbeddings is not configured.

fallback

04

API Surface

Method	Endpoint	Description	Protocol
POST	/upload/pdf	Upload and process PDF documents via Docling. Chunks, embeds, and stores in PDFCOLLECTION.	REST
POST	/query	Execute a RAG query with optional reasoning strategy and collection targeting.	REST
POST	/a2a	Agent-to-Agent protocol endpoint. Accepts JSON-RPC 2.0 messages for inter-agent communication.	A2A / JSON-RPC
GET	/agent_card	Returns the agent capability card describing skills, supported methods, and metadata.	A2A
POST	/v1/chat/completions	OpenAI-compatible chat completions. Routes to reasoning models via model name.	OpenAI
GET	/v1/models	Lists all 18 available reasoning models. Each maps to a strategy + mode combination.	OpenAI
GET	/events/statistics	Aggregated event monitoring. Counts, average response times, error rates across all tables.	REST
GET	/events/{type}	Event details filtered by type: all, a2a, api, model, document, query.	REST
POST	/sync/embeddings	Synchronize document embeddings from Open WebUI into Oracle Vector Store.	REST

05

Event Logging System

Six Oracle Database tables provide full observability across every layer of the system. Every operation is tracked with timestamps, durations, and contextual metadata.

API_EVENTS

Tracks all HTTP endpoint invocations. Captures request method, path, status code, response time, and client metadata.

endpoint method status_code response_ms client_ip timestamp

MODEL_EVENTS

Records every LLM invocation. Stores model name, prompt, response, token counts, and generation latency.

model_name prompt response tokens_in tokens_out latency_ms

DOCUMENT_EVENTS

Logs document processing operations: uploads, parsing, chunking. Tracks source type, chunk count, and processing time.

doc_type filename chunk_count collection process_ms

INGEST_EVENTS

Detailed ingestion tracking per chunk. Records embedding generation, vector storage, and metadata association.

chunk_id collection embed_ms store_ms metadata

QUERY_EVENTS

Captures user queries end-to-end. Stores the query text, retrieved context, reasoning strategy used, and final response.

query_text strategy context_docs response total_ms

A2A_EVENTS

Agent-to-Agent protocol logging. Records JSON-RPC method calls, task IDs, agent identifiers, and message payloads.

rpc_method task_id agent_id payload status

06

Deployment Matrix

Local

Direct Python execution. Fastest iteration cycle. Requires Oracle DB and Ollama running locally.

# Default: Gradio UI on :7860
$ python run_app.py

# Specific interface modes
$ python run_app.py --gradio
$ python run_app.py --openwebui
$ python run_app.py --api-only

Docker

Containerized deployment with GPU passthrough. Uses host networking for Oracle DB and Ollama access.

# Build with host network (pip)
$ docker build --network=host \
-t agentic-rag .

# Run with GPU support
$ docker run -d --gpus all \
--network=host agentic-rag

Kubernetes

Production-grade orchestration with HuggingFace token injection for gated model access.

# Deploy full stack
$ cd k8s
$ ./deploy.sh

# With HF token for gated models
$ ./deploy.sh \
--hf-token TOKEN

07

Open WebUI Integration

Three custom Open WebUI functions bridge the chat interface with Oracle AI Database. Together they enable transparent RAG augmentation across every conversation.

oracle_rag_filter.py

Inlet/outlet filter that intercepts every message, queries Oracle DB for relevant context, and injects it into the prompt before it reaches the LLM.

Transparent context injection

Works with any model selection

Inlet: augment prompt / Outlet: log response

oracle_rag_pipe.py

Manifold pipe that creates 6 dedicated Oracle RAG model entries in the model selector. Each maps to a different reasoning strategy.

6 virtual models in model picker

Strategy-specific behavior per model

Streaming response support

oracle_document_sync.py

Manual sync action. Triggers bulk document synchronization from Open WebUI's knowledge base into Oracle Vector Store.

On-demand sync trigger

Deduplication by document hash

Progress reporting via UI

Auto-Persistence Features

Attached webpages are automatically chunked and stored in WEBCOLLECTION with full source metadata.
Uploaded URLs are processed through WebProcessor (Trafilatura) and persisted to WEBCOLLECTION with title, URL, extraction date, and content hash.

08

Notable Code Patterns

Metadata Monkeypatch

Dual-format metadata handling for LangChain compatibility. Document metadata can arrive as either a Python dict or a JSON-encoded string depending on the source. The monkeypatch normalizes both formats transparently.

# Normalize metadata: dict or JSON string

def _get_metadata(doc):

    meta = doc.metadata

    if isinstance(meta, str):

        try:

            meta = json.loads(meta)

        except json.JSONDecodeError:

            meta = {"raw": meta}

    return meta

Vector Store Abstraction

ChromaDB and Oracle share the same interface contract. A factory function returns the appropriate implementation based on configuration, allowing seamless fallback without changing any query code.

# Unified vector store interface

def get_vector_store(config):

    if config.use_oracle:

        return OracleVS(

            client=config.connection,

            embedding=config.embeddings,

            table_name=config.collection

        )

    return Chroma(collection_name=config.collection)

Agent Factory Pattern

Chain-of-Thought agent creation via factory. Each agent type (Planner, Researcher, Reasoner, Synthesizer) is instantiated with a role-specific system prompt and tool configuration, then composed into a pipeline.

# Factory creates role-specific agents

class AgentFactory:

    def create(self, role, llm, tools):

        prompt = self.prompts[role]

        agent = create_react_agent(

            llm=llm,

            tools=tools,

            prompt=prompt

        )

        return AgentExecutor(agent=agent)

Similarity Score Normalization

Oracle Vector Search returns raw distance values. The normalization formula converts distance to a 0-1 similarity score where 1.0 means identical. This enables consistent threshold-based filtering across different distance metrics.

# Distance to similarity conversion

# score = 1 / (1 + distance)

# distance=0 -> score=1.0 (identical)

# distance=inf -> score~0.0 (unrelated)

def normalize_score(distance):

    return 1.0 / (1.0 + distance)