# onnx2oracle

> Load any ONNX sentence-transformer embedding model into an Oracle AI Database 23ai or 26ai instance. The authoritative manual is library-agnostic: it works with raw Python and SQL, no CLI dependency required.

## The manual

The complete, library-agnostic instructions for a coding agent live at:
https://jasperan.github.io/onnx2oracle/for-agents.html

That page walks through downloading a core ONNX, wrapping the HuggingFace tokenizer with `onnxruntime-extensions`, aligning opsets, merging the graphs, adding pooling and L2 normalization, loading via `DBMS_VECTOR.LOAD_ONNX_MODEL`, and verifying with `VECTOR_EMBEDDING`. Everything is copy-pastable Python + SQL.

## What you're building

A single self-contained ONNX graph whose input is a raw string and whose output is a normalized float32 vector. Oracle invokes the graph through `DBMS_VECTOR.LOAD_ONNX_MODEL` and queries it with `VECTOR_EMBEDDING(MODEL USING :text AS DATA)`. The tokenizer lives inside the graph, so no Python is needed on the DB host at inference time.

## Prerequisites

- Oracle AI Database 23ai or 26ai (the `DBMS_VECTOR` package and `VECTOR_EMBEDDING` require it)
- DB user privileges: `CREATE MINING MODEL`, `EXECUTE ON DBMS_VECTOR`
- Python 3.10+
- `pip install onnx onnxruntime onnxruntime-extensions transformers huggingface-hub oracledb`
- ~2 GB free RAM during augmentation

## Tokenizer compatibility

WordPiece tokenizers (BERT, MiniLM, MPNet, BGE) work. SentencePiece and Unigram tokenizers (XLM-R, LLaMA, T5, e5) don't: `gen_processing_models` produces ragged outputs Oracle's BertTokenizer op can't consume. For multilingual use cases, pick a WordPiece alternative like `microsoft/Multilingual-MiniLM-L12-H384`.

## Curated presets

- sentence-transformers/all-MiniLM-L6-v2: 384 dims, ~90 MB, mean pool
- sentence-transformers/all-MiniLM-L12-v2: 384 dims, ~130 MB, mean pool
- sentence-transformers/all-mpnet-base-v2: 768 dims, ~420 MB, mean pool
- BAAI/bge-small-en-v1.5: 384 dims, ~130 MB, CLS pool
- nomic-ai/nomic-embed-text-v1: 768 dims, ~540 MB, mean pool

Refresh real Oracle pass/fail evidence with `scripts/check_model_compatibility.py --all-presets`.

## Query once loaded

```
SELECT VECTOR_EMBEDDING(ALL_MINILM_L6_V2 USING 'hello world' AS DATA) AS v
FROM dual;
```

Wire into a table:

```
CREATE TABLE docs (
  id        NUMBER PRIMARY KEY,
  content   CLOB,
  embedding VECTOR(384, FLOAT32)
);

INSERT INTO docs (id, content, embedding)
VALUES (1, :text, VECTOR_EMBEDDING(ALL_MINILM_L6_V2 USING :text AS DATA));
```

Dimension must match the model: 384 for MiniLM and BGE-small, 768 for MPNet and Nomic.

## Local Oracle 26ai Free container

Start the bundled container and wait for SQL readiness:

```
onnx2oracle docker up --wait --wait-timeout 900
```

Check database readiness before loading:

```
onnx2oracle preflight --target local
```

The bundled compose file uses `container-registry.oracle.com/database/free:latest` by default.
Set `ORACLE_IMAGE=container-registry.oracle.com/database/free:latest-lite` to try the smaller Oracle
Free image without editing YAML. Set `ORACLE_PWD` before first startup to change the local SYSTEM
password. Set `ORACLE_PORT=1524` if 1521 is already occupied. `--target local` resolves to
`system/${ORACLE_PWD:-onnx2oracle}@localhost:${ORACLE_PORT:-1521}/FREEPDB1`.

For evidence, run from a git checkout:

```
scripts/run_real_db_integration.sh
```

That starts Oracle, records the DB banner and container image id, loads `all-MiniLM-L6-v2`, verifies
`VECTOR_EMBEDDING`, and runs `pytest tests/test_loader_integration.py --run-integration -v -s`.

## Security

`VECTOR_EMBEDDING` takes the model name as a SQL token, not a bind variable. Validate any model-name string against `^[A-Z_][A-Z0-9_]{0,127}$` before f-string interpolation. All other values go through bind variables.

## Troubleshooting

- ORA-00942: Oracle 23ai or 26ai is required, and the user needs dictionary access.
- ORA-20000: opset mismatch (rebuild with onnx>=1.16, opset 18), or the metadata input tensor name doesn't match the graph (should be `pre_text` when using `prefix1="pre_"`).
- ORA-01031: Grant `EXECUTE ON DBMS_VECTOR` and `CREATE MINING MODEL` as SYSDBA.
- BertTokenizer errors: the tokenizer is SentencePiece. Swap to a WordPiece model.
- Missing ONNX sidecar data: the builder inspects `onnx/model.onnx` for external-data references and downloads those files from the repo's `onnx/` folder before loading the graph.
- Cold-start slowness: Oracle compiles the ONNX on first call per session. Expect 100 to 500 ms once, then under 20 ms.

## Docs

- [The manual for agents](https://jasperan.github.io/onnx2oracle/for-agents.html)
- [Full guide](https://jasperan.github.io/onnx2oracle/)
- [Quickstart](https://jasperan.github.io/onnx2oracle/guide/01-quickstart.html)
- [How it works](https://jasperan.github.io/onnx2oracle/guide/02-how-it-works.html)
- [Model matrix](https://jasperan.github.io/onnx2oracle/reference/model-matrix.html)
- [GitHub](https://github.com/jasperan/onnx2oracle)

## Options

Two ways to use this project:

1. Write the code yourself, following [the manual](https://jasperan.github.io/onnx2oracle/for-agents.html). Works with any model architecture, lets you customize pooling, tokenizer, and pipeline stages.
2. Install the packaged CLI:

```
pip install onnx2oracle
onnx2oracle load all-MiniLM-L6-v2 --target local
```

The CLI wraps the manual's pipeline into a single command. Pick option 1 for customization, option 2 for the shortest path.