oonnx2oracle
Reference R3

Model matrix

The full set of registered presets. Click any column header to sort — useful when you're picking by size, dimension, or pooling strategy. Refresh real-DB evidence with scripts/check_model_compatibility.py --all-presets.

Preset HuggingFace repo Dimensions Size (FP32) Pooling Oracle name
all-MiniLM-L6-v2 sentence-transformers/all-MiniLM-L6-v2 384 ~90 MB mean ALL_MINILM_L6_V2
all-MiniLM-L12-v2 sentence-transformers/all-MiniLM-L12-v2 384 ~130 MB mean ALL_MINILM_L12_V2
all-mpnet-base-v2 sentence-transformers/all-mpnet-base-v2 768 ~420 MB mean ALL_MPNET_BASE_V2
bge-small-en-v1.5 BAAI/bge-small-en-v1.5 384 ~130 MB cls BGE_SMALL_EN_V1_5
nomic-embed-text-v1 nomic-ai/nomic-embed-text-v1 768 ~540 MB mean NOMIC_EMBED_TEXT_V1

A note on sizes

The FP32 size is the on-disk footprint of the augmented ONNX — encoder plus tokenizer ops plus pooling plus L2 norm. The encoder alone is about 85% of that. These are the models as downloaded, before any Oracle-side compression or tablespace overhead.

Dimensions and storage

The output dimension drives disk usage for the vector column:

Oracle's HNSW index doubles that rough figure once built. Cosine distance and dot product rank identically on L2-normalized vectors — all listed presets normalize, so use whichever VECTOR_DISTANCE metric reads cleaner in your SQL.