Custom models
The six presets cover most English and multilingual cases. When yours isn't one of them, --from-huggingface loads any sentence encoder — you just need to know its pooling strategy and output dimension.
What you need to know about the model
Before calling the loader, read the HuggingFace model card for four values:
- HF repo id — e.g.
BAAI/bge-large-en-v1.5. - Pooling —
mean(most models) orcls(most BGE, BERT-classification finetunes). - Embedding dimension — usually in the model card sidebar; 384, 512, 768, 1024 are common.
- Max sequence length — 512 unless the card says otherwise.
If the card is ambiguous, grep its config.json for hidden_size (= dimension) and its modules.json for the pooling module name.
Example 1 — a larger BGE variant
BAAI/bge-large-en-v1.5 uses CLS pooling and emits 1024-d vectors.
$ onnx2oracle load --from-huggingface BAAI/bge-large-en-v1.5 \
--pooling cls \
--dims 1024 \
--max-length 512 \
--name BGE_LARGE_EN_V1_5 \
--normalize
You'll see the same pipeline output as the presets — download, wrap tokenizer, add pooling, add L2 norm, load. The --normalize flag is on by default; pass --no-normalize only if you specifically want un-normalized output (rare).
Example 2 — a mean-pooled domain model
pritamdeka/S-BioBert-snli-multinli-stsb — BioBERT finetuned for sentence similarity. Mean pooling, 768-d.
$ onnx2oracle load --from-huggingface pritamdeka/S-BioBert-snli-multinli-stsb \
--pooling mean \
--dims 768 \
--name S_BIOBERT_STSB
Oracle mining-model names are case-insensitive identifiers — stick to uppercase with underscores to avoid quoting headaches in SQL.
Example 3 — targeting ADB
Same --dsn rules as the presets. The custom flags compose cleanly:
$ onnx2oracle load --from-huggingface intfloat/e5-large-v2 \
--pooling mean --dims 1024 --max-length 512 \
--name E5_LARGE_V2 \
--dsn 'admin/YourStrongPass@(description=...high...)'
Gated models
Some HF repos are gated — the download fails unless your token has accepted the license. Set HF_TOKEN before calling load:
$ export HF_TOKEN=hf_***
$ onnx2oracle load --from-huggingface nomic-ai/nomic-embed-text-v1.5 \
--pooling mean --dims 768 --name NOMIC_V1_5
See #hf-download-fails if downloads 401.
What the flags actually do
| Flag | Effect |
|---|---|
| --pooling mean | Wraps the encoder output in ReduceMean across the sequence axis, masked by the attention mask. |
| --pooling cls | Wraps the encoder output in Gather(axis=1, indices=[0]) — takes the CLS token's hidden state. |
| --dims N | Asserts the final output shape. If the graph's actual output shape disagrees, the loader errors before calling Oracle — cheap to catch locally. |
| --max-length N | Upper bound on tokens per input. Longer inputs get truncated. Default 512. |
| --normalize / --no-normalize | Append the L2 norm subgraph (or don't). Leave on unless you're cascading with a non-cosine distance. |
| --name NAME | The Oracle mining-model identifier. Required for custom loads — no preset to derive a default from. |
When a load fails on Oracle's side
Custom models are the most common source of ORA-20000. Two causes dominate:
- The exported ONNX opset is newer than Oracle's runtime supports. Rebuild with
onnx>=1.16and opset 18. The loader already does this, but if you're supplying your own ONNX bytes out of band, check. - The metadata JSON's input-tensor name doesn't match the graph. The built-in pipeline always names the input
pre_text; make sure your custom graph does too. Full writeup at #ora-20000.