Guide 05

Custom models

The six presets cover most English and multilingual cases. When yours isn't one of them, --from-huggingface loads any sentence encoder — you just need to know its pooling strategy and output dimension.

What you need to know about the model

Before calling the loader, read the HuggingFace model card for four values:

HF repo id — e.g. BAAI/bge-large-en-v1.5.
Pooling — mean (most models) or cls (most BGE, BERT-classification finetunes).
Embedding dimension — usually in the model card sidebar; 384, 512, 768, 1024 are common.
Max sequence length — 512 unless the card says otherwise.

If the card is ambiguous, grep its config.json for hidden_size (= dimension) and its modules.json for the pooling module name.

Example 1 — a larger BGE variant

BAAI/bge-large-en-v1.5 uses CLS pooling and emits 1024-d vectors.

$ onnx2oracle load --from-huggingface BAAI/bge-large-en-v1.5 \
    --pooling cls \
    --dims 1024 \
    --max-length 512 \
    --name BGE_LARGE_EN_V1_5 \
    --normalize

You'll see the same pipeline output as the presets — download, wrap tokenizer, add pooling, add L2 norm, load. The --normalize flag is on by default; pass --no-normalize only if you specifically want un-normalized output (rare).

Example 2 — a mean-pooled domain model

pritamdeka/S-BioBert-snli-multinli-stsb — BioBERT finetuned for sentence similarity. Mean pooling, 768-d.

$ onnx2oracle load --from-huggingface pritamdeka/S-BioBert-snli-multinli-stsb \
    --pooling mean \
    --dims 768 \
    --name S_BIOBERT_STSB

Oracle mining-model names are case-insensitive identifiers — stick to uppercase with underscores to avoid quoting headaches in SQL.

Example 3 — targeting ADB

Same --dsn rules as the presets. The custom flags compose cleanly:

$ onnx2oracle load --from-huggingface intfloat/e5-large-v2 \
    --pooling mean --dims 1024 --max-length 512 \
    --name E5_LARGE_V2 \
    --dsn 'admin/YourStrongPass@(description=...high...)'

Gated models

Some HF repos are gated — the download fails unless your token has accepted the license. Set HF_TOKEN before calling load:

$ export HF_TOKEN=hf_***
$ onnx2oracle load --from-huggingface nomic-ai/nomic-embed-text-v1.5 \
    --pooling mean --dims 768 --name NOMIC_V1_5

See #hf-download-fails if downloads 401.

What the flags actually do

Flag	Effect
--pooling mean	Wraps the encoder output in `ReduceMean` across the sequence axis, masked by the attention mask.
--pooling cls	Wraps the encoder output in `Gather(axis=1, indices=[0])` — takes the CLS token's hidden state.
--dims N	Asserts the final output shape. If the graph's actual output shape disagrees, the loader errors before calling Oracle — cheap to catch locally.
--max-length N	Upper bound on tokens per input. Longer inputs get truncated. Default 512.
--normalize / --no-normalize	Append the L2 norm subgraph (or don't). Leave on unless you're cascading with a non-cosine distance.
--name NAME	The Oracle mining-model identifier. Required for custom loads — no preset to derive a default from.

When a load fails on Oracle's side

Custom models are the most common source of ORA-20000. Two causes dominate:

The exported ONNX opset is newer than Oracle's runtime supports. Rebuild with onnx>=1.16 and opset 18. The loader already does this, but if you're supplying your own ONNX bytes out of band, check.
The metadata JSON's input-tensor name doesn't match the graph. The built-in pipeline always names the input pre_text; make sure your custom graph does too. Full writeup at #ora-20000.

ADB wallet flow CLI reference