AI Providers

SANE supports five provider back-ends. The LLM provider (for tags, links, summaries) and the embedding provider (for semantic search) can be configured independently.

OpenAI

recommended
OpenAI

The default provider. Offers the widest model selection and the most reliable embedding quality. gpt-4o-mini is cost-effective for most vaults; text-embedding-3-small balances speed and quality for semantic search.

Add your key at Settings → SANE → OpenAI API Key. Keys are stored in Obsidian's secure secret storage.

gpt-4o-mini (recommended) gpt-4o gpt-4-turbo text-embedding-3-small (recommended) text-embedding-3-large

Google AI (Gemini)

cloud
Google AI

Uses the Gemini family of models for generation and Google's embedding API for semantic search. A good alternative for users already in the Google ecosystem.

Get a key at aistudio.google.com and paste it into Settings → SANE → Google API Key.

gemini-1.5-flash (recommended) gemini-1.5-pro embedding-001 (recommended)

Grok (X.AI)

cloud
Grok / X.AI

Access X.AI's Grok models via the X.AI API. Compatible with the OpenAI SDK, so setup is straightforward. Note: Grok does not provide its own embedding model — pair it with OpenAI or Google embeddings.

Add your X.AI key under Settings → SANE → Grok API Key. Set embeddingProvider to openai or google separately.

grok-beta grok-2

Azure OpenAI

enterprise
Azure OpenAI

Use your own Azure OpenAI deployment. Required for organizations with data residency or compliance requirements.

You need two things: the endpoint URL of your deployment and an API key. Enter both under Settings → SANE → Azure.

Example endpoint
azureEndpoint: https://YOUR-RESOURCE.openai.azure.com/openai/deployments/YOUR-DEPLOYMENT
Note: The model name in SANE settings must match the deployment name in Azure, not the underlying model name.

Local LLM (Ollama / vLLM / llama.cpp)

offline · free
Local LLM

Run a local model server with an OpenAI-compatible API — Ollama, vLLM, or llama.cpp all work. Your note content never leaves your machine.

Set aiProvider to local and point localEndpoint at your server. The default assumes Ollama running on localhost:11434.

Start Ollama with a model
$ ollama run llama3
$ ollama pull nomic-embed-text

Set llmModel to your pulled model name (e.g. llama3) and embeddingModel to your embedding model (e.g. nomic-embed-text). No API key is required for local endpoints.

Performance tip: A 7B model is usually sufficient for generating tags and summaries. Larger models may be slower than the delayed or scheduled trigger can tolerate on modest hardware.

Mixing LLM and embedding providers

The aiProvider and embeddingProvider settings are fully independent. Common combinations:

LLM: Grok · Embeddings: OpenAI

Use Grok's conversational quality for generation while leveraging OpenAI's well-established embedding API for semantic search.

LLM: Local · Embeddings: Local

Fully offline. Run both an LLM and an embedding model locally — perfect for private vaults with no internet dependency.

LLM: OpenAI · Embeddings: Local

Reduce embedding costs by computing them locally while using a cloud LLM for the richer generation tasks.

LLM: Google · Embeddings: Google

Single-provider simplicity. One API key, one billing account, consistent performance across the pipeline.