settings
interpretation provider
model
haiku is faster; sonnet gives richer interpretations for ambiguous embeddings
api key
works in browser
api key
model
CORS blocked — needs proxy
api key
model
full model ID from openrouter.ai/models
site url (optional)
app name (optional)
works in browser
resource name
the name in {name}.openai.azure.com
deployment name
api version
api key
requires Azure CORS config
base url
llama.cpp: http://localhost:8080  |  Ollama: http://localhost:11434
api key (optional)
model
depends on server

embedding explorer

all-MiniLM-L6-v2 · 384 dims · in-browser inference
vocab
⬡ loading model…
downloading model weights…
encode
text → 384-dim float vector
text → vec

Type or paste any text and click "generate embedding" to convert it into a 384-dimensional vector. The vector is automatically saved to the inventory below, making it available to all tools. Try the sample button for ideas.

input text
quick decode
paste a vector → nearest concepts
vec → text

Paste a raw embedding (384-element JSON array) from an external source to decode it. If you're working with vectors from the encode panel, use the vocab decode tab below instead — it pulls directly from the inventory.

embedding (json array)
vector inventory
encode text above to add vectors

Compares your vector against pre-embedded concepts to find the closest matches. Use the vocab size selector (S/M/L/XL) above to control how many reference concepts are used — larger vocabularies provide finer-grained decoding. The ranked list shows where your embedding sits in semantic space. An LLM then reads the neighbors and interprets what they collectively mean.

try this
  1. Encode The immune system fights off infections above
  2. Select it here and decode — notice how the top terms cluster around biology/medicine
  3. Now encode The army defends the country from invasion — similar structure, different domain. Compare the neighbor lists to see how the model separates "biological defense" from "military defense"

Look for score gaps in the results. A tight cluster of high-scoring terms followed by a sharp drop means the embedding clearly represents that concept. An even spread of moderate scores suggests the text blends multiple themes.

select a vector to decode

Vector arithmetic lets you manipulate meaning algebraically. Subtraction removes a concept; addition adds one. This model is optimized for sentences, not single words — so sentence-level analogies work best. Single-word arithmetic (the famous king - man + woman) tends to produce scattered results because individual words create diffuse embeddings that blend multiple senses.

try this
  1. Encode: The king ruled the country, The man walked home, and The woman walked home
  2. Use A - B + C mode: set A="The king ruled…", B="The man walked…", C="The woman walked…"
  3. The result should cluster around political rule / authority — the model preserved the "ruling" axis while swapping the subject

More experiments to try:

  • The chef prepared a French meal - France is in Europe + Japan is in Asia — does the cuisine shift?
  • Use avg(A, B) to blend two concepts — what's halfway between I am overjoyed and I am devastated?
  • Try single-word king - man + woman to see how it doesn't work with this model — compare the noisy results to the sentence version above

Not every analogy works. When it fails, that tells you something too — the model doesn't encode that particular relationship as a clean linear direction. The structure of what works and what doesn't reveals the geometry the model has learned.

operation
vector a
vector b
+
vector c

Shows what semantic dimension separates two embeddings. For each vocabulary concept, it computes how much closer that concept is to vector A versus vector B. The top movers in each direction reveal what distinguishes the two inputs.

try this
  1. Encode The scientist conducted the experiment carefully and The artist painted the canvas passionately
  2. Compare them — you'll see terms like "experiment", "measurement", "data" pull toward A, while "painting", "creativity", "beauty" pull toward B
  3. Now try two sentences that seem similar but differ subtly: I am happy vs I am excited. The differential reveals what the model thinks distinguishes these emotions.

The cosine similarity score at the top tells you the overall relationship. Two near-synonyms might score 0.8+; two texts from different domains might be 0.1. The word lists show where they differ, not just how much.

vector a
vector b

Projects the 384-dimensional embedding space down to 2D or 3D. Each dot is a vocabulary term or one of your encoded vectors. Nearby dots have similar embeddings. Clusters show how the model organizes meaning.

try this
  1. Start with PCA (instant) to see the coarse layout, then switch to UMAP for clearer clusters
  2. Encode 5-6 texts from different domains — the projection updates automatically to show where your vectors land relative to the vocabulary clusters
  3. Switch to 3D for a richer view of the embedding structure. Click and drag to rotate, scroll to zoom.

PCA finds the directions of maximum variance — fast and deterministic but tends to produce overlapping clouds. UMAP preserves local neighborhood structure, producing tighter, more distinct clusters at the cost of a few seconds of computation. Both methods lose information when squashing 384 dimensions to 2-3. Use the other tools for precise similarity measurements.

encode some vectors first
vocabulary
encoded
arithmetic
search