Document Embedding Processor #
Generates vector embeddings for a document’s text chunks using an embedding model to enable semantic search and retrieval.
Requirements #
A configured embedding model provider is required. Set model_provider and
model in the processor config, or configure a default embedding model in the
application settings.
The processor always produces vectors with a fixed dimension of 1024, which
is the only dimension supported by Coco semantic search. For OpenAI-compatible
providers the processor requests 1024 dimensions from the API automatically.
For other providers, such as Ollama, the configured model itself must output
1024-dimensional vectors; otherwise the document is passed through without
embeddings and its chunks will not appear in semantic search results.
Configuration #
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
message_field | string | No | messages | Pipeline context key for the input messages |
output_queue | object | No | null | Queue to push processed documents to |
model_provider | string | No | (app default) | Embedding model provider ID |
model | string | No | (app default) | Embedding model name |
Example #
- document_embedding:
model_provider: openai
model: text-embedding-3-small
output_queue:
name: "documents_embedded"