Skip to content

Embedding Model Configuration

DeepSearcher supports various embedding models to convert text into vector representations for semantic search.

📝 Basic Configuration

config.set_provider_config("embedding", "(EmbeddingModelName)", "(Arguments dict)")

📋 Available Embedding Providers

Provider Description Key Features
OpenAIEmbedding OpenAI's text embedding models High quality, production-ready
MilvusEmbedding Built-in embedding models via Pymilvus Multiple model options
VoyageEmbedding VoyageAI embedding models Specialized for search
BedrockEmbedding Amazon Bedrock embedding AWS integration
GeminiEmbedding Google's Gemini embedding High performance
GLMEmbedding ChatGLM embeddings Chinese language support
OllamaEmbedding Local embedding with Ollama Self-hosted option
PPIOEmbedding PPIO cloud embedding Scalable solution
SiliconflowEmbedding Siliconflow's models Enterprise support
VolcengineEmbedding Volcengine embedding High throughput
NovitaEmbedding Novita AI embedding Cost-effective

🔍 Provider Examples

OpenAI Embedding

config.set_provider_config("embedding", "OpenAIEmbedding", {"model": "text-embedding-3-small"})
Requires OPENAI_API_KEY environment variable

Milvus Built-in Embedding

config.set_provider_config("embedding", "MilvusEmbedding", {"model": "BAAI/bge-base-en-v1.5"})

config.set_provider_config("embedding", "MilvusEmbedding", {"model": "jina-embeddings-v3"})
For Jina's embedding model, requires JINAAI_API_KEY environment variable

VoyageAI Embedding

config.set_provider_config("embedding", "VoyageEmbedding", {"model": "voyage-3"})
Requires VOYAGE_API_KEY environment variable and pip install voyageai

📚 Additional Providers

Amazon Bedrock

config.set_provider_config("embedding", "BedrockEmbedding", {"model": "amazon.titan-embed-text-v2:0"})
Requires AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables and pip install boto3

Novita AI

config.set_provider_config("embedding", "NovitaEmbedding", {"model": "baai/bge-m3"})
Requires NOVITA_API_KEY environment variable

Siliconflow

config.set_provider_config("embedding", "SiliconflowEmbedding", {"model": "BAAI/bge-m3"})
Requires SILICONFLOW_API_KEY environment variable

Volcengine

config.set_provider_config("embedding", "VolcengineEmbedding", {"model": "doubao-embedding-text-240515"})
Requires VOLCENGINE_API_KEY environment variable

GLM

config.set_provider_config("embedding", "GLMEmbedding", {"model": "embedding-3"})
Requires GLM_API_KEY environment variable and pip install zhipuai

Google Gemini

config.set_provider_config("embedding", "GeminiEmbedding", {"model": "text-embedding-004"})
Requires GEMINI_API_KEY environment variable and pip install google-genai

Ollama

config.set_provider_config("embedding", "OllamaEmbedding", {"model": "bge-m3"})
Requires local Ollama installation and pip install ollama

PPIO

config.set_provider_config("embedding", "PPIOEmbedding", {"model": "baai/bge-m3"})
Requires PPIO_API_KEY environment variable