Skip to content

Module Support

DeepSearcher supports various integration modules including embedding models, large language models, document loaders and vector databases.

📊 Overview

Module Type Count Description
Embedding Models 7+ Text vectorization tools
Large Language Models 11+ Query processing and text generation
Document Loaders 5+ Parse and process documents in various formats
Vector Databases 2+ Store and retrieve vector data

🔢 Embedding Models

Support for various embedding models to convert text into vector representations for semantic search.

Provider Required Environment Variables Features
Open-source models None Locally runnable open-source models
OpenAI OPENAI_API_KEY High-quality embeddings, easy to use
VoyageAI VOYAGE_API_KEY Embeddings optimized for retrieval
Amazon Bedrock AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY AWS integration, enterprise-grade
FastEmbed None Fast lightweight embeddings
PPIO PPIO_API_KEY Flexible cloud embeddings
Novita AI NOVITA_API_KEY Rich model selection

🧠 Large Language Models

Support for various large language models (LLMs) to process queries and generate responses.

Provider Required Environment Variables Features
OpenAI OPENAI_API_KEY GPT model family
DeepSeek DEEPSEEK_API_KEY Powerful reasoning capabilities
XAI Grok XAI_API_KEY Real-time knowledge and humor
Anthropic Claude ANTHROPIC_API_KEY Excellent long-context understanding
SiliconFlow SILICONFLOW_API_KEY Enterprise inference service
PPIO PPIO_API_KEY Diverse model support
TogetherAI TOGETHER_API_KEY Wide range of open-source models
Google Gemini GEMINI_API_KEY Google's multimodal models
SambaNova SAMBANOVA_API_KEY High-performance AI platform
Ollama None Local LLM deployment
Novita AI NOVITA_API_KEY Diverse AI services

📄 Document Loader

Support for loading and processing documents from various sources.

Local File Loaders

Loader Supported Formats Required Environment Variables
Built-in Loader PDF, TXT, MD None
Unstructured Multiple document formats UNSTRUCTURED_API_KEY, UNSTRUCTURED_URL (optional)

Web Crawlers

Crawler Description Required Environment Variables/Setup
FireCrawl Crawler designed for AI applications FIRECRAWL_API_KEY
Jina Reader High-accuracy web content extraction JINA_API_TOKEN
Crawl4AI Browser automation crawler Run crawl4ai-setup for first-time use

💾 Vector Database Support

Support for various vector databases for efficient storage and retrieval of embeddings.

Database Description Features
Milvus Open-source vector database High-performance, scalable
Zilliz Cloud Managed Milvus service Fully managed, maintenance-free
Qdrant Vector similarity search engine Simple, efficient