向量数据库对比：Qdrant vs Milvus vs Weaviate vs Chroma

原创灵阙教研团队

A 推荐进阶 | 约 8 分钟阅读更新于 2026-02-28

AI 导读

向量数据库对比：Qdrant vs Milvus vs Weaviate vs Chroma 四大向量数据库的架构设计、索引策略、性能表现与生产化选型 | 2026-02 一、为什么需要向量数据库大模型应用的核心范式之一是 RAG（Retrieval Augmented Generation）。RAG 的检索环节需要将文本/图像/音频转换为高维向量，然后做近似最近邻（ANN）搜索。虽然...

向量数据库对比：Qdrant vs Milvus vs Weaviate vs Chroma

四大向量数据库的架构设计、索引策略、性能表现与生产化选型 | 2026-02

一、为什么需要向量数据库

大模型应用的核心范式之一是 RAG（Retrieval Augmented Generation）。RAG 的检索环节需要将文本/图像/音频转换为高维向量，然后做近似最近邻（ANN）搜索。虽然 PostgreSQL（pgvector）等传统数据库也能存向量，但在十亿级规模、毫秒级延迟、混合过滤等场景下，专用向量数据库仍然是更优选择。

本文对比 Qdrant、Milvus、Weaviate、Chroma 四款代表性产品。

二、架构设计对比

2.1 基础信息

维度	Qdrant	Milvus	Weaviate	Chroma
语言	Rust	Go + C++	Go	Python + Rust
开源协议	Apache 2.0	Apache 2.0	BSD-3	Apache 2.0
首次发布	2021	2019	2019	2022
云服务	Qdrant Cloud	Zilliz Cloud	Weaviate Cloud	Chroma Cloud (beta)
存储引擎	自研（Segment）	RocksDB + 自研	LSM + HNSW	DuckDB + 自研
分布式	原生分片	原生分片	原生分片	单节点为主

2.2 架构对比

Qdrant Architecture (Rust-native, segment-based)
+------------------------------------------+
|            Qdrant Node                    |
|  +------+  +------+  +------+            |
|  | Shard |  | Shard |  | Shard |          |
|  |  0    |  |  1    |  |  2    |          |
|  +------+  +------+  +------+            |
|       |          |          |             |
|  +---------+---------+---------+          |
|  | Segment | Segment | Segment |          |
|  | (HNSW)  | (HNSW)  | (mmap)  |          |
|  +---------+---------+---------+          |
|       |                                   |
|  +------------------+                     |
|  | Payload Index    |   <- Filterable     |
|  | (keyword, int,   |      metadata       |
|  |  geo, datetime)  |                     |
|  +------------------+                     |
+------------------------------------------+

Milvus Architecture (Distributed, cloud-native)
+------------------------------------------+
|  Proxy -> Query Coord -> Data Coord      |
|              |                |           |
|         Query Nodes      Data Nodes      |
|              |                |           |
|         +--------+      +--------+       |
|         | Segment|      | Segment|       |
|         | (IVF/  |      | (binlog|       |
|         |  HNSW) |      |  store)|       |
|         +--------+      +--------+       |
|              |                |           |
|         Object Storage (S3 / MinIO)      |
|         Message Queue (Pulsar / Kafka)   |
+------------------------------------------+

2.3 关键设计差异

设计决策	Qdrant	Milvus	Weaviate	Chroma
存储分离	否（本地）	是（S3）	否（本地）	否（本地）
消息队列	否	是（Pulsar）	否	否
内存管理	mmap + on-disk	内存 + 持久化	mmap	内存为主
多租户	Collection 级	Database 级	Tenant 级	原生
量化	Scalar/Binary/Product	IVF_SQ8/PQ	PQ + BQ	否

三、索引与检索策略

3.1 支持的索引类型

索引类型	Qdrant	Milvus	Weaviate	Chroma
HNSW	默认	支持	默认	默认
IVF_FLAT	否	支持	否	否
IVF_SQ8	否	支持	否	否
IVF_PQ	否	支持	否	否
DISKANN	否	支持	否	否
Flat (brute force)	支持	支持	支持	支持
Scalar Quantization	支持	支持	否	否
Binary Quantization	支持	否	支持	否
Product Quantization	支持	支持	支持	否

3.2 混合搜索（向量 + 过滤）

混合搜索是生产环境最常见的需求：不仅要语义相似，还要满足元数据条件。

# Qdrant: Pre-filtering with payload index (most efficient)
from qdrant_client import QdrantClient
from qdrant_client.models import Filter, FieldCondition, MatchValue, Range

client = QdrantClient(url="http://localhost:6333")

results = client.query_points(
    collection_name="documents",
    query=[0.1, 0.2, ...],  # embedding vector
    query_filter=Filter(
        must=[
            FieldCondition(
                key="category",
                match=MatchValue(value="finance"),
            ),
            FieldCondition(
                key="created_at",
                range=Range(gte="2025-01-01T00:00:00Z"),
            ),
        ]
    ),
    limit=10,
    with_payload=True,
)

# Milvus: Boolean expression filtering
from pymilvus import Collection

collection = Collection("documents")
collection.load()

results = collection.search(
    data=[[0.1, 0.2, ...]],  # embedding vector
    anns_field="embedding",
    param={"metric_type": "COSINE", "params": {"nprobe": 10}},
    limit=10,
    expr='category == "finance" and created_at >= "2025-01-01"',
    output_fields=["title", "content", "category"],
)

3.3 过滤性能对比

过滤场景	Qdrant	Milvus	Weaviate	Chroma
无过滤纯向量	快	快	快	快
低选择性（90%命中）	快	快	快	中
中选择性（50%命中）	快	中	中	慢
高选择性（1%命中）	快（payload索引）	中（扫描）	中	很慢
多条件组合	快	中	中	慢

Qdrant 在过滤性能上的优势来自其 payload index 设计——先过滤再搜索，而非搜索后过滤。

四、性能基准

4.1 ANN-Benchmarks 摘要

基于公开基准（ann-benchmarks.com 及各厂商公布数据），以 100 万条 768 维向量为基准：

指标	Qdrant	Milvus	Weaviate	Chroma
索引构建时间	45s	60s	55s	30s
查询延迟（p50）	2ms	3ms	4ms	5ms
查询延迟（p99）	8ms	12ms	15ms	25ms
QPS（单节点）	5000	4000	3000	1500
Recall@10	0.98	0.97	0.97	0.96
内存占用	3.2GB	4.1GB	3.8GB	5.5GB
磁盘占用	2.8GB	3.5GB	3.2GB	4.0GB

4.2 大规模场景（1 亿条）

指标	Qdrant	Milvus	Weaviate	Chroma
是否支持	是（分片）	是（分布式）	是（分片）	困难
推荐节点数	3-5	5-10	3-5	N/A
查询延迟	10-20ms	15-30ms	15-25ms	N/A
水平扩展	线性	线性	线性	有限

五、开发者体验

5.1 SDK 与 API 设计

维度	Qdrant	Milvus	Weaviate	Chroma
REST API	是	是	是（GraphQL）	是
gRPC	是	是	是	否
Python SDK	优秀	良好	良好	优秀
TypeScript SDK	良好	良好	良好	优秀
Rust SDK	原生	社区	社区	否
本地模式	是（:memory:）	否（需 Docker）	否（需 Docker）	是（原生）
Docker 一键启动	是	复杂（多组件）	是	是

5.2 快速上手对比

# Chroma: Simplest getting started (no server needed)
import chromadb

client = chromadb.Client()  # In-memory, zero config
collection = client.create_collection("docs")
collection.add(
    ids=["doc1", "doc2"],
    documents=["Hello world", "Vector databases are great"],
    metadatas=[{"source": "test"}, {"source": "article"}],
)
results = collection.query(query_texts=["greeting"], n_results=1)
# Chroma auto-embeds with default model


# Qdrant: Also simple, but more explicit
from qdrant_client import QdrantClient
from qdrant_client.models import VectorParams, Distance, PointStruct

client = QdrantClient(":memory:")  # In-memory mode
client.create_collection(
    collection_name="docs",
    vectors_config=VectorParams(size=384, distance=Distance.COSINE),
)
client.upsert(
    collection_name="docs",
    points=[
        PointStruct(id=1, vector=[0.1]*384, payload={"text": "hello"}),
        PointStruct(id=2, vector=[0.2]*384, payload={"text": "world"}),
    ],
)

六、云服务与运维

6.1 托管服务对比

维度	Qdrant Cloud	Zilliz Cloud	Weaviate Cloud	Chroma Cloud
状态	GA	GA	GA	Beta
免费层	1GB，永久	有限额度	Sandbox	有限额度
起步价	~$25/月	~$65/月	~$25/月	未定
Serverless	是	是	否	是
私有云	是（BYOC）	是（BYOC）	是	否
SLA	99.9%	99.9%	99.9%	N/A
区域	AWS/GCP/Azure	AWS/GCP	AWS/GCP	AWS

6.2 自托管复杂度

Self-Hosting Complexity (1=simple, 5=complex)

                    Qdrant  Milvus  Weaviate  Chroma
Docker single node:   1       3        1         1
Kubernetes HA:        2       4        2         3
Backup/Restore:       2       3        2         1
Monitoring:           2       3        2         2
Upgrades:             2       3        2         1
Configuration:        2       4        2         1
---------------------------------------------------
Total:              11/30   20/30    11/30     9/30

Legend: Milvus has the highest operational complexity
        due to multiple components (etcd, MinIO, Pulsar)

七、生态集成

7.1 框架集成对比

框架/工具	Qdrant	Milvus	Weaviate	Chroma
LangChain	原生	原生	原生	原生
LlamaIndex	原生	原生	原生	原生
Haystack	原生	原生	原生	社区
Dify	内置	内置	内置	内置
AutoGen	支持	支持	支持	支持
CrewAI	支持	支持	支持	默认
Spring AI	原生	原生	原生	否

7.2 Embedding 集成

Embedding 方式	Qdrant	Milvus	Weaviate	Chroma
内置 Embedding	否	否	是（模块化）	是（默认）
FastEmbed	官方维护	否	否	否
OpenAI 集成	SDK 支持	SDK 支持	原生模块	原生支持
本地模型	用户管理	用户管理	模块支持	函数注入

八、选型决策

8.1 按场景推荐

场景	首选	理由
快速原型 / PoC	Chroma	零配置，Python 原生，5 分钟上手
生产 RAG（中等规模）	Qdrant	性能优秀，运维简单，过滤能力强
十亿级企业搜索	Milvus/Zilliz	原生分布式，索引类型丰富
多租户 SaaS	Weaviate	原生多租户，内置 Embedding
嵌入式 / Edge	Qdrant	Rust 原生，低内存，支持 :memory:
已有 K8s 基础设施	Milvus	云原生设计，适合 K8s 编排
数据隐私优先	Qdrant/自部署	BYOC 支持，数据不出境

8.2 综合评分

维度（权重）	Qdrant	Milvus	Weaviate	Chroma
性能（25%）	9.5	8.5	8.0	6.5
易用性（20%）	9.0	6.0	8.0	9.5
可扩展性（20%）	8.5	9.5	8.0	5.0
生态集成（15%）	9.0	9.0	8.5	8.5
运维成本（10%）	9.0	5.0	8.0	9.0
云服务（10%）	8.5	8.5	8.0	5.0
加权总分	9.0	7.8	8.0	7.2

九、迁移与互操作

如果需要在不同向量数据库之间迁移，建议在应用层抽象出统一接口：

from abc import ABC, abstractmethod
from dataclasses import dataclass
from typing import Any

@dataclass
class SearchResult:
    id: str
    score: float
    payload: dict[str, Any]

class VectorStore(ABC):
    @abstractmethod
    async def upsert(self, collection: str, id: str,
                     vector: list[float], payload: dict) -> None: ...

    @abstractmethod
    async def search(self, collection: str, vector: list[float],
                     limit: int = 10, filters: dict | None = None
                     ) -> list[SearchResult]: ...

    @abstractmethod
    async def delete(self, collection: str, ids: list[str]) -> None: ...

class QdrantStore(VectorStore):
    async def upsert(self, collection, id, vector, payload):
        self.client.upsert(collection, [
            PointStruct(id=id, vector=vector, payload=payload)
        ])

    async def search(self, collection, vector, limit=10, filters=None):
        # Convert generic filters to Qdrant filter format
        qfilter = self._build_filter(filters) if filters else None
        results = self.client.query_points(
            collection_name=collection,
            query=vector,
            query_filter=qfilter,
            limit=limit,
        )
        return [SearchResult(r.id, r.score, r.payload) for r in results.points]

这种抽象使得更换底层向量数据库时，只需要实现新的 VectorStore 子类，业务代码无需改动。

十、总结

向量数据库的选型核心是匹配你的工程阶段和运维能力。Chroma 适合原型期，Qdrant 是生产环境的最优平衡点，Milvus 适合已有 K8s 基础设施的大规模场景，Weaviate 在多租户和内置 Embedding 上有独特优势。

不要过早为"十亿级"做准备——大多数 RAG 应用在百万级以下运行良好。先选开发体验最好的，等真正遇到瓶颈再迁移。

Maurice | maurice_wen@proton.me