Open Source · Cloud-Native · Geo-Friendly

The WAL Engine
Built for the Cloud Era

A cloud-native Write-Ahead Log storage engine that leverages object storage for ultra-low cost, high throughput, and geo-distributed durability.

750MB/s
Peak Throughput (S3)
1.8ms
Latency (Local)
60-80%
Backend Efficiency
6+
Cloud Providers

Engineered for the Cloud,
Designed for Scale

Every component of Woodpecker is built from the ground up for cloud-native infrastructure, delivering the reliability and performance your systems demand.

Cloud-Native WAL

Uses cloud object storage (S3, GCS, Azure, OSS) as the durable layer, decoupling compute from storage for infinite scalability and zero capacity planning.

Geo-Distributed Friendly

Designed for multi-region deployments with cross-geo replication support. Leverage cloud provider's built-in geo-redundancy for global data durability.

Ultra-High Throughput

Achieves 60-80% of maximum storage backend throughput. Up to 750MB/s on S3 and 450MB/s on local storage with optimized batching and async writes.

Flexible Deployment

Deploy as a standalone service with dedicated LogStore cluster, or embed directly as a library in your application. Choose what fits your architecture.

Innovative Quorum Protocol

Novel quorum-based replication with configurable ensemble size, write quorum, and ack quorum. Supports affinity-aware node selection and AZ-aware placement.

Lightweight & Extensible

Minimal dependencies with etcd for metadata. Pluggable storage backends, Prometheus metrics, OpenTelemetry tracing, and condition-write fence mechanism built in.

Two Modes, One Engine

Choose the deployment model that matches your performance and operational requirements.

EmbeddedClient ETCD Cloud Object Storage

Embedded Mode

A lightweight library integrated directly into your application with minimal operational overhead. Only requires etcd for coordination.

  • Zero additional services — embed directly in your Go application
  • Condition-write fence mechanism prevents split-brain scenarios
  • Multi-cloud object storage support: AWS, GCP, Azure, Aliyun, Tencent
  • Auto-fallback to distributed locks when condition write unavailable
Client ETCD LogStore LogStore LogStore Cloud Object Storage

Service Mode

A dedicated LogStore cluster acts as a high-performance caching layer between clients and object storage, maximizing throughput and minimizing latency.

  • Dedicated LogStore cluster with gRPC communication
  • Data prefetching and read/write caching for lowest latency
  • Gossip-based cluster membership with auto-discovery
  • Quorum-based replication with AZ-aware node placement

Numbers That Speak

Single-node, single-client, single-log-stream benchmark results compared to industry standards.

System
Throughput
Latency
Backend
Efficiency
Kafka
130
MB/s
58
ms
Local
750 MB/s max
17%
Pulsar
107
MB/s
35
ms
Local
750 MB/s max
14%
WP MinIO
71
MB/s
184
ms
MinIO
110 MB/s max
65%
WP Local
450
MB/s
1.8
ms
Local
750 MB/s max
60%
WP S3
750
MB/s
166
ms
Amazon S3
1.1 GB/s max
68%
Woodpecker consistently achieves 60-80% of the maximum possible throughput for each storage backend — an exceptional efficiency level for middleware. Benchmark conducted on AWS EC2 with standard instance types.

Built for Critical Workloads

Initially built for Milvus vector database and Zilliz Cloud, with strong potential across diverse cloud workloads.

Distributed Database WAL

Ensure write-ahead logging with strict ordering and persistence for distributed databases. Seamlessly replaces on-prem WAL solutions.

Streaming & Event Sourcing

Provide a durable, ordered event log for stream processing frameworks with cloud-native scalability and cost efficiency.

Consensus Protocol Logs

Serve as a persistent log backend for distributed consensus algorithms like Raft and Paxos, with built-in quorum support.

Transaction Logs

Store ordered, durable logs for financial or critical business applications with strict consistency guarantees and multi-cloud support.

Simple API,
Powerful Engine

Get started with just a few lines of Go code. Woodpecker's clean API abstracts away the complexity of distributed log storage.

  • 1 Create a client — Connect via embedded mode or remote service
  • 2 Open a log — Create or open named log streams
  • 3 Write & Read — Sync or async writes with ordered reads
main.go
// Create an embedded client
cfg, _ := config.NewConfiguration()
client, _ := woodpecker.NewEmbedClientFromConfig(ctx, cfg)

// Create and open a log
client.CreateLog(ctx, "my-wal")
logHandle, _ := client.OpenLog(ctx, "my-wal")

// Open a writer
writer, _ := logHandle.OpenLogWriter(ctx)

// Async write with properties
result := writer.WriteAsync(ctx,
    &log.WriteMessage{
        Payload: []byte("hello world"),
        Properties: map[string]string{
            "key": "value",
        },
    },
)
writeResult := <-result

// Open a reader and iterate
reader, _ := logHandle.OpenLogReader(ctx, start)
for {
    msg, err := reader.ReadNext(ctx)
    if err != nil { break }
    process(msg.Payload)
}

Multi-Cloud, Multi-Backend

Seamlessly integrates with your existing cloud infrastructure and monitoring stack.

Amazon S3
Native S3 API support
Google GCS
Cloud Storage compatible
Azure Blob
Azure Storage integration
Aliyun OSS
China cloud native
Tencent COS
China cloud support
MinIO
Self-hosted S3 compatible
Prometheus
Metrics & monitoring
OpenTelemetry
Distributed tracing

Ready to Get Started?

Deploy Woodpecker in minutes and start building cloud-native log infrastructure.

Read the Docs Star on GitHub