Welcome to Woodpecker

A cloud-native Write-Ahead Log storage engine that leverages object storage for low-cost, high-throughput, and reliable logging.

What is Woodpecker?

Woodpecker is a cloud-native WAL (Write-Ahead Log) storage implementation designed to fully utilize cloud-native infrastructure for scalability and durability. Unlike traditional on-premises WAL solutions or custom-built distributed logging systems, Woodpecker uses cloud object storage as its durable storage layer.

Woodpecker is currently under active development. Initially built for Milvus vector database and Zilliz Cloud, with broad applicability across cloud workloads.

Key Features

Cloud-Native WAL — Uses cloud object storage (S3, GCS, Azure, OSS) as the durable layer for scalability and cost-effectiveness
High-Throughput Writes — Optimized write strategies achieving 60-80% of maximum backend throughput
Efficient Log Reads — Memory management and prefetching strategies for optimized sequential access
Ordered Durability — Strict sequential ordering guarantees for log persistence
Flexible Deployment — Standalone service mode or embedded library mode
Multi-Cloud Support — AWS, GCP, Azure, Aliyun, Tencent Cloud, MinIO
Innovative Quorum Protocol — Configurable ensemble/write/ack quorum with AZ-aware placement
Condition Write Fence — Prevents split-brain with built-in fence mechanism

Core Components

Component	Description
Client	Read/write protocol layer for interacting with Woodpecker
EmbeddedClient	Client with built-in LogStore for embedded deployments
LogStore	Handles high-speed log writes, batching, and cloud storage uploads
ObjectStorage	Cloud object storage backend (S3, GCS, OSS, etc.)
MetadataProvider	etcd-backed metadata and coordination service
QuorumDiscovery	Quorum-based node selection and replication management

Design Benefits

Lightweight deployment — Easy integration with minimal dependencies
Decoupled compute & storage — Reduced operational complexity
Auto-scaling storage — Eliminates capacity planning overhead
Reduced local disk dependency — Ideal for cloud-native workloads
One-write, multiple-reads — Data-embedded metadata design enables concurrent reads without writer synchronization

Quick Start

Get Woodpecker up and running in minutes. Choose between embedded mode (library) or service mode (dedicated cluster).

Install Client Library

Add Woodpecker as a Go dependency to your project:

bash

go get github.com/zilliztech/woodpecker@latest

Choose Your Mode

Woodpecker supports two deployment modes with different clients. Pick the one that fits your architecture.

Embedded mode runs the LogStore in-process. No external LogStore service needed — just etcd and object storage. Best for single-instance deployments and applications needing local WAL integration.

Prerequisites

Only an etcd instance is required. Object storage (MinIO/S3) is configured via woodpecker.yaml.

bash

# Start etcd (for metadata coordination)
docker run -d --name etcd -p 2379:2379 \
  quay.io/coreos/etcd:v3.5.0 /usr/local/bin/etcd \
  --advertise-client-urls http://0.0.0.0:2379 \
  --listen-client-urls http://0.0.0.0:2379

# Start MinIO (for object storage)
docker run -d --name minio -p 9000:9000 -p 9001:9001 \
  minio/minio server /data --console-address ":9001"

Create Client

Use NewEmbedClientFromConfig — it auto-creates etcd and storage connections, and starts an in-process LogStore.

main.go

cfg, _ := config.NewConfiguration()
client, _ := woodpecker.NewEmbedClientFromConfig(ctx, cfg)
defer client.Close(ctx)

Write Data

main.go

// Create and open a log
client.CreateLog(ctx, "my-wal")
logHandle, _ := client.OpenLog(ctx, "my-wal")

// Open a writer (acquires distributed lock)
writer, _ := logHandle.OpenLogWriter(ctx)

// Synchronous write
result := writer.Write(ctx,
    &log.WriteMessage{
        Payload:    []byte("hello world"),
        Properties: map[string]string{"key": "value"},
    },
)
// result.LogMessageId = {SegmentId, EntryId}

// Asynchronous write (higher throughput)
ch := writer.WriteAsync(ctx,
    &log.WriteMessage{
        Payload: []byte("async hello"),
    },
)
r := <-ch // wait for result

Read Data

main.go

// Open a reader from the earliest position
start := log.EarliestLogMessageID()
reader, _ := logHandle.OpenLogReader(ctx, &start, "my-reader")
defer reader.Close(ctx)

// Iterate through all messages
for {
    msg, err := reader.ReadNext(ctx)
    if err != nil { break }
    fmt.Printf("ID: {%d,%d} Payload: %s\n",
        msg.Id.SegmentId, msg.Id.EntryId,
        string(msg.Payload))
}

Service mode uses a dedicated LogStore cluster as a caching layer for higher throughput and lower latency. Requires running LogStore server(s) with gossip-based cluster membership. Uses staged storage (local disk + object storage hybrid).

Deploy LogStore Cluster

Service mode requires a running LogStore cluster. Choose one of the following approaches:

Option A: Using Docker Compose (Recommended)

Use the provided deployments/docker-compose.yaml template to pull the latest image and start a complete cluster:

bash

# Clone the repository for deployment templates
git clone https://github.com/zilliztech/woodpecker.git
cd woodpecker/deployments

# Start the complete cluster (etcd + MinIO + Jaeger + 3 nodes)
docker compose up -d

# Check cluster status
./deploy.sh status

Option B: Build from Source

Build Woodpecker binary and Docker image locally, then deploy:

bash

# 1. Build binary and Docker image
git clone https://github.com/zilliztech/woodpecker.git
cd woodpecker/deployments
./deploy.sh build

# 2. Start the cluster using locally built image
./deploy.sh up

# 3. Check cluster status
./deploy.sh status

Both options deploy 3 Woodpecker nodes (gRPC ports 18080–18082) with gossip-based discovery (ports 17946–17948), etcd, MinIO, and Jaeger tracing. See deployments/README.md for full configuration details.

Create Client

Use NewClient with an external etcd connection — the client connects to the LogStore cluster via gRPC.

main.go

cfg, _ := config.NewConfiguration()

// Connect to etcd
etcdCli, _ := clientv3.New(clientv3.Config{
    Endpoints: []string{"localhost:2379"},
})

// Create service-mode client
client, _ := woodpecker.NewClient(ctx, cfg, etcdCli, false)
defer client.Close(ctx)

Write & Read Data

Once the client is created, the write and read API is identical to embedded mode:

main.go

// Same API as embedded mode after client creation
client.CreateLog(ctx, "my-wal")
logHandle, _ := client.OpenLog(ctx, "my-wal")

// Write
writer, _ := logHandle.OpenLogWriter(ctx)
result := writer.Write(ctx, &log.WriteMessage{
    Payload: []byte("hello from service mode"),
})

// Read
start := log.EarliestLogMessageID()
reader, _ := logHandle.OpenLogReader(ctx, &start, "reader-1")
msg, _ := reader.ReadNext(ctx)
fmt.Println(string(msg.Payload))

Mode Comparison

Aspect	Embedded Mode	Service Mode
Client	`NewEmbedClientFromConfig()`	`NewClient(ctx, cfg, etcdCli, managed)`
LogStore	In-process (automatic)	Dedicated cluster (gRPC)
Storage	Direct to object storage	Staged: local disk + object storage
Dependencies	etcd + object storage	etcd + object storage + LogStore servers
Latency	No network overhead	Lower via caching layer
Throughput	Good	Higher (prefetching + batching)
Best for	Single instance, testing, embedded apps	Multi-instance, high-throughput production

For high-throughput use cases, prefer WriteAsync with batch writes. This allows Woodpecker to optimize batching and flush operations.

← Previous

Overview

Configuration

Comprehensive guide to configuring Woodpecker with all available options and their defaults.

Full Configuration Example

Woodpecker uses YAML configuration. Create a woodpecker.yaml file:

woodpecker.yaml

woodpecker:
  meta:
    type: etcd
    prefix: woodpecker
  client:
    segmentAppend:
      queueSize: 100            # max queued append requests
      maxRetries: 2             # max retries per append
    segmentRollingPolicy:
      maxSize: 100000000        # 100MB per segment
      maxInterval: 800          # seconds between rolls
      maxBlocks: 1000           # max blocks per segment
    auditor:
      maxInterval: 5            # seconds between audits
    sessionMonitor:
      checkInterval: 3          # seconds
      maxFailures: 5            # consecutive fails before invalid
    # quorum: (service mode only, see Quorum section)
  logstore:
    segmentSyncPolicy:
      maxInterval: 1000         # ms between syncs
      maxIntervalForLocalStorage: 5  # ms for local backend
      maxEntries: 2000          # max entries per buffer
      maxBytes: 100000000       # 100MB buffer limit
      maxFlushRetries: 3
      retryInterval: 2000       # ms between retries
      maxFlushSize: 16000000    # 16MB per block flush
      maxFlushThreads: 8
    segmentCompactionPolicy:
      maxBytes: 32000000       # 32MB merged block size
      maxParallelUploads: 4
      maxParallelReads: 8
    segmentReadPolicy:
      maxBatchSize: 16000000   # 16MB read batch
      maxFetchThreads: 32
    retentionPolicy:
      ttl: 259200             # 72 hours for truncated segments
    fencePolicy:
      conditionWrite: "auto"   # auto | enable | disable
    grpc:
      serverMaxSendSize: 536870912   # 512MB
      serverMaxRecvSize: 268435456   # 256MB
      clientMaxSendSize: 268435456   # 256MB
      clientMaxRecvSize: 536870912   # 512MB
    processorCleanupPolicy:
      cleanupInterval: 60     # seconds
      maxIdleTime: 300        # seconds
      shutdownTimeout: 15     # seconds
  storage:
    type: default             # default(=minio), minio, local, service
    rootPath: /var/lib/woodpecker

Client Configuration

Parameter	Description	Default
segmentAppend.queueSize	Maximum queued segment append requests	100
segmentAppend.maxRetries	Maximum retries for append operations	2
segmentRollingPolicy.maxSize	Maximum segment size in bytes	100MB
segmentRollingPolicy.maxInterval	Maximum interval between segment rolls (seconds)	800
segmentRollingPolicy.maxBlocks	Maximum blocks per segment	1000
auditor.maxInterval	Audit interval in seconds	5
sessionMonitor.checkInterval	Session health check interval (seconds)	3
sessionMonitor.maxFailures	Consecutive failures before session invalid	5

LogStore Configuration

Parameter	Description	Default
segmentSyncPolicy.maxInterval	Max sync interval (milliseconds)	1000
segmentSyncPolicy.maxIntervalForLocalStorage	Sync interval for local storage (ms)	5
segmentSyncPolicy.maxEntries	Max entries in write buffer	2000
segmentSyncPolicy.maxBytes	Max write buffer size	100MB
segmentSyncPolicy.maxFlushRetries	Max sync retries	3
segmentSyncPolicy.retryInterval	Retry interval (ms)	2000
segmentSyncPolicy.maxFlushSize	Max block flush size	16MB
segmentSyncPolicy.maxFlushThreads	Concurrent flush threads	8
segmentCompactionPolicy.maxBytes	Max merged block size after compaction	32MB
segmentCompactionPolicy.maxParallelUploads	Parallel compaction upload threads	4
segmentCompactionPolicy.maxParallelReads	Parallel compaction read threads	8
segmentReadPolicy.maxBatchSize	Max read batch size	16MB
segmentReadPolicy.maxFetchThreads	Concurrent fetch threads	32
retentionPolicy.ttl	TTL for truncated segments (seconds)	259200 (72h)
processorCleanupPolicy.cleanupInterval	Idle processor scan interval (seconds)	60
processorCleanupPolicy.maxIdleTime	Idle time before cleanup (seconds)	300

Storage Configuration

Parameter	Description	Default
storage.type	Storage backend: `default` (=minio), `minio`, `local`, `service`	default
storage.rootPath	Root path for log data files	/var/lib/woodpecker

etcd Configuration

woodpecker.yaml

etcd:
  endpoints: [localhost:2379]
  rootPath: by-dev
  metaSubPath: meta
  kvSubPath: kv
  requestTimeout: 10000        # ms
  use:
    embed: false              # enable embedded etcd
  data:
    dir: default.etcd         # embedded etcd data dir
  ssl:
    enabled: false
    tlsCert: /path/to/cert.pem
    tlsKey: /path/to/key.pem
    tlsCACert: /path/to/ca.pem
  auth:
    enabled: false
    userName: ""
    password: ""

MinIO / S3 Configuration

woodpecker.yaml

minio:
  address: localhost
  port: 9000
  accessKeyID: minioadmin
  secretAccessKey: minioadmin
  useSSL: false
  bucketName: a-bucket
  createBucket: false
  rootPath: files
  useIAM: false
  cloudProvider: aws           # aws, gcp, aliyun, gcpnative
  region: ""
  useVirtualHost: false
  requestTimeoutMs: 10000
  listObjectsMaxKeys: 0       # 0 = unlimited

Parameter	Description	Default
minio.address	MinIO/S3 server address	localhost
minio.port	MinIO/S3 server port	9000
minio.accessKeyID	Access key for authentication	minioadmin
minio.secretAccessKey	Secret key for authentication	minioadmin
minio.bucketName	Storage bucket name	a-bucket
minio.createBucket	Auto-create bucket if not exists	false
minio.useIAM	Use IAM role authentication	false
minio.cloudProvider	Cloud provider: aws, gcp, aliyun, gcpnative	aws
minio.region	Storage region	—
minio.useVirtualHost	Virtual host style addressing	false
minio.requestTimeoutMs	Request timeout in milliseconds	10000

Quorum Configuration (Service Mode)

Required when using storage.type: service with a LogStore cluster.

woodpecker.yaml

woodpecker:
  client:
    quorum:
      quorumBufferPools:
        - name: "region-1"
          seeds: ["logstore1:7946", "logstore2:7946"]
      quorumSelectStrategy:
        affinityMode: "soft"   # soft | hard
        replicas: 3             # 3 or 5
        strategy: "random"      # see table below
        customPlacement:       # only for strategy: custom
          - name: "replica-1"
            region: "region-1"
            az: "az-1"
            resourceGroup: "rg-1"

Strategy Value	Description
random	Random node selection (default)
single-az-single-rg	All nodes in same AZ and resource group
single-az-multi-rg	Same AZ, different resource groups
multi-az-single-rg	Multiple AZs, same resource group
multi-az-multi-rg	Multiple AZs and resource groups
cross-region	Nodes across different regions (requires 2+ buffer pools)
random-group	Pre-partitioned random groups
custom	Explicit placement rules via customPlacement

Fence Policy

Value	Behavior
auto	Follow existing cluster metadata; otherwise verify condition write and fail startup if unavailable (default)
enable	Require condition write; ignore stored false metadata and fail startup if unavailable
disable	Force distributed locks; skip detection

gRPC Configuration

Parameter	Description	Default
grpc.serverMaxSendSize	Server max send message size	512MB
grpc.serverMaxRecvSize	Server max receive message size	256MB
grpc.clientMaxSendSize	Client max send message size	256MB
grpc.clientMaxRecvSize	Client max receive message size	512MB

← Previous

Quick Start

System Architecture

Understanding Woodpecker's two deployment modes and core design.

Embedded Mode

In embedded mode, Woodpecker operates as a lightweight library integrated directly into your application. It requires only etcd for metadata storage and coordination, keeping dependencies minimal.

Key characteristics of embedded mode:

Zero additional services required beyond etcd
Client directly writes to object storage
Condition-write fence mechanism prevents split-brain
Best suited for applications needing simple WAL integration

Service Mode

In service mode, WAL read/write operations and caching logic are decoupled into a dedicated LogStore cluster service. This acts as a high-performance caching layer between clients and object storage.

Key characteristics of service mode:

Dedicated LogStore cluster with gRPC communication
Data prefetching and read/write caching for lower latency
Gossip-based cluster membership and auto-discovery
Quorum-based replication with AZ-aware node placement
Best suited for high-throughput, latency-sensitive workloads

Core Components

Component	Role
Client	Read/write protocol layer connecting applications to Woodpecker
EmbeddedClient	Combined client + LogStore for embedded deployments
LogStore	High-speed log writes, batching, caching, and cloud storage uploads
ObjectStorage	Durable storage backend using S3-compatible APIs
MetadataProvider	etcd-backed metadata storage, coordination, and distributed locking
QuorumDiscovery	Service mode quorum-based node selection and replication management via gossip

Data Flow

Write Path

Application calls Write() or WriteAsync() on a LogWriter
Writer validates the session lock is still valid (distributed lock via etcd)
Gets or creates a writable segment via rolling policy (maxSize=100MB, maxInterval=800s, maxBlocks=1000)
Serializes message with MarshalMessage() (payload + properties)
Calls segment.AppendAsync() — data is buffered
Buffer is flushed to storage when sync policy triggers (maxInterval=1000ms, maxEntries=2000, maxBytes=100MB)
Segment metadata updated in etcd with revision-based optimistic locking

Read Path

Application opens a LogReader with starting position and reader name (OpenLogReader(ctx, from, readerName))
Reader loads entries in batches via ReadBatchAdv() (batch limit: 200 entries)
Yields one entry per ReadNext() call from local cache
On segment EOF, auto-advances to the next segment seamlessly
Reader position updated in etcd every 30 seconds
For active (still-writing) segments, polls with 200ms wait interval when no new data is available

Segment Lifecycle

Segments transition through the following states:

State Machine

Active → Completed → Sealed → Truncated → (Deleted)

Active     - Currently accepting writes
Completed  - All data flushed, no more writes
Sealed     - Metadata finalized, immutable
Truncated  - Marked for cleanup
Deleted    - Storage data removed

← Previous

Configuration

Quorum Protocol

Woodpecker's innovative quorum-based replication for service mode deployments.

Overview

In service mode, Woodpecker uses a configurable quorum protocol for data replication across LogStore nodes. This ensures data durability while allowing flexible tradeoffs between consistency, availability, and performance.

Quorum Parameters

The quorum is configured via the replicas setting, which determines the ensemble size (E). Write quorum (WQ) and ack quorum (AQ) are derived automatically:

Parameter	Symbol	Formula	E=3	E=5
Ensemble Size	E	= replicas	3	5
Write Quorum	WQ	= E	3	5
Ack Quorum	AQ	= (E / 2) + 1	2	3

Data is written to all E nodes (WQ = E), but only AQ acknowledgments are required before the write is considered durable. This allows writes to succeed even if some nodes are slow, while ensuring majority-based durability.

Buffer Pools

Buffer pools are logical groupings of LogStore nodes. Each pool is defined by a name and seed addresses used for gossip-based node discovery. Pools enable regional grouping of nodes for locality-aware placement.

woodpecker.yaml

woodpecker:
  client:
    quorum:
      quorumBufferPools:
        - name: "region-1"
          seeds: ["logstore1:7946", "logstore2:7946"]
        - name: "region-2"
          seeds: ["logstore3:7946", "logstore4:7946"]
      quorumSelectStrategy:
        affinityMode: "soft"     # soft | hard
        replicas: 3               # ensemble size (3 or 5)
        strategy: "random"        # see strategies table below
        customPlacement:         # only for strategy: custom
          - name: "replica-1"
            region: "region-1"
            az: "az-1"
            resourceGroup: "rg-1"

Node Selection Strategies

Woodpecker supports 8 node selection strategies for quorum placement:

Strategy	Config Value	Description
Random	random	Random node selection from all available nodes (default)
Single AZ, Single RG	single-az-single-rg	All nodes in the same availability zone and resource group
Single AZ, Multi RG	single-az-multi-rg	Nodes in the same AZ but spread across different resource groups
Multi AZ, Single RG	multi-az-single-rg	Nodes across multiple AZs within the same resource group
Multi AZ, Multi RG	multi-az-multi-rg	Nodes spread across multiple AZs and resource groups (highest fault tolerance)
Cross Region	cross-region	Nodes across different regions (requires 2+ buffer pools)
Random Group	random-group	Pre-partitioned random groups for balanced distribution
Custom	custom	Explicit placement rules via `customPlacement` configuration

Affinity Modes

Mode	Behavior
soft	Prefer nodes matching the strategy constraints, but fall back to any available node if not enough matching nodes are found
hard	Strictly enforce strategy constraints — fail if not enough matching nodes are available

Cluster Membership

Service mode uses HashiCorp memberlist (gossip protocol) for cluster membership management:

Nodes discover each other via seed addresses configured in buffer pools
Health status is propagated through gossip protocol
Failed nodes are automatically detected and excluded from quorum selection
New nodes can join the cluster dynamically by connecting to any seed
Each node exposes a bind port (gossip) and a service port (gRPC)

← Previous

System Architecture

Object Storage

Cloud-native object storage as the durable storage layer for Woodpecker's WAL data.

Overview

Object storage serves as Woodpecker's primary durable storage layer. All WAL data is ultimately persisted to S3-compatible object storage, providing virtually unlimited capacity, high durability (11 nines), and cross-region replication capabilities.

ObjectStorage Interface

Woodpecker abstracts storage operations through a unified ObjectStorage interface:

Go Interface

type ObjectStorage interface {
    GetObject(ctx, bucket, key string, offset, size int64) (*Object, error)
    PutObject(ctx, bucket, key string, reader io.Reader, size int64) error
    PutObjectIfNoneMatch(ctx, bucket, key string, reader, size) error
    PutFencedObject(ctx, bucket, key string, reader, size, fenceId) error
    StatObject(ctx, bucket, key string) (*ObjectAttr, error)
    WalkWithObjects(ctx, bucket, prefix string, recursive bool, walkFn) error
    RemoveObject(ctx, bucket, key string) error
}

Supported Backends

Woodpecker supports multiple cloud providers via the minio.cloudProvider configuration:

Provider	Config Value	Condition Write	Notes
Amazon S3	aws	Supported (`If-None-Match`)	Default provider
Google GCS	gcp	Supported	S3-compatible mode
Aliyun OSS	aliyun	Supported (wrapped)	Uses forbid-overwrite feature
Tencent COS	tencent	Supported (wrapped)	Uses forbid-overwrite feature
Azure Blob	azure	Supported	Dedicated implementation
MinIO	aws	Supported	Uses S3 API

Configuration

Object storage is configured via the storage and minio sections:

woodpecker.yaml

woodpecker:
  storage:
    type: default              # default(=minio), minio, local, service
    rootPath: /var/lib/woodpecker

minio:
  address: localhost
  port: 9000
  accessKeyID: minioadmin
  secretAccessKey: minioadmin
  useSSL: false
  bucketName: a-bucket
  createBucket: false
  rootPath: files
  useIAM: false
  cloudProvider: aws            # aws, gcp, aliyun, gcpnative
  region: ""
  useVirtualHost: false
  requestTimeoutMs: 10000
  listObjectsMaxKeys: 0        # 0 = unlimited

Condition Write Support

Condition write (PutObjectIfNoneMatch) is used as a fence mechanism to prevent split-brain in embedded mode. It ensures only one writer can create a given object key.

AWS/MinIO/GCS/Azure — Uses If-None-Match: * header for conditional PUT
Aliyun OSS / Tencent COS — Uses the forbid-overwrite feature (wrapped implementation)
Auto-detection — Woodpecker probes condition write support at startup (configurable via fencePolicy.conditionWrite)

Performance Best Practices

Write Optimization

The default maxFlushSize is 16MB — increase for larger batch uploads to reduce API call overhead
Default maxFlushThreads is 8 — increase for higher parallelism on fast networks
Use WriteAsync with batching for maximum throughput
Compaction merges small blocks into 32MB merged blocks (configurable via segmentCompactionPolicy.maxBytes)

Read Optimization

Read batch size is 16MB by default (segmentReadPolicy.maxBatchSize)
Up to 32 concurrent fetch threads for parallel reads (segmentReadPolicy.maxFetchThreads)
Range reads via GetObject with offset/size parameters for partial fetches

Data Durability

CRC32 checksums for data integrity verification on every block
Automatic retry on transient failures with configurable backoff (up to maxFlushRetries = 3, interval = 2s)
Fragment-based storage for incremental durability — data is durable as soon as each block is flushed
Segment compaction merges small fragments into larger objects for efficient storage and reads

← Previous

Quorum Protocol

Local File System Storage

High-performance disk-based log storage using memory-mapped I/O.

Overview

The local file system storage implements disk-based log storage using mmap (memory mapping) for efficient read and write performance. It maintains the Fragment concept for consistency with the object storage implementation.

File Layout

FragmentFile Layout

[Header (4K)] + [Data Area →] + [...Free Space...] + [← Index Area] + [Footer]

Header (4K)       Magic string and version information
Data Area         Actual log entries, growing forward from 4K
Free Space        Unused space between data and index
Index Area        Entry offsets, growing backward from file end
Footer            Metadata information

Data Entry Format:
[Payload Size (4B)] + [CRC32 (4B)] + [Actual Data (variable)]

Directory Structure

File Tree

/basePath/
  ├── log_[logID]/
  │   ├── fragment_[startOffset1]
  │   ├── fragment_[startOffset2]
  │   └── ...
  └── log_[logID2]/
      └── ...

Key Design Decisions

Why mmap?

Zero-copy read/write operations
Random access support
OS-managed page caching
Memory-level read/write performance

Why Fragments?

Although the local file system supports append operations, maintaining fragments provides:

Architecture consistency with object storage backend
File size limitation support
Segmented management and cleanup
Simplified concurrency control
Better system recovery capability

Performance Characteristics

Aspect	Optimization
Write	Zero-copy via mmap, batch writes, async support, pre-allocated space
Read	Index-based positioning, random reads, OS page cache, range reads
Memory	On-demand loading via mmap, controlled fragment sizes, memory limits

Reliability

CRC32 checksum for data validation
Magic string verification in file header
Read-write locks for concurrent access control
Atomic operations for counter integrity
File locks to prevent concurrent writes

← Previous

Object Storage

Staged Storage (Service Mode)

Two-tier hybrid storage combining local disk speed with object storage durability for service mode deployments.

Overview

Staged storage is the storage backend used in service mode (storage.type: service). It implements a two-tier hybrid architecture: active segments write to local disk for low-latency writes, then compaction uploads merged blocks to object storage for durable, long-term persistence.

Staged storage is automatically used when storage.type is set to service. No additional configuration is needed beyond configuring the MinIO/S3 backend for the object storage tier.

How It Works

Data Flow

Write Path:
  Client → LogStore Server → Local Disk (fast, low-latency)

Compaction:
  Local Disk → Merge Blocks → Object Storage (durable)

Read Path:
  Active segments  → Read from Local Disk
  Compacted data   → Read from Object Storage

Core Components

Component	Role
StagedSegmentImpl	Manages a segment's lifecycle across both tiers — coordinates writes to local disk and compaction uploads to object storage
StagedFileWriter	Handles buffered writes to local disk files with block-based batching and CRC32 integrity checks
StagedFileReaderAdv	Reads entries from either local disk (active segments) or object storage (compacted segments) transparently

Configuration

Enable staged storage by setting storage.type: service:

woodpecker.yaml

woodpecker:
  storage:
    type: service              # enables staged storage
    rootPath: /var/lib/woodpecker # local staging directory

# Object storage tier (for compacted data)
minio:
  address: localhost
  port: 9000
  accessKeyID: minioadmin
  secretAccessKey: minioadmin
  bucketName: a-bucket
  rootPath: files

Benefits

Low write latency — Writes go to local disk first, avoiding object storage API overhead
High durability — Compaction ensures all data reaches object storage for long-term persistence
Efficient reads — Active data served from local disk; compacted data served from object storage with prefetching
Automatic lifecycle — Compaction runs automatically based on configured policies
Reduced API costs — Merging blocks before upload reduces the number of object storage API calls

Storage Backend Comparison

Aspect	Object Storage (MinIO/S3)	Local File System	Staged Storage
Config Value	`default` / `minio`	`local`	`service`
Mode	Embedded	Embedded (dev/test)	Service
Write Latency	Higher (network)	Lowest (local I/O)	Low (local disk)
Durability	High (cloud-replicated)	Node-level only	High (compacted to cloud)
Capacity	Unlimited	Disk-limited	Unlimited (after compaction)
Best For	Production embedded	Development/testing	Production service mode

← Previous

Local File System

Condition Write & Fence

Split-brain prevention using condition write as a fence mechanism.

Overview

In embedded mode, Woodpecker uses condition write as a fence mechanism to prevent split-brain scenarios where a stale primary node resumes writing after being partitioned or crashed. This ensures data consistency and prevents concurrent write conflicts.

Supported Storage Backends

Backend	Support	Mechanism
MinIO	Full	`If-None-Match` header
AWS S3	Full	Conditional PUT operations
Azure Blob	Full	Conditional writes
GCP Cloud Storage	Full	Conditional operations
Aliyun OSS	Wrapped	Forbid overwrite feature
Tencent COS	Wrapped	Forbid overwrite feature

Detection Modes

Auto Mode (Default)

Follows existing true/false cluster metadata when present. If no metadata exists, verifies condition write support with strict enable semantics, stores true on success, and fails startup on failure.

Verification failure is surfaced immediately and is never persisted as an unsupported capability.

Enable Mode

Strict requirement: stored true is trusted, stored false is ignored, verification must succeed within 10 retries, and successful verification overwrites metadata to true.

Disable Mode

Skips detection entirely and uses distributed locks. Use when you know your backend doesn't support condition write.

Distributed Lock Fallback

When condition write is explicitly disabled, or when legacy auto metadata stores false, Woodpecker uses etcd-based distributed locks:

Property	Detail
Lock Type	etcd `concurrency.Mutex` with `concurrency.Session`
Session TTL	10 seconds (auto-renewed via keep-alive)
Lock Key	`/woodpecker/service/lock/{logName}`
Acquisition	Non-blocking `TryLock`

Failover Behavior

Session Expiration — etcd session expires after TTL (10s) if keep-alive fails
Lock Release — Distributed lock is automatically released
New Primary — New node acquires the lock
Writer Switch — Applications reopen writer (up to 10s wait)

Performance Comparison

Aspect	Condition Write	Distributed Lock
Lock overhead	None	Per-writer acquisition
Failover latency	Immediate	Up to 10 seconds
Concurrency	Multiple attempt, one wins	Single writer at a time

Configuration

woodpecker.yaml

woodpecker:
  logstore:
    fencePolicy:
      conditionWrite: "auto"   # "auto" | "enable" | "disable"

← Previous

Staged Storage

Monitoring & Metrics

Built-in Prometheus metrics and OpenTelemetry tracing for production observability.

Prometheus Metrics

Woodpecker exposes comprehensive Prometheus metrics for monitoring system health and performance. Each node exposes metrics on a dedicated port (default 9091–9094 for nodes 1–4).

Key Metrics

Module	Example Metrics	Purpose
Client Append	`woodpecker_client_append_latency`, `_requests_total`, `_bytes`	Write QPS, latency P50/P99, message size distribution
Client Read	`woodpecker_client_read_requests_total`, `_reader_bytes_read`	Read QPS, throughput, reader operation latency
Server LogStore	`woodpecker_server_logstore_active_logs`, `_active_segments`	Resource usage, segment lifecycle, operation success rate
Server File/Buffer	`woodpecker_server_file_flush_latency`, `_compaction_latency`	Persistence latency, compaction performance, buffer queuing
Object Storage	`woodpecker_server_object_storage_operations_total`, `_operation_latency`	S3 call frequency, latency P50/P99, bandwidth usage
System Resources	`woodpecker_server_system_cpu_usage`, `_memory_usage_ratio`	Node load, memory pressure, I/O bottleneck detection
gRPC	`grpc_server_handled_total`, `_handling_seconds`	RPC QPS, success rate, error code distribution

OpenTelemetry Tracing

Woodpecker integrates with OpenTelemetry for distributed tracing. The deployment includes Jaeger (port 16686) for trace visualization. gRPC interceptors automatically propagate trace context in service mode.

Setting Up Monitoring Stack

Use the provided Docker Compose overlay to add Prometheus and Grafana to the cluster:

bash

# Start cluster with Prometheus + Grafana monitoring
cd deployments
docker compose -f docker-compose.yaml \
  -f ../tests/docker/monitor/docker-compose.monitor.yaml \
  -p woodpecker-monitor up -d

Access the monitoring services:

Service	URL	Credentials
Prometheus	http://localhost:9090	—
Grafana	http://localhost:3000	Anonymous admin (no login)
Jaeger	http://localhost:16686	—
MinIO Console	http://localhost:9001	minioadmin / minioadmin

Grafana Dashboards

Pre-built Grafana dashboard templates are available in tests/docker/monitor/grafana/templates/. These provide ready-to-use visualizations for write/read throughput, latency percentiles (P50/P95/P99), segment state distribution, storage backend performance, and error rates.

Run Monitoring Tests

Automated E2E metric verification tests are provided:

bash

# One-click: build, deploy, test, and cleanup
cd tests/docker/monitor
./run_monitor_tests.sh

# Keep cluster running after tests for manual inspection
./run_monitor_tests.sh --no-cleanup

The monitoring test suite also includes a rolling restart latency profile test that measures write/read latency across node restarts — useful as a baseline for evaluating rolling upgrade quality.

← Previous

Condition Write & Fence