Woodpecker Docs
Home GitHub

Welcome to Woodpecker

A cloud-native Write-Ahead Log storage engine that leverages object storage for low-cost, high-throughput, and reliable logging.

What is Woodpecker?

Woodpecker is a cloud-native WAL (Write-Ahead Log) storage implementation designed to fully utilize cloud-native infrastructure for scalability and durability. Unlike traditional on-premises WAL solutions or custom-built distributed logging systems, Woodpecker uses cloud object storage as its durable storage layer.

Woodpecker is currently under active development. Initially built for Milvus vector database and Zilliz Cloud, with broad applicability across cloud workloads.

Key Features

  • Cloud-Native WAL — Uses cloud object storage (S3, GCS, Azure, OSS) as the durable layer for scalability and cost-effectiveness
  • High-Throughput Writes — Optimized write strategies achieving 60-80% of maximum backend throughput
  • Efficient Log Reads — Memory management and prefetching strategies for optimized sequential access
  • Ordered Durability — Strict sequential ordering guarantees for log persistence
  • Flexible Deployment — Standalone service mode or embedded library mode
  • Multi-Cloud Support — AWS, GCP, Azure, Aliyun, Tencent Cloud, MinIO
  • Innovative Quorum Protocol — Configurable ensemble/write/ack quorum with AZ-aware placement
  • Condition Write Fence — Prevents split-brain with built-in fence mechanism

Core Components

ComponentDescription
ClientRead/write protocol layer for interacting with Woodpecker
EmbeddedClientClient with built-in LogStore for embedded deployments
LogStoreHandles high-speed log writes, batching, and cloud storage uploads
ObjectStorageCloud object storage backend (S3, GCS, OSS, etc.)
MetadataProvideretcd-backed metadata and coordination service
QuorumDiscoveryQuorum-based node selection and replication management

Design Benefits

  • Lightweight deployment — Easy integration with minimal dependencies
  • Decoupled compute & storage — Reduced operational complexity
  • Auto-scaling storage — Eliminates capacity planning overhead
  • Reduced local disk dependency — Ideal for cloud-native workloads
  • One-write, multiple-reads — Data-embedded metadata design enables concurrent reads without writer synchronization

Quick Start

Get Woodpecker up and running in minutes. Choose between embedded mode (library) or service mode (dedicated cluster).

Install Client Library

Add Woodpecker as a Go dependency to your project:

bash
go get github.com/zilliztech/woodpecker@latest

Choose Your Mode

Woodpecker supports two deployment modes with different clients. Pick the one that fits your architecture.

Embedded mode runs the LogStore in-process. No external LogStore service needed — just etcd and object storage. Best for single-instance deployments and applications needing local WAL integration.

Prerequisites

Only an etcd instance is required. Object storage (MinIO/S3) is configured via woodpecker.yaml.

bash
# Start etcd (for metadata coordination)
docker run -d --name etcd -p 2379:2379 \
  quay.io/coreos/etcd:v3.5.0 /usr/local/bin/etcd \
  --advertise-client-urls http://0.0.0.0:2379 \
  --listen-client-urls http://0.0.0.0:2379

# Start MinIO (for object storage)
docker run -d --name minio -p 9000:9000 -p 9001:9001 \
  minio/minio server /data --console-address ":9001"

Create Client

Use NewEmbedClientFromConfig — it auto-creates etcd and storage connections, and starts an in-process LogStore.

main.go
cfg, _ := config.NewConfiguration()
client, _ := woodpecker.NewEmbedClientFromConfig(ctx, cfg)
defer client.Close(ctx)

Write Data

main.go
// Create and open a log
client.CreateLog(ctx, "my-wal")
logHandle, _ := client.OpenLog(ctx, "my-wal")

// Open a writer (acquires distributed lock)
writer, _ := logHandle.OpenLogWriter(ctx)

// Synchronous write
result := writer.Write(ctx,
    &log.WriteMessage{
        Payload:    []byte("hello world"),
        Properties: map[string]string{"key": "value"},
    },
)
// result.LogMessageId = {SegmentId, EntryId}

// Asynchronous write (higher throughput)
ch := writer.WriteAsync(ctx,
    &log.WriteMessage{
        Payload: []byte("async hello"),
    },
)
r := <-ch // wait for result

Read Data

main.go
// Open a reader from the earliest position
start := log.EarliestLogMessageID()
reader, _ := logHandle.OpenLogReader(ctx, &start, "my-reader")
defer reader.Close(ctx)

// Iterate through all messages
for {
    msg, err := reader.ReadNext(ctx)
    if err != nil { break }
    fmt.Printf("ID: {%d,%d} Payload: %s\n",
        msg.Id.SegmentId, msg.Id.EntryId,
        string(msg.Payload))
}
Service mode uses a dedicated LogStore cluster as a caching layer for higher throughput and lower latency. Requires running LogStore server(s) with gossip-based cluster membership. Uses staged storage (local disk + object storage hybrid).

Deploy LogStore Cluster

Service mode requires a running LogStore cluster. Choose one of the following approaches:

Option A: Using Docker Compose (Recommended)

Use the provided deployments/docker-compose.yaml template to pull the latest image and start a complete cluster:

bash
# Clone the repository for deployment templates
git clone https://github.com/zilliztech/woodpecker.git
cd woodpecker/deployments

# Start the complete cluster (etcd + MinIO + Jaeger + 3 nodes)
docker compose up -d

# Check cluster status
./deploy.sh status

Option B: Build from Source

Build Woodpecker binary and Docker image locally, then deploy:

bash
# 1. Build binary and Docker image
git clone https://github.com/zilliztech/woodpecker.git
cd woodpecker/deployments
./deploy.sh build

# 2. Start the cluster using locally built image
./deploy.sh up

# 3. Check cluster status
./deploy.sh status
Both options deploy 3 Woodpecker nodes (gRPC ports 18080–18082) with gossip-based discovery (ports 17946–17948), etcd, MinIO, and Jaeger tracing. See deployments/README.md for full configuration details.

Create Client

Use NewClient with an external etcd connection — the client connects to the LogStore cluster via gRPC.

main.go
cfg, _ := config.NewConfiguration()

// Connect to etcd
etcdCli, _ := clientv3.New(clientv3.Config{
    Endpoints: []string{"localhost:2379"},
})

// Create service-mode client
client, _ := woodpecker.NewClient(ctx, cfg, etcdCli, false)
defer client.Close(ctx)

Write & Read Data

Once the client is created, the write and read API is identical to embedded mode:

main.go
// Same API as embedded mode after client creation
client.CreateLog(ctx, "my-wal")
logHandle, _ := client.OpenLog(ctx, "my-wal")

// Write
writer, _ := logHandle.OpenLogWriter(ctx)
result := writer.Write(ctx, &log.WriteMessage{
    Payload: []byte("hello from service mode"),
})

// Read
start := log.EarliestLogMessageID()
reader, _ := logHandle.OpenLogReader(ctx, &start, "reader-1")
msg, _ := reader.ReadNext(ctx)
fmt.Println(string(msg.Payload))

Mode Comparison

AspectEmbedded ModeService Mode
ClientNewEmbedClientFromConfig()NewClient(ctx, cfg, etcdCli, managed)
LogStoreIn-process (automatic)Dedicated cluster (gRPC)
StorageDirect to object storageStaged: local disk + object storage
Dependenciesetcd + object storageetcd + object storage + LogStore servers
LatencyNo network overheadLower via caching layer
ThroughputGoodHigher (prefetching + batching)
Best forSingle instance, testing, embedded appsMulti-instance, high-throughput production
For high-throughput use cases, prefer WriteAsync with batch writes. This allows Woodpecker to optimize batching and flush operations.

Configuration

Comprehensive guide to configuring Woodpecker with all available options and their defaults.

Full Configuration Example

Woodpecker uses YAML configuration. Create a woodpecker.yaml file:

woodpecker.yaml
woodpecker:
  meta:
    type: etcd
    prefix: woodpecker
  client:
    segmentAppend:
      queueSize: 100            # max queued append requests
      maxRetries: 2             # max retries per append
    segmentRollingPolicy:
      maxSize: 100000000        # 100MB per segment
      maxInterval: 800          # seconds between rolls
      maxBlocks: 1000           # max blocks per segment
    auditor:
      maxInterval: 5            # seconds between audits
    sessionMonitor:
      checkInterval: 3          # seconds
      maxFailures: 5            # consecutive fails before invalid
    # quorum: (service mode only, see Quorum section)
  logstore:
    segmentSyncPolicy:
      maxInterval: 1000         # ms between syncs
      maxIntervalForLocalStorage: 5  # ms for local backend
      maxEntries: 2000          # max entries per buffer
      maxBytes: 100000000       # 100MB buffer limit
      maxFlushRetries: 3
      retryInterval: 2000       # ms between retries
      maxFlushSize: 16000000    # 16MB per block flush
      maxFlushThreads: 8
    segmentCompactionPolicy:
      maxBytes: 32000000       # 32MB merged block size
      maxParallelUploads: 4
      maxParallelReads: 8
    segmentReadPolicy:
      maxBatchSize: 16000000   # 16MB read batch
      maxFetchThreads: 32
    retentionPolicy:
      ttl: 259200             # 72 hours for truncated segments
    fencePolicy:
      conditionWrite: "auto"   # auto | enable | disable
    grpc:
      serverMaxSendSize: 536870912   # 512MB
      serverMaxRecvSize: 268435456   # 256MB
      clientMaxSendSize: 268435456   # 256MB
      clientMaxRecvSize: 536870912   # 512MB
    processorCleanupPolicy:
      cleanupInterval: 60     # seconds
      maxIdleTime: 300        # seconds
      shutdownTimeout: 15     # seconds
  storage:
    type: default             # default(=minio), minio, local, service
    rootPath: /var/lib/woodpecker

Client Configuration

ParameterDescriptionDefault
segmentAppend.queueSizeMaximum queued segment append requests100
segmentAppend.maxRetriesMaximum retries for append operations2
segmentRollingPolicy.maxSizeMaximum segment size in bytes100MB
segmentRollingPolicy.maxIntervalMaximum interval between segment rolls (seconds)800
segmentRollingPolicy.maxBlocksMaximum blocks per segment1000
auditor.maxIntervalAudit interval in seconds5
sessionMonitor.checkIntervalSession health check interval (seconds)3
sessionMonitor.maxFailuresConsecutive failures before session invalid5

LogStore Configuration

ParameterDescriptionDefault
segmentSyncPolicy.maxIntervalMax sync interval (milliseconds)1000
segmentSyncPolicy.maxIntervalForLocalStorageSync interval for local storage (ms)5
segmentSyncPolicy.maxEntriesMax entries in write buffer2000
segmentSyncPolicy.maxBytesMax write buffer size100MB
segmentSyncPolicy.maxFlushRetriesMax sync retries3
segmentSyncPolicy.retryIntervalRetry interval (ms)2000
segmentSyncPolicy.maxFlushSizeMax block flush size16MB
segmentSyncPolicy.maxFlushThreadsConcurrent flush threads8
segmentCompactionPolicy.maxBytesMax merged block size after compaction32MB
segmentCompactionPolicy.maxParallelUploadsParallel compaction upload threads4
segmentCompactionPolicy.maxParallelReadsParallel compaction read threads8
segmentReadPolicy.maxBatchSizeMax read batch size16MB
segmentReadPolicy.maxFetchThreadsConcurrent fetch threads32
retentionPolicy.ttlTTL for truncated segments (seconds)259200 (72h)
processorCleanupPolicy.cleanupIntervalIdle processor scan interval (seconds)60
processorCleanupPolicy.maxIdleTimeIdle time before cleanup (seconds)300

Storage Configuration

ParameterDescriptionDefault
storage.typeStorage backend: default (=minio), minio, local, servicedefault
storage.rootPathRoot path for log data files/var/lib/woodpecker

etcd Configuration

woodpecker.yaml
etcd:
  endpoints: [localhost:2379]
  rootPath: by-dev
  metaSubPath: meta
  kvSubPath: kv
  requestTimeout: 10000        # ms
  use:
    embed: false              # enable embedded etcd
  data:
    dir: default.etcd         # embedded etcd data dir
  ssl:
    enabled: false
    tlsCert: /path/to/cert.pem
    tlsKey: /path/to/key.pem
    tlsCACert: /path/to/ca.pem
  auth:
    enabled: false
    userName: ""
    password: ""

MinIO / S3 Configuration

woodpecker.yaml
minio:
  address: localhost
  port: 9000
  accessKeyID: minioadmin
  secretAccessKey: minioadmin
  useSSL: false
  bucketName: a-bucket
  createBucket: false
  rootPath: files
  useIAM: false
  cloudProvider: aws           # aws, gcp, aliyun, gcpnative
  region: ""
  useVirtualHost: false
  requestTimeoutMs: 10000
  listObjectsMaxKeys: 0       # 0 = unlimited
ParameterDescriptionDefault
minio.addressMinIO/S3 server addresslocalhost
minio.portMinIO/S3 server port9000
minio.accessKeyIDAccess key for authenticationminioadmin
minio.secretAccessKeySecret key for authenticationminioadmin
minio.bucketNameStorage bucket namea-bucket
minio.createBucketAuto-create bucket if not existsfalse
minio.useIAMUse IAM role authenticationfalse
minio.cloudProviderCloud provider: aws, gcp, aliyun, gcpnativeaws
minio.regionStorage region
minio.useVirtualHostVirtual host style addressingfalse
minio.requestTimeoutMsRequest timeout in milliseconds10000

Quorum Configuration (Service Mode)

Required when using storage.type: service with a LogStore cluster.

woodpecker.yaml
woodpecker:
  client:
    quorum:
      quorumBufferPools:
        - name: "region-1"
          seeds: ["logstore1:7946", "logstore2:7946"]
      quorumSelectStrategy:
        affinityMode: "soft"   # soft | hard
        replicas: 3             # 3 or 5
        strategy: "random"      # see table below
        customPlacement:       # only for strategy: custom
          - name: "replica-1"
            region: "region-1"
            az: "az-1"
            resourceGroup: "rg-1"
Strategy ValueDescription
randomRandom node selection (default)
single-az-single-rgAll nodes in same AZ and resource group
single-az-multi-rgSame AZ, different resource groups
multi-az-single-rgMultiple AZs, same resource group
multi-az-multi-rgMultiple AZs and resource groups
cross-regionNodes across different regions (requires 2+ buffer pools)
random-groupPre-partitioned random groups
customExplicit placement rules via customPlacement

Fence Policy

ValueBehavior
autoAuto-detect condition write support; fallback to distributed locks (default)
enableRequire condition write; panic if unavailable
disableForce distributed locks; skip detection

gRPC Configuration

ParameterDescriptionDefault
grpc.serverMaxSendSizeServer max send message size512MB
grpc.serverMaxRecvSizeServer max receive message size256MB
grpc.clientMaxSendSizeClient max send message size256MB
grpc.clientMaxRecvSizeClient max receive message size512MB

System Architecture

Understanding Woodpecker's two deployment modes and core design.

Embedded Mode

In embedded mode, Woodpecker operates as a lightweight library integrated directly into your application. It requires only etcd for metadata storage and coordination, keeping dependencies minimal.

EmbeddedClient ETCD Cloud Object Storage MinIO / S3 / Aliyun / GCS / Azure

Key characteristics of embedded mode:

  • Zero additional services required beyond etcd
  • Client directly writes to object storage
  • Condition-write fence mechanism prevents split-brain
  • Best suited for applications needing simple WAL integration

Service Mode

In service mode, WAL read/write operations and caching logic are decoupled into a dedicated LogStore cluster service. This acts as a high-performance caching layer between clients and object storage.

Client ETCD LogStore LogStore LogStore Cloud Object Storage MinIO / S3 / Aliyun / GCS / Azure

Key characteristics of service mode:

  • Dedicated LogStore cluster with gRPC communication
  • Data prefetching and read/write caching for lower latency
  • Gossip-based cluster membership and auto-discovery
  • Quorum-based replication with AZ-aware node placement
  • Best suited for high-throughput, latency-sensitive workloads

Core Components

ComponentRole
ClientRead/write protocol layer connecting applications to Woodpecker
EmbeddedClientCombined client + LogStore for embedded deployments
LogStoreHigh-speed log writes, batching, caching, and cloud storage uploads
ObjectStorageDurable storage backend using S3-compatible APIs
MetadataProvideretcd-backed metadata storage, coordination, and distributed locking
QuorumDiscoveryService mode quorum-based node selection and replication management via gossip

Data Flow

Write Path

  1. Application calls Write() or WriteAsync() on a LogWriter
  2. Writer validates the session lock is still valid (distributed lock via etcd)
  3. Gets or creates a writable segment via rolling policy (maxSize=100MB, maxInterval=800s, maxBlocks=1000)
  4. Serializes message with MarshalMessage() (payload + properties)
  5. Calls segment.AppendAsync() — data is buffered
  6. Buffer is flushed to storage when sync policy triggers (maxInterval=1000ms, maxEntries=2000, maxBytes=100MB)
  7. Segment metadata updated in etcd with revision-based optimistic locking

Read Path

  1. Application opens a LogReader with starting position and reader name (OpenLogReader(ctx, from, readerName))
  2. Reader loads entries in batches via ReadBatchAdv() (batch limit: 200 entries)
  3. Yields one entry per ReadNext() call from local cache
  4. On segment EOF, auto-advances to the next segment seamlessly
  5. Reader position updated in etcd every 30 seconds
  6. For active (still-writing) segments, polls with 200ms wait interval when no new data is available

Segment Lifecycle

Segments transition through the following states:

State Machine
Active → Completed → Sealed → Truncated → (Deleted)

Active     - Currently accepting writes
Completed  - All data flushed, no more writes
Sealed     - Metadata finalized, immutable
Truncated  - Marked for cleanup
Deleted    - Storage data removed

Quorum Protocol

Woodpecker's innovative quorum-based replication for service mode deployments.

Overview

In service mode, Woodpecker uses a configurable quorum protocol for data replication across LogStore nodes. This ensures data durability while allowing flexible tradeoffs between consistency, availability, and performance.

Quorum Parameters

The quorum is configured via the replicas setting, which determines the ensemble size (E). Write quorum (WQ) and ack quorum (AQ) are derived automatically:

ParameterSymbolFormulaE=3E=5
Ensemble SizeE= replicas35
Write QuorumWQ= E35
Ack QuorumAQ= (E / 2) + 123
Data is written to all E nodes (WQ = E), but only AQ acknowledgments are required before the write is considered durable. This allows writes to succeed even if some nodes are slow, while ensuring majority-based durability.

Buffer Pools

Buffer pools are logical groupings of LogStore nodes. Each pool is defined by a name and seed addresses used for gossip-based node discovery. Pools enable regional grouping of nodes for locality-aware placement.

woodpecker.yaml
woodpecker:
  client:
    quorum:
      quorumBufferPools:
        - name: "region-1"
          seeds: ["logstore1:7946", "logstore2:7946"]
        - name: "region-2"
          seeds: ["logstore3:7946", "logstore4:7946"]
      quorumSelectStrategy:
        affinityMode: "soft"     # soft | hard
        replicas: 3               # ensemble size (3 or 5)
        strategy: "random"        # see strategies table below
        customPlacement:         # only for strategy: custom
          - name: "replica-1"
            region: "region-1"
            az: "az-1"
            resourceGroup: "rg-1"

Node Selection Strategies

Woodpecker supports 8 node selection strategies for quorum placement:

StrategyConfig ValueDescription
RandomrandomRandom node selection from all available nodes (default)
Single AZ, Single RGsingle-az-single-rgAll nodes in the same availability zone and resource group
Single AZ, Multi RGsingle-az-multi-rgNodes in the same AZ but spread across different resource groups
Multi AZ, Single RGmulti-az-single-rgNodes across multiple AZs within the same resource group
Multi AZ, Multi RGmulti-az-multi-rgNodes spread across multiple AZs and resource groups (highest fault tolerance)
Cross Regioncross-regionNodes across different regions (requires 2+ buffer pools)
Random Grouprandom-groupPre-partitioned random groups for balanced distribution
CustomcustomExplicit placement rules via customPlacement configuration

Affinity Modes

ModeBehavior
softPrefer nodes matching the strategy constraints, but fall back to any available node if not enough matching nodes are found
hardStrictly enforce strategy constraints — fail if not enough matching nodes are available

Cluster Membership

Service mode uses HashiCorp memberlist (gossip protocol) for cluster membership management:

  • Nodes discover each other via seed addresses configured in buffer pools
  • Health status is propagated through gossip protocol
  • Failed nodes are automatically detected and excluded from quorum selection
  • New nodes can join the cluster dynamically by connecting to any seed
  • Each node exposes a bind port (gossip) and a service port (gRPC)

Object Storage

Cloud-native object storage as the durable storage layer for Woodpecker's WAL data.

Overview

Object storage serves as Woodpecker's primary durable storage layer. All WAL data is ultimately persisted to S3-compatible object storage, providing virtually unlimited capacity, high durability (11 nines), and cross-region replication capabilities.

ObjectStorage Interface

Woodpecker abstracts storage operations through a unified ObjectStorage interface:

Go Interface
type ObjectStorage interface {
    GetObject(ctx, bucket, key string, offset, size int64) (*Object, error)
    PutObject(ctx, bucket, key string, reader io.Reader, size int64) error
    PutObjectIfNoneMatch(ctx, bucket, key string, reader, size) error
    PutFencedObject(ctx, bucket, key string, reader, size, fenceId) error
    StatObject(ctx, bucket, key string) (*ObjectAttr, error)
    WalkWithObjects(ctx, bucket, prefix string, recursive bool, walkFn) error
    RemoveObject(ctx, bucket, key string) error
}

Supported Backends

Woodpecker supports multiple cloud providers via the minio.cloudProvider configuration:

ProviderConfig ValueCondition WriteNotes
Amazon S3awsSupported (If-None-Match)Default provider
Google GCSgcpSupportedS3-compatible mode
Aliyun OSSaliyunSupported (wrapped)Uses forbid-overwrite feature
Tencent COStencentSupported (wrapped)Uses forbid-overwrite feature
Azure BlobazureSupportedDedicated implementation
MinIOawsSupportedUses S3 API

Configuration

Object storage is configured via the storage and minio sections:

woodpecker.yaml
woodpecker:
  storage:
    type: default              # default(=minio), minio, local, service
    rootPath: /var/lib/woodpecker

minio:
  address: localhost
  port: 9000
  accessKeyID: minioadmin
  secretAccessKey: minioadmin
  useSSL: false
  bucketName: a-bucket
  createBucket: false
  rootPath: files
  useIAM: false
  cloudProvider: aws            # aws, gcp, aliyun, gcpnative
  region: ""
  useVirtualHost: false
  requestTimeoutMs: 10000
  listObjectsMaxKeys: 0        # 0 = unlimited

Condition Write Support

Condition write (PutObjectIfNoneMatch) is used as a fence mechanism to prevent split-brain in embedded mode. It ensures only one writer can create a given object key.

  • AWS/MinIO/GCS/Azure — Uses If-None-Match: * header for conditional PUT
  • Aliyun OSS / Tencent COS — Uses the forbid-overwrite feature (wrapped implementation)
  • Auto-detection — Woodpecker probes condition write support at startup (configurable via fencePolicy.conditionWrite)

Performance Best Practices

Write Optimization

  • The default maxFlushSize is 16MB — increase for larger batch uploads to reduce API call overhead
  • Default maxFlushThreads is 8 — increase for higher parallelism on fast networks
  • Use WriteAsync with batching for maximum throughput
  • Compaction merges small blocks into 32MB merged blocks (configurable via segmentCompactionPolicy.maxBytes)

Read Optimization

  • Read batch size is 16MB by default (segmentReadPolicy.maxBatchSize)
  • Up to 32 concurrent fetch threads for parallel reads (segmentReadPolicy.maxFetchThreads)
  • Range reads via GetObject with offset/size parameters for partial fetches

Data Durability

  • CRC32 checksums for data integrity verification on every block
  • Automatic retry on transient failures with configurable backoff (up to maxFlushRetries = 3, interval = 2s)
  • Fragment-based storage for incremental durability — data is durable as soon as each block is flushed
  • Segment compaction merges small fragments into larger objects for efficient storage and reads

Local File System Storage

High-performance disk-based log storage using memory-mapped I/O.

Overview

The local file system storage implements disk-based log storage using mmap (memory mapping) for efficient read and write performance. It maintains the Fragment concept for consistency with the object storage implementation.

File Layout

FragmentFile Layout
[Header (4K)] + [Data Area →] + [...Free Space...] + [← Index Area] + [Footer]

Header (4K)       Magic string and version information
Data Area         Actual log entries, growing forward from 4K
Free Space        Unused space between data and index
Index Area        Entry offsets, growing backward from file end
Footer            Metadata information

Data Entry Format:
[Payload Size (4B)] + [CRC32 (4B)] + [Actual Data (variable)]

Directory Structure

File Tree
/basePath/
  ├── log_[logID]/
  │   ├── fragment_[startOffset1]
  │   ├── fragment_[startOffset2]
  │   └── ...
  └── log_[logID2]/
      └── ...

Key Design Decisions

Why mmap?

  • Zero-copy read/write operations
  • Random access support
  • OS-managed page caching
  • Memory-level read/write performance

Why Fragments?

Although the local file system supports append operations, maintaining fragments provides:

  • Architecture consistency with object storage backend
  • File size limitation support
  • Segmented management and cleanup
  • Simplified concurrency control
  • Better system recovery capability

Performance Characteristics

AspectOptimization
WriteZero-copy via mmap, batch writes, async support, pre-allocated space
ReadIndex-based positioning, random reads, OS page cache, range reads
MemoryOn-demand loading via mmap, controlled fragment sizes, memory limits

Reliability

  • CRC32 checksum for data validation
  • Magic string verification in file header
  • Read-write locks for concurrent access control
  • Atomic operations for counter integrity
  • File locks to prevent concurrent writes

Staged Storage (Service Mode)

Two-tier hybrid storage combining local disk speed with object storage durability for service mode deployments.

Overview

Staged storage is the storage backend used in service mode (storage.type: service). It implements a two-tier hybrid architecture: active segments write to local disk for low-latency writes, then compaction uploads merged blocks to object storage for durable, long-term persistence.

Staged storage is automatically used when storage.type is set to service. No additional configuration is needed beyond configuring the MinIO/S3 backend for the object storage tier.

How It Works

Data Flow
Write Path:
  Client → LogStore Server → Local Disk (fast, low-latency)

Compaction:
  Local Disk → Merge Blocks → Object Storage (durable)

Read Path:
  Active segments  → Read from Local Disk
  Compacted data   → Read from Object Storage

Core Components

ComponentRole
StagedSegmentImplManages a segment's lifecycle across both tiers — coordinates writes to local disk and compaction uploads to object storage
StagedFileWriterHandles buffered writes to local disk files with block-based batching and CRC32 integrity checks
StagedFileReaderAdvReads entries from either local disk (active segments) or object storage (compacted segments) transparently

Configuration

Enable staged storage by setting storage.type: service:

woodpecker.yaml
woodpecker:
  storage:
    type: service              # enables staged storage
    rootPath: /var/lib/woodpecker # local staging directory

# Object storage tier (for compacted data)
minio:
  address: localhost
  port: 9000
  accessKeyID: minioadmin
  secretAccessKey: minioadmin
  bucketName: a-bucket
  rootPath: files

Benefits

  • Low write latency — Writes go to local disk first, avoiding object storage API overhead
  • High durability — Compaction ensures all data reaches object storage for long-term persistence
  • Efficient reads — Active data served from local disk; compacted data served from object storage with prefetching
  • Automatic lifecycle — Compaction runs automatically based on configured policies
  • Reduced API costs — Merging blocks before upload reduces the number of object storage API calls

Storage Backend Comparison

AspectObject Storage (MinIO/S3)Local File SystemStaged Storage
Config Valuedefault / miniolocalservice
ModeEmbeddedEmbedded (dev/test)Service
Write LatencyHigher (network)Lowest (local I/O)Low (local disk)
DurabilityHigh (cloud-replicated)Node-level onlyHigh (compacted to cloud)
CapacityUnlimitedDisk-limitedUnlimited (after compaction)
Best ForProduction embeddedDevelopment/testingProduction service mode

Condition Write & Fence

Split-brain prevention using condition write as a fence mechanism.

Overview

In embedded mode, Woodpecker uses condition write as a fence mechanism to prevent split-brain scenarios where a stale primary node resumes writing after being partitioned or crashed. This ensures data consistency and prevents concurrent write conflicts.

Supported Storage Backends

BackendSupportMechanism
MinIOFullIf-None-Match header
AWS S3FullConditional PUT operations
Azure BlobFullConditional writes
GCP Cloud StorageFullConditional operations
Aliyun OSSWrappedForbid overwrite feature
Tencent COSWrappedForbid overwrite feature

Detection Modes

Auto Mode (Default)

Attempts to detect condition write support up to 30 times with exponential backoff. If any detection succeeds, it's enabled cluster-wide. If all fail, falls back to distributed locks.

Cluster consensus is achieved via CAS (Compare-And-Swap) in etcd. The first successful detection determines the cluster-wide setting.

Enable Mode

Strict requirement: detection must succeed within 10 retries or the system panics. Use when you're certain your backend supports condition write.

Disable Mode

Skips detection entirely and uses distributed locks. Use when you know your backend doesn't support condition write.

Distributed Lock Fallback

When condition write is unavailable, Woodpecker uses etcd-based distributed locks:

PropertyDetail
Lock Typeetcd concurrency.Mutex with concurrency.Session
Session TTL10 seconds (auto-renewed via keep-alive)
Lock Key/woodpecker/service/lock/{logName}
AcquisitionNon-blocking TryLock

Failover Behavior

  1. Session Expiration — etcd session expires after TTL (10s) if keep-alive fails
  2. Lock Release — Distributed lock is automatically released
  3. New Primary — New node acquires the lock
  4. Writer Switch — Applications reopen writer (up to 10s wait)

Performance Comparison

AspectCondition WriteDistributed Lock
Lock overheadNonePer-writer acquisition
Failover latencyImmediateUp to 10 seconds
ConcurrencyMultiple attempt, one winsSingle writer at a time

Configuration

woodpecker.yaml
woodpecker:
  logstore:
    fencePolicy:
      conditionWrite: "auto"   # "auto" | "enable" | "disable"

Monitoring & Metrics

Built-in Prometheus metrics and OpenTelemetry tracing for production observability.

Prometheus Metrics

Woodpecker exposes comprehensive Prometheus metrics for monitoring system health and performance. Each node exposes metrics on a dedicated port (default 9091–9094 for nodes 1–4).

Key Metrics

ModuleExample MetricsPurpose
Client Appendwoodpecker_client_append_latency, _requests_total, _bytesWrite QPS, latency P50/P99, message size distribution
Client Readwoodpecker_client_read_requests_total, _reader_bytes_readRead QPS, throughput, reader operation latency
Server LogStorewoodpecker_server_logstore_active_logs, _active_segmentsResource usage, segment lifecycle, operation success rate
Server File/Bufferwoodpecker_server_file_flush_latency, _compaction_latencyPersistence latency, compaction performance, buffer queuing
Object Storagewoodpecker_server_object_storage_operations_total, _operation_latencyS3 call frequency, latency P50/P99, bandwidth usage
System Resourceswoodpecker_server_system_cpu_usage, _memory_usage_ratioNode load, memory pressure, I/O bottleneck detection
gRPCgrpc_server_handled_total, _handling_secondsRPC QPS, success rate, error code distribution

OpenTelemetry Tracing

Woodpecker integrates with OpenTelemetry for distributed tracing. The deployment includes Jaeger (port 16686) for trace visualization. gRPC interceptors automatically propagate trace context in service mode.

Setting Up Monitoring Stack

Use the provided Docker Compose overlay to add Prometheus and Grafana to the cluster:

bash
# Start cluster with Prometheus + Grafana monitoring
cd deployments
docker compose -f docker-compose.yaml \
  -f ../tests/docker/monitor/docker-compose.monitor.yaml \
  -p woodpecker-monitor up -d

Access the monitoring services:

ServiceURLCredentials
Prometheushttp://localhost:9090
Grafanahttp://localhost:3000Anonymous admin (no login)
Jaegerhttp://localhost:16686
MinIO Consolehttp://localhost:9001minioadmin / minioadmin

Grafana Dashboards

Pre-built Grafana dashboard templates are available in tests/docker/monitor/grafana/templates/. These provide ready-to-use visualizations for write/read throughput, latency percentiles (P50/P95/P99), segment state distribution, storage backend performance, and error rates.

Run Monitoring Tests

Automated E2E metric verification tests are provided:

bash
# One-click: build, deploy, test, and cleanup
cd tests/docker/monitor
./run_monitor_tests.sh

# Keep cluster running after tests for manual inspection
./run_monitor_tests.sh --no-cleanup
The monitoring test suite also includes a rolling restart latency profile test that measures write/read latency across node restarts — useful as a baseline for evaluating rolling upgrade quality.