Woodpecker

A High-Performance Cloud-Native WAL Storage Implementation

Client Layer Client Writer Reader Server Layer Log Store Processor Manager Storage Layer Object File Cache Log Flow Segment 1 Segment 2 Segment 3 LogFile 1 LogFile 2 LogFile 3 LogFile 4 LogFile 5 LogFile 6

Key Features

Woodpecker provides a powerful solution for cloud-native logging with high performance and reliability

Cloud-Native WAL

Uses cloud object storage as the durable storage layer, ensuring scalability and cost-effectiveness.

High-Throughput Writes

Optimized for cloud storage with specialized write strategies to maximize sequential write throughput.

Efficient Log Reads

Utilizes memory management and prefetching strategies to optimize sequential log access.

Ordered Durability

Guarantees strict sequential ordering for log persistence.

Flexible Deployment

Can be deployed as a standalone service or integrated as an embedded library in your application.

Resilient & Fault-Tolerant

Leverages cloud reliability features for strong durability guarantees.

System Architecture

Woodpecker is designed with a modular architecture that separates concerns into distinct layers, enabling efficient data flow and management.

The system consists of three main layers: Client Layer, Server Layer, and Storage Layer, each with specific responsibilities and components.

Client Layer Client Writer Reader Server Layer Log Store Processor Manager Storage Layer Object File Cache

Core Components

The building blocks of the Woodpecker system

Client Layer Server Layer Storage Layer
Client
Log Writer/Reader
Segment Handler
Log Store
Segment Processor
LogFile Writer/Reader
Object Storage
Local File System
Cache System

Core Concepts

Understanding the fundamental data model of Woodpecker

Key Workflows

How data flows through the Woodpecker system

Write Flow

  1. Client calls AppendAsync to add data to the log
  2. Data is first written to memory buffer
  3. When specific conditions are met, Sync operation is triggered
  4. Sync persists data to storage system as log file fragments
  5. Client receives acknowledgment of successful write

Read Flow

  1. Client creates a Reader for a specific range
  2. Reader locates the appropriate segment through segment manager
  3. LogFile is retrieved from the segment, and data is read from fragments
  4. Data may be retrieved from cache or object storage

Configuration & Tuning

Key parameters for optimizing Woodpecker performance

Future Development

Planned improvements and enhancements

Short-term Goals

  • Improve test coverage and system stability
  • Optimize memory usage and reduce resource consumption
  • Enhance monitoring and diagnostic capabilities

Mid-term Goals

  • Support more storage backends (beyond MinIO)
  • Implement more advanced caching strategies
  • Provide richer client interfaces and tools

Long-term Vision

  • Distributed deployment support
  • Advanced data analysis and query capabilities
  • Integration with more ecosystems

Reference Documentation

Detailed technical documentation for developers