Skip to content

ZFS Guide

ZFS (Zettabyte File System) represents a revolutionary approach to storage management, combining a filesystem with a volume manager. Unlike traditional filesystems, ZFS provides end-to-end data integrity verification, automatic repair capabilities, efficient snapshots, and native RAID support—all while eliminating the complexity of separate volume management.

Core Concepts

Storage Pools (zpools) Storage pools form the foundation of ZFS storage management. Think of a pool as a dynamic storage container that:

  • Combines multiple physical storage devices into a unified storage resource
  • Manages physical storage allocation automatically
  • Provides built-in redundancy through RAID-like configurations
  • Allows all filesystems in the pool to share space dynamically
  • Eliminates the need for traditional volume management
  • Handles device failures and repairs automatically

Datasets Datasets are the fundamental building blocks for organizing data in ZFS. They offer:

  • Flexible organization through filesystem hierarchies
  • Independent property management (compression, quotas, etc.)
  • Dynamic space sharing from the pool without pre-allocation
  • Four main types:
    1. Filesystems: Standard filesystems for general data storage
    2. Volumes (zvols): Block devices for applications needing raw devices
    3. Snapshots: Point-in-time copies of filesystems or volumes
    4. Clones: Writable copies of snapshots

Snapshots Snapshots provide powerful data protection and recovery capabilities:

  • Create instantaneous, read-only copies of filesystems or volumes
  • Initially consume no additional space
  • Store only changed blocks from the active dataset
  • Enable quick recovery from mistakes or unwanted changes
  • Support efficient backups and rollbacks
  • Accessible through the hidden .zfs/snapshot directory

Basic Pool Operations

Creating Storage Pools Storage pools can be configured in various RAID-like configurations to balance redundancy, performance, and storage capacity:

# Simple pool with single disk (not recommended for important data)
zpool create tank /dev/sda

# Mirrored pool (RAID-1) - Best for 2-drive configurations
zpool create tank mirror /dev/sda /dev/sdb

# RAID-Z1 pool (Single parity, similar to RAID-5) - Minimum 3 drives
zpool create tank raidz1 /dev/sda /dev/sdb /dev/sdc

# RAID-Z2 pool (Double parity, similar to RAID-6) - Minimum 4 drives
zpool create tank raidz2 /dev/sda /dev/sdb /dev/sdc /dev/sdd

# Create pool with specific sector size alignment (recommended for modern drives)
zpool create -o ashift=12 tank mirror /dev/sda /dev/sdb

Pool Management Regular pool maintenance and monitoring are essential for data integrity:

# Check pool status and health
zpool status tank

# List all pools and their basic information
zpool list

# Start pool scrub (verifies all data against checksums)
zpool scrub tank

# Import/export pools (for moving between systems)
zpool export tank
zpool import tank

# Destroy pool (permanent and irreversible)
zpool destroy tank

Dataset Operations

Creating and Managing Filesystems Datasets provide flexible data organization with inheritable properties:

# Create new filesystem
zfs create tank/data

# Create nested filesystem
zfs create tank/data/documents

# Set custom mount point
zfs set mountpoint=/mnt/data tank/data

# Enable compression (recommended for most use cases)
zfs set compression=lz4 tank/data

# Set quota to limit space usage
zfs set quota=100G tank/data

Dataset Destruction and Management Safe dataset removal requires understanding the implications:

# Remove dataset and all descendants
zfs destroy -r tank/data

# Dry run - show what would be destroyed
zfs destroy -r -n tank/data

# Remove specific snapshot
zfs destroy tank/data@snapshot1

# Remove all snapshots of a dataset
zfs destroy tank/data@%

# Remove snapshots matching pattern
zfs destroy tank/data@backup-2023*

Snapshot Management

Snapshots are crucial for data protection and provide a safety net for system administration:

# Create snapshot with timestamp
zfs snapshot tank/data@$(date +%Y-%m-%d-%H%M)

# Create recursive snapshot of dataset and all descendants
zfs snapshot -r tank@backup-2025-01-03

# List all snapshots
zfs list -t snapshot

# Rollback to snapshot (destroys newer snapshots)
zfs rollback tank/data@backup-2025-01-03

# Create writable clone from snapshot
zfs clone tank/data@backup-2025-01-03 tank/data_clone

Advanced Features

Data Transfer and Backup ZFS provides powerful tools for data replication and backup:

# Full backup: Send snapshot to file
zfs send tank/data@backup-2025-01-03 > backup.zfs

# Remote backup: Send to another system
zfs send tank/data@backup-2025-01-03 | ssh remote_host zfs recv backup/data

# Incremental backup: Send only changes between snapshots
zfs send -i tank/data@old tank/data@new | ssh backup_host zfs recv tank/backup

# Send with properties and resume support
zfs send -p tank/data@backup | gzip > backup.gz
zfs send -Rv tank/data@backup

Pool Device Management ZFS allows dynamic modification of storage pools:

# Add read cache device (L2ARC)
zpool add tank cache /dev/sdd

# Add write log device (ZIL)
zpool add tank log /dev/sde

# Remove device (only for some device types)
zpool remove tank /dev/sdd

# Take device offline for maintenance
zpool offline tank /dev/sda
zpool online tank /dev/sda

Dataset Properties ZFS provides extensive property management:

# View all properties
zfs get all tank/data

# Set multiple properties at once
zfs set quota=100G compression=lz4 tank/data

# Inherit property from parent dataset
zfs inherit compression tank/data

# Set property recursively on all descendants
zfs set compression=lz4 -r tank

# View specific properties
zfs get compression,compressratio,used,available tank/data

Performance Monitoring Regular monitoring helps maintain optimal performance:

# Monitor IO statistics in real-time
zpool iostat -v tank 5

# Check compression efficiency
zfs get compressratio tank/data

# Monitor space usage
zfs list -o name,used,available,referenced tank

# View detailed pool performance
zpool iostat -v tank 1

Advanced Maintenance

Health Monitoring Regular health checks are crucial for data integrity:

# Detailed pool health status
zpool status -v tank

# View command history
zpool history tank

# Check for errors across all pools
zpool status -x

# Clear transient device errors
zpool clear tank

Data Recovery ZFS provides various recovery options for different failure scenarios:

# Force import potentially damaged pool
zpool import -f tank

# Import pool with different name
zpool import oldtank newtank

# Recovery mode import (may lose recent transactions)
zpool import -F tank

# Replace failed drive
zpool replace tank /dev/sda /dev/sdc

Performance Tuning Optimize ZFS for specific workloads:

# Optimize for databases
zfs set recordsize=8K tank/database
zfs set logbias=throughput tank/database
zfs set primarycache=metadata tank/database

# Enable deduplication (requires significant RAM)
zfs set dedup=on tank/data

# Configure cache behavior
zfs set primarycache=all tank/data
zfs set secondarycache=all tank/data

Best Practices

Pool Design - Use ashift=12 for modern drives (4K sectors) - Choose appropriate RAID levels: - Mirror (RAID-1): Best for 2 drives, highest performance - RAID-Z1: Minimum 3 drives, single drive failure tolerance - RAID-Z2: Minimum 4 drives, two drive failure tolerance - Maintain 10-20% free space for optimal performance - Use whole disks instead of partitions - Consider separate log devices for sync-heavy workloads

Dataset Organization - Create separate datasets for different types of data - Use consistent naming conventions:

tank/
  ├── home/
  │   └── users/
  ├── vm/
  │   └── images/
  ├── backup/
  └── shared/
- Set appropriate properties per dataset type: - Databases: smaller recordsize, no compression - Virtual machines: larger recordsize, compression - User data: default recordsize, compression enabled

Backup Strategy - Implement 3-2-1 backup rule: - 3 copies of data - 2 different media types - 1 off-site copy - Regular snapshot schedule:

# Hourly snapshots retained for 24 hours
# Daily snapshots retained for 30 days
# Weekly snapshots retained for 6 months
- Test backup restoration regularly - Use incremental replication for efficiency

Maintenance Schedule - Daily Tasks: - Monitor pool status - Check for errors - Verify backup completion - Weekly Tasks: - Run scrub operations - Review snapshot usage - Check compression ratios - Monthly Tasks: - Review storage utilization - Clean up old snapshots - Verify backup integrity - Quarterly Tasks: - Test disaster recovery procedures - Review performance metrics - Update documentation

Prerequisites and Requirements

Hardware Considerations - Memory Requirements: - Minimum: 2GB RAM for basic usage - Recommended: 1GB RAM per 1TB storage for general use - Deduplication: 5GB RAM per 1TB of deduplicated data - Storage Devices: - Enterprise-grade drives recommended for production - SSDs for ZIL (write cache) and L2ARC (read cache) - ECC RAM highly recommended for data integrity - Network (for replication): - Gigabit or faster network for efficient backups - Dedicated network for replication traffic recommended

Software Requirements - Operating System: - 64-bit OS required - Kernel with ZFS support - Compatible ZFS version across systems for replication - Packages:

# Ubuntu/Debian
apt install zfsutils-linux

# RHEL/CentOS
dnf install zfs

# FreeBSD
ZFS included by default

Common Issues and Solutions

Pool Problems

Degraded Pool

# Check detailed status
zpool status -x

# Identify failed device
zpool status tank

# Replace failed device
zpool replace tank /dev/sda /dev/sdc

# Clear errors after resolution
zpool clear tank

Import Issues

# Scan for importable pools
zpool import

# Force import potentially damaged pool
zpool import -f tank

# Import pool with missing device
zpool import -m tank

Space Management

Finding Space Usage

# List large filesystems
zfs list -o name,used,referenced -s used

# Find snapshot space usage
zfs list -t snapshot -o name,used,referenced

# Show space usage by type
zfs list -o name,used,usedbysnapshots,usedbydataset

Recovering Space

# Remove old snapshots
zfs destroy tank/data@old-snapshot

# Destroy all snapshots of a dataset
zfs destroy tank/data@%

# Set compression to reduce space usage
zfs set compression=lz4 tank

Performance Issues

Slow Performance

# Check IO statistics
zpool iostat -v tank 1

# Monitor cache hit ratio
arc_summary

# Check device latency
zpool iostat -w tank

Fragmentation

# Check fragmentation level
zpool list -v tank

# Scrub pool to optimize
zpool scrub tank

Advanced Troubleshooting

Data Recovery Scenarios

Corrupted Data

# Check for checksum errors
zpool status -v tank

# Attempt automatic repair
zpool scrub tank

# Clear transient errors
zpool clear tank

Snapshot Recovery

# List all snapshots including hidden ones
zfs list -t snapshot -r tank

# Mount snapshot for file recovery
mkdir /mnt/recover
mount -t zfs tank/data@snapshot /mnt/recover

System Integration

Boot Environment Management

# Create boot environment snapshot
zfs snapshot rpool/ROOT/default@before_upgrade

# Clone boot environment
zfs clone rpool/ROOT/default@before_upgrade rpool/ROOT/new_be

Performance Monitoring Integration

# Monitor ZFS performance metrics
zpool iostat -v 5

# Archive performance data
zpool iostat -v 5 | tee /var/log/zfs/perf.log

Remember that ZFS is a powerful but complex system. Always: - Test changes in a non-production environment first - Maintain current backups before major operations - Document all configuration changes - Monitor system resources regularly - Keep ZFS software updated - Verify data integrity periodically

This comprehensive guide covers most common ZFS operations and maintenance tasks. However, specific environments may require additional tuning and configuration.