Skip to content

Snapshot Testing with Xarray/Zarr

OCR uses syrupy for snapshot testing with custom support for xarray DataArrays and Datasets stored in zarr format.

Overview

The XarraySnapshotExtension and GeoDataFrameSnapshotExtension allow you to:

  • Create and compare snapshots of xarray DataArrays and Datasets and GeoPandas dataframes
  • Store snapshots in S3 (default) or locally
  • Handle large datasets efficiently using zarr format and geoparquet for geodataframes.

Basic Usage

S3 Snapshots (Default)

By default, snapshots are stored in S3 at s3://carbonplan-ocr/integration-tests/snapshots/. This is configured in tests/conftest.py and can be overridden if needed.

import numpy as np
import pytest
import xarray as xr
import geopandas as gpd
from ocr.testing import XarraySnapshotExtension, GeoDataFrameSnapshotExtension


@pytest.fixture
def xarray_snapshot(snapshot):
    return snapshot.use_extension(XarraySnapshotExtension)

@pytest.fixture
def geodataframe_snapshot(snapshot):
    return snapshot.use_extension(GeoDataFrameSnapshotExtension)


def test_my_data(xarray_snapshot):
    data = xr.DataArray(
        np.array([[1, 2, 3], [4, 5, 6]]),
        dims=('x', 'y'),
    )
    assert xarray_snapshot == data


def test_geodataframe_snapshot_extension(geodataframe_snapshot) -> None:
    """Test snapshot extension for GeoDataFrames."""
    gdf = gpd.GeoDataFrame(
        {
            'name': ['Point A', 'Point B', 'Point C'],
            'value': [10.5, 20.3, 30.7],
            'category': ['type1', 'type2', 'type1'],
        },
        geometry=[Point(-120.0, 35.0), Point(-119.5, 35.5), Point(-119.0, 36.0)],
        crs='EPSG:4326',
    )
    assert geodataframe_snapshot == gdf

Run tests with:

# Create/update snapshots in S3
pixi run pytest tests/test_snapshot.py --snapshot-update

# Verify snapshots against S3
pixi run pytest tests/test_snapshot.py

Local Storage (Override)

If you prefer to store snapshots locally during development, you can override the default:

# Create/update local snapshots
SNAPSHOT_STORAGE_PATH=tests/__snapshots__ pixi run pytest tests/test_snapshot.py --snapshot-update

# Verify local snapshots
SNAPSHOT_STORAGE_PATH=tests/__snapshots__ pixi run pytest tests/test_snapshot.py

Configuration

The snapshot storage location is controlled by the SNAPSHOT_STORAGE_PATH environment variable:

  • Default: s3://carbonplan-ocr/integration-tests/snapshots/ (set in tests/conftest.py)
  • Override: Set SNAPSHOT_STORAGE_PATH to any local path or S3 URI

Examples

# Use default S3 storage (no env var needed)
pixi run tests-integration

# Override with local storage
SNAPSHOT_STORAGE_PATH=tests/__snapshots__ pixi run tests-integration

# Override with different S3 bucket
SNAPSHOT_STORAGE_PATH=s3://my-bucket/snapshots pixi run tests-integration

Snapshot Organization

Snapshots are named based on the test file and function:

{test_file}_{test_function}.zarr

Example:

  • Test file: tests/test_fire_risk.py
  • Test function: test_wind_adjustment
  • Snapshot name: test_fire_risk_test_wind_adjustment.zarr

Managing S3 Snapshots

List snapshots

aws s3 ls s3://carbonplan-ocr/integration-tests/snapshots/ --recursive

Delete old snapshots

# Delete a specific snapshot
aws s3 rm s3://carbonplan-ocr/integration-tests/snapshots/test_snapshot_test_example.zarr --recursive

# Delete all snapshots (careful!)
aws s3 rm s3://carbonplan-ocr/integration-tests/snapshots/ --recursive

Example: Complete Test Setup

# tests/test_fire_risk.py
import numpy as np
import pytest
import xarray as xr
from ocr.testing import XarraySnapshotExtension


@pytest.fixture
def xarray_snapshot(snapshot):
    return snapshot.use_extension(XarraySnapshotExtension)

@pytest.mark.integration # mark this test as integration
def test_fire_risk_calculation(xarray_snapshot):
    # Your data processing
    result = xr.Dataset({
        'fire_risk': xr.DataArray(
            np.random.rand(100, 100),
            dims=('lat', 'lon'),
            coords={
                'lat': np.linspace(30, 40, 100),
                'lon': np.linspace(-120, -110, 100),
            }
        )
    })

    # Snapshot assertion
    assert xarray_snapshot == result

Run with:

# Default S3 storage
pixi run pytest tests/test_fire_risk.py --snapshot-update
pixi run pytest tests/test_fire_risk.py

# Local storage for development
SNAPSHOT_STORAGE_PATH=tests/__snapshots__ pixi run pytest tests/test_fire_risk.py --snapshot-update
SNAPSHOT_STORAGE_PATH=tests/__snapshots__ pixi run pytest tests/test_fire_risk.py

# Custom S3 bucket
SNAPSHOT_STORAGE_PATH=s3://my-bucket/snapshots pixi run pytest tests/test_fire_risk.py --snapshot-update
SNAPSHOT_STORAGE_PATH=s3://my-bucket/snapshots pixi run pytest tests/test_fire_risk.py