Evaluation¶
This section contains comprehensive evaluation and benchmarking analyses of our fire risk estimates. These notebooks compare our results against historical fire data, other datasets, and provide detailed statistical assessments.
Overview¶
Our evaluation approach includes multiple independent analyses to validate the quality and reliability of our fire risk estimates:
- Comparison with historical fire data - Benchmarking against 70+ years of actual burn perimeters
- Cross-dataset validation - Comparing with established fire risk datasets
- Regional deep-dives - Detailed analysis of specific geographic areas and historical events
- Methodological validation - Statistical assessment of our scoring and classification approaches
Evaluation Notebooks¶
Benchmarking¶
Comprehensive comparison of our burn probability estimates against historical U.S. fire perimeter data. This analysis adapts methods from Moran et al. 2025 to benchmark our model-derived burn probabilities, examining both all pixels and specifically "non-burnable" areas where we extended estimates beyond the original Riley et al. (2025) coverage.
Key analyses:
- Distribution of burn probability in historically burned vs. unburned areas
- Performance assessment in areas we designated as burnable
- Statistical comparison with 70+ year fire history
California Comparison¶
Detailed comparison of our risk estimates with two authoritative California datasets: the Wildfire Risk to Communities (WRC) project and California Fire Hazard Severity Zones from CAL FIRE.
Key analyses:
- Census tract-level concordance analysis using Kendall's Tau
- Spatial patterns of agreement and disagreement
- Regional variation in performance metrics
Key features:
- Monotonically descending bin prevalence with increasing risk scores
- Distribution-based bin design using building-level data
- Comparison with other scoring approaches
Comparing Risk Rasters¶
Visual comparison of our 30m resolution risk rasters with those from the Wildfire Risk to Communities project. This notebook showcases regions where the datasets differ and explains the underlying causes, including effects of wind modeling and development patterns.
Key analyses:
- Areas of low correlation between datasets
- Regions with high and low bias
- Historical fire locations (Eaton Fire, Marshall Fire, Camp Fire)
- "Wind effect" vs. "development effect" attribution
Comparing Risk at Buildings¶
Building-level comparison of risk estimates, examining how our approach differs from other datasets when evaluated at the scale of individual structures.
Key Findings¶
Our evaluation demonstrates that:
- Historical concordance: Areas with higher burn probability in our estimates correspond well with areas that have historically burned, validating the spatial distribution of risk
- Regional performance: Our estimates show strong concordance with established California fire hazard maps, with performance comparable to or exceeding other national datasets
- Wind modeling impact: Our incorporation of wind effects produces meaningful differences in developed areas, particularly in regions prone to wind-driven fires
- Scoring validity: Our categorical scoring system effectively captures the full range of risk while maintaining interpretability across different risk levels
References¶
- Finney, M.A., et al. (2011). A simulation of probabilistic wildfire risk components for the continental United States. Stochastic Environmental Research and Risk Assessment.
- Moran, C.J., et al. (2025). Benchmarking burn probability maps in California using historical fire perimeters. Scientific Reports.
- Riley, K.L., et al. (2025). Wildfire Risk to Communities methodology and data products.