Generative AI for insurers: catastrophe models and hallucinations
Firms including Fathom, Verisk and Moody's RMS use diffusion models to create synthetic disasters.
TL;DR
- 01Firms including Fathom, Verisk and Moody's RMS use diffusion models to create synthetic disasters.
- 02On June 25, 2026 the industry is testing tools that can generate tens of thousands of plausible scenarios and refine coarse simulated grids down to higher resolution for loss modeling.
- 03Fathom, a subsidiary of reinsurer Swiss Re, trained its diffusion tool on "roughly 1,000 years of existing climate simulations," then had it produce far more scenarios for a projected 2030 climate.
Insurers are using diffusion-based generative AI to produce thousands of synthetic weather events for future climates, a move intended to sharpen catastrophe risk estimates even where historical data is thin. On June 25, 2026 the industry is testing tools that can generate tens of thousands of plausible scenarios and refine coarse simulated grids down to higher resolution for loss modeling.
How are insurers using generative AI for catastrophe models?
Diffusion models are being trained on existing climate simulations and then used to synthesize far more events than the original simulations contain, producing tens of thousands of plausible weather events to feed risk models. Fathom, a subsidiary of reinsurer Swiss Re, trained its diffusion tool on "roughly 1,000 years of existing climate simulations," then had it produce far more scenarios for a projected 2030 climate. A second, image-sharpening model refines the initial 100 × 100 kilometer resolution down to 10 × 10 kilometers, which Fathom says is fine enough to capture precipitation patterns.
Competitors are taking related approaches. Verisk uses generative AI to model extreme wind and rain together rather than sequentially, a change its research chief Jay Guin says captures spatial variability more precisely than traditional machine learning. Moody's RMS applies AI to analyze satellite imagery after wildfires and hurricanes and to estimate insured losses. Firas Saleh, who leads Moody's flood and wildfire modeling for North America, says the technology is especially valuable for tail-risk events, rare catastrophes with almost no historical data.
What are the technical and commercial limits?
Generative AI can produce realistic-looking scenarios, but those scenarios can violate physical laws and produce implausible cases, creating a clear danger for downstream risk calculations. "You can hallucinate some absolute slop," warns Fathom's scientific director Oliver Wing. Those hallucinations may look convincing to human reviewers while misrepresenting the physics of storms, floods or fire spread.
Better models could also force hard commercial choices. More precise modeling might reveal higher potential losses in regions insurers have historically undercovered, such as parts of Bangladesh or Brazil where major modelers have previously skipped work because of low asset values. The Financial Times coverage cited by the piece notes that insurers "will generally purchase the model that allows them to do more business - that produces a lower loss estimate." One modeler added, "Underwriters just want to write more business." Those incentives can bias which models are adopted, independent of their scientific merit.
Why it matters
The shift to generative methods expands the effective sample of extreme events and can fill gaps where historical records are sparse, improving coverage of tail risks. At the same time, the combination of model hallucinations and commercial incentives risks producing loss estimates that are either dangerously optimistic or inconsistent across carriers. Swiss Re reported that natural disasters caused $220 billion in damage in 2025, of which only $107 billion was insured; more accurate modeling could change how that gap is priced or covered.
What to watch
Watch whether major insurers and reinsurers adopt models that produce materially different capital or premium requirements, and whether regulators or rating agencies require validation steps for AI-generated scenarios. Also track whether vendors publish error rates or physically grounded constraints for their generative pipelines, and whether the industry converges on standards for preventing physics-defying hallucinations.
Generative AI is expanding what catastrophe modeling can simulate, but the technology brings both new resolution and new failure modes. The coming year should show whether the industry prioritizes improved scientific fidelity or models that align with underwriters' business incentives.
Written by The Brieftide · Source: The Decoder
The Brieftide Daily · 06:00
Briefs like this one, in your inbox every morning.
Continue reading
More in Multimodal AIReMMD: Multilingual Multi-Image Benchmark and Agent Release
ReMMD introduces ReMMDBench (500 samples, 2,756 images) and ReMMD-Agent; GPT-5.2 yields 41.80% accuracy and 39.12% macro-F1.
Amazon Nova embeddings beat Cohere for Vexcel aerial search
Amazon Nova Multimodal Embeddings, evaluated on Vexcel imagery via Amazon Bedrock.
LLMs: gpt-4o, gpt-4.1-mini and claude-sonnet-4.6 study
Analysis of 21,000 multi-turn conversations finds human-like behaviors vary by model and user and can be modulated by system prompts.
ThinkDeception: Progressive RL framework for multimodal deception
ThinkDeception on arXiv uses MLLMs, a step-by-step multimodal Chain of Thought dataset and a four-tier progressive RL trainer for.