

AI is transforming emergency management and business continuity — and vendors are racing to claim the space. Some are building real capability. Others are wrapping ChatGPT in a branded interface and calling it innovation.
As someone who's spent twenty years in this field and now builds AI tools for it, I've watched (and experienced) procurement conversations up close. I've heard about and seen what vendors promise in demos and what their contracts actually say. I've talked to buyers who signed and regretted it, and buyers who asked the right questions and avoided costly mistakes.
Here are ten questions and seven buzzwords that will separate substance from marketing — regardless of which vendor you're evaluating, including us.
1. Does your contract match what you just told me?
This is the most important question on this list, and it should be the first thing you check.
Ask the vendor in the demo: Do you train your AI models on our data? Write down exactly what they say. Then read the contract. Read the data rights section, the license grants, the intellectual property clause. Compare the words on the page to the words in the room.
If they don't match, walk away. A company that says one thing in a meeting and writes another in the contract is telling you who they are.
2. What happens to my data if I leave?
Every vendor will tell you they don't lock you in. Ask them to prove it. Can you export all of your data — documents, outputs, findings, exercise records, user activity — in a format you can actually use? How long do they retain your data after termination? Is deletion automatic, or do you have to request it? What's the timeline?
"You can cancel anytime" is not the same as "you can leave with everything you brought and everything you built."
3. If I contribute content to a shared library, can I fully retract it?
Some platforms aggregate user-contributed content into shared knowledge bases. This can be valuable — but it raises a critical question about data recall.
If you upload a document to a shared library and later revoke permission, does the vendor delete only your original file, or do they also remove every derivative output that referenced your content in other users' sessions? If they can't recall derivative content, your data lives on in the system long after you've "deleted" it. Ask for specifics. "You can remove your documents at any time" is not the same thing as full retraction.
4. Can I see the methodology behind your evaluations?
If a vendor tells you their tool evaluates your plans and gives you a score or a rating, ask what it's evaluating against. Is it a checklist? A compliance framework? A proprietary rubric? Can you see it?
A red-yellow-green scorecard is only useful if you understand the standard behind it. If the methodology is opaque — if the tool tells you your plan is "compliant" without explaining what compliance means in this context — then the output is decoration, not analysis. You need to know what's being measured, against what standard, and whether that standard is appropriate for your organization.
5. What does your tool actually produce — and what do I still have to do myself?
This is where demos get slippery. A tool that "designs exercises" might mean it generates scenario text you have to copy, paste, format, and organize into a situation manual yourself. It's a lot like what ChatGPT or Claude might do.
Or it might mean it produces a complete, structured, HSEEP-aligned exercise plan with objectives, injects, discussion questions, facilitation guidance, and exportable documentation.
Those are very different products at very different price points. Ask to see the actual output. Not a slide about the output. The output itself. Then ask: what do I have to do between this output and a deliverable I can hand to my boss?
6. Does your tool handle the full workflow, or just one piece of it?
Exercise design is not a single task. It's a workflow that includes document analysis, stakeholder engagement, threat intelligence gathering, scenario development, inject sequencing, facilitation planning, exercise delivery, observation capture, after-action reporting, and improvement planning.
Ask the vendor which of these steps happen inside their platform and which ones you're doing outside it — in Word, in email, in meetings, by hand. A tool that handles one step well but leaves you to manually manage the other nine is not a platform. It's a feature.
7. How does your tool handle stakeholder review and revision?
Exercise content rarely survives first contact with stakeholders. Situation manuals get reviewed, commented on, revised, and re-reviewed — sometimes by dozens of people across multiple agencies.
Ask the vendor: does this process happen inside your tool, or does it happen in email and tracked-changes Word documents? If a stakeholder submits fifty comments on a draft, does the tool help you adjudicate them, or are you reading them one by one and making changes manually? The review and revision cycle is where most exercise timelines die. If the tool doesn't address it, you haven't saved as much time as the demo suggested.
8. Can I get a free trial with my own data?
Not a guided demo with the vendor's sample data. A free trial where you upload your own plans, work with your own scenarios, and evaluate the tool against your actual workflow.
If the answer is no — if the vendor requires a contract before you can test the product with real-world content — ask yourself why. A product that can't survive contact with your reality is a product that's selling its demo, not its capability.
9. Who built this, and do they understand my work?
AI is a powerful technology, but it's a means, not an end. The quality of an AI tool for emergency management depends entirely on whether the people who built it understand emergency management — the doctrine, the workflows, the politics, the constraints, the difference between compliance theater and genuine readiness.
Ask who's on the team. Ask about their background in the field. Ask them to explain HSEEP, or core capabilities, or the difference between a tabletop discussion and a full-scale exercise. If the answers are vague, the product probably is too.
10. What does your tool look like after the first use?
The best demo is always the first one. The real question is what happens on day thirty, day ninety, day three hundred. Does the tool learn from your exercises? Does it connect findings to future planning? Does the second exercise build on the first, or are you starting from scratch every time?
A tool that treats every interaction as isolated is a utility. A tool that compounds — that gets smarter and more useful with every cycle — is a system. Ask which one you're buying.
Bonus: The Buzzword Decoder
AI vendors in this space love impressive-sounding language. Before you nod along in a demo, make sure you understand what the words actually mean — and what they don't.
"Powered by a frontier model"
This usually means the vendor is using OpenAI's GPT, Anthropic's Claude, Google's Gemini, or another commercially available large language model through an API. That's fine — most AI applications are built on top of these models, including ours. But "frontier model" is not a differentiator. It's an ingredient. Every vendor has access to the same foundational models. The question isn't which model powers the tool — it's what the vendor built on top of it. What are the workflows? What's the domain expertise? What guardrails exist? Saying "powered by a frontier model" without explaining what you've built around it is like a restaurant advertising "made with heat." The oven isn't the recipe.
"Doctrine-trained"
Ask what this actually means. In most cases, it means the vendor has ingested publicly available federal documents — FEMA guidance, CPGs, NRF, NIMS documentation — into their system so the AI can reference them. These documents are freely available to anyone. You can download them from fema.gov right now.
There's nothing wrong with building a knowledge base from public doctrine. But "doctrine-trained" implies a proprietary advantage that often doesn't exist. The question to ask is: trained how? Did they fine-tune a model on this content? Did they build a retrieval system that indexes it? Or did they simply upload the same PDFs you already have and let a chatbot search them? The rigor behind the integration matters far more than the fact that the documents are present.
"Closed-circuit AI"
This phrase sounds like a security innovation. Usually, it means the vendor doesn't send your data to a third-party AI model for training — which is a standard contractual provision that any responsible AI vendor should offer. It's table stakes, not a feature.
Ask what specifically is "closed." Is the AI model running in an isolated environment? Are your queries and documents processed without leaving the vendor's infrastructure? Does it mean the model won't access the internet to answer your queries? Does it mean the model will only answer using your provided context — your documents, your plans, your data? Or does "closed-circuit" just mean they checked a box in their API agreement with OpenAI or Anthropic that says "don't train on this customer's data" — which is the default setting for enterprise API usage anyway?
Security and data privacy matter enormously. But a branded term for an industry-standard practice shouldn't be confused with a technical breakthrough.
"AI-powered plan evaluation"
What is it evaluating against? If the tool gives your plan a score, a grade, or a red-yellow-green rating, you need to understand the rubric. Is it measuring compliance with a specific standard? Completeness against a checklist? Alignment with a framework like CPG 101?
CPG 101, for example, describes a process for developing plans. It doesn't define what a good plan contains in operational terms. A tool that tells you your plan is "CPG 101 compliant" might be telling you that you followed the right process steps — not that your plan would actually work in a disaster. Make sure you understand whether the evaluation measures process adherence or operational substance. They're very different things.
"AI exercise design"
This is the one that requires the most scrutiny, because it can mean almost anything.
At one end of the spectrum, "AI exercise design" means a chatbot that generates scenario text, objectives, and discussion questions when you type a prompt — essentially the same thing you'd get from ChatGPT, Claude, or Gemini with a well-written prompt. You read the output, copy it into a Word document, format it yourself, and call it a situation manual. That's AI-assisted content creation. It's useful, but it's not exercise design.
At the other end of the spectrum, it means a structured, guided workflow that integrates your plans, engages your stakeholders, scopes your scenario against real capabilities and hazards, sequences injects, builds facilitation materials, produces HSEEP-aligned documentation, and connects to a delivery and evaluation system. That's an exercise design platform.
When a vendor says "AI exercise design," ask them to show you — start to finish — what happens between "I want to do a tabletop" and "here's the completed situation manual, the MSEL, the facilitation guide, and the AAR." If the answer involves copy-paste, manual formatting, or "you'd handle that part outside the tool," you now know where the product ends and the marketing begins.
"Accurate"
Every vendor says their tool is accurate. It's the easiest claim to make and the hardest to verify — because "accurate" without a standard is meaningless.
Accurate compared to what? Measured how? If the tool generates exercise content, is it grounded in your actual plans and organizational context, or is it producing plausible-sounding text from a general model? If it evaluates your plans, what's the rubric? Can you see it? Can you challenge it?
What's the methodology? What standard are you evaluating against? Show me the rigor. Show me the output — the real output, not the demo version. Show me what I still have to do myself after your tool is done. "Accurate" is not a feature. It's a claim that requires evidence.
"Secure"
Every vendor says their tool is secure too. But "secure" can mean anything from SOC 2 compliance with rigorous third-party audits to "we use HTTPS."
Prove secure. Not with a branded phrase. With architecture. With compliance audits. With contract language that matches what you said in the room. With a clear answer on what happens to my data if I leave, what happens to my content in your shared library, and whether your security terminology means something real or just means you checked a box on an API agreement. "Secure" is not a feature either. It's a promise that requires proof.
The Bottom Line
None of these questions are unfair. None of them are trick questions. Any vendor building a real product with real integrity should be able to answer every single one of them — clearly, specifically, and without hesitation. Including us.
If a vendor can't answer them, that tells you something. If they won't, that tells you more.



