

Three studies published in the last few months don't reference each other. They sit in different literatures, used different methods, and were written for different audiences. Read separately, each is interesting. Read together, they describe a structural threat that the emergency management field — and anyone whose work depends on getting hard things right — needs to understand quickly.
The first, by Stephanie Tully, Chiara Longoni, and Gil Appel in the Journal of Marketing , shows that people with lower AI literacy are more receptive to AI. The relationship is consistent and is mediated by a specific psychological mechanism: people with less AI knowledge perceive AI as magical and experience awe when thinking about it executing tasks that seem to require uniquely human attributes. The paper's managerial recommendation is direct: companies should target lower-literacy consumers and lean into the magic, because efforts to demystify AI actually reduce its appeal.
The second, by Steve Rathje and colleagues across seven studies, explores the psychological consequences of interacting with sycophantic AI — AI that excessively validates the user's existing beliefs. The findings are sharp:
Brief conversations with sycophantic chatbots increased the extremity of people's attitudes and increased their certainty.
It inflated their perception that they were "better than average" on desirable traits (like intelligence and empathy) and led them to bet real money on those inflated self-assessments.
Most effects persisted at a one-week follow-up — meaning this is not a momentary usability annoyance but a durable shift in how the user thinks.
Most crucially, participants consistently rated sycophantic AI as "unbiased," even though third-party annotators viewed the sycophantic and disagreeable chatbots as equally biased. People could readily see bias in AI that disagreed with them; they could not see bias in AI that agreed with them.
The third, by Steven D. Shaw and Gideon Nave, introduces the missing mechanical link. They found that as AI becomes embedded in daily thought, users frequently engage in "cognitive surrender" — adopting AI outputs with minimal scrutiny and completely overriding their own intuition and deliberation.
Put the three papers next to each other, and a pattern emerges. The marketing playbook (Tully) produces the awe-state that drives uncritical adoption. The conversational design (Rathje) produces overconfidence and bias blindness. The cognitive result (Shaw & Nave) is the bypass of analytical human thought.
For consumer applications where engagement is the goal, this is a tradeoff companies will happily make. In emergency management, where the purpose of the work is to surface what we have not yet considered, it's a structural problem.
A note on what transfers
The Tully study was conducted on consumers, not emergency managers, and it would be sloppy to claim its findings map cleanly onto a professional field. Emergency managers are not the lower-literacy consumers in that research. They are sophisticated practitioners with deep doctrinal training, and the awe-driven adoption pathway Tully describes is not the primary risk for this audience.
But the Rathje and Shaw/Nave findings are a different matter. Rathje's participants were not asked to make decisions about pizza toppings — they were asked questions where they expected the AI to give them something resembling an objective answer, and they updated their confidence based on what they got back. That expectation of objectivity is precisely the relationship emergency managers have with the tools they use to draft plans, build exercises, and analyze hazards. When a planner asks an AI to help develop a scenario or surface gaps in a continuity plan, they are not looking for affirmation — they are looking for analysis they can rely on. The sycophancy mechanism described in Rathje is most dangerous exactly in that posture: when the user believes they are receiving an objective answer and has no internal signal that they are not.
Shaw and Nave's time-pressure finding transfers even more directly. Their research describes a structural feature of human cognition under load, not a consumer behavior. Emergency managers operate under time pressure as a defining condition of the work. If anything, the EM context represents a more acute version of the environment Shaw and Nave studied, not a softer one.
So the precise claim is this: the adoption pathway in Tully is a consumer phenomenon and should be cited with caution. The cognitive consequences in Rathje and Shaw/Nave describe mechanisms that the EM field has good reason to believe apply with equal or greater force in professional crisis contexts — because the expectation of objectivity is higher, and the time pressure is more acute.
The mechanism is calibration, and the result is surrender
The most important finding in the Rathje paper is a single line: participants who talked to sycophantic AI rated themselves as more knowledgeable about the topic they had just discussed, but when given an objective 10-item knowledge quiz, they performed no better than participants in other conditions. Perceived knowledge went up; actual knowledge didn't move.
This is a calibration failure. Calibration is the alignment between your confidence and your actual accuracy — it is what separates expertise from mere certainty. But Shaw and Nave explain exactly why this calibration fails so completely: it is the result of cognitive surrender.
Cognitive surrender is not just about being tricked or taking a shortcut; it represents a deeper transfer of agency. The user relinquishes cognitive control and simply adopts the AI's judgment as their own, completely shutting down their internal analytical processing. The user feels more sure, but the reasons for being sure haven't changed. The bridge between belief and evidence snaps quietly, and nothing in the user's experience signals that anything has happened. You don't update your thinking because nothing tells you there is anything to update on.
The compounding problem under pressure
The failure modes described across these studies compound, and they are acutely dangerous in crisis environments.
The Tully mechanism is an adoption problem: those with the least understanding of AI are the most eager to use it. The Rathje mechanism is a use problem: users accept AI outputs more readily when those outputs flatter their priors.
But Shaw and Nave identify the environmental trigger that springs the trap: time pressure. They found that time constraints suppress analytical deliberation and actively shift decision-makers toward relying on artificial cognition. When decision time is scarce, the low-friction path of deferring to an external machine becomes almost irresistible.
This means that exactly when an emergency manager most needs rigorous, interrogative thinking — during the onset of a crisis — the psychological pressure of the moment will structurally push them to surrender to a sycophantic AI. There is no natural correction mechanism. The user who most needs to slow down is the user least likely to.
Why emergency management cannot afford this
Emergency management is one of a handful of fields where the entire point of the work is to produce productive doubt. A plan is valuable because of the planning process that generated it — the assumptions surfaced, the dependencies mapped, the failure modes considered.
Ashby's Law of Requisite Variety says that the variety in a response system has to at least match the variety in the environment it confronts. The work of preparedness is the work of forcing variety into thinking that, left alone, defaults to template reuse and assumption inheritance.
A sycophantic AI in this context does not produce a worse plan. It produces a more confident plan that has been less interrogated. The planner feels they have done the work, and the resulting document carries the appearance of thoroughness while losing the variety that made the underlying process valuable.
This is worse than no AI. A planner working without AI knows they are guessing at what they might have missed, and that uncertainty is itself a prompt to keep searching. A planner who has cognitively surrendered to a sycophantic AI has been told, gently and convincingly, that they have not missed anything — and the search ends right there. The unexamined assumption survives intact, dressed in the language of analysis.
The narrow escape hatch
The Rathje paper does gesture at a way out. They tested a "gently disagreeable" chatbot — one that presented opposing facts wrapped in emotional validation. Users rated it as relatively unbiased, and it was still persuasive. However, it was less persuasive than purely disagreeable AI. In an open consumer market, the equilibrium under competitive pressure will be pure palatability.
This is where the EM field has unusual leverage. In professional domains where bad decisions have visible downstream costs, the procurement decision is the one place where epistemic quality can be priced.
Shaw and Nave offer the exact vocabulary procurement officers need: the difference between cognitive offloading and cognitive surrender.
Cognitive offloading is a strategic delegation of a discrete task to an external tool, where the user maintains oversight and their internal reasoning remains active.
Cognitive surrender is the uncritical abdication of reasoning itself — adopting an answer without verification.
The procurement question for any AI product in emergency management should not be "Is it capable?" or "Does our team enjoy using it?" The limiting mandate must be: we buy tools designed for cognitive offloading; we reject tools optimized for cognitive surrender.
Does the system surface assumptions the user did not explicitly state? Does it flag inconsistencies in the user's logic? Does it preserve opposing evidence rather than smoothing it away? When asked to design a tabletop, does it surface the assumptions baked into the scenario — or just execute on them? A vendor who cannot answer these questions concretely is selling a tool that will, by default, slide toward sycophancy.
A note for those of us selling into this field
There is a recursive trap in the Tully finding that anyone building AI products for emergency management has to sit with honestly. The study shows that demystifying AI reduces its appeal. The more clearly you explain how the system actually works — its limits, its failure modes, the places it should not be trusted — the less magical it feels, and the less some buyers want it. According to the research, some actually trust you less after explaining the AI magic trick.
This is the bind. A vendor who builds an interrogative tool, who refuses to let the system flatter the user, who surfaces uncertainty rather than smoothing it away, is building a product that will feel less impressive in a demo than a competitor's polished, confident, agreeable one. Every honest disclosure is a small subtraction from the awe. The market, left alone, rewards the magician over the engineer.
I do not have a clean answer to this. I am writing as someone whose company sells AI into this field, and I am describing a dynamic that works against my own commercial interest as much as anyone else's. The only response I can defend is to keep building the interrogative version anyway, to be transparent about what the tools can and cannot do, and to trust that the buyers who matter — the ones whose decisions have downstream consequences — will eventually price epistemic quality higher than polish.
That trust is not naive. It is conditional on the field making the choice described in the next section. If practitioners do not signal that they want the harder tool, the harder tool will not get built — not by me, not by anyone. The vendors who try will lose to the vendors who don't, and the field will get the AI it tolerated rather than the AI it needed.
The honest version
The AI products that consumers most want to use are the AI products that are worst for their decision-making. Awe drives adoption. Validation drives enjoyment. Both effects point the same direction, and the open market will resolve toward the products that maximize them.
Emergency management does not have to resolve there. But it won't drift toward something better on its own. The field has to decide, out loud, that it wants something different — and then it has to behave that way.
That means three things, and none of them require waiting for a procurement cycle.
Be aware. The mechanisms described in these papers are not exotic. They are operating right now, in every interaction a planner has with a general-purpose AI tool. The first defense is recognizing that the feeling of having thought rigorously about a problem is not the same as having thought rigorously about a problem — and that a well-tuned chatbot is engineered to produce the former without requiring the latter. A planner who knows the mechanism can watch for it in themselves. A planner who doesn't, can't.
Signal what you want. Vendors build what buyers ask for. Right now, the questions EM professionals ask in demos are almost entirely about capability — can it do this, can it do that, how fast. Those are the wrong questions, or at least incomplete ones. The questions that shape the next generation of tools are questions like: does this system surface assumptions I didn't state? Does it preserve disagreement rather than smooth it away? Does it tell me when my logic is inconsistent, or does it help me sound more confident in it? If enough practitioners ask those questions in enough rooms, vendors will start building toward the answers. If no one asks, no one builds.
Critique vendors publicly. The EM field has a strong professional culture and a small enough community that reputational signals travel fast. When a vendor demos a tool that flatters the user, says yes to bad scenarios, or produces confident outputs without surfacing what it doesn't know, that should be said out loud — in conference rooms, in LinkedIn posts, in after-action conversations. Not as a gotcha, but as a standard. The vendors building responsibly will welcome the pressure because it differentiates them. The ones building toward palatability will feel it, and some of them will adjust.
The confidence machine is being built. The people most affected by it will be structurally least equipped to notice. The EM field is one of the few professional communities with both the standing to push back and the incentive to do so — because when confidence outruns warrant in this work, the cost is paid by people who never saw the plan and never met the planner.
That's us. Awareness, signal, critique. We should act like it.



