Are attackers really using ChatGPT for phishing?

Yes. Public reporting from CISA, the Anti-Phishing Working Group (APWG), Krebs on Security and the Verizon DBIR documents the routine use of LLMs in real attacker workflows. Attackers use them to fix grammar, generate variations of a single lure for A/B testing, write convincing pretexts in target-specific industry language and impersonate the writing style of named executives based on public material.

Does AI-generated phishing make spam filters useless?

Not useless, but less effective. Many filters historically relied on linguistic heuristics - broken English, awkward phrasing, repeated suspicious phrases - that LLMs strip out. Modern filters have shifted to behavior- and infrastructure-based signals (sender reputation, domain age, link analysis), which are still effective. The user-side recognition gap is what training has to close.

How should phishing training change for the AI era?

Move from copy-and-grammar pattern matching to verification habits. Train users to inspect the sender domain and URL, to use out-of-band confirmation for any financial or credential request and to recognize lure structure (urgency, mismatched action, unusual request) rather than spelling and grammar. The cue set has to evolve faster than the attackers' polish.

Can phishing simulation platforms generate AI-based templates?

Yes. Bait & Phish includes AI-assisted template generation that produces polymorphic variants of base templates - same lure category, different copy, different details - so simulations match what attackers send. The point is to train against a moving target, not a static template library.

What is AI-vs-AI phishing detection?

It's the use of large language models on the defensive side - to score inbound emails for social-engineering signals that aren't present in the headers or links, like manipulative language, executive-impersonation patterns and unusual requests relative to a sender's history. Several email security vendors have shipped this in 2026; it's most effective when paired with continuous simulation that exercises the same detection layer.

AI Phishing: How Attackers Weaponize ChatGPT

AI-Generated Phishing: How Attackers Use ChatGPT

By The Bait & Phish Team
Threat Education, AI, Phishing

For two decades, the dependable advice given to employees about spotting phishing started with the same line: look for typos and bad grammar. That advice is dead. Large language models - ChatGPT, Gemini, Claude, the open-weight models running on attackers' own hardware - have reduced the cost of writing a grammatically perfect, contextually plausible phishing email to roughly zero. The same tools that let your sales team write better emails let attackers do exactly the same thing, at industrial scale, in any language, in the writing style of any target you can find online.

This post is the operator's version of what's actually happening, why it matters and what changes in defensive training and simulation. It's not alarmist; the threat is real but the response is concrete and within reach.

What LLMs gave attackers

Three things attackers used to lack and now have in abundance:

Grammatical and stylistic perfection. The historical "tell" of phishing - broken English, awkward phrasing, missing articles - is gone for any attacker willing to paste their lure through a free LLM. The gap that used to separate cheap commodity phishing from polished spear-phishing has narrowed to almost nothing.
Polymorphic content at zero marginal cost. A single lure can be regenerated in a thousand variations, each different enough to evade content-based spam filtering and template-matching defenses. Reuse penalties that used to slow campaigns down don't apply anymore.
Persona impersonation from public material. Feed an LLM a CEO's last earnings call transcript, three LinkedIn posts and a press release - outputs an email in their voice, on a topic they actually care about, plausible enough that even close colleagues read past the obvious checks.

None of that requires an exotic threat actor. The barrier to entry is a free account on any major LLM service or a copy of an open-weight model running on a laptop. Public reporting from CISA, the Anti-Phishing Working Group (APWG), the Verizon DBIR, the FBI Internet Crime Complaint Center (IC3), and ongoing coverage at Krebs on Security all document the routine adoption of LLM workflows by both organized criminal groups and lone-wolf attackers.

The new shape of a phishing email

What does an LLM-assisted phishing email actually look like in 2026?

The grammar is correct. The tone is appropriate to the recipient's industry and seniority.
The pretext references real public events: a recent acquisition, a product launch, an industry conference the recipient attended last month.
Salutations match the recipient's preferred form of address. The signature block matches the impersonated sender's style - ranks, abbreviations, even punctuation habits.
Where a human attacker would have made one version of the email, the LLM produced 200, each slightly different, none matching a static IOC the email gateway has seen before.
The call-to-action remains the same - a link, a callback, an attachment, a wire transfer instruction - but the social-engineering wrapper is much harder to spot.

The lure categories haven't changed; see our 2026 examples roundup for the categories that still dominate. What changed is the polish. The Microsoft 365 password-expiry email of 2026 reads like an internal memo, not a Russian-to-English translation.

What still works for users

The recognition advice has to evolve away from grammar pattern-matching toward structural and behavioral cues. The cues that still work:

Sender domain inspection. No amount of LLM polish changes the underlying SMTP infrastructure. A spoofed Microsoft notification still has to come from somewhere; that somewhere is rarely microsoft.com.
URL preview before click. Hover-to-inspect, or long-press on mobile. The lookalike domain is still the lookalike domain even when the email body is flawless.
Out-of-band verification for high-impact requests. Any request involving credentials, money or sensitive data should be confirmed through a separate channel - a known phone number, a hallway conversation, the corporate chat. The LLM can't impersonate two channels at once unless it's in both, which materially raises the attacker's cost.
Lure structure recognition. Urgency, scarcity, mismatched action ("click here to verify" for an internal HR matter), unusual request from a known contact. The structure of the manipulation is unchanged; it's the surface polish that moved.

The five-red-flags piece - see phishing email red flags - is the employee-facing version of this. It's deliberately structured to age well as the surface polish improves.

What changes for the simulation program

If your simulation library still consists of templates with quirky grammar errors, it's underprepared for what attackers actually send. Three concrete program changes:

Add AI-generated templates to the library. Bait & Phish's aiMail.php capability generates polymorphic variants of base templates so each user receives a unique-but-equivalent lure. This mirrors the polymorphism attackers now have for free.
Move templates up the difficulty ladder. Templates that scored as "hard" three years ago are "regular" today; what was "regular" is now "easy." The difficulty levels guide explains the calibration. Recalibrate yearly.
Test executive-impersonation explicitly. The combination of LLM persona-mimicry and BEC pretexts is the highest-loss scenario in 2026. Run hard-tier executive-impersonation campaigns against finance and legal teams as a regular part of the program.

The four ways attackers actually use LLMs in their workflow

Worth grounding the abstract "AI-generated phishing" phrase in the concrete attacker workflow. Public reporting and intercepted toolkits document four distinct uses:

Translation and grammar polish. The historical commodity-phishing operation was non-English-native. LLMs eliminate that signal across every target language. A campaign against Brazilian Portuguese speakers reads as fluently as a campaign against U.S. English speakers, with no additional human cost.
A/B-style content variation. Generate 200 variants of a single base lure, send 25 each to different segments of the target list, measure which performs best, scale up the winner. This is normal marketing practice; attackers have adopted the same loop.
Persona impersonation from public source material. Feed the model an executive's earnings-call transcript or a series of public posts; ask for an email in their voice on a plausible topic. The output is a spear-phishing draft that close colleagues read past the obvious checks.
Reply-chain insertion. Hijack a real email thread, then use the LLM to compose a continuation message that matches the prior tone and topic. Often the single highest-yield use of LLMs in attacker workflows because the trust signal is borrowed from the legitimate prior thread.

AI-vs-AI on the defense side

Several email security vendors have shipped LLM-based scoring of inbound mail in 2026 - looking for executive impersonation patterns, manipulative language and out-of-character requests relative to a sender's history. These tools are useful but they're not a replacement for trained users; they're a layer that catches what the user might miss, with the user catching what the model misses. The two together perform measurably better than either alone, which is the consistent finding across NIST guidance and CISA advisories on layered defense.

The implication for a security awareness program is that simulation has to test the human layer in conditions that look like the production threat - that is, with AI-polished lures - rather than against the static, broken-grammar baseline of five years ago.

What this looks like in policy

Three policy changes worth pushing through in 2026:

Out-of-band callback for any wire transfer or credential change. No emailed link is sufficient evidence on its own. Codify it.
Mandatory simulation coverage for executives. Carve-outs are the highest-risk pattern; cyber insurers are starting to ask about them by name. See our cyber insurer phishing questions guide.
Annual training-content refresh. If your awareness modules still highlight grammar errors as the primary recognition cue, the curriculum is dated. Replace with verification habits and lure-structure recognition.

What changes for the security awareness curriculum

If your training modules still highlight broken English as the leading recognition cue, the curriculum is dated. Three concrete updates:

Replace grammar-error examples with structural-cue examples. Show employees a clean, well-written phishing email next to a clean, well-written legitimate email and ask them to identify the structural differences (sender domain mismatch, mismatched action, unusual request). The exercise teaches what still works.
Introduce the verification habit explicitly. Out-of-band callback for any high-impact request is the cheapest reliable defense, and it's not a default behavior in most workforces. Train it, role-play it, simulate it.
Acknowledge the AI threat directly. Employees who have heard "look for typos" their entire careers need to be told, plainly, that the rule has changed. Don't bury the update; lead with it.

The five red flags piece - see phishing email red flags - was deliberately written to be the structural-cue replacement for the grammar-error training era.

Where Bait & Phish fits

The Bait & Phish platform was built with AI-assisted simulation in mind: polymorphic template generation through aiMail, a simulated phishing library that covers AI-polished lures across all five intent categories, and auto-assigned remediation training that emphasizes verification habits over copy-and-grammar pattern matching. If you want to test how your team holds up against a modern AI-generated lure, start a 25-user free trial and run a hard-difficulty campaign this week. For full-organization rollouts, pricing is on the pricing page; for help mapping the program to a specific compliance requirement, contact us. The threat is real, the response is concrete - the gap is mostly between programs that have updated and programs that haven't. About us covers our methodology in more depth.

See also: Phishing Trends 2026 - annual roundup covering AiTM commoditization, AI-generated lure quality, collaboration-tool phishing, ransomware dwell-time compression and other patterns that defined the year.

19feb