Don't Let Your AI Agent Act Without Asking

Key Takeaways

  • Every major AI agent disaster shares one root cause. The AI acted on real systems -- inboxes, customer chats, financial data -- and nobody checked before it did.
  • The incidents are escalating. A deleted inbox. A $60,000 car sold for $1. A court ruling against an airline. Gartner predicts 40% of agentic AI projects will be canceled by 2027.
  • "Fully autonomous" is a vendor pitch, not a user feature. When an AI agent has write access to your tools, the question isn't how fast it acts. It's what happens when it's wrong.
  • Review-first is the architecture that eliminates the entire category. The AI proposes, you approve, then it acts. Two seconds to check vs hours to undo.
  • Viktor uses review-first by default. Every draft email, every CRM update, every file it creates -- you see it before anyone else does.

Summer Yue, Meta's director of AI alignment, gave her OpenClaw agent one job: go through her overstuffed email inbox and suggest what to delete or archive.

The agent started deleting everything.

Not spam. Not promotions. Everything. The sheer volume of her real inbox triggered context compaction -- the agent's running memory compressed mid-session, causing it to skip her stop commands and revert to an earlier "clean up everything" behavior. What she later called a "speed run" through her entire inbox.

"I had to RUN to my Mac mini like I was defusing a bomb," she wrote on X.

When someone on X asked whether she'd been intentionally testing its guardrails, she replied: "Rookie mistake tbh."

This is a director of AI alignment at one of the largest AI labs in the world. She'd tested the agent on a smaller inbox first. It had been working fine for weeks. She trusted it. She let it loose on the real thing.

If it can happen to her, it will happen to your ops lead. Your marketing manager. Your founder who just connected their CRM.

The pattern hiding in every AI failure story

The biggest AI agent risks aren't intelligence failures -- they're architecture failures. Every major incident shares the same root cause: the AI had write access to real systems and nobody checked before it acted. The OpenClaw inbox incident is the most recent, but it's not the first:

Incident What the AI did Damage Root cause
OpenClaw inbox (Feb 2026) Deleted entire email inbox while ignoring stop commands All email gone, researcher had to run to the machine to kill it Agent had write access, no approval step before acting
Chevrolet chatbot (Nov 2023) Agreed to sell a $60,000 Tahoe for $1 after prompt injection 20M+ viral views, brand embarrassment Chatbot could make commitments with no human check
Air Canada chatbot (2024) Gave wrong refund policy, told customer to buy full-price ticket and claim bereavement discount later Court ruled airline liable, forced to pay refund Chatbot stated policy as fact with no verification step
DPD delivery chatbot (Jan 2024) Swore at a customer and called DPD "the worst delivery firm in the world" 1.3M views, company disabled the chatbot entirely No review layer between AI output and customer
McDonald's AI drive-thru (2024) Added bacon to ice cream, got simple orders wrong repeatedly 3 years of testing with IBM, program ended June 2024 AI acted on interpreted input without confirmation

Five different companies. Five different industries. Five different AI systems. One pattern: the AI had the ability to act, and nobody checked before it did.

Not one of these incidents involved a model that was too dumb. The models understood language fine. They failed because they had write access -- to inboxes, to customer-facing chat, to ordering systems -- and the architecture didn't include a step where a human looked at what was about to happen.

"Fully autonomous" is the wrong goal

Full autonomy is a vendor pitch, not a user feature. When an AI agent has write access to your tools, the question isn't how fast it acts -- it's what happens when it's wrong. Most agentic AI companies skip this question entirely. Finding an agentic AI company that leads with safety instead of speed is rare -- but that's exactly where the category needs to go.

The numbers tell a different story.

Gartner studied thousands of vendors marketing agentic AI capabilities in 2025. They found that only 130 of those thousands were actual agents. The rest were chatbots with better marketing -- what Gartner calls "agent washing."

Of the real ones, Gartner predicts over 40% of agentic AI projects will be canceled by the end of 2027 due to rising costs, unclear value, and poor risk controls.

Anushree Verma, Senior Director Analyst at Gartner, put it bluntly: "Most agentic AI projects right now are early stage experiments or proof of concepts that are mostly driven by hype and are often misapplied."

Here's what that means for you as a founder or operator: the AI agent that promises to do everything without asking is also the one most likely to do the wrong thing without asking. Full autonomy sounds like a feature. In practice, it's the root cause of every incident in the table above.

What review-first actually means

Review-first is simple: the AI proposes an action, you see exactly what it's about to do, and then you approve or reject it.

Not "human in the loop" in the vague, academic sense. Not a toggle buried in settings. The default behavior.

Here's how this works in Viktor:

Say you ask Viktor to draft a follow-up email to a prospect who went quiet after a demo.

@Viktor Draft a follow-up email to Sarah Chen at Meridian. She did a demo last Tuesday, said she needed to check with her CTO. Keep it short, reference the HubSpot integration she asked about.

Viktor pulls the context from your CRM -- the demo notes, the deal stage, the specific integration Sarah asked about. It drafts the email. Then it shows you exactly what it wrote and who it's sending to.

You read it. You tweak one line. You hit approve. The email goes out.

Total time added by the review step: maybe 8 seconds. Damage prevented by the review step: you caught that Viktor pulled the wrong demo date from HubSpot, or that Sarah's last name was misspelled in the CRM, or that the tone was too aggressive for this account.

Now scale that to everything: CRM updates, Slack messages to clients, Google Ads budget changes, file uploads, calendar invites. Every action Viktor takes goes through this step by default.

When (and how) you remove the guardrails

Review-first doesn't mean you review everything forever.

There's a natural progression. You start by reviewing every action. You notice that Viktor's weekly report is correct 15 weeks in a row. You set up a cron job: "Every Monday at 9 AM, pull data from Stripe, Google Ads, and HubSpot, generate the report, and post it to #growth-metrics." No approval needed for that one anymore. You've built trust through evidence.

The difference is that you choose when to remove the check based on what you've seen, not what a vendor promised. And you can re-enable review for any action at any time.

Compare this to how each approach actually plays out across real workflows:

Workflow Fully autonomous Review-first (Viktor) Manual
Follow-up email to prospect Sends immediately -- wrong tone goes to client Viktor drafts, you adjust one line, approve in 8 sec You write from scratch, 15 min per email
Pause underperforming ad sets Kills all low-ROAS ads including new tests Shows you what it'll pause, you save the $12 test You check Meta Ads Manager, miss it until Monday
Update CRM deal after a call Writes the wrong close date, team plans around it You spot the date error before it hits HubSpot You forget to update, pipeline data goes stale
Weekly report to #general Posts with an incorrect revenue number, team panics You glance at the PDF, fix one metric, approve Someone spends 4 hours pulling data from 5 tabs

The tradeoff is not speed. Review-first adds seconds, not hours. The tradeoff is between trust you've earned and trust you've assumed.

The real cost of getting it wrong

AI agent safety isn't theoretical. The incidents above each carry a different kind of cost -- and all of them compound.

Lost data costs recovery time. Yue's inbox contained years of context that no backup fully restores. Brand damage costs trust. Twenty million people watched the Chevrolet incident and now associate that dealership with reckless AI deployment. Legal liability costs money. A Canadian court ruled Air Canada responsible for what its chatbot told a customer, setting a precedent: companies are legally liable for what their AI says and does.

These aren't edge cases anymore. They're a pattern. And the pattern has a fix.

What this looks like when it works

Here's a second example -- different from email, different stakes. Your Meta Ads ROAS dropped overnight and you need to act fast:

@Viktor Pause all Meta Ads ad sets in the "Spring Sale" campaign where ROAS dropped below 1.5x in the last 48 hours. Leave the top 3 performers running. Show me what you're about to pause before you do it.

Viktor pulls performance data from Meta's API, identifies the underperforming ad sets, and shows you a table: ad set name, spend, ROAS, and the action it's about to take. You see that one ad set Viktor flagged is actually a new creative you launched yesterday -- low ROAS but only $12 in spend. You reject that one, approve the rest. The pauses go through. Your budget stops bleeding.

Without the review step, Viktor would have paused the new creative too. With it, you saved a test you wanted to run -- and still killed the waste in under a minute.

That's the entire philosophy: AI does the work, you own the outcome.

How review-first works across 3,000+ tools

Every integration Viktor connects to -- all 3,000+ of them -- goes through the same review step. Not a subset. Not just "high-risk" tools. All of them, by default.

That means the review step you saw with the email draft above works identically when Viktor changes a Google Ads budget, updates a Notion page, creates a Linear ticket, or sends a Slack message to a client channel. The action gets proposed, you see it, you approve or reject. Same pattern whether the tool is a CRM or a payment processor.

This is the same architecture that lets Viktor handle 100,000 tools without breaking. The review layer doesn't slow down as integrations scale -- it's built into how Viktor routes every action, not bolted on after the fact.

Frequently Asked Questions

What is review-first AI? Review-first means the AI proposes every action -- emails, CRM updates, ad changes, file uploads -- and waits for your approval before executing. You see exactly what's about to happen, approve or reject it, and only then does the action go through. It's the default in Viktor, not an optional setting.

What are the biggest AI agent risks? The biggest risks come from architecture, not intelligence. When an AI agent has write access to real systems (inboxes, customer chat, financial data) and acts without a human check, a single mistake can delete data, commit your company to false promises, or create legal liability. Every major AI incident in 2023-2026 shares this root cause.

Can you turn off review-first and let Viktor act autonomously? Yes. You choose when to remove the review step for specific workflows based on evidence -- for example, after Viktor generates the same weekly report accurately for 15 weeks. You can re-enable review at any time. The key is that autonomy is earned through observed performance, not assumed from a sales pitch.

Is Viktor safe to connect to sensitive tools like Stripe or HubSpot? Viktor connects via official OAuth -- it never sees your passwords or API keys. Every action goes through review-first by default, so Viktor can't charge a customer, delete a deal, or send an email without your explicit approval. You control exactly what Viktor can and can't do.

How long does the review step add to each task? Typically 2-10 seconds. You're glancing at a draft or confirming a CRM update, not re-doing the work. The time you spend reviewing is a fraction of the time you'd spend fixing a mistake that went out unchecked.


Viktor is an AI coworker that lives in Slack, connects to 3,000+ integrations, and shows you every action before it happens. Add Viktor to your workspace -- free to start →