What Makes an AI Company 'Agentic'? Three Tests Most Fail.
Key Takeaways
- "Agentic AI" has become the most diluted label in enterprise software. Every chatbot wrapper, PDF summarizer, and search tool calls itself agentic now. The word has lost all signal.
- Three concrete tests separate real agentic AI companies from the noise. Can the AI read AND write to your tools? Does it remember context across sessions? Can it act proactively without a prompt?
- Most products fail at test one. They can pull data from your CRM but can't update a single deal field. Read-only is not agentic.
- Persistent memory is the quiet differentiator. Brief an AI on Monday, come back Wednesday, and reference the conversation. If it asks "can you provide more context?" it's a session-based chatbot with a better label.
- Proactive behavior is the real dividing line. An agentic AI coworker notices a deal went quiet in HubSpot and flags it before you ask. A chatbot waits for you to type.
Your Head of Revenue spent three weeks last quarter evaluating AI tools for the sales team. Every one of them called itself an agentic AI company. She connected each to HubSpot and ran the same test: pull a list of stale deals, then update one.
Most could pull the list. Not one could update the deal. They showed her the data, then told her to go change it herself in HubSpot.
Three weeks of demos and trials. Zero tools that could do what their landing pages described.
Half of "agentic AI" is agent washing
This is the state of the market in 2026. Gartner predicts that 40% of enterprise applications will include task-specific AI agents by the end of this year, up from less than 5% in 2025. But the label has outrun the reality by a wide margin. A PDF summarizer and a tool that manages your Google Ads budget now share the same buzzword on their homepage. CFOs are asking whether productivity gains justify the costs, or whether vendors are guilty of what analysts now call "agent washing" -- applying the agentic label to products that don't meet the criteria.
Deloitte's 2026 Tech Trends research describes the pattern: many agentic AI implementations are failing because organizations layer AI onto processes designed for human workers without rethinking how the work should actually be done. The problem often starts at the buying stage. The marketing said "agentic." The demo looked promising. But the product couldn't actually do the job.
If you're a founder or operator evaluating agentic AI companies for your team, you need a filter that takes 15 minutes, not 15 demos. Here are three tests. Most products that use the agentic label fail at least one.
Test 1: Can it read AND write to your tools?
The first and most revealing test for any agentic AI company is simple: can the product take action inside your tools, or can it only pull data from them?
Most products stop at read-only. They connect to your CRM through an API, pull deal data, and display it in a chat interface. That's a dashboard with a text box. Useful, sure. But not agentic.
The difference shows up the moment you ask the AI to change something.
Read-only: you ask for deals closing this month. The AI pulls from HubSpot and shows a table. Helpful. Now you notice a deal with the wrong close date. You switch to HubSpot, find the record, fix the field yourself.
Read and write: you ask the same question, get the same table, spot the same error. Then you say:
@Viktor Push the Keystone deal close date to April 15 and add a note that their CTO wants a second demo of the API integration.
Viktor drafts the HubSpot update, shows you exactly what will change (close date, note text, deal ID), and waits for your approval. You confirm. HubSpot is updated. You never left Slack.
That second interaction is what agentic means in practice. The AI didn't just retrieve information. It took an action in a real system on your behalf, with your sign-off.
Run this test with any product you're evaluating: connect it to HubSpot, Stripe, Google Ads, Linear, or whatever your team uses daily. Ask it to pull data. Then ask it to change something. If it returns instructions on how to make the change yourself, it failed. Viktor connects to 3,000+ integrations with full read and write access. When you ask it to update a deal, pause an ad set in Meta, or create a ticket in Linear, it drafts the change, shows you a preview, and executes only after you approve.
Test 2: Does it remember what happened last Tuesday?
Brief your AI tool on Monday. Come back Wednesday and reference that conversation. If it has no idea what you're talking about, it's stateless -- every interaction starts from zero. This is the test that exposes the widest gap between "agentic" marketing and actual product capability.
Here's why persistent memory matters. Say you're coordinating a product launch. On Monday, you tell the AI: "We're launching the Pro plan next Tuesday. Pricing is $49 per month. Landing page is live at /pricing-pro. The waitlist has 2,400 users."
A stateless tool forgets all of that by your next session. On Tuesday morning, you say "draft the launch email for the waitlist." It doesn't know what launch, what waitlist, or what pricing. You re-explain everything. At that point, you're doing the coordination yourself and using the AI as a text editor that can't remember its own conversations.
Viktor retains that context:
@Viktor Remember the Pro plan launch we discussed Monday? Draft the announcement email for the 2,400 waitlist users. Same pricing and landing page we agreed on.
Viktor already knows it's $49 per month, already has the landing page URL, already knows the audience size. It drafts the email with the correct details without you repeating a word.
This compounds over weeks and months. Viktor maintains persistent memory across every conversation. Ask it to prepare your weekly report, and it remembers what format you preferred last time, which metrics you flagged as important three weeks ago, and that you asked it to stop including the vanity metrics your board doesn't track. That accumulated context is what turns a tool you use into a coworker that knows how your team operates.
The test takes two minutes: brief the AI on something specific. Wait 48 hours. Come back and reference it without re-explaining. If it picks up where you left off, it passes. If it asks "can you provide more context?" you have a session-based chatbot in an agentic costume.
Test 3: Can it do something without being asked?
This is the test that separates real agents from pretenders. Almost nobody passes it.
A chatbot waits for input. You type a question, it responds. That's the interaction model for the vast majority of products using the "agentic" label today. You have to remember to ask the right question at the right time. If you forget, nothing happens.
A genuinely agentic coworker notices things and acts on them. Not because you typed a prompt right now, but because it's watching your tools and surfacing patterns that need attention.
Concrete example: you have 180 deals in your HubSpot pipeline. A $40K deal in the Negotiation stage hasn't had any email activity in 12 days. The deal owner is buried closing three other accounts and hasn't noticed. A chatbot does nothing until someone remembers to ask. Viktor posts in your Slack channel on its own: "Heads up: the Keystone deal ($40K, Negotiation stage) has gone quiet. Last email was March 14. Want me to draft a check-in for the deal owner?"
You didn't ask. You didn't open HubSpot. The AI noticed because you set up a standing instruction:
@Viktor Check our HubSpot pipeline every morning at 8 AM. If any deal over $25K hasn't had activity in 10+ days, post a summary in #sales with the deal owner and last touchpoint.
That pattern works across every tool. An AI coworker monitoring your Google Ads campaigns doesn't wait for you to check the dashboard on Monday. When cost-per-click on the "Enterprise Demo" campaign jumps from $4.20 to $5.90 overnight, it messages you: "CPC spiked 40% on your Enterprise Demo campaign since yesterday. Three ad groups are driving the increase. Want me to pull the breakdown?" You weren't watching. You didn't type a prompt. The AI caught it because monitoring is built into how it works.
Here's the test: ask the vendor whether their product can run tasks on a schedule, monitor your tools for specific conditions, and notify you when something changes without you sitting in the chat window. If the answer involves "you would just ask it when you need that," it's reactive software wearing a proactive label.
How these tests look side by side
All three tests compound. When an AI can read and write, remember context, and act proactively, the workflow gap between what vendors promise and what they deliver becomes obvious. Here's the same set of tasks, handled by a typical chatbot labeled "agentic" versus an AI coworker that actually passes all three:
| Workflow | Chatbot labeled "agentic" | AI coworker that passes all three tests |
|---|---|---|
| Deal goes stale in HubSpot | Does nothing until you remember to ask "show me stale deals" | Posts in #sales at 8 AM: "$40K Keystone deal quiet for 12 days. Draft a check-in?" |
| Google Ads CPC spikes overnight | You discover it Monday morning in the dashboard | Messages you within hours: "CPC up 40% on Enterprise Demo. Three ad groups driving it." |
| Prep a board deck with Q1 numbers | Pulls metrics only if you specify each one from scratch every time | Remembers your Q4 format, pulls MRR from Stripe and pipeline from HubSpot, includes the metrics each board member tracks |
| Update a deal stage after a call | Shows you the deal record, tells you how to update it in HubSpot | Drafts the CRM update (stage change, notes, next step). You approve in Slack |
| Weekly team report from Stripe + HubSpot | Generates a report when asked, forgets the format by next week | Posts a formatted PDF to #growth-metrics every Monday at 9 AM, no prompt needed |
| New customer signs up in Stripe | You find out whenever someone checks the dashboard | Notifies #revenue in real time: "New signup: Acme Corp, Pro plan, $2,400/yr" |
Everything in the left column describes most products wearing the "agentic" label today. The right column is what the label should mean.
What an agentic AI company actually builds
You've run the three tests. Now look at what the company that passed them actually had to build.
Real tool access, not a read-only layer. The AI connects to your tools through official authorization, the same way you'd grant access to any SaaS app. It can pull data and push changes: update a HubSpot deal, adjust a Google Ads budget, create a Linear ticket, send an email through Gmail. Every write action goes through a review-first step where you see what will happen before anything fires. That combination of write access plus human review is the foundation. Write access without review is reckless. Read-only access without write is a reporting tool.
A memory layer that persists across sessions. The AI retains context from previous conversations, learns your preferences, and applies that knowledge to future work. It remembers your board deck format, your preferred metrics, your alert thresholds, and the fact that you stopped including CAC in the weekly report because your CEO finds it misleading at your current stage. This isn't chat history you scroll through. It's working knowledge that shapes how the AI operates every day.
Proactive infrastructure, not just reactive chat. Scheduled tasks, monitoring rules, threshold alerts, and standing instructions that run without a human typing a prompt. This is the engineering investment that separates a real agent company from a chat wrapper: the infrastructure for scheduling tasks, monitoring your tools for changes, and running multi-step workflows between conversations on your behalf.
Viktor is built on all three. It lives in Slack and Microsoft Teams, connects to 3,000+ integrations with read and write access, maintains persistent memory across every interaction, and runs proactive workflows on schedules and triggers you define.
Frequently Asked Questions
What is an agentic AI company? An agentic AI company builds products where the AI takes real actions inside your business tools, not just answers questions about them. Three capabilities define the category: read and write access to tools like HubSpot, Stripe, and Google Ads through official APIs; persistent memory that carries context across sessions; and proactive behavior including scheduled tasks and threshold alerts that run without waiting for a prompt. Most companies using the "agentic" label today meet one of these criteria at best.
How do I tell if an AI tool is truly agentic or just a chatbot? Run three tests in 15 minutes during your trial. First, connect it to a tool you use daily and ask it to change something, not just read data. If it gives you instructions instead of making the change, it fails. Second, brief it on a project, wait two days, and reference that conversation without re-explaining the details. Third, ask whether it can run a task on a schedule or alert you when a metric changes without you in the chat. Any failure means the "agentic" label is marketing, not architecture.
What does "agent washing" mean? Agent washing is when a company markets its product as agentic AI when it's actually a chatbot, a search tool, or a read-only integration with a new label. The term mirrors "greenwashing," applying a popular category label without meeting the actual criteria. Industry analysts have identified this as a widespread pattern, noting that many products marketed as agents are standard chatbots or basic automation with better copy.
What is the difference between an AI chatbot and an agentic AI coworker? A chatbot responds to questions within a single session. It might search the web, summarize a document, or draft text, but it can't log into your CRM and update a deal, can't remember what you discussed last week, and can't spot a problem before you ask. An agentic AI coworker operates inside your tools with real read and write access, carries persistent memory across sessions, and acts on schedules and triggers you configure. The gap is between answering questions about your business and doing work inside it.
Is Viktor an agentic AI company? Viktor passes all three tests. It connects to 3,000+ integrations with full read and write access through official APIs. It maintains persistent memory across conversations, retaining your preferences, formats, and context over time. It supports proactive workflows: scheduled reports, threshold alerts, and standing instructions that run without a prompt. Every write action goes through review-first by default, so you see exactly what will happen and approve before it executes.
Why are so many agentic AI projects failing? Most projects fail because the tool was never truly agentic. Companies buy products labeled "agentic" that turn out to be chatbots or read-only integrations with better marketing. When the promised results don't materialize, the blame lands on "AI isn't ready" rather than "this product was never agentic in the first place." Deloitte's 2026 Tech Trends research confirms the broader pattern: organizations layer AI onto processes designed for human workers without rethinking how the work should be done. Running the three tests above during a trial catches the mismatch before you invest weeks in evaluation.
Viktor is an agentic AI coworker that lives in Slack, connects to 3,000+ integrations with real read and write access, and does the work -- not just the talking. Add Viktor to your workspace -- free to start →
Social Snippets
LinkedIn #1 (Kris voice):
Every AI tool launched in the last 6 months calls itself "agentic."
Here are 3 tests I run in the first 15 minutes of any trial:
- Connect it to HubSpot. Ask it to update a deal, not just pull data. Most can't.
- Brief it on a project. Come back 48 hours later. Reference the conversation without explaining again. If it says "can you provide more context?" it's a chatbot.
- Ask if it can flag a stale deal or alert you when ad spend spikes on its own. If the answer is "you would just ask when you need that," chatbot.
Most fail test one.
Wrote up the full framework: [link]
LinkedIn #2 (brand voice):
"Agentic AI" is on every landing page in 2026.
The problem: a PDF summarizer and a tool that manages your Google Ads campaigns now share the same label.
3-test framework to tell the real ones from the noise: → Can it write to your tools, or just read? → Does it remember last week's conversation? → Can it notice a problem before you ask about it?
Takes 15 minutes during a free trial. Saves weeks of evaluation.
Full post: [link]
X/Twitter:
"Agentic AI" is on every landing page this year.
3 tests, 15 min total:
- Ask it to UPDATE a HubSpot deal, not just read
- Come back in 48 hrs, reference Monday's convo
- Ask if it can alert you when a metric spikes without you typing
Most fail #1. [link]