Learn · AI Engineering

    What is an AI agent?

    An AI agent is software that uses a language model to plan and take multi-step actions toward a goal, calling tools (APIs, databases, other systems) along the way. The minimal pattern: a model + a set of tools + a control loop. The model decides which tool to call next based on what it has seen so far.

    Updated · 2026-05-04 · 6 min read

    An agent vs. a chatbot

    Chatbots respond and wait. Agents take action. A chatbot answering "where is my order?" reads from a knowledge base and replies with a generic delivery-time line. An agent handling the same query queries the orders API, checks the shipping system, identifies a delay, drafts a refund offer, posts it to the ticket queue, and emails the customer.

    Same input — fundamentally different system. The agent's output is not just text; it's text plus a sequence of side-effecting actions.

    The minimal pattern

    Strip every popular framework — LangGraph, AutoGPT, CrewAI, Pydantic AI — back to the essentials, and you find three components running in a loop:

    1. 01A model. Usually a frontier-tier LLM (Claude, GPT, Gemini) for tool-using reasoning.
    2. 02A set of tools. Typed function definitions the model can invoke. Each tool has a name, description, JSON-schema input, and a runtime that returns a result.
    3. 03A control loop. Call the model with the conversation history; if the model requests a tool, run it and append the result; loop until the model produces a final answer or hits a termination condition.

    Production agents add evaluation, guardrails, observability, and human-in-the-loop checkpoints around that core loop.

    Real examples

    Customer-support deflection agent

    Reads the ticket, queries orders/shipping/billing APIs, drafts a reply, and either posts directly (low stakes) or queues for human review (refunds, escalations). Typical AISD outcome: 25–40% auto-resolution.

    Document processing pipeline

    Inbound contracts/claims/invoices → OCR → schema-validated extraction → routing. Exceptions go to a human queue. 30–50% reduction in human review time is typical.

    Sales-research agent

    For each lead: gather a recent news angle, identify the buyer persona, draft a personalized opener, write back to the CRM with a confidence score. 2–4× research throughput per SDR.

    When to use an agent (and when not to)

    Use an agent when the task involves multi-step reasoning + side-effecting actions — looking up data, deciding what to do with it, then doing it. Use a simpler pattern (a single LLM call, RAG over a knowledge base, a deterministic workflow) when the task is one-shot.

    Agents add cost (more tokens per task, more failure modes) and complexity (eval harness, observability, prompt-injection defense). They earn that cost when the workflow has high volume and structured outputs — and don't when the workflow is rare or the output is hard to validate.

    Where to go next

    Frequently asked

    Common questions.

    • What is an AI agent?

      An AI agent is software that uses a language model to plan and take multi-step actions toward a goal, calling tools (APIs, databases, other systems) along the way. The minimal pattern: a model + a set of tools + a control loop. Unlike a chatbot — which responds and waits — an agent acts, observes the result, and decides what to do next, often across dozens of steps.

    • What's the difference between an AI agent and a chatbot?

      A chatbot turns user input into a response and stops. An agent turns user input into a plan, executes that plan by calling tools, observes the results, and revises until the goal is met or it asks for help. A chatbot answering 'what's my order status' reads from a knowledge base. An agent handling the same query queries the orders API, checks the shipping system, identifies a delay, drafts a refund request, posts it to the ticket queue, and emails the customer.

    • What's the difference between agentic AI and generative AI?

      Generative AI is a capability: producing text, images, code, audio. Agentic AI is an architectural pattern that uses generative AI to drive autonomous, multi-step action with tools. All agentic AI uses generative AI under the hood; not all generative AI is agentic. A summarization endpoint is generative but not agentic. A customer-support agent that reads tickets, looks up orders, and posts replies is both. The agentic pattern is what unlocks measurable business outcomes.

    • How long does it take to build a production AI agent?

      Working prototype: 2 weeks. Production-grade agent (with eval harness, guardrails, observability, and a runbook): 6–10 weeks. The prototype-to-production gap is where most projects fail — the prototype handles the happy path; production has to handle the long tail.

    • Where do AI agents fail in production?

      Four predictable failure modes. Tool errors: an API the agent calls is down or returns unexpected data and the agent doesn't recover gracefully. Prompt injection: user-controlled text reaches the agent and overrides its instructions. Cost spirals: an agent that loops without termination conditions burns inference budget. Distribution shift: input patterns change after launch and the agent's prompts no longer match reality. Mitigations: strict tool-call schemas, prompt-injection test suites in CI, cost caps, and weekly eval re-runs.

    • How do you evaluate AI agent performance?

      Three layers of measurement. Offline: a golden test set of 50–500 representative inputs scored automatically (model-graded) and by humans on a sample. Run on every PR. Online: per-call metrics — latency, cost, tool-call success rate, schema-validation pass rate, downstream business outcome. Human-in-loop: weekly review of escalated and low-confidence cases, fed back into the test set.

    • Should I use n8n, LangGraph, or build from scratch?

      It depends on workflow shape and team. n8n wins when the agent is mostly orchestrating SaaS tools and the control flow is straightforward — deploys faster, easier for non-engineers to maintain. LangGraph wins when the agent has complex branching, multi-agent coordination, or needs tight Python integration with custom code. From scratch wins for simple, high-volume agents where every layer of abstraction is overhead.

    • What's the simplest way to start building an AI agent?

      Pick one workflow with high volume, structured input/output, and a clear success metric. Map the tools the agent will need (each is an integration). Build the eval harness on day 1 — 50–500 representative inputs scored automatically. Pick orchestration based on workload, not novelty (single-loop ReAct or plan-and-execute beats multi-agent for most cases). Ship a prototype in 2 weeks, productionize in 6–10. Read /learn/what-is-agentic-ai for the full primer.

    Ready to build?

    From article to production agent in 6 weeks.

    A 30-minute discovery call gets you to a fixed-price proposal — or an honest 'AISD isn't the right fit' if it isn't.