What is an AI agent?
An AI agent is software that uses a language model to plan and take multi-step actions toward a goal, calling tools (APIs, databases, other systems) along the way. The minimal pattern: a model + a set of tools + a control loop. Unlike a chatbot — which responds and waits — an agent acts, observes the result, and decides what to do next, often across dozens of steps.
What's the difference between an AI agent and a chatbot?
A chatbot turns user input into a response and stops. An agent turns user input into a plan, executes that plan by calling tools, observes the results, and revises until the goal is met or it asks for help. A chatbot answering 'what's my order status' reads from a knowledge base. An agent handling the same query queries the orders API, checks the shipping system, identifies a delay, drafts a refund request, posts it to the ticket queue, and emails the customer.
What's the difference between agentic AI and generative AI?
Generative AI is a capability: producing text, images, code, audio. Agentic AI is an architectural pattern that uses generative AI to drive autonomous, multi-step action with tools. All agentic AI uses generative AI under the hood; not all generative AI is agentic. A summarization endpoint is generative but not agentic. A customer-support agent that reads tickets, looks up orders, and posts replies is both. The agentic pattern is what unlocks measurable business outcomes.
How long does it take to build a production AI agent?
Working prototype: 2 weeks. Production-grade agent (with eval harness, guardrails, observability, and a runbook): 6–10 weeks. The prototype-to-production gap is where most projects fail — the prototype handles the happy path; production has to handle the long tail.
What does it cost to build an AI agent?
A production AI agent at AISD typically costs $40,000–$150,000 depending on complexity. Drivers: number of integrated systems, evaluation rigor required, compliance overhead, and ongoing operational scope. Prototypes alone are cheaper ($10k–$25k) but rarely worth it without a path to production.
Where do AI agents fail in production?
Four predictable failure modes. Tool errors: an API the agent calls is down or returns unexpected data and the agent doesn't recover gracefully. Prompt injection: user-controlled text reaches the agent and overrides its instructions. Cost spirals: an agent that loops without termination conditions burns inference budget. Distribution shift: input patterns change after launch and the agent's prompts no longer match reality. Mitigations: strict tool-call schemas, prompt-injection test suites in CI, cost caps, and weekly eval re-runs.
How do you evaluate AI agent performance?
Three layers of measurement. Offline: a golden test set of 50–500 representative inputs scored automatically (model-graded) and by humans on a sample. Run on every PR. Online: per-call metrics — latency, cost, tool-call success rate, schema-validation pass rate, downstream business outcome. Human-in-loop: weekly review of escalated and low-confidence cases, fed back into the test set.
Should I use n8n, LangGraph, or build from scratch?
It depends on workflow shape and team. n8n wins when the agent is mostly orchestrating SaaS tools and the control flow is straightforward — deploys faster, easier for non-engineers to maintain. LangGraph wins when the agent has complex branching, multi-agent coordination, or needs tight Python integration with custom code. From scratch wins for simple, high-volume agents where every layer of abstraction is overhead.