How to choose between workflows and agents, implement each pattern with OpenAI, and glue everything together with LangGraph, LangChain, LlamaIndex, and vector databases like Qdrant/Chroma.
TL;DR (what you’ll get)
- A crisp, workflows vs. agents comparison you can use in design reviews. (Source: Anthropic)
- Anthropic’s taxonomy of common agentic patterns (prompt chaining, routing, parallelization, orchestrator-workers, evaluator-optimizer, and fully autonomous agents) with “when to use” guidance. (Source: Anthropic)
- OpenAI-first code (Python) for each flow, with optional LangGraph/LangChain/LlamaIndex wiring. (Source: platform.openai.com)
- Minimal Qdrant and Chroma examples for retrieval.
- A complete end-to-end agentic recipe for a customer-support style app (routing → retrieve → act → review → approve).
1) Workflows vs. Agents: What’s the difference?
Anthropic’s definition is the cleanest I’ve seen:
- Workflows: LLMs + tools orchestrated via predefined code paths (predictable).
- Agents: LLMs dynamically direct their own process and tool usage, looping with feedback until done (flexible). Anthropic
Quick comparison:
| Dimension | Workflows | Agents |
|---|---|---|
| Control flow | Fixed / code-driven | Model-driven (plans/tools in a loop) |
| Predictability | High | Lower (but more adaptive) |
| Best for | Well-defined tasks, SLAs | Open-ended tasks, variable steps |
| Cost/latency | Lower, bounded | Higher, variable (looping/tool calls) |
| Guardrails | Straightforward | Essential (limits, checks, sandbox) |
| Typical framework | DAGs / graphs | Graphs with loops + memory / state |
Rule of thumb: start simple; only add agentic complexity when it measurably improves outcomes. (Source: Anthropic)
2) The building block: An augmented LLM
Before choosing a pattern, design an LLM with:
- Retrieval (vector search over your docs/data),
- Tools/actions (APIs to take real-world steps),
- Memory/state (short- or long-term).
Anthropic calls this the augmented LLM; everything else composes on top.
Mermaid diagram – augmented LLM context:
(DIAGRAM HERE)
3) Anthropic’s patterns (and how to code them)
Below I summarize Anthropic’s workflows and agents (with when-to-use) and show OpenAI-first Python snippets. Patterns are not mutually exclusive. Combine as needed.
3.1 Prompt Chaining (Workflow)
When: Break a task into clear, fixed steps (e.g., outline → draft → QA)
# Prompt chaining with OpenAI (Python)
from openai import OpenAI
client = OpenAI()
def gen_outline(topic):
return client.responses.create(
model="gpt-4.1-mini",
input=f"Create a bullet outline for a blog on: {topic}"
).output_text
def improve_outline(outline):
return client.responses.create(
model="gpt-4.1-mini",
input=f"Critique & improve this outline, keep bullets:\n{outline}"
).output_text
def write_article(outline):
return client.responses.create(
model="gpt-4.1",
input=f"Write a detailed article strictly following this outline:\n{outline}"
).output_text
outline = gen_outline("Agentic AI patterns")
outline = improve_outline(outline)
article = write_article(outline)
print(article)
OpenAI’s Responses API and Agents SDK are the recommended interfaces in 2025 for text + tool calling; the SDK gives you nice utilities for tools & tracing.
3.2 Routing (Workflow)
When: Inputs fall into distinct categories handled by different prompts/tools/models (e.g., billing vs. tech support; easy vs. hard routed to small vs. large model).
# Simple router → then task-specific prompts
from openai import OpenAI
client = OpenAI()
CATEGORIES = ["billing", "tech_support", "general"]
def route_query(q):
r = client.responses.create(
model="gpt-4.1-mini",
input=f"Classify into {CATEGORIES}. Only output a single label.\n\nQuery: {q}"
).output_text.strip().lower()
return r if r in CATEGORIES else "general"
def handle(q):
c = route_query(q)
prompt = {
"billing": "You are a billing assistant. Answer precisely.",
"tech_support": "You are a technical support assistant. Be diagnostic.",
"general": "Be concise and helpful."
}[c]
return client.responses.create(model="gpt-4.1-mini",
input=f"{prompt}\nUser: {q}").output_text
print(handle("Refund my last invoice, please"))
3.3 Parallelization (Workflow)
When: Subtasks are independent (speed via parallel) or you want voting across diverse prompts.
# Parallel sectioning / voting with asyncio
import asyncio
from openai import AsyncOpenAI
aclient = AsyncOpenAI()
async def summarize(section):
r = await aclient.responses.create(model="gpt-4.1-mini",
input=f"Summarize:\n{section}")
return r.output_text
async def main(sections):
results = await asyncio.gather(*[summarize(s) for s in sections])
# Voting/aggregation:
final = "\n".join(results)
return final
summary = asyncio.run(main(["part A...", "part B...", "part C..."]))
print(summary)
3.4 Orchestrator-Workers (Workflow)
When: Complex tasks with unknown subtasks; central LLM plans & delegates to worker LLMs/tools, then synthesizes.
This is where LangGraph shines: it’s a low-level orchestration framework for stateful, long-running agents/workflows—with cycles, branches, persistence, and human-in-the-loop. (https://www.langchain.com/langgraph)
# Orchestrator-workers in LangGraph (minimal sketch)
from langgraph.graph import StateGraph, END
from langgraph.checkpoint.memory import MemorySaver
from typing import TypedDict, List
class State(TypedDict):
task: str
plan: List[str]
results: List[str]
def plan_node(state: State):
# (call OpenAI to create a plan as bullet steps)
# ... set state["plan"] = [...]
return state
def worker_node(state: State):
# (iterate one step of the plan: search, code, call a tool, etc.)
# ... append to state["results"]
return state
def decide_next(state: State):
return END if len(state["results"]) >= len(state["plan"]) else "worker"
graph = StateGraph(State)
graph.add_node("plan", plan_node)
graph.add_node("worker", worker_node)
graph.set_entry_point("plan")
graph.add_edge("plan", "worker")
graph.add_conditional_edges("worker", decide_next, {END: END, "worker":"worker"})
app = graph.compile(checkpointer=MemorySaver())
# run
out = app.invoke({"task":"Implement feature X"}, config={"configurable":{"thread_id":"t1"}})
LangChain explicitly recommends building new agents on LangGraph.
3.5 Evaluator-Optimizer (Workflow)
When: Iterative improvement with clear evaluation criteria (e.g., code review, style guide conformance).
# Generator + Critic loop
from openai import OpenAI
client = OpenAI()
def generate(spec):
return client.responses.create(model="gpt-4.1",
input=f"Write code per spec:\n{spec}").output_text
def critique(code):
return client.responses.create(model="gpt-4.1-mini",
input=f"Critique this code; bullet list of fixes:\n```py\n{code}\n```").output_text
def apply_fixes(code, feedback):
return client.responses.create(model="gpt-4.1",
input=f"Improve the code using this feedback:\n{feedback}\n\nCode:\n```py\n{code}\n```").output_text
code = generate("CLI that greets a name")
for _ in range(2):
fb = critique(code)
code = apply_fixes(code, fb)
print(code)
3.6 Agents (Autonomous)
When: Open-ended, multi-step tasks where you can’t predefine the path; the model plans, uses tools, and loops—with limits & checkpoints.
# Minimal tool-using agent loop with OpenAI Responses (function tools)
from openai import OpenAI
client = OpenAI()
tools = [{
"type": "function",
"function": {
"name": "lookup_order",
"description": "Return order status by id",
"parameters": {
"type":"object",
"properties":{"order_id":{"type":"string"}},
"required":["order_id"]
}
}
}]
def lookup_order(order_id: str):
# replace with real DB/API call
return {"order_id": order_id, "status": "shipped"}
def step(messages):
r = client.responses.create(
model="gpt-4.1",
input=messages,
tools=tools,
tool_choice="auto"
)
return r
messages = [{"role":"system","content":"You are a helpful agent."},
{"role":"user","content":"Where is order 12345?"}]
for _ in range(5): # safety cap
resp = step(messages)
out = resp.output[0]
# Tool call?
if getattr(out, "type", "") == "tool_call":
call = out
if call.name == "lookup_order":
result = lookup_order(**call.arguments)
messages += [
{"role":"assistant","content":[{"type":"tool_result","tool_call_id":call.id,"output":str(result)}]}
]
else:
print(resp.output_text) # final answer
break
For a higher-level wrapper, check the OpenAI Agents SDK (function tools, Pydantic validation, built-in tracing).
4) Retrieval: Qdrant & Chroma (vector DBs)
Agentic systems usually need fast, filtered retrieval. Two strong options:
Qdrant (production-grade; filters, payloads, hybrid search)
Insert & search (Python client)
from qdrant_client import QdrantClient
from qdrant_client.models import VectorParams, Distance, PointStruct
client = QdrantClient(host="localhost", port=6333)
if not client.collection_exists("docs"):
client.create_collection(
collection_name="docs",
vectors_config=VectorParams(size=1536, distance=Distance.COSINE),
)
points = [
PointStruct(id=1, vector=[0.1]*1536, payload={"title":"Policy A", "type":"policy"}),
PointStruct(id=2, vector=[0.2]*1536, payload={"title":"Runbook X", "type":"runbook"}),
]
client.upsert(collection_name="docs", points=points)
# Search
hits = client.search(collection_name="docs", query_vector=[0.1]*1536, limit=3)
print(hits)
Qdrant supports payloads, rich filters, hybrid sparse/dense search (e.g., BM42), and async APIs; it integrates smoothly with LlamaIndex.
Chroma (fast local dev; simple API)
Add & query
import chromadb
from chromadb.utils import embedding_functions
client = chromadb.Client()
collection = client.create_collection(name="docs")
# (for custom embedders, see Chroma embedding functions)
collection.add(ids=["1","2"],
documents=["Policy A text ...", "Runbook X text ..."])
res = collection.query(query_texts=["refund policy"],
n_results=3)
print(res)
Chroma is great for local prototyping and small apps; Qdrant is excellent when you need filters, performance, and scale.
5) Frameworks that help (and when)
- LangGraph: State-machine graphs for long-running, stateful agents; cycles, persistent checkpoints, and human-in-the-loop moderation. Also ships a Platform/Studio for deployment, visualization, and debugging.
- LangChain: Building blocks for prompts, retrievers, tools; now points to LangGraph for agents.
- LlamaIndex: Quick routes to agents, query engines, and vector store integrations (Qdrant, others).
Minimal LangGraph “router → retrieve → answer → reflect”
from langgraph.graph import StateGraph, END
from typing import TypedDict, Literal
from openai import OpenAI
client = OpenAI()
class S(TypedDict):
question: str
route: Literal["billing","tech","general"]
context: str
answer: str
def route_node(s:S):
r = client.responses.create(model="gpt-4.1-mini",
input=f"Label as one of [billing, tech, general]. Only label.\n\n{s['question']}")
s["route"] = r.output_text.strip().lower()
return s
def retrieve_node(s:S):
# pretend retrieval (plug in Qdrant/Chroma here)
s["context"] = "doc snippets ..."
return s
def answer_node(s:S):
msg = f"Context:\n{s['context']}\n\nQ: {s['question']}\nA:"
s["answer"] = client.responses.create(model="gpt-4.1", input=msg).output_text
return s
def reflect_node(s:S):
critique = client.responses.create(model="gpt-4.1-mini",
input=f"Critique & improve the answer for correctness and tone.\n\n{s['answer']}").output_text
s["answer"] = critique
return s
g = StateGraph(S)
g.add_node("route", route_node)
g.add_node("retrieve", retrieve_node)
g.add_node("answer", answer_node)
g.add_node("reflect", reflect_node)
g.set_entry_point("route")
g.add_edge("route","retrieve")
g.add_edge("retrieve","answer")
g.add_edge("answer","reflect")
g.add_edge("reflect", END)
app = g.compile()
print(app.invoke({"question":"How do I get a refund?"})["answer"])
6) Anthropic’s classification (at a glance)
Anthropic frames workflows as predictable compositions (prompt chaining, routing, parallelization, orchestrator-workers, evaluator-optimizer) and agents as autonomous loops with tools + checkpoints. The “when to use” bullets above are lifted from their guidance; the full write-up is gold for teams deciding where to spend complexity. (Source: Anthropic)
7) End-to-end example: a Support Agent (hybrid)
Goal: Resolve customer queries with routing → retrieval → tool use (refund) → evaluation → optional human approval.
Code (condensed; swap your real search + refund API):
from openai import OpenAI
from qdrant_client import QdrantClient
client = OpenAI()
qdrant = QdrantClient(host="localhost", port=6333)
def retrieve(query: str, k: int = 5):
# ... embed query; qdrant.search(...) return top docs as text
return "retrieved docs ..."
def refund(order_id: str):
# call real system
return {"order_id": order_id, "status": "refunded"}
tools = [{
"type":"function",
"function":{
"name":"refund",
"description":"Issue a refund by order ID",
"parameters":{"type":"object","properties":{"order_id":{"type":"string"}}, "required":["order_id"]}
}
}]
def route(q):
r = client.responses.create(model="gpt-4.1-mini",
input=f"Classify as billing, tech, general. Only label.\n\n{q}")
return r.output_text.strip().lower()
def decide_action(question, context):
r = client.responses.create(model="gpt-4.1",
input=f"You can answer or call refund(order_id) if needed.\nContext:\n{context}\n\nQ:{question}",
tools=tools, tool_choice="auto")
return r
def agent(question):
bucket = route(question)
context = retrieve(question)
for _ in range(4): # safety cap
r = decide_action(question, context)
out = r.output[0]
if getattr(out, "type","") == "tool_call":
if out.name == "refund":
res = refund(**out.arguments)
# feed tool result back
question += f"\nTool result: {res}"
continue
# Final answer (optionally send through evaluator/critic like earlier)
return r.output_text
print(agent("Please refund order 12345; it arrived damaged."))
This blends routing, retrieval, tool use, and (optionally) an evaluator loop—a pragmatic hybrid most teams ship first.
8) Operational must-haves (production)
- Guardrails & limits: max iterations, timeouts, spend caps, tool allow-lists, sandboxed actions, and human checkpoints. Anthropic
- Observability: traces, token/cost, success metrics. LangGraph Studio/Platform + LangSmith are designed for this. LangChain Docs
- Evaluate regularly: task success rate, time-to-resolution, human fallback rate.
- Prefer simpler patterns when they perform similarly—it’s cheaper and easier to debug.
9) LlamaIndex agent snippets (alternative stack)
LlamaIndex provides quick starts for agents and vector stores (Qdrant, etc.).
# Tiny LlamaIndex agent (function-agent style; API may vary by version)
from llama_index.core.agent.workflow import FunctionAgent
from llama_index.llms.openai import OpenAI as LIOpenAI
llm = LIOpenAI(model="gpt-4.1-mini")
def tool_search(query: str) -> str:
# real search here
return "doc snippet..."
agent = FunctionAgent.from_functions(llm=llm, functions=[tool_search])
print(agent.chat("Find me our refund policy and summarize it."))
(Check your installed version’s API; LlamaIndex evolves quickly.)
10) Choosing your pattern (decision cheatsheet)
- Single-shot question answering with docs? → Augmented LLM + retrieval (no agent).
- Predictable steps? → Workflow (prompt chaining or routing).
- Unknown steps / tool calls / retries? → Agent with limits + checkpoints.
- Needs scale and control (retries, reflection, approvals)? → LangGraph graph with loops, memory, and human-in-the-loop.
References & further reading
- Anthropic – Building Effective Agents (definitions, patterns, when-to-use). (Source: Anthropic)
- OpenAI – Agents SDK & Developer quickstart (responses, tools, agents). (Source: platform.openai.com)
- LangGraph – Concepts, reference, and why to use it for modern agents. (Source: langchain-ai.github.io)
- LangChain – Agents overview (and migration note to LangGraph). (Source: python.langchain.com)
- LlamaIndex – Agents and Qdrant integrations. (Source: LlamaIndex)
- Qdrant – Python client examples (create/upsert/search) & docs. (Source: python-client.qdrant.tech)
- Chroma – Getting started & query docs. (Source: Chroma Docs)