How to choose between workflows and agents, implement each pattern with OpenAI, and glue everything together with LangGraph, LangChain, LlamaIndex, and vector databases like Qdrant/Chroma.

TL;DR (what you’ll get)

1) Workflows vs. Agents: What’s the difference?

Anthropic’s definition is the cleanest I’ve seen:

Quick comparison:

DimensionWorkflowsAgents
Control flowFixed / code-drivenModel-driven (plans/tools in a loop)
PredictabilityHighLower (but more adaptive)
Best forWell-defined tasks, SLAsOpen-ended tasks, variable steps
Cost/latencyLower, boundedHigher, variable (looping/tool calls)
GuardrailsStraightforwardEssential (limits, checks, sandbox)
Typical frameworkDAGs / graphsGraphs with loops + memory / state

Rule of thumb: start simple; only add agentic complexity when it measurably improves outcomes. (Source: Anthropic)

2) The building block: An augmented LLM

Before choosing a pattern, design an LLM with:

Mermaid diagram – augmented LLM context:

(DIAGRAM HERE)

3) Anthropic’s patterns (and how to code them)

Below I summarize Anthropic’s workflows and agents (with when-to-use) and show OpenAI-first Python snippets. Patterns are not mutually exclusive. Combine as needed.

3.1 Prompt Chaining (Workflow)

When: Break a task into clear, fixed steps (e.g., outline → draft → QA)

# Prompt chaining with OpenAI (Python)
from openai import OpenAI
client = OpenAI()

def gen_outline(topic):
    return client.responses.create(
        model="gpt-4.1-mini",
        input=f"Create a bullet outline for a blog on: {topic}"
    ).output_text

def improve_outline(outline):
    return client.responses.create(
        model="gpt-4.1-mini",
        input=f"Critique & improve this outline, keep bullets:\n{outline}"
    ).output_text

def write_article(outline):
    return client.responses.create(
        model="gpt-4.1",
        input=f"Write a detailed article strictly following this outline:\n{outline}"
    ).output_text

outline = gen_outline("Agentic AI patterns")
outline = improve_outline(outline)
article = write_article(outline)
print(article)

OpenAI’s Responses API and Agents SDK are the recommended interfaces in 2025 for text + tool calling; the SDK gives you nice utilities for tools & tracing.

3.2 Routing (Workflow)

When: Inputs fall into distinct categories handled by different prompts/tools/models (e.g., billing vs. tech support; easy vs. hard routed to small vs. large model).

# Simple router → then task-specific prompts
from openai import OpenAI
client = OpenAI()

CATEGORIES = ["billing", "tech_support", "general"]

def route_query(q):
    r = client.responses.create(
        model="gpt-4.1-mini",
        input=f"Classify into {CATEGORIES}. Only output a single label.\n\nQuery: {q}"
    ).output_text.strip().lower()
    return r if r in CATEGORIES else "general"

def handle(q):
    c = route_query(q)
    prompt = {
        "billing": "You are a billing assistant. Answer precisely.",
        "tech_support": "You are a technical support assistant. Be diagnostic.",
        "general": "Be concise and helpful."
    }[c]
    return client.responses.create(model="gpt-4.1-mini",
                                   input=f"{prompt}\nUser: {q}").output_text

print(handle("Refund my last invoice, please"))

3.3 Parallelization (Workflow)

When: Subtasks are independent (speed via parallel) or you want voting across diverse prompts.

# Parallel sectioning / voting with asyncio
import asyncio
from openai import AsyncOpenAI
aclient = AsyncOpenAI()

async def summarize(section):
    r = await aclient.responses.create(model="gpt-4.1-mini",
                                       input=f"Summarize:\n{section}")
    return r.output_text

async def main(sections):
    results = await asyncio.gather(*[summarize(s) for s in sections])
    # Voting/aggregation:
    final = "\n".join(results)
    return final

summary = asyncio.run(main(["part A...", "part B...", "part C..."]))
print(summary)

3.4 Orchestrator-Workers (Workflow)

When: Complex tasks with unknown subtasks; central LLM plans & delegates to worker LLMs/tools, then synthesizes.

This is where LangGraph shines: it’s a low-level orchestration framework for stateful, long-running agents/workflows—with cycles, branches, persistence, and human-in-the-loop. (https://www.langchain.com/langgraph)

# Orchestrator-workers in LangGraph (minimal sketch)
from langgraph.graph import StateGraph, END
from langgraph.checkpoint.memory import MemorySaver
from typing import TypedDict, List

class State(TypedDict):
    task: str
    plan: List[str]
    results: List[str]

def plan_node(state: State):
    # (call OpenAI to create a plan as bullet steps)
    # ... set state["plan"] = [...]
    return state

def worker_node(state: State):
    # (iterate one step of the plan: search, code, call a tool, etc.)
    # ... append to state["results"]
    return state

def decide_next(state: State):
    return END if len(state["results"]) >= len(state["plan"]) else "worker"

graph = StateGraph(State)
graph.add_node("plan", plan_node)
graph.add_node("worker", worker_node)
graph.set_entry_point("plan")
graph.add_edge("plan", "worker")
graph.add_conditional_edges("worker", decide_next, {END: END, "worker":"worker"})
app = graph.compile(checkpointer=MemorySaver())

# run
out = app.invoke({"task":"Implement feature X"}, config={"configurable":{"thread_id":"t1"}})

LangChain explicitly recommends building new agents on LangGraph.

3.5 Evaluator-Optimizer (Workflow)

When: Iterative improvement with clear evaluation criteria (e.g., code review, style guide conformance).

# Generator + Critic loop
from openai import OpenAI
client = OpenAI()

def generate(spec):
    return client.responses.create(model="gpt-4.1",
                                   input=f"Write code per spec:\n{spec}").output_text

def critique(code):
    return client.responses.create(model="gpt-4.1-mini",
                                   input=f"Critique this code; bullet list of fixes:\n```py\n{code}\n```").output_text

def apply_fixes(code, feedback):
    return client.responses.create(model="gpt-4.1",
                                   input=f"Improve the code using this feedback:\n{feedback}\n\nCode:\n```py\n{code}\n```").output_text

code = generate("CLI that greets a name")
for _ in range(2):
    fb = critique(code)
    code = apply_fixes(code, fb)
print(code)

3.6 Agents (Autonomous)

When: Open-ended, multi-step tasks where you can’t predefine the path; the model plans, uses tools, and loops—with limits & checkpoints.

# Minimal tool-using agent loop with OpenAI Responses (function tools)
from openai import OpenAI
client = OpenAI()

tools = [{
  "type": "function",
  "function": {
    "name": "lookup_order",
    "description": "Return order status by id",
    "parameters": {
      "type":"object",
      "properties":{"order_id":{"type":"string"}},
      "required":["order_id"]
    }
  }
}]

def lookup_order(order_id: str):
    # replace with real DB/API call
    return {"order_id": order_id, "status": "shipped"}

def step(messages):
    r = client.responses.create(
        model="gpt-4.1",
        input=messages,
        tools=tools,
        tool_choice="auto"
    )
    return r

messages = [{"role":"system","content":"You are a helpful agent."},
            {"role":"user","content":"Where is order 12345?"}]

for _ in range(5):  # safety cap
    resp = step(messages)
    out = resp.output[0]
    # Tool call?
    if getattr(out, "type", "") == "tool_call":
        call = out
        if call.name == "lookup_order":
            result = lookup_order(**call.arguments)
            messages += [
              {"role":"assistant","content":[{"type":"tool_result","tool_call_id":call.id,"output":str(result)}]}
            ]
    else:
        print(resp.output_text)  # final answer
        break

For a higher-level wrapper, check the OpenAI Agents SDK (function tools, Pydantic validation, built-in tracing).

4) Retrieval: Qdrant & Chroma (vector DBs)

Agentic systems usually need fast, filtered retrieval. Two strong options:

Qdrant (production-grade; filters, payloads, hybrid search)

Insert & search (Python client)

from qdrant_client import QdrantClient
from qdrant_client.models import VectorParams, Distance, PointStruct

client = QdrantClient(host="localhost", port=6333)

if not client.collection_exists("docs"):
    client.create_collection(
        collection_name="docs",
        vectors_config=VectorParams(size=1536, distance=Distance.COSINE),
    )

points = [
    PointStruct(id=1, vector=[0.1]*1536, payload={"title":"Policy A", "type":"policy"}),
    PointStruct(id=2, vector=[0.2]*1536, payload={"title":"Runbook X", "type":"runbook"}),
]
client.upsert(collection_name="docs", points=points)

# Search
hits = client.search(collection_name="docs", query_vector=[0.1]*1536, limit=3)
print(hits)

Qdrant supports payloads, rich filters, hybrid sparse/dense search (e.g., BM42), and async APIs; it integrates smoothly with LlamaIndex.

Chroma (fast local dev; simple API)

Add & query

import chromadb
from chromadb.utils import embedding_functions

client = chromadb.Client()
collection = client.create_collection(name="docs")

# (for custom embedders, see Chroma embedding functions)
collection.add(ids=["1","2"],
               documents=["Policy A text ...", "Runbook X text ..."])

res = collection.query(query_texts=["refund policy"],
                       n_results=3)
print(res)

Chroma is great for local prototyping and small apps; Qdrant is excellent when you need filters, performance, and scale.

5) Frameworks that help (and when)

Minimal LangGraph “router → retrieve → answer → reflect”

from langgraph.graph import StateGraph, END
from typing import TypedDict, Literal
from openai import OpenAI
client = OpenAI()

class S(TypedDict):
    question: str
    route: Literal["billing","tech","general"]
    context: str
    answer: str

def route_node(s:S):
    r = client.responses.create(model="gpt-4.1-mini",
        input=f"Label as one of [billing, tech, general]. Only label.\n\n{s['question']}")
    s["route"] = r.output_text.strip().lower()
    return s

def retrieve_node(s:S):
    # pretend retrieval (plug in Qdrant/Chroma here)
    s["context"] = "doc snippets ..."
    return s

def answer_node(s:S):
    msg = f"Context:\n{s['context']}\n\nQ: {s['question']}\nA:"
    s["answer"] = client.responses.create(model="gpt-4.1", input=msg).output_text
    return s

def reflect_node(s:S):
    critique = client.responses.create(model="gpt-4.1-mini",
        input=f"Critique & improve the answer for correctness and tone.\n\n{s['answer']}").output_text
    s["answer"] = critique
    return s

g = StateGraph(S)
g.add_node("route", route_node)
g.add_node("retrieve", retrieve_node)
g.add_node("answer", answer_node)
g.add_node("reflect", reflect_node)
g.set_entry_point("route")
g.add_edge("route","retrieve")
g.add_edge("retrieve","answer")
g.add_edge("answer","reflect")
g.add_edge("reflect", END)
app = g.compile()

print(app.invoke({"question":"How do I get a refund?"})["answer"])

6) Anthropic’s classification (at a glance)

Anthropic frames workflows as predictable compositions (prompt chaining, routing, parallelization, orchestrator-workers, evaluator-optimizer) and agents as autonomous loops with tools + checkpoints. The “when to use” bullets above are lifted from their guidance; the full write-up is gold for teams deciding where to spend complexity. (Source: Anthropic)

7) End-to-end example: a Support Agent (hybrid)

Goal: Resolve customer queries with routing → retrieval → tool use (refund) → evaluation → optional human approval.

Code (condensed; swap your real search + refund API):

from openai import OpenAI
from qdrant_client import QdrantClient
client = OpenAI()
qdrant = QdrantClient(host="localhost", port=6333)

def retrieve(query: str, k: int = 5):
    # ... embed query; qdrant.search(...) return top docs as text
    return "retrieved docs ..."

def refund(order_id: str):
    # call real system
    return {"order_id": order_id, "status": "refunded"}

tools = [{
  "type":"function",
  "function":{
    "name":"refund",
    "description":"Issue a refund by order ID",
    "parameters":{"type":"object","properties":{"order_id":{"type":"string"}}, "required":["order_id"]}
  }
}]

def route(q):
    r = client.responses.create(model="gpt-4.1-mini",
        input=f"Classify as billing, tech, general. Only label.\n\n{q}")
    return r.output_text.strip().lower()

def decide_action(question, context):
    r = client.responses.create(model="gpt-4.1",
        input=f"You can answer or call refund(order_id) if needed.\nContext:\n{context}\n\nQ:{question}",
        tools=tools, tool_choice="auto")
    return r

def agent(question):
    bucket = route(question)
    context = retrieve(question)
    for _ in range(4):  # safety cap
        r = decide_action(question, context)
        out = r.output[0]
        if getattr(out, "type","") == "tool_call":
            if out.name == "refund":
                res = refund(**out.arguments)
                # feed tool result back
                question += f"\nTool result: {res}"
                continue
        # Final answer (optionally send through evaluator/critic like earlier)
        return r.output_text

print(agent("Please refund order 12345; it arrived damaged."))

This blends routing, retrieval, tool use, and (optionally) an evaluator loop—a pragmatic hybrid most teams ship first.

8) Operational must-haves (production)

9) LlamaIndex agent snippets (alternative stack)

LlamaIndex provides quick starts for agents and vector stores (Qdrant, etc.).

# Tiny LlamaIndex agent (function-agent style; API may vary by version)
from llama_index.core.agent.workflow import FunctionAgent
from llama_index.llms.openai import OpenAI as LIOpenAI

llm = LIOpenAI(model="gpt-4.1-mini")
def tool_search(query: str) -> str:
    # real search here
    return "doc snippet..."

agent = FunctionAgent.from_functions(llm=llm, functions=[tool_search])
print(agent.chat("Find me our refund policy and summarize it."))

(Check your installed version’s API; LlamaIndex evolves quickly.)

10) Choosing your pattern (decision cheatsheet)

  1. Single-shot question answering with docs? → Augmented LLM + retrieval (no agent).
  2. Predictable steps? → Workflow (prompt chaining or routing).
  3. Unknown steps / tool calls / retries? → Agent with limits + checkpoints.
  4. Needs scale and control (retries, reflection, approvals)? → LangGraph graph with loops, memory, and human-in-the-loop.

References & further reading

Leave a Reply