AI Intermediate Tutorial

Build a Multi-Agent Research Pipeline with CrewAI and Ollama

Assemble a three-agent CrewAI crew backed by a locally running Llama 3.1 model to autonomously produce structured, cited research reports — no OpenAI key required.

Mariana Souza

Senior Editor · Jun 26, 2026 · 8 min read

Build a Multi-Agent Research Pipeline with CrewAI and Ollama

What You'll Build

A three-agent CrewAI pipeline that takes a research topic and produces a formatted Markdown report with sourced findings. A Researcher gathers facts via web search, an Analyst synthesizes them, and a Writer produces the final document. All inference runs locally through Ollama.

Prerequisites

Python 3.10 or 3.11 (3.12 works; 3.9 does not)
Ollama 0.1.x or later installed
At least 8 GB of free RAM; 16 GB is comfortable for llama3.1:8b
macOS or Linux. On Windows, use WSL2.
A virtual environment tool (venv, conda, etc.)

Step 1: Get Ollama Running with Llama 3.1

If Ollama isn't already running as a background service, start it first:

# Only needed on Linux or if you didn't install the macOS .dmg app
ollama serve &

On macOS with the app installed, Ollama starts at login automatically. Then pull the model:

# Downloads ~4.7 GB (Q4 quantization)
ollama pull llama3.1:8b

Confirm it's listening before continuing:

curl http://localhost:11434/api/tags

You should get JSON listing your local models. If you get Connection refused, Ollama isn't running yet.

Step 2: Install Python Dependencies

python3 -m venv .venv
source .venv/bin/activate

pip install "crewai>=0.28.0" "langchain-ollama>=0.1.0" "langchain-community>=0.2.0" duckduckgo-search

langchain-ollama is the standalone adapter split from langchain-community in LangChain 0.2.x. Prefer it over the older langchain_community.chat_models.ChatOllama import path. CrewAI requires Pydantic v2, so verify with pip show pydantic if you're in a shared environment.

Step 3: Configure the LLM and Search Tool

Create research_pipeline.py. The LLM setup is a single object you'll hand to every agent:

from crewai import Agent, Task, Crew, Process
from langchain_ollama import ChatOllama
from langchain_community.tools import DuckDuckGoSearchRun

# base_url defaults to http://localhost:11434
# Only override if Ollama is on a different host or port
llm = ChatOllama(model="llama3.1:8b", temperature=0.1)
search_tool = DuckDuckGoSearchRun()

Low temperature keeps the researcher and analyst factual. Bump it to 0.3 on the writer if you want less-dry prose.

Step 4: Define the Three Agents

researcher = Agent(
    role="Research Specialist",
    goal="Find accurate, recent information on the given topic and collect key facts with sources.",
    backstory=(
        "You are a meticulous researcher with a talent for locating credible sources "
        "and summarizing them without losing detail."
    ),
    llm=llm,
    tools=[search_tool],
    allow_delegation=False,
    verbose=True,
)

analyst = Agent(
    role="Data Analyst",
    goal="Identify patterns, gaps, and key insights from the research findings.",
    backstory=(
        "You specialize in transforming raw research into structured analysis, "
        "separating signal from noise."
    ),
    llm=llm,
    tools=[],
    allow_delegation=False,
    verbose=True,
)

writer = Agent(
    role="Technical Writer",
    goal="Produce a well-structured, cited research report suitable for a technical audience.",
    backstory=(
        "You write clear, authoritative reports. You cite sources precisely "
        "and never pad content with filler."
    ),
    llm=llm,
    tools=[],
    allow_delegation=False,
    verbose=True,
)

allow_delegation=False prevents agents from spontaneously reassigning work mid-run. In a sequential pipeline it mostly avoids confusion rather than wasted inference.

Step 5: Define Tasks with Explicit Context

research_task = Task(
    description=(
        "Search for recent developments, key players, and real-world use cases for: {topic}. "
        "Collect at least five distinct facts and note the source URL for each."
    ),
    expected_output=(
        "A bullet-point list of findings, each with a source URL. "
        "Minimum five items, no speculation."
    ),
    agent=researcher,
)

analysis_task = Task(
    description=(
        "Review the research findings and identify: (1) the three most significant trends, "
        "(2) any contradictions or gaps, and (3) practical implications."
    ),
    expected_output=(
        "A structured analysis in three labelled sections: Trends, Gaps, Implications. "
        "Each section contains 2-3 concise paragraphs."
    ),
    agent=analyst,
    context=[research_task],  # analyst receives researcher's full output
)

writing_task = Task(
    description=(
        "Write a research report on {topic} using the findings and analysis provided. "
        "Include: Executive Summary, Findings, Analysis, and Conclusion. "
        "Cite sources inline."
    ),
    expected_output=(
        "A formatted Markdown report with H2 headings for each section, "
        "inline citations, and a References section at the end."
    ),
    agent=writer,
    context=[research_task, analysis_task],
)

The context list is the key wiring. Without it, the analyst and writer only see their own task description, not the upstream output.

Step 6: Assemble the Crew and Run

crew = Crew(
    agents=[researcher, analyst, writer],
    tasks=[research_task, analysis_task, writing_task],
    process=Process.sequential,
    verbose=True,
)

if __name__ == "__main__":
    result = crew.kickoff(inputs={"topic": "post-quantum cryptography standardization"})
    print("\n=== FINAL REPORT ===\n")
    print(result)

kickoff(inputs=...) interpolates {topic} into every task description at runtime. Change the topic string without touching the pipeline definition.

python research_pipeline.py

Expect 5-15 minutes depending on hardware. Each agent's reasoning steps scroll by as it works.

Verify It Works

A successful run prints each agent's inner monologue, then === FINAL REPORT === followed by a Markdown document with H2 sections and a References block. If you see Action: duckduckgo_search lines in the researcher's output, tool use is working correctly.

Quick check: search the report for at least one http URL in the References section. If there are none, the researcher completed without invoking the search tool — see the first troubleshooting item below.

Troubleshooting

Agent loops without producing a Final Answer Local models sometimes fail to emit the Final Answer: token in the ReAct format CrewAI expects. Add max_iter=5 to the offending agent. If it persists, try mistral:7b — different quantizations handle ReAct prompting differently, and some work significantly better than others out of the box.

Researcher never calls the search tool The model generated an answer from its weights rather than invoking the tool. Make the task description more directive: add "You MUST use the search tool for every fact." It's a prompt engineering issue, not a code bug.

DuckDuckGo raises an exception or returns empty results DuckDuckGo rate-limits aggressive scrapers. If you hit it repeatedly during testing, add time.sleep(2) between runs. For production workloads, swap in from crewai_tools import SerperDevTool (requires a free Serper API key) which is more reliable under repeated load.

Pydantic validation errors on Agent or Task construction CrewAI requires Pydantic v2. Run pip show pydantic and confirm 2.x. If something in your environment pins v1, create a fresh virtual environment rather than trying to coerce compatibility.

Next Steps

Add memory=True to Crew with a local embeddings model to give agents persistent memory across runs.
Swap Process.sequential for Process.hierarchical and pass manager_llm=llm to let CrewAI dynamically assign tasks.
Explore crewai_tools for WebsiteSearchTool, PDFSearchTool, and FileWriterTool to write reports directly to disk.
Profile throughput with ollama ps while the pipeline runs and compare llama3.1:8b against mistral:7b on your hardware for the best speed/quality tradeoff.

#Python #Llm #Agents #Ai #Ollama #Crewai

Written by

Mariana Souza · Senior Editor

Mariana covers the fast-moving world of machine learning and generative AI, with a particular focus on how these technologies are reshaping development workflows. When she isn't stress-testing the latest foundation models, she's usually at a local hackathon.

Discussion 0

Join the discussion

No comments yet

Be the first to weigh in.