Best of Agent Harnesses

Hand-curated, ranked list of 110 AI agent harnesses — the runtimes that close the loop between a stateless model and the outside world.

110 harnesses · 10 categories · MCP-ready · weekly-rescored

claude mcp add agent-harnesses -- uvx agent-harnesses-mcp

The agent-harness landscape: every project plotted by adoption surface against GitHub stars

Browse all 110

Open source only

Project

Category

Stars

Tier

OSS

OpenClaw

typescriptmulti-agent

Personal agent runtimes

380k

complex

✅

superpowers

memoryide

Coding harness configs and SDKs

235k

complex

✅

everything-claude-code

multi-agent

Coding harness configs and SDKs

219k

complex

✅

Hermes

memorypythonprovider-agnostic

Personal agent runtimes

199k

slightly complex

✅

n8n

workflowlocaltypescript

Frameworks

193k

complex

⚠️ Fair-code

AutoGPT

memoryevalspython

Frameworks

185k

complex

⚠️ Polyform-SU

opencode

mcpprovider-agnosticclituitypescript

Coding agent products (IDEs, CLIs, full suites)

177k

slightly complex

✅

Anthropic Skills

Coding harness configs and SDKs

153k

mostly simple

✅

langflow

low-codepython

Frameworks

150k

complex

✅

Dify

low-coderagpython

Frameworks

146k

complex

⚠️ Fair-code

langchain

python

Frameworks

140k

complex

✅

GStack

typescript

Coding harness configs and SDKs

112k

slightly complex

✅

Gemini CLI

mcpclitypescript

Coding agent products (IDEs, CLIs, full suites)

105k

slightly complex

✅

browser-use

mcpbrowserpython

Frameworks

99.9k

slightly complex

✅

Codex

sandboxprovider-agnosticcli

Coding agent products (IDEs, CLIs, full suites)

92.4k

slightly complex

✅

claude-mem

memory

Plugins, MCPs, CLI tools

83.5k

slightly complex

✅

OpenHands

memorybrowsersandboxpython

Coding agent products (IDEs, CLIs, full suites)

77.9k

complex

⚠️ (multi-license)

Daytona

sandbox

Libraries and SDKs

72.4k

slightly complex

✅

MetaGPT

multi-agentpython

Multi-agent and orchestration

68.9k

complex

✅

get-shit-done

clipython

Coding harness configs and SDKs

64.4k

mostly simple

✅

Open Interpreter

clipython

Coding agent products (IDEs, CLIs, full suites)

64.1k

mostly simple

✅

Cline

idetypescript

Coding agent products (IDEs, CLIs, full suites)

63.6k

slightly complex

✅

autogen

multi-agentpython

Multi-agent and orchestration

59.1k

complex

✅ CC-BY

Mem0

memorypython

Libraries and SDKs

59k

slightly complex

✅

crewAI

python

Multi-agent and orchestration

54.1k

complex

✅

Flowise

low-codetypescript

Frameworks

53.9k

complex

⚠️ Apache+CLA

LiteLLM

provider-agnosticpython

Libraries and SDKs

51k

mostly simple

✅

llama-index

ragpython

Frameworks

50.2k

complex

✅

goose

mcprust

Coding agent products (IDEs, CLIs, full suites)

50k

slightly complex

✅

aider

mcpclipython

Plugins, MCPs, CLI tools

46.5k

slightly complex

✅

agno

memoryevalspython

Frameworks

40.8k

complex

✅

awesome-cursorrules

ide

Progressive disclosure harnesses

40k

super simple

✅

langgraph

workflowpython

Frameworks

35.3k

slightly complex

✅

Khoj

python

Personal agent runtimes

35.2k

complex

✅

continue

idetypescript

Plugins, MCPs, CLI tools

34.2k

complex

✅

ChatDev

python

Multi-agent and orchestration

33.5k

slightly complex

✅

github-mcp-server

mcp

Plugins, MCPs, CLI tools

30.9k

slightly complex

✅

Composio

sandboxtool-discoverypythontypescript

Libraries and SDKs

28.9k

complex

✅

semantic-kernel

python

Frameworks

28.2k

complex

✅

smolagents

sandboxpython

Libraries and SDKs

27.9k

mostly simple

✅

gpt-researcher

multi-agentpython

Research and task-specific harnesses

27.8k

complex

✅

openai-agents-python

python

Multi-agent and orchestration

27.3k

mostly simple

✅

crush

memoryclitui

Coding agent products (IDEs, CLIs, full suites)

25.5k

slightly complex

⚠️ FSL-1.1-MIT

mastra

typedtypescript

Frameworks

25.3k

slightly complex

⚠️ Elastic-2.0

vercel/ai

provider-agnostictypescript

Libraries and SDKs

25k

slightly complex

✅

deepagents

multi-agentsandboxpythontypescript

Libraries and SDKs

24.9k

slightly complex

✅

Roo Code

mcpworkflowidetypescript

Coding agent products (IDEs, CLIs, full suites)

24.2k

slightly complex

✅

letta

memorypython

Frameworks

23.4k

mostly simple

✅

MCP Python SDK

mcppython

Plugins, MCPs, CLI tools

23.4k

mostly simple

✅

agents.md

typescript

Progressive disclosure harnesses

22.4k

super simple

✅

rasa

voicepython

Frameworks

21.2k

complex

✅

Google ADK

evalssandboxpython

Frameworks

20.2k

complex

✅

SWE-agent

memoryevalspython

Coding harness configs and SDKs

19.6k

slightly complex

✅

Eliza

memorymulti-agenttypescript

Personal agent runtimes

18.6k

complex

✅

Agent Zero

memorymulti-agentbrowsersandboxpython

Personal agent runtimes

18.1k

slightly complex

❓

pydantic-ai

mcptypedprovider-agnosticpython

Libraries and SDKs

17.9k

slightly complex

✅

Agent Lightning

evalstrainingpython

Evaluation and benchmarking harnesses

17.3k

complex

✅

botpress

low-codetypescript

Frameworks

14.7k

complex

✅

OpenHarness (HKUDS)

memorymulti-agent

Personal agent runtimes

14k

complex

✅

MCP TypeScript SDK

mcptypescript

Plugins, MCPs, CLI tools

12.7k

mostly simple

✅

E2B

sandboxpython

Libraries and SDKs

12.7k

slightly complex

✅

Microsoft Agent Framework

multi-agentworkflowpython

Multi-agent and orchestration

11.5k

slightly complex

✅

MCP Inspector

mcptypescript

Plugins, MCPs, CLI tools

10.1k

super simple

✅

PraisonAI

multi-agentpython

Multi-agent and orchestration

8.2k

mostly simple

✅

R2R

visionragworkflowpython

Frameworks

7.9k

complex

✅

agent-squad

multi-agent

Frameworks

7.7k

slightly complex

✅

Claude Agent SDK

mcpmemorypythontypescript

Coding harness configs and SDKs

7.4k

complex

✅

MCP Registry

mcp

Plugins, MCPs, CLI tools

6.9k

slightly complex

✅

strands-agents

mcpmulti-agenttypedpython

Libraries and SDKs

6.2k

mostly simple

✅

SWE-bench

evalssandboxpython

Evaluation and benchmarking harnesses

5.2k

slightly complex

✅

Cloudflare Agents

memorytypescript

Libraries and SDKs

5.1k

slightly complex

✅

AgentVerse

multi-agentpython

Frameworks

5.1k

complex

✅

AgentBench

evalssandboxragworkflowpython

Evaluation and benchmarking harnesses

3.5k

complex

✅

Bee Agent Framework

mcpmulti-agentpythontypescript

Frameworks

3.3k

complex

✅

openai-agents-js

multi-agentvoicetypescript

Libraries and SDKs

3.3k

slightly complex

✅

inspect_ai

evalssandboxpython

Evaluation and benchmarking harnesses

2.2k

complex

✅

AgentStack

Frameworks

2.2k

slightly complex

✅

WebArena

python

Evaluation and benchmarking harnesses

1.5k

complex

✅

Docker MCP Gateway

mcpsandboxcli

Plugins, MCPs, CLI tools

1.5k

slightly complex

✅

AIlice

sandboxpython

Personal agent runtimes

1.4k

slightly complex

✅

WebVoyager

evalsvision

Evaluation and benchmarking harnesses

1.1k

slightly complex

✅

ARC-AGI-2

Evaluation and benchmarking harnesses

715

super simple

✅

SWE-Gym

evalstrainingpython

Evaluation and benchmarking harnesses

694

slightly complex

✅

swe-smith

trainingpython

Evaluation and benchmarking harnesses

681

slightly complex

✅

open-harness

mcpmulti-agenttypescript

Libraries and SDKs

571

slightly complex

✅

inspect_evals

evalssandbox

Evaluation and benchmarking harnesses

547

slightly complex

✅

langgraph-bigtool

tool-discoverypython

Progressive disclosure harnesses

542

slightly complex

✅

RepoMaster

workflowpython

Coding harness configs and SDKs

529

slightly complex

❓

claw-code-agent

mcprustpythontypescript

Coding agent products (IDEs, CLIs, full suites)

515

slightly complex

❓

MCP-Zero

tool-discovery

Progressive disclosure harnesses

489

complex

✅

AgentSilex

python

Frameworks

451

super simple

✅

openagents

Research and task-specific harnesses

427

complex

✅

arc-agi-benchmarking

evalsprovider-agnosticpython

Evaluation and benchmarking harnesses

350

mostly simple

✅

AutoHarness

memorymulti-agentprovider-agnosticpython

Coding harness configs and SDKs

326

super simple

✅

AgentRL

trainingpython

Multi-agent and orchestration

302

complex

✅

SuperAgentX

multi-agentpython

Frameworks

200

mostly simple

✅

ToolGen

tool-discoverypython

Progressive disclosure harnesses

180

complex

❓

VitaBench

Evaluation and benchmarking harnesses

145

complex

✅

AgencyBench

evalssandboxpython

Evaluation and benchmarking harnesses

complex

✅

spring-ai-tool-search-tool

tool-discovery

Progressive disclosure harnesses

mostly simple

✅

letta-evals

memorypython

Evaluation and benchmarking harnesses

mostly simple

✅

SUPER

sandboxpython

Evaluation and benchmarking harnesses

slightly complex

✅

ToolRAG

mcptool-discovery

Progressive disclosure harnesses

mostly simple

✅

puppeteer-real-browser-mcp

mcpbrowsertypescript

Plugins, MCPs, CLI tools

mostly simple

❓

TRAIL

Evaluation and benchmarking harnesses

mostly simple

✅

Community-curated agent lists

Libraries and SDKs

super simple

❓

Better-OpenCodeMCP

mcptypescript

Plugins, MCPs, CLI tools

mostly simple

✅

coderClaw

idetypescript

Coding agent products (IDEs, CLIs, full suites)

slightly complex

❓

pmstack

evals

Coding harness configs and SDKs

super simple

✅

agentlog

memoryclipython

Plugins, MCPs, CLI tools

super simple

✅

Pick by use case

I want a turnkey coding agent today — opencode, Cline, Codex, Gemini CLI, OpenHands
I want an always-on personal agent that lives in my chat apps — OpenClaw, Hermes, Khoj, Agent Zero, OpenHarness (HKUDS)
I want to extend Claude Code, Codex, or OpenCode with skills and slash commands — Anthropic Skills, everything-claude-code, superpowers, GStack, pmstack
I want to build my own coding harness from scratch — Claude Agent SDK, Google ADK, AutoHarness, SWE-agent, RepoMaster
I want a drop-in memory layer for agents — Mem0, claude-mem, agentlog, agno, letta
I want to plug hundreds to thousands of tools without context bloat — MCP-Zero, ToolGen, ToolRAG, langgraph-bigtool, spring-ai-tool-search-tool
I want multi-agent orchestration — openai-agents-python, crewAI, autogen, Microsoft Agent Framework, PraisonAI
I want a general LLM app framework — langgraph, langchain, llama-index, pydantic-ai, agno
I want low-code / visual workflows — langflow, Flowise, Dify, n8n
I want browser-using agents — browser-use, WebVoyager, puppeteer-real-browser-mcp
I want sandboxed code execution for agent-generated code — E2B, Daytona, smolagents, OpenHands
I want to evaluate or benchmark agents — SWE-bench, AgencyBench, inspect_ai, WebArena, ARC-AGI-2
I want a deep research / autonomous research agent — deepagents, gpt-researcher, openagents
I want a provider-agnostic LLM pipe (not a framework) — LiteLLM, vercel/ai

Decision guides

How to pick a harness — Six questions, in order. Each one eliminates most of the list; by the end you should be choosing between two or three projects, not 103. The
Agent memory layers: Mem0 vs Letta vs claude-mem — "Add memory to my agent" hides three different products: a **memory API** you call from any agent (Mem0), an **agent runtime** where memory
Multi-agent orchestration: OpenAI Agents SDK vs CrewAI vs AutoGen vs LangGraph — Four very different answers to "how should multiple agents coordinate?" — and the differences are architectural, not cosmetic. Picking wrong
OpenClaw vs Hermes: the always-on personal-agent debate — The loudest harness argument of 2026. Both are open-source (MIT), self-hosted, always-on personal agents you talk to from chat apps — and th
Terminal coding agents: opencode vs Codex vs Gemini CLI vs crush vs goose — The most-asked pick in this list: *"I want a turnkey coding agent in my terminal today."* These five are the open(ish)-source field. The clo

FAQ

All questions →

For agents

This list is published machine-readable so coding and research agents can recommend harnesses directly:

harnesses.json — every project with tier, tags, axes, license, example, and use-case index.
llms.txt — the whole list in one agent-readable file.
harnesses.jsonld — schema.org Dataset + ItemList.
feed.json — JSON Feed of refreshes.
MCP server: uvx agent-harnesses-mcp — pick_harness, search_harnesses, get_harness, comparisons.

Best of Agent Harnesses

Browse all 110

Categories

Pick by use case

Decision guides

FAQ

For agents