Jun 15, 2026

Goose AI Agent Review 2026: Apache 2.0, Any LLM, and the Best Free Local Coding Agent?

By AICoderScope Team · 11 min read

gooseclineaiderclaude-codelocal-llmreviewollamamcp

TL;DR: Goose is a free, Apache 2.0 coding agent from Block that runs against any LLM — including fully local Ollama models — and orchestrates real work through MCP extensions. It is the most capable open-source agent you can run with zero API cost. The catch: small local models still choke on its heavy tool-calling, so the “$0 forever” pitch only holds if you have the VRAM for a 14B+ model.

	Goose	Cline	Claude Code
Best for	Full-cycle automation, local-first, CLI + Desktop	In-editor agent inside VS Code	Deep agentic runs on Claude
Price / Cost	Free (Apache 2.0); pay tokens or $0 local	Free; bring your own API key	Bundled with Claude Pro $20/Max $100–$200, or API
The catch	Small local models fail its tool calls	Lives only in your editor	Locked to Anthropic models

Honest take: If you want one agent that works in the terminal and a desktop app, runs on whatever model you point it at, and never sends a line of code off your machine — Goose is the one to install first. Reach for Claude Code only when you want maximum reasoning on Anthropic’s models and don’t mind the bill.

What Goose actually is

Goose is an on-machine AI agent built by Block (the company behind Square, Cash App, and TIDAL). It shipped as open source in January 2025 and has grown to roughly 49,000 GitHub stars. As of June 2026 it is no longer a Block-only project: in December 2025 the Linux Foundation launched the Agentic AI Foundation (AAIF), anchored by three donated projects — Anthropic’s Model Context Protocol (MCP), OpenAI’s AGENTS.md, and Block’s goose. Goose formally moved to the AAIF on April 7, 2026, which matters if you care about a tool outliving the company that wrote it: governance is now vendor-neutral.

The thing that separates Goose from a code-completion plugin is scope. It does not just suggest code in a side panel. It runs shell commands, edits files across your repo, executes code, runs your tests, and chains those steps into multi-step tasks. It ships as both a CLI and a Desktop app (macOS, Linux, Windows). The agent’s tool access comes through MCP — Goose was one of the earliest MCP adopters and exposes 70+ documented extensions plus anything in the broader MCP server registry, which crossed 3,000 entries in early 2026.

The current stable release as of this writing is v1.37.0 (June 3, 2026). That release alone tells you the project is moving fast: it added a hooks system for custom agent behavior, a /goal self-evaluation command, a /review local code-analysis command, subagent instructions, PreToolUse denial hooks, and a TUI diff viewer. The two releases before it (v1.36.0 on May 27, v1.35.0 on May 22) were similarly dense.

The part that matters: it runs on local models

This is why Goose belongs in any “local LLM + coding tool” shortlist. It is genuinely model-agnostic. The June 2026 build talks to Anthropic, OpenAI, Google, OpenRouter, Azure, AWS Bedrock, and — the one developers searching for privacy actually want — Ollama for fully local inference. v1.37.0 added even more: xAI SuperGrok, Alibaba Qwen via DashScope, Databricks AI Gateway, and a generic declarative path for any OpenAI-compatible endpoint. Point it at a model, and your code never leaves the box.

Here is the actual install and local setup, tested on June 15, 2026 with Goose CLI v1.37.0 and Ollama 0.22:

# 1. Install the Goose CLI
curl -fsSL https://github.com/block/goose/releases/download/stable/download_cli.sh | bash

# 2. Pull a model that can handle tool calling
ollama pull qwen3-coder:14b

# 3. Configure Goose to use it
goose configure
#   ┌  goose-configure
#   │
#   ◇  What would you like to configure?
#   │  Configure Providers
#   │
#   ◇  Which model provider should we use?
#   │  Ollama
#   │
#   ◇  Provider Ollama requires OLLAMA_HOST, please enter a value
#   │  http://localhost:11434
#   │
#   ◇  Model fetch complete
#   │
#   └  Configuration saved. You can now run `goose`.

If you skip the host prompt, Goose defaults to localhost:11434, so a standard local Ollama install just works. Start a session with goose and you are in an agent loop that can read your repo, write files, and run commands against your local model — no API key, no metered tokens, no network egress.

For a deeper dive on which quantized models fit which GPU, our Gemma 4 QAT local coding guide maps VRAM tiers to real coding performance, and runaihome.com’s best local AI models by VRAM covers the hardware side.

The problem nobody mentions in the demos

Goose is a heavy tool-calling agent. Every step — read this file, run that command, apply this diff — is a structured function call the model has to emit correctly. Frontier models do this in their sleep. Small local models often do not.

On the first real test — “add input validation to the three handlers in api/routes.py and run the tests” — qwen3-coder:14b handled it cleanly: it read the files, edited all three, and ran pytest. Dropping to a 7B model on the same task, the agent stalled. It returned malformed tool-call JSON, Goose retried, and it looped. This is the same class of failure documented across local-agent setups, and it is the single biggest gap between the marketing (“runs on any model”) and reality (“runs well on models that are actually good at tool use”).

Two fixes that worked:

Use a model trained for agentic tool use, not just code completion. A 14B coder-tuned model (Qwen3-Coder, Devstral) succeeds where a general 7B model fails. If you are on 8GB of VRAM, this is the constraint, not Goose itself.
Lean on v1.37’s hooks and /review. When a full autonomous run is too much for a local model, use Goose more surgically — /review for local analysis, smaller scoped goals — instead of one giant “build the feature” prompt.

If your hardware can’t run a 14B model at usable speed, the honest move is to use Goose with a cheap cloud model. DeepSeek or a Qwen API runs Goose’s tool calls reliably for a fraction of a cent per task — see our DeepSeek V4-Flash as a coding backend breakdown.

Goose vs Cline vs Aider vs Claude Code

All four are agents. They overlap less than the marketing suggests.

| | Goose | Cline | Aider | Claude Code | |---|---|---|---| | License | Apache 2.0 | Apache 2.0 | Apache 2.0 | Proprietary | | Interface | CLI + Desktop app | VS Code / JetBrains extension | Terminal | Terminal | | Cost | Free; tokens or local | Free; tokens or local | Free; tokens or local | Claude Pro $20 / Max $100–$200 / API | | Local models | Yes (Ollama, OpenAI-compatible) | Yes (Ollama, LM Studio) | Yes (Ollama, OpenAI-compatible) | No | | MCP support | Deep — 70+ extensions | Yes | Limited | Yes | | Best fit | Full-cycle automation outside the editor | Devs who live in VS Code | Git-native, surgical edits | Best reasoning on Claude |

The decisions are clear, not “it depends”:

You live in VS Code all day. Use Cline. It is the same idea as Goose but rendered inside your editor with inline diffs, and it talks to local models too.
You want tight, git-aware edits and clean commits. Use Aider. It is the most disciplined of the four about diffs and commit messages, and it stays out of your way.
You want the strongest possible reasoning and will pay for it. Use Claude Code. Nothing here matches a top Claude model on a gnarly multi-file refactor — but you are locked to Anthropic and you pay per run.
You want one agent that works everywhere, on any model, including fully offline, for $0. Use Goose. It is the only one of the four that ships a polished desktop app and a CLI, and its MCP integration is the deepest.

Goose’s real competition in the “free, model-agnostic, terminal-first” lane is OpenCode. The difference: Goose adds a desktop GUI and a far larger extension catalog; OpenCode is leaner. If you want a GUI, Goose wins.

Where Goose breaks

No fence-sitting — here is where it actually struggles.

Small local models. Covered above. Below ~14B, expect tool-call failures and loops on multi-step tasks. The agent is only as reliable as the model’s function-calling.

The autonomy is real, so the blast radius is real. Goose runs shell commands and edits files. v1.37 added PreToolUse denial hooks precisely because letting an agent run arbitrary commands needs guardrails. Run it in a project directory with git committed, not on your home folder, and review diffs before you trust them. This is the same caution we give in when to trust the AI suggestion.

Setup friction vs a one-click IDE. Cursor or Copilot install and just work inside an editor you already know. Goose asks you to pick a provider, manage a model, and learn a CLI or a separate desktop app. The payoff is portability and zero lock-in; the cost is ten minutes of setup.

Extension sprawl. 70+ first-party extensions plus 3,000+ MCP servers is a feature and a trap. Most users need three or four. Installing everything slows the agent and bloats the context. Start minimal — see our best MCP servers for AI coding for which ones earn their slot.

Who should actually install it

Install Goose today if you are a developer who (a) wants real agentic automation — running tests, editing across files, executing code — and (b) either cares about keeping code on your own machine or just refuses to pay a subscription. With a 14B+ local model on a 16GB+ GPU, Goose gives you a genuinely capable agent for $0 in recurring cost. That combination did not exist this cleanly a year ago.

Skip it if you live entirely inside VS Code (Cline is the better-fitting twin) or if your only hardware is an 8GB laptop GPU and you want everything offline — at that tier, the model, not Goose, is the bottleneck, and you’ll fight tool-call failures. In that case, pay a few cents for a cloud model and let Goose drive it.

FAQ

Is Goose really free? Yes. The entire codebase is Apache 2.0. There is no Goose subscription. Your only cost is model inference — which is $0 if you run a local Ollama model, or pay-as-you-go if you point it at a cloud API.

Does Goose work fully offline? Yes, with a local provider. Configure Ollama (or any OpenAI-compatible local server) and no code or prompt leaves your machine. The quality ceiling is set by the local model you can run.

Goose vs Cline — which should I pick? Same core idea, different shell. Cline lives inside VS Code/JetBrains with inline diffs; Goose runs as a CLI and a standalone desktop app. Pick Cline if you never leave your editor; pick Goose if you want a terminal-first or GUI agent that isn’t tied to one IDE.

What model should I run locally with Goose? A coder-tuned model at 14B or larger that’s good at tool calling — Qwen3-Coder 14B or Devstral are reliable. Anything smaller than ~7B will frequently fail Goose’s structured tool calls. Match the quant to your VRAM.

Who owns Goose now? Block built it and donated it to the Agentic AI Foundation (AAIF) under the Linux Foundation in April 2026, alongside MCP and AGENTS.md. Development continues, but governance is now vendor-neutral.

Does Goose support MCP servers? Yes — it has one of the deepest MCP integrations in the ecosystem, with 70+ documented extensions and access to the 3,000+ servers in the MCP registry.

Sources

Last updated June 15, 2026. Pricing and features change frequently; verify current state before purchasing.

Was this article helpful?