Opus 4.8

Opus 4.8 now powers Amp's smart mode, replacing Opus 4.7.

It is a better coding agent than Opus 4.7: more faithful to the prompt, tighter in the changes it makes, and better at checking its own work. In our internal evals it solved 62% of tasks, up from 4.7's 52%.

Tighter Changes, Better Checks

Opus 4.7 was already strong on hard, multi-file work, and Opus 4.8 keeps that. What changes is how it gets there: with fewer wasted steps and more self-checking.

The clearest difference is restraint and verification.

Opus 4.7 can sometimes over-engineer, reaching for a more elaborate solution than the task needs. It also verifies its own work less, occasionally moving on even when a command's output is already warning that something is off.

Opus 4.8 makes a more focused change that solves the specific intended task, then checks itself. It leans on a tighter write→test loop, often spinning up a quick script, test, or skill to confirm the change works before proceeding. In our evals it ran tests and code 15% more per task than 4.7.

That restraint is easiest to see on hard tasks. On everyday work the two make a similar number of tool calls. The harder a task gets, the tighter 4.8 stays, just where 4.7 tends to run long and fail more often.

Tool calls per task across the difficulty curve: Opus 4.7 and Opus 4.8 track together through p50, then 4.8 stays consistently lower, with the gap widening sharply on the hardest tasks

It Reaches for the Right Tool

Opus 4.8 is noticeably better at using its tools and sub-agents without being told to.

When a task needs outside context, it actually calls librarian instead of inferring a library's behavior from the local code. Across our eval it reached for it 14 times, versus once for Opus 4.7. It also reaches for a repo's skills more often to verify its work, for example by driving the browser or the CLI, rather than just assuming the change worked.

When it edits, it leans on edit_file for surgical, in-place changes rather than rewriting whole files with create_file. In our evals 79% of its file edits go through edit_file, up from 63% on Opus 4.7.

Fewer Built-in Tools

We dropped the Read tool from smart.

Opus 4.8 is good enough at reading files straight from the shell with cat, rg, sed, and nl. It parallelizes those reads when it needs several files at once.

Fast Mode Is Worth It Now

Opus 4.8 has a fast mode at roughly 2.5× the speed. It now costs 2× base tokens, down from 6× on 4.7—3× cheaper.

Toggle fast mode for a thread from the CLI command palette (Ctrl+O) → speed: use fast.

How to Use It

Opus 4.8 stays close to what you ask, changes less to get there, and checks its own work. A few habits make it shine:

  • Say how far to go. It keeps changes narrow, touching fewer files than 4.7 unless told otherwise. Name the scope when you want it wide: "Fix this for every input format, not just this one." Left unsaid, it changes exactly what you described, which is usually what you want.
  • Give it something to verify against. It runs tests and code more readily than 4.7, so a test, repro command, or repo skill turns that instinct loose. A browser or CLI skill lets it actually exercise the change rather than infer correctness from the code.