Agent Safehouseを使い出した

Claude Codeに--permission-mode=autoが追加されてだいぶ安心感は増したけど、結局LLMが判断しているので完璧ではない。Agent SafehouseはmacOSのsandbox-execを使って、カーネルレベルでエージェントのファイルアクセスを制限してくれるやつ。LLMの判断に関係なく、OSが強制的にブロックしてくれる。

brew install eugene1g/safehouse/agent-safehouse
safehouse -- claude --dangerously-skip-permissions

sandbox-execはmacOSに元々あるサンドボックスの仕組みで、カーネルがシステムコールの段階でアクセスを拒否する。manページにはDEPRECATEDと書いてあるけど、CLI向けの代替がないので現役で動いている（App Sandboxはアプリ向け）。

safehouseはこのsandbox-execにdeny-firstのポリシーを食わせるやつで、作業ディレクトリ以外はすべてブロックされる（~/.sshや~/.awsにはもちろん触れない）。万能なセキュリティ境界ではなく防御層の一つだけど、--dangerously-skip-permissionsを安心して使えるようになるのが大きい。内部ではSandbox Profile Language（SBPL）でポリシーを組み立てていて、safehouse --stdoutで生成されたプロファイルを確認できる。実装の99.8%がシェルスクリプトなので読みやすい。

面白いのが公式サイトにLLM向けのプロンプトテンプレートが置いてあって、エージェントに自分の環境を検出させてカスタムプロファイルを生成させることができる。サンドボックスの設定をサンドボックスされる側に作らせるという発想がちょっと面白い。

僕は毎回使うオプションをまとめたsafeというラッパーを書いて使っている。

#!/usr/bin/env bash
set -euo pipefail

args=(
  --env-pass=CLAUDE_CONFIG_DIR,DD_SITE,EDITOR,GOPATH,GOBIN
  --add-dirs-ro="$HOME/.config"
  --add-dirs-ro="$HOME/.codex"
  --add-dirs-ro="$HOME/.gnupg"
  --add-dirs-ro="$HOME/blog"
  --add-dirs-ro="$HOME/local"
  --add-dirs="$HOME/.cache"
  --add-dirs=/private/tmp
  --enable=clipboard,cleanshot,1password
)

# Grant write access to worktree-related directories:
# - {git-root}-worktrees/  (where git-waku creates worktrees)
# - main repo .git/        (worktrees store index/refs inside the main .git)
git_root="$(git rev-parse --show-toplevel 2>/dev/null || true)"
if [[ -n "$git_root" ]]; then
  worktrees_dir="${git_root}-worktrees"
  [[ -d "$worktrees_dir" ]] && args+=(--add-dirs="$worktrees_dir")

  git_common_dir="$(git rev-parse --git-common-dir 2>/dev/null || true)"
  if [[ -n "$git_common_dir" && "$git_common_dir" != ".git" ]]; then
    # Inside a worktree — grant write to the main repo so git can write index.lock etc.
    main_repo="$(dirname "$git_common_dir")"
    args+=(--add-dirs="$main_repo")
  fi
fi

profile="${SAFEHOUSE_APPEND_PROFILE:-}"
[ -n "$profile" ] && [ -f "$profile" ] && args+=(--append-profile="$profile")

# Extract safehouse options (before --) from user args
safehouse_opts=()
cmd_args=()
saw_separator=0
for arg in "$@"; do
  if [[ $saw_separator -eq 1 ]]; then
    cmd_args+=("$arg")
  elif [[ "$arg" == "--" ]]; then
    saw_separator=1
  elif [[ "$arg" == --explain || "$arg" == --stdout || "$arg" == --enable=* || "$arg" == --workdir=* ]]; then
    safehouse_opts+=("$arg")
  else
    cmd_args+=("$arg")
  fi
done

exec safehouse "${args[@]}" ${safehouse_opts[@]+"${safehouse_opts[@]}"} -- "${cmd_args[@]}"

.configは読み取り専用、.cacheは書き込み可、みたいにディレクトリごとに権限を設定できる。--enableでclipboardや1passwordも必要な分だけ許可した。

git worktreeを使う場合はちょっと工夫がいる。worktree内でsafehouseを起動するとworkdirはworktreeのパスになる。けどgitがindex.lock等を書き込む先はメインリポの.git/の中なのでスコープ外になる。git rev-parse --git-common-dirでメインリポを特定して書き込み許可を追加している。

僕はworktree管理にgit-wakuを使っている。git configのwaku.command.agentにsafeを設定しておけばgit waku open --agentで常にサンドボックス内のClaude Codeが立ち上がる。

[waku "command"]
    agent = safe claude --dangerously-skip-permissions

ところで、この記事を書いている最中にsafehouseの威力を体感した。Claude Codeにsafeスクリプトの中身を読ませようとしたらOperation not permittedで弾かれた。~/.local/binは許可リストに入れてないので当然だ。ちゃんと動いている。

みなさんもAIエージェントにフルアクセスを渡して祈るのはやめて、サンドボックスに入れましょう。

ちなみに

火曜日の空は僕を押しつぶした。

Agent Safehouseを使い出した