What Happens When an AI Agent Manages Your Password Vault

TL;DR

Claude Code and the op CLI reorganized 690 credentials — four vaults, 390 items tagged, SSH agent configured — in one session.
This is AI-native work: the agent operated the vault; the human set direction and approved via Touch ID.
The CLI failed on 18 items with social-auth (UNKNOWN field type) — hard failure, not graceful degradation; a real reliability blocker for team-scale use.
The bug was filed from the terminal via the GitHub CLI in the same session it was found.
If your password manager has a CLI, you already have everything needed to run this.

I’ve been a 1Password user for years. Not in a conscious, intentional way — more in the way you use a good chair: it became part of how I work and I stopped thinking about it.

That changed when I set up a new machine. I had to install 1Password, wire up the SSH agent, reconnect the CLI, re-authenticate everything. The process took longer than it should have because I’d never written down what I’d built. I’d only accumulated it. And somewhere in the middle of that setup, it hit me: I had 690 credentials in one flat vault — logins from jobs I’d left years ago sitting next to active API keys, personal bank accounts mixed with infrastructure credentials, demo user passwords alongside production secrets. The kind of accumulation that happens when a tool works well enough that you never stop to organize it.

I’d been meaning to clean it up for a long time. I never did, because the job is exactly the kind of work that’s too tedious to do manually and too important to skip: touch every item, make a judgment call, move it somewhere sensible, repeat 690 times.

Then I realized: with Claude Code and the op CLI, this was now actually possible. Not assisted — the agent could do it. So I handed it the keys.

What “AI-native” actually means here

Quick context on timing: 1Password launched its SSH agent and CLI 2.0 in March 2022. Git commit signing via the vault came six months later. These are mature, stable features — not betas. 1Password has since launched 1Password Unified Access, an enterprise product built around exactly this problem: AI tools and scripts running on developer endpoints with no visibility, credentials sitting in local files undetected, agents inheriting access without accountability. The enterprise solution adds endpoint discovery, credential scanning, runtime brokering, and unified audit logs.

I wasn’t aware of Unified Access when I started this project. I was trying to clean up a vault. But the workflow I fell into — agent operating the CLI, human setting direction and approving vault access via Touch ID — is the personal-account version of what they’re building for teams. The core pattern is the same. You don’t need the enterprise product to prove the model.

The 1Password CLI (op) exposes the full vault API: list items, read fields, edit metadata, move between vaults, create vaults, apply tags. Everything the UI can do, the CLI can do programmatically.

Claude Code — an AI coding CLI — can run shell commands, write scripts, read output, and iterate. The combination is straightforward: describe what you want, the agent writes op commands, executes them, reads the results, and adjusts.

No GUI. No clicking. No manually locating items and dragging them. The agent operates the vault the same way it would operate any other infrastructure — through the CLI, programmatically, at volume.

This is the definition of AI-native work: the AI isn’t assisting a human who operates the tool. The AI is operating the tool. The human sets direction and reviews outcomes.

What we built

Starting from one vault with 690 items and no organization, the agent produced:

Three vaults with clear boundaries:

Vault	Purpose
`Personal`	Consumer accounts — the person, not any project
`[Project]`	One vault per active project — APIs, deployment credentials, dev tools
`Archive`	Dead accounts — old jobs, expired credentials, abandoned projects

The pattern applies whether you’re running one side project or five. Each project gets its own vault. When it’s time to hand something to a contractor, bring on a co-founder, or hand access to an accountant — the credential surface is already scoped. You’re not hunting through a flat list. The boundary was set when the project started.

The vault split matters beyond organization. Credentials accumulate without intention. The moment you separate by project, you also get a forcing function: every new credential has to go somewhere specific. That’s a better default than one flat vault where everything lives together until you can’t tell what’s active and what’s abandoned.

Tags applied across ~400 Personal items:

finance, health, travel, shopping, family, tech, social, learning, home, government, ai-tools, crypto

This took a Python script, a TSV export of all items, regex matching against titles, and a loop of op item edit --tags calls. 390 items tagged in a single background task. The kind of work that would have taken hours manually — if it ever got done at all.

1Password SSH agent configured:

One addition to ~/.ssh/config:

Host *
  IdentityAgent "~/Library/Group Containers/2BUA8C4S2C.com.1password/t/agent.sock"

SSH keys no longer live as files. Every ssh operation routes through the vault. The agent also verified this was working:

ssh -T git@github.com
# Hi blueCycle! You've successfully authenticated...

The key from a previous machine, stored in 1Password, working on a new machine immediately. No key file to copy, no ssh-keygen to run, no ~/.ssh/authorized_keys to update.

One detail worth naming: vault access is gated by Touch ID, not per-operation. When 1Password is unlocked, CLI commands operate within that session without re-prompting for each call. Re-lock behavior is configurable — there’s a setting for how many minutes of inactivity before the vault requires authentication again. The effect: you approved vault access once, and the agent operates within that window. The human checkpoint exists; you set the terms. The agent doesn’t bypass the gate, it works within it.

Where it broke

The CLI failed on 18 items with this error:

[ERROR] unable to process line 1: failed to edit due to identity inconsistencies:
for UUID <template-uuid> found in the template was inconsistent with <item-uuid>

Root cause: items saved using “Sign in with Google” or “Sign in with Apple” via the browser extension contain a field with type UNKNOWN. The CLI’s field validator doesn’t recognize this type and hard-fails the entire operation — not just the unknown field, the whole item.

{
  "label": "sign in with",
  "type": "UNKNOWN"
}

The documented workaround: open the desktop app, locate the item, drag it to the target vault.

That workaround breaks the workflow in a specific way. The value of agent-driven automation is that it runs end-to-end without human intervention. A step that requires switching to a GUI, locating an item visually, and dragging it doesn’t add friction — it reintroduces the exact human bottleneck the automation was built to eliminate.

For 18 items in a 690-item cleanup, it’s tolerable. For a team running this against a shared vault with hundreds of social-auth items, it’s a reliability blocker.

We filed a bug report: github.com/1Password/shell-plugins/issues/600

The issue was written, formatted, and filed from the terminal via the GitHub CLI — no browser, no copy-paste, no context switching. The bug report about broken AI-native workflows was itself filed via an AI-native workflow. The irony is intentional.

The proposed fix: preserve UNKNOWN fields as-is rather than failing validation. The field has meaning — it records how the user originally authenticated. The CLI should not require a recognized type to process the rest of the item.

The pattern that generalizes

Password managers are credential stores. CLIs expose credential stores as APIs. AI agents can operate APIs. The combination means credential management — historically a human-only task because of its volume and judgment requirements — becomes something an agent can handle at scale, with the human setting policy and reviewing outcomes.

The judgment calls that still require a human:

Which vault does this belong in? (organizational policy)
Is this credential still active? (operational knowledge)
Should I archive or delete? (risk assessment)

The execution that doesn’t:

Move all old work logins to Archive
Tag everything matching these patterns
Apply this vault structure to these 690 items

The split is clean. Policy and judgment stay with the human. Volume and execution go to the agent.

This is the same pattern that makes AI-native work different from AI-assisted work. Assistance means the human operates the tool with AI suggestions. Native means the AI operates the tool with human direction. The vault cleanup took one session. Manual, it never would have happened.

So what

1Password Unified Access is the enterprise version of a pattern that already works with a personal account and the existing op CLI. The enterprise product adds endpoint discovery, credential scanning across local files, runtime brokering for teams, and audit logs. Useful at scale. Not required to start.

What I found: the agent operated the vault, reorganized 690 credentials, filed a bug report in GitHub — all within a single session, on a personal account, with tooling that shipped in 2022. The human set direction. The agent handled volume and execution.

If you have a password manager with CLI access and an AI coding environment, that combination is already more capable than either alone.

The UNKNOWN field type bug is a real limitation today. It will either get fixed — the issue is filed — or it won’t, and AI-native vault management will have an asterisk for accounts that use social auth. Watch the issue if you’re building on this.

The broader question: what other personal infrastructure has a CLI but no one has thought to run an agent against it? Password vaults are one answer. The category is larger than that.

This is the second post in an eight-post series on 1Password as infrastructure. The first covers what 1Password replaces in a builder setup. The third covers service accounts and runtime credential infrastructure. The fourth covers 1Password as the trust anchor in a live authentication chain. The fifth covers how the same pattern scales from solo builder to enterprise. The sixth covers op run — making any tool vault-transparent. The seventh applies the same discipline to a box that runs agents continuously. The eighth covers why that’s still the floor, not the ceiling.