refactor(browser): split installation and profile abstractions (#603)

* refactor(browser): split installation and profile abstractions A Chromium installation shares one master key across its profiles, but modeling each profile as its own Browser re-derived the key per profile. Browser now represents one installation holding its profiles and derives the key once; new types.Profile/ExtractResult/CountResult carry per-profile results. * style: gofumpt safari_test.go * test(chromium): rename shadowed loop var to path
2026-07-22 22:20:50 +02:00 · 2026-05-31 16:37:23 +08:00
parent d5dc81f1c0
commit b901f7dff0
28 changed files with 1359 additions and 1206 deletions
@@ -13,14 +13,14 @@ Key constraints:
 - **Go 1.20** — the module must build with Go 1.20 to maintain Windows 7 support. Features from Go 1.21+ (`log/slog`, `slices`, `maps`, `cmp`) must not be used.
 - **Supported engines**: Chromium (including Yandex and Opera variants) and Firefox.
 - **Supported platforms**: Windows (DPAPI), macOS (Keychain), Linux (D-Bus Secret Service).
- **No root-level library API** — the CLI calls `browser.PickBrowsers()` directly; there is no importable `pkg/` surface.
+- **No root-level library API** — the CLI calls `browser.DiscoverBrowsersWithKeys()` directly; there is no importable `pkg/` surface.

 ## 2. Directory Structure

 ```
 HackBrowserData/
 ├── cmd/hack-browser-data/    # CLI entrypoint: cobra root, dump, list, version
-├── browser/                  # Browser interface, PickBrowsers(), platform browser lists
+├── browser/                  # Browser interface, DiscoverBrowsersWithKeys(), platform browser lists
 │   ├── chromium/             # Chromium engine: extraction, decryption, profile discovery
 │   └── firefox/              # Firefox engine: extraction, NSS key derivation
 ├── types/                    # Data model: Category enum, Entry structs, BrowserData
@@ -82,14 +82,14 @@ See `types/category.go` for the authoritative enum definition.
 There are two entry points, one for extraction and one for discovery:

 ```
-PickBrowsers(opts)                    // used by `dump` — ready to Extract
+DiscoverBrowsersWithKeys(opts)                    // used by `dump` — ready to Extract
  → pickFromConfigs(configs, opts)     // shared discovery core
      → platformBrowsers()             // build-tagged list for this OS
      → filter by name / profile path
      → newBrowsers(cfg)                // dispatch to chromium/firefox/safari.NewBrowsers
          → discoverProfiles()          // scan profile subdirectories
          → resolveSourcePaths()        // stat candidates, first match wins
-  → newPlatformInjector(opts)          // build-tagged: returns a func(Browser)
+  → newCredentialInjector(opts)          // build-tagged: returns a browserInjector
      → for each browser:               // closure captures retriever + keychain pw lazily
          inject(b)                     // type-assert retrieverSetter / keychainPasswordSetter

@@ -97,7 +97,7 @@ DiscoverBrowsers(opts)                 // used by `list` / `list --detail`
  → pickFromConfigs(configs, opts)     // same shared discovery core, NO injection
 ```

-`PickBrowsers` does discovery + decryption setup in one call; the returned
+`DiscoverBrowsersWithKeys` does discovery + decryption setup in one call; the returned
 browsers are ready for `b.Extract`. `DiscoverBrowsers` skips injection
 entirely, so list-style commands never trigger the macOS Keychain password
 prompt — they have no use for the credential. Both entry points share the
@@ -106,8 +106,8 @@ consistent.

 Key design decisions:

- **One KeyRetriever chain per process** — built lazily inside `newPlatformInjector` and reused across every Chromium browser and every profile to prevent repeated keychain prompts on macOS.
- **Discovery is decoupled from injection** — `pickFromConfigs` is injection-free; `DiscoverBrowsers` stops after it, `PickBrowsers` continues into injection.
+- **One KeyRetriever chain per process** — built lazily inside `newCredentialInjector` and reused across every Chromium browser and every profile to prevent repeated keychain prompts on macOS.
+- **Discovery is decoupled from injection** — `pickFromConfigs` is injection-free; `DiscoverBrowsers` stops after it, `DiscoverBrowsersWithKeys` continues into injection.
 - **Profile discovery differs by engine**: Chromium looks for `Preferences` files in subdirectories; Firefox accepts any subdirectory containing known source files.
 - **Flat layout fallback** — Opera-style browsers that store data directly in UserDataDir (no profile subdirectories) are handled by falling back to the base directory.

@@ -36,7 +36,7 @@ The return value is the **ready-to-use decryption key** — either the raw AES k

 `ChainRetriever` wraps multiple retrievers and tries them in order. The first successful result wins. If all fail, errors from every retriever are combined into a single error.

-**Caching**: the retriever chain is created once per process inside `newPlatformInjector` (see `browser/browser_{darwin,linux,windows}.go`) and shared across every Chromium browser and every profile. macOS retrievers additionally use `sync.Once` internally, so multi-profile browsers only trigger one keychain prompt or memory dump.
+**Caching**: the retriever chain is created once per process inside `newCredentialInjector` (see `browser/browser_{darwin,linux,windows}.go`) and shared across every Chromium browser and every profile. macOS retrievers additionally use `sync.Once` internally, so multi-profile browsers only trigger one keychain prompt or memory dump.

 ## 3. macOS Key Retrieval

@@ -122,7 +122,7 @@ Windows populates two slots of the `keyretriever.Retrievers` struct — V10 (leg
 | V10 | `DPAPIRetriever` | `os_crypt.encrypted_key` | `CryptUnprotectData` (Crypt32.dll) |
 | V20 | `ABERetriever` | `os_crypt.app_bound_encrypted_key` | IElevator via reflective injection (see [RFC-010](010-chrome-abe-integration.md)) |

-`browser/browser_windows.go::newPlatformInjector` calls `keyretriever.DefaultRetrievers()` and wires the resulting struct through `Browser.SetKeyRetrievers(r)`. At extract time `keyretriever.NewMasterKeys` runs each slot independently — a failure on one tier does not prevent the other from succeeding, because mixed-tier Chrome profiles (upgraded from pre-127) need partial success to be useful.
+`browser/browser_windows.go::newCredentialInjector` calls `keyretriever.DefaultRetrievers()` and wires the resulting struct through `Browser.SetKeyRetrievers(r)`. At extract time `keyretriever.NewMasterKeys` runs each slot independently — a failure on one tier does not prevent the other from succeeding, because mixed-tier Chrome profiles (upgraded from pre-127) need partial success to be useful.

 **Why not a ChainRetriever?** `ChainRetriever` has first-success semantics: once ABE returns a key, DPAPI is never called. That semantics is wrong for orthogonal tiers — it was the root cause of issue #578, where upgraded profiles' v10-encrypted passwords silently failed because only the v20 key was retrieved. `NewMasterKeys` evaluates each tier independently and returns an `errors.Join` of per-tier failures; log severity is a caller-side decision. `browser/chromium::getMasterKeys` currently logs all tier errors uniformly at `Warnf` — the distinction between "partial" and "total" failure was judged low-value for a short-lived CLI where all warn lines are visible in the default output.

@@ -214,7 +214,7 @@ Future contributors adding a new macOS browser that reads credentials from the K

 ### 7.3 Where the `--keychain-pw` Password Goes

-The macOS login password is resolved once at startup by `browser/browser_darwin.go::resolveKeychainPassword`, then delivered to both consumers from within a single platform-specific closure, `newPlatformInjector` (defined per platform in `browser/browser_{darwin,linux,windows}.go`). The closure captures both the retriever chain and the raw password, and applies whichever capability interface each Browser happens to satisfy:
+The macOS login password is resolved once at startup by `browser/browser_darwin.go::resolveKeychainPassword`, then delivered to both consumers from within a single platform-specific closure, `newCredentialInjector` (defined per platform in `browser/browser_{darwin,linux,windows}.go`). The closure captures both the retriever chain and the raw password, and applies whichever capability interface each Browser happens to satisfy:

 | Consumer | Capability interface | Defined in | Payload |
 |---|---|---|---|
@@ -28,7 +28,7 @@ The primary command. Extracts, decrypts, and writes browser data to files.
 | `--keychain-pw` | | | macOS keychain password |
 | `--zip` | | `false` | Compress output to zip |

-**Workflow**: PickBrowsers (filter by `-b`) → parseCategories (split `-c` on commas) → NewWriter (select formatter by `-f`) → Extract loop (each browser) → Write → optional CompressDir.
+**Workflow**: DiscoverBrowsersWithKeys (filter by `-b`) → parseCategories (split `-c` on commas) → NewWriter (select formatter by `-f`) → Extract loop (each browser) → Write → optional CompressDir.

 The nine recognized categories are: `password`, `cookie`, `bookmark`, `history`, `download`, `creditcard`, `extension`, `localstorage`, `sessionstorage`. The string `"all"` maps to all nine.

@@ -121,7 +121,7 @@ File permissions are restrictive: directories `0750`, files `0600` (data may con

 ```
 CLI: hack-browser-data dump -b chrome -c password,cookie -f csv -d results
-  → PickBrowsers(name="chrome")       → []Browser
+  → DiscoverBrowsersWithKeys(name="chrome")       → []Browser
  → parseCategories("password,cookie") → []Category
  → NewWriter("results", "csv")        → *Writer
  → for each browser: