refactor(browser): split installation and profile abstractions (#603)

* refactor(browser): split installation and profile abstractions

A Chromium installation shares one master key across its profiles, but
modeling each profile as its own Browser re-derived the key per profile.
Browser now represents one installation holding its profiles and derives
the key once; new types.Profile/ExtractResult/CountResult carry per-profile
results.

* style: gofumpt safari_test.go

* test(chromium): rename shadowed loop var to path
This commit is contained in:
Roger
2026-05-31 16:37:23 +08:00
committed by GitHub
parent d5dc81f1c0
commit b901f7dff0
28 changed files with 1359 additions and 1206 deletions
+7 -7
View File
@@ -13,14 +13,14 @@ Key constraints:
- **Go 1.20** — the module must build with Go 1.20 to maintain Windows 7 support. Features from Go 1.21+ (`log/slog`, `slices`, `maps`, `cmp`) must not be used.
- **Supported engines**: Chromium (including Yandex and Opera variants) and Firefox.
- **Supported platforms**: Windows (DPAPI), macOS (Keychain), Linux (D-Bus Secret Service).
- **No root-level library API** — the CLI calls `browser.PickBrowsers()` directly; there is no importable `pkg/` surface.
- **No root-level library API** — the CLI calls `browser.DiscoverBrowsersWithKeys()` directly; there is no importable `pkg/` surface.
## 2. Directory Structure
```
HackBrowserData/
├── cmd/hack-browser-data/ # CLI entrypoint: cobra root, dump, list, version
├── browser/ # Browser interface, PickBrowsers(), platform browser lists
├── browser/ # Browser interface, DiscoverBrowsersWithKeys(), platform browser lists
│ ├── chromium/ # Chromium engine: extraction, decryption, profile discovery
│ └── firefox/ # Firefox engine: extraction, NSS key derivation
├── types/ # Data model: Category enum, Entry structs, BrowserData
@@ -82,14 +82,14 @@ See `types/category.go` for the authoritative enum definition.
There are two entry points, one for extraction and one for discovery:
```
PickBrowsers(opts) // used by `dump` — ready to Extract
DiscoverBrowsersWithKeys(opts) // used by `dump` — ready to Extract
→ pickFromConfigs(configs, opts) // shared discovery core
→ platformBrowsers() // build-tagged list for this OS
→ filter by name / profile path
→ newBrowsers(cfg) // dispatch to chromium/firefox/safari.NewBrowsers
→ discoverProfiles() // scan profile subdirectories
→ resolveSourcePaths() // stat candidates, first match wins
→ newPlatformInjector(opts) // build-tagged: returns a func(Browser)
→ newCredentialInjector(opts) // build-tagged: returns a browserInjector
→ for each browser: // closure captures retriever + keychain pw lazily
inject(b) // type-assert retrieverSetter / keychainPasswordSetter
@@ -97,7 +97,7 @@ DiscoverBrowsers(opts) // used by `list` / `list --detail`
→ pickFromConfigs(configs, opts) // same shared discovery core, NO injection
```
`PickBrowsers` does discovery + decryption setup in one call; the returned
`DiscoverBrowsersWithKeys` does discovery + decryption setup in one call; the returned
browsers are ready for `b.Extract`. `DiscoverBrowsers` skips injection
entirely, so list-style commands never trigger the macOS Keychain password
prompt — they have no use for the credential. Both entry points share the
@@ -106,8 +106,8 @@ consistent.
Key design decisions:
- **One KeyRetriever chain per process** — built lazily inside `newPlatformInjector` and reused across every Chromium browser and every profile to prevent repeated keychain prompts on macOS.
- **Discovery is decoupled from injection** — `pickFromConfigs` is injection-free; `DiscoverBrowsers` stops after it, `PickBrowsers` continues into injection.
- **One KeyRetriever chain per process** — built lazily inside `newCredentialInjector` and reused across every Chromium browser and every profile to prevent repeated keychain prompts on macOS.
- **Discovery is decoupled from injection** — `pickFromConfigs` is injection-free; `DiscoverBrowsers` stops after it, `DiscoverBrowsersWithKeys` continues into injection.
- **Profile discovery differs by engine**: Chromium looks for `Preferences` files in subdirectories; Firefox accepts any subdirectory containing known source files.
- **Flat layout fallback** — Opera-style browsers that store data directly in UserDataDir (no profile subdirectories) are handled by falling back to the base directory.
+3 -3
View File
@@ -36,7 +36,7 @@ The return value is the **ready-to-use decryption key** — either the raw AES k
`ChainRetriever` wraps multiple retrievers and tries them in order. The first successful result wins. If all fail, errors from every retriever are combined into a single error.
**Caching**: the retriever chain is created once per process inside `newPlatformInjector` (see `browser/browser_{darwin,linux,windows}.go`) and shared across every Chromium browser and every profile. macOS retrievers additionally use `sync.Once` internally, so multi-profile browsers only trigger one keychain prompt or memory dump.
**Caching**: the retriever chain is created once per process inside `newCredentialInjector` (see `browser/browser_{darwin,linux,windows}.go`) and shared across every Chromium browser and every profile. macOS retrievers additionally use `sync.Once` internally, so multi-profile browsers only trigger one keychain prompt or memory dump.
## 3. macOS Key Retrieval
@@ -122,7 +122,7 @@ Windows populates two slots of the `keyretriever.Retrievers` struct — V10 (leg
| V10 | `DPAPIRetriever` | `os_crypt.encrypted_key` | `CryptUnprotectData` (Crypt32.dll) |
| V20 | `ABERetriever` | `os_crypt.app_bound_encrypted_key` | IElevator via reflective injection (see [RFC-010](010-chrome-abe-integration.md)) |
`browser/browser_windows.go::newPlatformInjector` calls `keyretriever.DefaultRetrievers()` and wires the resulting struct through `Browser.SetKeyRetrievers(r)`. At extract time `keyretriever.NewMasterKeys` runs each slot independently — a failure on one tier does not prevent the other from succeeding, because mixed-tier Chrome profiles (upgraded from pre-127) need partial success to be useful.
`browser/browser_windows.go::newCredentialInjector` calls `keyretriever.DefaultRetrievers()` and wires the resulting struct through `Browser.SetKeyRetrievers(r)`. At extract time `keyretriever.NewMasterKeys` runs each slot independently — a failure on one tier does not prevent the other from succeeding, because mixed-tier Chrome profiles (upgraded from pre-127) need partial success to be useful.
**Why not a ChainRetriever?** `ChainRetriever` has first-success semantics: once ABE returns a key, DPAPI is never called. That semantics is wrong for orthogonal tiers — it was the root cause of issue #578, where upgraded profiles' v10-encrypted passwords silently failed because only the v20 key was retrieved. `NewMasterKeys` evaluates each tier independently and returns an `errors.Join` of per-tier failures; log severity is a caller-side decision. `browser/chromium::getMasterKeys` currently logs all tier errors uniformly at `Warnf` — the distinction between "partial" and "total" failure was judged low-value for a short-lived CLI where all warn lines are visible in the default output.
@@ -214,7 +214,7 @@ Future contributors adding a new macOS browser that reads credentials from the K
### 7.3 Where the `--keychain-pw` Password Goes
The macOS login password is resolved once at startup by `browser/browser_darwin.go::resolveKeychainPassword`, then delivered to both consumers from within a single platform-specific closure, `newPlatformInjector` (defined per platform in `browser/browser_{darwin,linux,windows}.go`). The closure captures both the retriever chain and the raw password, and applies whichever capability interface each Browser happens to satisfy:
The macOS login password is resolved once at startup by `browser/browser_darwin.go::resolveKeychainPassword`, then delivered to both consumers from within a single platform-specific closure, `newCredentialInjector` (defined per platform in `browser/browser_{darwin,linux,windows}.go`). The closure captures both the retriever chain and the raw password, and applies whichever capability interface each Browser happens to satisfy:
| Consumer | Capability interface | Defined in | Payload |
|---|---|---|---|
+2 -2
View File
@@ -28,7 +28,7 @@ The primary command. Extracts, decrypts, and writes browser data to files.
| `--keychain-pw` | | | macOS keychain password |
| `--zip` | | `false` | Compress output to zip |
**Workflow**: PickBrowsers (filter by `-b`) → parseCategories (split `-c` on commas) → NewWriter (select formatter by `-f`) → Extract loop (each browser) → Write → optional CompressDir.
**Workflow**: DiscoverBrowsersWithKeys (filter by `-b`) → parseCategories (split `-c` on commas) → NewWriter (select formatter by `-f`) → Extract loop (each browser) → Write → optional CompressDir.
The nine recognized categories are: `password`, `cookie`, `bookmark`, `history`, `download`, `creditcard`, `extension`, `localstorage`, `sessionstorage`. The string `"all"` maps to all nine.
@@ -121,7 +121,7 @@ File permissions are restrictive: directories `0750`, files `0600` (data may con
```
CLI: hack-browser-data dump -b chrome -c password,cookie -f csv -d results
PickBrowsers(name="chrome") → []Browser
DiscoverBrowsersWithKeys(name="chrome") → []Browser
→ parseCategories("password,cookie") → []Category
→ NewWriter("results", "csv") → *Writer
→ for each browser: