# RFC-001: Project Architecture & Data Model **Author**: moonD4rk **Status**: Living Document **Created**: 2026-04-05 ## 1. Project Positioning HackBrowserData is a CLI security research tool that extracts and decrypts browser data from Chromium-based browsers and Firefox across Windows, macOS, and Linux. Key constraints: - **Go 1.20** — the module must build with Go 1.20 to maintain Windows 7 support. Features from Go 1.21+ (`log/slog`, `slices`, `maps`, `cmp`) must not be used. - **Supported engines**: Chromium (including Yandex and Opera variants) and Firefox. - **Supported platforms**: Windows (DPAPI), macOS (Keychain), Linux (D-Bus Secret Service). - **No root-level library API** — the CLI calls `browser.PickBrowsers()` directly; there is no importable `pkg/` surface. ## 2. Directory Structure ``` HackBrowserData/ ├── cmd/hack-browser-data/ # CLI entrypoint: cobra root, dump, list, version ├── browser/ # Browser interface, PickBrowsers(), platform browser lists │ ├── chromium/ # Chromium engine: extraction, decryption, profile discovery │ └── firefox/ # Firefox engine: extraction, NSS key derivation ├── types/ # Data model: Category enum, Entry structs, BrowserData ├── crypto/ # Encryption primitives, cipher version detection │ └── keyretriever/ # Platform-specific master key retrieval (Keychain/DPAPI/D-Bus) ├── filemanager/ # Temp file session, locked file handling (Windows) ├── output/ # Output Writer: CSV, JSON, CookieEditor formatters ├── log/ # Logging with level filtering └── utils/ # SQLite query helpers, file utilities ``` ## 3. Core Data Model ### 3.1 Category `Category` is an `int` enum representing 9 browser-agnostic data kinds: Password, Cookie, Bookmark, History, Download, CreditCard, Extension, LocalStorage, SessionStorage. Three categories are classified as **sensitive** (Password, Cookie, CreditCard) via `IsSensitive()`, enabling safe-by-default export scenarios. ### 3.2 Entry Types Each category has a corresponding Entry struct with `json` and `csv` struct tags. All structs are flat (no nesting) and use `time.Time` for timestamps. | Struct | Category | Key Fields | |--------|----------|------------| | `LoginEntry` | Password | URL, Username, Password, CreatedAt | | `CookieEntry` | Cookie | Host, Path, Name, Value, IsSecure, IsHTTPOnly, ExpireAt, CreatedAt | | `BookmarkEntry` | Bookmark | Name, URL, Folder, CreatedAt | | `HistoryEntry` | History | URL, Title, VisitCount, LastVisit | | `DownloadEntry` | Download | URL, TargetPath, TotalBytes, StartTime, EndTime | | `CreditCardEntry` | CreditCard | Name, Number, ExpMonth, ExpYear | | `ExtensionEntry` | Extension | Name, ID, Description, Version | | `StorageEntry` | LocalStorage, SessionStorage | URL, Key, Value | `StorageEntry` is shared by both LocalStorage and SessionStorage. ### 3.3 BrowserData Container `BrowserData` is the result container returned by `Extract()`. It holds typed slices — one per category. The container is populated field-by-field during extraction. The output layer uses `makeExtractor[T]()` generics to pull the correct slice for serialization. ## 4. Browser Interface & Registration ### 4.1 BrowserKind Each config declares an engine kind that determines source paths and extraction logic. Kinds fall into three engine families: - **Chromium** (`Chromium`, `ChromiumYandex`, `ChromiumOpera`) — the standard Chromium layout plus two variants that override file names or storage paths for Yandex and Opera forks. See RFC-003. - **Firefox** — NSS-based key derivation from `key4.db`, SQLite + JSON source files. See RFC-005. - **Safari** — macOS only, with direct Keychain-based credential extraction. See RFC-006 §7. See `types/category.go` for the authoritative enum definition. ### 4.2 BrowserConfig `BrowserConfig` is the declarative, platform-specific browser definition containing: Key (CLI matching; also the Windows ABE / winutil.Table identifier when WindowsABE is true), Name (display), Kind (engine), KeychainLabel (macOS Keychain / Linux D-Bus Secret Service label), WindowsABE (bool — enable Windows App-Bound Encryption v20 path), UserDataDir (data path). ### 4.3 Browser Selection Flow There are two entry points, one for extraction and one for discovery: ``` PickBrowsers(opts) // used by `dump` — ready to Extract → pickFromConfigs(configs, opts) // shared discovery core → platformBrowsers() // build-tagged list for this OS → filter by name / profile path → newBrowsers(cfg) // dispatch to chromium/firefox/safari.NewBrowsers → discoverProfiles() // scan profile subdirectories → resolveSourcePaths() // stat candidates, first match wins → newPlatformInjector(opts) // build-tagged: returns a func(Browser) → for each browser: // closure captures retriever + keychain pw lazily inject(b) // type-assert retrieverSetter / keychainPasswordSetter DiscoverBrowsers(opts) // used by `list` / `list --detail` → pickFromConfigs(configs, opts) // same shared discovery core, NO injection ``` `PickBrowsers` does discovery + decryption setup in one call; the returned browsers are ready for `b.Extract`. `DiscoverBrowsers` skips injection entirely, so list-style commands never trigger the macOS Keychain password prompt — they have no use for the credential. Both entry points share the same `pickFromConfigs` core, so filtering/profile-path/glob semantics stay consistent. Key design decisions: - **One KeyRetriever chain per process** — built lazily inside `newPlatformInjector` and reused across every Chromium browser and every profile to prevent repeated keychain prompts on macOS. - **Discovery is decoupled from injection** — `pickFromConfigs` is injection-free; `DiscoverBrowsers` stops after it, `PickBrowsers` continues into injection. - **Profile discovery differs by engine**: Chromium looks for `Preferences` files in subdirectories; Firefox accepts any subdirectory containing known source files. - **Flat layout fallback** — Opera-style browsers that store data directly in UserDataDir (no profile subdirectories) are handled by falling back to the base directory. ### 4.4 Platform Browser Lists Browser configs are defined per-platform via build tags in `platformBrowsers()` (`browser/browser_{darwin,linux,windows}.go`). The supported set groups by engine family: - **Chromium-based** — the largest family, covering mainstream browsers (Chrome, Edge, Brave, Vivaldi, Opera, Chromium) across all three platforms plus regional variants and forks. Windows carries the longest list because of China-region Chromium forks (360, QQ, Sogou, DC, …) and MSIX-packaged browsers with dynamic install paths (Arc, DuckDuckGo). - **Firefox** — all three platforms, via internal NSS key derivation (RFC-005). - **Safari** — macOS only, via direct Keychain `InternetPassword` extraction (RFC-006 §7). Adding a new browser is a config-only change in `platformBrowsers()`; this section does not need updates for new variants within an existing family. ## 5. Extract() Orchestration Both Chromium and Firefox engines follow the same extraction pattern: ``` Extract(categories) 1. NewSession() → create isolated temp directory 2. acquireFiles(session) → copy source files to temp dir (with dedup and WAL/SHM) 3. getMasterKey(session) → platform-specific key retrieval 4. for each category: extractCategory(data, cat, masterKey, path) 5. defer session.Cleanup() → remove temp directory ``` For details on file acquisition, see [RFC-008](008-file-acquisition-and-platform-quirks.md). For encryption details, see [RFC-003](003-chromium-encryption.md) (Chromium) and [RFC-005](005-firefox-encryption.md) (Firefox). For key retrieval, see [RFC-006](006-key-retrieval-mechanisms.md). ### 5.1 Collect-and-Continue Pattern The extraction loop maximizes data recovery. Each category is extracted independently — a failure in one does not affect others. Errors are handled at three levels: | Level | Trigger | Action | |-------|---------|--------| | **Session failure** | Temp dir cannot be created | Abort entirely, return error | | **Category failure** | Source file missing or extraction error | Skip category, continue to next | | **Record failure** | Single row decryption fails | Skip record, continue extraction | **Master key failure is non-fatal.** If the key cannot be retrieved, categories requiring decryption (passwords, cookies, credit cards) produce empty values, while non-encrypted categories (history, bookmarks, downloads) still succeed. ### 5.2 Custom Extractors The `categoryExtractor` interface allows browser-specific extraction logic. Yandex and Opera use custom extractors for passwords and extensions respectively, while all other categories fall through to the default Chromium implementation. ## 6. Dependency Constraints The module is pinned to `go 1.20` in `go.mod`. This is enforced by a CI lint check that fails if the directive changes. | Dependency | Version | Purpose | |-----------|---------|---------| | `modernc.org/sqlite` | v1.31.1 (pinned) | Pure-Go SQLite. v1.32+ requires Go 1.21 | | `github.com/syndtr/goleveldb` | v1.0.0 | LevelDB for Chromium localStorage/sessionStorage | | `github.com/tidwall/gjson` | v1.18.0 | JSON path queries | | `github.com/spf13/cobra` | v1.10.2 | CLI framework | | `github.com/moond4rk/keychainbreaker` | v0.2.5 | macOS keychain decryption | | `github.com/godbus/dbus/v5` | v5.2.2 | Linux D-Bus Secret Service | | `golang.org/x/sys` | v0.27.0 | Windows syscalls (DPAPI, DuplicateHandle) | ## Related RFCs | RFC | Topic | |-----|-------| | [RFC-002](002-chromium-data-storage.md) | Chromium data file locations and storage formats | | [RFC-003](003-chromium-encryption.md) | Chromium encryption mechanisms per platform | | [RFC-004](004-firefox-data-storage.md) | Firefox data file locations and storage formats | | [RFC-005](005-firefox-encryption.md) | Firefox NSS encryption and key derivation | | [RFC-006](006-key-retrieval-mechanisms.md) | Platform-specific master key retrieval | | [RFC-007](007-cli-and-output-design.md) | CLI commands and output formats | | [RFC-008](008-file-acquisition-and-platform-quirks.md) | File acquisition and platform quirks | | [RFC-009](009-windows-locked-file-bypass.md) | Windows locked file bypass technique |