10 KiB
RFC-001: Project Architecture & Data Model
Author: moonD4rk Status: Living Document Created: 2026-04-05
1. Project Positioning
HackBrowserData is a CLI security research tool that extracts and decrypts browser data from Chromium-based browsers and Firefox across Windows, macOS, and Linux.
Key constraints:
- Go 1.20 — the module must build with Go 1.20 to maintain Windows 7 support. Features from Go 1.21+ (
log/slog,slices,maps,cmp) must not be used. - Supported engines: Chromium (including Yandex and Opera variants) and Firefox.
- Supported platforms: Windows (DPAPI), macOS (Keychain), Linux (D-Bus Secret Service).
- No root-level library API — the CLI calls
browser.PickBrowsers()directly; there is no importablepkg/surface.
2. Directory Structure
HackBrowserData/
├── cmd/hack-browser-data/ # CLI entrypoint: cobra root, dump, list, version
├── browser/ # Browser interface, PickBrowsers(), platform browser lists
│ ├── chromium/ # Chromium engine: extraction, decryption, profile discovery
│ └── firefox/ # Firefox engine: extraction, NSS key derivation
├── types/ # Data model: Category enum, Entry structs, BrowserData
├── crypto/ # Encryption primitives, cipher version detection
│ └── keyretriever/ # Platform-specific master key retrieval (Keychain/DPAPI/D-Bus)
├── filemanager/ # Temp file session, locked file handling (Windows)
├── output/ # Output Writer: CSV, JSON, CookieEditor formatters
├── log/ # Logging with level filtering
└── utils/ # SQLite query helpers, file utilities
3. Core Data Model
3.1 Category
Category is an int enum representing 9 browser-agnostic data kinds: Password, Cookie, Bookmark, History, Download, CreditCard, Extension, LocalStorage, SessionStorage.
Three categories are classified as sensitive (Password, Cookie, CreditCard) via IsSensitive(), enabling safe-by-default export scenarios.
3.2 Entry Types
Each category has a corresponding Entry struct with json and csv struct tags. All structs are flat (no nesting) and use time.Time for timestamps.
| Struct | Category | Key Fields |
|---|---|---|
LoginEntry |
Password | URL, Username, Password, CreatedAt |
CookieEntry |
Cookie | Host, Path, Name, Value, IsSecure, IsHTTPOnly, ExpireAt, CreatedAt |
BookmarkEntry |
Bookmark | Name, URL, Folder, CreatedAt |
HistoryEntry |
History | URL, Title, VisitCount, LastVisit |
DownloadEntry |
Download | URL, TargetPath, TotalBytes, StartTime, EndTime |
CreditCardEntry |
CreditCard | Name, Number, ExpMonth, ExpYear |
ExtensionEntry |
Extension | Name, ID, Description, Version |
StorageEntry |
LocalStorage, SessionStorage | URL, Key, Value |
StorageEntry is shared by both LocalStorage and SessionStorage.
3.3 BrowserData Container
BrowserData is the result container returned by Extract(). It holds typed slices — one per category. The container is populated field-by-field during extraction. The output layer uses makeExtractor[T]() generics to pull the correct slice for serialization.
4. Browser Interface & Registration
4.1 BrowserKind
Each config declares an engine kind that determines source paths and extraction logic. Kinds fall into three engine families:
- Chromium (
Chromium,ChromiumYandex,ChromiumOpera) — the standard Chromium layout plus two variants that override file names or storage paths for Yandex and Opera forks. See RFC-003. - Firefox — NSS-based key derivation from
key4.db, SQLite + JSON source files. See RFC-005. - Safari — macOS only, with direct Keychain-based credential extraction. See RFC-006 §7.
See types/category.go for the authoritative enum definition.
4.2 BrowserConfig
BrowserConfig is the declarative, platform-specific browser definition containing: Key (CLI matching), Name (display), Kind (engine), Storage (keychain label), UserDataDir (data path).
4.3 Browser Selection Flow
There are two entry points, one for extraction and one for discovery:
PickBrowsers(opts) // used by `dump` — ready to Extract
→ pickFromConfigs(configs, opts) // shared discovery core
→ platformBrowsers() // build-tagged list for this OS
→ filter by name / profile path
→ newBrowsers(cfg) // dispatch to chromium/firefox/safari.NewBrowsers
→ discoverProfiles() // scan profile subdirectories
→ resolveSourcePaths() // stat candidates, first match wins
→ newPlatformInjector(opts) // build-tagged: returns a func(Browser)
→ for each browser: // closure captures retriever + keychain pw lazily
inject(b) // type-assert retrieverSetter / keychainPasswordSetter
DiscoverBrowsers(opts) // used by `list` / `list --detail`
→ pickFromConfigs(configs, opts) // same shared discovery core, NO injection
PickBrowsers does discovery + decryption setup in one call; the returned
browsers are ready for b.Extract. DiscoverBrowsers skips injection
entirely, so list-style commands never trigger the macOS Keychain password
prompt — they have no use for the credential. Both entry points share the
same pickFromConfigs core, so filtering/profile-path/glob semantics stay
consistent.
Key design decisions:
- One KeyRetriever chain per process — built lazily inside
newPlatformInjectorand reused across every Chromium browser and every profile to prevent repeated keychain prompts on macOS. - Discovery is decoupled from injection —
pickFromConfigsis injection-free;DiscoverBrowsersstops after it,PickBrowserscontinues into injection. - Profile discovery differs by engine: Chromium looks for
Preferencesfiles in subdirectories; Firefox accepts any subdirectory containing known source files. - Flat layout fallback — Opera-style browsers that store data directly in UserDataDir (no profile subdirectories) are handled by falling back to the base directory.
4.4 Platform Browser Lists
Browser configs are defined per-platform via build tags in platformBrowsers() (browser/browser_{darwin,linux,windows}.go). The supported set groups by engine family:
- Chromium-based — the largest family, covering mainstream browsers (Chrome, Edge, Brave, Vivaldi, Opera, Chromium) across all three platforms plus regional variants and forks. Windows carries the longest list because of China-region Chromium forks (360, QQ, Sogou, DC, …) and MSIX-packaged browsers with dynamic install paths (Arc, DuckDuckGo).
- Firefox — all three platforms, via internal NSS key derivation (RFC-005).
- Safari — macOS only, via direct Keychain
InternetPasswordextraction (RFC-006 §7).
Adding a new browser is a config-only change in platformBrowsers(); this section does not need updates for new variants within an existing family.
5. Extract() Orchestration
Both Chromium and Firefox engines follow the same extraction pattern:
Extract(categories)
1. NewSession() → create isolated temp directory
2. acquireFiles(session) → copy source files to temp dir (with dedup and WAL/SHM)
3. getMasterKey(session) → platform-specific key retrieval
4. for each category:
extractCategory(data, cat, masterKey, path)
5. defer session.Cleanup() → remove temp directory
For details on file acquisition, see RFC-008. For encryption details, see RFC-003 (Chromium) and RFC-005 (Firefox). For key retrieval, see RFC-006.
5.1 Collect-and-Continue Pattern
The extraction loop maximizes data recovery. Each category is extracted independently — a failure in one does not affect others. Errors are handled at three levels:
| Level | Trigger | Action |
|---|---|---|
| Session failure | Temp dir cannot be created | Abort entirely, return error |
| Category failure | Source file missing or extraction error | Skip category, continue to next |
| Record failure | Single row decryption fails | Skip record, continue extraction |
Master key failure is non-fatal. If the key cannot be retrieved, categories requiring decryption (passwords, cookies, credit cards) produce empty values, while non-encrypted categories (history, bookmarks, downloads) still succeed.
5.2 Custom Extractors
The categoryExtractor interface allows browser-specific extraction logic. Yandex and Opera use custom extractors for passwords and extensions respectively, while all other categories fall through to the default Chromium implementation.
6. Dependency Constraints
The module is pinned to go 1.20 in go.mod. This is enforced by a CI lint check that fails if the directive changes.
| Dependency | Version | Purpose |
|---|---|---|
modernc.org/sqlite |
v1.31.1 (pinned) | Pure-Go SQLite. v1.32+ requires Go 1.21 |
github.com/syndtr/goleveldb |
v1.0.0 | LevelDB for Chromium localStorage/sessionStorage |
github.com/tidwall/gjson |
v1.18.0 | JSON path queries |
github.com/spf13/cobra |
v1.10.2 | CLI framework |
github.com/moond4rk/keychainbreaker |
v0.2.5 | macOS keychain decryption |
github.com/godbus/dbus/v5 |
v5.2.2 | Linux D-Bus Secret Service |
golang.org/x/sys |
v0.27.0 | Windows syscalls (DPAPI, DuplicateHandle) |
Related RFCs
| RFC | Topic |
|---|---|
| RFC-002 | Chromium data file locations and storage formats |
| RFC-003 | Chromium encryption mechanisms per platform |
| RFC-004 | Firefox data file locations and storage formats |
| RFC-005 | Firefox NSS encryption and key derivation |
| RFC-006 | Platform-specific master key retrieval |
| RFC-007 | CLI commands and output formats |
| RFC-008 | File acquisition and platform quirks |
| RFC-009 | Windows locked file bypass technique |