* docs: rewrite README, RFCs, and CONTRIBUTING * docs: fix Linux storage labels in RFC-006 (Opera/Vivaldi swapped)
8.5 KiB
RFC-001: Project Architecture & Data Model
Author: moonD4rk Status: Living Document Created: 2026-04-05
1. Project Positioning
HackBrowserData is a CLI security research tool that extracts and decrypts browser data from Chromium-based browsers and Firefox across Windows, macOS, and Linux.
Key constraints:
- Go 1.20 — the module must build with Go 1.20 to maintain Windows 7 support. Features from Go 1.21+ (
log/slog,slices,maps,cmp) must not be used. - Supported engines: Chromium (including Yandex and Opera variants) and Firefox.
- Supported platforms: Windows (DPAPI), macOS (Keychain), Linux (D-Bus Secret Service).
- No root-level library API — the CLI calls
browser.PickBrowsers()directly; there is no importablepkg/surface.
2. Directory Structure
HackBrowserData/
├── cmd/hack-browser-data/ # CLI entrypoint: cobra root, dump, list, version
├── browser/ # Browser interface, PickBrowsers(), platform browser lists
│ ├── chromium/ # Chromium engine: extraction, decryption, profile discovery
│ └── firefox/ # Firefox engine: extraction, NSS key derivation
├── types/ # Data model: Category enum, Entry structs, BrowserData
├── crypto/ # Encryption primitives, cipher version detection
│ └── keyretriever/ # Platform-specific master key retrieval (Keychain/DPAPI/D-Bus)
├── filemanager/ # Temp file session, locked file handling (Windows)
├── output/ # Output Writer: CSV, JSON, CookieEditor formatters
├── log/ # Logging with level filtering
└── utils/ # SQLite query helpers, file utilities
3. Core Data Model
3.1 Category
Category is an int enum representing 9 browser-agnostic data kinds: Password, Cookie, Bookmark, History, Download, CreditCard, Extension, LocalStorage, SessionStorage.
Three categories are classified as sensitive (Password, Cookie, CreditCard) via IsSensitive(), enabling safe-by-default export scenarios.
3.2 Entry Types
Each category has a corresponding Entry struct with json and csv struct tags. All structs are flat (no nesting) and use time.Time for timestamps.
| Struct | Category | Key Fields |
|---|---|---|
LoginEntry |
Password | URL, Username, Password, CreatedAt |
CookieEntry |
Cookie | Host, Path, Name, Value, IsSecure, IsHTTPOnly, ExpireAt, CreatedAt |
BookmarkEntry |
Bookmark | Name, URL, Folder, CreatedAt |
HistoryEntry |
History | URL, Title, VisitCount, LastVisit |
DownloadEntry |
Download | URL, TargetPath, TotalBytes, StartTime, EndTime |
CreditCardEntry |
CreditCard | Name, Number, ExpMonth, ExpYear |
ExtensionEntry |
Extension | Name, ID, Description, Version |
StorageEntry |
LocalStorage, SessionStorage | URL, Key, Value |
StorageEntry is shared by both LocalStorage and SessionStorage.
3.3 BrowserData Container
BrowserData is the result container returned by Extract(). It holds typed slices — one per category. The container is populated field-by-field during extraction. The output layer uses makeExtractor[T]() generics to pull the correct slice for serialization.
4. Browser Interface & Registration
4.1 BrowserKind
Four engine kinds determine source paths and extractors:
| Kind | Description |
|---|---|
Chromium |
Standard Chromium layout |
ChromiumYandex |
Yandex variant: different file names and SQL queries |
ChromiumOpera |
Opera variant: different extension key, Roaming path on Windows |
Firefox |
Firefox: NSS encryption, SQLite + JSON files |
4.2 BrowserConfig
BrowserConfig is the declarative, platform-specific browser definition containing: Key (CLI matching), Name (display), Kind (engine), Storage (keychain label), UserDataDir (data path).
4.3 PickBrowsers() Flow
PickBrowsers(opts)
→ platformBrowsers() // build-tagged: returns []BrowserConfig for this OS
→ pickFromConfigs(configs, opts) // filter by name, apply profile-path/keychain overrides
→ newBrowsers(cfg) // dispatch by Kind to chromium.NewBrowsers or firefox.NewBrowsers
→ discoverProfiles() // scan for profile subdirectories
→ resolveSourcePaths() // stat each candidate path, first match wins
Key design decisions:
- One KeyRetriever per browser — created once and shared across all profiles to prevent repeated keychain prompts on macOS.
- Profile discovery differs by engine: Chromium looks for
Preferencesfiles in subdirectories; Firefox accepts any subdirectory containing known source files. - Flat layout fallback — Opera-style browsers that store data directly in UserDataDir (no profile subdirectories) are handled by falling back to the base directory.
4.4 Platform Browser Lists
Browser configs are defined per-platform via build tags:
- macOS — 12 browsers (Chrome, Edge, Chromium, Chrome Beta, Opera, OperaGX, Vivaldi, CocCoc, Brave, Yandex, Arc, Firefox)
- Windows — 16 browsers (all macOS minus Arc, plus 360 Speed, 360 Speed X, QQ, DC, Sogou)
- Linux — 8 browsers (Chrome, Edge, Chromium, Chrome Beta, Opera, Vivaldi, Brave, Firefox)
5. Extract() Orchestration
Both Chromium and Firefox engines follow the same extraction pattern:
Extract(categories)
1. NewSession() → create isolated temp directory
2. acquireFiles(session) → copy source files to temp dir (with dedup and WAL/SHM)
3. getMasterKey(session) → platform-specific key retrieval
4. for each category:
extractCategory(data, cat, masterKey, path)
5. defer session.Cleanup() → remove temp directory
For details on file acquisition, see RFC-008. For encryption details, see RFC-003 (Chromium) and RFC-005 (Firefox). For key retrieval, see RFC-006.
5.1 Collect-and-Continue Pattern
The extraction loop maximizes data recovery. Each category is extracted independently — a failure in one does not affect others. Errors are handled at three levels:
| Level | Trigger | Action |
|---|---|---|
| Session failure | Temp dir cannot be created | Abort entirely, return error |
| Category failure | Source file missing or extraction error | Skip category, continue to next |
| Record failure | Single row decryption fails | Skip record, continue extraction |
Master key failure is non-fatal. If the key cannot be retrieved, categories requiring decryption (passwords, cookies, credit cards) produce empty values, while non-encrypted categories (history, bookmarks, downloads) still succeed.
5.2 Custom Extractors
The categoryExtractor interface allows browser-specific extraction logic. Yandex and Opera use custom extractors for passwords and extensions respectively, while all other categories fall through to the default Chromium implementation.
6. Dependency Constraints
The module is pinned to go 1.20 in go.mod. This is enforced by a CI lint check that fails if the directive changes.
| Dependency | Version | Purpose |
|---|---|---|
modernc.org/sqlite |
v1.31.1 (pinned) | Pure-Go SQLite. v1.32+ requires Go 1.21 |
github.com/syndtr/goleveldb |
v1.0.0 | LevelDB for Chromium localStorage/sessionStorage |
github.com/tidwall/gjson |
v1.18.0 | JSON path queries |
github.com/spf13/cobra |
v1.10.2 | CLI framework |
github.com/moond4rk/keychainbreaker |
v0.2.5 | macOS keychain decryption |
github.com/godbus/dbus/v5 |
v5.2.2 | Linux D-Bus Secret Service |
golang.org/x/sys |
v0.27.0 | Windows syscalls (DPAPI, DuplicateHandle) |
Related RFCs
| RFC | Topic |
|---|---|
| RFC-002 | Chromium data file locations and storage formats |
| RFC-003 | Chromium encryption mechanisms per platform |
| RFC-004 | Firefox data file locations and storage formats |
| RFC-005 | Firefox NSS encryption and key derivation |
| RFC-006 | Platform-specific master key retrieval |
| RFC-007 | CLI commands and output formats |
| RFC-008 | File acquisition and platform quirks |
| RFC-009 | Windows locked file bypass technique |