mirror of
https://github.com/moonD4rk/HackBrowserData.git
synced 2026-05-19 18:58:03 +02:00
182 lines
10 KiB
Markdown
182 lines
10 KiB
Markdown
# RFC-001: Project Architecture & Data Model
|
|
|
|
**Author**: moonD4rk
|
|
**Status**: Living Document
|
|
**Created**: 2026-04-05
|
|
|
|
## 1. Project Positioning
|
|
|
|
HackBrowserData is a CLI security research tool that extracts and decrypts browser data from Chromium-based browsers and Firefox across Windows, macOS, and Linux.
|
|
|
|
Key constraints:
|
|
|
|
- **Go 1.20** — the module must build with Go 1.20 to maintain Windows 7 support. Features from Go 1.21+ (`log/slog`, `slices`, `maps`, `cmp`) must not be used.
|
|
- **Supported engines**: Chromium (including Yandex and Opera variants) and Firefox.
|
|
- **Supported platforms**: Windows (DPAPI), macOS (Keychain), Linux (D-Bus Secret Service).
|
|
- **No root-level library API** — the CLI calls `browser.PickBrowsers()` directly; there is no importable `pkg/` surface.
|
|
|
|
## 2. Directory Structure
|
|
|
|
```
|
|
HackBrowserData/
|
|
├── cmd/hack-browser-data/ # CLI entrypoint: cobra root, dump, list, version
|
|
├── browser/ # Browser interface, PickBrowsers(), platform browser lists
|
|
│ ├── chromium/ # Chromium engine: extraction, decryption, profile discovery
|
|
│ └── firefox/ # Firefox engine: extraction, NSS key derivation
|
|
├── types/ # Data model: Category enum, Entry structs, BrowserData
|
|
├── crypto/ # Encryption primitives, cipher version detection
|
|
│ └── keyretriever/ # Platform-specific master key retrieval (Keychain/DPAPI/D-Bus)
|
|
├── filemanager/ # Temp file session, locked file handling (Windows)
|
|
├── output/ # Output Writer: CSV, JSON, CookieEditor formatters
|
|
├── log/ # Logging with level filtering
|
|
└── utils/ # SQLite query helpers, file utilities
|
|
```
|
|
|
|
## 3. Core Data Model
|
|
|
|
### 3.1 Category
|
|
|
|
`Category` is an `int` enum representing 9 browser-agnostic data kinds: Password, Cookie, Bookmark, History, Download, CreditCard, Extension, LocalStorage, SessionStorage.
|
|
|
|
Three categories are classified as **sensitive** (Password, Cookie, CreditCard) via `IsSensitive()`, enabling safe-by-default export scenarios.
|
|
|
|
### 3.2 Entry Types
|
|
|
|
Each category has a corresponding Entry struct with `json` and `csv` struct tags. All structs are flat (no nesting) and use `time.Time` for timestamps.
|
|
|
|
| Struct | Category | Key Fields |
|
|
|--------|----------|------------|
|
|
| `LoginEntry` | Password | URL, Username, Password, CreatedAt |
|
|
| `CookieEntry` | Cookie | Host, Path, Name, Value, IsSecure, IsHTTPOnly, ExpireAt, CreatedAt |
|
|
| `BookmarkEntry` | Bookmark | Name, URL, Folder, CreatedAt |
|
|
| `HistoryEntry` | History | URL, Title, VisitCount, LastVisit |
|
|
| `DownloadEntry` | Download | URL, TargetPath, TotalBytes, StartTime, EndTime |
|
|
| `CreditCardEntry` | CreditCard | Name, Number, ExpMonth, ExpYear |
|
|
| `ExtensionEntry` | Extension | Name, ID, Description, Version |
|
|
| `StorageEntry` | LocalStorage, SessionStorage | URL, Key, Value |
|
|
|
|
`StorageEntry` is shared by both LocalStorage and SessionStorage.
|
|
|
|
### 3.3 BrowserData Container
|
|
|
|
`BrowserData` is the result container returned by `Extract()`. It holds typed slices — one per category. The container is populated field-by-field during extraction. The output layer uses `makeExtractor[T]()` generics to pull the correct slice for serialization.
|
|
|
|
## 4. Browser Interface & Registration
|
|
|
|
### 4.1 BrowserKind
|
|
|
|
Each config declares an engine kind that determines source paths and extraction logic. Kinds fall into three engine families:
|
|
|
|
- **Chromium** (`Chromium`, `ChromiumYandex`, `ChromiumOpera`) — the standard Chromium layout plus two variants that override file names or storage paths for Yandex and Opera forks. See RFC-003.
|
|
- **Firefox** — NSS-based key derivation from `key4.db`, SQLite + JSON source files. See RFC-005.
|
|
- **Safari** — macOS only, with direct Keychain-based credential extraction. See RFC-006 §7.
|
|
|
|
See `types/category.go` for the authoritative enum definition.
|
|
|
|
### 4.2 BrowserConfig
|
|
|
|
`BrowserConfig` is the declarative, platform-specific browser definition containing: Key (CLI matching; also the Windows ABE / winutil.Table identifier when WindowsABE is true), Name (display), Kind (engine), KeychainLabel (macOS Keychain / Linux D-Bus Secret Service label), WindowsABE (bool — enable Windows App-Bound Encryption v20 path), UserDataDir (data path).
|
|
|
|
### 4.3 Browser Selection Flow
|
|
|
|
There are two entry points, one for extraction and one for discovery:
|
|
|
|
```
|
|
PickBrowsers(opts) // used by `dump` — ready to Extract
|
|
→ pickFromConfigs(configs, opts) // shared discovery core
|
|
→ platformBrowsers() // build-tagged list for this OS
|
|
→ filter by name / profile path
|
|
→ newBrowsers(cfg) // dispatch to chromium/firefox/safari.NewBrowsers
|
|
→ discoverProfiles() // scan profile subdirectories
|
|
→ resolveSourcePaths() // stat candidates, first match wins
|
|
→ newPlatformInjector(opts) // build-tagged: returns a func(Browser)
|
|
→ for each browser: // closure captures retriever + keychain pw lazily
|
|
inject(b) // type-assert retrieverSetter / keychainPasswordSetter
|
|
|
|
DiscoverBrowsers(opts) // used by `list` / `list --detail`
|
|
→ pickFromConfigs(configs, opts) // same shared discovery core, NO injection
|
|
```
|
|
|
|
`PickBrowsers` does discovery + decryption setup in one call; the returned
|
|
browsers are ready for `b.Extract`. `DiscoverBrowsers` skips injection
|
|
entirely, so list-style commands never trigger the macOS Keychain password
|
|
prompt — they have no use for the credential. Both entry points share the
|
|
same `pickFromConfigs` core, so filtering/profile-path/glob semantics stay
|
|
consistent.
|
|
|
|
Key design decisions:
|
|
|
|
- **One KeyRetriever chain per process** — built lazily inside `newPlatformInjector` and reused across every Chromium browser and every profile to prevent repeated keychain prompts on macOS.
|
|
- **Discovery is decoupled from injection** — `pickFromConfigs` is injection-free; `DiscoverBrowsers` stops after it, `PickBrowsers` continues into injection.
|
|
- **Profile discovery differs by engine**: Chromium looks for `Preferences` files in subdirectories; Firefox accepts any subdirectory containing known source files.
|
|
- **Flat layout fallback** — Opera-style browsers that store data directly in UserDataDir (no profile subdirectories) are handled by falling back to the base directory.
|
|
|
|
### 4.4 Platform Browser Lists
|
|
|
|
Browser configs are defined per-platform via build tags in `platformBrowsers()` (`browser/browser_{darwin,linux,windows}.go`). The supported set groups by engine family:
|
|
|
|
- **Chromium-based** — the largest family, covering mainstream browsers (Chrome, Edge, Brave, Vivaldi, Opera, Chromium) across all three platforms plus regional variants and forks. Windows carries the longest list because of China-region Chromium forks (360, QQ, Sogou, DC, …) and MSIX-packaged browsers with dynamic install paths (Arc, DuckDuckGo).
|
|
- **Firefox** — all three platforms, via internal NSS key derivation (RFC-005).
|
|
- **Safari** — macOS only, via direct Keychain `InternetPassword` extraction (RFC-006 §7).
|
|
|
|
Adding a new browser is a config-only change in `platformBrowsers()`; this section does not need updates for new variants within an existing family.
|
|
|
|
## 5. Extract() Orchestration
|
|
|
|
Both Chromium and Firefox engines follow the same extraction pattern:
|
|
|
|
```
|
|
Extract(categories)
|
|
1. NewSession() → create isolated temp directory
|
|
2. acquireFiles(session) → copy source files to temp dir (with dedup and WAL/SHM)
|
|
3. getMasterKey(session) → platform-specific key retrieval
|
|
4. for each category:
|
|
extractCategory(data, cat, masterKey, path)
|
|
5. defer session.Cleanup() → remove temp directory
|
|
```
|
|
|
|
For details on file acquisition, see [RFC-008](008-file-acquisition-and-platform-quirks.md). For encryption details, see [RFC-003](003-chromium-encryption.md) (Chromium) and [RFC-005](005-firefox-encryption.md) (Firefox). For key retrieval, see [RFC-006](006-key-retrieval-mechanisms.md).
|
|
|
|
### 5.1 Collect-and-Continue Pattern
|
|
|
|
The extraction loop maximizes data recovery. Each category is extracted independently — a failure in one does not affect others. Errors are handled at three levels:
|
|
|
|
| Level | Trigger | Action |
|
|
|-------|---------|--------|
|
|
| **Session failure** | Temp dir cannot be created | Abort entirely, return error |
|
|
| **Category failure** | Source file missing or extraction error | Skip category, continue to next |
|
|
| **Record failure** | Single row decryption fails | Skip record, continue extraction |
|
|
|
|
**Master key failure is non-fatal.** If the key cannot be retrieved, categories requiring decryption (passwords, cookies, credit cards) produce empty values, while non-encrypted categories (history, bookmarks, downloads) still succeed.
|
|
|
|
### 5.2 Custom Extractors
|
|
|
|
The `categoryExtractor` interface allows browser-specific extraction logic. Yandex and Opera use custom extractors for passwords and extensions respectively, while all other categories fall through to the default Chromium implementation.
|
|
|
|
## 6. Dependency Constraints
|
|
|
|
The module is pinned to `go 1.20` in `go.mod`. This is enforced by a CI lint check that fails if the directive changes.
|
|
|
|
| Dependency | Version | Purpose |
|
|
|-----------|---------|---------|
|
|
| `modernc.org/sqlite` | v1.31.1 (pinned) | Pure-Go SQLite. v1.32+ requires Go 1.21 |
|
|
| `github.com/syndtr/goleveldb` | v1.0.0 | LevelDB for Chromium localStorage/sessionStorage |
|
|
| `github.com/tidwall/gjson` | v1.18.0 | JSON path queries |
|
|
| `github.com/spf13/cobra` | v1.10.2 | CLI framework |
|
|
| `github.com/moond4rk/keychainbreaker` | v0.2.5 | macOS keychain decryption |
|
|
| `github.com/godbus/dbus/v5` | v5.2.2 | Linux D-Bus Secret Service |
|
|
| `golang.org/x/sys` | v0.27.0 | Windows syscalls (DPAPI, DuplicateHandle) |
|
|
|
|
## Related RFCs
|
|
|
|
| RFC | Topic |
|
|
|-----|-------|
|
|
| [RFC-002](002-chromium-data-storage.md) | Chromium data file locations and storage formats |
|
|
| [RFC-003](003-chromium-encryption.md) | Chromium encryption mechanisms per platform |
|
|
| [RFC-004](004-firefox-data-storage.md) | Firefox data file locations and storage formats |
|
|
| [RFC-005](005-firefox-encryption.md) | Firefox NSS encryption and key derivation |
|
|
| [RFC-006](006-key-retrieval-mechanisms.md) | Platform-specific master key retrieval |
|
|
| [RFC-007](007-cli-and-output-design.md) | CLI commands and output formats |
|
|
| [RFC-008](008-file-acquisition-and-platform-quirks.md) | File acquisition and platform quirks |
|
|
| [RFC-009](009-windows-locked-file-bypass.md) | Windows locked file bypass technique |
|