diff --git a/README.md b/README.md index 56d9003..af3f9eb 100644 --- a/README.md +++ b/README.md @@ -8,6 +8,8 @@ `HackBrowserData` is a command-line tool for decrypting and exporting browser data (passwords, history, cookies, bookmarks, credit cards, download history, localStorage, sessionStorage and extensions) from the browser. It supports the most popular Chromium-based browsers and Firefox on Windows, macOS and Linux, plus Safari on macOS. +It can also decrypt data **across machines and operating systems**: export the master keys on the origin host, then decrypt a copy of the data offline on any other host — even for a browser that the analyst host's OS cannot run (see [Cross-host decryption](#cross-host-decryption)). + > Disclaimer: This tool is only intended for security research. Users are responsible for all legal and related liabilities resulting from the use of this tool. The original author does not assume any legal responsibility. ## Supported Data Categories @@ -42,19 +44,21 @@ | Vivaldi | ✅ | ✅ | ✅ | | Yandex | ✅ | ✅ | - | | CocCoc | ✅² | ✅ | - | -| Arc | - | ✅ | - | -| DuckDuckGo | ✅ | - | - | -| QQ | ✅ | - | - | -| 360 ChromeX | ✅ | - | - | -| 360 Chrome | ✅ | - | - | -| DC Browser | ✅ | - | - | -| Sogou Explorer | ✅ | - | - | +| Arc | ✅ | ✅ | - | +| DuckDuckGo³ | ✅ | - | - | +| QQ³ | ✅ | - | - | +| 360 ChromeX³ | ✅ | - | - | +| 360 Chrome³ | ✅ | - | - | +| DC Browser³ | ✅ | - | - | +| Sogou Explorer³| ✅ | - | - | | Firefox | ✅ | ✅ | ✅ | | Safari¹ | - | ✅ | - | > ¹ Safari requires Full Disk Access; enable it in System Settings → Privacy & Security → Full Disk Access if extraction returns empty results. > > ² On Windows, decrypting Chromium 127+ cookies (Chrome / Chrome Beta / Edge / Brave / CocCoc) requires the App-Bound Encryption payload built via `make build-windows` — see [Building from source](#building-from-source) below. +> +> ³ These browsers ship only on Windows, but their data is **decryptable on any OS**: pull the files with `archive`, export the keys with `dumpkeys`, then decrypt on macOS or Linux with `restore` — see [Cross-host decryption](#cross-host-decryption). ## Getting Started @@ -117,16 +121,19 @@ Usage: hack-browser-data [command] Available Commands: + archive Pack decryption-relevant profile files into a zip for cross-host restore dump Extract and decrypt browser data (default command) + dumpkeys Export Chromium master keys as JSON for cross-host decryption help Help about any command list List detected browsers and profiles + restore Decrypt copied profile data using exported master keys version Print version information Flags: -b, --browser string target browser: all|chrome|firefox|edge|... (default "all") -c, --category string data categories (comma-separated): all|password,cookie,... (default "all") -d, --dir string output directory (default "results") - -f, --format string output format: csv|json|cookie-editor (default "csv") + -f, --format string output format: csv|json|cookie-editor (default "json") -h, --help help for hack-browser-data --keychain-pw string macOS keychain password -p, --profile-path string custom profile dir path, get with chrome://version @@ -144,12 +151,88 @@ Running `hack-browser-data` without a subcommand defaults to `dump`. |------------------|-------|-----------|--------------------------------------------------------------------------------------------------------------------------------------------| | `--browser` | `-b` | `all` | Target browser (all\|chrome\|firefox\|edge\|...) | | `--category` | `-c` | `all` | Data categories, comma-separated (all\|password\|cookie\|bookmark\|history\|download\|creditcard\|extension\|localstorage\|sessionstorage) | -| `--format` | `-f` | `csv` | Output format (csv\|json\|cookie-editor) | +| `--format` | `-f` | `json` | Output format (csv\|json\|cookie-editor) | | `--dir` | `-d` | `results` | Output directory | | `--profile-path` | `-p` | | Custom profile dir path, get with chrome://version | | `--keychain-pw` | | | macOS keychain password | | `--zip` | | `false` | Compress output to zip | +> `--format cookie-editor` writes **only cookies**, as a JSON array matching the Cookie-Editor browser extension's import format; non-cookie categories are skipped. + +### Cross-host decryption + +Decrypt browser data on an **analyst host** that was collected on a different **origin host** — including a browser whose engine the analyst's OS cannot even install (e.g. decrypt Sogou or QQ Browser data on macOS). Nothing platform-bound (DPAPI, macOS Keychain, Chrome App-Bound Encryption) has to leave the origin: the master keys are exported once, and decryption then runs entirely offline from a copy of the data. + +The workflow uses three commands and two transportable artifacts: + +| Step | Host | Command | Produces | +|------|------|---------|----------| +| 1 | origin | `dumpkeys` | `keys.json` — portable master keys | +| 2 | origin | `archive` | `browser-data.zip` — only the files needed to decrypt | +| 3 | analyst | `restore` | decrypted output (csv / json / cookie-editor) | + +```bash +# On the origin host (any OS) — export the keys and pack the data +hack-browser-data dumpkeys -o keys.json +hack-browser-data archive -o browser-data.zip + +# Copy keys.json + browser-data.zip to the analyst host, then decrypt offline +hack-browser-data restore --keys keys.json --data-zip browser-data.zip +``` + +> `keys.json` contains plaintext master keys — treat it as a secret. `dumpkeys -o` writes it with `0600` permissions; prefer streaming it over a secure channel instead of leaving it on disk. + +#### `dumpkeys` - Export master keys for cross-host decryption + +Derives each Chromium installation's master keys on the origin host and writes them as JSON (Firefox / Safari have no portable key and are skipped). Defaults to stdout so it can be piped over SSH. + +| Flag | Short | Default | Description | +|-----------------|-------|----------|-------------------------------------------------| +| `--browser` | `-b` | `all` | Target browser (all\|chrome\|edge\|...) | +| `--output` | `-o` | *stdout* | Output file (written `0600`); stdout if omitted | +| `--keychain-pw` | | | macOS keychain password | + +#### `archive` - Pack decryption-relevant files for transport + +Collects only the files a restore actually needs (cookies, login data, history, …) through the same locked-file bypass used for extraction, so live SQLite files are read safely on Windows. The zip is laid out as `/`, so one archive can carry several browsers and restore stays unambiguous. Entry names are always forward-slash, so a Windows-produced archive restores on macOS / Linux. + +| Flag | Short | Default | Description | +|--------------|-------|--------------------|-----------------------------------------| +| `--browser` | `-b` | `all` | Target browser (all\|chrome\|edge\|...) | +| `--category` | `-c` | `all` | Data categories, comma-separated | +| `--output` | `-o` | `browser-data.zip` | Output archive path | + +#### `restore` - Decrypt copied data with exported keys + +Rebuilds each Chromium engine straight from `keys.json` and decrypts the supplied data — it never consults the analyst's local browser table, so **the browsers you can restore are exactly the vaults in your `keys.json`**. Supply the data one of two ways (exactly one is required): + +- `--data-zip` — a zip produced by `archive`; extracted to a temp dir and removed afterward. +- `--data-dir` — a directory. Either the `archive` layout (`/...`, several browsers at once), or one browser's hand-copied `User Data` root, which is unambiguous only for a single browser — so pair it with `-b`. + +`-b` is an **optional filter** over the dump's vaults, not a required selector. + +| Flag | Short | Default | Description | +|--------------|-------|------------|------------------------------------------------------------| +| `--keys` | | *required* | Keys file from `dumpkeys` (use `-` for stdin) | +| `--data-zip` | | | Zip from `archive` (mutually exclusive with `--data-dir`) | +| `--data-dir` | | | Copied data dir (mutually exclusive with `--data-zip`) | +| `--browser` | `-b` | | Restore only this browser; must match a vault in `--keys` | +| `--category` | `-c` | `all` | Data categories, comma-separated | +| `--format` | `-f` | `json` | Output format (csv\|json\|cookie-editor) | +| `--dir` | `-d` | `results` | Output directory | +| `--zip` | | `false` | Compress output to zip | + +#### Cross-host examples + +```bash +# Stream keys over SSH (no keys.json on disk), data copied separately +ssh origin "hack-browser-data dumpkeys" | \ + hack-browser-data restore --keys - --data-zip browser-data.zip + +# Restore one browser from a hand-copied User Data folder (no archive) +hack-browser-data restore --keys keys.json --data-dir ./chrome-userdata -b chrome +``` + ### `list` - List detected browsers and profiles | Flag | Default | Description | @@ -177,8 +260,8 @@ hack-browser-data # Extract specific browser and categories hack-browser-data dump -b chrome -c password,cookie -# Export in JSON format to a custom directory -hack-browser-data dump -b chrome -f json -d output +# Export in CSV format to a custom directory (JSON is the default) +hack-browser-data dump -b chrome -f csv -d output # Export cookies in CookieEditor format hack-browser-data dump -f cookie-editor diff --git a/browser/chromium/extract_creditcard.go b/browser/chromium/extract_creditcard.go index 62bb1ea..af9ba2a 100644 --- a/browser/chromium/extract_creditcard.go +++ b/browser/chromium/extract_creditcard.go @@ -61,7 +61,7 @@ func extractCreditCards(masterKeys masterkey.MasterKeys, path string) ([]types.C return cards, nil } -// extractYandexCreditCards reads the records table (not Chromium's credit_cards). AAD = guid. See RFC-012 §4. +// extractYandexCreditCards reads the records table (not Chromium's credit_cards). AAD = guid. func extractYandexCreditCards(masterKeys masterkey.MasterKeys, path string) ([]types.CreditCardEntry, error) { dataKey, err := loadYandexDataKey(path, masterKeys.V10) if err != nil { diff --git a/browser/chromium/extract_password.go b/browser/chromium/extract_password.go index 4e40658..159c417 100644 --- a/browser/chromium/extract_password.go +++ b/browser/chromium/extract_password.go @@ -51,7 +51,7 @@ func extractPasswordsWithQuery(masterKeys masterkey.MasterKeys, path, query stri return logins, nil } -// extractYandexPasswords walks Ya Passman Data; protocol in RFC-012 §4. +// extractYandexPasswords walks Ya Passman Data. // Note: URL column is origin_url — it's what the per-row AAD is computed over (not action_url). func extractYandexPasswords(masterKeys masterkey.MasterKeys, path string) ([]types.LoginEntry, error) { dataKey, err := loadYandexDataKey(path, masterKeys.V10) diff --git a/browser/chromium/yandex.go b/browser/chromium/yandex.go index d2eb16b..2938c8d 100644 --- a/browser/chromium/yandex.go +++ b/browser/chromium/yandex.go @@ -73,10 +73,10 @@ func yandexCardAAD(guid string, keyID []byte) []byte { return out } -// errYandexMasterPasswordSet: caller warns + skips; RSA-OAEP unseal is deferred (RFC-012 §6). +// errYandexMasterPasswordSet: caller warns + skips; RSA-OAEP unseal is deferred. var errYandexMasterPasswordSet = errors.New("yandex: profile protected by master password, skipping") -// loadYandexDataKey honors the master-password gate and returns the per-DB data key. See RFC-012 §4.2. +// loadYandexDataKey honors the master-password gate and returns the per-DB data key. func loadYandexDataKey(dbPath string, masterKey []byte) ([]byte, error) { if len(masterKey) == 0 { return nil, fmt.Errorf("yandex: master key not available") diff --git a/browser/firefox/masterkey.go b/browser/firefox/masterkey.go index 7489215..5bae231 100644 --- a/browser/firefox/masterkey.go +++ b/browser/firefox/masterkey.go @@ -50,13 +50,11 @@ func readKey4DB(path string) (*key4DB, error) { var record key4DB - // Read metaData table const metaQuery = `SELECT item1, item2 FROM metaData WHERE id = 'password'` if err := db.QueryRow(metaQuery).Scan(&record.globalSalt, &record.passwordCheck); err != nil { return nil, fmt.Errorf("query metaData: %w", err) } - // Read nssPrivate table const nssQuery = `SELECT a11, a102 FROM nssPrivate` rows, err := db.Query(nssQuery) if err != nil { diff --git a/browser/firefox/profile.go b/browser/firefox/profile.go index 40cdf18..ad36293 100644 --- a/browser/firefox/profile.go +++ b/browser/firefox/profile.go @@ -55,7 +55,6 @@ func (p *profile) extract(categories []types.Category) *types.BrowserData { return data } -// count counts entries per category without decryption. func (p *profile) count(categories []types.Category) map[types.Category]int { session, err := filemanager.NewSession() if err != nil { @@ -76,7 +75,6 @@ func (p *profile) count(categories []types.Category) map[types.Category]int { return counts } -// acquireFiles copies source files to the session temp directory. func (p *profile) acquireFiles(session *filemanager.Session, categories []types.Category) map[types.Category]string { tempPaths := make(map[types.Category]string) for _, cat := range categories { @@ -114,7 +112,6 @@ func (p *profile) getMasterKey(session *filemanager.Session, tempPaths map[types return retrieveMasterKey(key4Dst, loginsPath) } -// extractCategory calls the appropriate extract function for a category. func (p *profile) extractCategory(data *types.BrowserData, cat types.Category, masterKey []byte, path string) { var err error switch cat { @@ -140,7 +137,6 @@ func (p *profile) extractCategory(data *types.BrowserData, cat types.Category, m } } -// countCategory calls the appropriate count function for a category. func (p *profile) countCategory(cat types.Category, path string) int { var count int var err error diff --git a/browser/safari/profiles_test.go b/browser/safari/profiles_test.go index 6cc3c06..9eadabe 100644 --- a/browser/safari/profiles_test.go +++ b/browser/safari/profiles_test.go @@ -72,7 +72,7 @@ func TestDiscoverSafariProfiles_OrphanUUIDWithoutDBEntry(t *testing.T) { // Profile directory with a History.db exists on disk but is absent from // SafariTabs.db. When the DB is readable and doesn't mention it, we trust // the DB — the orphan stays hidden because production filters profiles - // with no resolvable data in NewBrowsers anyway. Here we assert discovery + // with no resolvable data in NewBrowser anyway. Here we assert discovery // returns only what the DB declares. const dbUUID = "AAAAAAAA-BBBB-CCCC-DDDD-EEEEEEEEEEEE" const orphanUUID = "11111111-2222-3333-4444-555555555555" @@ -182,7 +182,7 @@ func TestDiscoverSafariProfiles_DefaultProfileSentinelIgnored(t *testing.T) { func TestDiscoverSafariProfiles_EmptyProfileDirectoryFiltersOutInNewBrowsers(t *testing.T) { // Matches the real 4E2D8DD0 orphan on the author's Mac: a profile dir // listed in neither SafariTabs.db nor containing any extractable data. - // Discovery without the DB surfaces it; NewBrowsers then drops it when + // Discovery without the DB surfaces it; NewBrowser then drops it when // resolveSourcePaths yields zero matches. const uuid = "4E2D8DD0-A7D2-4684-939A-898B7675C700" library := t.TempDir() diff --git a/browser/safari/safari.go b/browser/safari/safari.go index 3e7a852..6c6645c 100644 --- a/browser/safari/safari.go +++ b/browser/safari/safari.go @@ -107,10 +107,9 @@ func resolveSourcePaths(sources map[types.Category][]sourcePath) map[types.Categ // Offset from the Core Data epoch (2001-01-01 UTC) to the Unix epoch. const coreDataEpochOffset = 978307200 -// maxCoreDataSeconds is the largest CFAbsoluteTime that still lands inside -// time.Time.MarshalJSON's [1, 9999] year window. Also bounds the float → -// int64 conversion below; Go's spec makes out-of-range conversions return -// an implementation-dependent int64, which could silently corrupt results. +// maxCoreDataSeconds guards against CFAbsoluteTime values that would exceed +// time.Time.MarshalJSON's year-9999 ceiling, and bounds the float→int64 +// conversion below (Go spec: out-of-range result is implementation-dependent). const maxCoreDataSeconds = 252423993600 // coredataTimestamp converts Core Data seconds (CFAbsoluteTime) to UTC. diff --git a/browser/safari/safari_test.go b/browser/safari/safari_test.go index 4cc0789..4ae34ee 100644 --- a/browser/safari/safari_test.go +++ b/browser/safari/safari_test.go @@ -76,7 +76,7 @@ func TestNewBrowsers(t *testing.T) { } // --------------------------------------------------------------------------- -// NewBrowsers — multi-profile (macOS 14+ named profiles) +// NewBrowser — multi-profile (macOS 14+ named profiles) // --------------------------------------------------------------------------- func TestNewBrowsers_MultiProfile(t *testing.T) { diff --git a/crypto/asn1pbe.go b/crypto/asn1pbe.go index 3fddc23..4dbd55c 100644 --- a/crypto/asn1pbe.go +++ b/crypto/asn1pbe.go @@ -98,7 +98,7 @@ func (n privateKeyPBE) deriveKeyAndIV(globalSalt []byte) ([]byte, []byte) { return dk[:24], dk[len(dk)-8:] } -// MetaPBE Struct +// passwordCheckPBE Struct // // SEQUENCE (2 elem) // OBJECT IDENTIFIER diff --git a/crypto/crypto.go b/crypto/crypto.go index 1d87976..c2a747d 100644 --- a/crypto/crypto.go +++ b/crypto/crypto.go @@ -137,7 +137,6 @@ func AESGCMDecryptBlob(key, blob, aad []byte) ([]byte, error) { return aead.Open(nil, blob[:gcmNonceSize], blob[gcmNonceSize:], aad) } -// cbcEncrypt adds PKCS5 padding and encrypts plaintext in CBC mode. func cbcEncrypt(block cipher.Block, iv, plaintext []byte) ([]byte, error) { if len(iv) != block.BlockSize() { return nil, errInvalidIVLength @@ -149,7 +148,6 @@ func cbcEncrypt(block cipher.Block, iv, plaintext []byte) ([]byte, error) { return dst, nil } -// cbcDecrypt decrypts ciphertext in CBC mode and removes PKCS5 padding. func cbcDecrypt(block cipher.Block, iv, ciphertext []byte) ([]byte, error) { bs := block.BlockSize() if len(iv) != bs { @@ -172,8 +170,7 @@ func cbcDecrypt(block cipher.Block, iv, ciphertext []byte) ([]byte, error) { return dst, nil } -// paddingZero pads src with zero bytes to the given length. -// Returns src unchanged if already long enough; otherwise returns a new slice. +// paddingZero returns src unchanged if already long enough; otherwise a zero-padded new slice. func paddingZero(src []byte, length int) []byte { if len(src) >= length { return src @@ -195,7 +192,6 @@ func pkcs5Padding(src []byte, blockSize int) []byte { return dst } -// pkcs5UnPadding removes PKCS5/PKCS7 padding from src. func pkcs5UnPadding(src []byte, blockSize int) ([]byte, error) { length := len(src) if length == 0 { diff --git a/crypto/pbkdf2.go b/crypto/pbkdf2.go index 56ad1b9..8882b19 100644 --- a/crypto/pbkdf2.go +++ b/crypto/pbkdf2.go @@ -14,7 +14,7 @@ import ( // can get a derived key for e.g. AES-256 (which needs a 32-byte key) by // doing: // -// dk := pbkdf2.Key([]byte("some password"), salt, 4096, 32, sha1.New) +// dk := PBKDF2Key([]byte("some password"), salt, 4096, 32, sha1.New) // // Remember to get a good random salt. At least 8 bytes is recommended by the // RFC. diff --git a/crypto/version.go b/crypto/version.go index 085c273..5fb28f2 100644 --- a/crypto/version.go +++ b/crypto/version.go @@ -23,7 +23,6 @@ const ( // CipherDPAPI is pre-Chrome 80 raw DPAPI encryption (no version prefix). CipherDPAPI CipherVersion = "dpapi" - // versionPrefixLen is the byte length of the version prefix ("v10", "v20"). versionPrefixLen = 3 ) @@ -47,8 +46,6 @@ func DetectVersion(ciphertext []byte) CipherVersion { } } -// stripPrefix removes the version prefix (e.g. "v10") from ciphertext. -// Returns the ciphertext unchanged if no known prefix is found. func stripPrefix(ciphertext []byte) []byte { ver := DetectVersion(ciphertext) if ver == CipherV10 || ver == CipherV11 || ver == CipherV12 || ver == CipherV20 { diff --git a/crypto/windows/abe_native/bootstrap_layout.h b/crypto/windows/abe_native/bootstrap_layout.h index f2cc427..8a960ab 100644 --- a/crypto/windows/abe_native/bootstrap_layout.h +++ b/crypto/windows/abe_native/bootstrap_layout.h @@ -5,7 +5,8 @@ #include // BootstrapScratch describes the IPC contract between the C payload running -// inside chrome.exe and the Go injector in our own process. It squats inside +// inside the target browser process (chrome.exe, msedge.exe, brave.exe, etc.) +// and the Go injector in our own process. It squats inside // the target DLL's PE DOS header region. Windows' PE loader ignores the DOS // stub at 0x40..0x77, and we also borrow a few reserved bytes between 0x28 // and 0x3B inside IMAGE_DOS_HEADER. The e_lfanew at 0x3C..0x3F MUST be left diff --git a/crypto/yandex.go b/crypto/yandex.go index 608a60e..0ee3ddb 100644 --- a/crypto/yandex.go +++ b/crypto/yandex.go @@ -22,7 +22,7 @@ var ( errYandexKeyTooShort = errors.New("yandex: decrypted intermediate key shorter than 32 bytes") ) -// DecryptYandexIntermediateKey unwraps the per-DB data key from meta.local_encryptor_data. See RFC-012 §4.2. +// DecryptYandexIntermediateKey unwraps the per-DB data key from meta.local_encryptor_data. func DecryptYandexIntermediateKey(masterKey, blob []byte) ([]byte, error) { idx := bytes.Index(blob, localEncryptorPrefix) if idx < 0 { diff --git a/filemanager/copy_other.go b/filemanager/copy_other.go index 8266e0a..ad7eae9 100644 --- a/filemanager/copy_other.go +++ b/filemanager/copy_other.go @@ -6,7 +6,7 @@ import "fmt" // copyLocked is not supported on non-Windows platforms and always returns an error. // File locking is primarily a Windows issue where Chrome holds exclusive -// locks on Cookie files via SQLite WAL mode. +// locks on Cookie files via PRAGMA locking_mode=EXCLUSIVE. func copyLocked(_, _ string) error { return fmt.Errorf("locked file copy not supported on this platform") } diff --git a/masterkey/gcoredump_darwin.go b/masterkey/gcoredump_darwin.go index ef7e2f3..1ad21c2 100644 --- a/masterkey/gcoredump_darwin.go +++ b/masterkey/gcoredump_darwin.go @@ -102,7 +102,6 @@ func DecryptKeychainRecords() ([]keychainbreaker.GenericPassword, error) { return nil, fmt.Errorf("read keychain: %w", err) } - // try each candidate key against the keychain for _, candidate := range candidates { kc, err := keychainbreaker.Open(keychainbreaker.WithBytes(keychainBuf)) if err != nil { @@ -157,7 +156,6 @@ func scanMasterKeyCandidates(corePath string, regions []addressRange) ([]string, if ptr < region.start || ptr > region.end { continue } - // read 24 bytes at the pointer offset offset := ptr - vaddr if offset+0x18 > uint64(len(data)) { continue diff --git a/rfcs/001-project-architecture.md b/rfcs/001-project-architecture.md index 165fc61..a3ba433 100644 --- a/rfcs/001-project-architecture.md +++ b/rfcs/001-project-architecture.md @@ -11,7 +11,7 @@ HackBrowserData is a CLI security research tool that extracts and decrypts brows Key constraints: - **Go 1.20** — the module must build with Go 1.20 to maintain Windows 7 support. Features from Go 1.21+ (`log/slog`, `slices`, `maps`, `cmp`) must not be used. -- **Supported engines**: Chromium (including Yandex and Opera variants) and Firefox. +- **Supported engines**: Chromium (including Yandex and Opera variants), Firefox, and Safari. - **Supported platforms**: Windows (DPAPI), macOS (Keychain), Linux (D-Bus Secret Service). - **No root-level library API** — the CLI calls `browser.DiscoverBrowsersWithKeys()` directly; there is no importable `pkg/` surface. @@ -19,10 +19,11 @@ Key constraints: ``` HackBrowserData/ -├── cmd/hack-browser-data/ # CLI entrypoint: cobra root, dump, list, version +├── cmd/hack-browser-data/ # CLI entrypoint: cobra root, dump, dumpkeys, archive, restore, list, version ├── browser/ # Browser interface, DiscoverBrowsersWithKeys(), platform browser lists │ ├── chromium/ # Chromium engine: extraction, decryption, profile discovery -│ └── firefox/ # Firefox engine: extraction, NSS key derivation +│ ├── firefox/ # Firefox engine: extraction, NSS key derivation +│ └── safari/ # Safari engine: Keychain, Bookmark, History, Downloads (macOS only) ├── types/ # Data model: Category enum, Entry structs, BrowserData ├── crypto/ # Encryption primitives, cipher version detection ├── masterkey/ # Platform-specific master key retrieval (Keychain/DPAPI/D-Bus) @@ -59,7 +60,7 @@ Each category has a corresponding Entry struct with `json` and `csv` struct tags ### 3.3 BrowserData Container -`BrowserData` is the result container returned by `Extract()`. It holds typed slices — one per category. The container is populated field-by-field during extraction. The output layer uses `makeExtractor[T]()` generics to pull the correct slice for serialization. +`BrowserData` is the per-profile data container holding typed slices — one per category, populated field-by-field during extraction. `Extract()` returns `[]ExtractResult`, where each element pairs a `Profile` identity with a `*BrowserData`. The output layer uses `makeExtractor[T]()` generics to pull the correct slice for serialization. ## 4. Browser Interface & Registration @@ -91,7 +92,7 @@ DiscoverBrowsersWithKeys(opts) // used by `dump` — ready to → resolveSourcePaths() // stat candidates, first match wins → newCredentialInjector(opts) // build-tagged: returns a browserInjector → for each browser: // closure captures retriever + keychain pw lazily - inject(b) // type-assert retrieverSetter / keychainPasswordSetter + inject(b) // type-assert KeyManager / KeychainPasswordReceiver DiscoverBrowsers(opts) // used by `list` / `list --detail` → discoverFromConfigs(configs, opts) // same shared discovery core, NO injection @@ -118,13 +119,13 @@ Adding a new browser is a config-only change in `platformBrowsers()`; this secti ## 5. Extract() Orchestration -Both Chromium and Firefox engines follow the same extraction pattern: +Both Chromium and Firefox engines follow the same per-profile extraction pattern (Firefox runs it inside each `profile.extract()` call; for Firefox the master key comes from `key4.db` rather than a platform API): ``` -Extract(categories) +Extract(categories) // per-profile: one invocation per profile 1. NewSession() → create isolated temp directory 2. acquireFiles(session) → copy source files to temp dir (with dedup and WAL/SHM) - 3. getMasterKey(session) → platform-specific key retrieval + 3. getMasterKey(session) → platform-specific key retrieval (Firefox: key4.db) 4. for each category: extractCategory(data, cat, masterKey, path) 5. defer session.Cleanup() → remove temp directory @@ -146,7 +147,7 @@ The extraction loop maximizes data recovery. Each category is extracted independ ### 5.2 Custom Extractors -The `categoryExtractor` interface allows browser-specific extraction logic. Yandex and Opera use custom extractors for passwords and extensions respectively, while all other categories fall through to the default Chromium implementation. +The `categoryExtractor` interface allows browser-specific extraction logic. Yandex uses custom extractors for passwords and credit cards; Opera uses a custom extractor for extensions. All other categories fall through to the default Chromium implementation. ## 6. Dependency Constraints @@ -160,7 +161,7 @@ The module is pinned to `go 1.20` in `go.mod`. This is enforced by a CI lint che | `github.com/spf13/cobra` | v1.10.2 | CLI framework | | `github.com/moond4rk/keychainbreaker` | v0.2.5 | macOS keychain decryption | | `github.com/godbus/dbus/v5` | v5.2.2 | Linux D-Bus Secret Service | -| `golang.org/x/sys` | v0.27.0 | Windows syscalls (DPAPI, DuplicateHandle) | +| `golang.org/x/sys` | v0.30.0 | Windows syscalls (DPAPI, DuplicateHandle) | ## Related RFCs diff --git a/rfcs/002-chromium-data-storage.md b/rfcs/002-chromium-data-storage.md index 9bc78d0..b463be0 100644 --- a/rfcs/002-chromium-data-storage.md +++ b/rfcs/002-chromium-data-storage.md @@ -33,13 +33,13 @@ Yandex overrides two file names from the standard Chromium layout: | Password | `Login Data` | `Ya Passman Data` | | CreditCard | `Web Data` | `Ya Credit Cards` | -Yandex also uses `action_url` instead of `origin_url` in its password SQL query. +Yandex's password query selects extra columns (`username_element`, `password_element`, `signon_realm`) beyond the standard four; these columns are used to construct the per-row AAD for decryption. The URL column is `origin_url`, same as standard Chromium. -**Important limitation**: Yandex passwords and cookies currently cannot be decrypted because Yandex uses its own proprietary encryption algorithm. Only non-encrypted categories (bookmarks, history, downloads, extensions, storage) produce useful results. +Yandex passwords and credit cards use Yandex's proprietary two-layer encryption (see RFC-012) and are fully supported. Cookie decryption follows standard Chromium v10/v20 paths. ### 2.2 Opera -Opera differs from standard Chromium in two ways: +Opera differs from standard Chromium in three ways: - **Extension key**: Opera stores extension settings under `extensions.opsettings` in Secure Preferences, instead of the standard `extensions.settings`. - **Windows path**: Opera uses `AppData/Roaming` rather than `AppData/Local`, unlike most Chromium browsers. @@ -106,8 +106,8 @@ No encrypted fields. Shares the same `History` SQLite database as browsing histo ### 4.6 Credit Cards (Web Data -- SQLite) ```sql -SELECT guid, name_on_card, expiration_month, expiration_year, - card_number_encrypted, nickname, billing_address_id FROM credit_cards +SELECT COALESCE(guid, ''), name_on_card, expiration_month, expiration_year, + card_number_encrypted, COALESCE(nickname, ''), COALESCE(billing_address_id, '') FROM credit_cards ``` The `card_number_encrypted` column contains encrypted bytes. diff --git a/rfcs/003-chromium-encryption.md b/rfcs/003-chromium-encryption.md index 28b1006..0f477b2 100644 --- a/rfcs/003-chromium-encryption.md +++ b/rfcs/003-chromium-encryption.md @@ -18,6 +18,7 @@ Every encrypted value begins with a 3-byte prefix that identifies the cipher ver |--------|---------|---------| | `v10` | CipherV10 | Chrome 80+ standard encryption (AES-GCM on Windows, AES-CBC on macOS/Linux) | | `v11` | CipherV11 | Linux-only: AES-CBC variant where the key comes from libsecret / kwallet. Same algorithm and parameters as `v10` — only the key source differs | +| `v12` | CipherV12 | Chromium SecretPortal/Flatpak (xdg-desktop-portal) — recognized by the version detector so a clear error can be returned; not yet implemented | | `v20` | CipherV20 | Chrome 127+ App-Bound Encryption | | (none) | CipherDPAPI | Pre-Chrome 80 raw DPAPI encryption (Windows only, no prefix) | @@ -92,20 +93,15 @@ Decryption uses AES-128-CBC with a fixed IV of 16 space bytes (`0x20`) and PKCS5 ## 6. v20 App-Bound Encryption (Chrome 127+) -Chrome 127 introduced App-Bound Encryption on Windows, identified by the `v20` prefix. This scheme binds the encryption key to the Chrome application identity, making it harder for external tools to decrypt. After decryption, the payload contains a 32-byte application header before the actual plaintext: +Chrome 127 introduced App-Bound Encryption on Windows, identified by the `v20` prefix. This scheme binds the encryption key to the Chrome application identity. The key is a 32-byte AES-256 key retrieved via reflective injection into the browser process (`ABERetriever`). Ciphertext layout: ``` -| v20 | nonce | AES-GCM payload | +| v20 | nonce | AES-GCM ciphertext + auth tag | |-------|--------|-------------------------------------| | 3B | 12B | remaining bytes | - -After decryption: -| app-bound header | plaintext | -|------------------|------------------------------------| -| 32B | remaining bytes | ``` -**Current status**: v20 decryption is not yet implemented. Encountering a `v20`-prefixed value returns an error. This primarily affects recent Chrome installations on Windows. +Decryption uses `DecryptChromiumGCM` with the ABE-retrieved key. Note: `DecryptChromiumGCM` strips only the version prefix (3B) and nonce (12B) before passing to AES-GCM; it does not strip any post-decrypt header from the result. ## 7. Decryption Flow @@ -113,8 +109,9 @@ The high-level decryption path for any encrypted Chromium value: 1. **Detect version** -- inspect the first 3 bytes of the ciphertext 2. **Route by version**: - - `v10` / `v11` -- strip prefix, call platform-specific decryption (AES-CBC on macOS/Linux, AES-GCM on Windows). On Linux, a failed decryption retries once with `kEmptyKey` to recover legacy crbug.com/40055416 data - - `v20` -- not yet supported, return error + - `v10` / `v11` -- strip prefix, call platform-specific decryption (AES-CBC on macOS/Linux, AES-GCM on Windows). On macOS/Linux, a failed AES-CBC decryption retries once with `kEmptyKey` to recover legacy crbug.com/40055416 data + - `v12` -- SecretPortal/Flatpak — recognized, returns known-gap error (not yet implemented) + - `v20` -- AES-256-GCM with 32-byte ABE key (retrieved via Windows reflective injection) - DPAPI (no prefix) -- call Windows `CryptUnprotectData` directly (Windows only; returns error on other platforms) 3. **Return plaintext** -- the decrypted bytes are interpreted as a UTF-8 string diff --git a/rfcs/004-firefox-data-storage.md b/rfcs/004-firefox-data-storage.md index e88355a..5df3c93 100644 --- a/rfcs/004-firefox-data-storage.md +++ b/rfcs/004-firefox-data-storage.md @@ -104,6 +104,7 @@ Firefox uses inconsistent timestamp units across data types. All are Unix epoch- | Cookies (`expiry`) | Seconds | direct | | History (`last_visit_date`) | Microseconds | / 1,000,000 | | Downloads (`dateAdded`) | Microseconds | / 1,000,000 | +| Downloads (`endTime`) | Milliseconds | / 1,000 | | Bookmarks (`dateAdded`) | Microseconds | / 1,000,000 | | Passwords (`timeCreated`) | Milliseconds | / 1,000 | diff --git a/rfcs/005-firefox-encryption.md b/rfcs/005-firefox-encryption.md index bce9689..de5ae44 100644 --- a/rfcs/005-firefox-encryption.md +++ b/rfcs/005-firefox-encryption.md @@ -62,7 +62,7 @@ key = dk[:24], iv = dk[32:40] // 3DES key + IV ### 3.2 passwordCheckPBE Key Derivation -Uses standard PBKDF2 with SHA-256 and parameters embedded in the ASN1 structure (entry salt, iteration count, key size). The IV is reconstructed by prepending the ASN.1 OCTET STRING header (`0x04 0x0E`) to the 14-byte IV value from the parsed structure, yielding a 16-byte AES IV. +Uses PBKDF2-SHA-256 with parameters embedded in the ASN1 structure (entry salt, iteration count, key size). The PBKDF2 password is `SHA1(globalSalt)` (a 20-byte digest), not `globalSalt` itself. The IV is reconstructed by prepending the ASN.1 OCTET STRING header (`0x04 0x0E`) to the 14-byte IV value from the parsed structure, yielding a 16-byte AES IV. ## 4. Password Decryption diff --git a/rfcs/006-key-retrieval-mechanisms.md b/rfcs/006-key-retrieval-mechanisms.md index 3aad882..52f0ec9 100644 --- a/rfcs/006-key-retrieval-mechanisms.md +++ b/rfcs/006-key-retrieval-mechanisms.md @@ -124,7 +124,7 @@ Windows populates two slots of the `masterkey.Retrievers` struct — V10 (legacy `browser/browser_windows.go::newCredentialInjector` calls `masterkey.DefaultRetrievers()` and wires the resulting struct through `Browser.SetRetrievers(r)`. At extract time `masterkey.NewMasterKeys` runs each slot independently — a failure on one tier does not prevent the other from succeeding, because mixed-tier Chrome profiles (upgraded from pre-127) need partial success to be useful. -**Why not a ChainRetriever?** `ChainRetriever` has first-success semantics: once ABE returns a key, DPAPI is never called. That semantics is wrong for orthogonal tiers — it was the root cause of issue #578, where upgraded profiles' v10-encrypted passwords silently failed because only the v20 key was retrieved. `NewMasterKeys` evaluates each tier independently and returns an `errors.Join` of per-tier failures; log severity is a caller-side decision. `browser/chromium::getMasterKeys` currently logs all tier errors uniformly at `Warnf` — the distinction between "partial" and "total" failure was judged low-value for a short-lived CLI where all warn lines are visible in the default output. +**Why not a ChainRetriever?** `ChainRetriever` has first-success semantics: once ABE returns a key, DPAPI is never called. That semantics is wrong for orthogonal tiers — it was the root cause of issue #578, where upgraded profiles' v10-encrypted passwords silently failed because only the v20 key was retrieved. `NewMasterKeys` evaluates each tier independently and returns an `errors.Join` of per-tier failures; log severity is a caller-side decision. `browser/chromium.(Browser).masterKeys` currently logs all tier errors uniformly at `Warnf` — the distinction between "partial" and "total" failure was judged low-value for a short-lived CLI where all warn lines are visible in the default output. **Non-ABE Chromium forks** (Opera, Vivaldi, Yandex, 360, QQ, Sogou) omit `WindowsABE` in `platformBrowsers()` (default false). The caller leaves `Hints.WindowsABEKey` empty, and `ABERetriever` returns `(nil, nil)` for empty `WindowsABEKey`, which `NewMasterKeys` treats silently as "not applicable" — so attempting ABE on these forks is a no-op, not a failure. Their V10 DPAPI key continues to work unchanged. @@ -178,7 +178,7 @@ The authoritative mapping lives in the `KeychainLabel` field of each entry in `p | Windows | V10 = DPAPIRetriever; V20 = ABERetriever (Chrome 127+) | No | AES-256 | | Linux | V10 = PosixRetriever ("peanuts" kV10Key); V11 = DBusRetriever (keyring kV11Key) | 1 iteration | AES-128 | -\* Only included when `--keychain-pw` is provided. +\* Only included when a non-empty password resolves — either via `--keychain-pw` flag or an interactive TTY prompt. ## 7. Safari Credential Extraction @@ -218,10 +218,10 @@ The macOS login password is resolved once at startup by `browser/browser_darwin. | Consumer | Capability interface | Defined in | Payload | |---|---|---|---| -| Chromium browsers | `keyRetrieversSetter` | `browser/browser.go` | `masterkey.Retrievers` struct (V10 / V11 / V20 slots; unused tiers nil) | -| Safari | `keychainPasswordSetter` | `browser/browser_darwin.go` | raw `string` | +| Chromium browsers | `KeyManager` | `browser/browser.go` | `masterkey.Retrievers` struct (V10 / V11 / V20 slots; unused tiers nil) | +| Safari | `KeychainPasswordReceiver` | `browser/browser.go` | raw `string` | -The two setters are **intentionally not unified**. They carry different abstractions — one hands the browser a pre-assembled retrieval chain, the other hands the browser a credential token to unlock its own access path. Unifying them would create a leaky polymorphic interface with no real shared semantics. Note that `keychainPasswordSetter` is defined in the darwin-only file because Safari (its only implementer) is darwin-only. +The two interfaces are **intentionally not unified**. They carry different abstractions — one hands the browser a pre-assembled retrieval chain, the other hands the browser a credential token to unlock its own access path. Unifying them would create a leaky polymorphic interface with no real shared semantics. `resolveKeychainPassword` additionally performs an early `TryUnlock` against `keychainbreaker` before the chain is built, so a bad password surfaces as a startup warning rather than a mid-extraction failure. The small cost of opening the keychain twice (once for validation, once inside `KeychainPasswordRetriever`) buys meaningful UX. diff --git a/rfcs/007-cli-and-output-design.md b/rfcs/007-cli-and-output-design.md index 66e5ccf..1f1172d 100644 --- a/rfcs/007-cli-and-output-design.md +++ b/rfcs/007-cli-and-output-design.md @@ -6,7 +6,7 @@ ## 1. Command Structure -The CLI is built on [cobra](https://github.com/spf13/cobra) with three subcommands: `dump`, `list`, and `version`. +The CLI is built on [cobra](https://github.com/spf13/cobra) with six subcommands: `dump`, `dumpkeys`, `archive`, `restore`, `list`, and `version`. ### 1.1 Root Command @@ -22,7 +22,7 @@ The primary command. Extracts, decrypts, and writes browser data to files. |------|-------|---------|-------------| | `--browser` | `-b` | `"all"` | Target browser | | `--category` | `-c` | `"all"` | Data categories (comma-separated) | -| `--format` | `-f` | `"csv"` | Output format: csv, json, cookie-editor | +| `--format` | `-f` | `"json"` | Output format: csv, json, cookie-editor | | `--dir` | `-d` | `"results"` | Output directory | | `--profile-path` | `-p` | | Custom profile directory | | `--keychain-pw` | | | macOS keychain password | @@ -38,7 +38,7 @@ Lists all detected browsers and profiles via `text/tabwriter`. **Basic mode** (default) — three columns: Browser, Profile, Path. -**Detail mode** (`--detail`) — adds a column for every category showing entry counts. This actually calls `Extract()` on each browser to count entries. +**Detail mode** (`--detail`) — adds a column for every category showing entry counts. This calls `CountEntries()` on each browser (not `Extract()`) — no decryption is performed. ### 1.4 version Command @@ -125,7 +125,7 @@ CLI: hack-browser-data dump -b chrome -c password,cookie -f csv -d results → parseCategories("password,cookie") → []Category → NewWriter("results", "csv") → *Writer → for each browser: - Extract(categories) → *BrowserData + Extract(categories) → []ExtractResult Writer.Add(browser, profile, data) → Writer.Write() → aggregate by category → format rows → write files diff --git a/rfcs/008-file-acquisition-and-platform-quirks.md b/rfcs/008-file-acquisition-and-platform-quirks.md index 195954c..65d1388 100644 --- a/rfcs/008-file-acquisition-and-platform-quirks.md +++ b/rfcs/008-file-acquisition-and-platform-quirks.md @@ -27,8 +27,10 @@ Acquire(src, dst, isDir) ├── isDir=true → copyDir(src, dst, skip="lock") │ └── isDir=false → copyFile(src, dst) - ├── success → copy -wal and -shm companions if present - └── failure + Windows → copyLocked(src, dst) fallback + ├── success ──┐ + └── failure + Windows → copyLocked(src, dst) + └── success ──┐ + copy -wal and -shm companions if present ``` ### SQLite Companion Files diff --git a/rfcs/009-windows-locked-file-bypass.md b/rfcs/009-windows-locked-file-bypass.md index ddfc006..29ca10c 100644 --- a/rfcs/009-windows-locked-file-bypass.md +++ b/rfcs/009-windows-locked-file-bypass.md @@ -43,6 +43,7 @@ Each entry in the result table: | Field | Size | Description | |-------|------|-------------| +| Object | `uintptr` | Kernel object pointer | | UniqueProcessID | `uintptr` | Owning process PID | | HandleValue | `uintptr` | Handle value in the owning process | | GrantedAccess | `uint32` | Access mask | @@ -76,13 +77,13 @@ Suffix: google\chrome\...\network\cookies Once we have a duplicated handle to the locked file: ``` -| DuplicateHandle (read access) | +| DuplicateHandle(DUPLICATE_SAME_ACCESS) | |-------------------------------------------------| ↓ | CreateFileMappingW(handle, PAGE_READONLY) | |-------------------------------------------------| ↓ -| MapViewOfFile(mapping, FILE_MAP_READ, fileSize) | +| MapViewOfFile(mapping, FILE_MAP_READ, 0, 0, 0) | |-------------------------------------------------| ↓ | byte slice from kernel file cache | @@ -95,7 +96,7 @@ Once we have a duplicated handle to the locked file: Memory-mapped I/O reads from the OS kernel's **file cache**, which includes data Chrome has written but not yet checkpointed to disk. This produces a more complete snapshot than a raw `ReadFile`. -**Fallback**: if `CreateFileMappingW` fails (e.g., the file is empty or zero-length), falls back to `Seek(0)` + `ReadFile` on the duplicated handle. +**Fallback**: if `CreateFileMappingW` fails for any reason, falls back to `Seek(0)` + `ReadFile` on the duplicated handle. ## 4. Why This Works diff --git a/rfcs/010-chrome-abe-integration.md b/rfcs/010-chrome-abe-integration.md index ff6780e..5655ba2 100644 --- a/rfcs/010-chrome-abe-integration.md +++ b/rfcs/010-chrome-abe-integration.md @@ -46,7 +46,7 @@ End-to-end flow when `hack-browser-data.exe` encounters a v20 Chromium cookie on ``` browser/chromium.Extract() - → masterkey.Chain [ABERetriever, DPAPIRetriever] + → masterkey.Retrievers{V10: &DPAPIRetriever{}, V20: &ABERetriever{}} → ABERetriever.RetrieveKey(): reads Local State → extracts APPB-prefixed blob resolves browser exe via registry App Paths @@ -92,7 +92,7 @@ DoExtractKey → see §4.2 2. `ReadProcessMemory` for the 12-byte diagnostic header, then 32-byte key when `status == ready`. 3. `TerminateProcess(browser)` — the target was a throwaway from the start. -The returned key flows back up to `crypto.DecryptChromiumV20` (cross-platform AES-256-GCM; see §5.3) and then to the usual cookie/password extraction pipeline. +The returned key flows back up to `crypto.DecryptChromiumGCM` (cross-platform AES-256-GCM; see §5.3) and then to the usual cookie/password extraction pipeline. ## 4. C payload — `crypto/windows/abe_native/` @@ -163,12 +163,11 @@ Validity relies on Windows **KnownDlls + session-consistent ASLR** — `kernel32 ### 5.1 Injector package — `utils/injector/` -Three files collaborate: +Four files collaborate: | File | Role | |---|---| -| `reflective_windows.go` | `Reflective.Inject(exePath, payload, env) ([]byte, error)` — the orchestrator | -| `winapi_windows.go` | Package-level `windows.LazyProc` handles + `callBoolErr` helper. Centralizes `VirtualAllocEx` / `CreateRemoteThread` / NtFlushIC / import-address lookups. `ReadProcessMemory` / `WriteProcessMemory` use `x/sys/windows` typed wrappers directly. | +| `reflective_windows.go` | `Reflective.Inject(exePath, payload, env) ([]byte, error)` — the orchestrator. Win32 calls (`VirtualAllocEx`, `CreateRemoteThread`, `NtFlushIC`, import-address lookups) delegate to `utils/winapi/` via `CallBoolErr`. | | `errors_windows.go` | `formatABEError(scratchResult) string` — renders the C-side diag channel into human-readable strings via two lookup maps (`ABE_ERR_*` names + known HRESULT names like `E_ACCESSDENIED`). | | `pe_windows.go` | `FindExportFileOffset(dllBytes, "Bootstrap")` — raw-file offset via `debug/pe`. | | `arch_windows.go` | Architecture validation (amd64-only today). | @@ -185,7 +184,7 @@ _Static_assert(offsetof(struct BootstrapScratch, hresult) == 0x2C, "hresult offs _Static_assert(offsetof(struct BootstrapScratch, shared) == 0x40, "shared offset"); ``` -Go consumes the same constants via **`go tool cgo -godefs`** (a development-time tool, not a runtime dependency). `make gen-layout` regenerates `crypto/windows/abe_native/bootstrap/layout.go` from `bootstrap_layout.h` using `CC="zig cc"` for bit-identical results across host OSes. `make gen-layout-verify` is wired into CI to fail if the committed `layout.go` is stale. +Go consumes the same constants via **`go tool cgo -godefs`** (a development-time tool, not a runtime dependency). `make gen-layout` regenerates `crypto/windows/abe_native/bootstrap/layout.go` from `bootstrap_layout.h` using `CC="zig cc"` for bit-identical results across host OSes. `make gen-layout-verify` can be run locally to verify the committed `layout.go` matches the current header. **Why `cgo -godefs` rather than runtime `import "C"`**: we only need constants shared, not FFI to C functions. Runtime CGO would force the whole project into `CGO_ENABLED=1`, losing the "non-Windows contributor needs no C toolchain" guarantee. `cgo -godefs` bakes the values into a pure-Go file that commits to git; the project stays `CGO_ENABLED=0`. @@ -201,12 +200,12 @@ Go consumes the same constants via **`go tool cgo -godefs`** (a development-time On extraction success, logs at `Info` level (`abe: retrieved master key via reflective injection`). -**v20 decryption** is cross-platform by design: `browser/chromium/decrypt.go` routes `CipherV20` → `crypto.DecryptChromiumV20` (defined in `crypto/crypto.go`, uses `AESGCMDecrypt`). This lets Linux/macOS CI exercise the same decryption path as Windows — only the key-source side is platform-gated. +**v20 decryption** is cross-platform by design: `browser/chromium/decrypt.go` routes `CipherV20` → `crypto.DecryptChromiumGCM` (defined in `crypto/crypto.go`, uses `AESGCMDecrypt`). This lets Linux/macOS CI exercise the same decryption path as Windows — only the key-source side is platform-gated. ## 6. Build chain - **Default build** (any host, no zig): `go build ./cmd/hack-browser-data/` succeeds; ABE is stubbed out. Legacy v10/v11 cookies still decrypt via DPAPI. -- **Windows release with ABE**: `make build-windows` = `make payload` (zig cc → `crypto/abe_extractor_amd64.bin`) + `GOOS=windows go build -tags abe_embed`. The `abe_embed` tag activates `//go:embed` on the compiled binary. +- **Windows release with ABE**: `make build-windows` = `make payload` (zig cc → `crypto/windows/payload/abe_extractor_amd64.bin`) + `GOOS=windows go build -tags abe_embed`. The `abe_embed` tag activates `//go:embed` on the compiled binary. - **Layout regen**: `make gen-layout` after any change to `bootstrap_layout.h`. - **`go.mod` unchanged** — no new dependencies. `zig` is the only external toolchain, and only when actually rebuilding the payload. @@ -226,7 +225,7 @@ All ABE-specific Go code is behind `//go:build windows` (plus `&& abe_embed` for **No payload bytes ever touch disk on the target machine.** - Payload DLL exists only as: - 1. Build artifact on the developer machine (`crypto/abe_extractor_amd64.bin`, git-ignored) + 1. Build artifact on the developer machine (`crypto/windows/payload/abe_extractor_amd64.bin`, git-ignored) 2. `.rdata` section of `hack-browser-data.exe` (`//go:embed`) 3. Go `[]byte` in our process memory (one `copy()` for import patching) 4. `VirtualAllocEx`'d region in the target browser during injection; released on `TerminateProcess` diff --git a/rfcs/011-safari-data-storage.md b/rfcs/011-safari-data-storage.md index f002abe..1d30f28 100644 --- a/rfcs/011-safari-data-storage.md +++ b/rfcs/011-safari-data-storage.md @@ -62,6 +62,7 @@ Safari uses two different casings for the same profile UUID across the container | Cookie | `Container/Cookies/Cookies.binarycookies`, then `~/Library/Cookies/Cookies.binarycookies` | BinaryCookies | | Bookmark | `~/Library/Safari/Bookmarks.plist` | plist | | Download | `~/Library/Safari/Downloads.plist` | plist | +| Extension | `Container/Safari/AppExtensions/Extensions.plist`, `Container/Safari/WebExtensions/Extensions.plist` | plist | | LocalStorage | `Container/WebKit/WebsiteData/Default/` | WebKit Origins dir | | Password | macOS Keychain | — | @@ -87,9 +88,13 @@ Passwords live in the user-scope Keychain, not on a per-profile basis — only t ### 4.1 History (History.db — SQLite) ```sql -SELECT url, title, visit_count, visit_time -FROM history_items -LEFT JOIN history_visits ON history_items.id = history_visits.history_item +SELECT hi.url, COALESCE(hv.title, ''), hi.visit_count, COALESCE(hv.visit_time, 0) +FROM history_items hi +LEFT JOIN history_visits hv ON hv.id = ( + SELECT hv2.id FROM history_visits hv2 + WHERE hv2.history_item = hi.id + ORDER BY hv2.visit_time DESC LIMIT 1 +) ``` Schema notes: @@ -99,7 +104,7 @@ Schema notes: ### 4.2 Cookies (Cookies.binarycookies — binary) -Apple's proprietary BinaryCookies format — not SQLite, not a documented format. Parsed by the [go-binarycookies](https://github.com/moond4rk/go-binarycookies) library. +Apple's proprietary BinaryCookies format — not SQLite, not a documented format. Parsed by the [binarycookies](https://github.com/moond4rk/binarycookies) library. High-level layout: @@ -122,7 +127,7 @@ A nested dictionary tree with a `WebBookmarkType` discriminator at each node: | `WebBookmarkTypeList` | Folder | `Children` (array) | | `WebBookmarkTypeLeaf` | URL entry | `URLString`, `URIDictionary.title` | -The extractor walks the tree recursively, collecting leaf nodes into a flat list. Folder names are not preserved (only URL + title pairs are exported). +The extractor walks the tree recursively, collecting leaf nodes into a flat list. Folder names are preserved in the `Folder` field of each `BookmarkEntry`. ### 4.4 Downloads (Downloads.plist — property list) @@ -132,7 +137,7 @@ A flat structure with a `DownloadHistory` array. Relevant keys per entry: |-----|---------| | `DownloadEntryURL` | Source URL | | `DownloadEntryPath` | Local filesystem path | -| `DownloadEntryBytesReceivedSoFar` | Bytes downloaded | +| `DownloadEntryProgressTotalToLoad` | Total bytes to download | | `DownloadEntryProfileUUIDStringKey` | Owning profile's uppercase UUID, or `"DefaultProfile"` | The extractor filters by the caller-provided owner UUID so each profile reports its own downloads. MIME type and start/end times are not stored by Safari — `MimeType` is always empty in the output. @@ -241,7 +246,7 @@ The only encrypted category is passwords. Because they are not stored in Safari' - **Full Disk Access (TCC)** is required to read the sandboxed container. Without it, cookies / history / downloads / localStorage reads fail silently with permission errors at stat or open time. Legacy paths under `~/Library/Safari/` sometimes remain readable without FDA, but are mostly empty on modern systems. - **Live-file safety** follows a live-vs-temp split: - **Live reads** (`SafariTabs.db` during profile discovery in `profiles.go`) use `?mode=ro&immutable=1`, which disables WAL replay and locking so the extractor cannot disturb a running Safari — it sees a consistent snapshot of the main DB as of read time, at the cost of missing any pending WAL content. - - **Temp-copy reads** (`History.db`, `localstorage.sqlite3`, etc. via `filemanager.Session.Acquire`) use `?mode=ro` only. `Session.Acquire` copies the `-wal` / `-shm` sidecars alongside the main DB, so SQLite can replay uncommitted transactions on the copy — surfacing entries Safari has written to WAL but not yet checkpointed. Any `-shm` writes SQLite performs during replay land on the ephemeral copy and are deleted with the session. + - **Temp-copy reads** (via `filemanager.Session.Acquire`) vary by file: `localstorage.sqlite3` uses `?mode=ro` so SQLite can replay the copied `-wal` sidecar; `History.db` opens with `PRAGMA journal_mode=off` (WAL replay not needed for read-only history queries). `Session.Acquire` copies the `-wal` / `-shm` sidecars alongside the main DB. Any `-shm` writes SQLite performs during replay land on the ephemeral copy and are deleted with the session. - **Multi-profile availability**: requires Safari 17 (macOS 14 Sonoma) or newer. Older Safari versions have only the default profile; discovery degrades cleanly via the ReadDir fallback described in §2.1. - **File acquisition**: all per-profile files are copied into a `filemanager.Session` temp directory before extraction, except the discovery-time `SafariTabs.db` read which opens the live file directly. See [RFC-008](008-file-acquisition-and-platform-quirks.md) for the general pattern. diff --git a/rfcs/012-yandex-decryption.md b/rfcs/012-yandex-decryption.md index 450d2ca..98bf806 100644 --- a/rfcs/012-yandex-decryption.md +++ b/rfcs/012-yandex-decryption.md @@ -46,7 +46,7 @@ Deferred to a follow-up RFC / PR: ### 3.1 `meta.local_encryptor_data` ``` -[protobuf preamble bytes...] "v10" [12B nonce] [68B plaintext + 16B GCM tag] +[protobuf preamble bytes...] "v10" [12B nonce] [68B ciphertext + 16B GCM tag] ``` The 68-byte plaintext (decrypted with the Chromium master key, empty AAD) has the shape: diff --git a/rfcs/013-cli-redesign-cross-host.md b/rfcs/013-cli-redesign-cross-host.md index da54fb6..747323d 100644 --- a/rfcs/013-cli-redesign-cross-host.md +++ b/rfcs/013-cli-redesign-cross-host.md @@ -1,7 +1,7 @@ # RFC-013: CLI Redesign — Flat-Verb Surface & Cross-Host Restore **Author**: moonD4rk -**Status**: Accepted — `archive` (#607) implemented; cross-platform `restore` (#606) pending +**Status**: Implemented — `archive` (#610); cross-platform `restore` (#611) **Created**: 2026-06-03 **Revised**: 2026-06-06 (subdir-convention archive, dual-mode restore, Local State, delivery order) @@ -126,4 +126,4 @@ Working backwards from the chosen surface: | [RFC-003](003-chromium-encryption.md) | Cipher version dispatch (v10/v11/v20) consumed by restore | | [RFC-006](006-key-retrieval-mechanisms.md) | Master-key retrieval the cross-host split externalizes | | [RFC-001](001-project-architecture.md) | Browser interface and Extract() orchestration | -| [RFC-008](008-file-acquisition-and-platform-quirks.md) | Locked-file session and CompressDir used by archive | +| [RFC-008](008-file-acquisition-and-platform-quirks.md) | Locked-file session and ZipDir used by archive | diff --git a/utils/fileutil/fileutil.go b/utils/fileutil/fileutil.go index 46b5ce0..533dceb 100644 --- a/utils/fileutil/fileutil.go +++ b/utils/fileutil/fileutil.go @@ -30,7 +30,6 @@ func CompressDir(dir string) error { return fmt.Errorf("read dir error: %w", err) } if len(files) == 0 { - // Return an error if no files are found in the directory return fmt.Errorf("no files to compress in: %s", dir) }