mirror of
https://github.com/moonD4rk/HackBrowserData.git
synced 2026-05-19 18:58:03 +02:00
docs: rewrite readme, rfcs, and contributing (#555)
* docs: rewrite README, RFCs, and CONTRIBUTING * docs: fix Linux storage labels in RFC-006 (Opera/Vivaldi swapped)
This commit is contained in:
@@ -1,797 +0,0 @@
|
||||
# RFC-001: Architecture Refactoring
|
||||
|
||||
**Author**: moonD4rk
|
||||
**Status**: Proposed
|
||||
**Created**: 2025-09-01
|
||||
**Updated**: 2026-03-22
|
||||
|
||||
## Abstract
|
||||
|
||||
This RFC addresses the overall architecture of HackBrowserData:
|
||||
|
||||
1. **Data model redesign**: `Category` enum + browser-agnostic `*Entry` structs
|
||||
2. **Crypto layer**: cipher version detection, master key retrieval abstraction
|
||||
3. **Browser registration & discovery**: declarative config, direct profile scanning
|
||||
4. **Yandex variant handling**: source overrides + query overrides
|
||||
5. **Error handling**: collect-and-continue pattern
|
||||
|
||||
**Constraint**: Go 1.20 (Windows 7 support).
|
||||
|
||||
See RFC-002 for file acquisition, extract method details, and output.
|
||||
|
||||
---
|
||||
|
||||
## 1. Target Directory Structure
|
||||
|
||||
```
|
||||
hackbrowserdata/
|
||||
├── cmd/
|
||||
│ └── hack-browser-data/
|
||||
│ └── main.go # CLI: flag parsing → PickBrowsers → Extract → Output
|
||||
│
|
||||
├── browser/
|
||||
│ ├── browser.go # Browser interface, BrowserKind, Config, PickBrowsers()
|
||||
│ ├── browser_darwin.go # platformBrowsers() → []Config
|
||||
│ ├── browser_windows.go # platformBrowsers() → []Config
|
||||
│ ├── browser_linux.go # platformBrowsers() → []Config
|
||||
│ │
|
||||
│ ├── chromium/
|
||||
│ │ ├── chromium.go # Chromium struct (holds masterKey []byte), Extract()
|
||||
│ │ ├── chromium_darwin.go # platform key retriever wiring
|
||||
│ │ ├── chromium_windows.go # platform key retriever wiring
|
||||
│ │ ├── chromium_linux.go # platform key retriever wiring
|
||||
│ │ ├── source.go # chromiumSources, yandexSources maps
|
||||
│ │ ├── decrypt.go # decryptValue() — Chromium-specific DPAPI/AES fallback
|
||||
│ │ ├── extract_password.go # extractPasswords() + default SQL query
|
||||
│ │ ├── extract_cookie.go # extractCookies() + default SQL query
|
||||
│ │ ├── extract_history.go # extractHistories() + default SQL query
|
||||
│ │ ├── extract_download.go # extractDownloads() + default SQL query
|
||||
│ │ ├── extract_bookmark.go # extractBookmarks() (JSON)
|
||||
│ │ ├── extract_creditcard.go # extractCreditCards() + default SQL query
|
||||
│ │ ├── extract_extension.go # extractExtensions() (JSON)
|
||||
│ │ └── extract_storage.go # extractLocalStorage(), extractSessionStorage() (LevelDB)
|
||||
│ │
|
||||
│ ├── firefox/
|
||||
│ │ ├── firefox.go # Firefox struct, Extract(), deriveMasterKey()
|
||||
│ │ ├── firefox_test.go
|
||||
│ │ ├── source.go # firefoxSources map
|
||||
│ │ ├── extract_password.go # extractPasswords() (JSON + ASN1PBE)
|
||||
│ │ ├── extract_cookie.go # extractCookies() (SQLite, no encryption)
|
||||
│ │ ├── extract_history.go # extractHistories() (SQLite)
|
||||
│ │ ├── extract_download.go # extractDownloads() (SQLite)
|
||||
│ │ ├── extract_bookmark.go # extractBookmarks() (SQLite)
|
||||
│ │ ├── extract_extension.go # extractExtensions() (JSON)
|
||||
│ │ └── extract_storage.go # extractLocalStorage() (SQLite)
|
||||
│ │
|
||||
│ └── exploit/
|
||||
│ └── gcoredump/
|
||||
│ └── gcoredump.go # CVE-2025-24204 macOS exploit (darwin only)
|
||||
│
|
||||
├── browserdata/
|
||||
│ ├── browserdata.go # BrowserData struct (typed slices)
|
||||
│ ├── output.go # BrowserData.Output() — CSV/JSON writer
|
||||
│ ├── output_test.go
|
||||
│
|
||||
├── crypto/
|
||||
│ ├── crypto.go # AESCBCDecrypt, AESGCMDecrypt, DES3, PKCS5
|
||||
│ ├── crypto_darwin.go # DecryptWithChromium (CBC), DecryptWithDPAPI (returns error)
|
||||
│ ├── crypto_windows.go # DecryptWithChromium (GCM), DecryptWithDPAPI
|
||||
│ ├── crypto_linux.go # DecryptWithChromium (CBC), DecryptWithDPAPI (returns error)
|
||||
│ ├── crypto_test.go
|
||||
│ ├── version.go # DetectVersion(), StripPrefix(), CipherVersion
|
||||
│ ├── asn1pbe.go # Firefox ASN.1 PBE key derivation
|
||||
│ ├── asn1pbe_test.go
|
||||
│ ├── pbkdf2.go
|
||||
│ │
|
||||
│ └── keyretriever/
|
||||
│ ├── keyretriever.go # KeyRetriever interface, ChainRetriever
|
||||
│ ├── keyretriever_darwin.go # GcoredumpRetriever, SecurityCmdRetriever
|
||||
│ ├── keyretriever_windows.go # DPAPIRetriever
|
||||
│ ├── keyretriever_linux.go # DBusRetriever, FallbackRetriever
|
||||
│ └── params.go # PBKDF2Params (saltysalt, iterations)
|
||||
│
|
||||
├── filemanager/
|
||||
│ └── session.go # Session: MkdirTemp, TempDir(), Acquire(), Cleanup()
|
||||
│
|
||||
├── types/
|
||||
│ ├── category.go # Category enum (9 values)
|
||||
│ ├── models.go # LoginEntry, CookieEntry, ... (browser-agnostic)
|
||||
│ └── types_test.go
|
||||
│
|
||||
├── log/
|
||||
│ ├── log.go
|
||||
│ ├── logger.go
|
||||
│ ├── logger_test.go
|
||||
│ └── level.go # log levels (merged from level/ sub-package)
|
||||
│
|
||||
└── utils/
|
||||
├── byteutil/
|
||||
│ └── byteutil.go
|
||||
├── fileutil/
|
||||
│ ├── fileutil.go # renamed from filetutil.go
|
||||
│ └── fileutil_test.go
|
||||
├── sqliteutil/
|
||||
│ ├── sqlite.go # QuerySQLite() helper
|
||||
│ └── query.go # QueryRows[T]() generic helper (Go 1.20)
|
||||
├── typeutil/
|
||||
│ ├── typeutil.go
|
||||
│ └── typeutil_test.go
|
||||
└── chainbreaker/
|
||||
├── chainbreaker.go
|
||||
└── chainbreaker_test.go
|
||||
```
|
||||
|
||||
### What changed vs current structure
|
||||
|
||||
| Change | Current | Target |
|
||||
|--------|---------|--------|
|
||||
| **New** `utils/sqliteutil/` | — | QuerySQLite + QueryRows[T] helpers |
|
||||
| **New** `filemanager/` | — | Session-based temp file management |
|
||||
| **New** `crypto/keyretriever/` | — | Master key retrieval abstraction |
|
||||
| **New** `crypto/version.go` | — | Cipher version detection |
|
||||
| **New** `browser/chromium/extract_*.go` | — | Per-category extract methods |
|
||||
| **New** `browser/firefox/extract_*.go` | — | Per-category extract methods |
|
||||
| **New** `browser/*/source.go` | — | File source mapping per engine |
|
||||
| **Restructured** `types/` | 22 DataType constants + file mappings | 9 Category constants + data model structs |
|
||||
| **Deleted** `extractor/` | interface + registry + factory | not needed |
|
||||
| **Deleted** `browserdata/imports.go` | init() side-effect registration | not needed |
|
||||
| **Deleted** `browserdata/password/`, `cookie/`, etc. | 9 sub-packages | extract logic moved into browser engines |
|
||||
| **Deleted** `browser/consts.go` | 27 scattered constants | inlined into Config |
|
||||
| **Renamed** `filetutil.go` | typo | `fileutil.go` |
|
||||
| **Renamed** `AES128CBCDecrypt` | misleading name | `AESCBCDecrypt` |
|
||||
|
||||
### Naming conventions
|
||||
|
||||
| Concept | Package | Type/Func | File |
|
||||
|---------|---------|-----------|------|
|
||||
| Data category | `types` | `Category` (int enum) | `category.go` |
|
||||
| Data models | `types` | `LoginEntry`, `CookieEntry`, ... | `models.go` |
|
||||
| Result container | `browserdata` | `BrowserData` | `browserdata.go` |
|
||||
| Browser config | `browser` | `Config` | `browser.go` |
|
||||
| Browser engine kind | `browser` | `BrowserKind` | `browser.go` |
|
||||
| File source mapping | `chromium`/`firefox` | `source` struct, `chromiumSources` map | `source.go` |
|
||||
| Key retrieval | `keyretriever` | `KeyRetriever` (interface) | `keyretriever.go` |
|
||||
| Strategy chain | `keyretriever` | `ChainRetriever` | `keyretriever.go` |
|
||||
| Cipher version | `crypto` | `CipherVersion` | `version.go` |
|
||||
| Temp file session | `filemanager` | `Session` | `session.go` |
|
||||
| SQLite helper | `sqliteutil` | `QuerySQLite` (func) | `sqlite.go` |
|
||||
| Generic query helper | `sqliteutil` | `QueryRows[T]` (func) | `query.go` |
|
||||
| Chromium decrypt | `chromium` | `decryptValue` (unexported func) | `decrypt.go` |
|
||||
|
||||
### Public vs private
|
||||
|
||||
| Symbol | Exported | Reason |
|
||||
|--------|----------|--------|
|
||||
| `Browser` interface | Yes | used by cmd/main.go |
|
||||
| `Config` struct | Yes | passed to chromium.New() |
|
||||
| `PickBrowsers()` | Yes | called by cmd/main.go |
|
||||
| `platformBrowsers()` | No | browser package internal |
|
||||
| `isValidBrowserDir()` | No | browser package internal |
|
||||
| `Chromium.Extract()` | Yes | implements Browser interface |
|
||||
| `Chromium.extractPasswords()` | No | chromium package internal |
|
||||
| `Chromium.acquireFiles()` | No | chromium package internal |
|
||||
| `discoverProfiles()` | No | chromium package internal |
|
||||
| `BrowserData` struct | Yes | returned to cmd/main.go |
|
||||
| `BrowserData.Output()` | Yes | called by cmd/main.go |
|
||||
| `QuerySQLite()` | Yes | used by chromium and firefox |
|
||||
| `QueryRows[T]()` | Yes | used by chromium and firefox |
|
||||
|
||||
### File naming convention for `extract_*.go`
|
||||
|
||||
Files inside `browser/chromium/` and `browser/firefox/` use the `extract_` prefix for extraction logic. This groups them visually when sorted alphabetically:
|
||||
|
||||
```
|
||||
chromium.go ← struct + Extract orchestration
|
||||
chromium_darwin.go ← platform: master key
|
||||
chromium_linux.go
|
||||
chromium_windows.go
|
||||
extract_bookmark.go ← extract: one file per Category
|
||||
extract_cookie.go
|
||||
extract_creditcard.go
|
||||
extract_download.go
|
||||
extract_extension.go
|
||||
extract_history.go
|
||||
extract_password.go
|
||||
extract_storage.go
|
||||
source.go ← file source mapping
|
||||
```
|
||||
|
||||
Three natural groups: `chromium*` (struct + platform), `extract_*` (data extraction), `source.go` (file mapping). Each `extract_*.go` file contains the default SQL query constant and the extract method (~20-30 lines).
|
||||
|
||||
---
|
||||
|
||||
## 2. Core Data Model Redesign
|
||||
|
||||
### 2.1 Problem: MasterKey mixed with data types
|
||||
|
||||
The current `DataType` enum contains 22 constants that conflate three concerns:
|
||||
|
||||
- **Infrastructure** (keys): `ChromiumKey`, `FirefoxKey4`
|
||||
- **Browser engine prefix**: `ChromiumPassword` vs `FirefoxPassword` vs `YandexPassword`
|
||||
- **File layout**: `Filename()`, `TempFilename()` methods on the enum
|
||||
|
||||
A password is a password regardless of which browser it came from. The browser engine determines *how* to extract, not *what* the data is.
|
||||
|
||||
### 2.2 New design: Category + Models
|
||||
|
||||
**`types/category.go`** — 9 data categories (down from 22 DataType constants):
|
||||
|
||||
```go
|
||||
package types
|
||||
|
||||
type Category int
|
||||
|
||||
const (
|
||||
Password Category = iota
|
||||
Cookie
|
||||
Bookmark
|
||||
History
|
||||
Download
|
||||
CreditCard
|
||||
Extension
|
||||
LocalStorage
|
||||
SessionStorage
|
||||
)
|
||||
|
||||
var AllCategories = []Category{
|
||||
Password, Cookie, Bookmark, History, Download,
|
||||
CreditCard, Extension, LocalStorage, SessionStorage,
|
||||
}
|
||||
|
||||
func (c Category) String() string { ... }
|
||||
|
||||
func (c Category) IsSensitive() bool {
|
||||
switch c {
|
||||
case Password, Cookie, CreditCard:
|
||||
return true
|
||||
default:
|
||||
return false
|
||||
}
|
||||
}
|
||||
|
||||
func NonSensitiveCategories() []Category {
|
||||
var cats []Category
|
||||
for _, c := range AllCategories {
|
||||
if !c.IsSensitive() {
|
||||
cats = append(cats, c)
|
||||
}
|
||||
}
|
||||
return cats
|
||||
}
|
||||
```
|
||||
|
||||
**`types/models.go`** — browser-agnostic data models, no encrypted fields:
|
||||
|
||||
```go
|
||||
package types
|
||||
|
||||
import "time"
|
||||
|
||||
type LoginEntry struct {
|
||||
URL string `json:"url" csv:"url"`
|
||||
Username string `json:"username" csv:"username"`
|
||||
Password string `json:"password" csv:"password"`
|
||||
CreatedAt time.Time `json:"created_at" csv:"created_at"`
|
||||
}
|
||||
|
||||
type CookieEntry struct {
|
||||
Host string `json:"host" csv:"host"`
|
||||
Path string `json:"path" csv:"path"`
|
||||
Name string `json:"name" csv:"name"`
|
||||
Value string `json:"value" csv:"value"`
|
||||
IsSecure bool `json:"is_secure" csv:"is_secure"`
|
||||
IsHTTPOnly bool `json:"is_httponly" csv:"is_httponly"`
|
||||
ExpireAt time.Time `json:"expire_at" csv:"expire_at"`
|
||||
CreatedAt time.Time `json:"created_at" csv:"created_at"`
|
||||
}
|
||||
|
||||
type BookmarkEntry struct {
|
||||
Name string `json:"name" csv:"name"`
|
||||
URL string `json:"url" csv:"url"`
|
||||
Folder string `json:"folder" csv:"folder"`
|
||||
CreatedAt time.Time `json:"created_at" csv:"created_at"`
|
||||
}
|
||||
|
||||
type HistoryEntry struct {
|
||||
URL string `json:"url" csv:"url"`
|
||||
Title string `json:"title" csv:"title"`
|
||||
VisitCount int `json:"visit_count" csv:"visit_count"`
|
||||
LastVisit time.Time `json:"last_visit" csv:"last_visit"`
|
||||
}
|
||||
|
||||
type DownloadEntry struct {
|
||||
URL string `json:"url" csv:"url"`
|
||||
TargetPath string `json:"target_path" csv:"target_path"`
|
||||
TotalBytes int64 `json:"total_bytes" csv:"total_bytes"`
|
||||
StartTime time.Time `json:"start_time" csv:"start_time"`
|
||||
EndTime time.Time `json:"end_time" csv:"end_time"`
|
||||
}
|
||||
|
||||
type CreditCardEntry struct {
|
||||
Name string `json:"name" csv:"name"`
|
||||
Number string `json:"number" csv:"number"`
|
||||
ExpMonth string `json:"exp_month" csv:"exp_month"`
|
||||
ExpYear string `json:"exp_year" csv:"exp_year"`
|
||||
}
|
||||
|
||||
type StorageEntry struct {
|
||||
URL string `json:"url" csv:"url"`
|
||||
Key string `json:"key" csv:"key"`
|
||||
Value string `json:"value" csv:"value"`
|
||||
}
|
||||
|
||||
type ExtensionEntry struct {
|
||||
Name string `json:"name" csv:"name"`
|
||||
ID string `json:"id" csv:"id"`
|
||||
Description string `json:"description" csv:"description"`
|
||||
Version string `json:"version" csv:"version"`
|
||||
}
|
||||
```
|
||||
|
||||
### 2.3 Result container
|
||||
|
||||
**`browserdata/browserdata.go`**:
|
||||
|
||||
```go
|
||||
type BrowserData struct {
|
||||
Passwords []types.LoginEntry
|
||||
Cookies []types.CookieEntry
|
||||
Bookmarks []types.BookmarkEntry
|
||||
Histories []types.HistoryEntry
|
||||
Downloads []types.DownloadEntry
|
||||
CreditCards []types.CreditCardEntry
|
||||
Extensions []types.ExtensionEntry
|
||||
LocalStorage []types.StorageEntry
|
||||
SessionStorage []types.StorageEntry
|
||||
}
|
||||
```
|
||||
|
||||
### 2.4 What was removed from types/
|
||||
|
||||
| Removed | Reason |
|
||||
|---------|--------|
|
||||
| `ChromiumKey`, `FirefoxKey4` | MasterKey is infrastructure, handled inside browser engine |
|
||||
| `Chromium*`/`Firefox*`/`Yandex*` prefixes | Browser engine is extraction concern, not type concern |
|
||||
| `Filename()`, `TempFilename()` | File layout is browser engine's internal knowledge |
|
||||
| `itemFileNames` map | Moved into `chromium/source.go` and `firefox/source.go` |
|
||||
| `DefaultChromiumTypes`, `DefaultFirefoxTypes`, `DefaultYandexTypes` | Replaced by `types.AllCategories` |
|
||||
| `extractor/` package | No longer needed — browser engines have typed extract methods |
|
||||
| `browserdata/imports.go` | No longer needed — no init() registration |
|
||||
|
||||
---
|
||||
|
||||
## 3. Crypto Layer
|
||||
|
||||
### 3.1 Cipher version detection
|
||||
|
||||
**New file**: `crypto/version.go`
|
||||
|
||||
```go
|
||||
type CipherVersion string
|
||||
|
||||
const (
|
||||
CipherV10 CipherVersion = "v10" // Chrome 80+
|
||||
CipherV20 CipherVersion = "v20" // Chrome 127+ App-Bound Encryption
|
||||
CipherDPAPI CipherVersion = "dpapi" // pre-Chrome 80
|
||||
)
|
||||
|
||||
func DetectVersion(ciphertext []byte) CipherVersion {
|
||||
if len(ciphertext) < 3 { return CipherDPAPI }
|
||||
prefix := string(ciphertext[:3])
|
||||
switch prefix {
|
||||
case "v10":
|
||||
return CipherV10
|
||||
case "v20":
|
||||
return CipherV20
|
||||
default:
|
||||
return CipherDPAPI
|
||||
}
|
||||
}
|
||||
|
||||
func StripPrefix(ciphertext []byte) []byte {
|
||||
ver := DetectVersion(ciphertext)
|
||||
if ver == CipherV10 || ver == CipherV20 {
|
||||
return ciphertext[3:]
|
||||
}
|
||||
return ciphertext
|
||||
}
|
||||
```
|
||||
|
||||
Version-specific post-processing (e.g., v20 cookie value has a 32-byte header) belongs here, not in extract methods:
|
||||
|
||||
```go
|
||||
// DecryptCookieValue handles version-specific cookie decryption.
|
||||
func DecryptCookieValue(key, ciphertext []byte) ([]byte, error) {
|
||||
version := DetectVersion(ciphertext)
|
||||
payload := StripPrefix(ciphertext)
|
||||
|
||||
switch version {
|
||||
case CipherV10:
|
||||
return decryptPayload(key, payload)
|
||||
case CipherV20:
|
||||
value, err := decryptPayload(key, payload)
|
||||
if err != nil { return nil, err }
|
||||
if len(value) > 32 {
|
||||
return value[32:], nil // strip App-Bound header
|
||||
}
|
||||
return value, nil
|
||||
default:
|
||||
return nil, fmt.Errorf("unsupported cipher version: %s", version)
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 3.2 Key retriever abstraction
|
||||
|
||||
**New package**: `crypto/keyretriever/`
|
||||
|
||||
```go
|
||||
type KeyRetriever interface {
|
||||
RetrieveKey(storage string, localStatePath string) ([]byte, error)
|
||||
}
|
||||
|
||||
// Note: Windows DPAPIRetriever reads localStatePath to extract the encrypted key.
|
||||
// macOS and Linux retrievers ignore localStatePath (they use keychain/dbus instead).
|
||||
|
||||
type ChainRetriever struct {
|
||||
retrievers []KeyRetriever
|
||||
}
|
||||
|
||||
func NewChain(retrievers ...KeyRetriever) KeyRetriever { ... }
|
||||
|
||||
func (c *ChainRetriever) RetrieveKey(storage string, localStatePath string) ([]byte, error) {
|
||||
var lastErr error
|
||||
for _, r := range c.retrievers {
|
||||
key, err := r.RetrieveKey(storage, localStatePath)
|
||||
if err == nil && len(key) > 0 { return key, nil }
|
||||
lastErr = err
|
||||
}
|
||||
return nil, fmt.Errorf("all key retrievers failed: %w", lastErr)
|
||||
}
|
||||
```
|
||||
|
||||
Platform defaults:
|
||||
- macOS: `NewChain(&GcoredumpRetriever{}, &SecurityCmdRetriever{})`
|
||||
- Windows: `&DPAPIRetriever{}`
|
||||
- Linux: `NewChain(&DBusRetriever{}, &FallbackRetriever{})`
|
||||
|
||||
**`params.go`** centralizes PBKDF2 magic values with source links:
|
||||
|
||||
```go
|
||||
var (
|
||||
// https://source.chromium.org/chromium/chromium/src/+/master:components/os_crypt/os_crypt_mac.mm
|
||||
macOSParams = PBKDF2Params{Salt: []byte("saltysalt"), Iterations: 1003, KeyLen: 16}
|
||||
// https://source.chromium.org/chromium/chromium/src/+/main:components/os_crypt/os_crypt_linux.cc
|
||||
linuxParams = PBKDF2Params{Salt: []byte("saltysalt"), Iterations: 1, KeyLen: 16}
|
||||
)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. Browser Registration & Discovery
|
||||
|
||||
### 4.1 Declarative browser config
|
||||
|
||||
```go
|
||||
// browser/browser.go
|
||||
type BrowserKind int
|
||||
const (
|
||||
KindChromium BrowserKind = iota
|
||||
KindChromiumYandex // Chromium variant with different file names and SQL queries
|
||||
KindFirefox
|
||||
)
|
||||
|
||||
type Config struct {
|
||||
Key string // lookup key: "chrome", "firefox"
|
||||
Name string // display name: "Chrome", "Firefox"
|
||||
Kind BrowserKind
|
||||
Storage string // keychain label (macOS/Linux); unused on Windows (DPAPI reads Local State directly)
|
||||
UserDataDir string // e.g. ~/Library/Application Support/Google/Chrome/
|
||||
}
|
||||
|
||||
type Browser interface {
|
||||
Name() string
|
||||
Extract(categories []types.Category) (*browserdata.BrowserData, error)
|
||||
}
|
||||
```
|
||||
|
||||
### 4.2 Platform browser list & PickBrowsers
|
||||
|
||||
Each platform file defines `platformBrowsers()`. Use full paths per line (no shared prefix variable):
|
||||
|
||||
```go
|
||||
// browser/browser_darwin.go
|
||||
func platformBrowsers() []Config {
|
||||
return []Config{
|
||||
{Key: "chrome", Name: "Chrome", Kind: KindChromium, Storage: "Chrome",
|
||||
UserDataDir: homeDir + "/Library/Application Support/Google/Chrome"},
|
||||
{Key: "edge", Name: "Edge", Kind: KindChromium, Storage: "Microsoft Edge",
|
||||
UserDataDir: homeDir + "/Library/Application Support/Microsoft Edge"},
|
||||
// ... other browsers
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
```go
|
||||
func PickBrowsers(name, profile string) ([]Browser, error) {
|
||||
name = strings.ToLower(name)
|
||||
var browsers []Browser
|
||||
configs := platformBrowsers()
|
||||
for _, cfg := range configs {
|
||||
if name != "all" && cfg.Key != name { continue }
|
||||
dir := cfg.UserDataDir
|
||||
if profile != "" { dir = profile }
|
||||
if !isValidBrowserDir(cfg.Kind, dir) {
|
||||
continue
|
||||
}
|
||||
bs, err := newBrowserFromConfig(cfg, dir)
|
||||
if err != nil {
|
||||
log.Debugf("skip %s: %v", cfg.Name, err)
|
||||
continue
|
||||
}
|
||||
browsers = append(browsers, bs...)
|
||||
}
|
||||
return browsers, nil
|
||||
}
|
||||
|
||||
func newBrowserFromConfig(cfg Config, dir string) ([]Browser, error) {
|
||||
switch cfg.Kind {
|
||||
case KindChromium, KindChromiumYandex:
|
||||
return chromium.New(cfg, dir)
|
||||
case KindFirefox:
|
||||
return firefox.New(dir)
|
||||
default:
|
||||
return nil, fmt.Errorf("unknown browser kind: %d", cfg.Kind)
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 4.3 Browser installation validation & profile discovery
|
||||
|
||||
Before enumerating profiles, confirm the directory is a real browser installation. For Chromium, the `Local State` file is the confirmation signal:
|
||||
|
||||
```go
|
||||
func isValidBrowserDir(kind BrowserKind, dir string) bool {
|
||||
if !fileutil.IsDirExists(dir) { return false }
|
||||
switch kind {
|
||||
case KindChromium, KindChromiumYandex:
|
||||
return fileutil.IsFileExists(filepath.Join(dir, "Local State"))
|
||||
case KindFirefox:
|
||||
return true
|
||||
}
|
||||
return false
|
||||
}
|
||||
```
|
||||
|
||||
Chromium profiles are deterministic (`Default/`, `Profile 1/`, ...). Directly `os.ReadDir()` and check known file paths instead of `filepath.Walk`.
|
||||
|
||||
Firefox profiles are `xxxxxxxx.name/` directories. Enumerate and check for `key4.db` or `logins.json`.
|
||||
|
||||
---
|
||||
|
||||
## 5. Yandex Variant Handling
|
||||
|
||||
Yandex is Chromium-based with 3 differences:
|
||||
|
||||
| Aspect | Standard Chromium | Yandex |
|
||||
|--------|------------------|--------|
|
||||
| Password file | `Login Data` | `Ya Passman Data` |
|
||||
| Password SQL | `SELECT origin_url, ...` | `SELECT action_url, ...` |
|
||||
| CreditCard file | `Web Data` | `Ya Credit Cards` |
|
||||
|
||||
### 5.1 Separate source map
|
||||
|
||||
```go
|
||||
// browser/chromium/source.go
|
||||
|
||||
var yandexSources = map[types.Category]source{
|
||||
types.Password: {paths: []string{"Ya Passman Data"}}, // different
|
||||
types.Cookie: {paths: []string{"Network/Cookies", "Cookies"}},
|
||||
types.History: {paths: []string{"History"}},
|
||||
types.Download: {paths: []string{"History"}},
|
||||
types.Bookmark: {paths: []string{"Bookmarks"}},
|
||||
types.CreditCard: {paths: []string{"Ya Credit Cards"}}, // different
|
||||
types.Extension: {paths: []string{"Secure Preferences"}},
|
||||
types.LocalStorage: {paths: []string{"Local Storage/leveldb"}, isDir: true},
|
||||
types.SessionStorage: {paths: []string{"Session Storage"}, isDir: true},
|
||||
}
|
||||
```
|
||||
|
||||
### 5.2 Query overrides (default + override pattern)
|
||||
|
||||
Each extract method defines its own default SQL query constant. The Chromium struct holds an optional override map:
|
||||
|
||||
```go
|
||||
// browser/chromium/chromium.go
|
||||
type Chromium struct {
|
||||
name string
|
||||
profileDir string
|
||||
masterKey []byte // retrieved once in New(), shared across profiles
|
||||
sources map[types.Category]source // chromiumSources or yandexSources
|
||||
queryOverrides map[types.Category]string // nil for standard Chromium
|
||||
}
|
||||
|
||||
var yandexQueryOverrides = map[types.Category]string{
|
||||
types.Password: `SELECT action_url, username_value, password_value, date_created FROM logins`,
|
||||
}
|
||||
```
|
||||
|
||||
Extract methods check for overrides locally:
|
||||
|
||||
```go
|
||||
// browser/chromium/extract_password.go
|
||||
const defaultLoginQuery = `SELECT origin_url, username_value, password_value, date_created FROM logins`
|
||||
|
||||
func (c *Chromium) extractPasswords(masterKey []byte, path string) ([]types.LoginEntry, error) {
|
||||
query := defaultLoginQuery
|
||||
if q, ok := c.queryOverrides[types.Password]; ok {
|
||||
query = q
|
||||
}
|
||||
// ... rest of extraction
|
||||
}
|
||||
```
|
||||
|
||||
### 5.3 Wiring at creation time
|
||||
|
||||
```go
|
||||
func New(cfg browser.Config, userDataDir string) ([]*Chromium, error) {
|
||||
sources := chromiumSources
|
||||
var overrides map[types.Category]string
|
||||
if cfg.Kind == browser.KindChromiumYandex {
|
||||
sources = yandexSources
|
||||
overrides = yandexQueryOverrides
|
||||
}
|
||||
|
||||
// Retrieve master key ONCE for the entire browser, shared across all profiles.
|
||||
localStatePath := filepath.Join(userDataDir, "Local State")
|
||||
retriever := platformKeyRetriever() // returns ChainRetriever per platform
|
||||
masterKey, err := retriever.RetrieveKey(cfg.Storage, localStatePath)
|
||||
if err != nil { return nil, fmt.Errorf("retrieve master key: %w", err) }
|
||||
|
||||
// ... discover profiles, create Chromium instances with masterKey + sources + overrides
|
||||
}
|
||||
```
|
||||
|
||||
Zero if-branches in any extract method. All variant differences concentrated in `source.go` and `New()`. The master key is retrieved once and injected into every `Chromium` instance (one per profile).
|
||||
|
||||
---
|
||||
|
||||
## 6. Error Handling
|
||||
|
||||
### 6.1 Collect-and-continue pattern
|
||||
|
||||
`Extract()` collects errors per category but continues extracting. The returned `data` and `err` can both be non-nil:
|
||||
|
||||
```go
|
||||
func (c *Chromium) Extract(categories []types.Category) (*browserdata.BrowserData, error) {
|
||||
session, err := filemanager.NewSession()
|
||||
if err != nil { return nil, err }
|
||||
defer session.Cleanup()
|
||||
|
||||
files := c.acquireFiles(session, categories)
|
||||
|
||||
data := &browserdata.BrowserData{}
|
||||
var errs []error
|
||||
|
||||
for _, cat := range categories {
|
||||
path, ok := files[cat]
|
||||
if !ok { continue }
|
||||
|
||||
// c.masterKey was retrieved once in New() and stored on the struct.
|
||||
switch cat {
|
||||
case types.Password:
|
||||
data.Passwords, err = c.extractPasswords(c.masterKey, path)
|
||||
case types.Cookie:
|
||||
data.Cookies, err = c.extractCookies(c.masterKey, path)
|
||||
case types.History:
|
||||
data.Histories, err = c.extractHistories(path)
|
||||
case types.Download:
|
||||
data.Downloads, err = c.extractDownloads(path)
|
||||
case types.Bookmark:
|
||||
data.Bookmarks, err = c.extractBookmarks(path)
|
||||
case types.CreditCard:
|
||||
data.CreditCards, err = c.extractCreditCards(c.masterKey, path)
|
||||
case types.Extension:
|
||||
data.Extensions, err = c.extractExtensions(path)
|
||||
case types.LocalStorage:
|
||||
data.LocalStorage, err = c.extractLocalStorage(path)
|
||||
case types.SessionStorage:
|
||||
data.SessionStorage, err = c.extractSessionStorage(path)
|
||||
}
|
||||
if err != nil {
|
||||
log.Debugf("extract %s: %v", cat, err)
|
||||
errs = append(errs, fmt.Errorf("%s: %w", cat, err))
|
||||
}
|
||||
}
|
||||
return data, errors.Join(errs...) // Go 1.20
|
||||
}
|
||||
```
|
||||
|
||||
### 6.2 Error severity levels
|
||||
|
||||
| Level | Behavior | Example |
|
||||
|-------|----------|---------|
|
||||
| Session/key failure | `return nil, err` — abort entirely | Disk full, keychain denied |
|
||||
| Category failure | Log, skip, continue next category | Cookie file locked |
|
||||
| Single record failure | Skip record, continue extraction | One cookie decryption failed |
|
||||
|
||||
### 6.3 Error wrapping convention
|
||||
|
||||
Use `fmt.Errorf` with `%w` for error context. No custom error types needed.
|
||||
|
||||
```go
|
||||
// Good: wraps with context
|
||||
raw, err := base64.StdEncoding.DecodeString(encoded)
|
||||
if err != nil { return nil, fmt.Errorf("base64 decode: %w", err) }
|
||||
|
||||
// Bad: swallows error
|
||||
raw, _ := base64.StdEncoding.DecodeString(encoded)
|
||||
```
|
||||
|
||||
The `%w` verb preserves the error chain for `errors.Is()` and `errors.As()` if needed later.
|
||||
|
||||
### 6.4 Caller pattern
|
||||
|
||||
```go
|
||||
data, err := b.Extract(categories)
|
||||
if err != nil {
|
||||
log.Warnf("%s: %v", b.Name(), err) // partial failure
|
||||
}
|
||||
if data == nil {
|
||||
continue // total failure
|
||||
}
|
||||
data.Output(dir, b.Name(), format) // output whatever succeeded
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 7. Implementation Order
|
||||
|
||||
| Phase | Scope | Risk |
|
||||
|-------|-------|------|
|
||||
| 1 | `types/category.go` + `types/models.go` + `browserdata/browserdata.go` | Zero — new files only |
|
||||
| 2 | `utils/sqliteutil/sqlite.go` + `query.go` | Zero — new files only |
|
||||
| 3 | `crypto/version.go`, rename `AESCBCDecrypt` | Low — internal crypto changes |
|
||||
| 4 | `crypto/keyretriever/` | Low — new package |
|
||||
| 5 | `browser/chromium/source.go` + `extract_*.go` | Medium — new extract methods |
|
||||
| 6 | `browser/firefox/source.go` + `extract_*.go` | Medium — new extract methods |
|
||||
| 7 | `filemanager/session.go` | Low — new package |
|
||||
| 8 | Wire `Extract()` + `Config` + `PickBrowsers()` | High — connects everything |
|
||||
| 9 | Delete old code: `extractor/`, `browserdata/*/`, `imports.go` | High — removal |
|
||||
| 10 | Update CLI, tests, cross-platform build verification | Medium |
|
||||
|
||||
---
|
||||
|
||||
## 8. Relationship with RFC-002
|
||||
|
||||
| Area | RFC-001 (this doc) | RFC-002 |
|
||||
|------|-------------------|---------|
|
||||
| Data model (Category + *Entry) | defines | uses |
|
||||
| BrowserData container | defines | implements Output |
|
||||
| Cipher version | covered | — |
|
||||
| Master key retrieval | covered | — |
|
||||
| Browser registration | covered | — |
|
||||
| Yandex variant | covered | — |
|
||||
| Error handling pattern | covered | — |
|
||||
| Extract() orchestration | covered | — |
|
||||
| File source mapping | — | covered |
|
||||
| File acquisition (Session) | — | covered |
|
||||
| Extract method details | — | covered |
|
||||
| datautil helpers | — | covered |
|
||||
| Output implementation | — | covered |
|
||||
|
||||
---
|
||||
|
||||
## 9. Open Questions
|
||||
|
||||
1. **App-Bound Encryption (Chrome 127+ v20)**: `crypto/version.go` has the extension point. Implementation deferred until tested.
|
||||
2. **Firefox version detection**: is the key-length heuristic in `processMasterKey()` sufficient, or formalize it?
|
||||
3. **Sort direction**: standardize all categories to DESC by date? (Firefox history/download currently ASC)
|
||||
|
||||
---
|
||||
|
||||
## References
|
||||
|
||||
- [Chromium OS Crypt](https://source.chromium.org/chromium/chromium/src/+/main:components/os_crypt/)
|
||||
- [Chrome Password Decryption](https://github.com/chromium/chromium/blob/main/components/os_crypt/sync/os_crypt_win.cc)
|
||||
- [Firefox NSS](https://developer.mozilla.org/en-US/docs/Mozilla/Projects/NSS)
|
||||
@@ -0,0 +1,163 @@
|
||||
# RFC-001: Project Architecture & Data Model
|
||||
|
||||
**Author**: moonD4rk
|
||||
**Status**: Living Document
|
||||
**Created**: 2026-04-05
|
||||
|
||||
## 1. Project Positioning
|
||||
|
||||
HackBrowserData is a CLI security research tool that extracts and decrypts browser data from Chromium-based browsers and Firefox across Windows, macOS, and Linux.
|
||||
|
||||
Key constraints:
|
||||
|
||||
- **Go 1.20** — the module must build with Go 1.20 to maintain Windows 7 support. Features from Go 1.21+ (`log/slog`, `slices`, `maps`, `cmp`) must not be used.
|
||||
- **Supported engines**: Chromium (including Yandex and Opera variants) and Firefox.
|
||||
- **Supported platforms**: Windows (DPAPI), macOS (Keychain), Linux (D-Bus Secret Service).
|
||||
- **No root-level library API** — the CLI calls `browser.PickBrowsers()` directly; there is no importable `pkg/` surface.
|
||||
|
||||
## 2. Directory Structure
|
||||
|
||||
```
|
||||
HackBrowserData/
|
||||
├── cmd/hack-browser-data/ # CLI entrypoint: cobra root, dump, list, version
|
||||
├── browser/ # Browser interface, PickBrowsers(), platform browser lists
|
||||
│ ├── chromium/ # Chromium engine: extraction, decryption, profile discovery
|
||||
│ └── firefox/ # Firefox engine: extraction, NSS key derivation
|
||||
├── types/ # Data model: Category enum, Entry structs, BrowserData
|
||||
├── crypto/ # Encryption primitives, cipher version detection
|
||||
│ └── keyretriever/ # Platform-specific master key retrieval (Keychain/DPAPI/D-Bus)
|
||||
├── filemanager/ # Temp file session, locked file handling (Windows)
|
||||
├── output/ # Output Writer: CSV, JSON, CookieEditor formatters
|
||||
├── log/ # Logging with level filtering
|
||||
└── utils/ # SQLite query helpers, file utilities
|
||||
```
|
||||
|
||||
## 3. Core Data Model
|
||||
|
||||
### 3.1 Category
|
||||
|
||||
`Category` is an `int` enum representing 9 browser-agnostic data kinds: Password, Cookie, Bookmark, History, Download, CreditCard, Extension, LocalStorage, SessionStorage.
|
||||
|
||||
Three categories are classified as **sensitive** (Password, Cookie, CreditCard) via `IsSensitive()`, enabling safe-by-default export scenarios.
|
||||
|
||||
### 3.2 Entry Types
|
||||
|
||||
Each category has a corresponding Entry struct with `json` and `csv` struct tags. All structs are flat (no nesting) and use `time.Time` for timestamps.
|
||||
|
||||
| Struct | Category | Key Fields |
|
||||
|--------|----------|------------|
|
||||
| `LoginEntry` | Password | URL, Username, Password, CreatedAt |
|
||||
| `CookieEntry` | Cookie | Host, Path, Name, Value, IsSecure, IsHTTPOnly, ExpireAt, CreatedAt |
|
||||
| `BookmarkEntry` | Bookmark | Name, URL, Folder, CreatedAt |
|
||||
| `HistoryEntry` | History | URL, Title, VisitCount, LastVisit |
|
||||
| `DownloadEntry` | Download | URL, TargetPath, TotalBytes, StartTime, EndTime |
|
||||
| `CreditCardEntry` | CreditCard | Name, Number, ExpMonth, ExpYear |
|
||||
| `ExtensionEntry` | Extension | Name, ID, Description, Version |
|
||||
| `StorageEntry` | LocalStorage, SessionStorage | URL, Key, Value |
|
||||
|
||||
`StorageEntry` is shared by both LocalStorage and SessionStorage.
|
||||
|
||||
### 3.3 BrowserData Container
|
||||
|
||||
`BrowserData` is the result container returned by `Extract()`. It holds typed slices — one per category. The container is populated field-by-field during extraction. The output layer uses `makeExtractor[T]()` generics to pull the correct slice for serialization.
|
||||
|
||||
## 4. Browser Interface & Registration
|
||||
|
||||
### 4.1 BrowserKind
|
||||
|
||||
Four engine kinds determine source paths and extractors:
|
||||
|
||||
| Kind | Description |
|
||||
|------|-------------|
|
||||
| `Chromium` | Standard Chromium layout |
|
||||
| `ChromiumYandex` | Yandex variant: different file names and SQL queries |
|
||||
| `ChromiumOpera` | Opera variant: different extension key, Roaming path on Windows |
|
||||
| `Firefox` | Firefox: NSS encryption, SQLite + JSON files |
|
||||
|
||||
### 4.2 BrowserConfig
|
||||
|
||||
`BrowserConfig` is the declarative, platform-specific browser definition containing: Key (CLI matching), Name (display), Kind (engine), Storage (keychain label), UserDataDir (data path).
|
||||
|
||||
### 4.3 PickBrowsers() Flow
|
||||
|
||||
```
|
||||
PickBrowsers(opts)
|
||||
→ platformBrowsers() // build-tagged: returns []BrowserConfig for this OS
|
||||
→ pickFromConfigs(configs, opts) // filter by name, apply profile-path/keychain overrides
|
||||
→ newBrowsers(cfg) // dispatch by Kind to chromium.NewBrowsers or firefox.NewBrowsers
|
||||
→ discoverProfiles() // scan for profile subdirectories
|
||||
→ resolveSourcePaths() // stat each candidate path, first match wins
|
||||
```
|
||||
|
||||
Key design decisions:
|
||||
|
||||
- **One KeyRetriever per browser** — created once and shared across all profiles to prevent repeated keychain prompts on macOS.
|
||||
- **Profile discovery differs by engine**: Chromium looks for `Preferences` files in subdirectories; Firefox accepts any subdirectory containing known source files.
|
||||
- **Flat layout fallback** — Opera-style browsers that store data directly in UserDataDir (no profile subdirectories) are handled by falling back to the base directory.
|
||||
|
||||
### 4.4 Platform Browser Lists
|
||||
|
||||
Browser configs are defined per-platform via build tags:
|
||||
|
||||
- **macOS** — 12 browsers (Chrome, Edge, Chromium, Chrome Beta, Opera, OperaGX, Vivaldi, CocCoc, Brave, Yandex, Arc, Firefox)
|
||||
- **Windows** — 16 browsers (all macOS minus Arc, plus 360 Speed, 360 Speed X, QQ, DC, Sogou)
|
||||
- **Linux** — 8 browsers (Chrome, Edge, Chromium, Chrome Beta, Opera, Vivaldi, Brave, Firefox)
|
||||
|
||||
## 5. Extract() Orchestration
|
||||
|
||||
Both Chromium and Firefox engines follow the same extraction pattern:
|
||||
|
||||
```
|
||||
Extract(categories)
|
||||
1. NewSession() → create isolated temp directory
|
||||
2. acquireFiles(session) → copy source files to temp dir (with dedup and WAL/SHM)
|
||||
3. getMasterKey(session) → platform-specific key retrieval
|
||||
4. for each category:
|
||||
extractCategory(data, cat, masterKey, path)
|
||||
5. defer session.Cleanup() → remove temp directory
|
||||
```
|
||||
|
||||
For details on file acquisition, see [RFC-008](008-file-acquisition-and-platform-quirks.md). For encryption details, see [RFC-003](003-chromium-encryption.md) (Chromium) and [RFC-005](005-firefox-encryption.md) (Firefox). For key retrieval, see [RFC-006](006-key-retrieval-mechanisms.md).
|
||||
|
||||
### 5.1 Collect-and-Continue Pattern
|
||||
|
||||
The extraction loop maximizes data recovery. Each category is extracted independently — a failure in one does not affect others. Errors are handled at three levels:
|
||||
|
||||
| Level | Trigger | Action |
|
||||
|-------|---------|--------|
|
||||
| **Session failure** | Temp dir cannot be created | Abort entirely, return error |
|
||||
| **Category failure** | Source file missing or extraction error | Skip category, continue to next |
|
||||
| **Record failure** | Single row decryption fails | Skip record, continue extraction |
|
||||
|
||||
**Master key failure is non-fatal.** If the key cannot be retrieved, categories requiring decryption (passwords, cookies, credit cards) produce empty values, while non-encrypted categories (history, bookmarks, downloads) still succeed.
|
||||
|
||||
### 5.2 Custom Extractors
|
||||
|
||||
The `categoryExtractor` interface allows browser-specific extraction logic. Yandex and Opera use custom extractors for passwords and extensions respectively, while all other categories fall through to the default Chromium implementation.
|
||||
|
||||
## 6. Dependency Constraints
|
||||
|
||||
The module is pinned to `go 1.20` in `go.mod`. This is enforced by a CI lint check that fails if the directive changes.
|
||||
|
||||
| Dependency | Version | Purpose |
|
||||
|-----------|---------|---------|
|
||||
| `modernc.org/sqlite` | v1.31.1 (pinned) | Pure-Go SQLite. v1.32+ requires Go 1.21 |
|
||||
| `github.com/syndtr/goleveldb` | v1.0.0 | LevelDB for Chromium localStorage/sessionStorage |
|
||||
| `github.com/tidwall/gjson` | v1.18.0 | JSON path queries |
|
||||
| `github.com/spf13/cobra` | v1.10.2 | CLI framework |
|
||||
| `github.com/moond4rk/keychainbreaker` | v0.2.5 | macOS keychain decryption |
|
||||
| `github.com/godbus/dbus/v5` | v5.2.2 | Linux D-Bus Secret Service |
|
||||
| `golang.org/x/sys` | v0.27.0 | Windows syscalls (DPAPI, DuplicateHandle) |
|
||||
|
||||
## Related RFCs
|
||||
|
||||
| RFC | Topic |
|
||||
|-----|-------|
|
||||
| [RFC-002](002-chromium-data-storage.md) | Chromium data file locations and storage formats |
|
||||
| [RFC-003](003-chromium-encryption.md) | Chromium encryption mechanisms per platform |
|
||||
| [RFC-004](004-firefox-data-storage.md) | Firefox data file locations and storage formats |
|
||||
| [RFC-005](005-firefox-encryption.md) | Firefox NSS encryption and key derivation |
|
||||
| [RFC-006](006-key-retrieval-mechanisms.md) | Platform-specific master key retrieval |
|
||||
| [RFC-007](007-cli-and-output-design.md) | CLI commands and output formats |
|
||||
| [RFC-008](008-file-acquisition-and-platform-quirks.md) | File acquisition and platform quirks |
|
||||
| [RFC-009](009-windows-locked-file-bypass.md) | Windows locked file bypass technique |
|
||||
@@ -1,829 +0,0 @@
|
||||
# RFC-002: Data Extraction & File Acquisition
|
||||
|
||||
**Author**: moonD4rk
|
||||
**Status**: Proposed
|
||||
**Created**: 2026-03-14
|
||||
**Updated**: 2026-03-22
|
||||
|
||||
## Abstract
|
||||
|
||||
This RFC covers the implementation details of data extraction and file acquisition:
|
||||
|
||||
1. **File source mapping**: how each browser engine maps categories to files
|
||||
2. **File acquisition**: Session-based temp file management with deduplication
|
||||
3. **Extract methods**: concrete implementations for each data category
|
||||
4. **Shared helpers**: `QuerySQLite()` and `DecryptChromiumValue()`
|
||||
5. **Output**: writing `Extract` results to CSV/JSON files
|
||||
|
||||
**Constraint**: Go 1.20 (Windows 7 support).
|
||||
|
||||
See RFC-001 for data model (`Category` + `*Entry` types), crypto layer, browser registration, and Yandex variant design.
|
||||
|
||||
---
|
||||
|
||||
## 1. Data Flow
|
||||
|
||||
```
|
||||
CLI: main.go
|
||||
│
|
||||
▼
|
||||
browser.PickBrowsers("all", "")
|
||||
│
|
||||
│ platformBrowsers() → []Config
|
||||
│ → chromium.New(cfg, dir) / firefox.New(dir)
|
||||
▼
|
||||
Browser.Extract(categories)
|
||||
│
|
||||
├─ filemanager.NewSession()
|
||||
│ └─ acquireFiles() with dedup → map[Category]tempPath
|
||||
│
|
||||
├─ masterKey
|
||||
│ Chromium: keyretriever.RetrieveKey(storage)
|
||||
│ Firefox: deriveMasterKey(key4dbPath)
|
||||
│
|
||||
└─ per-category extract methods
|
||||
├─ c.extractPasswords(masterKey, path) → []LoginEntry
|
||||
├─ c.extractCookies(masterKey, path) → []CookieEntry
|
||||
├─ c.extractHistories(path) → []HistoryEntry
|
||||
├─ c.extractDownloads(path) → []DownloadEntry
|
||||
├─ c.extractBookmarks(path) → []BookmarkEntry
|
||||
├─ c.extractCreditCards(masterKey, path) → []CreditCardEntry
|
||||
├─ c.extractExtensions(path) → []ExtensionEntry
|
||||
├─ c.extractLocalStorage(path) → []StorageEntry (LevelDB)
|
||||
└─ c.extractSessionStorage(path) → []StorageEntry (LevelDB)
|
||||
│
|
||||
▼
|
||||
browserdata.BrowserData{Passwords: [...], Cookies: [...], ...}
|
||||
│
|
||||
▼
|
||||
BrowserData.Output(dir, name, format)
|
||||
│
|
||||
▼
|
||||
chrome_default_password.csv
|
||||
chrome_default_cookie.json
|
||||
...
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 2. File Source Mapping
|
||||
|
||||
### 2.1 Category → source (one flat map per engine)
|
||||
|
||||
```go
|
||||
// browser/chromium/source.go
|
||||
|
||||
type source struct {
|
||||
paths []string // candidates in priority order
|
||||
isDir bool
|
||||
}
|
||||
|
||||
var chromiumSources = map[types.Category]source{
|
||||
types.Password: {paths: []string{"Login Data"}},
|
||||
types.Cookie: {paths: []string{"Network/Cookies", "Cookies"}},
|
||||
types.History: {paths: []string{"History"}},
|
||||
types.Download: {paths: []string{"History"}}, // same file, different query
|
||||
types.Bookmark: {paths: []string{"Bookmarks"}},
|
||||
types.CreditCard: {paths: []string{"Web Data"}},
|
||||
types.Extension: {paths: []string{"Secure Preferences"}},
|
||||
types.LocalStorage: {paths: []string{"Local Storage/leveldb"}, isDir: true},
|
||||
types.SessionStorage: {paths: []string{"Session Storage"}, isDir: true},
|
||||
}
|
||||
```
|
||||
|
||||
```go
|
||||
// browser/firefox/source.go
|
||||
|
||||
var firefoxSources = map[types.Category]source{
|
||||
types.Password: {paths: []string{"logins.json"}},
|
||||
types.Cookie: {paths: []string{"cookies.sqlite"}},
|
||||
types.History: {paths: []string{"places.sqlite"}},
|
||||
types.Download: {paths: []string{"places.sqlite"}}, // same file
|
||||
types.Bookmark: {paths: []string{"places.sqlite"}}, // same file
|
||||
types.Extension: {paths: []string{"extensions.json"}},
|
||||
types.LocalStorage: {paths: []string{"webappsstore.sqlite"}},
|
||||
}
|
||||
```
|
||||
|
||||
Yandex source map defined in RFC-001 Section 5.
|
||||
|
||||
### 2.2 File acquisition with deduplication
|
||||
|
||||
When multiple categories map to the same file (e.g. History + Download), the file is copied once:
|
||||
|
||||
```go
|
||||
func (c *Chromium) acquireFiles(session *filemanager.Session, categories []types.Category) map[types.Category]string {
|
||||
result := make(map[types.Category]string)
|
||||
copied := make(map[string]string) // abs src → temp dst
|
||||
|
||||
for _, cat := range categories {
|
||||
src, ok := c.sources[cat] // uses c.sources (chromiumSources or yandexSources)
|
||||
if !ok { continue }
|
||||
|
||||
for _, rel := range src.paths {
|
||||
abs := filepath.Join(c.profileDir, rel)
|
||||
|
||||
if dst, ok := copied[abs]; ok {
|
||||
result[cat] = dst // reuse already-copied file
|
||||
break
|
||||
}
|
||||
|
||||
dst := filepath.Join(session.TempDir(), filepath.Base(rel))
|
||||
if err := session.Acquire(abs, dst, src.isDir); err == nil {
|
||||
copied[abs] = dst
|
||||
result[cat] = dst
|
||||
break
|
||||
}
|
||||
}
|
||||
}
|
||||
return result
|
||||
}
|
||||
```
|
||||
|
||||
### 2.3 Firefox key4.db: infrastructure, not a Category
|
||||
|
||||
Each Firefox profile has its own `key4.db`. The master key is derived once in `New()` and stored on the struct, so `Extract()` never re-derives it:
|
||||
|
||||
```go
|
||||
// firefox.New() — called once per profile
|
||||
func New(profileDir string) (*Firefox, error) {
|
||||
// derive master key from this profile's key4.db
|
||||
keyPath := filepath.Join(profileDir, "key4.db")
|
||||
masterKey, err := deriveMasterKey(keyPath)
|
||||
if err != nil { return nil, err }
|
||||
|
||||
return &Firefox{
|
||||
profileDir: profileDir,
|
||||
masterKey: masterKey,
|
||||
sources: firefoxSources,
|
||||
}, nil
|
||||
}
|
||||
|
||||
func (f *Firefox) Extract(categories []types.Category) (*browserdata.BrowserData, error) {
|
||||
session, _ := filemanager.NewSession()
|
||||
defer session.Cleanup()
|
||||
|
||||
files := f.acquireFiles(session, categories)
|
||||
|
||||
// masterKey was derived in New() from this profile's key4.db
|
||||
data := &browserdata.BrowserData{}
|
||||
// ... extract each category using f.masterKey ...
|
||||
}
|
||||
```
|
||||
|
||||
### 2.4 Profile Discovery
|
||||
|
||||
Profile discovery functions are pure helpers (no struct receiver) that scan the filesystem:
|
||||
|
||||
```go
|
||||
// profile/finder.go
|
||||
|
||||
// discoverProfiles returns sub-directory names that look like Chrome profiles.
|
||||
// Matches "Default" or any name starting with "Profile ".
|
||||
// Falls back to ["."] for Opera-style layouts (data files live directly in userDataDir).
|
||||
func discoverProfiles(userDataDir string) []string {
|
||||
entries, err := os.ReadDir(userDataDir)
|
||||
if err != nil { return []string{"."} }
|
||||
|
||||
var profiles []string
|
||||
for _, e := range entries {
|
||||
if !e.IsDir() { continue }
|
||||
name := e.Name()
|
||||
if name == "Default" || strings.HasPrefix(name, "Profile ") {
|
||||
profiles = append(profiles, name)
|
||||
}
|
||||
}
|
||||
if len(profiles) == 0 {
|
||||
return []string{"."}
|
||||
}
|
||||
return profiles
|
||||
}
|
||||
|
||||
// discoverDataFiles checks which categories have actual data files in profileDir.
|
||||
func discoverDataFiles(profileDir string, sources map[types.Category]source) map[types.Category]string {
|
||||
found := make(map[types.Category]string)
|
||||
for cat, src := range sources {
|
||||
for _, rel := range src.paths {
|
||||
abs := filepath.Join(profileDir, rel)
|
||||
info, err := os.Stat(abs)
|
||||
if err != nil { continue }
|
||||
if src.isDir && !info.IsDir() { continue }
|
||||
if !src.isDir && info.IsDir() { continue }
|
||||
found[cat] = abs
|
||||
break
|
||||
}
|
||||
}
|
||||
return found
|
||||
}
|
||||
|
||||
// isValidBrowserDir checks whether the directory belongs to a real browser install.
|
||||
// Chromium: requires "Local State" file. Firefox: requires directory existence.
|
||||
func isValidBrowserDir(dir string, kind BrowserKind) bool {
|
||||
switch kind {
|
||||
case KindChromium, KindChromiumYandex:
|
||||
_, err := os.Stat(filepath.Join(dir, "Local State"))
|
||||
return err == nil
|
||||
case KindFirefox:
|
||||
info, err := os.Stat(dir)
|
||||
return err == nil && info.IsDir()
|
||||
}
|
||||
return false
|
||||
}
|
||||
```
|
||||
|
||||
**Testing approach**: all three functions are pure filesystem operations, easily testable with `t.TempDir()`:
|
||||
|
||||
```go
|
||||
func TestDiscoverProfiles(t *testing.T) {
|
||||
dir := t.TempDir()
|
||||
os.MkdirAll(filepath.Join(dir, "Default"), 0o755)
|
||||
os.MkdirAll(filepath.Join(dir, "Profile 1"), 0o755)
|
||||
os.MkdirAll(filepath.Join(dir, "System Profile"), 0o755)
|
||||
|
||||
profiles := discoverProfiles(dir)
|
||||
assert.Equal(t, []string{"Default", "Profile 1"}, profiles)
|
||||
}
|
||||
|
||||
func TestDiscoverDataFiles(t *testing.T) {
|
||||
dir := t.TempDir()
|
||||
os.WriteFile(filepath.Join(dir, "Login Data"), []byte{}, 0o644)
|
||||
os.MkdirAll(filepath.Join(dir, "Network"), 0o755)
|
||||
os.WriteFile(filepath.Join(dir, "Network", "Cookies"), []byte{}, 0o644)
|
||||
|
||||
files := discoverDataFiles(dir, chromiumSources)
|
||||
assert.Contains(t, files, types.Password)
|
||||
assert.Contains(t, files, types.Cookie)
|
||||
}
|
||||
|
||||
func TestAcquireFiles_Dedup(t *testing.T) {
|
||||
dir := t.TempDir()
|
||||
os.WriteFile(filepath.Join(dir, "History"), []byte("data"), 0o644)
|
||||
|
||||
session, _ := filemanager.NewSession()
|
||||
defer session.Cleanup()
|
||||
|
||||
c := &Chromium{profileDir: dir, sources: chromiumSources}
|
||||
files := c.acquireFiles(session, []types.Category{types.History, types.Download})
|
||||
assert.Equal(t, files[types.History], files[types.Download])
|
||||
}
|
||||
```
|
||||
|
||||
### 2.5 Platform Config Example
|
||||
|
||||
Each platform file returns the full list of known browsers with their `UserDataDir` paths:
|
||||
|
||||
```go
|
||||
// browser/browser_windows.go
|
||||
func platformBrowsers() []Config {
|
||||
return []Config{
|
||||
{Key: "chrome", Name: "Chrome", Kind: KindChromium, UserDataDir: homeDir + "/AppData/Local/Google/Chrome/User Data"},
|
||||
{Key: "edge", Name: "Microsoft Edge", Kind: KindChromium, UserDataDir: homeDir + "/AppData/Local/Microsoft/Edge/User Data"},
|
||||
{Key: "opera", Name: "Opera", Kind: KindChromium, UserDataDir: homeDir + "/AppData/Roaming/Opera Software/Opera Stable"},
|
||||
{Key: "yandex", Name: "Yandex", Kind: KindChromiumYandex, UserDataDir: homeDir + "/AppData/Local/Yandex/YandexBrowser/User Data"},
|
||||
{Key: "firefox", Name: "Firefox", Kind: KindFirefox, UserDataDir: homeDir + "/AppData/Roaming/Mozilla/Firefox/Profiles"},
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
`PickBrowsers()` iterates this list, calls `isValidBrowserDir()` to skip browsers that aren't installed, then calls `discoverProfiles()` to find all profiles within valid browser directories.
|
||||
|
||||
---
|
||||
|
||||
## 3. Shared Helpers: `utils/sqliteutil/`
|
||||
|
||||
### 3.1 SQLite query helper
|
||||
|
||||
```go
|
||||
// utils/sqliteutil/sqlite.go
|
||||
|
||||
func QuerySQLite(dbPath string, journalOff bool, query string, scanFn func(*sql.Rows) error) error {
|
||||
db, err := sql.Open("sqlite", dbPath)
|
||||
if err != nil { return err }
|
||||
defer db.Close()
|
||||
|
||||
if journalOff {
|
||||
if _, err := db.Exec("PRAGMA journal_mode=off"); err != nil { return err }
|
||||
}
|
||||
|
||||
rows, err := db.Query(query)
|
||||
if err != nil { return err }
|
||||
defer rows.Close()
|
||||
|
||||
for rows.Next() {
|
||||
if err := scanFn(rows); err != nil {
|
||||
log.Debugf("scan row error: %v", err)
|
||||
continue // skip bad row, continue extraction
|
||||
}
|
||||
}
|
||||
return rows.Err()
|
||||
}
|
||||
```
|
||||
|
||||
### 3.2 Generic query helper — `datautil/query.go`
|
||||
|
||||
```go
|
||||
package sqliteutil
|
||||
|
||||
// queryRows is a generic helper (Go 1.20) that wraps QuerySQLite
|
||||
// and collects results into a typed slice. Each extract method
|
||||
// only needs to provide the scan function.
|
||||
func QueryRows[T any](path string, journalOff bool, query string, scanRow func(*sql.Rows) (T, error)) ([]T, error) {
|
||||
var items []T
|
||||
err := QuerySQLite(path, journalOff, query, func(rows *sql.Rows) error {
|
||||
item, err := scanRow(rows)
|
||||
if err != nil { return nil } // skip bad row
|
||||
items = append(items, item)
|
||||
return nil
|
||||
})
|
||||
return items, err
|
||||
}
|
||||
```
|
||||
|
||||
### 3.3 Chromium decrypt helper
|
||||
|
||||
Moved to `browser/chromium/decrypt.go` as an unexported function `decryptValue()`. It is Chromium-specific (DPAPI → AES-GCM/CBC fallback) and only used by Chromium extract methods. See RFC-001 for details.
|
||||
|
||||
---
|
||||
|
||||
## 4. Extract Method Examples
|
||||
|
||||
Each extract method lives in its own `extract_*.go` file inside the browser engine package (see RFC-001 for naming convention). The default SQL query is a `const` in the same file. Override is checked via `c.queryOverrides`.
|
||||
|
||||
### 4.1 Chromium password (SQLite + decryption)
|
||||
|
||||
```go
|
||||
// browser/chromium/extract_password.go
|
||||
|
||||
const defaultLoginQuery = `SELECT origin_url, username_value, password_value, date_created FROM logins`
|
||||
|
||||
func (c *Chromium) extractPasswords(masterKey []byte, path string) ([]types.LoginEntry, error) {
|
||||
logins, err := sqliteutil.QueryRows(path, false, c.query(types.Password),
|
||||
func(rows *sql.Rows) (types.LoginEntry, error) {
|
||||
var url, username string
|
||||
var pwd []byte
|
||||
var created int64
|
||||
if err := rows.Scan(&url, &username, &pwd, &created); err != nil {
|
||||
return types.LoginEntry{}, err
|
||||
}
|
||||
password, _ := decryptValue(masterKey, pwd)
|
||||
return types.LoginEntry{
|
||||
URL: url,
|
||||
Username: username,
|
||||
Password: string(password),
|
||||
CreatedAt: typeutil.TimeEpoch(created),
|
||||
}, nil
|
||||
})
|
||||
if err != nil { return nil, err }
|
||||
|
||||
sort.Slice(logins, func(i, j int) bool {
|
||||
return logins[i].CreatedAt.After(logins[j].CreatedAt)
|
||||
})
|
||||
return logins, nil
|
||||
}
|
||||
```
|
||||
|
||||
### 4.2 Chromium cookie (SQLite + decryption)
|
||||
|
||||
```go
|
||||
// browser/chromium/extract_cookie.go
|
||||
|
||||
const defaultCookieQuery = `SELECT name, encrypted_value, host_key, path,
|
||||
creation_utc, expires_utc, is_secure, is_httponly,
|
||||
has_expires, is_persistent FROM cookies`
|
||||
|
||||
func (c *Chromium) extractCookies(masterKey []byte, path string) ([]types.CookieEntry, error) {
|
||||
cookies, err := sqliteutil.QueryRows(path, false, c.query(types.Cookie),
|
||||
func(rows *sql.Rows) (types.CookieEntry, error) {
|
||||
var (
|
||||
name, host, path string
|
||||
isSecure, isHTTPOnly, hasExpire, isPersistent int
|
||||
createdAt, expireAt int64
|
||||
encryptedValue []byte
|
||||
)
|
||||
if err := rows.Scan(&name, &encryptedValue, &host, &path,
|
||||
&createdAt, &expireAt, &isSecure, &isHTTPOnly,
|
||||
&hasExpire, &isPersistent); err != nil {
|
||||
return types.CookieEntry{}, err
|
||||
}
|
||||
|
||||
value, _ := decryptValue(masterKey, encryptedValue)
|
||||
return types.CookieEntry{
|
||||
Name: name,
|
||||
Host: host,
|
||||
Path: path,
|
||||
Value: string(value),
|
||||
IsSecure: isSecure != 0,
|
||||
IsHTTPOnly: isHTTPOnly != 0,
|
||||
ExpireAt: typeutil.TimeEpoch(expireAt),
|
||||
CreatedAt: typeutil.TimeEpoch(createdAt),
|
||||
}, nil
|
||||
})
|
||||
if err != nil { return nil, err }
|
||||
|
||||
sort.Slice(cookies, func(i, j int) bool {
|
||||
return cookies[i].CreatedAt.After(cookies[j].CreatedAt)
|
||||
})
|
||||
return cookies, nil
|
||||
}
|
||||
```
|
||||
|
||||
### 4.3 Firefox password (JSON + `decryptPBE()` helper)
|
||||
|
||||
Firefox uses `decryptPBE()` to combine the 3-step pipeline (base64 decode -> ASN1 PBE parse -> decrypt) into one call, reducing 6 error checks to 2.
|
||||
|
||||
```go
|
||||
// browser/firefox/extract_password.go
|
||||
|
||||
// decryptPBE combines base64 decode + ASN1 PBE parse + decrypt.
|
||||
func decryptPBE(encoded string, masterKey []byte) ([]byte, error) {
|
||||
raw, err := base64.StdEncoding.DecodeString(encoded)
|
||||
if err != nil { return nil, fmt.Errorf("base64 decode: %w", err) }
|
||||
pbe, err := crypto.NewASN1PBE(raw)
|
||||
if err != nil { return nil, fmt.Errorf("parse asn1 pbe: %w", err) }
|
||||
plaintext, err := pbe.Decrypt(masterKey)
|
||||
if err != nil { return nil, fmt.Errorf("decrypt: %w", err) }
|
||||
return plaintext, nil
|
||||
}
|
||||
|
||||
func (f *Firefox) extractPasswords(masterKey []byte, path string) ([]types.LoginEntry, error) {
|
||||
data, err := os.ReadFile(path)
|
||||
if err != nil { return nil, err }
|
||||
|
||||
var logins []types.LoginEntry
|
||||
for _, v := range gjson.GetBytes(data, "logins").Array() {
|
||||
user, err := decryptPBE(v.Get("encryptedUsername").String(), masterKey)
|
||||
if err != nil {
|
||||
log.Debugf("decrypt username: %v", err)
|
||||
continue
|
||||
}
|
||||
pwd, err := decryptPBE(v.Get("encryptedPassword").String(), masterKey)
|
||||
if err != nil {
|
||||
log.Debugf("decrypt password: %v", err)
|
||||
continue
|
||||
}
|
||||
|
||||
url := v.Get("formSubmitURL").String()
|
||||
if url == "" { url = v.Get("hostname").String() }
|
||||
|
||||
logins = append(logins, types.LoginEntry{
|
||||
URL: url,
|
||||
Username: string(user),
|
||||
Password: string(pwd),
|
||||
CreatedAt: typeutil.TimeStamp(v.Get("timeCreated").Int() / 1000),
|
||||
})
|
||||
}
|
||||
|
||||
sort.Slice(logins, func(i, j int) bool {
|
||||
return logins[i].CreatedAt.After(logins[j].CreatedAt)
|
||||
})
|
||||
return logins, nil
|
||||
}
|
||||
```
|
||||
|
||||
### 4.4 Firefox cookie (SQLite, no encryption)
|
||||
|
||||
```go
|
||||
// browser/firefox/extract_cookie.go
|
||||
|
||||
const firefoxCookieQuery = `SELECT name, value, host, path,
|
||||
creationTime, expiry, isSecure, isHttpOnly FROM moz_cookies`
|
||||
|
||||
func (f *Firefox) extractCookies(path string) ([]types.CookieEntry, error) {
|
||||
cookies, err := sqliteutil.QueryRows(path, true, firefoxCookieQuery,
|
||||
func(rows *sql.Rows) (types.CookieEntry, error) {
|
||||
var (
|
||||
name, value, host, path string
|
||||
isSecure, isHTTPOnly int
|
||||
createdAt, expiry int64
|
||||
)
|
||||
if err := rows.Scan(&name, &value, &host, &path,
|
||||
&createdAt, &expiry, &isSecure, &isHTTPOnly); err != nil {
|
||||
return types.CookieEntry{}, err
|
||||
}
|
||||
return types.CookieEntry{
|
||||
Name: name,
|
||||
Host: host,
|
||||
Path: path,
|
||||
Value: value, // not encrypted
|
||||
IsSecure: isSecure != 0,
|
||||
IsHTTPOnly: isHTTPOnly != 0,
|
||||
ExpireAt: typeutil.TimeStamp(expiry),
|
||||
CreatedAt: typeutil.TimeStamp(createdAt / 1000000),
|
||||
}, nil
|
||||
})
|
||||
if err != nil { return nil, err }
|
||||
|
||||
sort.Slice(cookies, func(i, j int) bool {
|
||||
return cookies[i].CreatedAt.After(cookies[j].CreatedAt)
|
||||
})
|
||||
return cookies, nil
|
||||
}
|
||||
```
|
||||
|
||||
### 4.5 Chromium local storage (LevelDB)
|
||||
|
||||
```go
|
||||
// browser/chromium/extract_storage.go
|
||||
|
||||
func (c *Chromium) extractLocalStorage(path string) ([]types.StorageEntry, error) {
|
||||
db, err := leveldb.OpenFile(path, nil)
|
||||
if err != nil { return nil, err }
|
||||
defer db.Close()
|
||||
|
||||
var entries []types.StorageEntry
|
||||
iter := db.NewIterator(nil, nil)
|
||||
defer iter.Release()
|
||||
|
||||
for iter.Next() {
|
||||
url, name := parseStorageKey(iter.Key(), []byte{0}) // \x00 separator
|
||||
if url == "" { continue }
|
||||
entries = append(entries, types.StorageEntry{
|
||||
URL: url,
|
||||
Key: name,
|
||||
Value: string(iter.Value()),
|
||||
})
|
||||
}
|
||||
return entries, iter.Error()
|
||||
}
|
||||
|
||||
func (c *Chromium) extractSessionStorage(path string) ([]types.StorageEntry, error) {
|
||||
db, err := leveldb.OpenFile(path, nil)
|
||||
if err != nil { return nil, err }
|
||||
defer db.Close()
|
||||
|
||||
var entries []types.StorageEntry
|
||||
iter := db.NewIterator(nil, nil)
|
||||
defer iter.Release()
|
||||
|
||||
for iter.Next() {
|
||||
url, name := parseStorageKey(iter.Key(), []byte("-")) // "-" separator
|
||||
if url == "" { continue }
|
||||
entries = append(entries, types.StorageEntry{
|
||||
URL: url,
|
||||
Key: name,
|
||||
Value: string(iter.Value()),
|
||||
})
|
||||
}
|
||||
return entries, iter.Error()
|
||||
}
|
||||
|
||||
func parseStorageKey(key []byte, separator []byte) (url, name string) {
|
||||
parts := bytes.SplitN(key, separator, 2)
|
||||
if len(parts) != 2 { return "", "" }
|
||||
return string(parts[0]), string(parts[1])
|
||||
}
|
||||
```
|
||||
|
||||
### 4.6 Key differences between engines
|
||||
|
||||
| Aspect | Chromium | Firefox |
|
||||
|--------|----------|---------|
|
||||
| Password source | SQLite (`Login Data`) | JSON (`logins.json`) |
|
||||
| Password decryption | DPAPI → AES-GCM/CBC | ASN1PBE |
|
||||
| Cookie encryption | Yes (masterKey needed) | No (plaintext) |
|
||||
| Cookie journal_mode | Not needed | `PRAGMA journal_mode=off` |
|
||||
| Time format | WebKit epoch (`TimeEpoch`) | Unix microseconds (`TimeStamp / 1e6`) |
|
||||
| Storage format | LevelDB directory | SQLite (`webappsstore.sqlite`) |
|
||||
| key4.db | Not used | Required for master key derivation |
|
||||
| masterKey parameter | Passed to password, cookie, creditcard | Passed to password only |
|
||||
|
||||
### 4.7 Error handling in extract methods
|
||||
|
||||
Three-level rule:
|
||||
|
||||
| Level | Action | Example |
|
||||
|-------|--------|---------|
|
||||
| File/DB open failure | `return nil, err` | `os.ReadFile` fails, `sql.Open` fails |
|
||||
| Single record failure | `log.Debugf` + `continue` | One password decryption failed |
|
||||
| Entire Category failure | Collected into `errs` by caller | Cookie file locked |
|
||||
|
||||
Extract methods only `return error` for file-level failures. Record-level failures are logged at Debug level and skipped. The caller (`Extract()`) collects per-category errors with `errors.Join`.
|
||||
|
||||
Error wrapping uses `fmt.Errorf("context: %w", err)` — no custom error types.
|
||||
|
||||
---
|
||||
|
||||
## 5. File Acquisition Layer
|
||||
|
||||
### 5.1 Session manager
|
||||
|
||||
```go
|
||||
// filemanager/session.go
|
||||
|
||||
type Session struct {
|
||||
tempDir string
|
||||
}
|
||||
|
||||
func NewSession() (*Session, error) {
|
||||
dir, err := os.MkdirTemp("", "hbd-*")
|
||||
if err != nil { return nil, err }
|
||||
return &Session{tempDir: dir}, nil
|
||||
}
|
||||
|
||||
func (s *Session) TempDir() string { return s.tempDir }
|
||||
|
||||
func (s *Session) Acquire(src, dst string, isDir bool) error {
|
||||
if isDir {
|
||||
return fileutil.CopyDir(src, dst, "lock")
|
||||
}
|
||||
// Try normal copy first
|
||||
err := fileutil.CopyFile(src, dst)
|
||||
if err != nil {
|
||||
// Normal copy failed (file may be locked), try platform-specific method
|
||||
if err2 := copyLocked(src, dst); err2 != nil {
|
||||
return fmt.Errorf("copy %s: %w; locked copy: %v", src, err, err2)
|
||||
}
|
||||
}
|
||||
// Copy SQLite WAL/SHM companion files if present
|
||||
for _, suffix := range []string{"-wal", "-shm"} {
|
||||
if fileutil.IsFileExists(src + suffix) {
|
||||
_ = fileutil.CopyFile(src+suffix, dst+suffix)
|
||||
}
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
func (s *Session) Cleanup() {
|
||||
os.RemoveAll(s.tempDir)
|
||||
}
|
||||
```
|
||||
|
||||
### 5.2 Locked file handling (Windows)
|
||||
|
||||
On Windows, Chrome locks Cookie files while running. `Session.Acquire()` falls back to `copyLocked()` which uses `syscall.CreateFile` with `FILE_SHARE_READ|FILE_SHARE_WRITE|FILE_SHARE_DELETE` flags to bypass exclusive locks.
|
||||
|
||||
Platform-specific files:
|
||||
- `filemanager/copy_windows.go` — `copyLocked()` with sharing flags
|
||||
- `filemanager/copy_other.go` — stub returning error
|
||||
|
||||
This is transparent to callers — browser extract methods never know whether a file was copied normally or via the locked-file path.
|
||||
|
||||
### 5.3 Acquirer interface (deferred)
|
||||
|
||||
If only `CopyAcquirer` is needed, `Session.Acquire()` handles it directly. The `Acquirer` interface can be introduced later when VSS or other strategies are needed.
|
||||
|
||||
---
|
||||
|
||||
## 6. Output
|
||||
|
||||
```go
|
||||
// browserdata/output.go
|
||||
|
||||
func (d *BrowserData) Output(dir, browserName, format string) error {
|
||||
items := []struct {
|
||||
name string
|
||||
data interface{}
|
||||
len int
|
||||
}{
|
||||
{"password", d.Passwords, len(d.Passwords)},
|
||||
{"cookie", d.Cookies, len(d.Cookies)},
|
||||
{"bookmark", d.Bookmarks, len(d.Bookmarks)},
|
||||
{"history", d.Histories, len(d.Histories)},
|
||||
{"download", d.Downloads, len(d.Downloads)},
|
||||
{"creditcard", d.CreditCards, len(d.CreditCards)},
|
||||
{"extension", d.Extensions, len(d.Extensions)},
|
||||
{"localstorage", d.LocalStorage, len(d.LocalStorage)},
|
||||
{"sessionstorage", d.SessionStorage, len(d.SessionStorage)},
|
||||
}
|
||||
|
||||
var errs []error
|
||||
for _, item := range items {
|
||||
if item.len == 0 { continue }
|
||||
filename := formatFilename(browserName, item.name, format)
|
||||
if err := writeFile(dir, filename, format, item.data); err != nil {
|
||||
errs = append(errs, fmt.Errorf("write %s: %w", filename, err))
|
||||
continue
|
||||
}
|
||||
log.Infof("exported: %s (%d items)", filename, item.len)
|
||||
}
|
||||
return errors.Join(errs...)
|
||||
}
|
||||
|
||||
func writeFile(dir, filename, format string, data interface{}) error {
|
||||
if dir != "" {
|
||||
if err := os.MkdirAll(dir, 0o750); err != nil { return err }
|
||||
}
|
||||
path := filepath.Join(dir, filename)
|
||||
f, err := os.OpenFile(path, os.O_CREATE|os.O_TRUNC|os.O_WRONLY, 0o600)
|
||||
if err != nil { return err }
|
||||
defer f.Close()
|
||||
|
||||
switch format {
|
||||
case "json":
|
||||
return writeJSON(f, data)
|
||||
default:
|
||||
return writeCSV(f, data)
|
||||
}
|
||||
}
|
||||
|
||||
func writeJSON(w io.Writer, data interface{}) error {
|
||||
enc := json.NewEncoder(w)
|
||||
enc.SetIndent("", " ")
|
||||
enc.SetEscapeHTML(false)
|
||||
return enc.Encode(data)
|
||||
}
|
||||
|
||||
func writeCSV(w io.Writer, data interface{}) error {
|
||||
// UTF-8 BOM (3 bytes) — replaces golang.org/x/text dependency
|
||||
w.Write([]byte{0xEF, 0xBB, 0xBF})
|
||||
csvWriter := csv.NewWriter(w)
|
||||
return gocsv.MarshalCSV(data, gocsv.NewSafeCSVWriter(csvWriter))
|
||||
}
|
||||
|
||||
func formatFilename(browserName, dataName, format string) string {
|
||||
r := strings.NewReplacer(" ", "_", ".", "_", "-", "_")
|
||||
ext := "csv"
|
||||
if format == "json" { ext = "json" }
|
||||
return strings.ToLower(fmt.Sprintf("%s_%s.%s", r.Replace(browserName), dataName, ext))
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 7. What Was Eliminated
|
||||
|
||||
| Before | After | Why |
|
||||
|--------|-------|-----|
|
||||
| `extractor/` package (interface + registry + factory) | Deleted | Browser engines have typed extract methods |
|
||||
| `browserdata/password/`, `cookie/`, etc. (9 sub-packages) | Deleted | Extract logic moved into `browser/chromium/` and `browser/firefox/` |
|
||||
| `browserdata/imports.go` | Deleted | No init() registration needed |
|
||||
| `types.DataType` (22 iota constants) | `types.Category` (9 constants) | No browser prefix, no key types |
|
||||
| `itemFileNames` map | `chromiumSources` / `firefoxSources` per engine | File layout is engine-internal |
|
||||
| `TempFilename()` on DataType | `Session.TempDir()` + `filepath.Base()` | Session manages temp paths |
|
||||
| `DefaultChromiumTypes`, `DefaultFirefoxTypes`, `DefaultYandexTypes` | `types.AllCategories` | One list for all engines |
|
||||
| `loginData.encryptPass`, `cookie.encryptValue` | Local variables in extract methods | Encrypted fields don't belong in data models |
|
||||
| 20 trivial `Name()` / `Len()` methods | Not needed | No Extractor interface |
|
||||
|
||||
---
|
||||
|
||||
## 8. Implementation Plan
|
||||
|
||||
### Phase 1: Foundation (new files only, zero risk)
|
||||
|
||||
1. `types/category.go` — Category enum
|
||||
2. `types/models.go` — all *Entry structs
|
||||
3. `browserdata/browserdata.go` — BrowserData struct
|
||||
4. `utils/sqliteutil/sqlite.go` — QuerySQLite()
|
||||
5. `browser/chromium/decrypt.go` — decryptValue() (Chromium-specific, unexported)
|
||||
6. `filemanager/session.go` — Session
|
||||
|
||||
### Phase 2: Extract methods (new files, coexist with old code)
|
||||
|
||||
1. `browser/chromium/source.go` — chromiumSources, yandexSources
|
||||
2. `browser/chromium/extract_*.go` — all 9 extract methods
|
||||
3. `browser/firefox/source.go` — firefoxSources
|
||||
4. `browser/firefox/extract_*.go` — all extract methods
|
||||
|
||||
### Phase 3: Wiring (modify existing files)
|
||||
|
||||
1. Update `Chromium.Extract()` to use new extract methods
|
||||
2. Update `Firefox.Extract()` to use new extract methods
|
||||
3. Update `Config` and `PickBrowsers()`
|
||||
4. Update `browserdata/output.go`
|
||||
5. Update CLI `main.go`
|
||||
|
||||
### Phase 4: Cleanup (delete old code)
|
||||
|
||||
1. Delete `extractor/` package
|
||||
2. Delete `browserdata/imports.go`
|
||||
3. Delete `browserdata/password/`, `cookie/`, etc.
|
||||
4. Delete old `types.DataType`, `itemFileNames`
|
||||
5. Delete `browser/consts.go`
|
||||
|
||||
### Phase 5: Verification
|
||||
|
||||
```bash
|
||||
go test ./...
|
||||
go vet ./...
|
||||
gofmt -d .
|
||||
GOOS=windows GOARCH=amd64 go build ./cmd/hack-browser-data/
|
||||
GOOS=linux GOARCH=amd64 go build ./cmd/hack-browser-data/
|
||||
GOOS=darwin GOARCH=amd64 go build ./cmd/hack-browser-data/
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 9. Open Questions
|
||||
|
||||
1. **Sort direction**: standardize all categories to DESC by date?
|
||||
2. **Output format**: keep `gocsv` or switch to `encoding/csv`?
|
||||
3. **LevelDB key parsing**: the current `fillKey`/`fillHeader`/`fillValue` logic in localstorage is complex — how much of that detail carries over?
|
||||
|
||||
---
|
||||
|
||||
## 10. Relationship with RFC-001
|
||||
|
||||
| Area | RFC-001 | RFC-002 (this doc) |
|
||||
|------|---------|-------------------|
|
||||
| Data model (Category + *Entry) | defines | uses |
|
||||
| BrowserData container | defines | implements Output |
|
||||
| Cipher version | covered | — |
|
||||
| Master key retrieval | covered | — |
|
||||
| Browser registration | covered | — |
|
||||
| Yandex variant | covered | — |
|
||||
| Error handling pattern | covered | — |
|
||||
| File source mapping | — | covered |
|
||||
| File acquisition | — | covered |
|
||||
| Extract methods | — | covered |
|
||||
| sqliteutil helpers | — | covered |
|
||||
| Output | — | covered |
|
||||
@@ -0,0 +1,139 @@
|
||||
# RFC-002: Chromium Data Storage
|
||||
|
||||
**Author**: moonD4rk
|
||||
**Status**: Living Document
|
||||
**Created**: 2026-04-05
|
||||
|
||||
## 1. Data File Locations
|
||||
|
||||
All paths are relative to the profile directory (e.g. `~/.config/google-chrome/Default/`).
|
||||
|
||||
| Category | Candidate Paths (priority order) | Format |
|
||||
|----------|----------------------------------|--------|
|
||||
| Password | `Login Data` | SQLite |
|
||||
| Cookie | `Network/Cookies`, `Cookies` | SQLite |
|
||||
| History | `History` | SQLite |
|
||||
| Download | `History` (same file) | SQLite |
|
||||
| Bookmark | `Bookmarks` | JSON |
|
||||
| CreditCard | `Web Data` | SQLite |
|
||||
| Extension | `Secure Preferences` | JSON |
|
||||
| LocalStorage | `Local Storage/leveldb/` | LevelDB dir |
|
||||
| SessionStorage | `Session Storage/` | LevelDB dir |
|
||||
|
||||
Cookies have two candidate paths because older Chromium versions stored cookies at `<profile>/Cookies`, while newer versions moved them to `<profile>/Network/Cookies`. The first existing path wins.
|
||||
|
||||
## 2. Browser Variants
|
||||
|
||||
### 2.1 Yandex
|
||||
|
||||
Yandex overrides two file names from the standard Chromium layout:
|
||||
|
||||
| Category | Standard Chromium | Yandex |
|
||||
|----------|-------------------|--------|
|
||||
| Password | `Login Data` | `Ya Passman Data` |
|
||||
| CreditCard | `Web Data` | `Ya Credit Cards` |
|
||||
|
||||
Yandex also uses `action_url` instead of `origin_url` in its password SQL query.
|
||||
|
||||
**Important limitation**: Yandex passwords and cookies currently cannot be decrypted because Yandex uses its own proprietary encryption algorithm. Only non-encrypted categories (bookmarks, history, downloads, extensions, storage) produce useful results.
|
||||
|
||||
### 2.2 Opera
|
||||
|
||||
Opera differs from standard Chromium in two ways:
|
||||
|
||||
- **Extension key**: Opera stores extension settings under `extensions.opsettings` in Secure Preferences, instead of the standard `extensions.settings`.
|
||||
- **Windows path**: Opera uses `AppData/Roaming` rather than `AppData/Local`, unlike most Chromium browsers.
|
||||
- **Flat layout**: Older Opera versions store data files directly in the user data directory without profile subdirectories (see Section 3).
|
||||
|
||||
## 3. Profile Discovery
|
||||
|
||||
Chromium supports multiple profiles (Default, Profile 1, Profile 2, ...) under a single user data directory. Profile discovery identifies which subdirectories are actual profiles versus internal directories like `Crashpad` or `ShaderCache`.
|
||||
|
||||
A directory is recognized as a profile if it contains a `Preferences` file. This convention follows Chromium's own source code -- Chromium creates a per-profile `Preferences` file on first use, making it a reliable marker even in early Chromium versions. Tencent-based browsers (QQ Browser, Sogou Explorer) use `Preferences_02` instead, which is also checked.
|
||||
|
||||
Certain directories are always skipped: `System Profile`, `Guest Profile`, and `Snapshot`.
|
||||
|
||||
**Flat layout fallback**: If no profile subdirectories are found, the user data directory itself is checked for any known source file. This handles Opera-style browsers that store data alongside `Local State` in the base directory.
|
||||
|
||||
## 4. Data Storage Formats
|
||||
|
||||
### 4.1 Passwords (Login Data -- SQLite)
|
||||
|
||||
```sql
|
||||
SELECT origin_url, username_value, password_value, date_created FROM logins
|
||||
```
|
||||
|
||||
The `password_value` column contains encrypted bytes. See [RFC-003](003-chromium-encryption.md) for decryption.
|
||||
|
||||
### 4.2 Cookies (Cookies -- SQLite)
|
||||
|
||||
```sql
|
||||
SELECT name, encrypted_value, host_key, path,
|
||||
creation_utc, expires_utc, is_secure, is_httponly,
|
||||
has_expires, is_persistent FROM cookies
|
||||
```
|
||||
|
||||
The `encrypted_value` column contains encrypted bytes. Chrome 130+ (cookie DB schema version 24) prepends `SHA256(host_key)` to the cookie value before encryption as a cross-domain replay mitigation. After decryption, the cookie value layout is:
|
||||
|
||||
```
|
||||
| SHA256(host_key) | actual cookie value |
|
||||
|------------------|---------------------|
|
||||
| 32B | remaining bytes |
|
||||
```
|
||||
|
||||
The first 32 bytes are verified against `SHA256(host_key)` and stripped if they match. If the decrypted value is shorter than 32 bytes or the hash does not match, the value is returned as-is (pre-Chrome 130 behavior).
|
||||
|
||||
### 4.3 Bookmarks (Bookmarks -- JSON)
|
||||
|
||||
A JSON file with a `roots` object containing bookmark trees (bookmark_bar, other, synced). Each node has a `type` ("url" or "folder"), `name`, `url`, and `date_added`. Folder nodes contain a `children` array, forming a recursive tree that is walked to collect all URL entries.
|
||||
|
||||
### 4.4 History (History -- SQLite)
|
||||
|
||||
```sql
|
||||
SELECT url, title, visit_count, last_visit_time FROM urls
|
||||
```
|
||||
|
||||
No encrypted fields. Results are sorted by visit count (descending).
|
||||
|
||||
### 4.5 Downloads (History -- SQLite, same file)
|
||||
|
||||
```sql
|
||||
SELECT target_path, tab_url, total_bytes, start_time, end_time, mime_type FROM downloads
|
||||
```
|
||||
|
||||
No encrypted fields. Shares the same `History` SQLite database as browsing history.
|
||||
|
||||
### 4.6 Credit Cards (Web Data -- SQLite)
|
||||
|
||||
```sql
|
||||
SELECT guid, name_on_card, expiration_month, expiration_year,
|
||||
card_number_encrypted, nickname, billing_address_id FROM credit_cards
|
||||
```
|
||||
|
||||
The `card_number_encrypted` column contains encrypted bytes.
|
||||
|
||||
### 4.7 Extensions (Secure Preferences -- JSON)
|
||||
|
||||
The `Secure Preferences` file contains extension metadata under `extensions.settings` (or variant-specific keys). Each extension entry includes a `manifest` object with name, description, version, and homepage URL. System/component extensions (location 5 or 10) are filtered out.
|
||||
|
||||
Extension enabled state is determined by `disable_reasons` (modern Chrome: empty array = enabled) or `state` (older Chrome: 1 = enabled).
|
||||
|
||||
### 4.8 LocalStorage / SessionStorage (LevelDB)
|
||||
|
||||
Both use LevelDB directories, but with different key encoding schemes.
|
||||
|
||||
**LocalStorage** keys use a binary format: a `_` prefix byte, followed by the origin URL, a null separator, and a Chromium-encoded string key. The string encoding uses a format byte: `0x01` for Latin-1, `0x00` for UTF-16 LE. Values follow the same encoding. Metadata entries (`META:`, `METAACCESS:`) and `VERSION` keys are recognized but not treated as user data.
|
||||
|
||||
**SessionStorage** uses a two-pass approach. First, `namespace-<guid>-<origin>` entries map GUIDs to origins. Then, `map-<map_id>-<key_name>` entries contain the actual data with raw UTF-16 LE values (no format byte prefix).
|
||||
|
||||
## 5. Time Format
|
||||
|
||||
Chromium uses WebKit epoch timestamps: microseconds since 1601-01-01 00:00:00 UTC. This applies to `date_created`, `creation_utc`, `expires_utc`, `last_visit_time`, `start_time`, `end_time`, and `date_added`. To convert to Unix time, subtract 11644473600000000 microseconds (the offset between 1601 and 1970).
|
||||
|
||||
## Related RFCs
|
||||
|
||||
| RFC | Topic |
|
||||
|-----|-------|
|
||||
| [RFC-003](003-chromium-encryption.md) | Chromium encryption mechanisms per platform |
|
||||
| [RFC-006](006-key-retrieval-mechanisms.md) | Platform-specific master key retrieval |
|
||||
| [RFC-008](008-file-acquisition-and-platform-quirks.md) | File acquisition and platform quirks |
|
||||
@@ -0,0 +1,120 @@
|
||||
# RFC-003: Chromium Encryption
|
||||
|
||||
**Author**: moonD4rk
|
||||
**Status**: Living Document
|
||||
**Created**: 2026-04-05
|
||||
|
||||
## 1. Overview
|
||||
|
||||
Chromium encrypts sensitive fields in three data categories: passwords (`password_value`), cookies (`encrypted_value`), and credit cards (`card_number_encrypted`). The encryption algorithm varies by platform -- macOS and Linux use AES-128-CBC with a PBKDF2-derived key, while Windows uses AES-256-GCM with a DPAPI-protected key.
|
||||
|
||||
Non-sensitive categories (history, bookmarks, downloads, extensions, storage) are stored in plaintext and do not require decryption.
|
||||
|
||||
## 2. Cipher Version Detection
|
||||
|
||||
Every encrypted value begins with a 3-byte prefix that identifies the cipher version:
|
||||
|
||||
| Prefix | Version | Meaning |
|
||||
|--------|---------|---------|
|
||||
| `v10` | CipherV10 | Chrome 80+ standard encryption (AES-GCM on Windows, AES-CBC on macOS/Linux) |
|
||||
| `v20` | CipherV20 | Chrome 127+ App-Bound Encryption |
|
||||
| (none) | CipherDPAPI | Pre-Chrome 80 raw DPAPI encryption (Windows only, no prefix) |
|
||||
|
||||
If the ciphertext is shorter than 3 bytes or the prefix is unrecognized, it is treated as legacy DPAPI.
|
||||
|
||||
## 3. macOS Encryption
|
||||
|
||||
Chromium on macOS stores a per-browser secret in the macOS Keychain (e.g. "Chrome Safe Storage", "Brave Safe Storage"). The master key is derived from this secret via PBKDF2:
|
||||
|
||||
| Parameter | Value |
|
||||
|-----------|-------|
|
||||
| Hash | SHA-1 |
|
||||
| Salt | `saltysalt` |
|
||||
| Iterations | 1003 |
|
||||
| Key length | 16 bytes (AES-128) |
|
||||
|
||||
Decryption uses AES-128-CBC with a fixed IV of 16 space bytes (`0x20`). The ciphertext layout:
|
||||
|
||||
```
|
||||
| v10 | AES-CBC ciphertext (PKCS5 padded) |
|
||||
|-------|-------------------------------------|
|
||||
| 3B | remaining bytes |
|
||||
```
|
||||
|
||||
There are three retrieval strategies, tried in order: (1) gcoredump exploit for securityd process memory, (2) direct keychain unlock with user's login password, (3) `security` CLI command (may trigger a GUI prompt). See [RFC-006](006-key-retrieval-mechanisms.md) for details.
|
||||
|
||||
## 4. Windows Encryption
|
||||
|
||||
Chromium on Windows stores a base64-encoded encrypted key in `Local State` at `os_crypt.encrypted_key`. The key recovery process is:
|
||||
|
||||
1. Base64-decode the `encrypted_key` value
|
||||
2. Strip the 5-byte `DPAPI` ASCII prefix
|
||||
3. Decrypt via Windows `CryptUnprotectData` (DPAPI) to obtain the 256-bit master key
|
||||
|
||||
With the master key, each encrypted value is decrypted as AES-256-GCM:
|
||||
|
||||
```
|
||||
| v10 | nonce | ciphertext + auth tag (16B) |
|
||||
|-------|--------|-----------------------------|
|
||||
| 3B | 12B | remaining bytes |
|
||||
```
|
||||
|
||||
**Legacy DPAPI** — values without a `v10`/`v20` prefix (pre-Chrome 80) are passed directly to `CryptUnprotectData`:
|
||||
|
||||
```
|
||||
| DPAPI blob (no prefix) |
|
||||
|-------------------------------------|
|
||||
| variable length |
|
||||
```
|
||||
|
||||
## 5. Linux Encryption
|
||||
|
||||
Chromium on Linux retrieves a per-browser secret from D-Bus Secret Service (GNOME Keyring or KDE Wallet). The label matches the browser's storage name (e.g. "Chrome Safe Storage", "Chromium Safe Storage"). If D-Bus is unavailable, the hardcoded fallback password `peanuts` is used.
|
||||
|
||||
The master key is derived via PBKDF2 with different parameters than macOS:
|
||||
|
||||
| Parameter | Value |
|
||||
|-----------|-------|
|
||||
| Hash | SHA-1 |
|
||||
| Salt | `saltysalt` |
|
||||
| Iterations | 1 |
|
||||
| Key length | 16 bytes (AES-128) |
|
||||
|
||||
Decryption uses the same AES-128-CBC scheme as macOS (fixed IV of 16 space bytes, PKCS5 padding).
|
||||
|
||||
## 6. v20 App-Bound Encryption (Chrome 127+)
|
||||
|
||||
Chrome 127 introduced App-Bound Encryption on Windows, identified by the `v20` prefix. This scheme binds the encryption key to the Chrome application identity, making it harder for external tools to decrypt. After decryption, the payload contains a 32-byte application header before the actual plaintext:
|
||||
|
||||
```
|
||||
| v20 | nonce | AES-GCM payload |
|
||||
|-------|--------|-------------------------------------|
|
||||
| 3B | 12B | remaining bytes |
|
||||
|
||||
After decryption:
|
||||
| app-bound header | plaintext |
|
||||
|------------------|------------------------------------|
|
||||
| 32B | remaining bytes |
|
||||
```
|
||||
|
||||
**Current status**: v20 decryption is not yet implemented. Encountering a `v20`-prefixed value returns an error. This primarily affects recent Chrome installations on Windows.
|
||||
|
||||
## 7. Decryption Flow
|
||||
|
||||
The high-level decryption path for any encrypted Chromium value:
|
||||
|
||||
1. **Detect version** -- inspect the first 3 bytes of the ciphertext
|
||||
2. **Route by version**:
|
||||
- `v10` -- strip prefix, call platform-specific decryption (AES-CBC on macOS/Linux, AES-GCM on Windows)
|
||||
- `v20` -- not yet supported, return error
|
||||
- DPAPI (no prefix) -- call Windows `CryptUnprotectData` directly (Windows only; returns error on other platforms)
|
||||
3. **Return plaintext** -- the decrypted bytes are interpreted as a UTF-8 string
|
||||
|
||||
Each record is decrypted independently. A failure to decrypt one value does not prevent extraction of other records in the same database.
|
||||
|
||||
## Related RFCs
|
||||
|
||||
| RFC | Topic |
|
||||
|-----|-------|
|
||||
| [RFC-002](002-chromium-data-storage.md) | Chromium data file locations and storage formats |
|
||||
| [RFC-006](006-key-retrieval-mechanisms.md) | Platform-specific master key retrieval |
|
||||
@@ -1,53 +0,0 @@
|
||||
# RFC-003: Crypto Package and Naming Cleanup
|
||||
|
||||
**Author**: moonD4rk
|
||||
**Status**: Proposed
|
||||
**Created**: 2026-04-03
|
||||
|
||||
## Abstract
|
||||
|
||||
The `crypto/` package and cross-browser shared code have accumulated naming
|
||||
and structural issues over time. This RFC tracks them for a future dedicated
|
||||
refactoring pass. No code changes are proposed here.
|
||||
|
||||
## 1. crypto/asn1pbe.go
|
||||
|
||||
### Naming
|
||||
|
||||
| Current | Issue | Suggested |
|
||||
|---------|-------|-----------|
|
||||
| `nssPBE` | Too generic — "NSS" covers all Firefox crypto | `privateKeyPBE` — decrypts key4.db nssPrivate entries |
|
||||
| `metaPBE` | "meta" is vague | `passwordCheckPBE` — decrypts key4.db metaData check |
|
||||
| `loginPBE` | Acceptable but inconsistent | `credentialPBE` — decrypts logins.json credentials |
|
||||
| `ASN1PBE` interface | Too technical for callers | `Decryptor` or `PBEDecryptor` |
|
||||
| `SlatAttr` | **Typo** — should be `Salt` | `SaltAttr` |
|
||||
| `AlgoAttr.Data.Data` | Nested names are meaningless | Flatten with descriptive field names |
|
||||
| `AES128CBCDecrypt` | Misnomer — supports all AES key lengths | `AESCBCDecrypt` |
|
||||
|
||||
### Structure
|
||||
|
||||
`NewASN1PBE` uses trial-and-error `asn1.Unmarshal` to detect the type.
|
||||
ASN1 parsing is lenient, so multiple structs may succeed. A safer approach
|
||||
would be to parse the OID first, then unmarshal into the matching struct.
|
||||
|
||||
## 2. crypto/crypto_*.go
|
||||
|
||||
| Current | Issue |
|
||||
|---------|-------|
|
||||
| `DecryptWithChromium` | Platform-specific (AES-CBC on darwin, AES-GCM on windows) — name doesn't reflect this |
|
||||
| `DecryptWithYandex` | Nearly identical to `DecryptWithChromium` on Windows |
|
||||
|
||||
## 3. Shared code between Chromium and Firefox
|
||||
|
||||
`discoverProfiles`, `hasAnySource`, `resolveSourcePaths`, `resolvedPath`
|
||||
are nearly identical in both packages (~40 lines duplicated). Currently
|
||||
each package keeps its own copy for independence. If more browser engines
|
||||
are added (e.g. Safari WebKit), consider extracting to a shared package.
|
||||
|
||||
## 4. Priority
|
||||
|
||||
1. **SlatAttr typo** — trivial fix, do anytime
|
||||
2. **AES128CBCDecrypt rename** — grep + rename, low risk
|
||||
3. **ASN1PBE type/naming cleanup** — medium effort, needs comprehensive tests
|
||||
4. **NewASN1PBE OID-first detection** — higher effort, must not break any Firefox version
|
||||
5. **Shared profile discovery** — only when a third browser engine is added
|
||||
@@ -1,304 +0,0 @@
|
||||
# RFC-004: CLI (Cobra) and Output Design
|
||||
|
||||
**Author**: moonD4rk
|
||||
**Status**: Proposed
|
||||
**Created**: 2026-04-03
|
||||
**Updated**: 2026-04-03
|
||||
|
||||
## Context
|
||||
|
||||
v2 architecture delivers `Extract() → *types.BrowserData`. The remaining
|
||||
pieces are: CLI for user interaction and output for writing results to files.
|
||||
Current CLI uses `urfave/cli` with flat flags; migrating to `cobra` with
|
||||
subcommands for better extensibility.
|
||||
|
||||
## 1. CLI Design
|
||||
|
||||
### Subcommands
|
||||
|
||||
```
|
||||
hack-browser-data
|
||||
├── dump # extract browser data (default when no subcommand)
|
||||
│ ├── -b, --browser all|chrome|firefox|... (default: all)
|
||||
│ ├── -c, --category all|password,cookie,... (default: all)
|
||||
│ ├── -f, --format csv|json|cookie-editor (default: csv)
|
||||
│ ├── -d, --dir output directory (default: results)
|
||||
│ ├── -p, --profile-path custom profile path
|
||||
│ ├── --keychain-pw macOS keychain password
|
||||
│ └── --zip compress output
|
||||
│
|
||||
├── list # show detected browsers and profile paths
|
||||
│ └── --detail show per-category entry counts (no decryption)
|
||||
│
|
||||
└── global flags
|
||||
├── -v, --verbose
|
||||
└── --version
|
||||
```
|
||||
|
||||
Running `hack-browser-data` with no subcommand defaults to `dump`.
|
||||
|
||||
### Examples
|
||||
|
||||
```bash
|
||||
hack-browser-data # dump all
|
||||
hack-browser-data dump -b chrome -c password,cookie # specific
|
||||
hack-browser-data dump -b chrome -f json # JSON output
|
||||
hack-browser-data dump -f cookie-editor # CookieEditor format
|
||||
hack-browser-data list # show browsers
|
||||
hack-browser-data list --detail # show counts
|
||||
```
|
||||
|
||||
### Removed/changed flags vs current CLI
|
||||
|
||||
| Current flag | Action | Reason |
|
||||
|-------------|--------|--------|
|
||||
| `--full-export` | Removed | Replaced by `--category all` (default) |
|
||||
| `--results-dir` | Renamed `--dir` | Shorter |
|
||||
| — | New `--category` | Fine-grained control |
|
||||
| — | New `--keychain-pw` | macOS keychain password |
|
||||
| — | New `--format cookie-editor` | CookieEditor compatibility |
|
||||
|
||||
### Code structure
|
||||
|
||||
```
|
||||
cmd/hack-browser-data/
|
||||
├── main.go # cobra root command setup
|
||||
├── dump.go # dump subcommand
|
||||
└── list.go # list subcommand
|
||||
```
|
||||
|
||||
## 2. Output Design
|
||||
|
||||
### File organization
|
||||
|
||||
One file per category. Browser and profile are columns, not filenames:
|
||||
|
||||
```
|
||||
results/
|
||||
├── password.csv
|
||||
├── cookie.csv
|
||||
├── history.csv
|
||||
├── bookmark.csv
|
||||
├── download.csv
|
||||
├── extension.csv
|
||||
├── creditcard.csv
|
||||
├── localstorage.csv
|
||||
└── sessionstorage.csv
|
||||
```
|
||||
|
||||
At most 9 files, regardless of how many browsers/profiles.
|
||||
|
||||
Example `password.csv`:
|
||||
```
|
||||
browser,profile,url,username,password,created_at
|
||||
Chrome,Default,https://example.com,alice,xxx,2026-01-01
|
||||
Chrome,Profile 1,https://github.com,bob,yyy,2026-02-01
|
||||
Firefox,abc123.default,https://reddit.com,charlie,zzz,2026-03-01
|
||||
```
|
||||
|
||||
Example `password.json`:
|
||||
```json
|
||||
[
|
||||
{"browser":"Chrome","profile":"Default","url":"https://example.com","username":"alice","password":"xxx","created_at":"2026-01-01T00:00:00Z"},
|
||||
{"browser":"Firefox","profile":"abc123.default","url":"https://reddit.com","username":"charlie","password":"zzz","created_at":"2026-03-01T00:00:00Z"}
|
||||
]
|
||||
```
|
||||
|
||||
### Architecture: encapsulated Writer struct
|
||||
|
||||
The `Writer` struct is the only exported type. All internals (formatter,
|
||||
row types, file management) are unexported. Caller sees 3 methods only.
|
||||
|
||||
```go
|
||||
// output/output.go — the only exported type
|
||||
|
||||
type Writer struct {
|
||||
dir string
|
||||
formatter formatter // unexported
|
||||
results []result // unexported
|
||||
}
|
||||
|
||||
func NewWriter(dir, format string) (*Writer, error) {
|
||||
f, err := newFormatter(format)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
return &Writer{dir: dir, formatter: f}, nil
|
||||
}
|
||||
|
||||
func (w *Writer) Add(browser, profile string, data *types.BrowserData) {
|
||||
w.results = append(w.results, result{browser, profile, data})
|
||||
}
|
||||
|
||||
func (w *Writer) Write() error {
|
||||
// 1. aggregate all results by category into row slices
|
||||
// 2. for each non-empty category, format to buffer, write file
|
||||
}
|
||||
```
|
||||
|
||||
Caller code (3 lines):
|
||||
|
||||
```go
|
||||
w, _ := output.NewWriter(dir, "csv")
|
||||
for _, b := range browsers {
|
||||
data, _ := b.Extract(categories)
|
||||
w.Add(b.BrowserName(), b.ProfileName(), data)
|
||||
}
|
||||
w.Write()
|
||||
```
|
||||
|
||||
### Data layer stays pure
|
||||
|
||||
Entry structs do NOT contain browser/profile. Each field carries both
|
||||
`json` and `csv` struct tags — JSON output reads `json` tags, CSV output
|
||||
reads `csv` tags via reflection. No methods on entry types.
|
||||
|
||||
```go
|
||||
// types/models.go — pure data, no methods
|
||||
type LoginEntry struct {
|
||||
URL string `json:"url" csv:"url"`
|
||||
Username string `json:"username" csv:"username"`
|
||||
Password string `json:"password" csv:"password"`
|
||||
CreatedAt time.Time `json:"created_at" csv:"created_at"`
|
||||
}
|
||||
```
|
||||
|
||||
### Internal row type (unexported)
|
||||
|
||||
A single `row` type wraps any entry with browser/profile context:
|
||||
|
||||
```go
|
||||
// output/row.go — unexported
|
||||
|
||||
type row struct {
|
||||
Browser string
|
||||
Profile string
|
||||
entry any
|
||||
}
|
||||
```
|
||||
|
||||
- **CSV**: `row.csvHeader()` / `row.csvRow()` use reflection to read `csv`
|
||||
struct tags and convert field values to strings (handles string, bool,
|
||||
int, int64, time.Time).
|
||||
- **JSON**: `row.MarshalJSON()` uses `reflect.StructOf` to dynamically
|
||||
build a flat struct with browser/profile fields followed by entry fields,
|
||||
then delegates to `json.Marshal`. No manual string concatenation.
|
||||
|
||||
### Internal formatter interface (unexported)
|
||||
|
||||
```go
|
||||
// output/formatter.go — unexported
|
||||
|
||||
type formatter interface {
|
||||
format(w io.Writer, rows []row) error
|
||||
ext() string
|
||||
}
|
||||
|
||||
func newFormatter(name string) (formatter, error) {
|
||||
switch name {
|
||||
case "csv": return &csvFormatter{}, nil
|
||||
case "json": return &jsonFormatter{}, nil
|
||||
case "cookie-editor": return &cookieEditorFormatter{}, nil
|
||||
default: return nil, fmt.Errorf("unsupported format: %s", name)
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Format support
|
||||
|
||||
**CSV** (default):
|
||||
- Standard `encoding/csv` — **no gocsv dependency**
|
||||
- UTF-8 BOM for Excel compatibility
|
||||
- Headers and values derived from `csv` struct tags via reflection
|
||||
|
||||
**JSON**:
|
||||
- Valid JSON Array per file (not JSON Lines)
|
||||
- Pretty-printed with `json.Encoder`, no HTML escape
|
||||
- `reflect.StructOf` dynamically flattens browser/profile + entry fields
|
||||
|
||||
**CookieEditor** (`--format cookie-editor`):
|
||||
- Only exports cookies, other categories skipped
|
||||
- Field mapping: host→domain, IsSecure→secure, ExpireAt→expirationDate (unix)
|
||||
|
||||
### Dependency changes
|
||||
|
||||
- **Remove**: `github.com/gocarina/gocsv`
|
||||
- **Remove**: `golang.org/x/text` (UTF-8 BOM = 3 bytes directly)
|
||||
- **Add**: `github.com/spf13/cobra`
|
||||
|
||||
### Output package structure
|
||||
|
||||
```
|
||||
output/
|
||||
├── output.go # Writer struct (exported): NewWriter(), Add(), Write()
|
||||
├── row.go # Unified row type (unexported) + MarshalJSON
|
||||
├── reflect.go # Reflection helpers: csv tag parsing, field formatting
|
||||
├── formatter.go # formatter interface (unexported) + newFormatter()
|
||||
├── csv.go # csvFormatter (unexported)
|
||||
├── json.go # jsonFormatter (unexported)
|
||||
└── cookie_editor.go # cookieEditorFormatter (unexported)
|
||||
```
|
||||
|
||||
## 3. `list` Command
|
||||
|
||||
### Basic mode
|
||||
|
||||
Shows real filesystem paths detected by `NewBrowsers`. No database access.
|
||||
|
||||
```
|
||||
$ hack-browser-data list
|
||||
|
||||
Browser Profile Path
|
||||
Chrome Default /Users/x/Library/.../Google/Chrome/Default
|
||||
Chrome Profile 1 /Users/x/Library/.../Google/Chrome/Profile 1
|
||||
Firefox abc123.default-release /Users/x/Library/.../Firefox/Profiles/abc123...
|
||||
```
|
||||
|
||||
### Detail mode (`--detail`)
|
||||
|
||||
Counts entries per category without decryption:
|
||||
|
||||
```
|
||||
$ hack-browser-data list --detail
|
||||
|
||||
Browser Profile Password Cookie History Bookmark Extension
|
||||
Chrome Default 1 3544 66 852 39
|
||||
Chrome Profile 1 2 802 32 0 3
|
||||
Firefox abc123.default-release 3 48 53 7 0
|
||||
```
|
||||
|
||||
## 4. Data flow
|
||||
|
||||
```
|
||||
CLI (cobra dump)
|
||||
→ Parse flags: browser, category, format, dir, keychain-pw
|
||||
→ browser.Pick(browserName, keychainPwd) → []Browser
|
||||
→ w, _ := output.NewWriter(dir, format)
|
||||
→ For each browser:
|
||||
→ data, _ := b.Extract(categories)
|
||||
→ w.Add(b.BrowserName(), b.ProfileName(), data)
|
||||
→ w.Write()
|
||||
→ Optional: compress dir to zip
|
||||
|
||||
CLI (cobra list)
|
||||
→ browser.Pick("all", "") → []Browser
|
||||
→ For each browser:
|
||||
→ Print BrowserName() + ProfileName() + profileDir
|
||||
→ If --detail: Extract + count entries
|
||||
```
|
||||
|
||||
## 5. Implementation status
|
||||
|
||||
- [x] `output/` package: Writer struct + unified row type + reflection-based CSV/JSON + formatters
|
||||
- [x] `types/category.go`: removed Each() and CategoryData
|
||||
- [x] `types/models.go`: pure data structs with `json` + `csv` tags, no methods
|
||||
- [x] Tests: 27 tests covering CSV/JSON/CookieEditor output, reflection helpers, MarshalJSON, csv tag coverage
|
||||
- [ ] (PR 2) Rewrite browser dispatch + cobra CLI
|
||||
- [ ] (PR 3) Delete old code + rename files
|
||||
|
||||
## 6. Future extensions
|
||||
|
||||
- `--group-by browser` — one file per browser+category (group by browser)
|
||||
- `--group-by profile` — one file per browser+profile+category (group by profile)
|
||||
- `--format netscape` — Netscape cookie.txt format (curl/wget compatible)
|
||||
- `--format har` — HAR (HTTP Archive) format
|
||||
@@ -0,0 +1,129 @@
|
||||
# RFC-004: Firefox Data Storage
|
||||
|
||||
**Author**: moonD4rk
|
||||
**Status**: Living Document
|
||||
**Created**: 2026-04-05
|
||||
|
||||
## 1. Profile Structure
|
||||
|
||||
Firefox stores per-user data in **profile directories** beneath a platform-specific root (e.g. `~/Library/Application Support/Firefox/Profiles/` on macOS). Each profile directory has a random-prefix name like `97nszz88.default-release`.
|
||||
|
||||
Profile discovery enumerates subdirectories of the root and accepts any directory that contains at least one known data file. Unlike Chromium (which looks for a `Preferences` sentinel), Firefox validation simply checks for the presence of any source file from the table below.
|
||||
|
||||
## 2. Data File Locations
|
||||
|
||||
All paths are relative to the profile directory.
|
||||
|
||||
| Category | File | Format |
|
||||
|----------|------|--------|
|
||||
| Password | `logins.json` | JSON |
|
||||
| Cookie | `cookies.sqlite` | SQLite |
|
||||
| History | `places.sqlite` | SQLite |
|
||||
| Download | `places.sqlite` | SQLite |
|
||||
| Bookmark | `places.sqlite` | SQLite |
|
||||
| Extension | `extensions.json` | JSON |
|
||||
| LocalStorage | `webappsstore.sqlite` | SQLite |
|
||||
|
||||
History, Download, and Bookmark all share `places.sqlite` but query different tables within it. Firefox does not support CreditCard or SessionStorage extraction.
|
||||
|
||||
The master encryption key is stored separately in `key4.db` (see [RFC-005](005-firefox-encryption.md)).
|
||||
|
||||
## 3. Data Storage Formats
|
||||
|
||||
### 3.1 Passwords (logins.json)
|
||||
|
||||
Passwords are stored as a JSON file with a top-level `logins` array. Each entry contains:
|
||||
|
||||
- `formSubmitURL` / `hostname` — the login URL (formSubmitURL preferred, hostname as fallback)
|
||||
- `encryptedUsername` — base64-encoded, ASN1 PBE-encrypted username
|
||||
- `encryptedPassword` — base64-encoded, ASN1 PBE-encrypted password
|
||||
- `timeCreated` — creation timestamp in **milliseconds**
|
||||
|
||||
Decryption pipeline: base64 decode the field, parse as ASN1 PBE structure, decrypt with the master key.
|
||||
|
||||
### 3.2 Cookies (cookies.sqlite)
|
||||
|
||||
Cookies are **not encrypted** — values are stored in plaintext.
|
||||
|
||||
```sql
|
||||
SELECT name, value, host, path, creationTime, expiry, isSecure, isHttpOnly
|
||||
FROM moz_cookies
|
||||
```
|
||||
|
||||
The database must be opened with `journal_mode=off` to avoid locking conflicts with a running Firefox instance.
|
||||
|
||||
### 3.3 History (places.sqlite)
|
||||
|
||||
```sql
|
||||
SELECT url, COALESCE(last_visit_date, 0), COALESCE(title, ''), visit_count
|
||||
FROM moz_places
|
||||
```
|
||||
|
||||
The `last_visit_date` column uses **microseconds** since epoch.
|
||||
|
||||
### 3.4 Downloads (places.sqlite)
|
||||
|
||||
Downloads use the `moz_annos` annotation table joined with `moz_places`:
|
||||
|
||||
```sql
|
||||
SELECT place_id, GROUP_CONCAT(content), url, dateAdded
|
||||
FROM (SELECT * FROM moz_annos INNER JOIN moz_places ON moz_annos.place_id = moz_places.id)
|
||||
t GROUP BY place_id
|
||||
```
|
||||
|
||||
Download metadata is stored as a concatenated string: `target_path,{json}` where the JSON portion contains `fileSize` and `endTime`.
|
||||
|
||||
### 3.5 Bookmarks (places.sqlite)
|
||||
|
||||
```sql
|
||||
SELECT id, url, type, dateAdded, COALESCE(title, '')
|
||||
FROM (SELECT * FROM moz_bookmarks INNER JOIN moz_places ON moz_bookmarks.fk = moz_places.id)
|
||||
```
|
||||
|
||||
The `type` field distinguishes URL bookmarks (1) from folders.
|
||||
|
||||
### 3.6 Extensions (extensions.json)
|
||||
|
||||
Extensions are read from the `addons` array. Only entries with `location == "app-profile"` are included (user-installed extensions). Fields extracted: `defaultLocale.name`, `id`, `version`, `defaultLocale.description`, `defaultLocale.homepageURL`, `active`.
|
||||
|
||||
### 3.7 LocalStorage (webappsstore.sqlite)
|
||||
|
||||
```sql
|
||||
SELECT originKey, key, value FROM webappsstore2
|
||||
```
|
||||
|
||||
The `originKey` column uses a **reversed-host format**: `moc.buhtig.:https:443` represents `https://github.com:443`. The host portion is byte-reversed and dot-suffixed; the remaining fields are scheme and port.
|
||||
|
||||
## 4. Time Formats
|
||||
|
||||
Firefox uses inconsistent timestamp units across data types. All are Unix epoch-based.
|
||||
|
||||
| Data Type | Unit | Conversion |
|
||||
|-----------|------|------------|
|
||||
| Cookies (`creationTime`) | Microseconds | / 1,000,000 |
|
||||
| Cookies (`expiry`) | Seconds | direct |
|
||||
| History (`last_visit_date`) | Microseconds | / 1,000,000 |
|
||||
| Downloads (`dateAdded`) | Microseconds | / 1,000,000 |
|
||||
| Bookmarks (`dateAdded`) | Microseconds | / 1,000,000 |
|
||||
| Passwords (`timeCreated`) | Milliseconds | / 1,000 |
|
||||
|
||||
## 5. Key Differences from Chromium
|
||||
|
||||
| Aspect | Chromium | Firefox |
|
||||
|--------|----------|---------|
|
||||
| Profile naming | Named directories (`Default`, `Profile 1`) | Random-prefix (`97nszz88.default-release`) |
|
||||
| Profile detection | `Preferences` sentinel file | Any known source file present |
|
||||
| Password storage | SQLite (`Login Data`) | JSON (`logins.json`) |
|
||||
| Cookie encryption | Encrypted with master key | **Plaintext** |
|
||||
| Shared database | Separate files per category | `places.sqlite` shared by History/Download/Bookmark |
|
||||
| LocalStorage | LevelDB | SQLite (`webappsstore.sqlite`) |
|
||||
| CreditCard support | Yes | No |
|
||||
| SessionStorage support | Yes | No |
|
||||
| Encryption scope | Passwords, cookies, credit cards | **Passwords only** (see [RFC-005](005-firefox-encryption.md)) |
|
||||
|
||||
## Related RFCs
|
||||
|
||||
| RFC | Topic |
|
||||
|-----|-------|
|
||||
| [RFC-005](005-firefox-encryption.md) | Firefox NSS encryption and master key derivation |
|
||||
| [RFC-008](008-file-acquisition-and-platform-quirks.md) | File acquisition and platform quirks |
|
||||
@@ -0,0 +1,142 @@
|
||||
# RFC-005: Firefox Encryption
|
||||
|
||||
**Author**: moonD4rk
|
||||
**Status**: Living Document
|
||||
**Created**: 2026-04-05
|
||||
|
||||
## 1. Overview
|
||||
|
||||
Firefox uses Mozilla's NSS (Network Security Services) for credential encryption. Unlike Chromium, which delegates key storage to the OS (DPAPI, Keychain, D-Bus), Firefox manages its own encryption entirely within the profile directory via `key4.db`. This makes Firefox encryption **platform-agnostic** — the same derivation logic works on Windows, macOS, and Linux.
|
||||
|
||||
Only passwords are encrypted. Cookies, history, bookmarks, downloads, extensions, and localStorage are all stored in plaintext. See [RFC-004](004-firefox-data-storage.md) for storage details.
|
||||
|
||||
## 2. Master Key Derivation (key4.db)
|
||||
|
||||
### 2.1 Database Structure
|
||||
|
||||
`key4.db` is a SQLite database containing two relevant tables:
|
||||
|
||||
| Table | Purpose | Key Columns |
|
||||
|-------|---------|-------------|
|
||||
| `metaData` | Stores the global salt and an encrypted integrity marker | `item1` (global salt), `item2` (encrypted "password-check" string) |
|
||||
| `nssPrivate` | Stores encrypted master key candidates | `a11` (PBE-encrypted key blob), `a102` (key type tag) |
|
||||
|
||||
The `nssPrivate` table may contain multiple rows (certificates, other NSS objects). Only rows where `a102` matches a specific 16-byte type tag (`{0xF8, 0x00, ...0x01}`) are actual master key entries.
|
||||
|
||||
### 2.2 Derivation Flow
|
||||
|
||||
1. **Read metaData** — extract the global salt and encrypted password-check marker from the row where `id = 'password'`.
|
||||
2. **Verify integrity** — decrypt the password-check marker using the global salt via ASN1 PBE. The plaintext must contain the string `"password-check"`. This confirms the database is valid and the empty-password assumption holds (Firefox uses an empty master password by default).
|
||||
3. **Decrypt key candidates** — for each `nssPrivate` row matching the type tag, decrypt the `a11` blob using the global salt via ASN1 PBE. The result must be at least 24 bytes.
|
||||
4. **Validate against logins** — if `logins.json` is available, each candidate key is tested by attempting to decrypt an actual login entry (both username and password). The first key that succeeds is selected. This prevents selecting the wrong candidate when multiple keys exist.
|
||||
|
||||
## 3. ASN1 PBE Types
|
||||
|
||||
Firefox wraps all encrypted data in ASN1 structures. Three PBE (Password-Based Encryption) types are used, each with a distinct ASN1 layout:
|
||||
|
||||
| PBE Type | Used For | Cipher | Key Derivation |
|
||||
|----------|----------|--------|----------------|
|
||||
| `privateKeyPBE` | Master key entries in `nssPrivate` | 3DES-CBC | SHA1 + HMAC-SHA1 custom NSS derivation |
|
||||
| `passwordCheckPBE` | Integrity marker in `metaData` | AES-256-CBC | PBKDF2-SHA256 |
|
||||
| `credentialPBE` | Encrypted fields in `logins.json` | 3DES-CBC or AES-256-CBC | Master key used directly (no derivation) |
|
||||
|
||||
The `key` parameter has different semantics depending on the PBE type:
|
||||
|
||||
- **privateKeyPBE / passwordCheckPBE**: the key parameter is the **global salt**, used as input to key derivation.
|
||||
- **credentialPBE**: the key parameter is the **already-derived master key**, used directly for decryption.
|
||||
|
||||
`NewASN1PBE()` auto-detects the type by attempting to unmarshal the raw bytes against each ASN1 structure in order.
|
||||
|
||||
### 3.1 privateKeyPBE Key Derivation
|
||||
|
||||
The NSS PBE-SHA1-3DES derivation produces a 40-byte derived key from the global salt and an entry-specific salt:
|
||||
|
||||
```
|
||||
hp = SHA1(globalSalt)
|
||||
ck = SHA1(hp || entrySalt)
|
||||
k1 = HMAC-SHA1(ck, pad(entrySalt,20) || entrySalt)
|
||||
k2 = HMAC-SHA1(ck, HMAC-SHA1(ck, pad(entrySalt,20)) || entrySalt)
|
||||
dk = k1 || k2 // 40 bytes
|
||||
key = dk[:24], iv = dk[32:40] // 3DES key + IV
|
||||
```
|
||||
|
||||
### 3.2 passwordCheckPBE Key Derivation
|
||||
|
||||
Uses standard PBKDF2 with SHA-256 and parameters embedded in the ASN1 structure (entry salt, iteration count, key size). The IV is reconstructed by prepending the ASN.1 OCTET STRING header (`0x04 0x0E`) to the 14-byte IV value from the parsed structure, yielding a 16-byte AES IV.
|
||||
|
||||
## 4. Password Decryption
|
||||
|
||||
### 4.1 3DES-CBC (Firefox < 144)
|
||||
|
||||
Legacy Firefox versions encrypt login credentials with 3DES-CBC. The `credentialPBE` ASN1 structure wraps the ciphertext with its own IV:
|
||||
|
||||
```
|
||||
| ASN1 OID + params | IV | 3DES-CBC ciphertext (PKCS5 padded) |
|
||||
|--------------------|-------|------------------------------------|
|
||||
| variable | 8B | remaining bytes |
|
||||
```
|
||||
|
||||
Decryption details:
|
||||
- **Key**: the first 24 bytes of the master key (derived from `key4.db`, see Section 2)
|
||||
- **IV**: 8-byte IV embedded in the ASN1 structure
|
||||
- **Algorithm**: Triple DES in CBC mode with PKCS5 padding
|
||||
- **Padding removal**: after decryption, PKCS5 padding bytes are stripped. The last byte of plaintext indicates how many padding bytes to remove (1-8).
|
||||
|
||||
3DES uses three independent 8-byte DES keys (k1, k2, k3) packed into the 24-byte key:
|
||||
|
||||
```
|
||||
| k1 (DES key 1) | k2 (DES key 2) | k3 (DES key 3) |
|
||||
|-----------------|-----------------|-----------------|
|
||||
| 8B | 8B | 8B |
|
||||
```
|
||||
|
||||
Encryption: `E(k1) → D(k2) → E(k3)`. Decryption: `D(k3) → E(k2) → D(k1)`.
|
||||
|
||||
### 4.2 AES-256-CBC (Firefox 144+)
|
||||
|
||||
Starting from [Firefox 144](https://www.firefox.com/en-US/firefox/144.0/releasenotes/) (January 2025), Mozilla migrated password encryption from 3DES to AES-256-CBC for stronger security. The ASN1 structure has the same layout but with a larger IV:
|
||||
|
||||
```
|
||||
| ASN1 OID + params | IV | AES-256-CBC ciphertext (PKCS5 padded) |
|
||||
|--------------------|-------|---------------------------------------|
|
||||
| variable | 16B | remaining bytes |
|
||||
```
|
||||
|
||||
Decryption details:
|
||||
- **Key**: the full master key (32 bytes for AES-256)
|
||||
- **IV**: 16-byte IV embedded in the ASN1 structure
|
||||
- **Algorithm**: AES-256 in CBC mode with PKCS5 padding
|
||||
- **Cipher selection**: the cipher is inferred from the **IV length** rather than checking OIDs — 8-byte IV means 3DES, 16-byte IV means AES-256-CBC. This allows the same code path to handle both old and new Firefox profiles.
|
||||
|
||||
### 4.3 Pipeline
|
||||
|
||||
Each encrypted login field (`encryptedUsername`, `encryptedPassword` in `logins.json`) follows the same decryption pipeline:
|
||||
|
||||
```
|
||||
logins.json
|
||||
→ encryptedUsername / encryptedPassword (base64 string)
|
||||
|
||||
| base64 encoded string |
|
||||
|----------------------------------------------------------|
|
||||
↓ base64 decode
|
||||
| raw ASN1 DER bytes |
|
||||
|----------------------------------------------------------|
|
||||
↓ ASN1 parse (auto-detect credentialPBE)
|
||||
| IV (8B or 16B) | ciphertext |
|
||||
|----------------------------------------------------------|
|
||||
↓ decrypt (3DES or AES-256 based on IV length)
|
||||
| plaintext + PKCS5 padding |
|
||||
|----------------------------------------------------------|
|
||||
↓ strip PKCS5 padding
|
||||
| plaintext (UTF-8 string) |
|
||||
|----------------------------------------------------------|
|
||||
```
|
||||
|
||||
The master key is passed through unchanged — `credentialPBE` uses the key directly without further derivation (unlike `privateKeyPBE` and `passwordCheckPBE` which derive from the global salt).
|
||||
|
||||
## Related RFCs
|
||||
|
||||
| RFC | Topic |
|
||||
|-----|-------|
|
||||
| [RFC-004](004-firefox-data-storage.md) | Firefox data file locations and storage formats |
|
||||
| [RFC-006](006-key-retrieval-mechanisms.md) | Platform-specific master key retrieval (Chromium only — Firefox is self-contained) |
|
||||
@@ -0,0 +1,180 @@
|
||||
# RFC-006: Key Retrieval Mechanisms
|
||||
|
||||
**Author**: moonD4rk
|
||||
**Status**: Living Document
|
||||
**Created**: 2026-04-05
|
||||
|
||||
## 1. Overview
|
||||
|
||||
Chromium-based browsers encrypt sensitive data (passwords, cookies, credit cards) using a **master key**. The master key is stored differently on each platform:
|
||||
|
||||
| Platform | Storage | Key Type |
|
||||
|----------|---------|----------|
|
||||
| macOS | macOS Keychain | Password string → PBKDF2 → AES-128 |
|
||||
| Windows | `Local State` JSON (DPAPI-encrypted) | Raw AES-256 key |
|
||||
| Linux | GNOME Keyring / KDE Wallet via D-Bus | Password string → PBKDF2 → AES-128 |
|
||||
|
||||
Each platform may have multiple retrieval strategies. The `KeyRetriever` interface and `ChainRetriever` pattern abstract over these strategies, trying each in priority order until one succeeds.
|
||||
|
||||
For Chromium encryption details (cipher versions, AES-CBC/GCM), see [RFC-003](003-chromium-encryption.md). Firefox manages its own keys via `key4.db` — see [RFC-005](005-firefox-encryption.md).
|
||||
|
||||
## 2. KeyRetriever Interface
|
||||
|
||||
The interface takes two parameters:
|
||||
|
||||
- **`storage`** — keychain/keyring label identifying the browser's secret (e.g. `"Chrome"` on macOS, `"Chrome Safe Storage"` on Linux). Unused on Windows.
|
||||
- **`localStatePath`** — path to `Local State` JSON file. Only used on Windows.
|
||||
|
||||
The return value is the **ready-to-use decryption key** — either the raw AES key (Windows) or the PBKDF2-derived key (macOS/Linux).
|
||||
|
||||
`ChainRetriever` wraps multiple retrievers and tries them in order. The first successful result wins. If all fail, errors from every retriever are combined into a single error.
|
||||
|
||||
**Caching**: the retriever is created once per browser and shared across all profiles. macOS retrievers use `sync.Once` internally, so multi-profile browsers only trigger one keychain prompt or memory dump.
|
||||
|
||||
## 3. macOS Key Retrieval
|
||||
|
||||
Chromium on macOS stores the encryption password in the user's login keychain under a browser-specific account name (e.g. `"Chrome"`, `"Brave"`, `"Microsoft Edge"`).
|
||||
|
||||
### 3.1 Retrieval Strategies
|
||||
|
||||
**GcoredumpRetriever** — exploits **CVE-2025-24204** to extract keychain secrets from `securityd` process memory. Requires root. The exploit works because the `gcore` binary holds the `com.apple.system-task-ports.read` entitlement, bypassing TCC protections:
|
||||
|
||||
1. Find `securityd` PID via `sysctl`
|
||||
2. Dump process memory via `gcore`
|
||||
3. Parse heap regions via `vmmap`, scan `MALLOC_SMALL` regions for 24-byte key pattern
|
||||
4. Try each candidate against `login.keychain-db`
|
||||
|
||||
**KeychainPasswordRetriever** — unlocks `login.keychain-db` directly using the user's macOS login password (from `--keychain-pw` flag), powered by the [moond4rk/keychainbreaker](https://github.com/moond4rk/keychainbreaker) library which implements a full macOS Keychain file parser and decryptor in pure Go. Non-root, non-interactive.
|
||||
|
||||
**SecurityCmdRetriever** — invokes `security find-generic-password -wa <label>`. Triggers a macOS password dialog. Last resort.
|
||||
|
||||
### 3.2 Chain Order
|
||||
|
||||
| Priority | Strategy | Requires | Interactive? |
|
||||
|----------|----------|----------|:------------:|
|
||||
| 1 | Gcoredump (CVE-2025-24204) | Root | No |
|
||||
| 2 | Keychain password | `--keychain-pw` flag | No |
|
||||
| 3 | `security` CLI command | Nothing | Yes (dialog) |
|
||||
|
||||
### 3.3 PBKDF2 Derivation
|
||||
|
||||
All macOS strategies produce a raw password string from the keychain. This is derived into an AES-128 key via PBKDF2:
|
||||
|
||||
| Parameter | Value | Source |
|
||||
|-----------|-------|--------|
|
||||
| Salt | `"saltysalt"` | [os_crypt_mac.mm](https://source.chromium.org/chromium/chromium/src/+/master:components/os_crypt/os_crypt_mac.mm;l=157) |
|
||||
| Iterations | 1003 | |
|
||||
| Key length | 16 bytes (AES-128) | |
|
||||
| Hash | HMAC-SHA1 | |
|
||||
|
||||
### 3.4 Storage Labels
|
||||
|
||||
| Browser | Keychain Account |
|
||||
|---------|-----------------|
|
||||
| Chrome / Chrome Beta | `"Chrome"` |
|
||||
| Edge | `"Microsoft Edge"` |
|
||||
| Chromium | `"Chromium"` |
|
||||
| Opera / OperaGX | `"Opera"` |
|
||||
| Vivaldi | `"Vivaldi"` |
|
||||
| Brave | `"Brave"` |
|
||||
| Yandex | `"Yandex"` |
|
||||
| Arc | `"Arc"` |
|
||||
| CocCoc | `"CocCoc"` |
|
||||
|
||||
## 4. Windows Key Retrieval
|
||||
|
||||
Chromium on Windows stores the master key in `Local State` JSON, encrypted with DPAPI.
|
||||
|
||||
### 4.1 DPAPI Background
|
||||
|
||||
Windows Data Protection API (DPAPI) is a built-in symmetric encryption service provided by `Crypt32.dll`. It uses the logged-in user's Windows credentials (derived from the user's login password) as the root key material. Applications call `CryptProtectData` to encrypt and `CryptUnprotectData` to decrypt, without needing to manage keys themselves.
|
||||
|
||||
Key characteristics:
|
||||
- **User-scoped** — data encrypted by one Windows user cannot be decrypted by another user, even on the same machine
|
||||
- **Machine-bound** — the encrypted blob cannot be decrypted on a different machine (unless roaming credentials are used)
|
||||
- **No password prompt** — decryption is transparent to the calling process as long as it runs under the correct user session
|
||||
|
||||
### 4.2 Retrieval Flow
|
||||
|
||||
```
|
||||
Local State → os_crypt.encrypted_key (base64 string)
|
||||
|
||||
| "DPAPI" prefix | DPAPI-encrypted AES key |
|
||||
|----------------|--------------------------|
|
||||
| 5B (ASCII) | remaining bytes |
|
||||
|
||||
→ strip prefix
|
||||
→ CryptUnprotectData (Crypt32.dll)
|
||||
→ 32-byte AES-256 master key
|
||||
```
|
||||
|
||||
The implementation loads `Crypt32.dll` at runtime via `syscall.NewLazyDLL` and calls `CryptUnprotectData` with a `DATA_BLOB` structure pointing to the ciphertext. Windows internally derives the decryption key from the user's credentials and returns the plaintext master key.
|
||||
|
||||
### 4.3 No PBKDF2 Needed
|
||||
|
||||
Unlike macOS/Linux, DPAPI gives the **final AES-256 key directly**. No intermediate password, no derivation step. The key is used as-is for AES-256-GCM decryption (see [RFC-003](003-chromium-encryption.md)).
|
||||
|
||||
### 4.4 Single Retriever
|
||||
|
||||
Windows uses only `DPAPIRetriever` — no chain needed. Both `storage` and `keychainPassword` parameters are ignored.
|
||||
|
||||
## 5. Linux Key Retrieval
|
||||
|
||||
### 5.1 Retrieval Strategies
|
||||
|
||||
**DBusRetriever** — queries the D-Bus Secret Service API (provided by `gnome-keyring-daemon` or `kwalletd`). Iterates all collections and items, looking for a label matching the browser's storage name.
|
||||
|
||||
**FallbackRetriever** — when D-Bus is unavailable (headless servers, Docker, CI), uses the hardcoded password `"peanuts"`. This matches Chromium's own fallback behavior.
|
||||
|
||||
### 5.2 Chain Order
|
||||
|
||||
| Priority | Strategy | Requires | Interactive? |
|
||||
|----------|----------|----------|:------------:|
|
||||
| 1 | D-Bus Secret Service | D-Bus session + keyring | No |
|
||||
| 2 | Fallback (`"peanuts"`) | Nothing | No |
|
||||
|
||||
### 5.3 PBKDF2 Derivation
|
||||
|
||||
Both strategies produce a password, derived via PBKDF2 with notably weaker parameters than macOS:
|
||||
|
||||
| Parameter | Value | Source |
|
||||
|-----------|-------|--------|
|
||||
| Salt | `"saltysalt"` | [os_crypt_linux.cc](https://source.chromium.org/chromium/chromium/src/+/main:components/os_crypt/os_crypt_linux.cc;l=100) |
|
||||
| Iterations | **1** | |
|
||||
| Key length | 16 bytes (AES-128) | |
|
||||
| Hash | HMAC-SHA1 | |
|
||||
|
||||
A single iteration makes PBKDF2 essentially a keyed HMAC — no real key-stretching. Combined with the well-known fallback password `"peanuts"`, Linux Chromium encryption is trivial to break without the keyring.
|
||||
|
||||
### 5.4 Storage Labels
|
||||
|
||||
| Browser | D-Bus Label |
|
||||
|---------|-------------|
|
||||
| Chrome / Chrome Beta / Vivaldi | `"Chrome Safe Storage"` |
|
||||
| Chromium / Edge / Opera | `"Chromium Safe Storage"` |
|
||||
| Brave | `"Brave Safe Storage"` |
|
||||
|
||||
## 6. Platform Summary
|
||||
|
||||
| Platform | Chain | PBKDF2 | Key Size |
|
||||
|----------|-------|:------:|----------|
|
||||
| macOS | Gcoredump → KeychainPassword* → SecurityCmd | 1003 iterations | AES-128 |
|
||||
| Windows | DPAPI only | No | AES-256 |
|
||||
| Linux | DBus → Fallback | 1 iteration | AES-128 |
|
||||
|
||||
\* Only included when `--keychain-pw` is provided.
|
||||
|
||||
## References
|
||||
|
||||
- **macOS**: [os_crypt_mac.mm](https://source.chromium.org/chromium/chromium/src/+/master:components/os_crypt/os_crypt_mac.mm;l=157)
|
||||
- **Windows**: [os_crypt_win.cc](https://source.chromium.org/chromium/chromium/src/+/main:components/os_crypt/os_crypt_win.cc)
|
||||
- **Linux**: [os_crypt_linux.cc](https://source.chromium.org/chromium/chromium/src/+/main:components/os_crypt/os_crypt_linux.cc;l=100)
|
||||
- **CVE-2025-24204**: [Exploit PoC](https://github.com/FFRI/CVE-2025-24204/tree/main/decrypt-keychain), [Apple advisory](https://support.apple.com/en-us/122373)
|
||||
- **DPAPI**: [CryptUnprotectData](https://learn.microsoft.com/en-us/windows/win32/api/dpapi/nf-dpapi-cryptunprotectdata)
|
||||
|
||||
## Related RFCs
|
||||
|
||||
| RFC | Topic |
|
||||
|-----|-------|
|
||||
| [RFC-003](003-chromium-encryption.md) | Chromium encryption mechanisms per platform |
|
||||
| [RFC-005](005-firefox-encryption.md) | Firefox NSS encryption and key derivation |
|
||||
@@ -0,0 +1,140 @@
|
||||
# RFC-007: CLI & Output Design
|
||||
|
||||
**Author**: moonD4rk
|
||||
**Status**: Living Document
|
||||
**Created**: 2026-04-05
|
||||
|
||||
## 1. Command Structure
|
||||
|
||||
The CLI is built on [cobra](https://github.com/spf13/cobra) with three subcommands: `dump`, `list`, and `version`.
|
||||
|
||||
### 1.1 Root Command
|
||||
|
||||
The root command defines one persistent flag: `--verbose` / `-v` (enable debug logging).
|
||||
|
||||
**Default-to-dump**: when no subcommand is given, the root delegates to `dump`. All of `dump`'s flags are copied onto the root command, so `hack-browser-data -b chrome` and `hack-browser-data dump -b chrome` are equivalent.
|
||||
|
||||
### 1.2 dump Command
|
||||
|
||||
The primary command. Extracts, decrypts, and writes browser data to files.
|
||||
|
||||
| Flag | Short | Default | Description |
|
||||
|------|-------|---------|-------------|
|
||||
| `--browser` | `-b` | `"all"` | Target browser |
|
||||
| `--category` | `-c` | `"all"` | Data categories (comma-separated) |
|
||||
| `--format` | `-f` | `"csv"` | Output format: csv, json, cookie-editor |
|
||||
| `--dir` | `-d` | `"results"` | Output directory |
|
||||
| `--profile-path` | `-p` | | Custom profile directory |
|
||||
| `--keychain-pw` | | | macOS keychain password |
|
||||
| `--zip` | | `false` | Compress output to zip |
|
||||
|
||||
**Workflow**: PickBrowsers (filter by `-b`) → parseCategories (split `-c` on commas) → NewWriter (select formatter by `-f`) → Extract loop (each browser) → Write → optional CompressDir.
|
||||
|
||||
The nine recognized categories are: `password`, `cookie`, `bookmark`, `history`, `download`, `creditcard`, `extension`, `localstorage`, `sessionstorage`. The string `"all"` maps to all nine.
|
||||
|
||||
### 1.3 list Command
|
||||
|
||||
Lists all detected browsers and profiles via `text/tabwriter`.
|
||||
|
||||
**Basic mode** (default) — three columns: Browser, Profile, Path.
|
||||
|
||||
**Detail mode** (`--detail`) — adds a column for every category showing entry counts. This actually calls `Extract()` on each browser to count entries.
|
||||
|
||||
### 1.4 version Command
|
||||
|
||||
Prints version, commit hash (truncated to 8 chars), and build date. Values are injected at build time via `-ldflags`. When building without ldflags (development mode), falls back to `runtime/debug.ReadBuildInfo()` to extract `vcs.revision` and `vcs.time`.
|
||||
|
||||
## 2. Output Architecture
|
||||
|
||||
All output logic lives in the `output` package. Only one type is exported: `Writer`.
|
||||
|
||||
### 2.1 Writer
|
||||
|
||||
Three methods define the entire API:
|
||||
|
||||
- **`NewWriter(dir, format)`** — creates a writer with the specified formatter
|
||||
- **`Add(browser, profile, data)`** — accumulates one browser profile's extraction results
|
||||
- **`Write()`** — aggregates all results by category and writes each non-empty category to its own file
|
||||
|
||||
### 2.2 Row Type
|
||||
|
||||
An unexported `row` wraps any entry struct with browser and profile context. It provides CSV header/value generation via reflection on `csv` struct tags, and flat JSON output via `reflect.StructOf` dynamic struct building (browser + profile fields prepended to entry fields).
|
||||
|
||||
### 2.3 Formatter Interface
|
||||
|
||||
An unexported interface with two methods: `format(w, rows)` and `ext()` (file extension).
|
||||
|
||||
| Format | Extension | Description |
|
||||
|--------|-----------|-------------|
|
||||
| `csv` | `.csv` | Standard `encoding/csv`, reflection-based headers from `csv` struct tags |
|
||||
| `json` | `.json` | `json.Encoder` with indent, no HTML escape, flat objects |
|
||||
| `cookie-editor` | `.json` | CookieEditor-compatible format, non-cookie categories fall back to standard JSON |
|
||||
|
||||
## 3. Output Formats
|
||||
|
||||
### 3.1 CSV
|
||||
|
||||
Headers and values are extracted via reflection on the `csv` struct tag of each entry field. A UTF-8 BOM (`0xEF 0xBB 0xBF`) is prepended for Excel compatibility. Field types are converted to strings: `time.Time` → RFC3339, `bool` → `"true"`/`"false"`, integers → base-10 string.
|
||||
|
||||
### 3.2 JSON
|
||||
|
||||
Each row is serialized via `MarshalJSON()`, which uses `reflect.StructOf` to dynamically build a flat struct at runtime — browser and profile are top-level fields alongside the entry fields, avoiding nested JSON. The encoder uses two-space indent and disables HTML escaping to preserve URLs.
|
||||
|
||||
### 3.3 CookieEditor
|
||||
|
||||
Produces JSON compatible with the [CookieEditor](https://cookie-editor.cgagnier.ca/) browser extension. Cookie entries are converted to a specific field mapping:
|
||||
|
||||
| CookieEntry field | CookieEditor field | Notes |
|
||||
|-------------------|--------------------|-------|
|
||||
| Host | domain | |
|
||||
| Path | path | |
|
||||
| Name | name | |
|
||||
| Value | value | |
|
||||
| IsSecure | secure | |
|
||||
| IsHTTPOnly | httpOnly | |
|
||||
| ExpireAt | expirationDate | Unix timestamp as float64 |
|
||||
|
||||
Non-cookie categories fall back to the standard JSON formatter.
|
||||
|
||||
## 4. File Organization
|
||||
|
||||
Output follows a **one file per category** convention:
|
||||
|
||||
```
|
||||
results/
|
||||
├── password.csv
|
||||
├── cookie.csv
|
||||
├── history.csv
|
||||
├── bookmark.csv
|
||||
├── download.csv
|
||||
├── creditcard.csv
|
||||
├── extension.csv
|
||||
├── localstorage.csv
|
||||
└── sessionstorage.csv
|
||||
```
|
||||
|
||||
Data from all browser profiles is aggregated into the same file. The `browser` and `profile` columns identify which browser and profile each row came from. Empty categories produce no file.
|
||||
|
||||
File permissions are restrictive: directories `0750`, files `0600` (data may contain passwords and cookies).
|
||||
|
||||
## 5. Data Flow
|
||||
|
||||
```
|
||||
CLI: hack-browser-data dump -b chrome -c password,cookie -f csv -d results
|
||||
→ PickBrowsers(name="chrome") → []Browser
|
||||
→ parseCategories("password,cookie") → []Category
|
||||
→ NewWriter("results", "csv") → *Writer
|
||||
→ for each browser:
|
||||
Extract(categories) → *BrowserData
|
||||
Writer.Add(browser, profile, data)
|
||||
→ Writer.Write()
|
||||
→ aggregate by category → format rows → write files
|
||||
→ (optional) CompressDir → results.zip
|
||||
```
|
||||
|
||||
## Related RFCs
|
||||
|
||||
| RFC | Topic |
|
||||
|-----|-------|
|
||||
| [RFC-001](001-project-architecture.md) | Browser interface and Extract() orchestration |
|
||||
| [RFC-008](008-file-acquisition-and-platform-quirks.md) | File utilities (CompressDir) |
|
||||
@@ -0,0 +1,94 @@
|
||||
# RFC-008: File Acquisition & Platform Quirks
|
||||
|
||||
**Author**: moonD4rk
|
||||
**Status**: Living Document
|
||||
**Created**: 2026-04-05
|
||||
|
||||
## 1. Overview
|
||||
|
||||
Browsers keep their data files open and often locked while running. Chromium on Windows is particularly aggressive: it holds exclusive locks on databases like `Cookies` via `PRAGMA locking_mode=EXCLUSIVE`. Even on macOS and Linux, reading directly from a live database can produce corrupt or inconsistent results.
|
||||
|
||||
The solution is a **copy-then-read** strategy: copy all needed files to an isolated temporary directory, then extract data from the copies. The `filemanager` package manages this lifecycle, while platform-specific fallbacks handle locked files on Windows.
|
||||
|
||||
## 2. Session Management
|
||||
|
||||
A `Session` wraps a single temporary directory for one browser profile extraction run:
|
||||
|
||||
1. **Create** — `NewSession()` creates a unique temp directory via `os.MkdirTemp("", "hbd-*")`
|
||||
2. **Acquire** — `Acquire(src, dst, isDir)` copies a browser file or directory into the session
|
||||
3. **Cleanup** — removes the entire temp directory tree, always called with `defer`
|
||||
|
||||
## 3. Acquire Flow
|
||||
|
||||
`Acquire` is the single entry point for copying browser files:
|
||||
|
||||
```
|
||||
Acquire(src, dst, isDir)
|
||||
├── isDir=true → copyDir(src, dst, skip="lock")
|
||||
│
|
||||
└── isDir=false → copyFile(src, dst)
|
||||
├── success → copy -wal and -shm companions if present
|
||||
└── failure + Windows → copyLocked(src, dst) fallback
|
||||
```
|
||||
|
||||
### SQLite Companion Files
|
||||
|
||||
SQLite databases using WAL mode maintain `-wal` (write-ahead log) and `-shm` (shared memory) files. After a successful file copy, `Acquire` automatically copies these companions if they exist. Without the WAL file, recently written data (cookies set in the last few seconds) would be missing.
|
||||
|
||||
## 4. File Deduplication
|
||||
|
||||
Multiple categories can share the same source file:
|
||||
|
||||
| Engine | Categories | Shared Source |
|
||||
|--------|-----------|---------------|
|
||||
| Chromium | History + Download | `History` |
|
||||
| Firefox | History + Download + Bookmark | `places.sqlite` |
|
||||
|
||||
Each category gets its own destination path in the temp directory, so the same source file may be copied multiple times. This is intentional — each extract function expects its own independent file path, and the copy cost is negligible for small SQLite files.
|
||||
|
||||
## 5. Windows Locked File Handling
|
||||
|
||||
Chromium on Windows holds exclusive locks on certain databases (notably `Cookies`), causing standard file reads to fail. Chrome introduced `PRAGMA locking_mode=EXCLUSIVE` for the cookies database starting from Chrome 114 (2023) via the "Lock profile cookie files on disk" feature, preventing external processes from reading cookie data while the browser is running. This is a Windows-specific problem — macOS and Linux use `fcntl`/`flock` advisory locks that do not prevent reading by other processes.
|
||||
|
||||
A dedicated technique using Windows kernel APIs (DuplicateHandle + memory-mapped I/O) is used to bypass these locks. See [RFC-009](009-windows-locked-file-bypass.md) for the full technical details.
|
||||
|
||||
## 6. LevelDB Directory Handling
|
||||
|
||||
Chromium stores localStorage and sessionStorage as LevelDB directories:
|
||||
|
||||
| Category | Path | Type |
|
||||
|----------|------|------|
|
||||
| LocalStorage | `Local Storage/leveldb/` | directory |
|
||||
| SessionStorage | `Session Storage/` | directory |
|
||||
|
||||
When `isDir=true`, `Acquire` copies the entire directory while **skipping the `LOCK` file**. LevelDB uses this file for single-process access control; copying it could interfere with the running browser.
|
||||
|
||||
## 7. SQLite Query Helpers
|
||||
|
||||
### QuerySQLite
|
||||
|
||||
Encapsulates the common SQLite extraction pattern: validate file exists → open database → optional `PRAGMA journal_mode=off` → execute query → iterate rows with error-tolerant scan callback.
|
||||
|
||||
Row-level scan errors are logged and skipped (graceful degradation for corrupt records), while database-level errors abort the query.
|
||||
|
||||
### QueryRows[T]
|
||||
|
||||
A generic wrapper (Go 1.18+) that collects results into a typed slice, eliminating boilerplate. Each extract function only needs to provide the scan function.
|
||||
|
||||
### Firefox journal_mode=off
|
||||
|
||||
All Firefox extract calls use `journal_mode=off`. Firefox databases use WAL mode in production, and the `modernc.org/sqlite` driver may attempt WAL replay on a temp copy. Disabling the journal prevents this and treats the database as a read-only snapshot.
|
||||
|
||||
Chromium extract calls do **not** disable journal mode because `Acquire` already copies the WAL/SHM companions, giving SQLite everything it needs for a clean WAL replay.
|
||||
|
||||
## 8. File Utilities
|
||||
|
||||
- **CompressDir** — compresses all files in the output directory into a single `.zip` file (used by `--zip` flag). Original files are removed after archiving.
|
||||
|
||||
## Related RFCs
|
||||
|
||||
| RFC | Topic |
|
||||
|-----|-------|
|
||||
| [RFC-002](002-chromium-data-storage.md) | Chromium data file locations |
|
||||
| [RFC-004](004-firefox-data-storage.md) | Firefox data file locations |
|
||||
| [RFC-009](009-windows-locked-file-bypass.md) | Windows locked file bypass technique |
|
||||
@@ -0,0 +1,120 @@
|
||||
# RFC-009: Windows Locked File Bypass
|
||||
|
||||
**Author**: moonD4rk
|
||||
**Status**: Living Document
|
||||
**Created**: 2026-04-05
|
||||
|
||||
## 1. Problem
|
||||
|
||||
Chromium on Windows sets `PRAGMA locking_mode=EXCLUSIVE` on certain SQLite databases, most notably the `Cookies` file (`Network/Cookies`). This means Chrome's process holds the file open with `dwShareMode=0` (no sharing), preventing any other process from opening it — even for reading.
|
||||
|
||||
```
|
||||
Chrome.exe
|
||||
→ CreateFileW("Network/Cookies", ..., dwShareMode=0) // exclusive lock
|
||||
→ PRAGMA locking_mode=EXCLUSIVE
|
||||
|
||||
hack-browser-data.exe
|
||||
→ os.ReadFile("Network/Cookies")
|
||||
→ ERROR: access denied
|
||||
```
|
||||
|
||||
This is a **Windows-specific problem**. On macOS and Linux, SQLite uses `fcntl`/`flock` advisory locks, which do not prevent other processes from reading the file. The standard copy path works fine on those platforms.
|
||||
|
||||
## 2. Solution Overview
|
||||
|
||||
Bypass the exclusive lock using Windows kernel APIs: enumerate system handles to find Chrome's file handle, duplicate it into our process, then read the file contents via memory-mapped I/O. **No admin privileges required.**
|
||||
|
||||
```
|
||||
NtQuerySystemInformation → find Chrome's handle to Cookies file
|
||||
→ DuplicateHandle into our process
|
||||
→ CreateFileMappingW + MapViewOfFile (read from kernel cache)
|
||||
→ write bytes to temp destination
|
||||
```
|
||||
|
||||
## 3. Step-by-Step
|
||||
|
||||
### 3.1 Enumerate System Handles
|
||||
|
||||
Call `NtQuerySystemInformation` with `SystemExtendedHandleInformation` (class 64) to get every open handle in the system. The "extended" variant uses `ULONG_PTR` for PIDs and handle values, avoiding truncation on 64-bit Windows.
|
||||
|
||||
The query starts with a 4 MB buffer and doubles it (up to 256 MB) if the API returns `STATUS_INFO_LENGTH_MISMATCH`.
|
||||
|
||||
Each entry in the result table:
|
||||
|
||||
| Field | Size | Description |
|
||||
|-------|------|-------------|
|
||||
| UniqueProcessID | `uintptr` | Owning process PID |
|
||||
| HandleValue | `uintptr` | Handle value in the owning process |
|
||||
| GrantedAccess | `uint32` | Access mask |
|
||||
| ObjectTypeIndex | `uint16` | Kernel object type |
|
||||
|
||||
### 3.2 Find the Target Handle
|
||||
|
||||
For each handle entry:
|
||||
|
||||
1. `OpenProcess(PROCESS_DUP_HANDLE, pid)` — open the owning process
|
||||
2. `DuplicateHandle` — duplicate the handle into our process with `DUPLICATE_SAME_ACCESS`
|
||||
3. `GetFileType` — verify it is `FILE_TYPE_DISK` (skip pipes, sockets, etc.)
|
||||
4. `GetFinalPathNameByHandleW` — get the full file path
|
||||
|
||||
### 3.3 Path Matching with Short-Name Tolerance
|
||||
|
||||
Windows 8.3 short path names (e.g. `RUNNER~1` vs `runneradmin`) cause direct path comparison to fail. The solution extracts a **stable suffix** by stripping everything before `AppData\Local\` or `AppData\Roaming\` and comparing in lowercase:
|
||||
|
||||
```
|
||||
Input: C:\Users\RUNNER~1\AppData\Local\Google\Chrome\...\Network\Cookies
|
||||
Suffix: google\chrome\...\network\cookies
|
||||
|
||||
Input: C:\Users\runneradmin\AppData\Local\Google\Chrome\...\Network\Cookies
|
||||
Suffix: google\chrome\...\network\cookies
|
||||
|
||||
→ match!
|
||||
```
|
||||
|
||||
### 3.4 Read via Memory-Mapped I/O
|
||||
|
||||
Once we have a duplicated handle to the locked file:
|
||||
|
||||
```
|
||||
| DuplicateHandle (read access) |
|
||||
|-------------------------------------------------|
|
||||
↓
|
||||
| CreateFileMappingW(handle, PAGE_READONLY) |
|
||||
|-------------------------------------------------|
|
||||
↓
|
||||
| MapViewOfFile(mapping, FILE_MAP_READ, fileSize) |
|
||||
|-------------------------------------------------|
|
||||
↓
|
||||
| byte slice from kernel file cache |
|
||||
| (includes uncommitted WAL data from Chrome) |
|
||||
|-------------------------------------------------|
|
||||
↓
|
||||
| os.WriteFile(destination, bytes, 0600) |
|
||||
|-------------------------------------------------|
|
||||
```
|
||||
|
||||
Memory-mapped I/O reads from the OS kernel's **file cache**, which includes data Chrome has written but not yet checkpointed to disk. This produces a more complete snapshot than a raw `ReadFile`.
|
||||
|
||||
**Fallback**: if `CreateFileMappingW` fails (e.g., the file is empty or zero-length), falls back to `Seek(0)` + `ReadFile` on the duplicated handle.
|
||||
|
||||
## 4. Why This Works
|
||||
|
||||
The key insight is that `dwShareMode=0` only prevents **new** `CreateFileW` calls from opening the file. It does **not** prevent:
|
||||
|
||||
- `DuplicateHandle` — which creates a copy of an existing handle (Chrome's own handle)
|
||||
- `CreateFileMappingW` — which operates on a handle we already own
|
||||
- `MapViewOfFile` — which reads from the kernel's page cache
|
||||
|
||||
This is a documented Windows behavior, not an exploit. The technique requires only standard user privileges because `PROCESS_DUP_HANDLE` access is available for processes owned by the same user.
|
||||
|
||||
## 5. Limitations
|
||||
|
||||
- **Performance**: enumerating all system handles is expensive (the system may have 100,000+ handles). The entire table must be scanned to find the target file.
|
||||
- **Race condition**: Chrome could close and reopen the file between enumeration and duplication, though this is unlikely for long-lived database files.
|
||||
- **Not needed on macOS/Linux**: advisory locking on these platforms does not prevent reading, so the standard `copyFile` path is always sufficient.
|
||||
|
||||
## Related RFCs
|
||||
|
||||
| RFC | Topic |
|
||||
|-----|-------|
|
||||
| [RFC-008](008-file-acquisition-and-platform-quirks.md) | File acquisition lifecycle and session management |
|
||||
Reference in New Issue
Block a user