docs: add architecture RFC and development guidelines (#486)

- Add RFC-001 for architecture refactoring proposal
- Add CLAUDE.md with development guidelines and security analysis
- Document current issues and proposed solutions for library support
- Include cross-platform considerations and encryption versioning

The RFC addresses key architectural challenges:
* Limited encryption version support (only v10)
* Scattered cross-platform MasterKey retrieval
* Windows Cookie file access permission issues
* Coupled code architecture preventing library usage
* Inconsistent error handling
* Testing and maintenance difficulties

Proposed improvements include versioned encryption strategies,
unified MasterKey abstraction, and a clean library API design.
This commit is contained in:
Roger
2025-09-02 23:23:19 +08:00
committed by GitHub
parent d101da627d
commit 3e9abed2b3
2 changed files with 505 additions and 0 deletions
+264
View File
@@ -0,0 +1,264 @@
# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## ⚠️ CRITICAL SECURITY AND LEGAL NOTICE
**THIS PROJECT IS STRICTLY FOR SECURITY RESEARCH AND DEFENSIVE PURPOSES ONLY**
- This tool is ONLY intended for legitimate security research, authorized audits, and defensive security operations
- ANY use of this project for unauthorized access, data theft, or malicious purposes is STRICTLY PROHIBITED and may violate computer fraud and abuse laws
- Users are SOLELY responsible for ensuring compliance with all applicable laws and regulations in their jurisdiction
- The original author and contributors assume NO legal responsibility for misuse of this tool
- You MUST have explicit authorization before using this tool on any system you do not own
- This tool should NEVER be used for attacking, credential harvesting, or any malicious intent
- All security research must be conducted ethically and within legal boundaries
## Project Overview
HackBrowserData is a command-line security research tool for extracting and decrypting browser data across multiple platforms (Windows, macOS, Linux). It supports data extraction from Chromium-based browsers (Chrome, Edge, Brave, etc.) and Firefox.
**Legitimate Use Cases**:
- Personal data backup and recovery
- Authorized enterprise security audits
- Digital forensics investigations (with proper authorization)
- Security vulnerability research and defense improvement
- Understanding browser security mechanisms for defensive purposes
## Development Commands
### Build the Project
```bash
# Build for current platform
cd cmd/hack-browser-data
go build
# Cross-compile for Windows from macOS/Linux
GOOS=windows GOARCH=amd64 go build
# Cross-compile for Linux from macOS/Windows
GOOS=linux GOARCH=amd64 go build
# Cross-compile for macOS from Linux/Windows
GOOS=darwin GOARCH=amd64 go build
```
### Testing
```bash
# Run all tests
go test -v ./...
# Run tests with coverage
go test -v ./... -covermode=count -coverprofile=coverage.out
# Run specific package tests
go test -v ./browser/chromium/...
go test -v ./crypto/...
```
### Code Quality
```bash
# Format check
gofmt -d .
# Run linter (requires golangci-lint)
golangci-lint run
# Check spelling
typos
# Tidy dependencies
go mod tidy
```
## Architecture Overview
### Core Components
**Browser Abstraction Layer** (`browser/`)
- Interface-based design allowing easy addition of new browsers
- Platform-specific implementations using build tags (`_darwin.go`, `_windows.go`, `_linux.go`)
- Automatic profile discovery and multi-profile support
**Data Extraction Pipeline**
1. **Profile Discovery**: `profile/finder.go` locates browser profiles
2. **File Management**: `filemanager/` handles secure copying of browser files
3. **Decryption**: `crypto/` provides platform-specific decryption
- Windows: DPAPI via Windows API
- macOS: Keychain access (requires user password)
- Linux: PBKDF2 key derivation
4. **Data Processing**: `browserdata/` parses and structures extracted data
5. **Output**: `browserdata/outputter.go` exports to CSV/JSON
**Key Interfaces**
- `Browser`: Main interface for browser implementations
- `DataType`: Enum for different data types (passwords, cookies, etc.)
- `BrowserData`: Container for all extracted browser data
### Platform-Specific Considerations
**macOS**
- Requires user password for Keychain access to decrypt Chrome passwords
- Uses Security framework for keychain operations
- Profile paths: `~/Library/Application Support/[Browser]/`
**Windows**
- Uses DPAPI for decryption (no password required)
- Accesses Local State file for encryption keys
- Profile paths: `%LOCALAPPDATA%/[Browser]/User Data/`
**Linux**
- Uses PBKDF2 with "peanuts" as salt
- Requires gnome-keyring or kwallet access
- Profile paths: `~/.config/[Browser]/`
### Security Mechanisms
**Data Protection**
- Temporary file cleanup after extraction
- No persistent storage of decrypted master keys
- Secure memory handling for sensitive data
**File Operations**
- Copy-on-read to avoid modifying original browser files
- Lock file filtering to prevent conflicts
- Atomic operations where possible
## Adding New Browser Support
1. Create browser-specific package in `browser/[name]/`
2. Implement the `Browser` interface
3. Add platform-specific profile paths in `browser/consts.go`
4. Register in `browser/browser.go` picker functions
5. Add data type mappings in `types/types.go`
## Important Files and Their Roles
- `cmd/hack-browser-data/main.go`: CLI entry point and flag handling
- `browser/chromium/chromium.go`: Core Chromium implementation
- `crypto/crypto_[platform].go`: Platform-specific decryption
- `extractor/extractor.go`: Main extraction orchestration
- `profile/finder.go`: Browser profile discovery logic
- `browserdata/password/password.go`: Password parsing and decryption
## Testing Considerations
- Tests use mocked data to avoid requiring actual browser installations
- Platform-specific tests are isolated with build tags
- Sensitive operations (like keychain access) are mocked in tests
- Use `DATA-DOG/go-sqlmock` for database operation testing
## Browser Security Analysis
### Chromium-Based Browsers Security
**Encryption Methods**:
- **Chrome v80+**: AES-256-GCM encryption for sensitive data
- **Pre-v80**: AES-128-CBC with PKCS#5 padding
- **Master Key Storage**:
- Windows: Encrypted with DPAPI in `Local State` file
- macOS: Stored in system Keychain (requires user password)
- Linux: Derived using PBKDF2 with "peanuts" salt
**Data Protection Layers**:
1. **Password Storage**: Encrypted in SQLite database (`Login Data`)
2. **Cookie Encryption**: Encrypted values in `Cookies` database
3. **Credit Card Data**: Encrypted with same master key as passwords
4. **Local Storage**: Stored in LevelDB format, some values encrypted
### Firefox Security
**Encryption Architecture**:
- **Master Password**: Optional user-defined password for additional protection
- **Key Database**: `key4.db` stores encrypted master keys
- **NSS Library**: Network Security Services for cryptographic operations
- **Profile Encryption**: Each profile has independent encryption keys
**Key Derivation**:
- Uses PKCS#5 PBKDF2 for key derivation
- Triple-DES (3DES) for legacy compatibility
- AES-256-CBC for modern encryption
- ASN.1 encoding for key storage
### Platform-Specific Security Mechanisms
**Windows DPAPI (Data Protection API)**:
- User-specific encryption tied to Windows login
- No additional password required for decryption
- Keys protected by Windows security subsystem
- Vulnerable if attacker has user-level access
**macOS Keychain Services**:
- Requires user password for access
- Integration with system security framework
- Protected by System Integrity Protection (SIP)
- Security command-line tool for programmatic access
**Linux Secret Service**:
- GNOME Keyring or KDE Wallet integration
- D-Bus communication for key retrieval
- User session-based protection
- Fallback to PBKDF2 if keyring unavailable
### Security Vulnerabilities and Mitigations
**Known Attack Vectors**:
1. **Physical Access**: Direct file system access to browser profiles
2. **Memory Dumps**: Extraction of decrypted data from RAM
3. **Malware**: Keyloggers and info-stealers targeting browsers
4. **Process Injection**: DLL injection to extract decrypted data
**Defensive Recommendations**:
1. **Enable Master Password**: Firefox users should set master password
2. **Use OS-Level Encryption**: FileVault (macOS), BitLocker (Windows), LUKS (Linux)
3. **Regular Updates**: Keep browsers updated for latest security patches
4. **Profile Isolation**: Use separate profiles for sensitive activities
5. **Hardware Keys**: Use FIDO2/WebAuthn for critical accounts
### Cryptographic Implementation Details
**AES-GCM (Galois/Counter Mode)**:
- Authenticated encryption with associated data (AEAD)
- 96-bit nonce/IV for randomization
- 128-bit authentication tag for integrity
- Used in Chrome v80+ for enhanced security
**PBKDF2 (Password-Based Key Derivation Function 2)**:
- Iterations: 1003 (macOS), 1 (Linux default)
- Hash function: SHA-1 (legacy) or SHA-256
- Salt: "saltysalt" (Chrome), "peanuts" (Linux)
- Output: 128-bit or 256-bit keys
**DPAPI Internals**:
- Uses CryptProtectData/CryptUnprotectData Windows APIs
- Machine-specific or user-specific encryption
- Automatic key management by Windows
- Integrates with Windows credential manager
## Dependencies
- `modernc.org/sqlite`: Pure Go SQLite for cross-platform compatibility
- `github.com/godbus/dbus`: Linux keyring access
- `github.com/ppacher/go-dbus-keyring`: Secret service integration
- `github.com/tidwall/gjson`: JSON parsing for browser preferences
- `github.com/syndtr/goleveldb`: LevelDB for IndexedDB/LocalStorage
## Ethical Usage Guidelines
### Responsible Disclosure
- Report vulnerabilities to browser vendors through official channels
- Allow reasonable time for patches before public disclosure
- Never exploit vulnerabilities for personal gain
### Legal Compliance
- Obtain written authorization before testing third-party systems
- Comply with GDPR, CCPA, and other privacy regulations
- Respect intellectual property and terms of service
- Maintain audit logs of all security testing activities
### Best Practices for Security Researchers
1. **Scope Definition**: Clearly define testing boundaries
2. **Data Handling**: Securely delete any extracted sensitive data
3. **Documentation**: Maintain detailed records of methodologies
4. **Collaboration**: Work with security community ethically
5. **Education**: Share knowledge to improve overall security
+241
View File
@@ -0,0 +1,241 @@
# RFC-001: HackBrowserData Architecture Refactoring
**Author**: moonD4rk
**Status**: Proposed
**Created**: 2025-09-01
**Updated**: 2025-09-01
## Abstract
This RFC analyzes the current architectural issues in the HackBrowserData project and proposes refactoring directions. The core goal of the refactoring is to establish a modular, extensible, and testable architecture while supporting usage as a library that can be imported by other projects.
## Current Issues Analysis
### 1. Limited Encryption Version Support
**Current State**:
- Only supports Chrome v10 (Chrome 80+) AES-GCM encryption format
- Hardcoded "v10" prefix handling logic in the code
- Lacks version detection and dynamic selection mechanism
**Impact**:
- Unable to support data extraction from older browser versions
- Cannot adapt to future browser encryption algorithm upgrades (e.g., v11, v20)
- Chrome is introducing new encryption mechanisms (e.g., App-Bound Encryption in Chrome 127+), which the current architecture struggles to extend
### 2. Scattered Cross-Platform MasterKey Retrieval
**Current State**:
- Windows: Decrypts encrypted_key from Local State via DPAPI
- macOS: Accesses Keychain through security command, derives key using PBKDF2
- Linux: Accesses Secret Service via D-Bus or uses hardcoded "peanuts" salt
**Issues**:
- Each platform implementation is completely independent without a unified interface
- Difficult to add new key retrieval methods
- Code duplication and maintenance challenges
- Chrome on Windows is updating retrieval methods, requiring support for multiple strategies
### 3. Windows Cookie File Access Permission Issues
**Specific Issues**:
- On Windows, browsers lock Cookie files during runtime
- Direct reading may encounter "The process cannot access the file" errors
- Some security software blocks access to Cookie files
**Current Approach Limitations**:
- Simple file copying may fail due to file locking
- Lacks alternative access strategies (e.g., shadow copy, process injection)
- No abstraction for permission elevation or bypass mechanisms
### 4. Coupled Code Architecture
**Problems**:
- CLI logic mixed with core functionality
- Data extraction, decryption, and output are tightly coupled
- Uses global variables and functions, difficult to use as a library
**Specific Impact**:
- Cannot use core functionality independently
- Difficult to unit test
- Code reuse challenges
### 5. Inconsistent Error Handling
**Current State**:
- Some functions return errors, others directly use logging
- Error messages lack context (which browser, data type, platform)
- Cannot distinguish error severity (ignorable vs. fatal errors)
**Impact**:
- Debugging difficulties with insufficient error information
- Cannot implement flexible error handling strategies
- Inconsistent user experience
### 6. Testing and Maintenance Difficulties
**Issues**:
- Depends on real file system and browser installations
- Cannot mock system calls and external dependencies
- Low test coverage
- Adding new features requires modifying multiple code locations
## Architecture Improvement Proposals
### 1. Versioned Encryption Strategies
**Design Approach**:
- Create encryption version interface where each version implements its own detection and decryption logic
- Use registration mechanism to manage all supported versions
- Support both automatic detection and manual version specification
**Key Capabilities**:
- Version Detection: Automatically identify encryption version through data characteristics
- Version Registration: Dynamically register new encryption version implementations
- Priority Control: Try different versions by priority
### 2. Unified MasterKey Retrieval Abstraction
**Design Approach**:
- Define cross-platform MasterKey retrieval interface
- Each platform can have multiple retrieval strategies
- Support strategy chain, trying different methods sequentially
**Windows Strategy Examples**:
- DPAPI Strategy (traditional method)
- App-Bound Strategy (Chrome 127+)
- Cloud Sync Strategy (potential future)
**Key Capabilities**:
- Platform detection and automatic selection
- Strategy priority and fallback mechanisms
- Error handling and logging
### 3. File Access Abstraction Layer
**Design Approach**:
- Create file access interface encapsulating different access strategies
- For Windows Cookie issues, implement multiple access methods
- Provide unified error handling and retry mechanisms
**Windows Cookie Access Strategies**:
- Direct Copy (current method)
- Volume Shadow Copy Service (VSS)
- Memory Reading (from browser process)
- Stream Reading (bypass exclusive locks)
### 4. Layered Package Structure
**Design Principles**:
- Separate public API from internal implementation
- Separate interface definitions from concrete implementations
- Isolate platform-specific code
**Package Structure Plan**:
```
pkg/ # Public API (externally importable)
├── browser/ # Browser interface definitions
├── crypto/ # Encryption interface definitions
└── extractor/ # Data extractor interface definitions
internal/ # Internal implementation (not exposed)
├── browser/ # Browser implementations
├── crypto/ # Encryption algorithm implementations
└── platform/ # Platform-specific implementations
```
### 5. Improved Browser Interface
**Design Goals**:
- Support dependency injection
- Configurable and extensible
- Easy to test
**Core Methods**:
- Configuration settings (profile, crypto provider, etc.)
- Data extraction (support selecting data types)
- Capability queries (supported data types and platforms)
### 6. Unified Error Handling
**Design Approach**:
- Define structured error types
- Include rich context information
- Support error classification and handling strategies
**Error Information Should Include**:
- Operation type
- Browser name
- Data type
- Platform information
- Severity level
- Original error
### 7. Library API Design
**Design Goals**:
- Provide clean client interface
- Support convenient methods for common use cases
- Allow advanced users to customize behavior
**Use Cases**:
- Simple: One-click extraction of all browser data
- Advanced: Custom encryption versions, error handling, data filtering
### 8. Testing Strategy
**Improvement Directions**:
- Use interfaces instead of concrete implementations
- Support dependency injection
- Provide mock implementations
**Test Types**:
- Unit tests: Test independent components
- Integration tests: Test component interactions
- Platform tests: Test platform-specific functionality
## Implementation Recommendations
### Priority Levels
1. **High Priority**:
- Versioned encryption strategies (solve version support issues)
- MasterKey retrieval abstraction (unify cross-platform implementations)
- Windows Cookie access issues (solve permission problems)
2. **Medium Priority**:
- Browser interface refactoring
- Unified error handling
- Basic testing framework
3. **Low Priority**:
- Complete library API
- Advanced feature extensions
- Performance optimizations
### Compatibility Considerations
- Keep CLI backward compatible, internally calling new architecture
- Provide migration documentation
- Gradually deprecate old APIs across versions
## Security Considerations
1. **Minimize Permissions**: Only request necessary system permissions
2. **Memory Safety**: Zero out sensitive data after use
3. **Error Messages**: Avoid leaking sensitive information
4. **Input Validation**: Strictly validate paths and data
## Open Questions
1. **File Access Strategy Selection**: How to automatically select the best file access strategy?
2. **Error Recovery**: How to gracefully recover and continue when encountering partial failures?
3. **Configuration Management**: Should configuration files be supported to control behavior?
4. **Plugin System**: Should user-defined data extractors be supported?
## References
- [Chromium OS Crypt](https://source.chromium.org/chromium/chromium/src/+/main:components/os_crypt/)
- [Chrome Password Decryption](https://github.com/chromium/chromium/blob/main/components/os_crypt/sync/os_crypt_win.cc)
- [Firefox NSS](https://developer.mozilla.org/en-US/docs/Mozilla/Projects/NSS)
- [Windows File Locking](https://docs.microsoft.com/en-us/windows/win32/fileio/locking-and-unlocking-byte-ranges-in-files)