Files
HackBrowserData/rfc/001-architecture-refactoring.md
T
Roger 3e9abed2b3 docs: add architecture RFC and development guidelines (#486)
- Add RFC-001 for architecture refactoring proposal
- Add CLAUDE.md with development guidelines and security analysis
- Document current issues and proposed solutions for library support
- Include cross-platform considerations and encryption versioning

The RFC addresses key architectural challenges:
* Limited encryption version support (only v10)
* Scattered cross-platform MasterKey retrieval
* Windows Cookie file access permission issues
* Coupled code architecture preventing library usage
* Inconsistent error handling
* Testing and maintenance difficulties

Proposed improvements include versioned encryption strategies,
unified MasterKey abstraction, and a clean library API design.
2025-09-02 23:23:19 +08:00

8.2 KiB

RFC-001: HackBrowserData Architecture Refactoring

Author: moonD4rk
Status: Proposed
Created: 2025-09-01
Updated: 2025-09-01

Abstract

This RFC analyzes the current architectural issues in the HackBrowserData project and proposes refactoring directions. The core goal of the refactoring is to establish a modular, extensible, and testable architecture while supporting usage as a library that can be imported by other projects.

Current Issues Analysis

1. Limited Encryption Version Support

Current State:

  • Only supports Chrome v10 (Chrome 80+) AES-GCM encryption format
  • Hardcoded "v10" prefix handling logic in the code
  • Lacks version detection and dynamic selection mechanism

Impact:

  • Unable to support data extraction from older browser versions
  • Cannot adapt to future browser encryption algorithm upgrades (e.g., v11, v20)
  • Chrome is introducing new encryption mechanisms (e.g., App-Bound Encryption in Chrome 127+), which the current architecture struggles to extend

2. Scattered Cross-Platform MasterKey Retrieval

Current State:

  • Windows: Decrypts encrypted_key from Local State via DPAPI
  • macOS: Accesses Keychain through security command, derives key using PBKDF2
  • Linux: Accesses Secret Service via D-Bus or uses hardcoded "peanuts" salt

Issues:

  • Each platform implementation is completely independent without a unified interface
  • Difficult to add new key retrieval methods
  • Code duplication and maintenance challenges
  • Chrome on Windows is updating retrieval methods, requiring support for multiple strategies

Specific Issues:

  • On Windows, browsers lock Cookie files during runtime
  • Direct reading may encounter "The process cannot access the file" errors
  • Some security software blocks access to Cookie files

Current Approach Limitations:

  • Simple file copying may fail due to file locking
  • Lacks alternative access strategies (e.g., shadow copy, process injection)
  • No abstraction for permission elevation or bypass mechanisms

4. Coupled Code Architecture

Problems:

  • CLI logic mixed with core functionality
  • Data extraction, decryption, and output are tightly coupled
  • Uses global variables and functions, difficult to use as a library

Specific Impact:

  • Cannot use core functionality independently
  • Difficult to unit test
  • Code reuse challenges

5. Inconsistent Error Handling

Current State:

  • Some functions return errors, others directly use logging
  • Error messages lack context (which browser, data type, platform)
  • Cannot distinguish error severity (ignorable vs. fatal errors)

Impact:

  • Debugging difficulties with insufficient error information
  • Cannot implement flexible error handling strategies
  • Inconsistent user experience

6. Testing and Maintenance Difficulties

Issues:

  • Depends on real file system and browser installations
  • Cannot mock system calls and external dependencies
  • Low test coverage
  • Adding new features requires modifying multiple code locations

Architecture Improvement Proposals

1. Versioned Encryption Strategies

Design Approach:

  • Create encryption version interface where each version implements its own detection and decryption logic
  • Use registration mechanism to manage all supported versions
  • Support both automatic detection and manual version specification

Key Capabilities:

  • Version Detection: Automatically identify encryption version through data characteristics
  • Version Registration: Dynamically register new encryption version implementations
  • Priority Control: Try different versions by priority

2. Unified MasterKey Retrieval Abstraction

Design Approach:

  • Define cross-platform MasterKey retrieval interface
  • Each platform can have multiple retrieval strategies
  • Support strategy chain, trying different methods sequentially

Windows Strategy Examples:

  • DPAPI Strategy (traditional method)
  • App-Bound Strategy (Chrome 127+)
  • Cloud Sync Strategy (potential future)

Key Capabilities:

  • Platform detection and automatic selection
  • Strategy priority and fallback mechanisms
  • Error handling and logging

3. File Access Abstraction Layer

Design Approach:

  • Create file access interface encapsulating different access strategies
  • For Windows Cookie issues, implement multiple access methods
  • Provide unified error handling and retry mechanisms

Windows Cookie Access Strategies:

  • Direct Copy (current method)
  • Volume Shadow Copy Service (VSS)
  • Memory Reading (from browser process)
  • Stream Reading (bypass exclusive locks)

4. Layered Package Structure

Design Principles:

  • Separate public API from internal implementation
  • Separate interface definitions from concrete implementations
  • Isolate platform-specific code

Package Structure Plan:

pkg/           # Public API (externally importable)
├── browser/   # Browser interface definitions
├── crypto/    # Encryption interface definitions
└── extractor/ # Data extractor interface definitions

internal/      # Internal implementation (not exposed)
├── browser/   # Browser implementations
├── crypto/    # Encryption algorithm implementations
└── platform/  # Platform-specific implementations

5. Improved Browser Interface

Design Goals:

  • Support dependency injection
  • Configurable and extensible
  • Easy to test

Core Methods:

  • Configuration settings (profile, crypto provider, etc.)
  • Data extraction (support selecting data types)
  • Capability queries (supported data types and platforms)

6. Unified Error Handling

Design Approach:

  • Define structured error types
  • Include rich context information
  • Support error classification and handling strategies

Error Information Should Include:

  • Operation type
  • Browser name
  • Data type
  • Platform information
  • Severity level
  • Original error

7. Library API Design

Design Goals:

  • Provide clean client interface
  • Support convenient methods for common use cases
  • Allow advanced users to customize behavior

Use Cases:

  • Simple: One-click extraction of all browser data
  • Advanced: Custom encryption versions, error handling, data filtering

8. Testing Strategy

Improvement Directions:

  • Use interfaces instead of concrete implementations
  • Support dependency injection
  • Provide mock implementations

Test Types:

  • Unit tests: Test independent components
  • Integration tests: Test component interactions
  • Platform tests: Test platform-specific functionality

Implementation Recommendations

Priority Levels

  1. High Priority:

    • Versioned encryption strategies (solve version support issues)
    • MasterKey retrieval abstraction (unify cross-platform implementations)
    • Windows Cookie access issues (solve permission problems)
  2. Medium Priority:

    • Browser interface refactoring
    • Unified error handling
    • Basic testing framework
  3. Low Priority:

    • Complete library API
    • Advanced feature extensions
    • Performance optimizations

Compatibility Considerations

  • Keep CLI backward compatible, internally calling new architecture
  • Provide migration documentation
  • Gradually deprecate old APIs across versions

Security Considerations

  1. Minimize Permissions: Only request necessary system permissions
  2. Memory Safety: Zero out sensitive data after use
  3. Error Messages: Avoid leaking sensitive information
  4. Input Validation: Strictly validate paths and data

Open Questions

  1. File Access Strategy Selection: How to automatically select the best file access strategy?
  2. Error Recovery: How to gracefully recover and continue when encountering partial failures?
  3. Configuration Management: Should configuration files be supported to control behavior?
  4. Plugin System: Should user-defined data extractors be supported?

References