Commit Graph

932 Commits

Author SHA1 Message Date
Cuong Manh Le 2742669bc1 fix: prevent panic on network change during SetSelfIP
SetSelfIP unconditionally accessed t.dhcp, but t.dhcp is only
initialized when DHCP discovery is enabled. A network change event
can fire SetSelfIP regardless of the discovery configuration,
causing a nil pointer dereference.

Guard the t.dhcp access with a nil check so the self IP is still
updated on the Table even when DHCP discovery is disabled.
2026-04-30 19:19:19 +07:00
Cuong Manh Le a767ebdaa5 doq: use OpenStreamSync and retry on StreamLimitReachedError
Replace conn.OpenStream (non-blocking) with conn.OpenStreamSync so that
the resolver waits for the server's MAX_STREAMS credit replenishment frame
instead of immediately failing when the stream limit is temporarily
exhausted. Also retry on StreamLimitReachedError as defense-in-depth for
servers that are slow or fail to send MAX_STREAMS updates.
2026-04-30 19:19:19 +07:00
Codescribe a92d20cef8 doq: configure QUIC keep-alive and retry on idle timeout
Pass a quic.Config with KeepAlivePeriod (15s) to DoQ dial calls instead
of nil, so pooled connections send periodic QUIC PINGs to stay alive and
detect dead paths proactively.

Also add IdleTimeoutError to the DoQ retry conditions alongside io.EOF,
so stale pooled connections trigger a transparent retry instead of
propagating as a query failure.
2026-04-30 19:19:19 +07:00
Codescribe a8821e6d00 fix(darwin): support non-standard listener port in intercept mode
When port 53 is taken (e.g. by mDNSResponder), ctrld failed with
'could not find available listen ip and port' instead of falling back
to port 5354. Root cause: tryUpdateListenerConfig() checked the
dnsIntercept bool, which is derived in prog.run() AFTER listener
config is resolved.

Fix: check interceptMode string directly (CLI flag + config fallback)
in a new tryUpdateListenerConfigIntercept() that tries 127.0.0.1:53
then 127.0.0.1:5354.

Also updates buildPFAnchorRules() to use the actual listener IP/port
from config instead of hardcoded 127.0.0.1:53, so pf rules redirect
to wherever ctrld is actually listening.
2026-04-30 19:19:19 +07:00
Codescribe a3880beec2 docs: port IPv6 learnings and comment fixes to master
- Update comment in ensurePFAnchorReference: pfctl -sn returns
  rdr-anchor only (nat-anchor not used by ctrld)
- Update nat-anchor table entry in pf-dns-intercept.md
- Add pf nuances 10-16 from investigation: cross-AF redirect,
  block return, sendmsg EINVAL, nat-on-lo0, raw sockets, DIOCNATLOOK,
  and the pragmatic IPv6 block solution
2026-04-30 19:19:19 +07:00
Codescribe d7124995d2 fix: bracket IPv6 addresses in VPN DNS upstream config
upstreamConfigFor() used strings.Contains(":") to decide whether to
append ":53", which always evaluates true for IPv6 addresses. This left
bare addresses like "2a0d:6fc0:9b0:3600::1" without brackets or port,
causing net.Dial to reject with "too many colons in address".

Use net.JoinHostPort() which handles IPv6 bracketing automatically,
producing "[2a0d:6fc0:9b0:3600::1]:53".
2026-04-30 19:19:19 +07:00
Codescribe 86dafc432d Add log tail command for live log streaming
This commit adds a new `ctrld log tail` subcommand that streams
runtime debug logs to the terminal in real-time, similar to `tail -f`.

Changes:
- log_writer.go: Add Subscribe/tailLastLines for fan-out to tail clients
- control_server.go: Add /log/tail endpoint with streaming response
  - Internal logging: subscribes to logWriter for live data
  - File-based logging: polls log file for new data (200ms interval)
  - Sends last N lines as initial context on connect
- commands.go: Add `log tail` cobra subcommand with --lines/-n flag
- control_client.go: Add postStream() with no timeout for long-lived connections

Usage:
  sudo ctrld log tail          # shows last 10 lines then follows
  sudo ctrld log tail -n 50    # shows last 50 lines then follows
  Ctrl+C to stop
2026-04-30 19:19:19 +07:00
Cuong Manh Le ca8d07d3f5 fix(darwin): correct pf rules tests 2026-04-30 19:19:19 +07:00
Cuong Manh Le 2aaa78ef48 fix(windows): make staticcheck happy 2026-04-30 19:19:19 +07:00
Codescribe 0f2a930cf8 feat: robust username detection and CI updates
Add platform-specific username detection for Control D metadata:
- macOS: directory services (dscl) with console user fallback
- Linux: systemd loginctl, utmp, /etc/passwd traversal
- Windows: WTS session enumeration, registry, token lookup
2026-04-30 19:19:19 +07:00
Codescribe 5a6163142c feat: add VPN DNS split routing 2026-04-30 19:19:19 +07:00
Codescribe 402771bed6 feat: add Windows NRPT and WFP DNS interception 2026-04-30 19:19:19 +07:00
Codescribe a99dcca288 feat: add macOS pf DNS interception 2026-04-30 19:19:19 +07:00
Codescribe 395335162f feat: introduce DNS intercept mode infrastructure 2026-04-30 19:19:19 +07:00
Codescribe c56d4771de docs: add DNS Intercept Mode section to README 2026-04-30 19:19:19 +07:00
Codescribe ea48186d73 Fix dnsFromResolvConf not filtering loopback IPs
The continue statement only broke out of the inner loop, so
loopback/local IPs (e.g. 127.0.0.1) were never filtered.
This caused ctrld to use itself as bootstrap DNS when already
installed as the system resolver — a self-referential loop.

Use the same isLocal flag pattern as getDNSFromScutil() and
getAllDHCPNameservers().
2026-04-30 19:19:19 +07:00
Cuong Manh Le bc71622deb Use go1.25 for CI 2026-04-30 19:19:19 +07:00
Codescribe 846aaac27a fix: include hostname hints in metadata for API-side fallback
Send all available hostname sources (ComputerName, LocalHostName,
HostName, os.Hostname) in the metadata map when provisioning.
This allows the API to detect and repair generic hostnames like
'Mac' by picking the best available source server-side.

Belt and suspenders: preferredHostname() picks the right one
client-side, but metadata gives the API a second chance.
2026-04-30 19:19:19 +07:00
Codescribe f1e49a7ee6 fix(darwin): use scutil for provisioning hostname (#485)
macOS Sequoia with Private Wi-Fi Address enabled causes os.Hostname()
to return generic names like "Mac.lan" from DHCP instead of the real
computer name. The /utility provisioning endpoint sends this raw,
resulting in devices named "Mac-lan" in the dashboard.

Fallback chain: ComputerName → LocalHostName → os.Hostname()

LocalHostName can also be affected by DHCP. ComputerName is the
user-set display name from System Settings, fully immune to network state.
2026-04-30 19:19:19 +07:00
Cuong Manh Le 878b3d7920 fix(cli): avoid warning when HTTP log server is not yet available
Treat "socket missing" (ENOENT) and connection refused as expected when
probing the log server, and only log when the error indicates something
unexpected. This prevents noisy warnings when the log server has not
started yet.

Discover while doing captive portal tests.
2026-04-30 19:19:19 +07:00
Cuong Manh Le f1b93c81bc refactor(doq): simplify DoQ connection pool implementation
Replace the map-based pool and refCount bookkeeping with a channel-based
pool. Drop the closed state, per-connection address tracking, and extra
mutexes so the pool relies on the channel for concurrency and lifecycle,
matching the approach used in the DoT pool.
2026-04-30 19:19:19 +07:00
Cuong Manh Le 60dd366cc4 refactor(dot): simplify DoT connection pool implementation
Replace the map-based pool and refCount bookkeeping with a channel-based
pool. Drop the closed state, per-connection address tracking, and
extra mutexes so the pool relies on the channel for concurrency and
lifecycle.
2026-04-30 19:19:19 +07:00
Cuong Manh Le e45e56c021 fix(dot): validate connections before reuse to prevent io.EOF errors
Add connection health check in getConn to validate TLS connections
before reusing them from the pool. This prevents io.EOF errors when
reusing connections that were closed by the server (e.g., due to idle
timeout).
2026-04-30 19:19:19 +07:00
Cuong Manh Le 6f331f19c8 fix(dns): handle empty and invalid IP addresses gracefully
Add guard checks to prevent panics when processing client info with
empty IP addresses. Replace netip.MustParseAddr with ParseAddr to
handle invalid IP addresses gracefully instead of panicking.

Add test to verify queryFromSelf handles IP addresses safely.
2026-04-30 19:19:19 +07:00
Cuong Manh Le e4ca728ef0 refactor(network): consolidate network change monitoring
Remove separate watchLinkState function and integrate link state change
handling directly into monitorNetworkChanges. This consolidates network
monitoring logic into a single place and simplifies the codebase.

Update netlink dependency from v1.2.1-beta.2 to v1.3.1 and netns from
v0.0.4 to v0.0.5 to use stable versions.
2026-04-30 19:19:19 +07:00
Cuong Manh Le 8117084d39 fix(windows): improve DNS server discovery for domain-joined machines
Add DNS suffix matching for non-physical adapters when domain-joined.
This allows interfaces with matching DNS suffix to be considered valid
even if not in validInterfacesMap, improving DNS server discovery for
remote VPN scenarios.
2026-04-30 19:19:19 +07:00
Cuong Manh Le e23451df37 fix(system): disable ghw warnings to reduce log noise
Disable warnings from ghw library when retrieving chassis information.
These warnings are undesirable but recoverable errors that emit unnecessary
log messages. Using WithDisableWarnings() suppresses them while maintaining
functionality.
2026-04-30 19:19:19 +07:00
Cuong Manh Le 43d4e1957c fix: remove incorrect transport close on DoH3 error
Remove the transport Close() call from DoH3 error handling path.
The transport is shared and reused across requests, and closing it
on error would break subsequent requests. The transport lifecycle
is already properly managed by the http.Client and the finalizer
set in newDOH3Transport().
2026-04-30 19:19:19 +07:00
Cuong Manh Le ba3dd3a4b0 Including system metadata when posting to utility API 2026-04-30 19:19:19 +07:00
Cuong Manh Le 8b92dc97a3 perf(dot): implement connection pooling for improved performance
Implement TCP/TLS connection pooling for DoT resolver to match DoQ
performance. Previously, DoT created a new TCP/TLS connection for every
DNS query, incurring significant TLS handshake overhead. Now connections are
reused across queries, eliminating this overhead for subsequent requests.

The implementation follows the same pattern as DoQ, using parallel dialing
and connection pooling to achieve comparable performance characteristics.
2026-04-30 19:19:19 +07:00
Cuong Manh Le 9158cd7835 fix(config): use three-state atomic for rebootstrap to prevent data race
Replace boolean rebootstrap flag with a three-state atomic integer to
prevent concurrent SetupTransport calls during rebootstrap. The atomic
state machine ensures only one goroutine can proceed from "started" to
"in progress", eliminating the need for a mutex while maintaining
thread safety.

States: NotStarted -> Started -> InProgress -> NotStarted

Note that the race condition is still acceptable because any additional
transports created during the race are functional. Once the connection
is established, the unused transports are safely handled by the garbage
collector.
2026-04-30 19:19:19 +07:00
Cuong Manh Le 2d9603609f refactor(config): consolidate transport setup and eliminate duplication
Consolidate DoH/DoH3/DoQ transport initialization into a single
SetupTransport method and introduce generic helper functions to eliminate
duplicated IP stack selection logic across transport getters.

This reduces code duplication by ~77 lines while maintaining the same
functionality.
2026-04-30 19:19:19 +07:00
Cuong Manh Le e4e655414c perf(doq): implement connection pooling for improved performance
Implement QUIC connection pooling for DoQ resolver to match DoH3
performance. Previously, DoQ created a new QUIC connection for every
DNS query, incurring significant handshake overhead. Now connections are
reused across queries, eliminating this overhead for subsequent requests.

The implementation follows the same pattern as DoH3, using parallel dialing
and connection pooling to achieve comparable performance characteristics.
2026-04-30 19:19:19 +07:00
Cuong Manh Le aacba92698 docs: add documentation for runtime internal logging 2026-04-30 19:19:19 +07:00
Cuong Manh Le c3c9e1a4d7 .github/workflows: temporary use actions/setup-go
Since WillAbides/setup-go-faster failed with macOS-latest.

See: https://github.com/WillAbides/setup-go-faster/issues/37
2026-04-30 19:19:19 +07:00
Cuong Manh Le 9a3840954b Upgrade quic-go to v0.57.0 2026-04-30 19:19:19 +07:00
Cuong Manh Le 673308a1fe docs: add v2.0.0 breaking changes documentation
- Add comprehensive documentation for ctrld v2.0.0 breaking changes
- Document removal of automatic configuration for router/server platforms
- Provide step-by-step migration guide for affected users
- Include detailed dnsmasq and Windows Server configuration examples
- Update README.md to reflect v2.0.0 installer URLs and Go version requirements
- Remove references to automatic dnsmasq upstream configuration in README
2026-04-30 19:19:19 +07:00
Cuong Manh Le 2cb0456265 .github/workflows: upgrade staticcheck-action to v1.4.0
While at it, also bump go version to 1.24
2026-04-30 19:19:19 +07:00
Cuong Manh Le 9dd4183981 Upgrade quic-go to v0.56.0
Updates #461
2026-04-30 19:19:19 +07:00
Cuong Manh Le aacbcad133 cmd/cli: workaround TB.TemdDir path too long for Unix socket path
Discover while testing v2.0.0 Github MR.

See: https://github.com/golang/go/issues/62614

While at it, also fix staticcheck linter on Windows.
2026-04-30 19:19:19 +07:00
Cuong Manh Le 1489245f50 cmd/cli: ensure error message ends with newline 2026-04-30 19:19:19 +07:00
Cuong Manh Le 6aedc2b2d3 docs: add comprehensive package documentation for rulematcher
- Add detailed package documentation to engine.go explaining the rule matching
  system, supported rule types (Network, MAC, Domain), and priority ordering
- Include usage example demonstrating typical API usage patterns
- Remove unused Type() method from RuleMatcher interface and implementations
- Maintain backward compatibility while improving code documentation

The documentation explains the policy-based DNS routing system and how different
rule types interact with configurable priority ordering.
2026-04-30 19:19:19 +07:00
Cuong Manh Le 9b1f102315 refactor: remove unused StopOnFirstMatch field from MatchingConfig
Remove StopOnFirstMatch field that was defined but never used in the
actual matching logic.

The current implementation always evaluates all rule types and applies
a fixed precedence (Domain > MAC > Network), making the StopOnFirstMatch
field unnecessary.

Changes:
- Remove StopOnFirstMatch from MatchingConfig structs
- Update DefaultMatchingConfig() function
- Update all test cases and references
- Simplify configuration to only include Order field

This cleanup removes dead code and simplifies the configuration API
without changing any functional behavior.
2026-04-30 19:19:19 +07:00
Cuong Manh Le c365051732 feat: add configurable rule matching with improved code structure
Implement configurable DNS policy rule matching order and refactor
upstreamFor method for better maintainability.

New features:
- Add MatchingConfig to ListenerPolicyConfig for rule order configuration
- Support custom rule evaluation order (network, mac, domain)
- Add stop_on_first_match configuration option
- Hidden from config files (mapstructure:"-" toml:"-") for future release

Code improvements:
- Create upstreamForRequest struct to reduce method parameter count
- Refactor upstreamForWithConfig to use single struct parameter
- Improve code readability and maintainability
- Maintain full backward compatibility

Technical details:
- String-based configuration converted to RuleType enum internally
- Default behavior preserved (network → mac → domain order)
- Domain rules still override MAC/network rules regardless of order
- Comprehensive test coverage for configuration integration

The matching configuration is programmatically accessible but hidden
from user configuration files until ready for public release.
2026-04-30 19:19:19 +07:00
Cuong Manh Le 6294ba4028 feat: add configurable rule matching engine
Implement MatchingEngine in internal/rulematcher package to enable
configurable DNS policy rule evaluation order and behavior.

New components:
- MatchingConfig: Configuration for rule order and stop behavior
- MatchingEngine: Orchestrates rule matching with configurable order
- MatchingResult: Standardized result structure
- DefaultMatchingConfig(): Maintains backward compatibility

Key features:
- Configurable rule evaluation order (e.g., domain-first, MAC-first)
- StopOnFirstMatch configuration option
- Graceful handling of invalid rule types
- Comprehensive test coverage for all scenarios

The engine supports custom matching strategies while preserving
the default Networks → Macs → Domains order for backward compatibility.
This enables future configuration-driven rule matching without
breaking existing functionality.
2026-04-30 19:19:18 +07:00
Cuong Manh Le 261f9483a2 refactor: extract rule matching logic into internal/rulematcher package
Extract DNS policy rule matching logic from dns_proxy.go into a dedicated
internal/rulematcher package to improve code organization and maintainability.

The new package provides:
- RuleMatcher interface for extensible rule matching
- NetworkRuleMatcher for IP-based network rules
- MacRuleMatcher for MAC address-based rules
- DomainRuleMatcher for domain/wildcard rules
- Comprehensive unit tests for all matchers

This refactoring improves:
- Separation of concerns between DNS proxy and rule matching
- Testability with isolated rule matcher components
- Reusability of rule matching logic across the codebase
- Maintainability with focused, single-responsibility modules
2026-04-30 19:19:18 +07:00
Cuong Manh Le e17a538312 Fix staticcheck linter 2026-04-30 19:19:18 +07:00
Cuong Manh Le 650e47a504 refactor: consolidate network interface detection logic
Move platform-specific network interface detection from cmd/cli/ to root package
as ValidInterfaces function. This eliminates code duplication and provides a
consistent interface for determining valid physical network interfaces across
all platforms.

- Remove duplicate validInterfacesMap functions from platform-specific files
- Add context parameter to virtualInterfaces for proper logging
- Update all callers to use ctrld.ValidInterfaces instead of local functions
- Improve error handling in virtual interface detection on Linux
2026-04-30 19:19:18 +07:00
Cuong Manh Le f24059885f feat: add --rfc1918 flag for explicit LAN client support
Make RFC1918 listener spawning opt-in via --rfc1918 flag instead of automatic behavior.
This allows users to explicitly control when ctrld listens on private network addresses
to receive DNS queries from LAN clients, improving security and configurability.

Refactor network interface detection to better distinguish between physical and virtual
interfaces, ensuring only real hardware interfaces are used for RFC1918 address binding.
2026-04-30 19:19:18 +07:00
Cuong Manh Le 52cfb4c302 Change download url for v2
While at it, also updating CI flow to reflect new path.
2026-04-30 19:19:18 +07:00