Replace conn.OpenStream (non-blocking) with conn.OpenStreamSync so that
the resolver waits for the server's MAX_STREAMS credit replenishment frame
instead of immediately failing when the stream limit is temporarily
exhausted. Also retry on StreamLimitReachedError as defense-in-depth for
servers that are slow or fail to send MAX_STREAMS updates.
Pass a quic.Config with KeepAlivePeriod (15s) to DoQ dial calls instead
of nil, so pooled connections send periodic QUIC PINGs to stay alive and
detect dead paths proactively.
Also add IdleTimeoutError to the DoQ retry conditions alongside io.EOF,
so stale pooled connections trigger a transparent retry instead of
propagating as a query failure.
When port 53 is taken (e.g. by mDNSResponder), ctrld failed with
'could not find available listen ip and port' instead of falling back
to port 5354. Root cause: tryUpdateListenerConfig() checked the
dnsIntercept bool, which is derived in prog.run() AFTER listener
config is resolved.
Fix: check interceptMode string directly (CLI flag + config fallback)
in a new tryUpdateListenerConfigIntercept() that tries 127.0.0.1:53
then 127.0.0.1:5354.
Also updates buildPFAnchorRules() to use the actual listener IP/port
from config instead of hardcoded 127.0.0.1:53, so pf rules redirect
to wherever ctrld is actually listening.
- Update comment in ensurePFAnchorReference: pfctl -sn returns
rdr-anchor only (nat-anchor not used by ctrld)
- Update nat-anchor table entry in pf-dns-intercept.md
- Add pf nuances 10-16 from investigation: cross-AF redirect,
block return, sendmsg EINVAL, nat-on-lo0, raw sockets, DIOCNATLOOK,
and the pragmatic IPv6 block solution
upstreamConfigFor() used strings.Contains(":") to decide whether to
append ":53", which always evaluates true for IPv6 addresses. This left
bare addresses like "2a0d:6fc0:9b0:3600::1" without brackets or port,
causing net.Dial to reject with "too many colons in address".
Use net.JoinHostPort() which handles IPv6 bracketing automatically,
producing "[2a0d:6fc0:9b0:3600::1]:53".
This commit adds a new `ctrld log tail` subcommand that streams
runtime debug logs to the terminal in real-time, similar to `tail -f`.
Changes:
- log_writer.go: Add Subscribe/tailLastLines for fan-out to tail clients
- control_server.go: Add /log/tail endpoint with streaming response
- Internal logging: subscribes to logWriter for live data
- File-based logging: polls log file for new data (200ms interval)
- Sends last N lines as initial context on connect
- commands.go: Add `log tail` cobra subcommand with --lines/-n flag
- control_client.go: Add postStream() with no timeout for long-lived connections
Usage:
sudo ctrld log tail # shows last 10 lines then follows
sudo ctrld log tail -n 50 # shows last 50 lines then follows
Ctrl+C to stop
The continue statement only broke out of the inner loop, so
loopback/local IPs (e.g. 127.0.0.1) were never filtered.
This caused ctrld to use itself as bootstrap DNS when already
installed as the system resolver — a self-referential loop.
Use the same isLocal flag pattern as getDNSFromScutil() and
getAllDHCPNameservers().
Send all available hostname sources (ComputerName, LocalHostName,
HostName, os.Hostname) in the metadata map when provisioning.
This allows the API to detect and repair generic hostnames like
'Mac' by picking the best available source server-side.
Belt and suspenders: preferredHostname() picks the right one
client-side, but metadata gives the API a second chance.
macOS Sequoia with Private Wi-Fi Address enabled causes os.Hostname()
to return generic names like "Mac.lan" from DHCP instead of the real
computer name. The /utility provisioning endpoint sends this raw,
resulting in devices named "Mac-lan" in the dashboard.
Fallback chain: ComputerName → LocalHostName → os.Hostname()
LocalHostName can also be affected by DHCP. ComputerName is the
user-set display name from System Settings, fully immune to network state.
Treat "socket missing" (ENOENT) and connection refused as expected when
probing the log server, and only log when the error indicates something
unexpected. This prevents noisy warnings when the log server has not
started yet.
Discover while doing captive portal tests.
Replace the map-based pool and refCount bookkeeping with a channel-based
pool. Drop the closed state, per-connection address tracking, and extra
mutexes so the pool relies on the channel for concurrency and lifecycle,
matching the approach used in the DoT pool.
Replace the map-based pool and refCount bookkeeping with a channel-based
pool. Drop the closed state, per-connection address tracking, and
extra mutexes so the pool relies on the channel for concurrency and
lifecycle.
Add connection health check in getConn to validate TLS connections
before reusing them from the pool. This prevents io.EOF errors when
reusing connections that were closed by the server (e.g., due to idle
timeout).
Add guard checks to prevent panics when processing client info with
empty IP addresses. Replace netip.MustParseAddr with ParseAddr to
handle invalid IP addresses gracefully instead of panicking.
Add test to verify queryFromSelf handles IP addresses safely.
Remove separate watchLinkState function and integrate link state change
handling directly into monitorNetworkChanges. This consolidates network
monitoring logic into a single place and simplifies the codebase.
Update netlink dependency from v1.2.1-beta.2 to v1.3.1 and netns from
v0.0.4 to v0.0.5 to use stable versions.
Add DNS suffix matching for non-physical adapters when domain-joined.
This allows interfaces with matching DNS suffix to be considered valid
even if not in validInterfacesMap, improving DNS server discovery for
remote VPN scenarios.
Disable warnings from ghw library when retrieving chassis information.
These warnings are undesirable but recoverable errors that emit unnecessary
log messages. Using WithDisableWarnings() suppresses them while maintaining
functionality.
Remove the transport Close() call from DoH3 error handling path.
The transport is shared and reused across requests, and closing it
on error would break subsequent requests. The transport lifecycle
is already properly managed by the http.Client and the finalizer
set in newDOH3Transport().
Implement TCP/TLS connection pooling for DoT resolver to match DoQ
performance. Previously, DoT created a new TCP/TLS connection for every
DNS query, incurring significant TLS handshake overhead. Now connections are
reused across queries, eliminating this overhead for subsequent requests.
The implementation follows the same pattern as DoQ, using parallel dialing
and connection pooling to achieve comparable performance characteristics.
Replace boolean rebootstrap flag with a three-state atomic integer to
prevent concurrent SetupTransport calls during rebootstrap. The atomic
state machine ensures only one goroutine can proceed from "started" to
"in progress", eliminating the need for a mutex while maintaining
thread safety.
States: NotStarted -> Started -> InProgress -> NotStarted
Note that the race condition is still acceptable because any additional
transports created during the race are functional. Once the connection
is established, the unused transports are safely handled by the garbage
collector.
Consolidate DoH/DoH3/DoQ transport initialization into a single
SetupTransport method and introduce generic helper functions to eliminate
duplicated IP stack selection logic across transport getters.
This reduces code duplication by ~77 lines while maintaining the same
functionality.
Implement QUIC connection pooling for DoQ resolver to match DoH3
performance. Previously, DoQ created a new QUIC connection for every
DNS query, incurring significant handshake overhead. Now connections are
reused across queries, eliminating this overhead for subsequent requests.
The implementation follows the same pattern as DoH3, using parallel dialing
and connection pooling to achieve comparable performance characteristics.
- Add comprehensive documentation for ctrld v2.0.0 breaking changes
- Document removal of automatic configuration for router/server platforms
- Provide step-by-step migration guide for affected users
- Include detailed dnsmasq and Windows Server configuration examples
- Update README.md to reflect v2.0.0 installer URLs and Go version requirements
- Remove references to automatic dnsmasq upstream configuration in README
- Add detailed package documentation to engine.go explaining the rule matching
system, supported rule types (Network, MAC, Domain), and priority ordering
- Include usage example demonstrating typical API usage patterns
- Remove unused Type() method from RuleMatcher interface and implementations
- Maintain backward compatibility while improving code documentation
The documentation explains the policy-based DNS routing system and how different
rule types interact with configurable priority ordering.
Remove StopOnFirstMatch field that was defined but never used in the
actual matching logic.
The current implementation always evaluates all rule types and applies
a fixed precedence (Domain > MAC > Network), making the StopOnFirstMatch
field unnecessary.
Changes:
- Remove StopOnFirstMatch from MatchingConfig structs
- Update DefaultMatchingConfig() function
- Update all test cases and references
- Simplify configuration to only include Order field
This cleanup removes dead code and simplifies the configuration API
without changing any functional behavior.
Implement configurable DNS policy rule matching order and refactor
upstreamFor method for better maintainability.
New features:
- Add MatchingConfig to ListenerPolicyConfig for rule order configuration
- Support custom rule evaluation order (network, mac, domain)
- Add stop_on_first_match configuration option
- Hidden from config files (mapstructure:"-" toml:"-") for future release
Code improvements:
- Create upstreamForRequest struct to reduce method parameter count
- Refactor upstreamForWithConfig to use single struct parameter
- Improve code readability and maintainability
- Maintain full backward compatibility
Technical details:
- String-based configuration converted to RuleType enum internally
- Default behavior preserved (network → mac → domain order)
- Domain rules still override MAC/network rules regardless of order
- Comprehensive test coverage for configuration integration
The matching configuration is programmatically accessible but hidden
from user configuration files until ready for public release.
Implement MatchingEngine in internal/rulematcher package to enable
configurable DNS policy rule evaluation order and behavior.
New components:
- MatchingConfig: Configuration for rule order and stop behavior
- MatchingEngine: Orchestrates rule matching with configurable order
- MatchingResult: Standardized result structure
- DefaultMatchingConfig(): Maintains backward compatibility
Key features:
- Configurable rule evaluation order (e.g., domain-first, MAC-first)
- StopOnFirstMatch configuration option
- Graceful handling of invalid rule types
- Comprehensive test coverage for all scenarios
The engine supports custom matching strategies while preserving
the default Networks → Macs → Domains order for backward compatibility.
This enables future configuration-driven rule matching without
breaking existing functionality.
Extract DNS policy rule matching logic from dns_proxy.go into a dedicated
internal/rulematcher package to improve code organization and maintainability.
The new package provides:
- RuleMatcher interface for extensible rule matching
- NetworkRuleMatcher for IP-based network rules
- MacRuleMatcher for MAC address-based rules
- DomainRuleMatcher for domain/wildcard rules
- Comprehensive unit tests for all matchers
This refactoring improves:
- Separation of concerns between DNS proxy and rule matching
- Testability with isolated rule matcher components
- Reusability of rule matching logic across the codebase
- Maintainability with focused, single-responsibility modules
Move platform-specific network interface detection from cmd/cli/ to root package
as ValidInterfaces function. This eliminates code duplication and provides a
consistent interface for determining valid physical network interfaces across
all platforms.
- Remove duplicate validInterfacesMap functions from platform-specific files
- Add context parameter to virtualInterfaces for proper logging
- Update all callers to use ctrld.ValidInterfaces instead of local functions
- Improve error handling in virtual interface detection on Linux
Make RFC1918 listener spawning opt-in via --rfc1918 flag instead of automatic behavior.
This allows users to explicitly control when ctrld listens on private network addresses
to receive DNS queries from LAN clients, improving security and configurability.
Refactor network interface detection to better distinguish between physical and virtual
interfaces, ensuring only real hardware interfaces are used for RFC1918 address binding.
Replace the legacy Unix socket log communication between `ctrld start` and
`ctrld run` with a modern HTTP-based system for better reliability and
maintainability.
Benefits:
- More reliable communication protocol using standard HTTP
- Better error handling and connection management
- Cleaner separation of concerns with dedicated endpoints
- Easier to test and debug with HTTP-based communication
- More maintainable code with proper abstraction layers
This change maintains backward compatibility while providing a more robust
foundation for inter-process communication between ctrld commands.