Commit Graph

70 Commits

Author SHA1 Message Date
Alex
cf6d16b439 set new dialer on every request
debugging

debugging

debugging

debugging

use default route interface IP for OS resolver queries

remove retries

fix resolv.conf clobbering on MacOS, set custom local addr for os resolver queries

remove the client info discovery logic on network change, this was overkill just for the IP, and was causing service failure after switching networks many times rapidly

handle ipv6 local addresses

guard ciTable from nil pointer

debugging failure count
2025-02-06 15:40:41 +07:00
Alex
2d3779ec27 fix MacOS nameserver detection, fix not installed errors for commands
copy

fix get valid ifaces in nameservers_bsd

nameservers on MacOS can be found in resolv.conf reliably

nameservers on MacOS can be found in resolv.conf reliably

exclude local IPs from MacOS resolve conf check

use scutil for MacOS, simplify reinit logic to prevent duplicate calls

add more dns server fetching options

never skip OS resolver in IsDown check

split dsb and darwin nameserver methods, add delay for setting DNS on interface on network change.

increase delay to 5s but only on MacOS
2025-02-05 13:18:06 +07:00
Cuong Manh Le
595071b608 all: update client info table on network changes
So the client metadata will be updated correctly when the device roaming
between networks.
2025-02-05 13:15:01 +07:00
Cuong Manh Le
57ef717080 cmd/cli: improve error message returned by FlushDNSCache
By recording both the error and output of external commands.

While at it:

 - Removing un-necessary usages of sudo, since ctrld already
   running with root privilege.
 - Removing un-used function triggerCaptiveCheck.
2025-02-05 13:14:52 +07:00
Cuong Manh Le
eb27d1482b cmd/cli: use warn level for network changes logging
So these events will be recorded separately from normal runtime log,
making troubleshooting later more easily.

While at it, only update ctrld.ProxyLogger for runCmd, it's the only one
which needs to log the query when proxying requests.
2025-02-05 13:14:39 +07:00
Alex
168eaf538b increase OSresolver timeout, fix debug log statements
flush dns cache, manually hit captive portal on MacOS

fix real ip in debug log

treat all upstreams as down upon network change

delay upstream checks when leaking queries on network changes
2025-02-04 18:03:41 +07:00
Alex
e573a490c9 ignore non physical ifaces in validInterfaces method on Windows
debugging

skip type 24 in nameserver detection

skip type 24 in nameserver detection

remove interface type check from valid interfaces for now

skip non hardware interfaces in DNS nameserver lookup

ignore win api log output

set retries to 5 and 1s backoff

reset DNS when upgrading to make sure we get the proper OS nameservers on start

init running iface for upgrade

update windows service options for auto restarts on failure

make upgrade use the actual stop and start commands

fix the windows service retry logic

fix the windows service retry logic

task debugging

more task debugging

windows service name fix

windows service name fix

fix start command args

fix restart delay

dont recover from non crash failures

fix upgrade flow
2025-01-30 17:06:43 +07:00
Alex
ce3281e70d much more debugging, improved nameserver detection, no more testing nameservers
fix logging

fix logging

try to enable nameserver logs

try to enable nameserver logs

handle flags in interface state changes

debugging

debugging

debugging

fix state detection, AD status fix

fix debugging line

more dc info

always log state changes

remove unused method

windows AD IP discovery

windows AD IP discovery

windows AD IP discovery
2025-01-29 12:28:49 +07:00
Cuong Manh Le
0fbfd160c9 cmd/cli: log interfaces state after dns set
The data will be useful for troubleshooting later.
2025-01-24 14:54:28 +07:00
Cuong Manh Le
20759017e6 all: use local resolver for ADDC
For normal OS resolver, ctrld does not use local addresses as nameserver
to avoid possible looping. However, on AD environment with local DNS
running, AD queries must be sent to the local DNS server for proper
resolving.
2025-01-24 14:54:20 +07:00
Alex
2687a4a018 remove leaking timeout, fix blocking upstreams checks, leaking is per listener, OS resolvers are tested in parallel, reset is only done is os is down
fix test

use upstreamIS var

init map, fix watcher flag

attempt to detect network changes

attempt to detect network changes

cancel and rerun reinitializeOSResolver

cancel and rerun reinitializeOSResolver

cancel and rerun reinitializeOSResolver

ignore invalid inferaces

ignore invalid inferaces

allow OS resolver upstream to fail

dont wait for dnsWait group on reinit, check for active interfaces to trigger reinit

fix unused var

simpler active iface check, debug logs

dont spam network service name patching on Mac

dont wait for os resolver nameserver testing

remove test for osresovlers for now

async nameserver testing

remove unused test
2025-01-20 15:03:27 +07:00
Alex Paguis
7833132917 Don't automatically restore saved DNS settings when switching networks
smol tweaks to nameserver test queries

fix restoreDNS errors

add some debugging information

fix wront type in log msg

set send logs command timeout to 5 mins

when the runningIface is no longer up, attempt to find a new interface

prefer default route, ignore non physical interfaces

prefer default route, ignore non physical interfaces

add max context timeout on performLeakingQuery with more debug logs
2025-01-20 14:59:31 +07:00
Cuong Manh Le
89600f6091 cmd/cli: new flow for leaking queries to OS resolver
The current flow involves marking OS resolver as down, which is not
right at all, since ctrld depends on it for leaking queries.

This commits implements new flow, which ctrld will restore DNS settings
once leaking marked, allowing queries go to OS resolver until the
internet connection is established.
2025-01-20 14:57:23 +07:00
Cuong Manh Le
a95d50c0af cmd/cli: ensure set/reset DNS is done before checking OS resolver
Otherwise, new DNS settings could be reverted by dns watchers, causing
the checking will be always false.
2025-01-14 14:33:15 +07:00
Cuong Manh Le
5db7d3577b cmd/cli: handle . domain query
By returning FormErr response, the same behavior with ControlD.
2025-01-14 14:33:05 +07:00
Cuong Manh Le
cb49d0d947 cmd/cli: perform leaking queries in non-cd mode 2024-12-19 21:50:00 +07:00
Cuong Manh Le
37d41bd215 Skip public DNS for LAN query
So we don't blindly send requests to public DNS even though they can not
handle these queries.
2024-12-19 21:50:00 +07:00
Cuong Manh Le
09426dcd36 cmd/cli: new flow for LAN hostname query
If there is no explicit rules for LAN hostname queries, using OS
resolver instead of forwarding requests to remote upstreams.
2024-12-19 21:50:00 +07:00
Cuong Manh Le
17941882a9 cmd/cli: split-route SRV record to OS resolver
Since SRV record is mostly useful in AD environment. Even in non-AD one,
the OS resolver could still resolve the query for external services.

Users who want special treatment can still specify domain rules to
forward requests to ControlD upstreams explicitly.
2024-12-19 21:50:00 +07:00
Cuong Manh Le
c654398981 cmd/cli: make widcard rules match case-insensitively
Domain name comparisons are done in case-insensitive manner.

See: https://datatracker.ietf.org/doc/html/rfc1034#section-3.1
2024-11-13 15:03:17 +07:00
Cuong Manh Le
5ac9d17bdf cmd/cli: simplify queryFromSelf
By using netmon.LocalAddresses instead of looping through interfaces
list manually.
2024-10-08 22:08:48 +07:00
Cuong Manh Le
e88372fc8c cmd/cli: log request id when leaking 2024-09-30 18:21:30 +07:00
Cuong Manh Le
f507bc8f9e cmd/cli: cache query from self result
So we don't waste time to compute a result which is not likely to be
changed.
2024-09-30 18:20:39 +07:00
Cuong Manh Le
3e388c2857 all: leaking queries to OS resolver instead of SRVFAIL
So it would work in more general case than just captive portal network,
which ctrld have supported recently.

Uses who may want no leaking behavior can use a config to turn off this
feature.
2024-09-30 18:20:27 +07:00
Cuong Manh Le
5a88a7c22c cmd/cli: decouple reset DNS task from ctrld status
So it can be run regardless of ctrld current status. This prevents a
racy behavior when reset DNS task restores DNS settings of the system,
but current running ctrld process may revert it immediately.
2024-09-30 18:17:31 +07:00
Cuong Manh Le
e6f256d640 all: add pull API config based on special DNS query
For query domain that matches "uid.verify.controld.com" in cd mode, and
the uid has the same value with "--cd" flag, ctrld will fetch uid config
from ControlD API, using this config if valid.

This is useful for force syncing API without waiting until the API
reload ticker fire.
2024-09-30 18:17:00 +07:00
Cuong Manh Le
082d14a9ba cmd/cli: implement auto captive portal detection
ControlD have global list of known captive portals that user can augment
with proper setup. However, this requires manual actions, and involving
restart ctrld for taking effects.

By allowing ctrld "leaks" DNS queries to OS resolver, this process
becomes automatically, the captive portal could intercept these queries,
and as long as it was passed, ctrld will resume normal operation.
2024-09-30 18:14:46 +07:00
Cuong Manh Le
617674ce43 all: update tailscale.com to v1.74.0 2024-09-30 18:14:30 +07:00
Cuong Manh Le
5af3ec4f7b cmd/cli: ensure DNS goroutines terminated before self-uninstall
Otherwise, these goroutines could mess up with what resetDNS function
do, reverting DHCP DNS settings to ctrld listeners.
2024-08-16 13:50:11 +07:00
Cuong Manh Le
c233ad9b1b cmd/cli: write new config file on reload 2024-08-07 15:51:11 +07:00
Cuong Manh Le
905f2d08c5 cmd/cli: fix reset DNS when doing self-uninstall
While at it, also using "ctrld uninstall" on unix platform, ensuring
everything is cleanup properly.
2024-08-07 15:51:11 +07:00
Cuong Manh Le
80cf79b9cb all: implement self-uninstall ctrld based on REFUSED queries 2024-08-07 15:51:11 +07:00
Cuong Manh Le
a1fda2c0de cmd/cli: make self-check process faster
The "ctrld start" command is running slow, and using much CPU than
necessary. The problem was made because of several things:

1. ctrld process is waiting for 5 seconds before marking listeners up.
   That ends up adding those seconds to the self-check process, even
   though the listeners may have been already available.

2. While creating socket control client, "s.Status()" is called to
   obtain ctrld service status, so we could terminate early if the
   service failed to run. However, that would make a lot of syscall in a
   hot loop, eating the CPU constantly while the command is running. On
   Windows, that call would become slower after each calls. The same
   effect could be seen using Windows services manager GUI, by pressing
   start/stop/restart button fast enough, we could see a timeout raised.

3. The socket control server is started lately, after all the listeners
   up. That would make the loop for creating socket control client run
   longer and use much resources than necessary.

Fixes for these problems are quite obvious:

1. Removing hard code 5 seconds waiting. NotifyStartedFunc is enough to
   ensure that listeners are ready for accepting requests.

2. Check "s.Status()" only once before the loop. There has been already
   30 seconds timeout, so if anything went wrong, the self-check process
   could be terminated, and won't hang forever.

3. Starting socket control server earlier, so newSocketControlClient can
   connect to server with fewest attempts, then querying "/started"
   endpoint to ensure the listeners have been ready.

With these fixes, "ctrld start" now run much faster on modern machines,
taking ~1-2 seconds (previously ~5-8 seconds) to finish. On dual cores
VM, it takes ~5-8 seconds (previously a few dozen seconds or timeout).

---

While at it, there are two refactoring for making the code easier to
read/maintain:

- PersistentPreRun is now used in root command to init console logging,
  so we don't have to initialize them in sub-commands.

- NotifyStartedFunc now use channel for synchronization, instead of a
  mutex, making the ugly asymetric calls to lock goes away, making the
  code more idiom, and theoretically have better performance.
2024-05-09 18:39:30 +07:00
Cuong Manh Le
f499770d45 cmd/cli: use channel instead of mutex in runDNSServer
So the code is easier to read/follow, and possible reduce the overhead
of using mutex in low resources system.
2024-05-09 18:39:30 +07:00
Cuong Manh Le
da01a146d2 internal/clientinfo: check hostname mapping for both ipv4/ipv6 2024-04-19 14:32:21 +07:00
Cuong Manh Le
dd9f2465be internal/clientinfo: map ::1 to the right host MAC address
So queries originating from host using ::1 as source will be recognized
properly, and treated the same as other queries from host itself.
2024-04-19 14:32:09 +07:00
Cuong Manh Le
1a8c1ec73d Provide better error message when self-check failed
By connecting to all upstreams when self-check failed, so it's clearer
to users what causes self-check failed.
2024-04-01 14:14:57 +07:00
Cuong Manh Le
a5025e35ea cmd/cli: add internal domain test query during self-check
So it's clear that client could be reached ctrld's listener or not.
2024-04-01 14:14:32 +07:00
Cuong Manh Le
b50cccac85 all: add flush cache domains config 2024-03-22 16:09:06 +07:00
Cuong Manh Le
34ebe9b054 cmd/cli: allow MAC wildcard matching 2024-03-22 16:08:53 +07:00
Cuong Manh Le
3ca754b438 cmd/cli: use loopback mapping for query from self
So queries from host will always use the same hostname consistently.
2024-03-22 15:58:31 +07:00
Cuong Manh Le
71f26a6d81 Add prometheus exporter
Updates #6
2024-01-22 23:12:17 +07:00
Cuong Manh Le
b82ad3720c cmd/cli: guard against nil client info
Though it's only possible raised in testing, still better to be safe.
2023-12-19 01:48:07 +07:00
Cuong Manh Le
8d2cb6091e cmd/cli: add QUERY/REPLY prefix to proxying log
So the log in INFO log is aligned, making it easier for human to
monitoring the log, either via console or running "tail" command.
2023-12-19 01:31:30 +07:00
Cuong Manh Le
8db28cb76e cmd/cli: improving logging of proxying action
INFO level becomes a sensible setting for normal operation that does not
overwhelm. Adding some small details to make DEBUG level more useful.
2023-12-18 21:31:08 +07:00
Cuong Manh Le
41846b6d4c all: add config to enable/disable answering WAN clients 2023-12-13 14:53:29 +07:00
Cuong Manh Le
684019c2e3 all: force re-bootstrapping with timeout error 2023-12-11 22:55:16 +07:00
Cuong Manh Le
0bb51aa71d cmd/cli: add loop guard for LAN/PTR queries 2023-12-06 15:33:05 +07:00
Cuong Manh Le
af2c1c87e0 cmd/cli: improve logging for new LAN/PTR flow 2023-12-06 15:33:05 +07:00
Cuong Manh Le
7591a0ccc6 all: add client id preference config param
So client can chose how client id is generated.
2023-12-06 15:33:05 +07:00