After installing as a system service, "ctrld start" does an end-to-end
test for ensuring DNS can be resolved correctly. However, in case the
system is mis-configured (by firewall, other softwares ...) and the test
query could not be sent to ctrld listener, the current error message is
not helpful, causing the confusion from users perspective.
To improve this, selfCheckStatus function now returns the actual status
and error during its process. The caller can now rely on the service
status and the error to produce more useful/friendly message to users.
In some old Windows systems, s.Uninstall does not remove the service
completely at the time s.Install was running, prevent ctrld from being
installed again.
Workaround this by attempting to uninstall ctrld several times, re-check
for service status after each attempt to ensure it was uninstalled.
An interface may have multiple MAC addresses, that leads to the problem
when looking up hostname for its multiple <ip, mac> pairs, because the
"ip" map, which storing "mac => ip" mapping can only store 1 entry. It
ends up returns an empty hostname for a known MAC address.
Fixing this by filling empty hostname based on clients which is already
listed, ensuring all clients with the same MAC address will have the
same hostname information.
Since go1.21, Go standard library have added support for QUIC protocol.
The binary size gains between quic and quic-free version is now minimal.
Removing the quic free build, simplify the code and build process.
Since ctrld now supports MAC rules, the client's mac and ip must always
be sent to ctrld. Otherwise, the mac policy won't work when ctrld is an
upstream of dnsmasq.
When avahi-daemon is avaibale, reading data from its cache help ctrld
populate the mdns data with already known services within local network,
allowing discover client info more quickly.
On some routers, dnsmasq config may change cache-size dynamically after
ctrld starts, causing dnsmasq crashes.
Fixing this by using max-cache-ttl, which have the same effect with
setting cache-size=0 but won't conflict with existing routers config.
Windows may raise WSAEHOSTUNREACH instead WSAENETUNREACH in case of
network not available when resuming from sleep or switching network, so
checkUpstream is never kicked in for this type of error.
To prevent duplicated running of checkUpstream function at the same
time, upstream monitor uses a boolean to report whether the upstream is
checking. If this boolean is true, then other calls after the first one
will be returned immediately.
However, checkUpstream does not set this boolean to false when it
finishes, thus all future calls to checkUpstream won't be run, causing
the upstream is marked as down forever.
Fixing this by ensuring the boolean is reset once checkUpstream done.
While at it, also guarding all upstream monitor operations with a mutex,
ensuring there's no race condition between marking upstream state.
We see number of failed test in Github Action, mostly on MacOS or
Windows due to the fact that goroutines are scheduled to be run
consequently.
This commit improves the test, ensuring at least 2 goroutines were
started before increasing the counting.