Instead of always doubling the request, first we wrap the request with a
failover timeout, 500ms, which is an average time for a normal request.
If this request failed, trigger re-bootstrapping and retry the request.
When network changes, for example: connect/disconnect VPN, the old
connection will become broken, but still can be re-used for new
requests. That would cause un-necessary delay for ctrld clients:
- Time 0 - do request with broken transport, 5s timeout.
- Time 0.5 - network stack become usable.
- Time 5 - timeout reached.
- Time 5.1 - do request with new transport -> success.
Instead, we can do two requests in parallel, with the failover one using
a fresh new transport. So if the main one is broken, we still can get
the result from the failover one.
We see in practice on fresh new VM test, there's a DNS server that
return the answer with record not for the query domain.
To workaround this, filter out the answers not for the query domain.
This reverts commit 00fe7f59d13774f2ea6c325bdbb8165be58a1edd.
The purpose is disable cd mode for already installed service, which is
a hard problem than we thought. So leave it out of v1.2 cycle.
When writing default config file, the content must be marshalled to the
config object first before writing to disk.
While at it, also use full path for default config file to make it clear
to the user where the config is written.
This commit add the ability for ctrld to gather client information,
including mac/ip/hostname, and send to Control-D server through a
config per upstream.
- Add send_client_info upstream config.
- Read/Watch dnsmasq leases files on supported platforms.
- Add corresponding client info to DoH query header
All of these only apply for Control-D upstream, though.
So we don't have to depend on network stack probing to decide whether
ipv4 or ipv6 will be used.
While at it, also prevent a race report when doing the same parallel
resolving for os resolver, even though this race is harmless.
Otherwise, we experiment with ctrld slow start after rebooting, because
the network check continuously report failed status even the network
state is up. Restoring the DNS before stopping, we leave the network
state as default, as long as ctrld starts, the DNS is configured again.
The bootstrap process has two issues that can make ctrld stop resolving
after restarting machine host.
ctrld uses bootstrap DNS and os nameservers for resolving upstream. On
unix, /etc/resolv.conf content is used to get available nameservers.
This works well when installing ctrld. However, after being installed,
ctrld may modify the content of /etc/resolv.conf itself, to make other
apps use its listener as DNS resolver. So when ctrld starts after OS
restart, it ends up using [bootstrap DNS + ctrld's listener], for
resolving upstream. At this moment, if ctrld could not contact bootstrap
DNS for any reason, upstream domain will not be resolved.
For above reason, an upstream may not have bootstrap IPs after ctrld
starts. When re-bootstrapping, if there's no bootstrap IPs, ctrld should
call the setup bootstrap process again. Currently, it does not, causing
all queries failed.
This commit fixes above issue by adding mechanism for retrieving OS
nameservers properly, by querying routing table information:
- Parsing /proc/net subsystem on Linux.
- For BSD variants, just fetching routing information base from OS.
- On Windows, just include the gateway information when reading iface.
The fixing for second issue is trivial, just kickoff a bootstrap process
if there's no bootstrap IPs when re-boostrapping.
While at it, also ensure that fetching resolver information from
ControlD API is also used the same approach.
Fixes#34
For os resolver, ctrld queries against all servers concurrently, and get
the first success result back. However, if all server failed, the result
channel is not closed, causing ctrld hang.
Fixing this by closing the result channel once getting back all response
from servers.
While at it, also shorten the backoff time when waiting for network up,
ctrld should serve as fast as possible after network is available.
Updates #34
- Include version/OS information when logging
- Make time field human readable in log file
- Force root privilege when running status command on darwin
Updates #34
When startup, ctrld waits for network up before calling s.Run to starts
its logic. However, if network is down on startup, ctrld will hang on
waiting for network up. That causes OS service manager unhappy, as ctrld
do not response to it, marking ctrld as failure service and never start
ctrld again.
To fix this, we should call s.Run as soon as possible, and use a channel
for waiting a signal that we can actual do our logic after network up.
Update #34
Avoiding reading/writing global config, causing a data race. While at
it, also guarding read/write access to cfg.Service.AllocateIP field,
since when it is read/write by multiple goroutines.