Commit Graph

28 Commits

Author SHA1 Message Date
Cuong Manh Le
b65a5ac283 all: fix bug that causes ctrld stop working if bootstrap failed
The bootstrap process has two issues that can make ctrld stop resolving
after restarting machine host.

ctrld uses bootstrap DNS and os nameservers for resolving upstream. On
unix, /etc/resolv.conf content is used to get available nameservers.
This works well when installing ctrld. However, after being installed,
ctrld may modify the content of /etc/resolv.conf itself, to make other
apps use its listener as DNS resolver. So when ctrld starts after OS
restart, it ends up using [bootstrap DNS + ctrld's listener], for
resolving upstream. At this moment, if ctrld could not contact bootstrap
DNS for any reason, upstream domain will not be resolved.

For above reason, an upstream may not have bootstrap IPs after ctrld
starts. When re-bootstrapping, if there's no bootstrap IPs, ctrld should
call the setup bootstrap process again. Currently, it does not, causing
all queries failed.

This commit fixes above issue by adding mechanism for retrieving OS
nameservers properly, by querying routing table information:

 - Parsing /proc/net subsystem on Linux.
 - For BSD variants, just fetching routing information base from OS.
 - On Windows, just include the gateway information when reading iface.

The fixing for second issue is trivial, just kickoff a bootstrap process
if there's no bootstrap IPs when re-boostrapping.

While at it, also ensure that fetching resolver information from
ControlD API is also used the same approach.

Fixes #34
2023-03-31 10:23:05 +07:00
Cuong Manh Le
8ffb42962a Use rcode string in error message
So it's clearer what went wrong.
2023-03-17 22:21:39 +07:00
Cuong Manh Le
e4eb3b2ded Do not query ipv6 eagerly when setup bootstrap IP
We only need on demand information when re-bootstrapping. On Bootsrap,
this is already checked by ctrldnet.Up, so on demand query will cause
un-necessary slow down if external ipv6 is slow to response.
2023-03-16 09:52:57 +07:00
Cuong Manh Le
77b62f8734 cmd/ctrld: add default timeout for os resolver
So it can fail fast if internet broken suddenly. While at it, also
filtering out ipv6 nameservers if ipv6 not available.
2023-03-16 09:52:39 +07:00
Cuong Manh Le
14bc29751f Use both os and bootstrap DNS to resolve bootstrap IP 2023-03-10 23:20:22 +07:00
Cuong Manh Le
d1589bd9d6 Use separate context when querying upstream ips
While at it, also include query type in log, and only honor upstream
timeout when it greater than zero.
2023-03-10 09:25:35 +07:00
Cuong Manh Le
85c95a6a3a all: set timeout for re-bootstrapping 2023-03-10 09:25:29 +07:00
Cuong Manh Le
fa50cd4df4 all: another rework on discovering bootstrap IPs
Instead of re-query DNS record for upstream when re-bootstrapping, just
query all records on startup, then selecting the next bootstrap ip
depends on the current network stack.
2023-03-10 09:25:17 +07:00
Cuong Manh Le
018f6651c1 Fix wrong time precision in bootstrapping timeout
The timeout is in millisecond, not second.
2023-03-08 10:19:49 +07:00
Cuong Manh Le
1a40767cb7 Use upstream timeout when querying bootstrap IP 2023-03-08 10:16:56 +07:00
Cuong Manh Le
12512a60da Always use first record from DNS response 2023-03-07 10:45:29 +07:00
Cuong Manh Le
b0114dfaeb cmd/ctrld: make staticcheck happy 2023-03-07 10:28:49 +07:00
Cuong Manh Le
fb20d443c1 all: retry the request more agressively
For better recovery and dealing with network stack changes, this commit
change the request flow to:

failure of any kind -> recreate transport/re-bootstrap -> retry once

That would make ctrld recover from all scenarios in theory.
2023-03-07 10:25:48 +07:00
Cuong Manh Le
8b08cc8a6e all: rework bootstrap IP discovering
At startup, ctrld gathers bootstrap IP information and use this
bootstrap IP for connecting to upstream. However, in case the network
stack changed, for example, dues to VPN connection, ctrld will still use
this old (maybe invalid) bootstrap IP for the current network stack.

This commit rework the discovering process, and re-initializing the
bootstrap IP if connecting to upstream failed.
2023-03-07 10:25:48 +07:00
Cuong Manh Le
8852f60ccb Add idle conn timeout for HTTP transport
Allowing the connection to be re-new once it becomes un-usable.
2023-03-07 10:25:48 +07:00
Cuong Manh Le
cad71997aa cmd/ctrld: allocate new ip instead of port
So the alternative listener address can still be used as system
resolver.
2023-02-27 20:50:01 +07:00
Cuong Manh Le
3218b5fac1 Add quic-free binaries in build pipeline
Updates #51
2023-02-27 19:54:18 +07:00
Cuong Manh Le
df514d15a5 Update quic-go to v0.32.0
Updates #51
2023-02-27 19:51:39 +07:00
Cuong Manh Le
279e938b2a cmd/ctrld: make "--cd" always owerwrites the config
While at it, also make the toml encoded config format nicer.
2023-01-20 21:37:34 +07:00
Cuong Manh Le
ec72af1916 cmd/ctrld: add commands to control ctrld as a system service
Supported actions:

 - start: install and start ctrld as a system service
 - stop: stop the ctrld service
 - restart: restart ctrld service
 - status: show status of ctrld service
 - uninstall: remove ctrld from system service
2023-01-20 21:33:31 +07:00
Cuong Manh Le
b93970ccfd all: add CLI flags for no config start
This commit adds the ability to start `ctrld` without config file. All
necessary information can be provided via command line flags, either in
base64 encoded config or launch arguments.
2023-01-20 21:33:05 +07:00
Cuong Manh Le
30fefe7ab9 all: add local caching
This commit adds config params to enable local DNS response caching and
control its behavior, allow tweaking the cache size, ttl override and
serving stale response.
2023-01-20 21:33:01 +07:00
Cuong Manh Le
a6b3c4a757 Don't set default log level in config file 2023-01-20 21:32:47 +07:00
Cuong Manh Le
b03aa39b83 all: support ipv6 for doh3 upstream bootstrap ip
We did it for doh, but the doh3 transport also needs to be changed.
2023-01-20 21:32:41 +07:00
Cuong Manh Le
a7ae6c9853 all: support ipv6 for upstream bootstrap ip 2023-01-20 21:32:26 +07:00
Cuong Manh Le
ebcc545547 all: improving DoH query performance
Previously, for each DoH query, we use the net/http default transport
with DialContext function re-assigned. This has some problems:

 - The first query to server will be slow.
 - Using the default transport for all upstreams can have race condition
   in case of multiple queries to multiple DoH upstreams

This commit fixes those issues, by initializing a separate transport for
each DoH upstream, the warming up the transport by doing a test query.
Later queries can take the advantage and re-use the connection.
2023-01-20 21:32:14 +07:00
Cuong Manh Le
ccada70e31 all: implement policy failover rcodes
While at it, ensure that config is validated, and fixing a bug related
to reuse ctx between multiple upstreams resolving.
2022-12-14 23:34:24 +07:00
Cuong Manh Le
91d60d2a64 Import code, preparing for release 2022-12-13 01:27:48 +07:00