For local config, we don't want to alter what user explicitly set, and
only try filling in missing value.
While at it, also remove the dnsmasq port delete on openwrt, we don't
need that hack anymore.
Config fetching/generating in cd mode is currently weird, error prone,
and easy for user to break ctrld when using custom config.
This commit reworks the flow:
- Fetching config from Control D API.
- No custom config, use the current default config.
- If custom config presents, but there's no listener, use 0.0.0.0:53.
- Try listening on current ip+port config, if ok, ctrld could be a
direct listener with current setup, moving on.
- If failed, trying 127.0.0.1:53.
- If failed, trying current ip + port 5354
- If still failed, pick a random ip:port pair, retry until listening ok.
With this flow, thing is more predictable/stable, and help removing the
Config interface for router.
On some Merlin routers reported by users, ctrld some how is not stopped
properly. So the router does not have a working DNS at boot time to do
ntp synchronization.
To fix it, just clean up the router before start waiting for ntp ready.
Firewalla ignores 127.0.0.1 in all VLAN config, so making 127.0.0.1 as
dnsmasq upstream would break thing when multiple VLAN presents.
To deal with this, we need to gather all interfaces available, and
making them as upstream of dnsmasq. Then changing ctrld to listen on all
interfaces, too.
It also leads to better improvement for dnsmasq configuration template,
as the upstream server can now be generated dynamically instead of hard
coding to 127.0.0.1:5354.
On Firewalla, lo interface is excluded in all dnsmasq settings of all
interfaces, to prevent conflicts. The one that ctrld adds in
dnsmasq_local directory could not work if there're multiple dnsmasq
configs for multiple interfaces (real example from an user who uses
VLAN in router setup).
Instead, if we detect 127.0.0.1 on Firewalla, fallback to "br0"
interface IP address instead.
When running on routers, ctrld leverages default setup, let dnsmasq runs
on port 53, and forward queries to ctrld listener on port 5354. However,
this setup is not serialized to config file, causing confusion to users.
Fixing this by writing the correct routers setup to config file. While
at it, updating documentation to refelct that, and also adding note that
changing default router setup could break things.
UniFi Gateway (USG) uses its own DNS forwarding rule, which is
configured default in /etc/dnsmasq.conf file. Adding ctrld own config in
/etc/dnsmasq.d won't take effects. Instead, we must make changes
directly to /etc/dnsmasq.conf, configuring ctrld as the only upstream.
The current state of ctrld is very "high stakes" and easy to mess up,
and is unforgiving when "ctrld start" failed. That would cause the
router is in broken state, unrecoverable.
This commit makes these changes to improve the state:
- Moving router setup process after ctrld listeners are ready, so
dnsmasq won't flood requests to ctrld even though the listeners are
not ready to serve requests.
- On router, when ctrld stopped, restore router DNS setup. That leaves
the router in good state on reboot/startup, help removing the custom
DNS server for NTP synchronization on some routers.
- If self-check failed, uninstall ctrld to restore router to good
state, prevent confusion that ctrld process is still running even
though self-check reports it did not started.
On some platforms, like pfsense, ntpd is not problem, so do not spawn
the DNS server for it, which may conflict with default DNS server.
While at it, also make sure that ctrld will be run at last on startup.
On DD-WRT v3.0-r52189, dnsmasq version 2.89 lease format looks like:
1685794060 <mac> <ip> <hostname> 00:00:00:00:00:04 9
It has 6 fields, while the current parser only looks for line with exact
5 fields, which is too restricted. In fact, the parser shold just skip
line with less than 4 fields, because the 4th field is the hostname,
which is the last client info that ctrld needs.
Currently, on routers that require NTP waiting, ctrld makes the cleanup
process, and restart dnsmasq for restoring default DNS config, so ntpd
can query the NTP servers. It did work, but the code will depends on
router platforms.
Instead, we can spawn a plain DNS listener before PreRun on routers,
this listener will serve NTP dns queries and once ntp is configured, the
listener is terminated and ctrld will start serving using its configured
upstreams.
While at it, also fix the userHomeDir function on freshtomato, which
must return the binary directory for routers that requires JFFS.
The assignment is changed wrongly in process of refactoring parallel
dialer for resolving bootstrap IP.
While at it, also satisfy staticheck for jffs not enabled error.
On some Merlin routers, the time is broken when system reboot, and need
to wait for NTP synced to get the correct time. For fetching API in cd
mode successfully, ctrld need to wait until NTP set the time correctly,
otherwise, the certificate validation would complain.
On some Merlin routers, due to ntp bug, after rebooing, dnsmasq config
was restored to default without ctrld changes, causing ctrld stop
working. Workaround this problem by catching restart diskmon event,
which is triggered by ntpd_synced, then restart dnsmasq.