Some users mentioned that when there is an Internet outage, ctrld fails
to recover, crashing or locks up the router. When requests start
failing, this results in the clients emitting more queries, creating a
resource spiral of death that can brick the device entirely.
To guard against this case, this commit implement an upstream monitor
approach:
- Marking upstream as down after 100 consecutive failed queries.
- Start a goroutine to check when the upstream is back again.
- When upstream is down, answer all queries with SERVFAIL.
- The checking process uses backoff retry to reduce high requests rate.
- As long as the query succeeded, marking the upstream as alive then
start operate normally.
The current approach to get default route IP is finding the LAN
interface with the same MAC address. However, there could be multiple
interfaces like that, making ctrld confused.
This commit fixes this issue, by listing all possible private IPs, then
sorting them and use the smallest one for router self queries.
For reporting router queries, ctrld uses private IP of the default route
interface. However, when the default route is conntected directly to
ISP, the interface will have a public IP, and another interface with the
same MAC address will be created for LAN ip. So when no private IP found
for default route interface, ctrld must look at the other interface to
find the corret LAN ip.