Some users mentioned that when there is an Internet outage, ctrld fails
to recover, crashing or locks up the router. When requests start
failing, this results in the clients emitting more queries, creating a
resource spiral of death that can brick the device entirely.
To guard against this case, this commit implement an upstream monitor
approach:
- Marking upstream as down after 100 consecutive failed queries.
- Start a goroutine to check when the upstream is back again.
- When upstream is down, answer all queries with SERVFAIL.
- The checking process uses backoff retry to reduce high requests rate.
- As long as the query succeeded, marking the upstream as alive then
start operate normally.
So ctrld can record the raw/original client IP instead of looking up
from MAC to IP, which may not the right choice in some network setup
like using wireguard/vpn on Merlin router.
Using the same approach as in cd mode, but do it only once when running
ctrld the first time, then the config will be re-used then.
While at it, also adding Dockerfile.debug for better troubleshooting
with alpine base image.