Commit Graph

467 Commits

Author SHA1 Message Date
Brad Fitzpatrick 78b0bd2957 net/dns/resolver: add clientmetrics for DNS
Fixes tailscale/corp#1811

Change-Id: I864d11e0332a177e8c5ff403591bff6fec548f5a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-11-26 17:57:48 -08:00
Brad Fitzpatrick 25525b7754 net/dns/resolver, ipn/ipnlocal: wire up peerapi DoH server to DNS forwarder
Updates #1713

Change-Id: Ia4ed9d8c9cef0e70aa6d30f2852eaab80f5f695a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-11-23 18:59:36 -08:00
Josh Bleecher Snyder d10cefdb9b net/dns: require space after nameserver/search parsing resolv.conf
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-11-23 15:11:46 -08:00
Josh Bleecher Snyder 9f00510833 net/dns: handle comments in resolv.conf
Currently, comments in resolv.conf cause our parser to fail,
with error messages like:

ParseIP("192.168.0.100 # comment"): unexpected character (at " # comment")

Fix that.

Noticed while looking through logs.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-11-23 15:11:46 -08:00
Josh Bleecher Snyder 73beaaf360 net/tstun: rate limit "self disco out packet" logging
When this happens, it is incredibly noisy in the logs.
It accounts for about a third of all remaining
"unexpected" log lines from a recent investigation.

It's not clear that we know how to fix this,
we have a functioning workaround,
and we now have a (cheap and efficient) metric for this
that we can use for measurements.

So reduce the logging to approximately once per minute.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-11-23 12:52:52 -08:00
Brad Fitzpatrick 283ae702c1 ipn/ipnlocal: start adding DoH DNS server to peerapi when exit node
Updates #1713

Change-Id: I8d9c488f779e7acc811a9bc18166a2726198a429
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-11-23 08:21:41 -08:00
Josh Bleecher Snyder ad5e04249b wgengine/monitor: ignore adding/removing uninteresting IPs
One of the most common "unexpected" log lines is:

"network state changed, but stringification didn't"

One way that this can occur is if an interesting interface
(non-Tailscale, has interesting IP address)
gains or loses an uninteresting IP address (link local or loopback).

The fact that the interface is interesting is enough for EqualFiltered
to inspect it. The fact that an IP address changed is enough for
EqualFiltered to declare that the interfaces are not equal.

But the State.String method reasonably declines to print any
uninteresting IP addresses. As a result, the network state appears
to have changed, but the stringification did not.

The String method is correct; nothing interesting happened.

This change fixes this by adding an IP address filter to EqualFiltered
in addition to the interface filter. This lets the network monitor
ignore the addition/removal of uninteresting IP addresses.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-11-22 16:33:15 -08:00
Josh Bleecher Snyder ca1b3fe235 net/tshttpproxy: use correct size for Windows BOOL argument
The Windows BOOL type is an int32. We were using a bool,
which is a one byte wide. This could be responsible for the
ERROR_INVALID_PARAMETER errors we were seeing for calls to
WinHttpGetProxyForUrl.

We manually checked all other existing Windows syscalls
for similar mistakes and did not find any.

Updates #879

Co-authored-by: Aaron Klotz <aaron@tailscale.com>
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-11-22 12:24:24 -08:00
Josh Bleecher Snyder 1a629a4715 net/portmapper: mark fewer PMP probe failures as unexpected
There are lots of lines in the logs of the form:

portmapper: unexpected PMP probe response: {OpCode:128 ResultCode:3
SecondsSinceEpoch:NNN MappingValidSeconds:0 InternalPort:0
ExternalPort:0 PublicAddr:0.0.0.0}

ResultCode 3 here means a network failure, e.g. the NAT box itself has
not obtained a DHCP lease. This is not an indication that something
is wrong in the Tailscale client, so use different wording here
to reflect that. Keep logging, so that we can analyze and debug
the reasons that PMP probes fail.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-11-22 11:13:15 -08:00
David Anderson 88b8a09d37 net/dns: make constants for the various DBus strings.
Signed-off-by: David Anderson <danderson@tailscale.com>
2021-11-19 11:09:32 -08:00
David Anderson 6c82cebe57 health: add a health state for net/dns.OSConfigurator.
Lets the systemd-resolved OSConfigurator report health changes
for out of band config resyncs.

Updates #3327

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-11-19 11:09:32 -08:00
David Anderson 4ef3fed100 net/dns: resync config to systemd-resolved when it restarts.
Fixes #3327

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-11-19 11:09:32 -08:00
David Anderson cf9169e4be net/dns: remove unused Config struct element.
Signed-off-by: David Anderson <danderson@tailscale.com>
2021-11-19 11:09:32 -08:00
Josh Bleecher Snyder 758c37b83d net/netns: thread logf into control functions
So that darwin can log there without panicking during tests.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-11-18 15:09:51 -08:00
Brad Fitzpatrick cf06f9df37 net/tstun, wgengine: add packet-level and drop metrics
Primarily tstun work, but some MagicDNS stuff spread into wgengine.

No wireguard reconfig metrics (yet).

Updates #3307

Change-Id: Ide768848d7b7d0591e558f118b553013d1ec94ad
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-11-17 16:18:52 -08:00
Brad Fitzpatrick 400ed799e6 net/dns: work around old systemd-resolved setLinkDomain length limit
Don't set all the *.arpa. reverse DNS lookup domains if systemd-resolved
is old and can't handle them.

Fixes #3188

Change-Id: I283f8ce174daa8f0a972ac7bfafb6ff393dde41d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-11-16 12:54:21 -08:00
Brad Fitzpatrick 24ea365d48 netcheck, controlclient, magicsock: add more metrics
Updates #3307

Change-Id: Ibb33425764a75bde49230632f1b472f923551126
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-11-16 10:48:19 -08:00
David Anderson c5d572f371 net/dns: correctly handle NetworkManager-managed DNS that points to resolved.
Fixes #3304

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-11-15 12:21:25 -08:00
Maisem Ali eccc2ac6ee net/interfaces/windows: update Tailscale interface detection logic to
account for new wintun naming.

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2021-11-08 07:44:33 -08:00
David Anderson 0532eb30db all: replace tailcfg.DiscoKey with key.DiscoPublic.
Signed-off-by: David Anderson <danderson@tailscale.com>
2021-11-03 14:00:16 -07:00
Josh Bleecher Snyder 94fb42d4b2 all: use testingutil.MinAllocsPerRun
There are a few remaining uses of testing.AllocsPerRun:
Two in which we only log the number of allocations,
and one in which dynamically calculate the allocations
target based on a different AllocsPerRun run.

This also allows us to tighten the "no allocs"
test in wgengine/filter.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-10-28 12:48:37 -07:00
Denton Gentry 5302e4be96 net/portmapper: only print PCP/PMP if VerboseLogs
Make UPnP, NAT-PMP, and PCP packet reception logs be [v1] so
they will never appear on stdout and instead only go to logtail.

```
$ tailscale netcheck
2021/10/15 22:50:31 portmap: Got PMP response; IP: w.x.y.z, epoch: 1012707
2021/10/15 22:50:31 portmap: Got PCP response: epoch: 1012707

Report:
        * UDP: true
        * IPv4: yes, w.x.y.z:1511
        * IPv6: no
        * MappingVariesByDestIP: true
        * HairPinning: false
        * PortMapping: NAT-PMP, PCP
        * Nearest DERP: San Francisco
        * DERP latency:
                - sfo: 5.9ms   (San Francisco)
                - sea: 24ms    (Seattle)
                - dfw: 45ms    (Dallas)
                - ord: 53.7ms  (Chicago)
                - nyc: 74.1ms  (New York City)
                - tok: 111.1ms (Tokyo)
                - lhr: 139.4ms (London)
                - syd: 152.7ms (Sydney)
                - fra: 153.1ms (Frankfurt)
                - sin: 182.1ms (Singapore)
                - sao: 190.1ms (S_o Paulo)
                - blr: 218.6ms (Bangalore)
```

Signed-off-by: Denton Gentry <dgentry@tailscale.com>
2021-10-28 10:18:51 -07:00
David Anderson 060ba86baa net/portmapper: ignore IGD SSDP responses from !defaultgw
Now that we multicast the SSDP query, we can get IGD offers from
devices other than the current device's default gateway. We don't want
to accidentally bind ourselves to those.

Updates #3197

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-10-27 15:34:27 -07:00
David Anderson 4a65b07e34 net/portmapper: also send UPnP SSDP query to the SSDP multicast address.
Fixes #3197

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-10-27 15:02:03 -07:00
Brad Fitzpatrick b0b0a80318 net/netcheck: implement netcheck for js/wasm clients
And the derper change to add a CORS endpoint for latency measurement.

And a little magicsock change to cut down some log spam on js/wasm.

Updates #3157

Change-Id: I5fd9e6f5098c815116ddc8ac90cbcd0602098a48
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-10-27 09:59:31 -07:00
Denton Gentry 139a6c4c9c net/dns: detect when resolvconf points to systemd-resolved.
There are /etc/resolv.conf files out there where resolvconf wrote
the file but pointed to systemd-resolved as the nameserver.
We're better off handling those as systemd-resolved.

> # Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
> #     DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
> # 127.0.0.53 is the systemd-resolved stub resolver.
> # run "systemd-resolve --status" to see details about the actual nameservers.

Fixes https://github.com/tailscale/tailscale/issues/3026
Signed-off-by: Denton Gentry <dgentry@tailscale.com>
2021-10-26 18:00:31 -07:00
David Anderson a320d70614 net/dns: fall back to copy+delete/truncate if moving to/from /etc/resolv.conf fails.
In some containers, /etc/resolv.conf is a bind-mount from outside the container.
This prevents renaming to or from /etc/resolv.conf, because it's on a different
filesystem from linux's perspective. It also prevents removing /etc/resolv.conf,
because doing so would break the bind-mount.

If we find ourselves within this environment, fall back to using copy+delete when
renaming to /etc/resolv.conf, and copy+truncate when renaming from /etc/resolv.conf.

Fixes #3000

Co-authored-by: Denton Gentry <dgentry@tailscale.com>
Signed-off-by: David Anderson <danderson@tailscale.com>
2021-10-26 09:03:37 -07:00
David Anderson 04d24d3a38 net/dns: move directManager function below directManager's definition.
Signed-off-by: David Anderson <danderson@tailscale.com>
2021-10-26 09:03:37 -07:00
David Anderson 422ea4980f net/dns: remove a tiny wrapper function that isn't contributing anything.
Signed-off-by: David Anderson <danderson@tailscale.com>
2021-10-26 09:03:37 -07:00
Brad Fitzpatrick a8e2cceefd net/netcheck: hard-code preferred DERP region 900 on js/wasm for now
See TODO in code.

Updates #3157

Change-Id: I3a14dd2cf51d3c21336bb357af5abc362a079ff4
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-10-22 09:08:15 -07:00
Brad Fitzpatrick 9b101bd6af net/tstun: don't compile the code New constructor on js/wasm
Updates #3157

Change-Id: I81603edf3e69e6f1517b0074eef6b648f2981c50
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-10-21 10:36:30 -07:00
Maxim Merzhanov 9f954628e5 net/dns: ignore UnknownMethod error in SetLinkDefaultRoute for resolved manager
Signed-off-by: Maxim Merzhanov <maksimmerzh@gmail.com>
2021-10-20 16:31:24 -07:00
Brad Fitzpatrick 8efc306e4f net/interfaces: assume the network's up on js/wasm
Updates #3157

Change-Id: If4acd33598ad5e8ef7fb5960964c9ac32bc8f68b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-10-20 12:23:22 -07:00
Joe Tsai 9af27ba829 cmd/cloner: mangle "go:generate" in cloner.go
The "go generate" command blindly looks for "//go:generate" anywhere
in the file regardless of whether it is truly a comment.
Prevent this false positive in cloner.go by mangling the string
to look less like "//go:generate".

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2021-10-16 17:53:43 -07:00
Maisem Ali 7817ab6b20 net/dns/resolver: set maxDoHInFlight to 1000 on iOS 15+.
Signed-off-by: Maisem Ali <maisem@tailscale.com>
2021-10-14 23:29:23 -04:00
Brad Fitzpatrick 4a3e2842d9 net/interfaces: add List, GetList
And start moving funcs to methods on List.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-10-14 15:06:12 -07:00
David Crawshaw 77696579f5 net/dns/resolver: drop dropping log
Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
2021-10-14 13:58:24 -07:00
Brad Fitzpatrick 676fb458c3 net/dns/resolver: make hasRDNSBonjourPrefix match shorter queries too
Fixes tailscale/corp#2886
Updates tailscale/corp#2820
Updates #2442

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-10-13 15:49:45 -07:00
nicksherron f01ff18b6f all: fix spelling mistakes
Signed-off-by: nicksherron <nsherron90@gmail.com>
2021-10-12 21:23:14 -07:00
Aaron Klotz 1991a1ac6a net/tstun: update tun_windows for wintun 0.14 API revisions, update wireguard-go dependency to 82d2aa87aa623cb5143a41c3345da4fb875ad85d
Signed-off-by: Aaron Klotz <aaron@tailscale.com>
2021-10-12 16:07:46 -06:00
Smitty b382161fe5 tsdns: don't forward transient DNS errors
When a DNS server claims to be unable or unwilling to handle a request,
instead of passing that refusal along to the client, just treat it as
any other error trying to connect to the DNS server. This prevents DNS
requests from failing based on if a server can respond with a transient
error before another server is able to give an actual response. DNS
requests only failing *sometimes* is really hard to find the cause of
(#1033).

Signed-off-by: Smitty <me@smitop.com>
2021-10-12 09:35:25 -04:00
Denton Gentry 5d6198adee netcheck: don't log ErrGatewayRange
"skipping portmap; gateway range likely lacks support" is really
spammy on cloud systems, and not very useful in debugging.

Fixes https://github.com/tailscale/tailscale/issues/3034

Signed-off-by: Denton Gentry <dgentry@tailscale.com>
2021-10-10 10:47:03 -07:00
Denton Gentry d883747d8b net/dns/resolver: don't forward DNS-SD on all platforms
We added the initial handling only for macOS and iOS.
With 1.16.0 now released, suppress forwarding DNS-SD
on all platforms to test it through the 1.17.x cycle.

Updates #2442

Signed-off-by: Denton Gentry <dgentry@tailscale.com>
2021-10-08 17:14:59 -07:00
Brad Fitzpatrick 297d1b7cb6 net/dns/resolver: don't forward DNS-SD queries
Updates #2442
Fixes tailscale/corp#2820

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-10-07 12:38:55 -07:00
Brad Fitzpatrick 47044f3af7 net/dns/resolver: fix log prefix
The passed in logf already has a "dns: " prefix so they were
doubled up.
2021-10-07 12:19:41 -07:00
Brad Fitzpatrick 7634af5c6f all: gofmt
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-10-07 12:18:31 -07:00
Avery Pennarun 0d4a0bf60e magicsock: if STUN failed to send before, rebind before STUNning again.
On iOS (and possibly other platforms), sometimes our UDP socket would
get stuck in a state where it was bound to an invalid interface (or no
interface) after a network reconfiguration. We can detect this by
actually checking the error codes from sending our STUN packets.

If we completely fail to send any STUN packets, we know something is
very broken. So on the next STUN attempt, let's rebind the UDP socket
to try to correct any problems.

This fixes a problem where iOS would sometimes get stuck using DERP
instead of direct connections until the backend was restarted.

Fixes #2994

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2021-10-08 02:17:09 +09:00
Brad Fitzpatrick 2501a694cb net/interfaces: add RegisterInterfaceGetter for Android
Updates #2293

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-10-06 10:43:12 -07:00
Brad Fitzpatrick 22a1a5d7cf ipn/ipnlocal: for IPv6-only nodes, publish IPv6 MagicDNS records of peers
See https://github.com/tailscale/tailscale/issues/2970#issuecomment-931885268

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-10-04 08:54:23 -07:00
Brad Fitzpatrick 09c2462ae5 net/tlsdial: add forgotten test file for go mod tidy
I forgot to include this file in the earlier
7cf8ec8108 commit.

This exists purely to keep "go mod tidy" happy.

Updates #1609

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-10-01 10:30:01 -07:00