Commit Graph

2525 Commits

Author SHA1 Message Date
Josh Bleecher Snyder 773fcfd007 Revert "wgengine/bench: skip flaky test"
This reverts commit d707e2f7e5.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-05-11 11:28:30 -07:00
Josh Bleecher Snyder 68911f6778 wgengine/bench: ignore "engine closing" errors
On benchmark completion, we shut down the wgengine.
If we happen to poll for status during shutdown,
we get an "engine closing" error.
It doesn't hurt anything; ignore it.

Fixes tailscale/corp#1776

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-05-11 11:28:30 -07:00
Brad Fitzpatrick d707e2f7e5 wgengine/bench: skip flaky test
Updates tailscale/corp#1776

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-05-11 11:10:21 -07:00
David Anderson cfde997699 net/dns: don't use interfaces.Tailscale to find the tailscale interface index.
interfaces.Tailscale only returns an interface if it has at least one Tailscale
IP assigned to it. In the resolved DNS manager, when we're called upon to tear
down DNS config, the interface no longer has IPs.

Instead, look up the interface index on construction and reuse it throughout
the daemon lifecycle.

Fixes #1892.

Signed-off-by: David Anderson <dave@natulte.net>
2021-05-10 15:24:42 -07:00
Brad Fitzpatrick d82b28ba73 go.mod: bump wireguard-go 2021-05-10 14:41:39 -07:00
Brad Fitzpatrick 366b3d3f62 ipn{,/ipnserver}: delay JSON marshaling of ipn.Notifies
If nobody is connected to the IPN bus, don't burn CPU & waste
allocations (causing more GC) by encoding netmaps for nobody.

This will notably help hello.ipn.dev.

Updates tailscale/corp#1773

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-05-10 14:36:27 -07:00
David Anderson dc32b4695c util/dnsname: normalize leading dots in ToFQDN.
Fixes #1888.

Signed-off-by: David Anderson <dave@natulte.net>
2021-05-10 13:07:03 -07:00
Josh Bleecher Snyder c0a70f3a06 go.mod: pull in wintun alignment fix from upstream wireguard-go
6cd106ab13...030c638da3

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-05-10 11:10:09 -07:00
Maisem Ali 7027fa06c3 wf: implement windows firewall using inet.af/wf.
Signed-off-by: Maisem Ali <maisem@tailscale.com>
2021-05-10 09:57:07 -07:00
Josh Bleecher Snyder 8d2a90529e wgengine/bench: hold lock in TrafficGen.GotPacket while calling first packet callback
Without any synchronization here, the "first packet" callback can
be delayed indefinitely, while other work continues.
Since the callback starts the benchmark timer, this could skew results.
Worse, if the benchmark manages to complete before the benchmark timer begins,
it'll cause a data race with the benchmark shutdown performed by package testing.
That is what is reported in #1881.

This is a bit unfortunate, in that it means that users of TrafficGen have
to be careful to keep this callback speedy and lightweight and to avoid deadlocks.

Fixes #1881

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-05-10 09:45:35 -07:00
Josh Bleecher Snyder a72fb7ac0b wgengine/bench: handle multiple Engine status callbacks
It is possible to get multiple status callbacks from an Engine.
We need to wait for at least one from each Engine.
Without limiting to one per Engine,
wait.Wait can exit early or can panic due to a negative counter.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-05-10 09:45:35 -07:00
Josh Bleecher Snyder 6618e82ba2 wgengine/bench: close Engines on benchmark completion
This reduces the speed with which these benchmarks exhaust their supply fds.
Not to zero unfortunately, but it's still helpful when doing long runs.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-05-10 09:45:35 -07:00
Josh Bleecher Snyder e9066ee625 types/wgkey: optimize Key.ShortString
name           old time/op    new time/op    delta
ShortString-8    82.6ns ± 0%    15.6ns ± 0%  -81.07%  (p=0.008 n=5+5)

name           old alloc/op   new alloc/op   delta
ShortString-8      104B ± 0%        8B ± 0%  -92.31%  (p=0.008 n=5+5)

name           old allocs/op  new allocs/op  delta
ShortString-8      3.00 ± 0%      1.00 ± 0%  -66.67%  (p=0.008 n=5+5)

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-05-10 09:43:44 -07:00
Josh Bleecher Snyder 7cd4766d5e types/wgkey: add BenchmarkShortString
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-05-10 09:43:44 -07:00
Brad Fitzpatrick 3173c5a65c net/interface: remove darwin fetchRoutingTable workaround
Fixed upstream. Bump dep.

Updates #1345

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-05-10 08:24:11 -07:00
Josh Bleecher Snyder ceb568202b tailcfg: optimize keyMarshalText
This function accounted for ~1% of all allocs by tailscaled.
It is trivial to improve, so may as well.

name              old time/op    new time/op    delta
KeyMarshalText-8     197ns ± 0%      47ns ± 0%  -76.12%  (p=0.016 n=4+5)

name              old alloc/op   new alloc/op   delta
KeyMarshalText-8      200B ± 0%       80B ± 0%  -60.00%  (p=0.008 n=5+5)

name              old allocs/op  new allocs/op  delta
KeyMarshalText-8      5.00 ± 0%      1.00 ± 0%  -80.00%  (p=0.008 n=5+5)

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-05-07 18:50:10 -07:00
Brad Fitzpatrick 5190435d6e cmd/tailscale: rewrite the "up" checker, fix bugs
The old way was way too fragile and had felt like it had more special
cases than normal cases. (see #1874, #1860, #1834, etc) It became very
obvious the old algorithm didn't work when we made the output be
pretty and try to show the user the command they need to run in
5ecc7c7200 for #1746)

The new algorithm is to map the prefs (current and new) back to flags
and then compare flags. This nicely handles the OS-specific flags and
the n:1 and 1:n flag:pref cases.

No change in the existing already-massive test suite, except some ordering
differences (the missing items are now sorted), but some new tests are
added for behavior that was broken before. In particular, it now:

* preserves non-pref boolean flags set to false, and preserves exit
  node IPs (mapping them back from the ExitNodeID pref, as well as
  ExitNodeIP),

* doesn't ignore --advertise-exit-node when doing an EditPrefs call
  (#1880)

* doesn't lose the --operator on the non-EditPrefs paths (e.g. with
  --force-reauth, or when the backend was not in state Running).

Fixes #1880

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-05-07 09:31:55 -07:00
Brad Fitzpatrick e72ed3fcc2 ipn/{ipnlocal,ipnstate}: add PeerStatus.ID stable ID to status --json output
Needed for the "up checker" to map back from exit node stable IDs (the
ipn.Prefs.ExitNodeID) back to an IP address in error messages.

But also previously requested so people can use it to then make API
calls. The upcoming "tailscale admin" subcommand will probably need it
too.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-05-07 09:31:55 -07:00
David Anderson 3c8e230ee1 Revert "net/dns: set IPv4 auto mode in NM, so it lets us set DNS."
This reverts commit 7d16c8228b.

I have no idea how I ended up here. The bug I was fixing with this change
fails to reproduce on Ubuntu 18.04 now, and this change definitely does
break 20.04, 20.10, and Debian Buster. So, until we can reliably reproduce
the problem this was meant to fix, reverting.

Part of #1875

Signed-off-by: David Anderson <dave@natulte.net>
2021-05-06 22:31:54 -07:00
David Anderson a3b15bdf7e .github: remove verbose issue templates, add triage label.
Signed-off-by: David Anderson <danderson@tailscale.com>
2021-05-06 19:14:19 -07:00
David Anderson 5bd38b10b4 net/dns: log the correct error when NM Reapply fails.
Found while debugging #1870.

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-05-06 16:02:09 -07:00
David Anderson 7d16c8228b net/dns: set IPv4 auto mode in NM, so it lets us set DNS.
Part of #1870.

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-05-06 16:02:09 -07:00
David Anderson 77e2375501 net/dns: don't try to configure LLMNR or mdns in NetworkManager.
Fixes #1870.

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-05-06 16:02:09 -07:00
Brad Fitzpatrick e78e26b6fb cmd/tailscale: fix another up warning with exit nodes
The --advertise-routes and --advertise-exit-node flags both mutating
one pref is the gift that keeps on giving.

I need to rewrite the this up warning code to first map prefs back to
flag values and then just compare flags instead of comparing prefs,
but this is the minimal fix for now.

This also includes work on the tests, to make them easier to write
(and more accurate), by letting you write the flag args directly and
have that parse into the upArgs/MaskedPrefs directly, the same as the
code, rather than them being possibly out of sync being written by
hand.

Fixes https://twitter.com/EXPbits/status/1390418145047887877

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-05-06 15:50:58 -07:00
Josh Bleecher Snyder ddd85b9d91 wgengine/magicsock: rename discoEndpoint.wgEndpointHostPort to wgEndpoint
Fields rename only.

Part of the general effort to make our code agnostic about endpoint formatting.
It's just a name, but it will soon be a misleading one; be more generic.
Do this as a separate commit because it generates a lot of whitespace changes.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-05-06 12:44:22 -07:00
Josh Bleecher Snyder e0bd3cc70c wgengine/magicsock: use netaddr.MustParseIPPrefix
Delete our bespoke helper.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-05-06 12:44:22 -07:00
Josh Bleecher Snyder bc68e22c5b all: s/CreateEndpoint/ParseEndpoint/ in docs
Upstream wireguard-go renamed the interface method
from CreateEndpoint to ParseEndpoint.
I missed some comments. Fix them.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-05-06 12:44:22 -07:00
Josh Bleecher Snyder 9bce1b7fc1 wgengine/wgcfg: make device test endpoint-format-agnostic
By using conn.NewDefaultBind, this test requires that our endpoints
be comprehensible to wireguard-go. Instead, use a no-op bind that
treats endpoints as opaque strings.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-05-06 12:44:22 -07:00
Josh Bleecher Snyder 73ad1f804b wgengine/wgcfg: use autogenerated Clone methods
Delete the manually written ones named Copy.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-05-06 12:44:22 -07:00
Josh Bleecher Snyder 05bed64772 types/wgkey: simplify Key.UnmarshalJSON
Instead of calling ParseHex, do the hex.Decode directly.

name             old time/op    new time/op    delta
UnmarshalJSON-8    86.9ns ± 0%    42.6ns ± 0%   -50.94%  (p=0.000 n=15+14)

name             old alloc/op   new alloc/op   delta
UnmarshalJSON-8      128B ± 0%        0B       -100.00%  (p=0.000 n=15+15)

name             old allocs/op  new allocs/op  delta
UnmarshalJSON-8      2.00 ± 0%      0.00       -100.00%  (p=0.000 n=15+15)

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-05-06 12:44:22 -07:00
Josh Bleecher Snyder a0dacba877 wgengine/magicsock: simplify legacy endpoint DstToString
Legacy endpoints (addrSet) currently reconstruct their dst string when requested.

Instead, store the dst string we were given to begin with.
In addition to being simpler and cheaper, this makes less code
aware of how to interpret endpoint strings.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-05-06 12:44:22 -07:00
Josh Bleecher Snyder 777c816b34 wgengine/wgcfg: return better errors from DeviceConfig, ReconfigDevice
Prefer the error from the actual wireguard-go device method call,
not {To,From}UAPI, as those tend to be less interesting I/O errors.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-05-06 12:44:22 -07:00
Josh Bleecher Snyder 1f6c4ba7c3 wgengine/wgcfg: prevent ReconfigDevice from hanging on error
When wireguard-go's UAPI interface fails with an error, ReconfigDevice hangs.
Fix that by buffering the channel and closing the writer after the call.
The code now matches the corresponding code in DeviceConfig, where I got it right.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-05-06 12:44:22 -07:00
Josh Bleecher Snyder 462f7e38fc tailcfg: fix typo in comment
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-05-06 12:44:22 -07:00
Josh Bleecher Snyder ed63a041bf wgengine/userspace: delete HandshakeDone
It is unused, and has been since early Feb 2021 (Tailscale 1.6).
We can't get delete the DeviceOptions entirely yet;
first #1831 and #1839 need to go in, along with some wireguard-go changes.
Deleting this chunk of code now will make the later commits more clearly correct.

Pingers can now go too.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-05-06 11:20:46 -07:00
Brad Fitzpatrick 4b14f72f1f VERSION.txt: the 1.9.x dev cycle hath begun 2021-05-06 10:35:05 -07:00
Brad Fitzpatrick b8fb8264a5 wgengine/netstack: avoid delivering incoming packets to both netstack + host
The earlier eb06ec172f fixed
the flaky SSH issue (tailscale/corp#1725) by making sure that packets
addressed to Tailscale IPs in hybrid netstack mode weren't delivered
to netstack, but another issue remained:

All traffic handled by netstack was also potentially being handled by
the host networking stack, as the filter hook returned "Accept", which
made it keep processing. This could lead to various random racey chaos
as a function of OS/firewalls/routes/etc.

Instead, once we inject into netstack, stop our caller's packet
processing.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-05-06 06:43:16 -07:00
Brad Fitzpatrick 7f2eb1d87a net/tstun: fix TUN log spam when ACLs drop a packet
Whenever we dropped a packet due to ACLs, wireguard-go was logging:

Failed to write packet to TUN device: packet dropped by filter

Instead, just lie to wireguard-go and pretend everything is okay.

Fixes #1229

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-05-06 06:42:58 -07:00
Brad Fitzpatrick 2585edfaeb cmd/tailscale: fix tailscale up --advertise-exit-node validation
Fixes #1859

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-05-05 20:50:47 -07:00
Brad Fitzpatrick 1a1123d461 wgengine: fix pendopen debug to not track SYN+ACKs, show Node.Online state
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-05-05 15:25:11 -07:00
Brad Fitzpatrick b2de34a45d version: bump date
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-05-05 14:49:20 -07:00
Brad Fitzpatrick eb06ec172f wgengine/netstack: don't pass non-subnet traffic to netstack in hybrid mode
Fixes tailscale/corp#1725

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-05-05 13:38:55 -07:00
Brad Fitzpatrick 7629cd6120 net/tsaddr: add NewContainsIPFunc (move from wgengine)
I want to use this from netstack but it's not exported.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-05-05 13:15:50 -07:00
Josh Bleecher Snyder 78d4c561b5 types/logger: add key grinder stats lines to rate-limiting exemption list
Updates #1749

Co-authored-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-05-05 08:25:15 -07:00
Josh Bleecher Snyder f116a4c44f types/logger: fix rate limiter allowlist
Upstream wireguard-go renamed the interface method
from CreateEndpoint to ParseEndpoint.
I updated the log call site but not the allowlist.

Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
2021-05-04 21:59:05 -07:00
Josh Bleecher Snyder be56aa4962 workflows: execute benchmarks
#1817 removed the only place in our CI where we executed our benchmark code.
Fix that by executing it everywhere.

The benchmarks are generally cheap and fast, 
so this should add minimal overhead.

Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
2021-05-04 20:21:03 -07:00
Brad Fitzpatrick 52e1031428 cmd/tailscale: gofmt
From 6d10655dc3

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-05-04 13:04:33 -07:00
Josh Bleecher Snyder ac75958d2e workflows: run staticcheck on more platforms
To prevent issues like #1786, run staticcheck on the primary GOOSes:
linux, mac, and windows.

Windows also has a fair amount of GOARCH-specific code.
If we ever have GOARCH staticcheck failures on other GOOSes,
we can expand the test matrix further.

This requires installing the staticcheck binary so that
we can execute it with different GOOSes.

Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
2021-05-04 12:50:13 -07:00
Avery Pennarun 6d10655dc3 ipnlocal: accept a new opts.UpdatePrefs field.
This is needed because the original opts.Prefs field was at some point
subverted for use in frontend->backend state migration for backward
compatibility on some platforms. We still need that feature, but we
also need the feature of providing the full set of prefs from
`tailscale up`, *not* including overwriting the prefs.Persist keys, so
we can't use the original field from `tailscale up`.

`tailscale up` had attempted to compensate for that by doing SetPrefs()
before Start(), but that violates the ipn.Backend contract, which says
you should call Start() before anything else (that's why it's called
Start()). As a result, doing SetPrefs({ControlURL=...,
WantRunning=true}) would cause a connection to the *previous* control
server (because WantRunning=true), and then connect to the *new*
control server only after running Start().

This problem may have been avoided before, but only by pure luck.

It turned out to be relatively harmless since the connection to the old
control server was immediately closed and replaced anyway, but it
created a race condition that could have caused spurious notifications
or rejected keys if the server responded quickly.

As already covered by existing TODOs, a better fix would be to have
Start() get out of the business of state migration altogether. But
we're approaching a release so I want to make the minimum possible fix.

Fixes #1840.

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2021-05-04 15:19:25 -04:00
Josh Bleecher Snyder 7dbbe0c7c7 cmd/tailscale/cli: fix running from Xcode
We were over-eager in running tailscale in GUI mode.
f42ded7acf fixed that by
checking for a variety of shell-ish env vars and using those
to force us into CLI mode.

However, for reasons I don't understand, those shell env vars
are present when Xcode runs Tailscale.app on my machine.
(I've changed no configs, modified nothing on a brand new machine.)
Work around that by adding an additional "only in GUI mode" check.

Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
2021-05-04 11:37:02 -07:00