Commit Graph

58 Commits

Author SHA1 Message Date
Brad Fitzpatrick 16a9cfe2f4 wgengine: configure wireguard peers lazily, as needed
wireguard-go uses 3 goroutines per peer (with reasonably large stacks
& buffers).

Rather than tell wireguard-go about all our peers, only tell it about
peers we're actively communicating with. That means we need hooks into
magicsock's packet receiving path and tstun's packet sending path to
lazily create a wireguard peer on demand from the network map.

This frees up lots of memory for iOS (where we have almost nothing
left for larger domains with many users).

We should ideally do this in wireguard-go itself one day, but that'd
be a pretty big change.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2020-07-24 12:50:15 -07:00
Brad Fitzpatrick a6559a8924 wgengine/magicsock: run test DERP in mode where only disco packets allowed
So we don't accidentally pass a NAT traversal test by having DERP pick up our slack
when we really just wanted DERP as an OOB messaging channel.
2020-07-16 12:58:35 -07:00
David Anderson 45578b47f3 tstest/natlab: refactor PacketHandler into a larger interface.
The new interface lets implementors more precisely distinguish
local traffic from forwarded traffic, and applies different
forwarding logic within Machines for each type. This allows
Machines to be packet forwarders, which didn't quite work
with the implementation of Inject.

Signed-off-by: David Anderson <danderson@tailscale.com>
2020-07-15 14:38:33 -07:00
David Anderson 88e8456e9b wgengine/magicsock: add a connectivity test for facing firewalls.
The test demonstrates that magicsock can traverse two stateful
firewalls facing each other, that each require localhost to
initiate connections.

Signed-off-by: David Anderson <danderson@tailscale.com>
2020-07-11 07:04:08 +00:00
David Anderson 1f7b1a4c6c wgengine/magicsock: rearrange TwoDevicePing test for future natlab tests.
Signed-off-by: David Anderson <danderson@tailscale.com>
2020-07-11 06:48:08 +00:00
David Anderson 977381f9cc wgengine/magicsock: make trivial natlab test pass.
Signed-off-by: David Anderson <danderson@tailscale.com>
2020-07-11 01:53:21 +00:00
Brad Fitzpatrick 6c74065053 wgengine/magicsock, tstest/natlab: start hooking up natlab to magicsock
Also adds ephemeral port support to natlab.

Work in progress.

Pairing with @danderson.
2020-07-10 14:32:58 -07:00
Brad Fitzpatrick 6196b7e658 wgengine/magicsock: change API to not permit disco key changes
Generate the disco key ourselves and give out the public half instead.

Fixes #525
2020-07-06 12:10:39 -07:00
Brad Fitzpatrick 5132edacf7 wgengine/magicsock: fix data race from undocumented wireguard-go requirement
Endpoints need to be Stringers apparently.

Fixes tailscale/corp#422
2020-07-03 22:27:52 -07:00
Brad Fitzpatrick 77e89c4a72 wgengine/magicsock: handle CallMeMaybe discovery mesages
Roughly feature complete now. Testing and polish remains.

Updates #483
2020-07-01 15:30:25 -07:00
Brad Fitzpatrick 2d6e84e19e net/netcheck, wgengine/magicsock: replace more UDPAddr with netaddr.IPPort 2020-06-30 13:25:13 -07:00
Brad Fitzpatrick e96f22e560 wgengine/magicsock: start handling disco message, use netaddr.IPPort more
Updates #483
2020-06-30 12:24:23 -07:00
Brad Fitzpatrick a83ca9e734 wgengine/magicsock: cache precomputed nacl/box shared keys
Updates #483
2020-06-29 14:26:25 -07:00
Brad Fitzpatrick a975e86bb8 wgengine/magicsock: add new endpoint type used for discovery-supporting peers
This adds a new magicsock endpoint type only used when both sides
support discovery (that is, are advertising a discovery
key). Otherwise the old code is used.

So far the new code only communicates over DERP as proof that the new
code paths are wired up. None of the actually discovery messaging is
implemented yet.

Support for discovery (generating and advertising a key) are still
behind an environment variable for now.

Updates #483
2020-06-29 13:59:54 -07:00
Brad Fitzpatrick 103c06cc68 wgengine/magicsock: open discovery naclbox messages from known peers
And track known peers.

Doesn't yet do anything with the messages. (nor does it send any yet)

Start of docs on the message format. More will come in subsequent changes.

Updates #483
2020-06-26 14:57:12 -07:00
Dmytro Shynkevych de5f6d70a8 magicsock: eliminate logging race in test
Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-06-22 11:06:12 -04:00
Brad Fitzpatrick 280e8884dd wgengine/magicsock: limit redundant log spam on packets from low-pri addresses
Fixes #407

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2020-06-11 09:40:55 -07:00
Brad Fitzpatrick becce82246 net/netns, misc tests: remove TestOnlySkipPrivilegedOps, argv checks
The netns UID check is sufficient for now. We can do something else
later if/when needed.
2020-05-31 14:40:18 -07:00
David Anderson 5114df415e net/netns: set the bypass socket mark on linux.
This allows tailscaled's own traffic to bypass Tailscale-managed routes,
so that things like tailscale-provided default routes don't break
tailscaled itself.

Progress on #144.

Signed-off-by: David Anderson <danderson@tailscale.com>
2020-05-29 15:16:58 -07:00
Brad Fitzpatrick b0c10fa610 stun, netcheck: move under net 2020-05-25 09:18:24 -07:00
Brad Fitzpatrick e6b84f2159 all: make client use server-provided DERP map, add DERP region support
Instead of hard-coding the DERP map (except for cmd/tailscale netcheck
for now), get it from the control server at runtime.

And make the DERP map support multiple nodes per region with clients
picking the first one that's available. (The server will balance the
order presented to clients for load balancing)

This deletes the stunner package, merging it into the netcheck package
instead, to minimize all the config hooks that would've been
required.

Also fix some test flakes & races.

Fixes #387 (Don't hard-code the DERP map)
Updates #388 (Add DERP region support)
Fixes #399 (wgengine: flaky tests)

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2020-05-23 22:31:59 -07:00
David Anderson e8b3a5e7a1 wgengine/filter: implement a destination IP pre-filter.
Signed-off-by: David Anderson <danderson@tailscale.com>
2020-05-22 17:03:30 +00:00
Brad Fitzpatrick e6d0c92b1d wgengine/magicsock: clean up earlier fix a bit
Move WaitReady from fc88e34f42 into the
test code, and keep the derp-reading goroutine named for debugging.
2020-05-14 10:01:48 -07:00
Avery Pennarun fc88e34f42 wgengine/magicsock/tests: wait for home DERP connection before sending packets.
This fixes an elusive test flake. Fixes #161.

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2020-05-13 23:50:25 -04:00
Avery Pennarun 4f128745d8 magicsock/test: oops, fix a data race in nested-test logf hack.
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2020-05-13 23:50:09 -04:00
Avery Pennarun 42a0e0c601 wgengine/magicsock/tests: call tstest.ResourceCheck for each test.
This didn't catch anything yet, but it's good practice for detecting
goroutine leaks that we might not find otherwise.

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2020-05-13 23:17:51 -04:00
Avery Pennarun 08acb502e5 Add tstest.PanicOnLog(), and fix various problems detected by this.
If a test calls log.Printf, 'go test' horrifyingly rearranges the
output to no longer be in chronological order, which makes debugging
virtually impossible. Let's stop that from happening by making
log.Printf panic if called from any module, no matter how deep, during
tests.

This required us to change the default error handler in at least one
http.Server, as well as plumbing a bunch of logf functions around,
especially in magicsock and wgengine, but also in logtail and backoff.

To add insult to injury, 'go test' also rearranges the output when a
parent test has multiple sub-tests (all the sub-test's t.Logf is always
printed after all the parent tests t.Logf), so we need to screw around
with a special Logf that can point at the "current" t (current_t.Logf)
in some places. Probably our entire way of using subtests is wrong,
since 'go test' would probably like to run them all in parallel if you
called t.Parallel(), but it definitely can't because the're all
manipulating the shared state created by the parent test. They should
probably all be separate toplevel tests instead, with common
setup/teardown logic. But that's a job for another time.

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2020-05-13 23:12:35 -04:00
Dmytro Shynkevych 33b2f30cea
wgengine: wrap tun.Device to support filtering and packet injection (#358)
Right now, filtering and packet injection in wgengine depend
on a patch to wireguard-go that probably isn't suitable for upstreaming.

This need not be the case: wireguard-go/tun.Device is an interface.
For example, faketun.go implements it to mock a TUN device for testing.

This patch implements the same interface to provide filtering
and packet injection at the tunnel device level,
at which point the wireguard-go patch should no longer be necessary.

This patch has the following performance impact on i7-7500U @ 2.70GHz,
tested in the following namespace configuration:
┌────────────────┐    ┌─────────────────────────────────┐     ┌────────────────┐
│      $ns1      │    │               $ns0              │     │      $ns2      │
│    client0     │    │      tailcontrol, logcatcher    │     │     client1    │
│  ┌─────┐       │    │  ┌──────┐         ┌──────┐      │     │  ┌─────┐       │
│  │vethc│───────┼────┼──│vethrc│         │vethrs│──────┼─────┼──│veths│       │
│  ├─────┴─────┐ │    │  ├──────┴────┐    ├──────┴────┐ │     │  ├─────┴─────┐ │
│  │10.0.0.2/24│ │    │  │10.0.0.1/24│    │10.0.1.1/24│ │     │  │10.0.1.2/24│ │
│  └───────────┘ │    │  └───────────┘    └───────────┘ │     │  └───────────┘ │
└────────────────┘    └─────────────────────────────────┘     └────────────────┘
Before:
---------------------------------------------------
| TCP send               | UDP send               |
|------------------------|------------------------|
| 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec |
---------------------------------------------------
After:
---------------------------------------------------
| TCP send               | UDP send               |
|------------------------|------------------------|
| 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec |
---------------------------------------------------
The impact on receive performance is similar.

Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
Brad Fitzpatrick 577f321c38 wgengine/magicsock: revise derp fallback logic
Revision to earlier 6284454ae5

Don't be sticky if we have no peers.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2020-03-25 13:09:18 -07:00
Brad Fitzpatrick e085aec8ef all: update to wireguard-go API changes
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2020-03-17 08:53:05 -07:00
Brad Fitzpatrick b9c6d3ceb8 netcheck: work behind UDP-blocked networks again, add tests
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2020-03-12 14:49:06 -07:00
Brad Fitzpatrick b0f8931d26 wgengine/magicsock: make a test signature a bit more explicit 2020-03-11 09:51:33 -07:00
Brad Fitzpatrick 01b4bec33f stunner: re-do how Stunner works
It used to make assumptions based on having Anycast IPs that are super
near. Now we're intentionally going to a bunch of different distant
IPs to measure latency.

Also, optimize how the hairpin detection works. No need to STUN on
that socket. Just use that separate socket for sending, once we know
the other UDP4 socket's endpoint. The trick is: make our test probe
also a STUN packet, so it fits through magicsock's existing STUN
routing.

This drops netcheck from ~5 seconds to ~250-500ms.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2020-03-11 08:08:48 -07:00
David Anderson 77af7e5436 wgengine/magicsock: mark test logfunc as a helper.
Signed-off-by: David Anderson <danderson@tailscale.com>
2020-03-10 18:00:37 -07:00
David Anderson 7eda3af034 wgengine/magicsock: clean up derp http servers on shutdown.
Failure to do this leads to fd exhaustion at -count=10000,
and increasingly poor execution north of -count=100.

Signed-off-by: David Anderson <danderson@tailscale.com>
2020-03-10 18:00:37 -07:00
David Anderson d651715528 wgengine/magicsock: synchronize test STUN shutdown.
Failure to do so triggers either a data race or a panic
in the testing package, due to racey use of t.Logf.

Signed-off-by: David Anderson <danderson@tailscale.com>
2020-03-10 18:00:37 -07:00
David Anderson 592fec7606 wgengine/magicsock: move device close to uncursed portion of test.
Device close used to suffer from deadlocks, but no longer.

Signed-off-by: David Anderson <danderson@tailscale.com>
2020-03-10 11:57:57 -07:00
Brad Fitzpatrick 39c0ae1dba derp/derpmap: new DERP config package, merge netcheck into magicsock more
Fixes #153
Updates #162
Updates #163

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2020-03-10 10:37:25 -07:00
Brad Fitzpatrick 4800926006 wgengine/magicsock: add AddrSet appendDests+UpdateDst tests 2020-03-09 09:13:28 -07:00
David Crawshaw e201f63230 magicsock: unskip tests that are reliable
Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
2020-03-08 09:29:37 -04:00
David Anderson bb93d7aaba wgengine/magicsock: plumb logf throughout, and expose in Options.
Signed-off-by: David Anderson <danderson@tailscale.com>
2020-03-07 14:11:28 -08:00
David Anderson e3172ae267 wgengine/magicsock: uncurse TestDeviceStartStop, let CI run it.
Signed-off-by: David Anderson <danderson@tailscale.com>
2020-03-06 20:43:57 -08:00
David Anderson 77354d4617 wgengine/magicsock: unblock wireguard-go's read on magicsock shutdown.
wireguard-go closes magicsock, and expects this to unblock reads
so that its internal goroutines can wind down. We were incorrectly
blocking the read indefinitey and breaking this contract.

Signed-off-by: David Anderson <danderson@tailscale.com>
2020-03-06 18:28:47 -08:00
David Anderson fdee5fb639 wgengine/magicsock: don't mutexly reach inside Conn to tweak DERP settings.
Signed-off-by: David Anderson <danderson@tailscale.com>
2020-03-06 18:28:47 -08:00
David Anderson 643bf14653 wgengine/magicsock: disable the new ping test.
It's extremely flaky in several dimensions, as well as very slow.
It's making the CI completely red all the time without telling us
useful information.

Set RUN_CURSED_TESTS=1 to run locally.
2020-03-06 13:35:59 -08:00
David Anderson c8ebac2def wgengine/magicsock: try deflaking again.
This change just alters the semantics of the one flaky test, without
trying to speed up timeouts on the others. Empirically, speeding up
the timeouts causes _more_ flakes right now :(
2020-03-06 12:43:49 -08:00
David Anderson cd1ac63b4c Revert "wgengine/magicsock: temporarily deflake."
This reverts commit c5835c6ced.
2020-03-06 12:37:19 -08:00
David Anderson c5835c6ced wgengine/magicsock: temporarily deflake.
The remaining flake occurs due to a mysterious packet loss. This
doesn't affect normal tailscaled operations, so until I track down
where the loss occurs and fix it, the flaky test is going to be
lenient about packet loss (but not about whether the spray logic
worked).

Signed-off-by: David Anderson <danderson@tailscale.com>
2020-03-06 12:14:54 -08:00
David Anderson bd60a750e8 wgengine/magicsock: fix packet spraying test to (mostly) pass.
It previously passed incorrectly due to bugs. With those fixed,
it becomes flaky for 2 reasons. One of them is the wireguard handshake
race, which can eat the 1st sprayed packet and prevent roamAddr
discovery. This change fixes that failure, by spreading the test
traffic out enough that additional spraying occurs.

Signed-Off-By: David Anderson <danderson@tailscale.com>
2020-03-06 11:10:13 -08:00
David Crawshaw a65b2a0efd magicsock: add some DERP tests
Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
2020-03-04 12:40:33 -05:00