Commit Graph

3167 Commits

Author SHA1 Message Date
David Anderson 4fdb88efe1 types/key: add MachinePrivate and MachinePublic.
Plumb throughout the codebase as a replacement for the mixed use of
tailcfg.MachineKey and wgkey.Private/Public.

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-09-03 10:07:15 -07:00
David Anderson 4ce091cbd8 version: use `go` from the current toolchain to compile in tests.
Signed-off-by: David Anderson <danderson@tailscale.com>
2021-09-02 20:11:20 -07:00
Brad Fitzpatrick d1cb7a2639 metrics: use SYS_OPENAT
New systems like arm64 don't even have SYS_OPEN.
2021-09-02 15:28:19 -07:00
David Anderson 159d88aae7 go.mod: tidy.
Signed-off-by: David Anderson <danderson@tailscale.com>
2021-09-02 14:26:27 -07:00
David Anderson b96159e820 go.mod: update github.com/ulikunitz/xz for https://github.com/advisories/GHSA-25xm-hr59-7c27
Our code is not vulnerable to the issue in question: it only happens in the decompression
path for untrusted inputs, and we only use xz as part of mkpkg, which is write-only
and operates on trusted build system outputs to construct deb and rpm packages.

Still, it's nice to keep the dependabot dashboard clean.

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-09-02 14:02:57 -07:00
Brad Fitzpatrick 99a1c74a6a metrics: optimize CurrentFDs to not allocate on Linux
It was 50% of our allocs on one of our servers. (!!)

Updates #2784

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-09-02 13:28:39 -07:00
Brad Fitzpatrick db3586cd43 go.mod: upgrade staticcheck
It was crashing on a PR of mine and this fixes it.
2021-09-02 13:13:42 -07:00
Brad Fitzpatrick ec2b7c7da6 all: bump minimum Go to 1.17
In prep for using 1.17 features.

Note the go.mod changes are due to:
https://golang.org/doc/go1.17#go-command

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-09-02 12:51:11 -07:00
Brad Fitzpatrick fc160f80ee metrics: move currentFDs code to the metrics package
Updates #2784

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-09-02 11:14:14 -07:00
Brad Fitzpatrick 5d800152d9 cmd/derper: increase port 80's WriteTimeout to permit longer CPU profiles
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-09-02 10:49:14 -07:00
Chuangbo Li e4e4d336d9
cmd/derper: listen on host of flag server addr for port 80 and 3478 (#2768)
cmd/derper: listen on host of flag server addr for port 80 and 3478

When using custom derp on the server with multiple IP addresses,
we would like to bind derp 80, 443 and stun 3478 to a certain IP.

derp command provides flag `-a` to customize which address to bind
for port 443. But port :80 and :3478 were hard-coded.

Fixes #2767

Signed-off-by: Li Chuangbo <im@chuangbo.li>
2021-09-02 10:42:27 -07:00
David Crawshaw 4e18cca62e testcontrol: replace panic with error
I have seen this once in the VM test (caused by an EOF, I believe on
shutdown) that didn't need to cause the test to fail.

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
2021-09-02 10:39:32 -07:00
Brad Fitzpatrick 5bacbf3744 wgengine/magicsock, health, ipn/ipnstate: track DERP-advertised health
And add health check errors to ipnstate.Status (tailscale status --json).

Updates #2746
Updates #2775

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-09-02 10:20:25 -07:00
Brad Fitzpatrick 722942dd46 tsweb: restore CPU profiling handler
It was accidentally deleted in the earlier 0022c3d2e (#2143) refactor.
Lock it in with a test.

Fixes tailscale/corp#2503

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-09-02 09:24:29 -07:00
David Anderson daf54d1253 control/controlclient: remove TS_DEBUG_USE_DISCO=only.
It was useful early in development when disco clients were the
exception and tailscale logs were noisier than today, but now
non-disco is the exception.

Updates #2752

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-09-01 18:11:32 -07:00
David Anderson 39748e9562 net/dns/resolver: authoritatively return NXDOMAIN for reverse zones we own.
Fixes #2774

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-09-01 18:11:32 -07:00
David Anderson 954064bdfe wgengine/wgcfg/nmcfg: don't configure peers who can't DERP or disco.
Fixes #2770

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-09-01 18:11:32 -07:00
David Anderson f90ac11bd8 wgengine: remove unnecessary magicConnStarted channel.
Having removed magicconn.Start, there's no need to synchronize startup
of other things to it any more.

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-09-01 18:11:32 -07:00
David Anderson bb10443edf wgengine/wgcfg: use just the hexlified node key as the WireGuard endpoint.
The node key is all magicsock needs to find the endpoint that WireGuard
needs.

Updates #2752

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-09-01 15:13:21 -07:00
David Anderson d00341360f wgengine/magicsock: remove unused debug knob.
Signed-off-by: David Anderson <danderson@tailscale.com>
2021-09-01 15:13:21 -07:00
David Anderson dfd978f0f2 wgengine/magicsock: use NodeKey, not DiscoKey, as the trigger for lazy reconfig.
Updates #2752

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-09-01 15:13:21 -07:00
David Anderson 4c27e2fa22 wgengine/magicsock: remove Start method from Conn.
Over time, other magicsock refactors have made Start effectively a
no-op, except that some other functions choose to panic if called
before Start.

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-09-01 15:13:21 -07:00
David Anderson 1a899344bd wgengine/magicsock: don't store tailcfg.Nodes alongside endpoints.
Updates #2752

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-09-01 15:13:21 -07:00
David Anderson b2181608b5 wgengine/magicsock: eagerly create endpoints in SetNetworkMap.
Updates #2752

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-09-01 15:13:21 -07:00
Maisem Ali 0842e2f45b ipn/store: add ability to store data as k8s secrets.
Signed-off-by: Maisem Ali <maisem@tailscale.com>
2021-09-01 12:50:59 -07:00
David Crawshaw f53792026e tstest/integration/vms: move build tags from linux to !windows
The tests build fine on other Unix's, they just can't run there.
But there is already a t.Skip by default, so `go test` ends up
working fine elsewhere and checks the code compiles.

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
2021-09-01 11:38:18 -07:00
Brad Fitzpatrick 7f29dcaac1 cmd/tailscale/cli: make up block until state Running, not just Starting
At "Starting", the DERP connection isn't yet up. After the first netmap
and DERP connect, then it transitions into "Running".

Fixes #2708

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-09-01 08:25:42 -07:00
Brad Fitzpatrick fb8b821710 tsnet: fix typo in comment
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-09-01 07:55:25 -07:00
Brad Fitzpatrick 7a7aa8f2b0 cmd/derper: also add port 80 timeouts
Didn't notice this one in earlier 00b3c1c042

Updates tailscale/corp#2486

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-08-31 21:18:36 -07:00
Brad Fitzpatrick 3c8ca4b357 client/tailscale, cmd/tailscale/cli: move version mismatch check to CLI
So people can use the package for whois checks etc without version
skew errors.

The earlier change faa891c1f2 for #1905
was a bit too aggressive.

Fixes #2757

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-08-31 15:27:25 -07:00
Brad Fitzpatrick 8744394cde version: bump date
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-08-31 15:27:25 -07:00
Brad Fitzpatrick 21cb0b361f safesocket: add connect retry loop to wait for tailscaled
Updates #2708

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-08-31 15:13:42 -07:00
Brad Fitzpatrick a59b389a6a derp: add new health update and server restarting frame types
Updates #2756
Updates #2746

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-08-31 13:31:51 -07:00
Christine Dodrill 0b9e938152 tstest/integration/vms: test DNS configuration
This uses a neat little tool to dump the output of DNS queries to
standard out. This is the first end-to-end test of DNS that runs against
actual linux systems. The /etc/resolv.conf test may look superflous,
however this will help for correlating system state if one of the DNS
tests fails.

Signed-off-by: Christine Dodrill <xe@tailscale.com>
2021-08-31 12:31:54 -07:00
Brad Fitzpatrick 00b3c1c042 cmd/derper: add missing read/write timeouts
Updates tailscale/corp#2486

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-08-31 10:23:53 -07:00
David Crawshaw 9b7fc2ed1f .github: add Ubuntu VM test
Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
2021-08-31 08:50:55 -07:00
Brad Fitzpatrick 73280595a8 derp: accept dup clients without closing prior's connection
A public key should only have max one connection to a given
DERP node (or really: one connection to a node in a region).

But if people clone their machine keys (e.g. clone their VM, Raspbery
Pi SD card, etc), then we can get into a situation where a public key
is connected multiple times.

Originally, the DERP server handled this by just kicking out a prior
connections whenever a new one came. But this led to reconnect fights
where 2+ nodes were in hard loops trying to reconnect and kicking out
their peer.

Then a909d37a59 tried to add rate
limiting to how often that dup-kicking can happen, but empirically it
just doesn't work and ~leaks a bunch of goroutines and TCP
connections, tying them up for hour+ while more and more accumulate
and waste memory. Mostly because we were doing a time.Sleep forever
while not reading from their TCP connections.

Instead, just accept multiple connections per public key but track
which is the most recent. And if two both are writing back & forth,
then optionally disable them both. That last part is only enabled in
tests for now. The current default policy is just last-sender-wins
while we gather the next round of stats.

Updates #2751

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-08-31 08:21:21 -07:00
David Crawshaw debaaebf3b tstest/integration/vms: turn on logcatcher logging by default
Absolutely vital to debugging failures.

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
2021-08-31 06:40:28 -07:00
David Crawshaw a1f1020042 tstest/integration/vms: avoid log after test completion
Avoids a panic in the Go testing package if a late log comes in.

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
2021-08-31 06:40:28 -07:00
David Crawshaw 583af7c1a6 tstest/integration/vms: give guest multiple cores and use generic machine
Speeds up tests.
Allows the use of more version of qemu.

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
2021-08-31 06:40:28 -07:00
David Crawshaw 8668103f06 tstest/integration/vms: print qemu console output, fix printing issues
Fix a few test printing issues when tests fail.

Qemu console output is super useful when something is wrong in the
harness and we cannot even bring up the tests.
Also useful for figuring out where all the time goes in tests.

A little noisy, but not too noisy as long as you're only running one VM
as part of the tests, which is my plan.

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
2021-08-31 06:40:28 -07:00
David Crawshaw 1a9fba5b04 tstest/integration/vms: fix ubuntu URLs
Also remove extra distros for now.
We can bring them back later if useful.
Though our most important distros are these two Ubuntu, debian stable,
and Raspbian (not currently supported).
And before doing more Linux, we should do Windows.

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
2021-08-31 06:40:28 -07:00
Emmanuel T Odeke 0daa32943e all: add (*testing.B).ReportAllocs() to every benchmark
This ensures that we can properly track and catch allocation
slippages that could otherwise have been missed.

Fixes #2748
2021-08-30 21:41:04 -07:00
David Anderson 44d71d1e42 wgengine/magicsock: fix race in test shutdown, again.
We were returning an error almost, but not quite like errConnClosed in
a single codepath, which could still trip the panic on reconfig in the
test logic.

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-08-30 21:26:38 -07:00
David Anderson f09ede9243 wgengine/magicsock: don't configure eager WireGuard handshaking in tests.
Our prod code doesn't eagerly handshake, because our disco layer enables
on-demand handshaking. Configuring both peers to eagerly handshake leads
to WireGuard handshake races that make TestTwoDevicePing flaky.

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-08-30 17:28:12 -07:00
David Anderson 86d1c4eceb wgengine/magicsock: ignore close races even harder.
Signed-off-by: David Anderson <danderson@tailscale.com>
2021-08-30 17:09:45 -07:00
David Anderson 8bacfe6a37 wgengine/magicsock: remove unused sendLogLimit limiter.
Magicsock these days gets its logs limited by the global log limiter.

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-08-30 17:09:45 -07:00
David Anderson e151b74f93 wgengine/magicsock: remove opts.SimulatedNetwork.
It only existed to override one test-only behavior with a
different test-only behavior, in both cases working around
an annoying feature of our CI environments. Instead, handle
that weirdness entirely in the test code, with a tweaked
TestOnlyPacketListener that gets injected.

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-08-30 17:09:45 -07:00
David Anderson 58c1f7d51a wgengine/magicsock: rename opts.PacketListener to TestOnlyPacketListener.
The docstring said it was meant for use in tests, but it's specifically a
special codepath that is _only_ used in tests, so make the claim stronger.

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-08-30 17:09:45 -07:00
David Anderson 8049063d35 wgengine/magicsock: rename discoEndpoint to just endpoint.
Signed-off-by: David Anderson <danderson@tailscale.com>
2021-08-30 17:09:45 -07:00