Commit Graph

1008 Commits

Author SHA1 Message Date
David Anderson a6bd3a7e53 logpolicy: use netns for dialing log.tailscale.io. 2020-05-28 23:53:19 +00:00
David Anderson e9f7d01b91 derp/derphttp: make DERP client use netns for dial-outs. 2020-05-28 23:48:08 +00:00
Brad Fitzpatrick 9e3ad4f79f net/netns: add package for start of network namespace support
And plumb in netcheck STUN packets.

TODO: derphttp, logs, control.

Updates #144

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2020-05-28 16:20:16 -07:00
Brad Fitzpatrick a428656280 wgengine/magicsock: don't report v4 localhost addresses on IPv6-only systems
Updates #376
2020-05-28 14:16:23 -07:00
David Anderson fff062b461 wgengine/router: make runner.go linux-only for now.
Otherwise, staticcheck complains that these functions are unused
and unexported on macOS.

Signed-off-by: David Anderson <danderson@tailscale.com>
2020-05-28 12:19:01 -07:00
Brad Fitzpatrick f0204098d8 Revert "control/controlclient: use "getprop net.hostname" for Android hostname"
This reverts commit afb9c6a6ab.

Doesn't work. See:

    https://github.com/tailscale/tailscale/issues/409#issuecomment-635241550

Looks pretty dire:

    https://medium.com/capital-one-tech/how-to-get-an-android-device-nickname-d5eab12f4ced

Updates #409
2020-05-28 10:50:11 -07:00
Brad Fitzpatrick 0245bbe97b Make netcheck handle v6-only interfaces better, faster.
Also:

* add -verbose flag to cmd/tailscale netcheck
* remove some API from the interfaces package
* convert some of the interfaces package to netaddr.IP
* don't even send IPv4 probes on machines with no IPv4 (or only v4
  loopback)
* and once three regions have replied, stop waiting for other probes
  at 2x the slowest duration.

Updates #376
2020-05-28 10:04:20 -07:00
Brad Fitzpatrick c5495288a6 Bump inet.af/netaddr dep for FromStdIP behavior change I want to depend on. 2020-05-28 09:34:41 -07:00
Brad Fitzpatrick 9bbcdba2b3 tempfork/internal/testenv: remove
It was for our x509 fork and no longer needed. (x509 changes
went into our Go fork instead)
2020-05-28 09:34:22 -07:00
Brad Fitzpatrick a96165679c cmd/tailscale: add netcheck flags for incremental reports, JSON output 2020-05-28 08:28:04 -07:00
Avery Pennarun f69003fd46 router_linux: work around terrible bugs in old iptables-compat versions.
Specifically, this sequence:
	iptables -N ts-forward
	iptables -A ts-forward -m mark --mark 0x10000 -j ACCEPT
	iptables -A FORWARD -j ts-forward
doesn't work on Debian-9-using-nftables, but this sequence:
	iptables -N ts-forward
	iptables -A FORWARD -j ts-forward
	iptables -A ts-forward -m mark --mark 0x10000 -j ACCEPT
does work.

I'm sure the reason why is totally fascinating, but it's an old version
of iptables and the bug doesn't seem to exist on modern nftables, so
let's refactor our code to add rules in the always-safe order and
pretend this never happened.

Fixes #401.

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2020-05-28 07:15:06 -04:00
Avery Pennarun 9ff51909a3 router_linux: fix behaviour when switching --netfilter-mode.
On startup, and when switching into =off and =nodivert, we were
deleting netfilter rules even if we weren't the ones that added them.

In order to avoid interfering with rules added by the sysadmin, we have
to be sure to delete rules only in the case that we added them in the
first place.

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2020-05-28 07:15:05 -04:00
Avery Pennarun a496cdc943 router_linux: remove need for iptables.ListChains().
Instead of retrieving the list of chains, or the list of rules in a
chain, just try deleting the ones we don't want and then adding the
ones we do want. An error in flushing/deleting still means the rule
doesn't exist anymore, so there was no need to check for it first.

This avoids the need to parse iptables output, which avoids the need to
ever call iptables -S, which fixes #403, among other things. It's also
much more future proof in case the iptables command line changes.

Unfortunately the iptables go module doesn't properly pass the iptables
command exit code back up when doing .Delete(), so we can't correctly
check the exit code there. (exit code 1 really means the rule didn't
exist, rather than some other weird problem).

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2020-05-28 07:15:05 -04:00
Avery Pennarun 8a6bd21baf router_linux: extract process runner routines into runner.go.
These will probably be useful across platforms. They're not really
Linux-specific at all.

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2020-05-28 07:15:05 -04:00
Avery Pennarun 34c30eaea0 router_linux: use only baseline 'ip rule' features that exist in old kernels.
This removes the use of suppress_ifgroup and fwmark "x/y" notation,
which are, among other things, not available in busybox and centos6.

We also use the return codes from the 'ip' program instead of trying to
parse its output.

I also had to remove the previous hack that routed all of 100.64.0.0/10
by default, because that would add the /10 route into the 'main' route
table instead of the new table 88, which is no good. It was a terrible
hack anyway; if we wanted to capture that route, we should have
captured it explicitly as a subnet route, not as part of the addr. Note
however that this change affects all platforms, so hopefully there
won't be any surprises elsewhere.

Fixes #405
Updates #320, #144

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2020-05-28 07:07:39 -04:00
Avery Pennarun 85d93fc4e3 cmd/tailscale: make ip_forward warnings more actionable.
Let's actually list the file we checked
(/proc/sys/net/ipv4/ip_forward). That gives the admin something
specific to look for when they get this message.

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2020-05-28 07:07:39 -04:00
Avery Pennarun 99aa33469e cmd/tailscale: be quiet when no interaction or errors are needed.
We would print a message about "nothing more to do", which some people
thought was an error or warning. Let's only print a message after
authenticating if we previously asked for interaction, and let's
shorten that message to just "Success," which is what it means.

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2020-05-28 07:07:39 -04:00
Avery Pennarun 30e5c19214 magicsock: work around race condition initializing .Regions[].
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2020-05-28 03:42:03 -04:00
Avery Pennarun 7cd9ff3dde net/netcheck: fix race condition initializting RegionLatency maps.
Under some conditions, code would try to look things up in the maps
before the first call to updateLatency. I don't see any reason to delay
initialization of the maps, so let's just init them right away when
creating the Report instance.

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2020-05-28 03:41:37 -04:00
Avery Pennarun 5eb09c8f5e filch_test: clarify the use of os.RemoveAll().
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2020-05-27 18:50:44 -04:00
Brad Fitzpatrick afb9c6a6ab control/controlclient: use "getprop net.hostname" for Android hostname
Updates #409
2020-05-27 12:50:41 -07:00
David Anderson 2b74236567 ipn: move e2e_test back to corp repo.
It depends on corp things, so can't run here anyway.

Signed-off-by: David Anderson <danderson@tailscale.com>
2020-05-27 19:23:36 +00:00
David Anderson 557b310e67 control/controlclient: move auto_test back to corp repo.
It can't run without corp stuff anyway, and makes it harder to
refactor the control server.
2020-05-27 19:08:21 +00:00
Dmytro Shynkevych 737124ef70 tstun: tolerate zero reads
Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-27 14:32:09 -04:00
David Anderson 7317e73bf4 control/controlclient: move direct_test back to corp repo.
It can only be built with corp deps anyway, and having it split
from the control code makes our lives harder.

Signed-off-by: David Anderson <danderson@tailscale.com>
2020-05-27 17:00:23 +00:00
Dmytro Shynkevych 7508b67c54 cmd/tailscale: expose --enable-derp
Signed-off-by: Dmytro Shynkevych <dm.shynk@gmail.com>
2020-05-26 21:38:26 -04:00
Brad Fitzpatrick 703d789005 tailcfg: add MapResponse.Debug mechanism to trigger logging heap pprof
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2020-05-25 15:22:13 -07:00
Brad Fitzpatrick b0c10fa610 stun, netcheck: move under net 2020-05-25 09:18:24 -07:00
Brad Fitzpatrick 43ded2b581 wgengine/packet: add some tests, more docs, minor Go style, performance changes 2020-05-25 08:58:10 -07:00
Brad Fitzpatrick 3f4a567032 types/strbuilder: add a variant of strings.Builder that uses sync.Pool
... and thus does not need to worry about when it escapes into
unprovable fmt interface{} land.

Also, add some convenience methods for efficiently writing integers.
2020-05-25 08:50:48 -07:00
Brad Fitzpatrick e6b84f2159 all: make client use server-provided DERP map, add DERP region support
Instead of hard-coding the DERP map (except for cmd/tailscale netcheck
for now), get it from the control server at runtime.

And make the DERP map support multiple nodes per region with clients
picking the first one that's available. (The server will balance the
order presented to clients for load balancing)

This deletes the stunner package, merging it into the netcheck package
instead, to minimize all the config hooks that would've been
required.

Also fix some test flakes & races.

Fixes #387 (Don't hard-code the DERP map)
Updates #388 (Add DERP region support)
Fixes #399 (wgengine: flaky tests)

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2020-05-23 22:31:59 -07:00
David Anderson e8b3a5e7a1 wgengine/filter: implement a destination IP pre-filter.
Signed-off-by: David Anderson <danderson@tailscale.com>
2020-05-22 17:03:30 +00:00
Brad Fitzpatrick 35a8586f7e go.sum: go mod tidy 2020-05-22 09:07:02 -07:00
Avery Pennarun 3ed2124356 ipn: Resolve some resource leaks in test.
Updates tailscale/corp#255.

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2020-05-21 16:37:25 -04:00
Avery Pennarun ea8f92b312 ipn/local: get rid of some straggling calls to the log module.
Use b.logf() instead.

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2020-05-21 15:51:41 -04:00
Avery Pennarun af9328c1b7 log rate limiting: reformat limiter messages, and use nonempty burst size.
- Reformat the warning about a message being rate limited to print the
  format string, rather than the formatted message. This helps give a
  clue what "type" of message is being limited.

- Change the rate limit warning to be [RATE LIMITED] in all caps. This
  uses less space on each line, plus is more noticeable.

- In tailscaled, change the frequency to be less often (once every 5
  seconds per format string) but to allow bursts of up to 5 messages.
  This greatly reduces the number of messages that are rate limited
  during startup, but allows us to tighten the limit even further during
  normal runtime.

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2020-05-20 11:59:21 -04:00
Avery Pennarun f2db4ac277 cmd/tailscaled: SetGCPercent() if GOGC is not set.
This cuts RSS from ~30MB to ~20MB on my machine, after the previous fix
to get rid of unnecessary zstd buffers.

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2020-05-20 11:40:50 -04:00
Avery Pennarun db051fb013 ipnserver and logpolicy: configure zstd with low-memory settings.
The compressed blobs we send back and forth are small and infrequent,
which doesn't justify the 8MB * GOMAXPROCS memory that was being
allocated. This was the overwhelming majority of memory use in
tailscaled. On my system it goes from ~100M RSS to ~15M RSS (which is
still suspiciously high, but we can worry about that more later).

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2020-05-20 11:23:26 -04:00
Avery Pennarun d074ec6571 cmd/tailscaled: eliminate unnecessary use of an init() function.
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2020-05-20 11:23:26 -04:00
Avery Pennarun c5fcc38bf1 controlclient tests: fix more memory leaks and add resource checking.
I can now run these tests with -count=1000 without running out of RAM.

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2020-05-20 11:23:26 -04:00
Avery Pennarun d03de31404 controlclient/direct: fix a race condition accessing auth keys.
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2020-05-19 03:02:09 -04:00
Avery Pennarun 1013cda799 controlclient/auto_test: don't print the s.control object.
This contains atomic ints that trigger a race check error if we access
them non-atomically.

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2020-05-19 02:07:05 -04:00
Avery Pennarun 806de4ac94 portlist: fix "readdirent: no such file or directory" errors on Linux.
This could happen when a process disappeared while we were reading its
file descriptor list.

I was able to replicate the problem by running this in another
terminal:

    while :; do for i in $(seq 10); do
      /bin/true & done >&/dev/null; wait >&/dev/null;
    done

And then running the portlist tests thousands of times.

Fixes #339.

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2020-05-19 01:51:21 -04:00
David Anderson c97c45f268 ipn: sprinkle documentation and clarity rewrites through LocalBackend.
Signed-off-by: David Anderson <danderson@tailscale.com>
2020-05-19 02:32:34 +00:00
David Anderson 39d20e8a75 go.mod: bump wireguard-go version. 2020-05-18 21:03:48 +00:00
David Anderson cd2f6679bb go.mod: bump wireguard-go version.
Signed-off-by: David Anderson <danderson@tailscale.com>
2020-05-15 22:29:27 +00:00
David Anderson 7fb33123d3 wgengine/router: warn about another variation of busybox's `ip`.
Signed-off-by: David Anderson <danderson@tailscale.com>
2020-05-15 20:48:25 +00:00
Wendi Yu bb55694c95
wgengine: log node IDs when peers are added/removed (#381)
Also stop logging data sent/received from nodes we're not connected to (ie all those `x`s being logged in the `peers: ` line)
Signed-off-by: Wendi <wendi.yu@yahoo.ca>
2020-05-15 14:13:44 -06:00
Dmytro Shynkevych 635f7b99f1 wgengine: pass tun.NativeDevice to router
Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-15 10:11:56 -04:00
David Anderson 9c914dc7dd wgengine/router: stop using -m comment.
The comment module is compiled out on several embedded systems (and
also gentoo, because netfilter can't go brrrr with comments holding it
back). Attempting to use comments results in a confusing error, and a
non-functional firewall.

Additionally, make the legacy rule cleanup non-fatal, because we *do*
have to probe for the existence of these -m comment rules, and doing
so will error out on these systems.

Signed-off-by: David Anderson <danderson@tailscale.com>
2020-05-15 07:09:33 +00:00