Commit Graph

1033 Commits

Author SHA1 Message Date
Brad Fitzpatrick a5d6c9d616 net/netns: optimize defaultRouteInterface a bit
It'll be called a bunch, so worth a bit of effort. Could go further, but not yet.
(really, should hook into wgengine/monitor and only re-read on netlink changes?)

name                     old time/op    new time/op    delta
DefaultRouteInterface-8    60.8µs ±11%    44.6µs ± 5%  -26.65%  (p=0.000 n=20+19)

name                     old alloc/op   new alloc/op   delta
DefaultRouteInterface-8    3.29kB ± 0%    0.55kB ± 0%  -83.21%  (p=0.000 n=20+20)

name                     old allocs/op  new allocs/op  delta
DefaultRouteInterface-8      9.00 ± 0%      6.00 ± 0%  -33.33%  (p=0.000 n=20+20)
2020-05-31 15:37:09 -07:00
Brad Fitzpatrick 9e5d79e2f1 wgengine/magicsock: drop a bytes.Buffer sync.Pool, use logger.ArgWriter instead 2020-05-31 15:29:04 -07:00
Brad Fitzpatrick becce82246 net/netns, misc tests: remove TestOnlySkipPrivilegedOps, argv checks
The netns UID check is sufficient for now. We can do something else
later if/when needed.
2020-05-31 14:40:18 -07:00
Brad Fitzpatrick 7a410f9236 net/netns: unindent, refactor to remove some redunant code
Also:
* always error on Control failing. That's very unexpected.
* pull out sockopt funcs into their own funcs for easier future testing
2020-05-31 14:29:54 -07:00
Brad Fitzpatrick 45b139d338 net/netns: remove redundant build tag
Filename is sufficient.
2020-05-31 14:05:54 -07:00
Brad Fitzpatrick dcd7a118d3 net/netns: add a test that tailscaleBypassMark stays in sync between packages 2020-05-31 14:02:13 -07:00
Brad Fitzpatrick 1e837b8e81 net/netns: refactor the sync.Once usage a bit 2020-05-31 14:01:20 -07:00
Avery Pennarun e7ae6a2e06 net/netns, wgengine/router: support Linux machines that don't have 'ip rule'.
We'll use SO_BINDTODEVICE instead of fancy policy routing. This has
some limitations: for example, we will route all traffic through the
interface that has the main "default" (0.0.0.0/0) route, so machines
that have multiple physical interfaces might have to go through DERP to
get to some peers. But machines with multiple physical interfaces are
very likely to have policy routing (ip rule) support anyway.

So far, the only OS I know of that needs this feature is ChromeOS
(crostini). Fixes #245.

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2020-05-31 04:31:01 -04:00
Avery Pennarun 8575b21ca8 Merge branch 'master' of github.com:tailscale/tailscale
* 'master' of github.com:tailscale/tailscale:
  tailcfg: remove unused, unimplemented DERPNode.CertFingerprint for now
  net/netns: also don't err on tailscaled -fake as a regular user
  net/netcheck: fix HTTPS fallback bug from earlier today
  net/netns: don't return an error if we're not root and running the tailscale binary
2020-05-31 03:05:51 -04:00
Avery Pennarun e46238a2af wgengine: separately dedupe wireguard configs and router configs.
Otherwise iOS/macOS will reconfigure their routing every time anything
minor changes in the netmap (in particular, endpoints and DERP homes),
which is way too often.

Some users reported "network reconfigured" errors from Chrome when this
happens.

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2020-05-31 02:37:58 -04:00
Avery Pennarun f0b6ba78e8 wgengine: don't pass nil router.Config objects.
These are hard for swift to decode in the iOS app.

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2020-05-31 02:37:22 -04:00
Brad Fitzpatrick 096d7a50ff tailcfg: remove unused, unimplemented DERPNode.CertFingerprint for now 2020-05-30 20:44:18 -07:00
Brad Fitzpatrick 765695eaa2 net/netns: also don't err on tailscaled -fake as a regular user
That's one of my dev workflows.
2020-05-29 22:40:26 -07:00
Brad Fitzpatrick 7f68e097dd net/netcheck: fix HTTPS fallback bug from earlier today
My earlier 3fa58303d0 tried to implement
the net/http.Tranhsport.DialTLSContext hook, but I didn't return a
*tls.Conn, so we ended up sending a plaintext HTTP request to an HTTPS
port. The response ended up being Go telling as such, not the
/derp/latency-check handler's response (which is currently still a
404). But we didn't even get the 404.

This happened to work well enough because Go's built-in error response
was still a valid HTTP response that we can measure for timing
purposes, but it's not a great answer. Notably, it means we wouldn't
be able to get a future handler to run server-side and count those
latency requests.
2020-05-29 22:33:08 -07:00
Brad Fitzpatrick 1407540b52 net/netns: don't return an error if we're not root and running the tailscale binary
tailscale netcheck was broken otherwise.

We can fix this a better way later; I'm just fixing a regression in
some way because I'm trying to work on netcheck at the moment.
2020-05-29 21:58:31 -07:00
David Anderson 5114df415e net/netns: set the bypass socket mark on linux.
This allows tailscaled's own traffic to bypass Tailscale-managed routes,
so that things like tailscale-provided default routes don't break
tailscaled itself.

Progress on #144.

Signed-off-by: David Anderson <danderson@tailscale.com>
2020-05-29 15:16:58 -07:00
Brad Fitzpatrick 3fa58303d0 netcheck: address some HTTP fallback measurement TODOs 2020-05-29 13:34:09 -07:00
Brad Fitzpatrick db2a216561 wgengine/magicsock: don't log on UDP send errors if address family known missing
Fixes #376
2020-05-29 12:41:30 -07:00
Brad Fitzpatrick d3134ad0c8 syncs: add AtomicBool 2020-05-29 12:41:30 -07:00
Brad Fitzpatrick 7247e896b5 net/netcheck: add Report.IPv4 and another TODO 2020-05-29 12:41:30 -07:00
Brad Fitzpatrick dd6b96ba68 types/logger: add TS_DEBUG_LOG_RATE knob to easily turn off rate limiting 2020-05-29 12:41:29 -07:00
David Crawshaw cf5d25e15b wgengine: ensure pingers are gone before returning from Close
We canceled the pingers in Close, but didn't wait around for their
goroutines to be cleaned up. This caused the ipn/e2e_test to catch
pingers in its resource leak check.

This commit introduces an object, but also simplifies the semantics
around the pinger's cancel functions. They no longer need to be called
while holding the mutex.

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
2020-05-30 05:30:26 +10:00
Brad Fitzpatrick 004780b312 ipn: restore LiveDERPs assignment in LocalBackend.parseWgStatus
Updates #421 (likely fixes it; need to do an iOS build to be sure)
2020-05-29 09:53:04 -07:00
David Anderson 03682cb271 control/controlclient: use netns package to dial connections.
Signed-off-by: David Anderson <danderson@tailscale.com>
2020-05-29 00:06:08 +00:00
David Anderson 1617a232e1 logpolicy: remove deprecated DualStack directive.
Signed-off-by: David Anderson <danderson@tailscale.com>
2020-05-29 00:04:28 +00:00
David Anderson a6bd3a7e53 logpolicy: use netns for dialing log.tailscale.io. 2020-05-28 23:53:19 +00:00
David Anderson e9f7d01b91 derp/derphttp: make DERP client use netns for dial-outs. 2020-05-28 23:48:08 +00:00
Brad Fitzpatrick 9e3ad4f79f net/netns: add package for start of network namespace support
And plumb in netcheck STUN packets.

TODO: derphttp, logs, control.

Updates #144

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2020-05-28 16:20:16 -07:00
Brad Fitzpatrick a428656280 wgengine/magicsock: don't report v4 localhost addresses on IPv6-only systems
Updates #376
2020-05-28 14:16:23 -07:00
David Anderson fff062b461 wgengine/router: make runner.go linux-only for now.
Otherwise, staticcheck complains that these functions are unused
and unexported on macOS.

Signed-off-by: David Anderson <danderson@tailscale.com>
2020-05-28 12:19:01 -07:00
Brad Fitzpatrick f0204098d8 Revert "control/controlclient: use "getprop net.hostname" for Android hostname"
This reverts commit afb9c6a6ab.

Doesn't work. See:

    https://github.com/tailscale/tailscale/issues/409#issuecomment-635241550

Looks pretty dire:

    https://medium.com/capital-one-tech/how-to-get-an-android-device-nickname-d5eab12f4ced

Updates #409
2020-05-28 10:50:11 -07:00
Brad Fitzpatrick 0245bbe97b Make netcheck handle v6-only interfaces better, faster.
Also:

* add -verbose flag to cmd/tailscale netcheck
* remove some API from the interfaces package
* convert some of the interfaces package to netaddr.IP
* don't even send IPv4 probes on machines with no IPv4 (or only v4
  loopback)
* and once three regions have replied, stop waiting for other probes
  at 2x the slowest duration.

Updates #376
2020-05-28 10:04:20 -07:00
Brad Fitzpatrick c5495288a6 Bump inet.af/netaddr dep for FromStdIP behavior change I want to depend on. 2020-05-28 09:34:41 -07:00
Brad Fitzpatrick 9bbcdba2b3 tempfork/internal/testenv: remove
It was for our x509 fork and no longer needed. (x509 changes
went into our Go fork instead)
2020-05-28 09:34:22 -07:00
Brad Fitzpatrick a96165679c cmd/tailscale: add netcheck flags for incremental reports, JSON output 2020-05-28 08:28:04 -07:00
Avery Pennarun f69003fd46 router_linux: work around terrible bugs in old iptables-compat versions.
Specifically, this sequence:
	iptables -N ts-forward
	iptables -A ts-forward -m mark --mark 0x10000 -j ACCEPT
	iptables -A FORWARD -j ts-forward
doesn't work on Debian-9-using-nftables, but this sequence:
	iptables -N ts-forward
	iptables -A FORWARD -j ts-forward
	iptables -A ts-forward -m mark --mark 0x10000 -j ACCEPT
does work.

I'm sure the reason why is totally fascinating, but it's an old version
of iptables and the bug doesn't seem to exist on modern nftables, so
let's refactor our code to add rules in the always-safe order and
pretend this never happened.

Fixes #401.

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2020-05-28 07:15:06 -04:00
Avery Pennarun 9ff51909a3 router_linux: fix behaviour when switching --netfilter-mode.
On startup, and when switching into =off and =nodivert, we were
deleting netfilter rules even if we weren't the ones that added them.

In order to avoid interfering with rules added by the sysadmin, we have
to be sure to delete rules only in the case that we added them in the
first place.

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2020-05-28 07:15:05 -04:00
Avery Pennarun a496cdc943 router_linux: remove need for iptables.ListChains().
Instead of retrieving the list of chains, or the list of rules in a
chain, just try deleting the ones we don't want and then adding the
ones we do want. An error in flushing/deleting still means the rule
doesn't exist anymore, so there was no need to check for it first.

This avoids the need to parse iptables output, which avoids the need to
ever call iptables -S, which fixes #403, among other things. It's also
much more future proof in case the iptables command line changes.

Unfortunately the iptables go module doesn't properly pass the iptables
command exit code back up when doing .Delete(), so we can't correctly
check the exit code there. (exit code 1 really means the rule didn't
exist, rather than some other weird problem).

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2020-05-28 07:15:05 -04:00
Avery Pennarun 8a6bd21baf router_linux: extract process runner routines into runner.go.
These will probably be useful across platforms. They're not really
Linux-specific at all.

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2020-05-28 07:15:05 -04:00
Avery Pennarun 34c30eaea0 router_linux: use only baseline 'ip rule' features that exist in old kernels.
This removes the use of suppress_ifgroup and fwmark "x/y" notation,
which are, among other things, not available in busybox and centos6.

We also use the return codes from the 'ip' program instead of trying to
parse its output.

I also had to remove the previous hack that routed all of 100.64.0.0/10
by default, because that would add the /10 route into the 'main' route
table instead of the new table 88, which is no good. It was a terrible
hack anyway; if we wanted to capture that route, we should have
captured it explicitly as a subnet route, not as part of the addr. Note
however that this change affects all platforms, so hopefully there
won't be any surprises elsewhere.

Fixes #405
Updates #320, #144

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2020-05-28 07:07:39 -04:00
Avery Pennarun 85d93fc4e3 cmd/tailscale: make ip_forward warnings more actionable.
Let's actually list the file we checked
(/proc/sys/net/ipv4/ip_forward). That gives the admin something
specific to look for when they get this message.

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2020-05-28 07:07:39 -04:00
Avery Pennarun 99aa33469e cmd/tailscale: be quiet when no interaction or errors are needed.
We would print a message about "nothing more to do", which some people
thought was an error or warning. Let's only print a message after
authenticating if we previously asked for interaction, and let's
shorten that message to just "Success," which is what it means.

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2020-05-28 07:07:39 -04:00
Avery Pennarun 30e5c19214 magicsock: work around race condition initializing .Regions[].
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2020-05-28 03:42:03 -04:00
Avery Pennarun 7cd9ff3dde net/netcheck: fix race condition initializting RegionLatency maps.
Under some conditions, code would try to look things up in the maps
before the first call to updateLatency. I don't see any reason to delay
initialization of the maps, so let's just init them right away when
creating the Report instance.

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2020-05-28 03:41:37 -04:00
Avery Pennarun 5eb09c8f5e filch_test: clarify the use of os.RemoveAll().
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2020-05-27 18:50:44 -04:00
Brad Fitzpatrick afb9c6a6ab control/controlclient: use "getprop net.hostname" for Android hostname
Updates #409
2020-05-27 12:50:41 -07:00
David Anderson 2b74236567 ipn: move e2e_test back to corp repo.
It depends on corp things, so can't run here anyway.

Signed-off-by: David Anderson <danderson@tailscale.com>
2020-05-27 19:23:36 +00:00
David Anderson 557b310e67 control/controlclient: move auto_test back to corp repo.
It can't run without corp stuff anyway, and makes it harder to
refactor the control server.
2020-05-27 19:08:21 +00:00
Dmytro Shynkevych 737124ef70 tstun: tolerate zero reads
Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-27 14:32:09 -04:00
David Anderson 7317e73bf4 control/controlclient: move direct_test back to corp repo.
It can only be built with corp deps anyway, and having it split
from the control code makes our lives harder.

Signed-off-by: David Anderson <danderson@tailscale.com>
2020-05-27 17:00:23 +00:00