Commit Graph

2981 Commits

Author SHA1 Message Date
Josh Bleecher Snyder f6e833748b wgengine: use mono.Time
Migrate wgengine to mono.Time for performance-sensitive call sites.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-07-29 12:56:58 -07:00
Josh Bleecher Snyder 8a3d52e882 wgengine/magicsock: use mono.Time
magicsock makes multiple calls to Now per packet.
Move to mono.Now. Changing some of the calls to
use package mono has a cascading effect,
causing non-per-packet call sites to also switch.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-07-29 12:56:58 -07:00
Josh Bleecher Snyder c2202cc27c net/tstun: use mono.Time
There's a call to Now once per packet.
Move to mono.Now.

Though the current implementation provides high precision,
we document it to be coarse, to preserve the ability
to switch to a coarse monotonic time later.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-07-29 12:56:58 -07:00
Josh Bleecher Snyder 142670b8c2 tstime/mono: new package
Package mono provides a fast monotonic time.

Its primary advantage is that it is fast:
It is approximately twice as fast as time.Now.
This is because time.Now uses two clock calls,
one for wall time and one for monotonic time.

We ask for the current time 4-6 times per network packet.
At ~50ns per call to time.Now, that's enough to show
up in CPU profiles.

Package mono is a first step towards addressing that.
It is designed to be a near drop-in replacement for package time.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-07-29 12:56:58 -07:00
Josh Bleecher Snyder 881bb8bcdc net/dns/resolver: allow an extra alloc for go closure allocation
Go 1.17 switches to a register ABI on amd64 platforms.
Part of that switch is that go and defer calls use an argument-less
closure, which allocates. This means that we have an extra
alloc in some DNS work. That's unfortunate but not a showstopper,
and I don't see a clear path to fixing it.
The other performance benefits from the register ABI will all
but certainly outweigh this extra alloc.

Fixes #2545

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-07-29 12:56:28 -07:00
Brad Fitzpatrick b6179b9e83 net/dnsfallback: add new nodes
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-29 10:50:49 -07:00
Pratik 2d35737a7a
Dockerfile: remove extra COPY step (#2355)
Signed-off-by: pratikbalar <pratik@improwised.com>
2021-07-28 11:07:50 -07:00
Aaron Bieber c179b9b535 cmd/tsshd: switch from github.com/kr/pty to github.com/creack/pty
The kr/pty module moved to creack/pty per the kr/pty README[1].

creack/pty brings in support for a number of OS/arch combos that
are lacking in kr/pty.

Run `go mod tidy` while here.

[1] https://github.com/kr/pty/blob/master/README.md

Signed-off-by: Aaron Bieber <aaron@bolddaemon.com>
2021-07-28 09:14:47 -07:00
Brad Fitzpatrick 690ade4ee1 ipn/ipnlocal: add URL to IP forwarding error message
Updates #606

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-28 08:00:53 -07:00
David Crawshaw f414a9cc01 net/dns/resolver: EDNS OPT record off-by-one
I don't know how to get access to a real packet. Basing this commit
entirely off:

       +------------+--------------+------------------------------+
       | Field Name | Field Type   | Description                  |
       +------------+--------------+------------------------------+
       | NAME       | domain name  | MUST be 0 (root domain)      |
       | TYPE       | u_int16_t    | OPT (41)                     |
       | CLASS      | u_int16_t    | requestor's UDP payload size |
       | TTL        | u_int32_t    | extended RCODE and flags     |
       | RDLEN      | u_int16_t    | length of all RDATA          |
       | RDATA      | octet stream | {attribute,value} pairs      |
       +------------+--------------+------------------------------+

From https://datatracker.ietf.org/doc/html/rfc6891#section-6.1.2

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
2021-07-27 16:39:27 -07:00
Josh Bleecher Snyder 1034b17bc7 net/tstun: buffer outbound channel
The handoff between tstun.Wrap's Read and poll methods
is one of the per-packet hotspots. It shows up in pprof.

Making outbound buffered increases throughput.

It is hard to measure exactly how much, because the numbers
are highly variable, but I'd estimate it at about 1%,
using the best observed max throughput across three runs.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-07-27 15:54:34 -07:00
Josh Bleecher Snyder 965dccd4fc net/tstun: buffer outbound channel
The handoff between tstun.Wrap's Read and poll methods
is one of the per-packet hotspots. It shows up in pprof.

Making outbound buffered increases throughput.

It is hard to measure exactly how much, because the numbers
are highly variable, but I'd estimate it at about 1%,
using the best observed max throughput across three runs.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-07-27 15:54:34 -07:00
Brad Fitzpatrick 7b9f02fcb1 cmd/tailscale/cli: document that empty string disable exit nodes, routes
Updates #2529

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-27 13:00:50 -07:00
Brad Fitzpatrick d8d9036dbb tailcfg: add Node.PrimaryRoutes
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-27 12:09:40 -07:00
Brad Fitzpatrick 1b14e1d6bd version: bump date
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-27 08:05:17 -07:00
Denton Gentry bf7ad05230 VERSION.txt: this is v1.13.0.
Signed-off-by: Denton Gentry <dgentry@tailscale.com>
2021-07-27 07:15:59 -07:00
Brad Fitzpatrick 68df379a7d net/portmapper: rename ErrGatewayNotFound to ErrGatewayRange, reword text
It confused & scared people. And it was just bad.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-26 20:30:28 -07:00
Brad Fitzpatrick aaf2df7ab1 net/{dnscache,interfaces}: use netaddr.IP.IsPrivate, delete copied code
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-26 20:30:28 -07:00
Christine Dodrill dde8e28f00 disable vm tests on every commit to main
This experiment apparently failed.

Signed-off-by: Christine Dodrill <xe@tailscale.com>
2021-07-26 16:42:56 -07:00
Brad Fitzpatrick c17d743886 net/dnscache: update a comment
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-26 16:16:08 -07:00
Brad Fitzpatrick 281d503626 net/dnscache: make Dialer try all resolved IPs
Tested manually with:

$ go test -v ./net/dnscache/ -dial-test=bogusplane.dev.tailscale.com:80

Where bogusplane has three A records, only one of which works.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-26 15:44:32 -07:00
Brad Fitzpatrick dfa5e38fad control/controlclient: report whether we're in a snap package
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-26 15:16:40 -07:00
Brad Fitzpatrick e299300b48 net/dnscache: cache all IPs per hostname
Not yet used in the dialer, but plumbed around.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-26 12:27:46 -07:00
Brad Fitzpatrick 7428ecfebd ipn/ipnlocal: populate Hostinfo.Package on Android
Fixes tailscale/corp#2266

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-26 10:35:37 -07:00
Brad Fitzpatrick 5c266bdb73 wgengine: re-set DNS config on Linux after a major link change
Updates #2458 (maybe fixes it)

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-26 08:01:27 -07:00
julianknodt 3377089583 tsweb: add float64 to logged metrics
A previously added metric which was float64 was being ignored in tsweb, because it previously
only accepted int64 and ints. It can be handled in the same way as ints.

Signed-off-by: julianknodt <julianknodt@gmail.com>
2021-07-25 21:02:36 -07:00
Brad Fitzpatrick 53a2f63658 net/dns/resolver: race well-known resolvers less aggressively
Instead of blasting away at all upstream resolvers at the same time,
make a timing plan upon reconfiguration and have each upstream have an
associated start delay, depending on the overall forwarding config.

So now if you have two or four upstream Google or Cloudflare DNS
servers (e.g. two IPv4 and two IPv6), we now usually only send a
query, not four.

This is especially nice on iOS where we start fewer DoH queries and
thus fewer HTTP/1 requests (because we still disable HTTP/2 on iOS),
fewer sockets, fewer goroutines, and fewer associated HTTP buffers,
etc, saving overall memory burstiness.

Fixes #2436
Updates tailscale/corp#2250
Updates tailscale/corp#2238

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-25 20:45:47 -07:00
Brad Fitzpatrick e94ec448a7 net/dns/resolver: add forwardQuery type as race work prep
Add a place to hang state in a future change for #2436.
For now this just simplifies the send signature without
any functional change.

Updates #2436

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-25 15:43:49 -07:00
Brad Fitzpatrick 064b916b1a net/dns/resolver: fix func used as netaddr.IP in printf
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-25 15:21:51 -07:00
Joe Tsai d145c594ad
util/deephash: improve cycle detection (#2470)
The previous algorithm used a map of all visited pointers.
The strength of this approach is that it quickly prunes any nodes
that we have ever visited before. The detriment of the approach
is that pruning is heavily dependent on the order that pointers
were visited. This is especially relevant for hashing a map
where map entries are visited in a non-deterministic manner,
which would cause the map hash to be non-deterministic
(which defeats the point of a hash).

This new algorithm uses a stack of all visited pointers,
similar to how github.com/google/go-cmp performs cycle detection.
When we visit a pointer, we push it onto the stack, and when
we leave a pointer, we pop it from the stack.
Before visiting a pointer, we first check whether the pointer exists
anywhere in the stack. If yes, then we prune the node.
The detriment of this approach is that we may hash a node more often
than before since we do not prune as aggressively.

The set of visited pointers up until any node is only the
path of nodes up to that node and not any other pointers
that may have been visited elsewhere. This provides us
deterministic hashing regardless of visit order.
We can now delete hashMapFallback and associated complexity,
which only exists because the previous approach was non-deterministic
in the presence of cycles.

This fixes a failure of the old algorithm where obviously different
values are treated as equal because the pruning was too aggresive.
See https://github.com/tailscale/tailscale/issues/2443#issuecomment-883653534

The new algorithm is slightly slower since it prunes less aggresively:
	name              old time/op    new time/op    delta
	Hash-8              66.1µs ± 1%    68.8µs ± 1%   +4.09%        (p=0.000 n=19+19)
	HashMapAcyclic-8    63.0µs ± 1%    62.5µs ± 1%   -0.76%        (p=0.000 n=18+19)
	TailcfgNode-8       9.79µs ± 2%    9.88µs ± 1%   +0.95%        (p=0.000 n=19+17)
	HashArray-8          643ns ± 1%     653ns ± 1%   +1.64%        (p=0.000 n=19+19)
However, a slower but more correct algorithm seems
more favorable than a faster but incorrect algorithm.

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2021-07-22 15:22:48 -07:00
Brad Fitzpatrick 7b295f3d21 net/portmapper: disable UPnP on iOS for now
Updates #2495

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-22 13:33:38 -07:00
Brad Fitzpatrick 4a2c3e2a0a control/controlclient: grow goroutine debug buffer as needed
To not allocate 1MB up front on iOS.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-22 13:18:05 -07:00
Brad Fitzpatrick 1986d071c3 control/controlclient: don't use regexp in goroutine stack scrubbing
To reduce binary size on iOS.

Updates tailscale/corp#2238

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-22 13:18:05 -07:00
Christine Dodrill 60f34c70a2
tstest/integration/vms: disable rDNS for sshd on centos (#2492)
This prevents centos tests from timing out because sshd does reverse dns
lookups on every session being established instead of doing it once on
the acutal ssh connection being established. This is odd. Appending this
to the sshd config and restarting it seems to fix it though.

Signed-off-by: Christine Dodrill <xe@tailscale.com>
2021-07-22 15:24:52 -04:00
Christine Dodrill 8db26a2261
tstest/integration/vms: disable nixos unstable (#2491)
cloud-init broke with the upgrade to python 3.9:
https://github.com/NixOS/nixpkgs/issues/131098

Signed-off-by: Christine Dodrill <xe@tailscale.com>
2021-07-22 15:16:11 -04:00
Brad Fitzpatrick cecfc14875 net/dns: don't build init*.go on non-windows
To remove the regexp dep on iOS, notably.

Updates tailscale/corp#2238

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-22 11:58:42 -07:00
Brad Fitzpatrick 2968893add net/dns/resolver: bound DoH usage on iOS
Updates tailscale/corp#2238

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-22 10:54:24 -07:00
Brad Fitzpatrick 95a9adbb97 wgengine/netstack: implement UDP relaying to advertised subnets
TCP was done in 662fbd4a09.

This does the same for UDP.

Tested by hand. Integration tests will have to come later. I'd wanted
to do it in this commit, but the SOCKS5 server needed for interop
testing between two userspace nodes doesn't yet support UDP and I
didn't want to invent some whole new userspace packet injection
interface at this point, as SOCKS seems like a better route, but
that's its own bug.

Fixes #2302

RELNOTE=netstack mode can now UDP relay to subnets

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-21 22:32:26 -07:00
Brad Fitzpatrick 3daf27eaad net/dns/resolver: fall back to IPv6 for well-known DoH servers if v4 fails
Should help with IPv6-only environments when the tailnet admin
only specified IPv4 DNS IPs.

See https://github.com/tailscale/tailscale/issues/2447#issuecomment-884188562

Co-Author: Adrian Dewhurst <adrian@tailscale.com>
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-21 12:45:25 -07:00
Brad Fitzpatrick 74eee4de1c net/dns/resolver: use correct Cloudflare DoH hostnames
We were using the wrong ones for the malware & adult content
variants. Docs:

https://developers.cloudflare.com/1.1.1.1/1.1.1.1-for-families/setup-instructions/dns-over-https

Earlier commit which added them:
236eb4d04d

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-21 12:24:36 -07:00
Joe Tsai d666bd8533
util/deephash: disambiguate hashing of AppendTo (#2483)
Prepend size to AppendTo output.

Fixes #2443

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2021-07-21 11:29:08 -07:00
Joe Tsai 23ad028414
util/deephash: include type as part of hash for interfaces (#2476)
A Go interface may hold any number of different concrete types.
Just because two underlying values hash to the same thing
does not mean the two values are identical if they have different
concrete types. As such, include the type in the hash.
2021-07-21 10:26:04 -07:00
julianknodt 3a4201e773 net/portmapper: return correct upnp port
Previously, this was incorrectly returning the internal port, and using that with the external
exposed IP when it did not use WANIPConnection2. In the case when we must provide a port, we
return it instead.

Noticed this while implementing the integration test for upnp.

Signed-off-by: julianknodt <julianknodt@gmail.com>
2021-07-21 10:11:47 -07:00
Joe Tsai a5fb8e0731
util/deephash: introduce deliberate instability (#2477)
Seed the hash upon first use with the current time.
This ensures that the stability of the hash is bounded within
the lifetime of one program execution.
Hopefully, this prevents future bugs where someone assumes that
this hash is stable.

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2021-07-21 09:23:04 -07:00
Brad Fitzpatrick ecac74bb65 wgengine/netstack: fix doc comment
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-21 08:25:05 -07:00
Brad Fitzpatrick e4fecfe31d wgengine/{monitor,router}: restore Linux ip rules when systemd deletes them
Thanks.

Fixes #1591

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-20 15:52:22 -07:00
Josh Bleecher Snyder 0aa77ba80f tstest/integration: fix filch test flake
Filch doesn't like having multiple processes competing
for the same log files (#937).

Parallel integration tests were all using the same log files.

Add a TS_LOGS_DIR env var that the integration test can use
to use separate log files per test.

Fixes #2269

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-07-20 14:16:28 -07:00
Brad Fitzpatrick ed8587f90d wgengine/router: take a link monitor
Prep for #1591 which will need to make Linux's router react to changes
that the link monitor observes.

The router package already depended on the monitor package
transitively. Now it's explicit.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-20 13:43:40 -07:00
Josh Bleecher Snyder 24db1a3c9b safesocket: print full lsof command on failure
This makes it easier to manually run the command
to discover why it is failing.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-07-20 13:35:31 -07:00
Josh Bleecher Snyder 130c5e727b safesocket: reduce log spam while running integration tests
Instead of logging lsof execution failures to stdout,
incorporate them into the returned error.

While we're here, make it clear that the file
success case always returns a nil error.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-07-20 13:35:31 -07:00