magicsock makes multiple calls to Now per packet.
Move to mono.Now. Changing some of the calls to
use package mono has a cascading effect,
causing non-per-packet call sites to also switch.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
Pull in the latest version of wireguard-windows.
Switch to upstream wireguard-go.
This requires reverting all of our import paths.
Unfortunately, this has to happen at the same time.
The wireguard-go change is very low risk,
as that commit matches our fork almost exactly.
(The only changes are import paths, CI files, and a go.mod entry.)
So if there are issues as a result of this commit,
the first place to look is wireguard-windows changes.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
magicsock.Conn.ParseEndpoint requires a peer's public key,
disco key, and legacy ip/ports in order to do its job.
We currently accomplish that by:
* adding the public key in our wireguard-go fork
* encoding the disco key as magic hostname
* using a bespoke comma-separated encoding
It's a bit messy.
Instead, switch to something simpler: use a json-encoded struct
containing exactly the information we need, in the form we use it.
Our wireguard-go fork still adds the public key to the
address when it passes it to ParseEndpoint, but now the code
compensating for that is just a couple of simple, well-commented lines.
Once this commit is in, we can remove that part of the fork
and remove the compensating code.
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
Fields rename only.
Part of the general effort to make our code agnostic about endpoint formatting.
It's just a name, but it will soon be a misleading one; be more generic.
Do this as a separate commit because it generates a lot of whitespace changes.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
Upstream wireguard-go renamed the interface method
from CreateEndpoint to ParseEndpoint.
I missed some comments. Fix them.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
One of the consequences of the bind refactoring in 6f23087175
is that attempting to bind an IPv6 socket will always
result in c.pconn6.pconn being non-nil.
If the bind fails, it'll be set to a placeholder packet conn
that blocks forever.
As a result, we can always run ReceiveIPv6 and health check it.
This removes IPv4/IPv6 asymmetry and also will allow health checks
to detect any IPv6 receive func failures.
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
It must be an IP address; enforce that at the type level.
Suggested-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
We had two separate code paths for the initial UDP listener bind
and any subsequent rebinds.
IPv6 got left out of the rebind code.
Rather than duplicate it there, unify the two code paths.
Then improve the resulting code:
* Rebind had nested listen attempts to try the user-specified port first,
and then fall back to :0 if that failed. Convert that into a loop.
* Initial bind tried only the user-specified port.
Rebind tried the user-specified port and 0.
But there are actually three ports of interest:
The one the user specified, the most recent port in use, and 0.
We now try all three in order, as appropriate.
* In the extremely rare case in which binding to port 0 fails,
use a dummy net.PacketConn whose reads block until close.
This will keep the wireguard-go receive func goroutine alive.
As a pleasant side-effect of this, if we decide that
we need to resuscitate #1796, it will now be much easier.
Fixes#1799
Co-authored-by: David Anderson <danderson@tailscale.com>
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
Assume it'll stay at 0 forever, so hard-code it
and delete code conditional on it being non-0.
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
It was set to context.Background by all callers, for the same reasons.
Set it locally instead, to simplify call sites.
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
The old implementation knew too much about how wireguard-go worked.
As a result, it missed genuine problems that occurred due to unrelated bugs.
This fourth attempt to fix the health checks takes a black box approach.
A receive func is healthy if one (or both) of these conditions holds:
* It is currently running and blocked.
* It has been executed recently.
The second condition is required because receive functions
are not continuously executing. wireguard-go calls them and then
processes their results before calling them again.
There is a theoretical false positive if wireguard-go go takes
longer than one minute to process the results of a receive func execution.
If that happens, we have other problems.
Updates #1790
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
They were not doing their job.
They need yet another conceptual re-think.
Start by clearing the decks.
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
The existing implementation was completely, embarrassingly conceptually broken.
We aren't able to see whether wireguard-go's receive function goroutines
are running or not. All we can do is model that based on what we have done.
This commit fixes that model.
Fixes#1781
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
Avery reported a sub-ms health transition from "receiveIPv4 not running" to "ok".
To avoid these transient false-positives, be more precise about
the expected lifetime of receive funcs. The problematic case is one in which
they were started but exited prior to a call to connBind.Close.
Explicitly represent started vs running state, taking care with the order of updates.
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
We were accidentally logging oldPort -> oldPort.
Log oldPort as well as c.port; if we failed to get the preferred port
in a previous rebind, oldPort might differ from c.port.
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
Track endpoints internally with a new tailcfg.Endpoint type that
includes a typed netaddr.IPPort (instead of just a string) and
includes a type for how that endpoint was discovered (STUN, local,
etc).
Use []tailcfg.Endpoint instead of []string internally.
At the last second, send it to the control server as the existing
[]string for endpoints, but also include a new parallel
MapRequest.EndpointType []tailcfg.EndpointType, so the control server
can start filtering out less-important endpoint changes from
new-enough clients. Notably, STUN-discovered endpoints can be filtered
out from 1.6+ clients, as they can discover them amongst each other
via CallMeMaybe disco exchanges started over DERP. And STUN endpoints
change a lot, causing a lot of MapResposne updates. But portmapped
endpoints are worth keeping for now, as they they work right away
without requiring the firewall traversal extra RTT dance.
End result will be less control->client bandwidth. (despite negligible
increase in client->control bandwidth)
Updates tailscale/corp#1543
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
It existed to work around the frequent opening and closing
of the conn.Bind done by wireguard-go.
The preceding commit removed that behavior,
so we can simply close the connections
when we are done with them.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
Upstream wireguard-go has changed its receive model.
NewDevice now accepts a conn.Bind interface.
The conn.Bind is stateless; magicsock.Conns are stateful.
To work around this, we add a connBind type that supports
cheap teardown and bring-up, backed by a Conn.
The new conn.Bind allows us to specify a set of receive functions,
rather than having to shoehorn everything into ReceiveIPv4 and ReceiveIPv6.
This lets us plumbing DERP messages directly into wireguard-go,
instead of having to mux them via ReceiveIPv4.
One consequence of the new conn.Bind layer is that
closing the wireguard-go device is now indistinguishable
from the routine bring-up and tear-down normally experienced
by a conn.Bind. We thus have to explicitly close the magicsock.Conn
when the close the wireguard-go device.
One downside of this change is that we are reliant on wireguard-go
to call receiveDERP to process DERP messages. This is fine for now,
but is perhaps something we should fix in the future.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
e.g.
$ tailscale ping 1.1.1.1
exit node found but not enabled
$ tailscale ping 10.2.200.2
node "tsbfvlan2" found, but not using its 10.2.200.0/24 route
$ sudo tailscale up --accept-routes
$ tailscale ping 10.2.200.2
pong from tsbfvlan2 (100.124.196.94) via 10.2.200.34:41641 in 1ms
$ tailscale ping mon.ts.tailscale.com
pong from monitoring (100.88.178.64) via DERP(sfo) in 83ms
pong from monitoring (100.88.178.64) via DERP(sfo) in 21ms
pong from monitoring (100.88.178.64) via [2604:a880:4:d1::37:d001]:41641 in 22ms
This necessarily moves code up from magicsock to wgengine, so we can
look at the actual wireguard config.
Fixes#1564
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
No server support yet, but we want Tailscale 1.6 clients to be able to respond
to them when the server can do it.
Updates #1310
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
There was a logical race where Conn.Rebind could acquire the
RebindingUDPConn mutex, close the connection, fail to rebind, release
the mutex, and then because the mutex was no longer held, ReceiveIPv4
wouldn't retry reads that failed with net.ErrClosed, letting that
error back to wireguard-go, which would then stop running that receive
IP goroutine.
Instead, keep the RebindingUDPConn mutex held for the entirety of the
replacement in all cases.
Updates tailscale/corp#1289
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Also change the type to netaddr.IP while here, because it made sorting
easier.
Updates tailscale/corp#1397
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>