Context: https://github.com/tailscale/tailscale/pull/5588#issuecomment-1260655929
It seems that if the interface at index 1 is down, the rule is not installed. As such,
we increase the range we detect up to 2004 in the hope that at least one of the interfaces
1-4 will be up.
Signed-off-by: Tom DNetto <tom@tailscale.com>
This fixes a race condition which caused `c.muCond.Broadcast()` to
never fire in the `firstDerp` if block. It resulted in `Close()`
hanging forever.
Signed-off-by: Kyle Carberry <kyle@carberry.com>
As the comment in the code says, netstack should always respond to ICMP
echo requests to a 4via6 address, even if the netstack instance isn't
normally processing subnet traffic.
Follow-up to #5709
Change-Id: I504d0776c5824071b2a2e0e687bc33e24f6c4746
Signed-off-by: Andrew Dunham <andrew@tailscale.com>
From 5c42990c2f, not yet released in a stable build.
Caught by existing tests.
Fixes#5685
Change-Id: Ia76bb328809d9644e8b96910767facf627830600
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Baby steps towards turning off heartbeat pings entirely as per #540.
This doesn't change any current magicsock functionality and requires additional
changes to send/disco paths before the flag can be turned on.
Updates #540
Change-Id: Idc9a72748e74145b068d67e6dd4a4ffe3932efd0
Signed-off-by: Jenny Zhang <jz@tailscale.com>
Signed-off-by: Jenny Zhang <jz@tailscale.com>
The io/ioutil package has been deprecated as of Go 1.16 [1]. This commit
replaces the existing io/ioutil functions with their new definitions in
io and os packages.
Reference: https://golang.org/doc/go1.16#ioutil
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
This change masks the bitspace used when setting and querying the fwmark on packets. This allows
tailscaled to play nicer with other networking software on the host, assuming the other networking
software is also using fwmarks & a different mask.
IPTables / mark module has always supported masks, so this is safe on the netfilter front.
However, busybox only gained support for parsing + setting masks in 1.33.0, so we make sure we
arent such a version before we add the "/<mask>" syntax to an ip rule command.
Signed-off-by: Tom DNetto <tom@tailscale.com>
Fixes an panic in `(*magicsock.Conn).ServeHTTPDebug` when the
`recentPongs` ring buffer for an endpoint wraps around.
Signed-off-by: Colin Adler <colin1adler@gmail.com>
If we accept a forwarded TCP connection before dialing, we can
erroneously signal to a client that we support IPv6 (or IPv4) without
that actually being possible. Instead, we only complete the client's TCP
handshake after we've dialed the outbound connection; if that fails, we
respond with a RST.
Updates #5425 (maybe fixes!)
Signed-off-by: Andrew Dunham <andrew@tailscale.com>
Incoming disco packets are now dropped unless they match one of the
current bound ports, or have a zero port*.
The BPF filter passes all packets with a disco header to the raw packet
sockets regardless of destination port (in order to avoid needing to
reconfigure BPF on rebind).
If a BPF enabled node has just rebound, due to restart or rebind, it may
receive and reply to disco ping packets destined for ports other than
those which are presently bound. If the pong is accepted, the pinging
node will now assume that it can send WireGuard traffic to the pinged
port - such traffic will not reach the node as it is not destined for a
bound port.
*The zero port is ignored, if received. This is a speculative defense
and would indicate a problem in the receive path, or the BPF filter.
This condition is allowed to pass as it may enable traffic to flow,
however it will also enable problems with the same symptoms this patch
otherwise fixes.
Fixes#5536
Signed-off-by: James Tucker <james@tailscale.com>
1f959edeb0 introduced a regression for JS
where the initial bind no longer occurred at all for JS.
The condition is moved deeper in the call tree to avoid proliferation of
higher level conditions.
Updates #5537
Signed-off-by: James Tucker <james@tailscale.com>
Both RebindingUDPConns now always exist. the initial bind (which now
just calls rebind) now ensures that bind is called for both, such that
they both at least contain a blockForeverConn. Calling code no longer
needs to assert their state.
Signed-off-by: James Tucker <james@tailscale.com>
This is entirely optional (i.e. failing in this code is non-fatal) and
only enabled on Linux for now. Additionally, this new behaviour can be
disabled by setting the TS_DEBUG_DISABLE_AF_PACKET environment variable.
Updates #3824
Replaces #5474
Co-authored-by: Andrew Dunham <andrew@du.nham.ca>
Signed-off-by: David Anderson <danderson@tailscale.com>
On sufficiently large tailnets, even writing the peer header (~95 bytes)
can result in a large amount of data that needs to be serialized and
deserialized. Only write headers for peers that need to have their
configuration changed.
Signed-off-by: Andrew Dunham <andrew@tailscale.com>
Avoid contention from fetching status for all peers, and instead fetch
status for a single peer.
Updates tailscale/coral#72
Signed-off-by: James Tucker <james@tailscale.com>
In addition to printing goroutine stacks, explicitly track all in-flight
operations and print them when the watchdog triggers (along with the
time they were started at). This should make debugging watchdog failures
easier, since we can look at the longest-running operation(s) first.
Signed-off-by: Andrew Dunham <andrew@tailscale.com>
Signed-off-by: Andrew Dunham <andrew@tailscale.com>
The Start method was removed in 4c27e2fa22, but the comment on NewConn
still mentioned it doesn't do anything until this method is called.
Signed-off-by: Kris Brandow <kris.brandow@gmail.com>
Hashing []any is slow since hashing of interfaces is slow.
Hashing of interfaces is slow since we pessimistically assume
that cycles can occur through them and start cycle tracking.
Drop the variadic signature of Update and fix callers to pass in
an anonymous struct so that we are hashing concrete types
near the root of the value tree.
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
This adds a lighter mechanism for endpoint updates from control.
Change-Id: If169c26becb76d683e9877dc48cfb35f90cc5f24
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The iOS and macOS networking extension API only exposes a single setter
for the entire routing and DNS configuration, and does not appear to
do any kind of diffing or deltas when applying changes. This results
in spurious "network changed" errors in Chrome, even when the
`OneCGNATRoute` flag from df9ce972c7 is
used (because we're setting the same configuration repeatedly).
Since we already keep track of the current routing and DNS configuration
in CallbackRouter, use that to detect if they're actually changing, and
only invoke the platform setter if it's actually necessary.
Updates #3102
Signed-off-by: Mihai Parparita <mihai@tailscale.com>
Link-local addresses on the Tailscale interface are not routable.
Ideally they would be removed, however, a concern exists that the
operating system will attempt to re-add them which would lead to
thrashing.
Setting SkipAsSource attempts to avoid production of packets using the
address as a source in any default behaviors.
Before, in powershell: `ping (hostname)` would ping the link-local
address of the Tailscale interface, and fail.
After: `ping (hostname)` now pings the link-local address on the next
highest priority metric local interface.
Fixes#4647
Signed-off-by: James Tucker <james@tailscale.com>
Per post-submit code review feedback of 1336fb740b from @maisem.
Change-Id: Ic5c16306cbdee1029518448642304981f77ea1fd
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Fixes#4647
It seems that Windows creates a link-local address for the TUN driver, seemingly
based on the (fixed) adapter GUID. This results in a fixed MAC address, which
for some reason doesn't handle loopback correctly. Given the derived link-local
address is preferred for lookups (thanks LLMNR), traffic which addresses the
current node by hostname uses this broken address and never works.
To address this, we remove the broken link-local address from the wintun adapter.
Signed-off-by: Tom DNetto <tom@tailscale.com>
Profiling identified this as a fairly hot path for growing a slice.
Given this is only used in control & when a new packet filter is received, this shouldnt be hot in the client.
We were marking them as gauges, but they are only ever incremented,
thus counter is more appropriate.
Signed-off-by: Mihai Parparita <mihai@tailscale.com>
* net/dns, wgengine: implement DNS over TCP
Signed-off-by: Tom DNetto <tom@tailscale.com>
* wgengine/netstack: intercept only relevant port/protocols to quad-100
Signed-off-by: Tom DNetto <tom@tailscale.com>
This were intended to be pushed to #4408, but in my excitement I
forgot to git push :/ better late than never.
Signed-off-by: Tom DNetto <tom@tailscale.com>
This change wires netstack with a hook for traffic coming from the host
into the tun, allowing interception and handling of traffic to quad-100.
With this hook wired, magicDNS queries over UDP are now handled within
netstack. The existing logic in wgengine to handle magicDNS remains for now,
but its hook operates after the netstack hook so the netstack implementation
takes precedence. This is done in case we need to support platforms with
netstack longer than expected.
Signed-off-by: Tom DNetto <tom@tailscale.com>
A subsequent commit implements handling of magicDNS traffic via netstack.
Implementing this requires a hook for traffic originating from the host and
hitting the tun, so we make another hook to support this.
Signed-off-by: Tom DNetto <tom@tailscale.com>
Well, goimports actually (which adds the normal import grouping order we do)
Change-Id: I0ce1b1c03185f3741aad67c14a7ec91a838de389
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Setting keepalive ensures that idle connections will eventually be
closed. In userspace mode, any application configured TCP keepalive is
effectively swallowed by the host kernel, and is not easy to detect.
Failure to close connections when a peer tailscaled goes offline or
restarts may result in an otherwise indefinite connection for any
protocol endpoint that does not initiate new traffic.
This patch does not take any new opinion on a sensible default for the
keepalive timers, though as noted in the TODO, doing so likely deserves
further consideration.
Update #4522
Signed-off-by: James Tucker <james@tailscale.com>
One current theory (among other things) on battery consumption is that
magicsock is resorting to using the IPv6 over LTE even on WiFi.
One thing that could explain this is that we do not get link change updates
for the LTE modem as we ignore them in this list.
This commit makes us not ignore changes to `pdp_ip` as a test.
Updates #3363
Signed-off-by: Maisem Ali <maisem@tailscale.com>
This reverts commit 8d6793fd70.
Reason: breaks Android build (cgo/pthreads addition)
We can try again next cycle.
Change-Id: I5e7e1730a8bf399a8acfce546a6d22e11fb835d5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Attempt to load the xt_mark kernel module when it is not present. If the
load fails, log error information.
It may be tempting to promote this failure to an error once it has been
in use for some time, so as to avoid reaching an error with the iptables
invocation, however, there are conditions under which the two stages may
disagree - this change adds more useful breadcrumbs.
Example new output from tailscaled running under my WSL2:
```
router: ensure module xt_mark: "/usr/sbin/modprobe xt_mark" failed: exit status 1; modprobe: FATAL: Module xt_mark not found in directory /lib/modules/5.10.43.3-microsoft-standard-WSL2
```
Background:
There are two places to lookup modules, one is `/proc/modules` "old",
the other is `/sys/module/` "new".
There was query_modules(2) in linux <2.6, alas, it is gone.
In a docker container in the default configuration, you would get
/proc/modules and /sys/module/ both populated. lsmod may work file,
modprobe will fail with EPERM at `finit_module()` for an unpriviliged
container.
In a priviliged container the load may *succeed*, if some conditions are
met. This condition should be avoided, but the code landing in this
change does not attempt to avoid this scenario as it is both difficult
to detect, and has a very uncertain impact.
In an nspawn container `/proc/modules` is populated, but `/sys/module`
does not exist. Modern `lsmod` versions will fail to gather most module
information, without sysfs being populated with module information.
In WSL2 modules are likely missing, as the in-use kernel typically is
not provided by the distribution filesystem, and WSL does not mount in a
module filesystem of its own. Notably the WSL2 kernel supports iptables
marks without listing the xt_mark module in /sys/module, and
/proc/modules is empty.
On a recent kernel, we can ask the capabilities system about SYS_MODULE,
that will help to disambiguate between the non-privileged container case
and just being root. On older kernels these calls may fail.
Update #4329
Signed-off-by: James Tucker <james@tailscale.com>
It unfortuantely gets truncated because it's too long, split it into 3
different log lines to circumvent truncation.
Signed-off-by: Maisem Ali <maisem@tailscale.com>
Currently we ignore these interfaces in the darwin osMon but then would consider it
interesting when checking if anything had changed.
Signed-off-by: Maisem Ali <maisem@tailscale.com>
In `(*Mon).Start` we don't run a timer to update `(*Mon).lastWall` on iOS and
Android as their sleep patterns are bespoke. However, in the debounce
goroutine we would notice that the the wall clock hadn't been updated
since the last event would assume that a time jump had occurred. This would
result in non-events being considered as major-change events.
This commit makes it so that `(*Mon).timeJumped` is never set to `true`
on iOS and Android.
Signed-off-by: Maisem Ali <maisem@tailscale.com>
Remove the weird netstack -> tailssh dependency and instead have tailssh
register itself with ipnlocal when linked.
This makes tailssh.server a singleton, so we can have a global map of
all sessions.
Updates #3802
Change-Id: Iad5caec3a26a33011796878ab66b8e7b49339f29
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This defines a new magic IPv6 prefix, fd7a:115c:a1e0:b1a::/64, a
subset of our existing /48, where the final 32 bits are an IPv4
address, and the middle 32 bits are a user-chosen "site ID". (which
must currently be 0000:00xx; the top 3 bytes must be zero for now)
e.g., I can say my home LAN's "site ID" is "0000:00bb" and then
advertise its 10.2.0.0/16 IPv4 range via IPv6, like:
tailscale up --advertise-routes=fd7a:115c:a1e0:b1a::bb:10.2.0.0/112
(112 being /128 minuse the /96 v6 prefix length)
Then people in my tailnet can:
$ curl '[fd7a:115c:a1e0:b1a::bb:10.2.0.230]'
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" ....
Updates #3616, etc
RELNOTE=initial support for TS IPv6 addresses to route v4 "via" specific nodes
Change-Id: I9b49b6ad10410a24b5866b9fbc69d3cae1f600ef
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Ignoring the events at this layer is the simpler path for right now, a
broader change should follow to suppress irrelevant change events in a
higher layer so as to avoid related problems with other monitoring paths
on other platforms. This approach may also carry a small risk that it
applies an at-most-once invariant low in the chain that could be assumed
otherwise higher in the code.
I adjusted the newAddrMessage type to include interface index rather
than a label, as labels are not always supplied, and in particular on my
test hosts they were consistently missing for ipv6 address messages.
I adjusted the newAddrMessage.Addr field to be populated from
Attributes.Address rather than Attributes.Local, as again for ipv6
.Local was always empty, and with ipv4 the .Address and .Local contained
the same contents in each of my test environments.
Update #4282
Signed-off-by: James Tucker <james@tailscale.com>
While I trust the test behavior, I also want to assert the behavior in a
reproduction environment, this envknob gives me the log information I
need to do so.
Update #4282
Signed-off-by: James Tucker <james@tailscale.com>
* net/dns, net/dns/resolver, wgengine: refactor DNS request path
Previously, method calls into the DNS manager/resolver types handled DNS
requests rather than DNS packets. This is fine for UDP as one packet
corresponds to one request or response, however will not suit an
implementation that supports DNS over TCP.
To support PRs implementing this in the future, wgengine delegates
all handling/construction of packets to the magic DNS endpoint, to
the DNS types themselves. Handling IP packets at this level enables
future support for both UDP and TCP.
Signed-off-by: Tom DNetto <tom@tailscale.com>
In addition an envknob (TS_DEBUG_NETSTACK_LEAK_MODE) now provides access
to set leak tracking to more useful values.
Fixes#4309
Signed-off-by: James Tucker <james@tailscale.com>
When `setWgengineStatus` is invoked concurrently from multiple
goroutines, it is possible that the call invoked with a newer status is
processed before a call with an older status. e.g. a status that has
endpoints might be followed by a status without endpoints. This causes
unnecessary work in the engine and can result in packet loss.
This patch adds an `AsOf time.Time` field to the status to specifiy when the
status was calculated, which later allows `setWgengineStatus` to ignore
any status messages it receives that are older than the one it has
already processed.
Updates tailscale/corp#2579
Signed-off-by: Maisem Ali <maisem@tailscale.com>
Plumb the outbound injection path to allow passing netstack
PacketBuffers down to the tun Read, where they are decref'd to enable
buffer re-use. This removes one packet alloc & copy, and reduces GC
pressure by pooling outbound injected packets.
Fixes#2741
Signed-off-by: James Tucker <james@tailscale.com>
The version string changed slightly. Adapt.
And always check the current Go version to prevent future
accidental regressions. I would have missed this one had
I not explicitly manually checked it.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>