Commit Graph

7473 Commits

Author SHA1 Message Date
Irbe Krumina df8f40905b
cmd/k8s-operator,k8s-operator: optionally serve tailscaled metrics on Pod IP (#11699)
Adds a new .spec.metrics field to ProxyClass to allow users to optionally serve
client metrics (tailscaled --debug) on <Pod-IP>:9001.
Metrics cannot currently be enabled for proxies that egress traffic to tailnet
and for Ingress proxies with tailscale.com/experimental-forward-cluster-traffic-via-ingress annotation
(because they currently forward all cluster traffic to their respective backends).

The assumption is that users will want to have these metrics enabled
continuously to be able to monitor proxy behaviour (as opposed to enabling
them temporarily for debugging). Hence we expose them on Pod IP to make it
easier to consume them i.e via Prometheus PodMonitor.

Updates tailscale/tailscale#11292

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-04-26 08:25:06 +01:00
Brad Fitzpatrick 723c775dbb tsd, ipnlocal, etc: add tsd.System.HealthTracker, start some plumbing
This adds a health.Tracker to tsd.System, accessible via
a new tsd.System.HealthTracker method.

In the future, that new method will return a tsd.System-specific
HealthTracker, so multiple tsnet.Servers in the same process are
isolated. For now, though, it just always returns the temporary
health.Global value. That permits incremental plumbing over a number
of changes. When the second to last health.Global reference is gone,
then the tsd.System.HealthTracker implementation can return a private
Tracker.

The primary plumbing this does is adding it to LocalBackend and its
dozen and change health calls. A few misc other callers are also
plumbed. Subsequent changes will flesh out other parts of the tree
(magicsock, controlclient, etc).

Updates #11874
Updates #4136

Change-Id: Id51e73cfc8a39110425b6dc19d18b3975eac75ce
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-04-25 22:13:04 -07:00
Brad Fitzpatrick cb66952a0d health: permit Tracker method calls on nil receiver
In prep for tsd.System Tracker plumbing throughout tailscaled,
defensively permit all methods on Tracker to accept a nil receiver
without crashing, lest I screw something up later. (A health tracking
system that itself causes crashes would be no good.) Methods on nil
receivers should not be called, so a future change will also collect
their stacks (and panic during dev/test), but we should at least not
crash in prod.

This also locks that in with a test using reflect to automatically
call all methods on a nil receiver and check they don't crash.

Updates #11874
Updates #4136

Change-Id: I8e955046ebf370ec8af0c1fb63e5123e6282a9d3
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-04-25 20:45:57 -07:00
Chris Palmer 7349b274bd
safeweb: handle mux pattern collisions more generally (#11801)
Fixes #11800

Signed-off-by: Chris Palmer <cpalmer@tailscale.com>
2024-04-25 16:08:30 -07:00
Brad Fitzpatrick 5b32264033 health: break Warnable into a global and per-Tracker value halves
Previously it was both metadata about the class of warnable item as
well as the value.

Now it's only metadata and the value is per-Tracker.

Updates #11874
Updates #4136

Change-Id: Ia1ed1b6c95d34bc5aae36cffdb04279e6ba77015
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-04-25 14:40:11 -07:00
Brad Fitzpatrick ebc552d2e0 health: add Tracker type, in prep for removing global variables
This moves most of the health package global variables to a new
`health.Tracker` type.

But then rather than plumbing the Tracker in tsd.System everywhere,
this only goes halfway and makes one new global Tracker
(`health.Global`) that all the existing callers now use.

A future change will eliminate that global.

Updates #11874
Updates #4136

Change-Id: I6ee27e0b2e35f68cb38fecdb3b2dc4c3f2e09d68
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-04-25 13:46:22 -07:00
Claire Wang d5fc52a0f5
tailcfg: add auto exit node attribute (#11871)
Updates tailscale/corp#19515

Signed-off-by: Claire Wang <claire@tailscale.com>
2024-04-25 15:05:39 -04:00
Sonia Appasamy 18765cd4f9 release/dist/qnap: omit .qpkg.codesigning files
Updates tailscale/tailscale-qpkg#135

Signed-off-by: Sonia Appasamy <sonia@tailscale.com>
2024-04-25 11:20:40 -04:00
Percy Wegmann 955ad12489 ipn/ipnlocal: only show Taildrive peers to which ACLs grant us access
This improves convenience and security.

* Convenience - no need to see nodes that can't share anything with you.
* Security - malicious nodes can't expose shares to peers that aren't
             allowed to access their shares.

Updates tailscale/corp#19432

Signed-off-by: Percy Wegmann <percy@tailscale.com>
2024-04-24 17:49:04 -05:00
Sonia Appasamy 5d4b4ffc3c release/dist/qnap: update perms for tmpDir files
Allows all users to read all files, and .sh/.cgi files to be
executable.

Updates tailscale/tailscale-qpkg#135

Signed-off-by: Sonia Appasamy <sonia@tailscale.com>
2024-04-24 14:48:20 -04:00
Lee Briggs 14ac41febc
cmd/k8s-operator,k8s-operator: proxyclass affinity (#11862)
add ability to set affinity rules to proxyclass

Updates#11861

Signed-off-by: Lee Briggs <lee@leebriggs.co.uk>
2024-04-24 09:31:35 -07:00
Anton Tolchanov 31e6bdbc82 ipn/ipnlocal: always stop the engine on auth when key has expired
If seamless key renewal is enabled, we typically do not stop the engine
(deconfigure networking). However, if the node key has expired there is
no point in keeping the connection up, and it might actually prevent
key renewal if auth relies on endpoints routed via app connectors.

Fixes tailscale/corp#5800

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2024-04-24 14:47:57 +01:00
Andrea Gottardo 1d3e77f373
util/syspolicy: add ReadStringArray interface (#11857)
Fixes tailscale/corp#19459

This PR adds the ability for users of the syspolicy handler to read string arrays from the MDM solution configured on the system.

Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
2024-04-23 22:23:48 -07:00
Sonia Appasamy 0cce456ee5 release/dist/qnap: use tmp file directory for qpkg building
This change allows for the release/dist/qnap package to be used
outside of the tailscale repo (notably, will be used from corp),
by using an embedded file system for build files which gets
temporarily written to a new folder during qnap build runs.

Without this change, when used from corp, the release/dist/qnap
folder will fail to be found within the corp repo, causing
various steps of the build to fail.

The file renames in this change are to combine the build files
into a /files folder, separated into /scripts and /Tailscale.

Updates tailscale/tailscale-qpkg#135

Signed-off-by: Sonia Appasamy <sonia@tailscale.com>
2024-04-23 21:34:45 -04:00
Percy Wegmann c8e912896e wgengine/router: consolidate routes before reconfiguring router for mobile clients
This helps reduce memory pressure on tailnets with large numbers
of routes.

Updates tailscale/corp#19332

Signed-off-by: Percy Wegmann <percy@tailscale.com>
2024-04-23 20:15:56 -05:00
Irbe Krumina add62af7c6
util/linuxfw,go.{mod,sum}: don't log errors when deleting non-existant chains and rules (#11852)
This PR bumps iptables to a newer version that has a function to detect
'NotExists' errors and uses that function to determine whether errors
received on iptables rule and chain clean up are because the rule/chain
does not exist- if so don't log the error.

Updates corp#19336

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-04-23 21:08:18 +01:00
Irbe Krumina 3af0f526b8
cmd{containerboot,k8s-operator},util/linuxfw: support ExternalName Services (#11802)
* cmd/containerboot,util/linuxfw: support proxy backends specified by DNS name

Adds support for optionally configuring containerboot to proxy
traffic to backends configured by passing TS_EXPERIMENTAL_DEST_DNS_NAME env var
to containerboot.
Containerboot will periodically (every 10 minutes) attempt to resolve
the DNS name and ensure that all traffic sent to the node's
tailnet IP gets forwarded to the resolved backend IP addresses.

Currently:
- if the firewall mode is iptables, traffic will be load balanced
accross the backend IP addresses using round robin. There are
no health checks for whether the IPs are reachable.
- if the firewall mode is nftables traffic will only be forwarded
to the first IP address in the list. This is to be improved.

* cmd/k8s-operator: support ExternalName Services

 Adds support for exposing endpoints, accessible from within
a cluster to the tailnet via DNS names using ExternalName Services.
This can be done by annotating the ExternalName Service with
tailscale.com/expose: "true" annotation.
The operator will deploy a proxy configured to route tailnet
traffic to the backend IPs that service.spec.externalName
resolves to. The backend IPs must be reachable from the operator's
namespace.

Updates tailscale/tailscale#10606

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-04-23 17:30:00 +01:00
License Updater bf46bff678 licenses: update license notices
Signed-off-by: License Updater <noreply+license-updater@tailscale.com>
2024-04-23 09:10:39 -07:00
Percy Wegmann b7e5122226 util/osuser: add unit test for parseGroupIds
Updates #11682

Signed-off-by: Percy Wegmann <percy@tailscale.com>
2024-04-23 08:54:17 -05:00
Andrew Dunham e985c6e58f ssh/tailssh: try fetching group IDs for user with the 'id' command
Since the tailscaled binaries that we distribute are static and don't
link cgo, we previously wouldn't fetch group IDs that are returned via
NSS. Try shelling out to the 'id' command, similar to how we call
'getent', to detect such cases.

Updates #11682

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I9bdc938bd76c71bc130d44a97cc2233064d64799
2024-04-23 08:54:17 -05:00
Kristoffer Dalby 9779eb6dba api.md: move device posture api to api.md
Updates tailscale/corp#18572

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-04-23 10:51:39 +02:00
Brad Fitzpatrick c07aa2cfed syncs: fix flaky test by deleting the code it tested (Watch)
Fixes #11766

Change-Id: Id5a875aab23eb1b48a57dc379d0cdd42412fd18b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-04-22 21:16:14 -07:00
Joe Tsai 63b3c82587
ipn/local: log OS-specific diagnostic information as JSON (#11700)
There is an undocumented 16KiB limit for text log messages.
However, the limit for JSON messages is 256KiB.
Even worse, logging JSON as text results in significant overhead
since each double quote needs to be escaped.

Instead, use logger.Logf.JSON to explicitly log the info as JSON.

We also modify osdiag to return the information as structured data
rather than implicitly have the package log on our behalf.
This gives more control to the caller on how to log.

Updates #7802

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2024-04-22 16:45:01 -07:00
Andrew Lytvynov 06502b9048
ipn/ipnlocal: reset auto-updates if unsupported on profile load (#11838)
Prior to
1613b18f82 (diff-314ba0d799f70c8998940903efb541e511f352b39a9eeeae8d475c921d66c2ac),
nodes could set AutoUpdate.Apply=true on unsupported platforms via
`EditPrefs`. Specifically, this affects tailnets where default
auto-updates are on.

Fix up those invalid prefs on profile reload, as a migration.

Updates #11544

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
2024-04-22 16:55:25 -06:00
Sonia Appasamy 0a84215036 release/dist/qnap: add qnap target builder
Creates new QNAP builder target, which builds go binaries then uses
docker to build into QNAP packages. Much of the docker/script code
here is pulled over from https://github.com/tailscale/tailscale-qpkg,
with adaptation into our builder structures.

The qnap/Tailscale folder contains static resources needed to build
Tailscale qpkg packages, and is an exact copy of the existing folder
in the tailscale-qpkg repo.

Builds can be run with:
```
sudo ./tool/go run ./cmd/dist build qnap
```

Updates tailscale/tailscale-qpkg#135

Signed-off-by: Sonia Appasamy <sonia@tailscale.com>
2024-04-22 17:43:28 -04:00
Andrew Lytvynov b743b85dad
ipn/ipnlocal,ssh/tailssh: reject c2n /update if SSH conns are active (#11820)
Since we already track active SSH connections, it's not hard to
proactively reject updates until those finish. We attempt to do the same
on the control side, but the detection latency for new connections is in
the minutes, which is not fast enough for common short sessions.

Handle a `force=true` query parameter to override this behavior, so that
control can still trigger an update on a server where some long-running
abandoned SSH session is open.

Updates https://github.com/tailscale/corp/issues/18556

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
2024-04-22 10:27:12 -06:00
Brad Fitzpatrick 5100bdeba7 types/persist: remove unused field Persist.Provider
It was only obviously unused after the previous change, c39cde79d.

Updates #19334

Change-Id: I9896d5fa692cb4346c070b4a339d0d12340c18f7
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-04-21 10:48:25 -07:00
Brad Fitzpatrick c39cde79d2 tailcfg: remove some unused fields from RegisterResponseAuth
Fixes #19334

Change-Id: Id6463f28af23078a7bc25b9280c99d4491bd9651
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-04-21 10:29:19 -07:00
Brad Fitzpatrick 05bfa022f2 tailcfg: pointerify RegisterRequest.Auth, omitemptify RegisterResponseAuth
We were storing server-side lots of:

    "Auth":{"Provider":"","LoginName":"","Oauth2Token":null,"AuthKey":""},

That was about 7% of our total storage of pending RegisterRequest
bodies.

Updates tailscale/corp#19327

Change-Id: Ib73842759a2b303ff5fe4c052a76baea0d68ae7d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-04-21 07:10:43 -07:00
Andrew Dunham 375617c5c8 net/tsdial: assume all connections are affected if no default route is present
If this happens, it results in us pessimistically closing more
connections than might be necessary, but is more correct since we won't
"miss" a change to the default route interface and keep trying to send
data over a nonexistent interface, or one that can't reach the internet.

Updates tailscale/corp#19124

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ia0b8b04cb8cdcb0da0155fd08751c9dccba62c1a
2024-04-19 22:14:36 -04:00
Nick Khyl 9e1c86901b wgengine\router: fix the Tailscale-In firewall rule to work on domain networks
The Network Location Awareness service identifies networks authenticated against
an Active Directory domain and categorizes them as "Domain Authenticated".
This includes the Tailscale network if a Domain Controller is reachable through it.

If a network is categories as NLM_NETWORK_CATEGORY_DOMAIN_AUTHENTICATED,
it is not possible to override its category, and we shouldn't attempt to do so.
Additionally, our Windows Firewall rules should be compatible with both private
and domain networks.

This fixes both issues.

Fixes #11813

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-04-19 15:43:15 -05:00
Andrew Lytvynov bff527622d
ipn/ipnlocal,clientupdate: disallow auto-updates in containers (#11814)
Containers are typically immutable and should be updated as a whole (and
not individual packages within). Deny enablement of auto-updates in
containers.

Also, add the missing check in EditPrefs in LocalAPI, to catch cases
like tailnet default auto-updates getting enabled for nodes that don't
support it.

Updates #11544

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
2024-04-19 14:37:21 -06:00
Andrew Lytvynov b3fb3bf084
clientupdate: return OS-specific version from LatestTailscaleVersion (#11812)
We don't always have the same latest version for all platforms (like
with 1.64.2 is only Synology+Windows), so we should use the OS-specific
result from pkgs JSON response instead of the main Version field.

Updates #11795

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
2024-04-19 13:04:11 -06:00
Irbe Krumina bbe194c80d
cmd/k8s-operator: correctly determine cluster domain (#11512)
Kubernetes cluster domain defaults to 'cluster.local', but can also be customized.
We need to determine cluster domain to set up in-cluster forwarding to our egress proxies.
This was previously hardcoded to 'cluster.local', so was the egress proxies were not usable in clusters with custom domains.
This PR ensures that we attempt to determine the cluster domain by parsing /etc/resolv.conf.
In case the cluster domain cannot be determined from /etc/resolv.conf, we fall back to 'cluster.local'.

Updates tailscale/tailscale#10399,tailscale/tailscale#11445

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-04-19 16:49:46 +01:00
Percy Wegmann d16c1293e9 ipn/ipnlocal: remove origin and referer headers from Taildrive requests
peerapi does not want these, but rclone includes them.
Removing them allows rclone to work with Taildrive configured
as a WebDAV remote.

Updates #cleanup

Signed-off-by: Percy Wegmann <percy@tailscale.com>
2024-04-18 17:00:22 -05:00
Percy Wegmann 94c0403104 ipn/ipnlocal: strip origin and referer headers from Taildrive requests
peerapi does not want these, but rclone includes them.
Stripping them out allows rclone to work with Taildrive configured
as a WebDAV remote.

Updates #cleanup

Signed-off-by: Percy Wegmann <percy@tailscale.com>
2024-04-18 17:00:22 -05:00
Percy Wegmann 787f8c08ec drive: rewrite Location headers
This ensures that MOVE, LOCK and any other verbs that use the Location
header work correctly.

Fixes #11758

Signed-off-by: Percy Wegmann <percy@tailscale.com>
2024-04-18 15:50:18 -05:00
Claire Wang c24f2eee34
tailcfg: rename exit node destination network flow log node attribute (#11779)
Updates tailscale/corp#18625

Signed-off-by: Claire Wang <claire@tailscale.com>
2024-04-18 16:07:08 -04:00
kari-ts 048cb61dd0
interfaces: create android impl (#11784)
-Move Android impl into interfaces_android.go
-Instead of using ip route to get the interface name, use the one passed in by Android (ip route is restricted in Android 13+ per termux/termux-app#2993)

Follow-up will be to do the same for router

Fixes tailscale/corp#19215
Fixes tailscale/corp#19124

Signed-off-by: kari-ts <kari@tailscale.com>
2024-04-18 12:49:02 -07:00
Aaron Klotz 7132b782d4 hostinfo: use Distro field for distinguishing Windows Server builds
Some editions of Windows server share the same build number as their
client counterpart; we must use an additional field found in the OS
version information to distinguish between them.

Even though "Distro" has Linux connotations, it is the most appropriate
hostinfo field. What is Windows Server if not an alternate distribution
of Windows? This PR populates Distro with "Server" when applicable.

Fixes #11785

Signed-off-by: Aaron Klotz <aaron@tailscale.com>
2024-04-18 13:48:50 -06:00
Percy Wegmann 02c6af2a69 cmd/tailscale: clarify Taildrive grants in help text
Fixes #cleanup

Signed-off-by: Percy Wegmann <percy@tailscale.com>
2024-04-18 13:27:15 -05:00
Chris Palmer bdfaef4879
safeweb: allow object-src: self in CSP (#11782)
This change is safe (self is still safe, by
definition), and makes the code match the comment.

Updates #cleanup

Signed-off-by: Chris Palmer <cpalmer@tailscale.com>
2024-04-18 10:39:11 -07:00
Andrew Lytvynov e775de3c63
go.mod: bump golang.org/x/net (#11775)
One more place to pick up a fix for
https://pkg.go.dev/vuln/GO-2024-2687.

Updates https://github.com/tailscale/corp/issues/18893

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
2024-04-18 09:55:34 -06:00
Adrian Dewhurst c8b0adb382 docs/windows/policy: add missing key expiration warning interval
Fixes #11345

Change-Id: Ib53b639690b77d1b7d857304dca2119f197227ce
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
2024-04-18 10:49:14 -04:00
Brad Fitzpatrick 03d5d1f0f9 wgengine/magicsock: disable portmapper in tunchan-faked tests
Most of the magicsock tests fake the network, simulating packets going
out and coming in. There's no reason to actually hit your router to do
UPnP/NAT-PMP/PCP during in tests. But while debugging thousands of
iterations of tests to deflake some things, I saw it slamming my
router. This stops that.

Updates #11762

Change-Id: I59b9f48f8f5aff1fa16b4935753d786342e87744
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-04-17 21:47:38 -07:00
Andrew Lytvynov 22bd506129
ipn/ipnlocal: hold the mutex when in onTailnetDefaultAutoUpdate (#11786)
Turns out, profileManager is not safe for concurrent use and I missed
all the locking infrastructure in LocalBackend, oops.

I was not able to reproduce the race even with `go test -count 100`, but
this seems like an obvious fix.

Fixes #11773

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
2024-04-17 21:15:09 -06:00
Chris Palmer 88a7767492
safeweb: set SameSite=Strict, with an option for Lax (#11781)
Fixes #11780

Signed-off-by: Chris Palmer <cpalmer@tailscale.com>
2024-04-17 16:20:14 -07:00
dependabot[bot] dd48cad89a build(deps-dev): bump vite from 5.1.4 to 5.1.7 in /client/web
Bumps [vite](https://github.com/vitejs/vite/tree/HEAD/packages/vite) from 5.1.4 to 5.1.7.
- [Release notes](https://github.com/vitejs/vite/releases)
- [Changelog](https://github.com/vitejs/vite/blob/v5.1.7/packages/vite/CHANGELOG.md)
- [Commits](https://github.com/vitejs/vite/commits/v5.1.7/packages/vite)

---
updated-dependencies:
- dependency-name: vite
  dependency-type: direct:development
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-04-17 15:16:35 -07:00
Andrew Dunham b85c2b2313 net/dns/resolver: use SystemDial in DoH forwarder
This ensures that we close the underlying connection(s) when a major
link change happens. If we don't do this, on mobile platforms switching
between WiFi and cellular can result in leftover connections in the
http.Client's connection pool which are bound to the "wrong" interface.

Updates #10821
Updates tailscale/corp#19124

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ibd51ce2efcaf4bd68e14f6fdeded61d4e99f9a01
2024-04-17 17:24:38 -04:00
Paul Scott 82394debb7 cmd/tailscale: add shell tab-completion
The approach is lifted from cobra: `tailscale completion bash` emits a bash
script for configuring the shell's autocomplete:

    . <( tailscale completion bash )

so that typing:

    tailscale st<TAB>

invokes:

    tailscale completion __complete -- st

RELNOTE=tailscale CLI now supports shell tab-completion

Fixes #3793

Signed-off-by: Paul Scott <paul@tailscale.com>
2024-04-17 18:54:10 +01:00