-
Notifications
You must be signed in to change notification settings - Fork 28
docs(talm): document DRBD sysctl tuning, keepalive toggle, etcd quota #567
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
Aleksei Sviridkin (lexfrei)
wants to merge
1
commit into
main
Choose a base branch
from
docs/talm-drbd-sysctl-etcd-defaults
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+10
−1
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -206,7 +206,7 @@ The `cozystack` preset ships curated defaults for `machine.kernel.modules`, `mac | |
| | --- | --- | --- | | ||
| | `extraKernelModules` | list | Appended to the built-in modules (`openvswitch`, `drbd`, `zfs`, `spl`, `vfio_pci`, `vfio_iommu_type1`). Each entry is a Talos kernel-module spec. | | ||
| | `extraKubeletExtraArgs` | map | Merged into `kubelet.extraConfig` after the preset's `cpuManagerPolicy: static`, `maxPods: 512`. Operator keys must NOT collide with built-ins — yaml.v3 rejects duplicate map keys on decode, so a collision fails the render with a precise hint pointing at the offending key. Fork the preset if you need a different default. | | ||
| | `extraSysctls` | map | Merged into `machine.sysctls` after the preset's `gc_thresh*` entries. Same collision-fails-render contract as `extraKubeletExtraArgs`. Values must be YAML strings (Talos expects strings even for numeric sysctls). | | ||
| | `extraSysctls` | map | Merged into `machine.sysctls` after the preset's built-in entries: the `gc_thresh1/2/3` ARP-cache sizes, the always-on DRBD/LINSTOR tuning (`tcp_orphan_retries`, `tcp_fin_timeout`, `netdev_max_backlog`, `netdev_budget`, `netdev_budget_usecs`), `vm.nr_hugepages` (when set), and the `tcp_keepalive_*` triplet while `tcpKeepaliveTuning` is enabled. All of these are preset-owned — the same collision-fails-render contract as `extraKubeletExtraArgs` applies. Values must be YAML strings (Talos expects strings even for numeric sysctls). | | ||
| | `extraMachineFiles` | list | Appended to the preset's CRI customization and `lvm.conf` entries. Talos rejects duplicate `path:` at apply time. | | ||
|
|
||
| Example `values.yaml` addition: | ||
|
|
@@ -226,6 +226,15 @@ extraMachineFiles: | |
|
|
||
| The `generic` preset ships no defaults under any of these sections — each block emits only when the matching `extra*` key is non-empty. | ||
|
|
||
| Beyond the `extra*` extension points, the `cozystack` preset exposes two opinionated tunables you can change without forking the chart: | ||
|
|
||
| | Key | Default | Effect | | ||
| | --- | --- | --- | | ||
| | `tcpKeepaliveTuning` | `false` | When `true`, adds `net.ipv4.tcp_keepalive_time=600` / `intvl=10` / `probes=6` to `machine.sysctls`, reaping a dead idle socket in ~660s instead of the kernel default ~2h. These sysctls are kernel-wide — they change failure detection for every long-lived idle TCP connection on the node, not just DRBD — so they are opt-in. DRBD already detects dead peers in seconds via its own protocol-level ping, so leave this off unless you specifically want faster node-wide dead-socket detection. | | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
| | `etcd.quotaBackendBytes` | `"8589934592"` (8 GiB) | etcd backend DB size ceiling, emitted as `cluster.etcd.extraArgs.quota-backend-bytes` on controlplane nodes only. Raises etcd's own 2 GiB default so a LINSTOR-heavy control plane holding many DRBD-resource CRDs in aggregate does not trip the NOSPACE alarm. It is a ceiling, not a reservation: a small DB stays small and costs no extra RAM/disk. Set it to `""` to fall back to etcd's built-in default. This governs total DB size, not single-object size — per-object writes stay bounded by kube-apiserver's fixed 3 MiB request-body limit, which has no configuration knob. | | ||
|
|
||
| The five always-on DRBD/LINSTOR sysctls listed in the `extraSysctls` row above ship unconditionally on the `cozystack` preset — they address TCP-port exhaustion observed under DRBD reconnect storms and have no equivalent on the `generic` preset. | ||
|
|
||
| ### 2.3 Add Keycloak Configuration | ||
|
|
||
| By default, the cluster will be accessible only by authentication with a token. | ||
|
|
||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For clarity and precision, it is recommended to use the fully qualified sysctl names instead of shorthands. This helps operators easily identify the exact keys being configured and avoids any confusion when they configure their own
extraSysctls.\n\nFor example, consider using:\n-net.ipv4.tcp_orphan_retriesinstead oftcp_orphan_retries\n-net.ipv4.tcp_fin_timeoutinstead oftcp_fin_timeout\n-net.core.netdev_max_backloginstead ofnetdev_max_backlog\n-net.core.netdev_budgetinstead ofnetdev_budget\n-net.core.netdev_budget_usecsinstead ofnetdev_budget_usecs\n-net.ipv4.tcp_keepalive_*instead oftcp_keepalive_*