guest: unify pod model for V1, virtual pod, and V2 shim support by shreyanshjain7174 · Pull Request #2699 · microsoft/hcsshim

shreyanshjain7174 · 2026-04-22T06:13:46Z

The GCS guest runtime (internal/guest/runtime/hcsv2/uvm.go) tracks virtual pods separately from V1 sandbox containers — a dedicated VirtualPod type, seven exported methods, a parent cgroup manager, and a reverse-lookup map. V1 sandboxes have no pod-level tracking at all. Adding V2 shim support would need a third path.

This collapses all three into one: a private uvmPod type and a single pods map on Host. Every sandbox — V1, virtual pod, or V2 shim — goes through createPodInUVM, which allocates a cgroup under /pods/{sandboxID}. Workload containers nest at /pods/{sandboxID}/{containerID}. Container-to-pod membership is tracked via addContainerToPod. Cleanup in RemoveContainer is a single code path: remove the container from the pod, and when the sandbox container itself is removed, delete the pod's cgroup.

Cgroup hierarchy changes from:

/containers/{id}                         (V1 sandbox)
/containers/virtual-pods/{virtualPodID}  (virtual pod)

to:

/pods/{sandboxID}                        (all pod types)
/pods/{sandboxID}/{containerID}          (workload containers)

Standalone (non-CRI) containers keep their own cgroup at /pods/{id} with no pod entry — same isolation as before, just under the new prefix.

Network namespace teardown for virtual pod sandboxes is preserved: RemoveContainer skips RemoveNetworkNamespace for virtual pod sandbox containers since the host-driven path (TearDownNetworking → RemoveNetNS → removeNIC) handles adapter removal first.

cmd/gcs/main.go replaces the /containers/virtual-pods parent cgroup with /pods and drops the InitializeVirtualPodSupport call.

Tested E2E with both shims:

	V1 shim (`io.containerd.runhcs.v1`)	V2 shim (`io.containerd.lcow.v2`)
OCIBundlePath	`/run/gcs/c/<podId>`	`/run/gcs/pods/<podId>/<podId>`
Pod cgroup	`/sys/fs/cgroup/memory/pods/<podId>`	`/sys/fs/cgroup/memory/pods/<podId>`
`/containers/virtual-pods/`	absent	absent

rawahars · 2026-04-30T05:19:05Z

-			msg = "memory usage for virtual pods cgroup exceeded threshold"
+		if strings.HasPrefix(cgName, "/pods") {
+			msg = "memory usage for pods cgroup exceeded threshold"
 		} else {


With this new change, would we ever go through the else condition?

Removed the /containers cgroup entirely — all containers now nest under /pods. Simplified the message check too.

rawahars · 2026-04-30T05:21:39Z

Do we use containersControl after this change?

rawahars · 2026-04-30T05:22:10Z

Same here also.

rawahars · 2026-04-30T05:22:39Z

@@ -430,7 +426,7 @@ func main() {

 	go readMemoryEvents(startTime, gefdFile, "/gcs", int64(*gcsMemLimitBytes), gcsControl)
 	go readMemoryEvents(startTime, oomFile, "/containers", containersLimit, containersControl)


rawahars · 2026-04-30T05:23:19Z


 type Container struct {
-	id string
+	id        string


Please add 1-line concise comment for both ID and sandboxID.

rawahars · 2026-04-30T05:43:44Z

+	h.podsMutex.Lock()
+	if c.sandboxID != "" {
+		if pod, exists := h.pods[c.sandboxID]; exists {
+			delete(pod.containers, id)


What is the behaviour with standalone containers?
Do we delete the entry from map and cgroup for them too?

rawahars · 2026-04-30T05:45:38Z

 	// Check for virtual pod annotation
-	virtualPodID, isVirtualPod := settings.OCISpecification.Annotations[annotations.VirtualPodID]
+	virtualPodID := settings.OCISpecification.Annotations[annotations.VirtualPodID]
+	isVirtualPod := virtualPodID != ""


We do not compare the id here with virtualPodID

We do — line 412: if isVirtualPod && id == virtualPodID. That's where we force criType = "sandbox" for the first virtual pod container.

rawahars · 2026-04-30T05:49:46Z


 	delete(h.containers, id)

+	// Extract pod cgroup manager under lock, delete cgroup outside lock to


Presently, we delete the cgroup for virtual pods under lock.
Let's continue that behaviour. It would simplify the code below too.

Done — moved cgroup delete back under the lock and merged into containersMutex.

rawahars · 2026-04-30T06:02:24Z

-		if err := h.AddContainerToVirtualPod(id, virtualPodID); err != nil {
-			return nil, errors.Wrapf(err, "failed to add container %s to virtual pod %s", id, virtualPodID)
+	// Determine the sandboxID for this container.
+	sandboxID := id


Can we move this logic to the next switch?
Under line 507, we already extract the sandboxID and check it's not empty. We can set c.sandboxID after that.

virtualPodID is set only when criType is container.

Moved into the switch — each case sets its own sandboxID now.

rawahars · 2026-04-30T06:04:34Z

-			entry.WithField("path", vpRootDir).Debug("Removed virtual pod root directory")
-		}
+// addContainerToPod registers a container as belonging to a pod.
+func (h *Host) addContainerToPod(sandboxID, containerID string) {


Can we inline this method?

Replace the separate VirtualPod tracking (dedicated type, 7 exported methods, parent cgroup manager, reverse-lookup map) with a unified uvmPod type and a single pods map on Host. All pod types (V1 sandbox, virtual pod, V2 shim) now go through the same code path: - createPodInUVM allocates a cgroup under /pods/{sandboxID} - addContainerToPod tracks container→pod membership - RemoveContainer handles cleanup uniformly Cgroup hierarchy changes from: /containers/{id} (V1 sandbox) /containers/virtual-pods/{virtualPodID} (virtual pod) to: /pods/{sandboxID} (all pod types) Workload containers nest under their pod: /pods/{sandboxID}/{containerID} Signed-off-by: Shreyansh Jain <shreyanshjain7174@gmail.com> Signed-off-by: Shreyansh Sancheti <shsancheti@microsoft.com>

shreyanshjain7174 requested a review from a team as a code owner April 22, 2026 06:13

shreyanshjain7174 mentioned this pull request Apr 22, 2026

Adds guest-side GCS changes for V2 shim support #2669

Open

shreyanshjain7174 requested a review from rawahars April 22, 2026 06:43

shreyanshjain7174 mentioned this pull request Apr 22, 2026

guest/spec: remove VirtualPod path helpers and dead code #2700

Open

shreyanshjain7174 force-pushed the guest-pod-unification-v2 branch from 62fc02c to a724fae Compare April 28, 2026 16:17

msscotb assigned rawahars Apr 28, 2026

rawahars requested changes Apr 30, 2026

View reviewed changes

shreyanshjain7174 force-pushed the guest-pod-unification-v2 branch from a724fae to ad3ee5f Compare April 30, 2026 06:25

shreyanshjain7174 requested a review from rawahars April 30, 2026 06:55

		@@ -430,7 +426,7 @@ func main() {

		go readMemoryEvents(startTime, gefdFile, "/gcs", int64(*gcsMemLimitBytes), gcsControl)
		go readMemoryEvents(startTime, oomFile, "/containers", containersLimit, containersControl)


		delete(h.containers, id)

		// Extract pod cgroup manager under lock, delete cgroup outside lock to

Conversation

shreyanshjain7174 commented Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

shreyanshjain7174 commented Apr 22, 2026 •

edited

Loading