Knowledge article

Understanding Multi-Attach Errors in Google Kubernetes Engine

Created 14 days ago

Active 14 days ago

Viewed 51 times

1 min read

Part of Google Cloud Collective

Recently, I encountered the following error while working with a statefulset that had only one replica:

Warning FailedAttachVolume attachdetach-controller Multi-Attach error for volume <pvc_name> Volume is already exclusively attached to one node and can''t be attached to another'

Getting an error like "Multi-Attach error" in GKE's StatefulSets can be super stressful. It's like hitting a roadblock when you're driving blindfolded. You're left wondering: What does this even mean? Why is my volume stuck with one node? And how do I go about fixing it without causing further damage?

As someone has already pointed out one go would be changing the access mode from ReadWriteOnce to ReadWriteMany to fix the Multi-Attach error, it's like putting a temporary patch on a bigger problem. ReadWriteMany might seem like a quick solution, but it's not the right way to go according to best practices. This error happens because the volume isn't meant to be shared by many—it's supposed to be used by just one user or node. Using ReadWriteMany could make things more complicated and cause more issues later on. So, it's important to focus on fixing the root cause of the error and following the recommended practices for managing volumes in Kubernetes to avoid future headaches.

I would recommend reviewing the logs around the time of the issue. It is possible that you may encounter the following message in the logs

volume_linux.go:49] Setting volume ownership for /var/lib/kubelet/pods/61fde90f-48d8-4704-9e9d-c5660db3f23b/volumes/kubernetes.io~csi/<PVC_name>/mount and fsGroup set. If the volume has a lot of files then setting volume ownership could be slow, see https://github.com/kubernetes/kubernetes/issues/69699

If you encounter a log message with the text above, please review your statefulset configuration file and verify that you have set up fsGroup under the securityContext, as shown below:

securityContext:
  runAsUser: 1000
  runAsGroup: 3000
  fsGroup: 2000

Cause for the issue: By default, Kubernetes recursively changes ownership and permissions for the contents of each volume to match the fsGroup specified in a Pod's securityContext when that volume is mounted. For large volumes, checking and changing ownership and permissions can take a lot of time, slowing Pod startup.

Resolution: The fsGroupChangePolicy field inside a securityContext can be used to control how Kubernetes checks and manages ownership and permissions for a volume.

fsGroupChangePolicy - This setting determines how ownership and permissions of a volume are altered before it's made available inside a Pod. It's relevant only for volume types that allow control over ownership and permissions through fsGroup. There are two options for this setting:

OnRootMismatch: Only change permissions and ownership if the permission and the ownership of the root directory does not match with expected permissions of the volume. This could help shorten the time it takes to change ownership and permission of a volume.
Always: Always change permission and ownership of the volume when volume is mounted.

For example:

securityContext:
  runAsUser: 1000
  runAsGroup: 3000
  fsGroup: 2000
  fsGroupChangePolicy: "OnRootMismatch"

Please note that the insights shared in this article are based on recent observations. However, it's important to acknowledge that experiences may vary depending on the specific issue encountered.

approved Mar 17 at 13:33

Jay

34.9k
18
56
85

Recognized by Google Cloud Collective

created Mar 17 at 13:33

Nani

If you believe the existing answer is wrong or a bad practice, why did you post an article instead of a better answer?
– Kevin B
Mar 18 at 16:13
1

I wouldn't say it's entirely bad or wrong, this article simply presents another perspective. You would also need to understand users may have different requirements. it's based on my one of the recent incident handling.
– Nani
Mar 18 at 17:25

Add a comment |

Collectives™ on Stack Overflow