Understanding Volumes in Kubernetes

Understanding Volumes in Kubernetes

20 December 2017

In this blog post we'll see different ways to attach external storage to the Pods. The focus of the blog is to cover fundamentals and not to do a deep dive.

The first thing one does after learning about containers is to run some stateless containers, something like following:-

As one gets confidence, he/she tries some stateful applications(maybe a database) to convince himself/herself, that containers can really be used for production. To run a stateful application, we need to make sure that we store the persistent data outside the container. To achieve this Docker uses Docker Volumes and Volume Plugins. Similarly, Kubernetes has the concept of Volumes, by which we can attach external storage to the Pods.

In this blog post, we'll see different ways to attach external storage to the Pods. The focus of the blog is to cover fundamentals and not to do a deep dive.

With the help of volumes section in the Pod's Defintion, we can attach external storage to a Pod. The external storage can be coming from NFS, GlutserFS, Cloud, Host etc. More details about the Volumes can be found here.

In above, the management of Volumes is offloaded to individual users but as a developer, I don't like it. As a developer, I won't like to take responsibility for managing storage. I should just say that I want 10GB of storage and it should be allocated from somewhere. But in reality, that would be backed by some physical storage.

To de-couple the storage provisioning and its use, Kubernetes created two objects - Persistent Volume (PV) and Persistent Volume ClaimPersistent Volumes are created via the Kubernetes Administrator and they are backed by different Physical storage like AWS EBS, GCE Disk, NFS, iSCSI etc. The Administrator can create a pool with more than one PVs, which can be backed by different physical storage.

Now coming to back the problem I mentioned earlier that as a developer I shouldn't be worried about the underlying storage management and it should be automatically allocated from somewhere. Kubernetes provides a special volume type called Persistent Volume Claim (PVC), in which we (developers) define the storage requirement like I want 10 GB storage.

Once we define our requirement in PVC, Kubernetes searches in the existing pool of PVs and attaches the best possible match. If there is no match then it would just keep looking.

In above, we requested for 10 GB worth of storage but a PV with 12 GB got attached, because that was the best match. This is good but we allocated 2 GB extra, which might get wasted. What we saw is an example of Static Volume Provisioning, in which such fragmentation would be common.

To overcome this Kubernetes provides Dynamic Volume Provisioning storage using StorageClass object. An administrator can create a StorageClass based on his/her setup.
For example, an admin can create a StorageClass with name platinum, which would create SSD backed disk on GCE. A user/developer would just need to mention platinum as StorageClass in the PVC.

Following is an example of in which we are using rook-block storage class, which would create a PV of exact 10 GB on Rook storage. Rook is a CloudNative storage solution, built on top on Ceph.

In the above we have seen, how we can attach external storage to Kubernetes Pods using Volumes. It is very specific to Kubernetes. On other container orchestrators, we have to follow different methods and use other plugins to use external storage. This is a nightmare for any storage vendor as they have to make sure that their storage solution is compatible with different orchestrators. To solve this, different storage vendors and container community is trying to build a common interface, Container Storage Interface, so that once a storage plugin is written for one orchestrator, it should work with others as well. Do check it out.

Happy Learning !!!

Leave a comment: