In this blog post we’ll see different ways to attach external storage to the Pods. The focus of the blog is to cover fundamentals and not to do a deep dive.
The first thing one does after learning about containers is to run some stateless containers, something like following:-
$ docker container run -i -t alpine sh $ kubectl run mynginx --image nginx:alpine --replicas=3
As one gets confidence, he/she tries some stateful applications(maybe a database) to convince himself/herself, that containers can really be used for production. To run a stateful application, we need to make sure that we store the persistent data outside the container. To achieve this Docker uses Docker Volumes
and Volume Plugins
. Similarly, Kubernetes has the concept of Volumes
, by which we can attach external storage to the Pods.
In this blog post, we’ll see different ways to attach external storage to the Pods. The focus of the blog is to cover fundamentals and not to do a deep dive.
With the help of volumes
section in the Pod's Defintion
, we can attach external storage to a Pod. The external storage can be coming from NFS, GlutserFS, Cloud, Host etc. More details about the Volumes
can be found here.
apiVersion: v1 kind: Pod metadata: name: test-pd spec: containers: - image: gcr.io/google_containers/test-webserver name: test-container volumeMounts: - mountPath: /test-pd name: test-volume volumes: - name: test-volume # This GCE PD must already exist. gcePersistentDisk: pdName: my-data-disk fsType: ext4
In above, the management of Volumes
is offloaded to individual users but as a developer, I don’t like it. As a developer, I won’t like to take responsibility for managing storage. I should just say that I want 10GB of storage and it should be allocated from somewhere
. But in reality, that would be backed by some physical storage.
To de-couple the storage provisioning and its use, Kubernetes created two objects – Persistent Volume (PV)
and Persistent Volume Claim
. Persistent Volumes
are created via the Kubernetes Administrator and they are backed by different Physical storage like AWS EBS, GCE Disk, NFS, iSCSI etc. The Administrator can create a pool with more than one PVs, which can be backed by different physical storage.
apiVersion: v1 kind: PersistentVolume metadata: name: pv-aws spec: capacity: storage: 12Gi accessModes: - ReadWriteOnce persistentVolumeReclaimPolicy: Recycle storageClassName: slow awsElasticBlockStore: fsType: "ext4" volumeID: "vol-f37a03aa"
Now coming to back the problem I mentioned earlier that as a developer I shouldn’t be worried about the underlying storage management and it should be automatically allocated from somewhere
. Kubernetes provides a special volume type called Persistent Volume Claim (PVC)
, in which we (developers) define the storage requirement like I want 10 GB storage
.
Once we define our requirement in PVC
, Kubernetes searches in the existing pool of PVs and attaches the best possible match. If there is no match then it would just keep looking.
apiVersion: v1 kind: PersistentVolumeClaim metadata: name: mongodb-pv-claim labels: app: mongodb spec: accessModes: - ReadWriteOnce resources: requests: storage: 10Gi --- apiVersion: extensions/v1beta1 kind: Deployment metadata: name: rsvp-db spec: replicas: 1 template: metadata: labels: appdb: rsvpdb spec: containers: - name: rsvpd-db image: mongo:3.3 ports: - containerPort: 27017 volumeMounts: - name : mongodb-persistent-storage mountPath : /data/db volumes: - name: mongodb-persistent-storage persistentVolumeClaim: claimName: mongodb-pv-claim
In above, we requested for 10 GB
worth of storage but a PV
with 12 GB
got attached, because that was the best match. This is good but we allocated 2 GB extra, which might get wasted. What we saw is an example of Static Volume Provisioning
, in which such fragmentation would be common.
To overcome this Kubernetes provides Dynamic Volume Provisioning
storage using StorageClass
object. An administrator can create a StorageClass
based on his/her setup.
For example, an admin can create a StorageClass
with name platinum
, which would create SSD
backed disk on GCE. A user/developer would just need to mention platinum
as StorageClass
in the PVC
.
Following is an example of in which we are using rook-block
storage class, which would create a PV
of exact 10 GB
on Rook
storage. Rook is a CloudNative
storage solution, built on top on Ceph
.
apiVersion: v1 kind: PersistentVolumeClaim metadata: name: mongodb-pv-claim labels: app: mongodb spec: storageClassName: rook-block accessModes: - ReadWriteOnce resources: requests: storage: 10Gi --- apiVersion: extensions/v1beta1 kind: Deployment metadata: name: rsvp-db spec: replicas: 1 template: metadata: labels: appdb: rsvpdb spec: containers: - name: rsvpd-db image: mongo:3.3 ports: - containerPort: 27017 volumeMounts: - name : mongodb-persistent-storage mountPath : /data/db volumes: - name: mongodb-persistent-storage persistentVolumeClaim: claimName: mongodb-pv-claim
In the above we have seen, how we can attach external storage to Kubernetes Pods using Volumes
. It is very specific to Kubernetes. On other container orchestrators, we have to follow different methods and use other plugins to use external storage. This is a nightmare for any storage vendor as they have to make sure that their storage solution is compatible with different orchestrators. To solve this, different storage vendors and container community is trying to build a common interface, Container Storage Interface
, so that once a storage plugin is written for one orchestrator, it should work with others as well. Do check it out.
Happy Learning !!!