Sent Successfully.
Home / Blog / Data Science / Kubernetes – An Open-Source Container Management Framework
Kubernetes – An Open-Source Container Management Framework
Table of Content
An essential open-source tool for creating, deploying, and maintaining containerized applications is Docker. Dockers make it easier to provide distributed applications.
A simpler option to scale up your application(s) is provided by Kubernetes. It aids in maintaining high availability, scalability, and operational code.
Automation of resource management, provisioning testing, and other processes are all possible with Kubernetes.
What is Docker
Consider the following real-world example: - A developer would create the application (code) and send it to a tester for evaluation.
The code wouldn't run on the tester system, which is the first issue that surfaces. This occurs as a result of the various computing settings.
We can attempt to create a virtual machine and reproduce the complete environment to get around these issues and make the code function for testers.
Using Docker (containers) is a substitute and effective option.
Virtual machines and Docker (containers) vary primarily in the following ways: (Note that this is only an example.)
The main distinction of Docker is that it employs the same guest OS for each container. As a result, it is lighter and still offers the same benefits.
The advantages and disadvantages of a virtual machine versus Docker
Comparatively speaking, Docker uses less RAM on the host computer than a virtual machine does.
The two have extremely different boot-up times. Docker starts up quicker. Compared to a virtual machine, the Docker environment performs faster and more reliably.
Compared to a virtual machine environment, Docker is also incredibly simple to start up and scale.
Docker is simpler to port across several systems.
When the guest OS is not required, the space allocation difference between Docker and a virtual machine is substantial. Additionally, the Docker environment is by nature smaller because it utilises the same guest OS for each container.
As a developer, you can now build up your solution or application and give it to a tester, and everything will function perfectly since Docker is already operating in their environment.
Docker brings a lot of advantages:
- Isolated
- Consistent
- Cost-effectiveness
- Fast Deployment
- Mobility
- Portability
- Repeatability
- Automation
- RollBack
- Flexibility
- Modularity and
- Scaling
Imagine that the application you have created has to be distributed to some testers or other end users. In other words, one Docker container is created for each of these users. Click here to learn Data Science Course in Chennai
So, what happens if there is an upgrade to the application?
All updates and modifications should be visible to application users as well, therefore each container has to be updated. If done manually, managing all of these containers will be challenging.
We need a framework to facilitate maintenance and operations on the containers. The Kubernetes container management framework was created by Google in 2014.
Let's examine Kubernetes, sometimes referred to as K8s:
A framework and technology called Kubernetes is used to manage distributed containers (microservices). The life cycle of a Docker container is managed by Kubernetes, which consists of a number of distributed components. Data Science Training in Bangalore
Nodes, the Linux machines that Kubernetes operates on, are these machines. Nodes come in two varieties.
- Worker/Slave Node: In charge of running your Docker containers
- Master Node: In charge of managing the cluster's state
The roles of the Kubernetes components operating on these nodes are well specified.
A typical workflow in Kubernetes is:
# List of Components in Kubernetes Cluster:
Component | Node |
---|---|
API Server | Master |
Etcd | Master |
Scheduler | Master |
Controller Manager | Master |
Kubelet | Worker |
Kube-Proxy | Worker |
Container Engine | Worker |
Learn the core concepts of Data Science Course video on Youtube:
API Server
- The heart of any Kubernetes cluster.
- It is a REST API.
- It is also stateless.
- The default port it listens to is port 6443 (unlike 80 (http) and 443 (https) requests).
- The resources in Kube-apiserver are:
- Pod
- ReplicaSet
- PersistentVolume
- NetworkPolicy
- Deployment
Note: all are related to container management only.
Etcd
- Distributed NoSQL database
- The main memory of the Kubernetes cluster
- Independent project of Kubernetes
- No Alternative option to Etcd for data storage
- Apiserver calls will read or write data from or to Etcd
- Cluster state information
- Single point of failure - Etcd crashes then Kubernetes cluster is inaccessible
- Not an In-Memory DB. Data is stored on the Disk
Scheduler
- Responsible for electing a worker node.
- Scheduler continuously queries the Apiserver at regular intervals to list the pods that have not been scheduled.
- Each pod object stored in Etcd has a property called 'nodeName' - it is the name of the worker node where the pod will deploy (nodeName assigned means the pod is scheduled).
- The scheduler is responsible for assigning the name of the worker node for each pod.
Controller-Manager
- Tries to maintain the actual state of the cluster as updated in the Etcd datastore.
- It performs garbage collection of pods, nodes, events, etc on the cluster.
- To handle multiple responsibilities different controllers are called.
- List of the few controllers
- NodeController
- NamespaceController
- EndpointsController
- ServiceaccountController
Kubelet
- The most important component of the worker node
- Interacts with the local Docker daemon on the worker node
- Kubelet is required to run on the host machine
- Kubelet refers to the configuration file ~/kubernetes/kubelet.conf
- Updates the 2 parameters in the config file:
- The endpoint of the Kube-apiserver component
- The local Docker daemon UNIX socket
- Kubelet is a bridge between Apiserver and the local container (Docker) daemon
- Kubelet runs a GET request at an interval of 20 seconds to the checklist of the pods created on Etcd.
- The kubelet performs garbage collection
- Verifies for unused images every five minutes
- Verifies for unused containers every minute
Kube-Proxy
- Handles all the networking related tasks on Kubernetes worker node for containers
- Enables access to the running pods to other pods or external applications
- The Proxy uses a feature called Service
- Services route traffic to Pods
- Maintains network rules on nodes to manage and allow network communication
Container Engine
- Kubernetes is a container management tool
- The Default Container engine is Docker
- Common container runtimes:
- Docker Engine
- Contained
- CRI-O
- Mirantis Container Runtime
Kubernetes Installation on Windows
Installation Steps:
1: Check if Virtualization is supported in your machine
$ systeminfo
2: Download/Install kubectl, minikube utility & Add it to the Local ENV PATH
https://kubernetes.io/docs/tasks/tools/install-kubectl-windows/
https://github.com/kubernetes/minikube/releases
- a. Download kubectl.exe file and place it in a dedicated folder (minikube folder).
Set the environment variables
Test the kubectl
open cmd prompt (or power shell)
type: kubectl and enter
- b. Download the file from github link share above
minikube-windows-amd64.exe
place the file in the folder where we have placed kubectl file (minikube folder).
3: Download/Install a Hypervisor | ORACLE VM BOX
https://www.virtualbox.org/wiki/Downloads
4: Start the Minikube (Specify the Driver)
syntax: $ minikube start --driver=driver_name
$ minikube start
or
$ minikube start --driver=virtualbox
5: To verify the Minikube installation
$ minikube status
minikube
type: Control Plane
host: Running
kubelet: Running
apiserver: Running
kubeconfig: Configured
6: Create & Expose a container in the Minikube Cluster
syntax: $ kubectl create deployment
> kubectl get pod
No resources found in default namespace.
> kubectl get deployment
No resources found in default namespace.
> kubectl create deployment test1-v1 --image=spark
deployment.apps/test1-v1 created
> kubectl get deployment
NAME READY UP-TO-DATE AVAILABLE AGE
test1-v1 0/1 1 0 28s
> kubectl get pod
NAME READY STATUS RESTARTS AGE
test1-v1-56f4d5f5b4-qm7qg 0/1 ErrImagePull 0 41s
> kubectl get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1
7: Test or Access the Container
$ minikube service test1-v1 --url
8: Cleanup
$ kubectl delete pods test1-v1
$ kubectl delete service test1-v1
$ minikube stop
$ minikube delete
Refer to a few kubectl commands
kubectl get nodes
kubectl get pod
kubectl get services
kubectl create deployment nginx-depl --image=nginx
kubectl get deployment
kubectl get replicaset
kubectl edit deployment nginx-depl
### debugging
kubectl logs {pod-name}
kubectl exec -it {pod-name} -- bin/bash
### create mongo deployment
kubectl create deployment mongo-depl --image=mongo
kubectl logs mongo-depl-{pod-name}
kubectl describe pod mongo-depl-{pod-name}
### delete deployment
kubectl delete deployment mongo-depl
kubectl delete deployment nginx-depl
### create or edit config file
vim nginx-deployment.yaml
kubectl apply -f nginx-deployment.yaml
kubectl get pod
kubectl get deployment
### delete with config
kubectl delete -f nginx-deployment.yaml
### Metrics
kubectl top
# The kubectl top command returns current CPU and memory usage for a cluster’s pods or nodes, or for a particular pod or node if specified.
Click here to learn Data Science Course, Data Science Course in Hyderabad, Data Science Course in Bangalore
Data Science Placement Success Story
Payal
Data Science Placement Success Story
Data Science Training Institutes in Other Locations
Agra, Ahmedabad, Amritsar, Anand, Anantapur, Bangalore, Bhopal, Bhubaneswar, Chengalpattu, Chennai, Cochin, Dehradun, Malaysia, Dombivli, Durgapur, Ernakulam, Erode, Gandhinagar, Ghaziabad, Gorakhpur, Gwalior, Hebbal, Hyderabad, Jabalpur, Jalandhar, Jammu, Jamshedpur, Jodhpur, Khammam, Kolhapur, Kothrud, Ludhiana, Madurai, Meerut, Mohali, Moradabad, Noida, Pimpri, Pondicherry, Pune, Rajkot, Ranchi, Rohtak, Roorkee, Rourkela, Shimla, Shimoga, Siliguri, Srinagar, Thane, Thiruvananthapuram, Tiruchchirappalli, Trichur, Udaipur, Yelahanka, Andhra Pradesh, Anna Nagar, Bhilai, Borivali, Calicut, Chandigarh, Chromepet, Coimbatore, Dilsukhnagar, ECIL, Faridabad, Greater Warangal, Guduvanchery, Guntur, Gurgaon, Guwahati, Hoodi, Indore, Jaipur, Kalaburagi, Kanpur, Kharadi, Kochi, Kolkata, Kompally, Lucknow, Mangalore, Mumbai, Mysore, Nagpur, Nashik, Navi Mumbai, Patna, Porur, Raipur, Salem, Surat, Thoraipakkam, Trichy, Uppal, Vadodara, Varanasi, Vijayawada, Visakhapatnam, Tirunelveli, Aurangabad
Data Analyst Courses in Other Locations
ECIL, Jaipur, Pune, Gurgaon, Salem, Surat, Agra, Ahmedabad, Amritsar, Anand, Anantapur, Andhra Pradesh, Anna Nagar, Aurangabad, Bhilai, Bhopal, Bhubaneswar, Borivali, Calicut, Cochin, Chengalpattu , Dehradun, Dombivli, Durgapur, Ernakulam, Erode, Gandhinagar, Ghaziabad, Gorakhpur, Guduvanchery, Gwalior, Hebbal, Hoodi , Indore, Jabalpur, Jaipur, Jalandhar, Jammu, Jamshedpur, Jodhpur, Kanpur, Khammam, Kochi, Kolhapur, Kolkata, Kothrud, Ludhiana, Madurai, Mangalore, Meerut, Mohali, Moradabad, Pimpri, Pondicherry, Porur, Rajkot, Ranchi, Rohtak, Roorkee, Rourkela, Shimla, Shimoga, Siliguri, Srinagar, Thoraipakkam , Tiruchirappalli, Tirunelveli, Trichur, Trichy, Udaipur, Vijayawada, Vizag, Warangal, Chennai, Coimbatore, Delhi, Dilsukhnagar, Hyderabad, Kalyan, Nagpur, Noida, Thane, Thiruvananthapuram, Uppal, Kompally, Bangalore, Chandigarh, Chromepet, Faridabad, Guntur, Guwahati, Kharadi, Lucknow, Mumbai, Mysore, Nashik, Navi Mumbai, Patna, Pune, Raipur, Vadodara, Varanasi, Yelahanka
Navigate to Address
360DigiTMG - Data Science, Data Scientist Course Training in Bangalore
No 23, 2nd Floor, 9th Main Rd, 22nd Cross Rd, 7th Sector, HSR Layout, Bengaluru, Karnataka 560102
1800-212-654-321