Tag: Platform Engineer

Automating Kubernetes Clusters

Kubernetes is definitely the de facto standard for container orchestration, powering modern cloud-native applications. As organizations scale their infrastructure, managing Kubernetes clusters efficiently becomes increasingly critical. Manual cluster provisioning can be time-consuming and error-prone, leading to operational inefficiencies. To address these challenges, Kubernetes introduced the Cluster API, an extension that enables the management of Kubernetes clusters through a Kubernetes-native API. In this blog post, we’ll delve into leveraging ClusterClass and the Cluster API to automate the creation of Kubernetes clusters.

Let’s understand ClusterClass

ClusterClass is a Kubernetes Custom Resource Definition (CRD) introduced as part of the Cluster API. It serves as a blueprint for defining the desired state of a Kubernetes cluster. ClusterClass encapsulates various configuration parameters such as node instance types, networking settings, and authentication mechanisms, enabling users to define standardized cluster configurations.

Setting Up Cluster API

Before diving into ClusterClass, it’s essential to set up the Cluster API components within your Kubernetes environment. This typically involves deploying the Cluster API controllers and providers, such as AWS, Azure, or vSphere, depending on your infrastructure provider.

Creating a ClusterClass

Once the Cluster API is set up, defining a ClusterClass involves creating a Custom Resource (CR) using the ClusterClass schema. This example YAML manifest defines a ClusterClass:

apiVersion: cluster.x-k8s.io/v1alpha3
kind: ClusterClass
metadata:
  name: my-cluster-class
spec:
  infrastructureRef:
    kind: InfrastructureCluster
    apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
    name: my-infrastructure-cluster
  topology:
    controlPlane:
      count: 1
      machine:
        type: my-control-plane-machine
    workers:
      count: 3
      machine:
        type: my-worker-machine
  versions:
    kubernetes:
      version: 1.22.4

In this example:

  • metadata.name specifies the name of the ClusterClass.
  • spec.infrastructureRef references the InfrastructureCluster CR that defines the underlying infrastructure provider details.
  • spec.topology describes the desired cluster topology, including the number and type of control plane and worker nodes.
  • spec.versions.kubernetes.version specifies the desired Kubernetes version.

Applying the ClusterClass

Once the ClusterClass is defined, it can be applied to instantiate a Kubernetes cluster. The Cluster API controllers interpret the ClusterClass definition and orchestrate the creation of the cluster accordingly. Applying the ClusterClass typically involves creating an instance of the ClusterClass CR:

kubectl apply -f my-cluster-class.yaml

Managing Cluster Lifecycle

The Cluster API facilitates the entire lifecycle management of Kubernetes clusters, including creation, scaling, upgrading, and deletion. Users can modify the ClusterClass definition to adjust cluster configurations dynamically. For example, scaling the cluster can be achieved by updating the spec.topology.workers.count field in the ClusterClass and reapplying the changes.

Monitoring and Maintenance

Automation of cluster creation with ClusterClass and the Cluster API streamlines the provisioning process, reduces manual intervention, and enhances reproducibility. However, monitoring and maintenance of clusters remain essential tasks. Utilizing Kubernetes-native monitoring solutions like Prometheus and Grafana can provide insights into cluster health and performance metrics.

Wrapping it up

Automating Kubernetes cluster creation using ClusterClass and the Cluster API simplifies the management of infrastructure at scale. By defining cluster configurations as code and leveraging Kubernetes-native APIs, organizations can achieve consistency, reliability, and efficiency in their Kubernetes deployments. Embracing these practices empowers teams to focus more on application development and innovation, accelerating the journey towards cloud-native excellence.

Declarative vs Imperative Operations in Kubernetes: A Deep Dive with Code Examples

Kubernetes, the de facto orchestrator for containerized applications, offers two distinct approaches to managing resources: declarative and imperative. Understanding the nuances between these two can significantly impact the efficiency, reliability, and scalability of your applications. In this post, we’ll dissect the differences, advantages, and use cases of declarative and imperative operations in Kubernetes, supplemented with code examples for popular workloads.

Imperative Operations: Direct Control at Your Fingertips

Imperative operations in Kubernetes involve commands that make changes to the cluster directly. This approach is akin to giving step-by-step instructions to Kubernetes about what you want to happen. It’s like telling a chef exactly how to make a dish, rather than giving them a recipe to follow.

Example: Running an NGINX Deployment

Consider deploying an NGINX server. An imperative command would be:

kubectl run nginx --image=nginx:1.17.10 --replicas=3

This command creates a deployment named nginx, using the nginx:1.17.10 image, and scales it to three replicas. It’s straightforward and excellent for quick tasks or one-off deployments.

Modifying a Deployment Imperatively

To update the number of replicas imperatively, you’d execute:

kubectl scale deployment/nginx --replicas=5

This command changes the replica count to five. While this method offers immediate results, it lacks the self-documenting and version control benefits of declarative operations.

Declarative Operations: The Power of Describing Desired State

Declarative operations, on the other hand, involve defining the desired state of the system in configuration files. Kubernetes then works to make the cluster match the desired state. It’s like giving the chef a recipe; they know the intended outcome and can figure out how to get there.

Example: NGINX Deployment via a Manifest File

Here’s how you would define the same NGINX deployment declaratively:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.17.10

You would apply this configuration using:

kubectl apply -f nginx-deployment.yaml

Updating a Deployment Declaratively

To change the number of replicas, you would edit the nginx-deployment.yaml file to set replicas: 5 and reapply it.

spec:
  replicas: 5

Then apply the changes:

kubectl apply -f nginx-deployment.yaml

Kubernetes compares the desired state in the YAML file with the current state of the cluster and makes the necessary changes. This approach is idempotent, meaning you can apply the configuration multiple times without changing the result beyond the initial application.

Best Practices and When to Use Each Approach

Imperative:

  • Quick Prototyping: When you need to quickly test or prototype something, imperative commands are the way to go.
  • Learning and Debugging: For beginners learning Kubernetes or when debugging, imperative commands can be more intuitive and provide immediate feedback.

Declarative:

  • Infrastructure as Code (IaC): Declarative configurations can be stored in version control, providing a history of changes and facilitating collaboration.
  • Continuous Deployment: In a CI/CD pipeline, declarative configurations ensure that the deployed application matches the source of truth in your repository.
  • Complex Workloads: Declarative operations shine with complex workloads, where dependencies and the order of operations can become cumbersome to manage imperatively.

Conclusion

In Kubernetes, the choice between declarative and imperative operations boils down to the context of your work. For one-off tasks, imperative commands offer simplicity and speed. However, for managing production workloads and achieving reliable, repeatable deployments, declarative operations are the gold standard.

As you grow in your Kubernetes journey, you’ll likely find yourself using a mix of both approaches. The key is to understand the strengths and limitations of each and choose the right tool for the job at hand.

Remember, Kubernetes is a powerful system that demands respect for its complexity. Whether you choose the imperative wand or the declarative blueprint, always aim for practices that enhance maintainability, scalability, and clarity within your team. Happy orchestrating!

Leveraging Automation in Managing Kubernetes Clusters: The Path to Efficient Operation

Automation in managing Kubernetes clusters has burgeoned into an essential practice that enhances efficiency, security, and the seamless deployment of applications. With the exponential growth in containerized applications, automation has facilitated streamlined operations, reducing the room for human error while significantly saving time. Let’s delve deeper into the crucial role automation plays in managing Kubernetes clusters.

The Imperative of Automation in Kubernetes

Kubernetes Architecture

The Kubernetes Landscape

Before delving into the nuances of automation, let’s briefly recapitulate the fundamental components of Kubernetes, encompassing pods, nodes, and clusters, and their symbiotic relationships facilitating a harmonious operational environment.

The Need for Automation

Automation emerges as a vanguard in managing complex environments effortlessly, fostering efficiency, reducing downtime, and ensuring the optimal utilization of resources.

Efficiency and Scalability

Automation in Kubernetes ensures that clusters can dynamically scale based on the workload, fostering efficiency, and resource optimization.

Reduced Human Error

Automating repetitive tasks curtails the scope of human error, facilitating seamless operations and mitigating security risks.

Cost Optimization

Through efficient resource management, automation aids in cost reduction by optimizing resource allocation dynamically.

Automation Tools and Processes

top devops tools

CI/CD Pipelines

Continuous Integration and Continuous Deployment (CI/CD) pipelines are at the helm of automation, fostering swift and efficient deployment cycles.

pipeline:
  build:
    image: node:14
    commands:
      - npm install
      - npm test
  deploy:
    image: google/cloud-sdk
    commands:
      - gcloud container clusters get-credentials cluster-name --zone us-central1-a
      - kubectl apply -f k8s/

Declarative Example 1: A simple CI/CD pipeline example.

Infrastructure as Code (IaC)

IaC facilitates the programmable infrastructure, rendering a platform where systems and devices can be managed through code.

apiVersion: v1
kind: Pod
metadata:
  name: mypod
spec:
  containers:
  - name: mycontainer
    image: nginx

Declarative Example 2: Defining a Kubernetes pod using IaC.

Configuration Management

Tools like Ansible and Chef aid in configuration management, ensuring system uniformity and adherence to policies.

- hosts: kubernetes_nodes
  tasks:
    - name: Ensure Kubelet is installed
      apt: 
        name: kubelet
        state: present

Declarative Example 3: Using Ansible for configuration management.

Section 3: Automation Use Cases in Kubernetes

Auto-scaling

Auto-scaling facilitates automatic adjustments to the system’s computational resources, optimizing performance and curtailing costs.

Horizontal Pod Autoscaler

Kubernetes’ Horizontal Pod Autoscaler automatically adjusts the number of pod replicas in a replication controller, deployment, or replica set based on observed CPU utilization.

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: myapp-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: myapp
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 50

Declarative Example 4: Defining a Horizontal Pod Autoscaler in Kubernetes.

Automated Rollouts and Rollbacks

Kubernetes aids in automated rollouts and rollbacks, ensuring application uptime and facilitating seamless updates and reversions.

Deployment Strategies

Deployment strategies such as blue-green and canary releases can be automated in Kubernetes, facilitating controlled and safe deployments.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp
spec:
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
  selector:
    matchLabels:
      app: myapp
  template:
    metadata:
      labels:
        app: myapp
    spec:
      containers:
      - name: myapp
        image: myapp:v2

Declarative Example 5: Configuring a rolling update strategy in a Kubernetes deployment.

Conclusion: The Future of Kubernetes with Automation

As Kubernetes continues to be the front-runner in orchestrating containerized applications, the automation integral to its ecosystem fosters efficiency, security, and scalability. Through a plethora of tools and evolving best practices, automation stands central in leveraging Kubernetes to its fullest potential, orchestrating seamless operations, and steering towards an era of self-healing systems and zero-downtime deployments.

In conclusion, the ever-evolving landscape of Kubernetes managed through automation guarantees a future where complex deployments are handled with increased efficiency and reduced manual intervention. Leveraging automation tools and practices ensures that Kubernetes clusters not only meet the current requirements but are also future-ready, paving the way for a robust, scalable, and secure operational environment.


References:

  1. Kubernetes Official Documentation. Retrieved from https://kubernetes.io/docs/
  2. Jenkins, CI/CD, and Kubernetes: Integrating CI/CD with Kubernetes (2021). Retrieved from https://www.jenkins.io/doc/book/

Kubernetes quickstarts – AKS, EKS, GKE

There has been a lot of inquiries about how to get started quickly with what is commonly referred as the hyperscalers. Let’s dive in for a super quick primer!

All of these quickstarts assume the reader has accounts in each service with the appropriate rights and in most cases the reader needs to have the client installed.

Starting with Google Kubernetes Engine (GKE)

export NAME="$(whoami)-$RANDOM"
export ZONE="us-west2-a"
gcloud container clusters create "${NAME}" --zone ${ZONE} --num-nodes=1
glcoud container clusters get-credentials "${NAME}" --zone ${ZONE}

Moving on to Azure Kubernetes Service (AKS)

export NAME="$(whoami)-$RANDOM"
export AZURE_RESOURCE_GROUP="${NAME}-group"
az group create --name "${AZURE_RESOURCE_GROUP}" -l westus2
az aks create --resource-group "${AZURE_RESOURCE_GROUP}" --name "${NAME}"
az aks get-credentials --resource-group "${AZURE_RESOURCE_GROUP}" --name "${NAME}"

For Elastic Kubernetes Service (EKS)

export NAME="$(whoami)-$RANDOM"
eksctl create cluster --name "${NAME}"

As you can see setting up these clusters is very simple. Now that you have a cluster what are you going to do with it? Ensure you’ve installed the tools needed to manage the cluster. You’ll want to get the credentials from each copy into ~/{user}/.kube/config (except with eksctl as it copies the kubeconfig to the appropriate place automagically). To manipulate the cluster, install kubectl with your favorite package manager and to install applications the easiest way is via helm.

As you can see the setup of a kubernetes cluster in one of the major hyperscalers is very easy. A few lines of code and you’re up and running. Add those lines into a shell script and standing up clusters can be a single command…just don’t forget to tear it down when you’re done!

Streamline Kubernetes Management through Automation

Automation in managing Kubernetes clusters has burgeoned into an essential practice that enhances efficiency, security, and the seamless deployment of applications. With the exponential growth in containerized applications, automation has facilitated streamlined operations, reducing the room for human error while significantly saving time. Let’s delve deeper into the crucial role automation plays in managing Kubernetes clusters.

Section 1: The Imperative of Automation in Kubernetes

1.1 The Kubernetes Landscape

Before delving into the nuances of automation, let’s briefly recapitulate the fundamental components of Kubernetes, encompassing pods, nodes, and clusters, and their symbiotic relationships facilitating a harmonious operational environment.

1.2 The Need for Automation

Automation emerges as a vanguard in managing complex environments effortlessly, fostering efficiency, reducing downtime, and ensuring the optimal utilization of resources.

1.2.1 Efficiency and Scalability

Automation in Kubernetes ensures that clusters can dynamically scale based on the workload, fostering efficiency, and resource optimization.

1.2.2 Reduced Human Error

Automating repetitive tasks curtails the scope of human error, facilitating seamless operations and mitigating security risks.

1.2.3 Cost Optimization

Through efficient resource management, automation aids in cost reduction by optimizing resource allocation dynamically.

Section 2: Automation Tools and Processes

2.1 CI/CD Pipelines

Continuous Integration and Continuous Deployment (CI/CD) pipelines are at the helm of automation, fostering swift and efficient deployment cycles.

pipeline:
  build:
    image: node:14
    commands:
      - npm install
      - npm test
  deploy:
    image: google/cloud-sdk
    commands:
      - gcloud container clusters get-credentials cluster-name --zone us-central1-a
      - kubectl apply -f k8s/

Code snippet 1: A simple CI/CD pipeline example.

2.2 Infrastructure as Code (IaC)

IaC facilitates the programmable infrastructure, rendering a platform where systems and devices can be managed through code.

apiVersion: v1
kind: Pod
metadata:
  name: mypod
spec:
  containers:
  - name: mycontainer
    image: nginx

Code snippet 2: Defining a Kubernetes pod using IaC.

2.3 Configuration Management

Tools like Ansible and Chef aid in configuration management, ensuring system uniformity and adherence to policies.

- hosts: kubernetes_nodes
  tasks:
    - name: Ensure Kubelet is installed
      apt: 
        name: kubelet
        state: present

Code snippet 3: Using Ansible for configuration management.

Section 3: Automation Use Cases in Kubernetes

3.1 Auto-scaling

Auto-scaling facilitates automatic adjustments to the system’s computational resources, optimizing performance and curtailing costs.

3.1.1 Horizontal Pod Autoscaler

Kubernetes’ Horizontal Pod Autoscaler automatically adjusts the number of pod replicas in a replication controller, deployment, or replica set based on observed CPU utilization.

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: myapp-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: myapp
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 50

Code snippet 4: Defining a Horizontal Pod Autoscaler in Kubernetes.

3.2 Automated Rollouts and Rollbacks

Kubernetes aids in automated rollouts and rollbacks, ensuring application uptime and facilitating seamless updates and reversions.

3.2.1 Deployment Strategies

Deployment strategies such as blue-green and canary releases can be automated in Kubernetes, facilitating controlled and safe deployments.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp
spec:
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
  selector:
    matchLabels:
      app: myapp
  template:
    metadata:
      labels:
        app: myapp
    spec:
      containers:
      - name: myapp
        image: myapp:v2

Code snippet 5: Configuring a rolling update strategy in a Kubernetes deployment.

Conclusion: The Future of Kubernetes with Automation

As Kubernetes continues to be the frontrunner in orchestrating containerized applications, the automation integral to its ecosystem fosters efficiency, security, and scalability. Through a plethora of tools and evolving best practices, automation stands central in leveraging Kubernetes to its fullest potential, orchestrating seamless operations, and steering towards an era of self-healing systems and zero-downtime deployments.

In conclusion, the ever-evolving landscape of Kubernetes managed through automation guarantees a future where complex deployments are handled with increased efficiency and reduced manual intervention. Leveraging automation tools and practices ensures that Kubernetes clusters not only meet the current requirements but are also future-ready, paving the way for a robust, scalable, and secure operational environment.


References:

  1. Kubernetes Official Documentation. Retrieved from https://kubernetes.io/docs/
  2. Jenkins, CI/CD, and Kubernetes: Integrating CI/CD with Kubernetes (2021). Retrieved from https://www.jenkins.io/doc/book/
  3. Infrastructure as Code (IaC) Explained (2021).
  4. Understanding Kubernetes Operators (2021).

10x the DevEX!

Recently there has been a shift in language surrounding System Reliability Engineering (SRE) and DevOps to Platform Engineering. Granted these terms have been used in various ways for a while, but how language terms are used gives way to how markets evolve. This post provides a few key areas of thought around ways to ultimately get products to production faster. Remember…code means nothing until it’s in production.

No matter the title, anyone in the pipeline touching production code is part of the team of ensuring success of critical applications in an enterprise. This is an important concept because everyone is part of the larger team and how teams work together ultimately determines the success of any project.

The focus here will be on the actual development team who is primarily writing the code. The code in question would be delivered as microservices running on a K8S cluster. Keep in mind the use of microservices will lend itself to multiple teams individually creating a service for other teams to consume. Already there is significant dependencies and a single line of code has yet to be written.

Each team ultimately needs to consume one or more code repositories, one or more “testing” systems, at least one pipeline for continuous integration, continuous delivery/deployment (CI/CD), and many other systems to get code to production.

The Platform Engineering team is ultimately responsible for ensuring the “platforms” are working in a way to support the developers. Ensuring a great experience is paramount.

The question is how do Platform Engineers continually improve the great developer experience? The answer many teams turn to is to create powerful systems with guardrails or opinions on how they are to be utilized based on the collective understanding of the teams modus operandi or how they work most effectively.

The key to how is reducing the repetitive work, the mundane menial tasks which take a toll on the cognitive workload of developers allowing them to be able to focus on writing good, clean code.

Giving the power to the developers to consume what is needed in a self-service fashion is one major step as is giving a limited set of choices in what toolsets to use. Make it easy for developers to build and deliver software without removing the useful capabilities of the core services.

In the ideal world, limit restrictions on the how allowing choices in using GitOps or ClickOps or using a API vs CLI vs UI. Use a “as a service” approach to create a system built iteratively by the entirety of the team based on direct feedback.

What it all comes down to is the fact that everyone has different ways they want to work. Its the platform engineering team who can help ensure all of the tools are available and functional to create a great developer experience which in turn will increase productivity and get new, shiny things to market faster.

90 days to success in DevOps

Starting a new role? Maybe this is the first foray into DevOps or Platform Engineering? What is needed to “hit the ground running” in a new role? Leaders in high positions of a company typically have a “100 day rule” to prove themselves. Let’s round it out with 3 months of progress for success.

In most enterprises on boarding new talent is typically left to the new employee. This is very unfortunate because the first 90 days of a new role will impact not only the new employee, but their immersion into the culture and their view of the company. Bottom line, in most cases it is up to the new employee to “learn the ropes” in navigating their new position.

The first 30 days

This month is usually the most important for everyone. The first thing a new employee needs to do is find a good mentor especially if they are not assigned one. Seek out those with institutional knowledge who knows how to navigate the company politics. Find someone who knows how the systems work, how to gain the access needed to be successful in the role. The mentor would have knowledge of “how things work” and what is seen as best practice for accomplishing the tasks at hand.

Some things to know:

  • Who’s who in the organization? – an org chart
  • How mature are they as a development organization?
  • What are the processes to put code into production?
  • Are the processes manual or automated?
  • What is the expectation of you on a day to day basis?

There is plenty more to uncover, but this will help to get started. Once the processes are understood and access is granted to perform the role, find some quick wins. Listen closely to where the frustrations may lie within your organization. Maybe the previous employee in this role didn’t automate certain tasks…submit a small PR to help.

It’s important to find some quick wins for many reasons. First it helps “break the ice”. It also shows strengths. Maybe there’s a way to improve some docs. There may be some ideas brought in from previous experience to help with a particular pain point.

The first 30 days is important to uncover the expectations of the team. Talking to stakeholders and “the customer” is important to get a big picture of what works and what doesn’t in order to find quick wins to make an impact early.

Days 30-60

The first 4 weeks are usually greeted with firehose sessions daily. Take a bit to digest everything. Review notes, brainstorm ideas, understand how the team and the company works. Armed with the broader knowledge about the organization, the team, and how things work at a high level it’s time to dig deeper into where the biggest impacts can be achieved.

In this 30 day block uncover:

  • The maturity of the team?
  • What is the approval process for delivering code to production?
  • What steps are needed to approve PRs?
  • How does code flow through the various systems?
  • What amount of QA is performed?

Find ways to help the team be more efficient. Listen to the complaints and see where possible improvements could be made. Again, quick wins are key at this stage. As a fresh face, a lot of times gaining access to otherwise inaccessible groups within the organization is usually fairly easy. Keep an ear to the ground to find ways to create impactful suggestions

It is important to remember as people get to know a new employee the interactions have lasting impacts. Ensure there is adequate listening and relevant questions to get underneath a complaint. Avoid making off hand suggestions, but rather find some common issues. Start to tackle the common issues and socialize improvements. The key here to to avoid “calling the baby ugly”.

Days 60-90

This is where a new employee’s impact can accelerate. At this stage having the access needed to be successful would be complete. Hopefully there’s been a few quick wins, new co-workers are impressed, and there’s been positive impact on the team.

Regular interaction with your leader would have been established. A solid understanding of what is expected is created and the mentor has made an impact. Knowing where to go to get answers if there is a roadblock and knowing how to avoid the “potholes in the road” is key.

This stage is where the “rubber hits the road”. Gaining traction in the day to day and making regular impact to the business is routine at this point. This is where all of the knowledge gained in the first 60 days can be parlayed into a winning hand.

What success looks like

The first 3 months of any new position sets the stage for every new employee. Creating a positive impression on the team helps build credibility within the broader organization and is key to instilling the confidence needed to being successful overall.

It may take far more than 90 days to feel comfortable with the role and that is okay. As long as there is a consistent method for learning and mistakes are not repeated the impact new employees make is usually sustainable for a long time. Make the best of it and keep track of the wins and losses for the inevitable review with “the boss”.

You got this. Go.

What’s missing in Kubernetes

Kubernetes is an open-source container orchestration system that automates the deployment, scaling, and management of containerized applications. It is widely used for its ability to manage containers at scale and is the de facto standard for container orchestration. However, despite its broad adoption, there are still a few missing pieces that need to be addressed to make it fully functional.

Network Setup

One of the main missing pieces in Kubernetes is a proper network setup. Kubernetes allows for the creation of multiple clusters, each with its own set of nodes, but it requires a well-defined network setup to manage communication between these clusters.

Without proper network setup, nodes in the same cluster may not be able to communicate with each other, or there may be issues with cross-cluster communication. This can result in application downtime, loss of data, and other issues that can impact business operations.

One solution to this problem is to use a software-defined networking (SDN) approach that allows for the creation of a virtual network infrastructure. An SDN controller can be used to manage the virtual network infrastructure and provide network services such as load balancing, routing, and security. With SDN, Kubernetes clusters can be properly connected, and communication between clusters can be streamlined.

Security

Another missing piece in Kubernetes is security. Kubernetes provides some basic security features such as role-based access control (RBAC) and network policies, but these are not always enough to secure the entire system.

Security is a critical aspect of any container orchestration system, and Kubernetes is no exception. Kubernetes clusters are complex systems with many components, and securing them requires a multi-layered approach.

To enhance security, Kubernetes clusters should be set up with secure communication channels and encrypted data storage. Additionally, it is important to create and enforce security policies that prevent unauthorized access to the system. This includes implementing identity and access management (IAM) policies, network segmentation, and regular vulnerability scanning.

Monitoring and Logging

Kubernetes also lacks inbuilt monitoring and logging capabilities. While Kubernetes includes some basic monitoring features, such as health checks and resource usage metrics, it does not provide comprehensive monitoring and logging capabilities.

In a production environment, it is essential to have comprehensive monitoring and logging capabilities to ensure the health and availability of the system. Kubernetes clusters should be set up with a logging and monitoring stack that can collect and analyze logs and metrics from all nodes in the cluster. This can provide insights into the health and performance of the system, as well as help identify and troubleshoot issues.

Conclusion

Kubernetes is a powerful container orchestration system, but there are still a few missing pieces that need to be addressed to make it fully functional. A well-defined network setup, enhanced security, and proper monitoring and logging are all essential components of a fully functional Kubernetes environment.

With the increasing adoption of containers and cloud-native applications, Kubernetes is becoming more important than ever. As organizations continue to adopt Kubernetes, it is important to ensure that the missing pieces are addressed to provide a reliable and scalable platform for containerized applications. By addressing these missing pieces, Kubernetes can continue to evolve and improve, providing a robust and secure platform for developers and IT teams.

containers in arches

Securing cloud native containers

Security, in and of itself, is a broad topic. Container security adds yet another facet to the already nebulous subject of security. In a lot of enterprises today security is first and foremost and the process for securing applications continues to shift left, meaning security is moving to be integrated earlier into the development process. This post will focus on some of the high level tasks and automations developers and operators can implement to mitigate risk.

The issues.

Misconfiguration

The #1 security risk in any cloud native environment is misconfiguration. How do operators know if what they are deploying is secured properly? In a lot of cases, deployments are left insecure for long periods of time without anyone noticing. This is a massive problem, especially for new technologies such as Kubernetes.

Software Defects

Another security risk is software bugs. Every day new vulnerabilities are found in software. Some of the vulnerabilities are minor, but increasingly the discoveries constitute a potentially critical issue when deploying software to a public facing system. Vulnerabilities are “what is known”. There is a signature for each of the known vulnerabilities which can be used to scan software.

However, “you don’t know what you don’t know”. Keep in mind many defects exist which are not known. These are the zero-day vulnerabilities.

Defense-in-depth

Scanning

Scanning software for known vulnerabilities is an absolute must have in any defense-in-depth strategy. However, even the best scanning tools have unknown vulnerabilities (or known limitations). The best defense is offense so creating a system where your container images go through multiple scanners is always a good strategy.

It is important to scan at many different points in the development process and also continually when in production. Any change could potentially be a breach. It is also very important to have layers which would support the other layers if a layer is permeable. Impervious security requires layers and your goal as a security architect is to create impervious security. Read on for other layers.

Network visualization

“It starts and ends with the network”. Kubernetes, being the orchestrator of choice for cloud native deployments, attempts to keep things simple which has lead to a number of CNIs (container network interface) to give platform engineering many choices when deploying workloads. Having something to visualize the network is important, especially when you can act upon those connections. NeuVector provides these capabilities. Being able to quarantine a pod or take a packet capture is key to ensuring continuous protection against unknown attacks and for any required forensics.

Data protection

Many different regulations apply to data for enterprises. Being able to provide audit reports for specific data regulations is massively important for HIPAA or PCI DSS. Being able to provide reporting for SOC2 compliance may be important. If your tool cannot “see” into the packet before it traverses the kernel, then it cannot prevent data from crossing a “domain” or prevent sensitive information from being leaked.

WAF

A lot of cloud native security tools have the ability to block layer 3 packets. Very few have true layer 7 capabilities. Being able to manage traffic at layer 7 is critical to anyone who is running applications on Kubernetes. Layer 7 is where a lot of unknown vulnerabilities are stopped, but only if the tool can look into the application packet before it traverses the kernel. This is the important point. Once the packet crosses the kernel you are compromised. Use a tool which will learn your workloads behavior. This behavior is the workloads signature and would be the ONLY traffic allowed to traverse the network.

Wrapping it up

Security is the highest scoring word in buzzword bingo these days. Everyone wants to ensure their environments are secure and it takes specialized tools for specific platforms. Don’t use the perimeter firewall as a Kubernetes firewall…it simply will not suffice for complete security inside a Kubernetes cluster. Use a tool which can watch every packet and the data inside every packet ensuring only packets with your workloads signature and nothing else traverses the network. One that allows for visualization of the network along with the traditional scanning, admission control, and runtime security requirements every cloud native implementation requires.

Knowledge spew on GitOps

In working with a handful of customers the concept of GitOps continues to resonate more and more. Let us dive into a brain dump of some of the conversations related to GitOps and how these customers tackled the task at hand.

First thing to remember is these customers are not massive. They are rather common actually. A Gartner defined “medium-sized” enterprise. Keep in mind these customers have the same issues as the giant enterprises just at a different scale.

In every case there was a user story. At a high level, a common theme was the need to roll out updates to a specific application regularly enough to find ways to entice the consumer to purchase a widget of some sort. Ok, A/B testing. Simple enough.

Each of the customers were in different maturity levels when it came to development processes, kubernetes knowledge, and devops methods. However they all have one thing in common…the need to deliver an application to their customer base on a deadline and continuously improve the application based on user feedback. All three of them were successful in meeting their self imposed deadlines. How?

Simple. Every one of them came together, ironed out a plan, and implemented the plan. The interesting part, every one of them already knew how to get the product to market. All they needed was a bit of guidance on how to overcome obstacles and get shit done. How?

  • Step one. Define the top of the mountain, the finish line, the end result.
  • Step two. The project leads built out a high level timeline from end to beginning.
  • Step three. All of the team members came together to build out the task teams.
  • Step four. Each of the teams built out their respective timeline for contribution.
  • Step five. Build.

Now how does this relate to GitOps? GitOps was the pivotal methodology to get it done. The pipeline was built with all of the parts in mind. If you recall the DevOps “infinity loop“, the key is to use that and combine it with the OODA loop decision model. The combination creates a very powerful decision making framework facilitating agile development with constant improvement. Sound simple? It’s not. It is in theory, but the implementation is like a relationship. Everything is great when dating, but the hard work is when dating turns to marriage. Same goes for creating a product. Designing the product, what it needs to do, all of the moving parts is fun. The real work comes in when the first working build is complete.

This is where GitOps shines. The developers build things, test locally, and commit. The pipelines move it through the process and all of the other teams contribute to each part in this machine. If one part breaks down, the other work stops to crowdsource the problem. The problem is fixed and the machine continues on. GitOps is the magical fairy dust. What about the technology?

The technology is rather mundane actually. Git. A code repository. A CI/CD pipeline. A build system. A test harness. A deployment platform. Git…the tools of choice are Github or Gitlab. Github is pretty slick, but Gitlab will allow for running locally in small environments building closed source deliverables. Each has a pipeline mechanism or there are many other tools such as Texton, Argo, CircleCI and many others with various features depending on what was needed. For build systems, many exist and again each as features as needed. However the deployment platform consistently remains the same, Kubernetes.

Building deploy-able applications at scale is hard. There are many other moving parts, processes, tools, etc. in play. However one thing stands out in all of these engagements…give the right people who have the will to succeed the skills needed to succeed and the execution part will look easy.

It is always fun to be a part of something, but its most precious reward is being able to step away and watch the machine run on it’s own.

That’s the end of this spew. It went everywhere…maybe it’s more like a sneeze.

Peace out.