Tanzu Kubernetes Cluster lifecycle on vSphere Supervisor with ArgoCD

Navneet Verma
8 min readJan 17, 2024

Introduction

vSphere with Tanzu introduces the Supervisor, a powerful capability that enables organizations to run Kubernetes workloads directly on their vSphere infrastructure. This integration simplifies the deployment and management of Kubernetes clusters, providing a seamless experience for both virtual machine workloads and containerized applications.

Since the Supervisor provides a Kubernetes dial tone, it can be leveraged to run many core and critical data center/organization services. These services are called Supervisor Services. Users can leverage vSphere Namespaces to provide a secure tenancy for these vital services. VMware and some partners currently curate and release these Supervisor Services, but the concept can be easily extended to any Kubernetes applications. Developing these services is discussed here .

Today, we will discuss one such service — ArgoCD and how to leverage it to deploy and manage the entire lifecycle of a Tanzu Kubernetes Cluster.

ArgoCD

ArgoCD is a popular GitOps tool that helps automate application deployment and lifecycle management in and of Kubernetes clusters. It provides a declarative approach to defining and managing application configurations using Git repositories. ArgoCD continuously monitors the user repositories for changes and automatically applies those changes to the specified Kubernetes clusters, ensuring that the desired state is always maintained.

Advantage of GitOps

GitOps is a modern approach to continuous delivery that leverages version control systems like Git to manage infrastructure and application configurations. This method brings several advantages, including

  • versioning
  • auditability
  • the ability to easily roll back to previous states.

In deploying Tanzu Kubernetes Clusters (TKC) and the applications running within them, GitOps ensures that the entire infrastructure is defined as code and can be easily reproduced, promoting consistency and reliability.

Let us get started !!

Artifacts and Configurations

To follow along with this blog, you can refer to the GitHub repository: argocd-gitops-tanzu. The repository contains several subdirectories, each representing the Tanzu Kubernetes Cluster deployment requirements through ArgoCD. The folders are as follows —

tkc

This directory holds the desired state configuration for the Tanzu Kubernetes Cluster. It includes specifications for the TKC, such as size, version, and networking details. It also has some additional configurations of a Kubernetes job that ArgoCD needs to execute upon successful completion of the TKC deployment.

Dockerfile-synchook

The Dockerfile-synchook directory includes artifacts related to a custom image for automating the addition of the deployed TKC to the ArgoCD installation. This image is executed by the job in the tkc folder (see above). You can modify the image based on the details provided here, or you can use the image whoami6443/argocd-hook:0.1.3 referenced in the tkc/argocd-job.yaml.

cert-manager

This directory contains the Kubernetes manifests required to deploy and manage cert-manager within the Tanzu Kubernetes Cluster.

contour

Contour is an Ingress controller for Kubernetes. This directory contains the Kubernetes manifest files required to deploy and configure Contour for routing external traffic to services running in the Tanzu Kubernetes Cluster. A sample file kustomization.yaml shows how to leverage a proxy cache registry to change the image in the envoy deployment.

gatekeeper

Gatekeeper is a validating and mutating webhook that enforces CRD-based policies executed by Open Policy Agent, a policy engine for Cloud Native environments hosted by CNCF. This directory contains the deployment manifest and a standard kustomization.yaml file.

Note: You can install additional applications and add-ons to the TKC (beyond Contour, Gatekeeper, and cert-manager). Update the repository by creating their folder and providing relevant manifest files. You must update tkc/argocd-app-cm.yaml with the appropriate ArgoCD application object to reference the new add-ons/applications.

vCenter

In the vCenter, we have the Supervisor enabled and a vSphere Namespace demo1 created. ArgoCD instance with the name demo1-argocd is currently running within the demo1 namespace. A Kubernetes Service of type LoadBalancer exposes ArgoCD installation on an IP address 10.220.3.197 .

Note: In a separate article, we will discuss the installation of ArgoCD as a Supervisor Service within a Supervisor using an Operator pattern.

Steps

Login to the ArgoCD instance

  • If you have not already done so, download the latest stableargocd CLI and install it in your PATH.
  • Login to the Supervisor
$ kubectl vsphere login --server 10.220.3.194 -u administrator@vsphere.local


KUBECTL_VSPHERE_PASSWORD environment variable is not set. Please enter the password below
Password:
Logged in successfully.

You have access to the following contexts:
10.220.3.194
demo1
svc-argocd-operator-domain-c8
svc-mongodb-operator-domain-c8

If the context you wish to use is not in this list, you may need to try
logging in again later, or contact your cluster administrator.

To change context, use `kubectl config use-context <workload name>`
  • Get the IP address of the ArgoCD instance’s LoadBalancer service. 10.220.3.197 In our example.
$ kubectl get svc -n demo1
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
demo1-argocd-metrics ClusterIP 10.96.0.44 <none> 8082/TCP 5d1h
demo1-argocd-redis ClusterIP 10.96.0.55 <none> 6379/TCP 5d1h
demo1-argocd-repo-server ClusterIP 10.96.1.26 <none> 8081/TCP,8084/TCP 5d1h
demo1-argocd-server LoadBalancer 10.96.0.134 10.220.3.197 80:31375/TCP,443:31592/TCP 5d1h
demo1-argocd-server-metrics ClusterIP 10.96.1.126 <none> 8083/TCP 5d1h
  • Login to the argocd instance.
$ argocd login 10.220.3.197 --insecure
Username: admin
Password:
'admin:login' logged in successfully
Context '10.220.3.197' updated
  • For a UI experience, you can also log in to the Argocd UI within your browser. For a new install, you should not see any applications configured.

Deploying the main application

ArgoCD works with applications defined in Git repositories. Deploy the master application by synchronizing the GitOps repository with ArgoCD. This process will trigger the deployment of all components necessary for the Tanzu Kubernetes Cluster.

  • Create an application tkc-deploy within ArgoCD. This application points to the https://github.com/papivot/argocd-gitops-tanzu.git repo (referenced above), with the initial folder pointing to tkc. The target cluster is the in-cluster Supervisor, and the target namespace is demo1.
$ argocd app create tkc-deploy --repo https://github.com/papivot/argocd-gitops-tanzu.git --path tkc --dest-server https://kubernetes.default.svc --dest-namespace demo1 --auto-prune --sync-policy auto
application 'tkc-deploy' created

That is all that would need to be done to deploy your Tanzu Kubernetes Cluster with all the required applications and add-ons deployed within it.

Results

  • The first thing you would observe is that the cluster deployment initiates in the vCenter, and soon, a TKC with the name workload-vsphere-tkg1 is created. This cluster has one worker node and a single control plane node. This is as per the configurations defined in tkc/tkgs-cluster-class-noaz.yaml
  • Once the application responsible for the cluster deployment is complete, ArgoCD will execute the post-hook job — app-tkc-create-xxxx. This job is responsible for registering the new server (using its kubeconfig secrets) to ArgoCD installation and scheduling the latest applications and add-ons to the new cluster (cert-manager, Contour, and Gatekeeper in this example).
  • Within a few minutes, ArgoCD should be able to reconcile the applications within the TKC, and the UI should show something similar to this —
  • Log in to the newly cleared TKC.
$ kubectl vsphere login --server 10.220.3.194 -u administrator@vsphere.local --tanzu-kubernetes-cluster-name workload-vsphere-tkg1 --tanzu-kubernetes-cluster-namespace demo1


KUBECTL_VSPHERE_PASSWORD environment variable is not set. Please enter the password below
Password:
Logged in successfully.

You have access to the following contexts:
10.220.3.194
demo1
svc-argocd-operator-domain-c8
svc-mongodb-operator-domain-c8
workload-vsphere-tkg1

If the context you wish to use is not in this list, you may need to try
logging in again later, or contact your cluster administrator.

To change context, use `kubectl config use-context <workload name>`

$ kubectl config use-context workload-vsphere-tkg1
Switched to context "workload-vsphere-tkg1".
  • List the pods, and you can see that cert-manager, Contour, and Gatekeeper installed.
$ kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
cert-manager cert-manager-7865d8497c-rd7sb 1/1 Running 0 14m
cert-manager cert-manager-cainjector-7769cd8976-942s2 1/1 Running 0 14m
cert-manager cert-manager-webhook-7768cfb496-wfzpn 1/1 Running 0 14m
...
projectcontour contour-certgen-v1-27-0-49jfq 0/1 Completed 0 14m
projectcontour contour-fbd7dcdcd-cr728 1/1 Running 0 14m
projectcontour contour-fbd7dcdcd-v4mbj 1/1 Running 0 14m
projectcontour envoy-mb2pm 2/2 Running 0 13m
...
gatekeeper-system gatekeeper-audit-56f7f4bd77-7bvlg 1/1 Running 1 (3m56s ago) 4m
gatekeeper-system gatekeeper-controller-manager-86487c88f4-j6pdr 1/1 Running 0 4m
gatekeeper-system gatekeeper-controller-manager-86487c88f4-lng22 1/1 Running 0 4m
gatekeeper-system gatekeeper-controller-manager-86487c88f4-spnts 1/1 Running 0 4m

With a single argocd command, we could deploy a fully functioning cluster with a few applications running. Let us perform an additional workflow on the cluster.

Additional Step

We need to scale out the cluster from one one to two. In a true GitOps fashion, I will update the code in my Git repository, and ArgoCD should be able to edit the configuration as needed. To do so, I only need to modify the tkc/tkgs-cluster-class-noaz.yaml code from

    workers:
machineDeployments:
- class: node-pool
name: node-pool-1
replicas: 1

to

    workers:
machineDeployments:
- class: node-pool
name: node-pool-1
replicas: 2

and check in my code.

In a few minutes, the application should show as OutOfSync within the ArgoCD UI and trigger a node scale-out.

This is the view within the vCenter.

Summary

The above example shows how easy it is to execute the full LCM (install, upgrade, update, scale-out, scale-in, applications, and add-on installs) of a Tanzu Kubernetes Cluster without a cluster management tool. Once they can access their vSphere Namespaces, DevOps users can manage this workflow without having the vSphere admins install and manage cluster management tools. Besides reducing costs, this eases the administrative burden on the vSphere admins.

--

--

Responses (1)