Kubernetes Custom Resource Definitions — Implement in Java (Part 1)

Hin Lam
6 min readSep 9, 2020
Blueprint of a U-boat, operator of the submarine (controller) has to align with the design of the boat (CRD)

In this post, we will dive into Custom Resource Definition as a way to extends K8S.

We will first describe the use case of CRD, some hygiene of writing CRD and finally you will learn how to use Java to generate CRD instead of manually editing it.

This post will demonstrate how to implement Controller and CRD using Java to build CRD without YAML, running a custom controller for your CRD and detail on controller implementation step by step.

This post will not dive deep if you should use CRD or note, read the great comparison documentation from K8S.

This is an advanced level of information and only needed if you extend K8S, therefore I assume you are familiar with in deep of K8S, how to use K8S Java Client, and understand previous posts: Coding K8S resource in Java Part 1 and Part 2 .

All the sample code can be download from the GitHub repo and Gist

What is Custom Resource Definition?

Most of the time, using K8S with built-in resource is sufficient, all the built-in resources have well structured and comprehensive YAML.

Before CRD

These YAML could be extremely verbose, for example, if you have a MySQL cluster that want to deploy, it will require StatefulSet , Service , leader election, load balancing, scaling, health checking, monitoring, log gathering and many other properties in YAML files.

With CRD:

With CRD

It will be nice to have a CRD to describe the context specific to a MySQL database:

#A Custom Resource Definition I just imagine (ie make it up!)
#But I'll show you how to implement it
#Let's call this `FirstMySQL.yaml`
apiVersion: database.hinlam.io/v1alpha1
kind: MySQL
metadata:
name: mysql-cluster-a
namespace: default
spec:
mysql-version: 8.0
mysql-db-encoding: utf8
cluster-size: 5
autoscaling:
by: IO
scale-out: 0.8 #80%
scale-in: 0.2 #20%
resource-limits:
CPU: 2
RAM: 10G
Disk: 200G

Then you can have a custom controller MySQLController that read this resource object and produce the required StatefulSet , Volume , Service etc. and from the K8S user point of view, they just follow the specification of this MySQL Kind , then they will magically get a working MySQL server cluster.

By doing so, the FirstMySQL.yaml became context aware and specific instead of building up the whole infrastructure with vanilla lego blocks.

Combining with your custom controller, it is possible to perform smart actions, including re-package your own existing system, offload your daily operation tasks to controller (operator pattern), provision third-party and external resource, the sky is your limit.

Let’s see how *NOT* to use YAML and with some Java code to create CRD.

Creating CRD in API Server

CustomResourceDefinition is a well-formated YAML/JSON to tell API server there is a custom type.

#Let's call this file `crd.yml`, BTW, this file is invalid...just for concept only:apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: mysqls.database.hinlam.io
spec:
group: database.hinlam.io
names:
kind: MySQL
plural: mysqls
shortNames:
- ms
singular: mysql
scope: Namespaced
versions:
- name: v1alpha1
served: true
storage: true

This is simple! Once your CRD has created successfully, you can verify by kubectl api-resources :

So now you can create the MySQL resource using kubectl -f mysql.yml , surely the mysql.yml is a record of intent only, so in future, we still need a controller to react to this MySQL record.

Generate Custom Resource Definition using Java

We are going to generate a YAML file crd.yaml we’ve mention above.

As of this post writing date (Sep-2020), luckily the latest Java Client 9.0.1 just released and thus the easy way is to include the following into your pom.xml :

<!-- If you just want to talk to K8S -->
<dependency>
<
groupId>io.kubernetes</groupId>
<
artifactId>client-java</artifactId>
<
version>9.0.2</version>
</
dependency>
<!--This is write yourself a controller dependencies-->
<dependency>
<
groupId>io.kubernetes</groupId>
<
artifactId>client-java-extended</artifactId>
<
version>9.0.2</version>
</
dependency>

My Github repo has all the codes in this section, just I’m going to dissect it step by step:

Step 1: We start at the aggregated object of CustomResourceDefinition :

V1CustomResourceDefinition crd = new V1CustomResourceDefinitionBuilder()
.withApiVersion("apiextensions.k8s.io/v1")
.withKind("CustomResourceDefinition")
.withMetadata(crdMeta)
.withSpec(crdSpec)
.withStatus(crdStatus)
.build();

Assuming that you know what to put in crdMeta , let’s focus in crdSpec

Step 2: Create crdSpec:

V1CustomResourceDefinitionSpec crdSpec = new V1CustomResourceDefinitionSpecBuilder()
.withGroup(groupName)
.withVersions(crdV1)
.withScope("Namespaced")
.withNames(names)
.build();

Step 3: Create dependent names:

V1CustomResourceDefinitionNames names = new V1CustomResourceDefinitionNamesBuilder()
.withKind(kindName)
.withSingular(singular)
.withPlural(plural)
.withShortNames("ms")
.withCategories("hinlamdb", "all")
.build();

Step 4: Profit!
So now you can dump the object crd , using the technique introduced in previous post:

YAML.dump(crd)

Bug alert!

YAML dumping does not work for Boolean value in, so YAML.dump(crd) will have missing values.

This is a well-known issue and has been logged:

A temporary workaround as following (may not work for some scenario)

//Since Gson serialization is not affected, so we can convert to JSON
Gson gson = new Gson();
String json = gson.toJson(crd);
//you can uses the JSON output in kubectl, so optionally, use the snakeYAML’s function to convert JSON to YAML
org.yaml.snakeyaml.Yaml yaml = new org.yaml.snakeyaml.Yaml();
Object result = yaml.load(json);
String output = yaml.dumpAsMap(result);
//Now you've got the YAML

So you will get something like this:

#This is a version without validation
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: mysqls.database.hinlam.io
spec:
group: database.hinlam.io
names:
categories:
- hinlamdb
- all
kind: MySQL
plural: mysqls
shortNames:
- ms
singular: mysql
scope: Namespaced
versions:
- additionalPrinterColumns:
- description: Size of a cluster
jsonPath: .spec.cluster-size
name: ClusterSize
type: integer
name: v1alpha1
served: true
storage: true
subresources:
scale:
specReplicasPath: .spec.cluster-size
statusReplicasPath: .status.cluster-size
status:
clusterSize: 0

This is pretty minimum, but again, a more complex version can be found in GitHub Gist

At this point, you’ve read:

  1. What is CRD
  2. Generation of CRD using Java

However, you realized that you can create any invalid MySQL kind object and kubectl won’t even warn you, why is that?

apiVersion: database.hinlam.io/v1alpha1
kind: MySQL
metadata:
name: mysql-cluster-a
namespace: default
spec:
INVALID_KEY: INVALID_VALUE
autoscaling:
by: INVALID_VALUE
scale-out: 1.2
scale-in: 0.0 #20%
testing: value

Validation of CRD

We can tell the API server to validate all new CRD when creating the resource.

Remember the days of XML being valided by XML Schema? The same also happen in CRD: We describe how to validate the submitted resource (eg: MySQL.yaml) against the OpenAPI3Schema.

Firstly read the OpenAPI3 Specification in Swagger website, it provides a way to describe required value, map, array, object type.

Typically speaking we can describe the validation in YAML:

K8S Java Client provided all the essential OpenAPIV3Schema class for you to start building up the generation of validation:

Without pouring the codes everywhere in this post, you can take a look at sample code of generating a validation object.

The validation schema is similar to XML’s XSD validation — It’s lengthy and complex and with the use of manual editing in YAML is a recipe for making mistakes.

It may be very quick for you to write a schema with just 20–40 lines in YAML but for complex CRD, using Java Client to generate it will ensure all the relationships between properties and values are easier to get it right.

Learn from history of XML, there is a large room for improvement — how about automatically generate the validation code as a place to kickstart? Interesting to see but it’s surely out of the scope of this blog post.

Summary

CRD is a building block to tailor K8S to your specific behavior, it also serve as an abstraction and encapsulation for complex system.

By now you should be confident of building up your CRD and even think about OperatorFramework.

Even more, by generating CRD YAML programmatically, you’ve opened up a door to reusable and easy to maintain way of extending K8S.

[Update 27-May-2021: Part 2 released!]
Next — In Part 2, we will discuss more deep features to customize the experience of CRD: PrintColumn, Categories, Sub-resource and Defaulting.

Enjoy!

--

--