Helm Installation

The current Spring Cloud Data Flow chart is based on Helm 2. The Helm project will be ending support for Helm 2 in November of 2020. At that time the Spring Cloud Data Flow chart will be based on Helm 3, dropping support for Helm 2.

Migration steps from Helm 2 to Helm 3 are required. In preparation for the migration, it is advised to read the Helm v2 to v3 Migration Guide for more information. Additionally, some helpful tips on data migration and upgrades can be found in the post migration issues article.

Spring Cloud Data Flow offers a Helm Chart for deploying the Spring Cloud Data Flow server and its required services to a Kubernetes Cluster.

The following sections cover how to initialize Helm and install Spring Cloud Data Flow on a Kubernetes cluster.

If using Minikube, see Setting Minikube Resources for details on CPU and RAM resource requirements.

Installing Helm

The Spring Cloud Data Flow Helm chart is currently tested against Helm 2. Helm is comprised of two components: the client (Helm) and the server (Tiller). The Helm client runs on your local machine and can be installed by following the instructions found here. If Tiller has not been installed on your cluster, run the following to create a service account and the Helm init client command:

kubectl create serviceaccount tiller -n kube-system
kubectl create clusterrolebinding tiller --clusterrole cluster-admin --serviceaccount kube-system:tiller
helm init --wait --service-account tiller

Please see the Helm documentation for additional Helm security configuration.

helm repo update

To verify that the Tiller pod is running, run the following command:

kubectl get pod --namespace kube-system

You should see the Tiller pod running.

Installing the Spring Cloud Data Flow Server and Required Services

Spring Cloud Data Flow Chart

Spring Cloud Data Flow is a toolkit for microservices-based Streaming and Batch data processing pipelines in Cloud Foundry and Kubernetes

Data processing pipelines consist of Spring Boot apps, built using the Spring Cloud Stream or Spring Cloud Task microservice frameworks. This makes Spring Cloud Data Flow suitable for a range of data processing use cases, from import/export to event streaming and predictive analytics.

This Helm chart is deprecated

Given the stable deprecation timeline, the Bitnami maintained Spring Cloud Data Flow Helm chart is now located at bitnami/charts.

The Bitnami repository is already included in the Hubs and we will continue providing the same cadence of updates, support, etc that we've been keeping here these years. Installation instructions are very similar, just adding the bitnami repo and using it during the installation (bitnami/<chart> instead of stable/<chart>)

$ helm repo add bitnami https://charts.bitnami.com/bitnami
$ helm install my-release bitnami/<chart>           # Helm 3
$ helm install --name my-release bitnami/<chart>    # Helm 2

To update an exisiting stable deployment with a chart hosted in the bitnami repository you can execute

$ helm repo add bitnami https://charts.bitnami.com/bitnami
$ helm upgrade my-release bitnami/<chart>

Issues and PRs related to the chart itself will be redirected to bitnami/charts GitHub repository. In the same way, we'll be happy to answer questions related to this migration process in this issue created as a common place for discussion.

Chart Details

This chart will provision a fully functional and fully featured Spring Cloud Data Flow installation that can deploy and manage data processing pipelines in the cluster that it is deployed to.

Either the default MySQL deployment or an external database can be used as the data store for Spring Cloud Data Flow state and either RabbitMQ or Kafka can be used as the messaging layer for streaming apps to communicate with one another.

For more information on Spring Cloud Data Flow and its capabilities, see it's documentation.

Prerequisites

Assumes that serviceAccount credentials are available so the deployed Data Flow server can access the API server (Works on GKE and Minikube by default). See Configure Service Accounts for Pods

Installing the Chart

To install the chart with the release name my-release:

$ helm install --name my-release stable/spring-cloud-data-flow

If you are using a cluster that does not have a load balancer (like Minikube) then you can install using a NodePort:

$ helm install --name my-release --set server.service.type=NodePort stable/spring-cloud-data-flow

To restrict the load balancer to an IP address range:

$ helm install --name my-release  --set server.service.loadBalancerSourceRanges='[10.0.0.0/8]' stable/spring-cloud-data-flow

Data Store

By default, MySQL is deployed with this chart. However, if you wish to use an external database, please use the following set flags to the helm command to disable MySQL deployment, for example:

--set mysql.enabled=false

In addition, you are required to set all fields listed in External Database Configuration.

Messaging Layer

There are three messaging layers available in this chart:

  • RabbitMQ (default)
  • RabbitMQ HA
  • Kafka

To change the messaging layer to a highly available (HA) version of RabbitMQ, use the following set flags to the helm command, for example:

--set rabbitmq-ha.enabled=true,rabbitmq.enabled=false

Alternatively, to change the messaging layer to Kafka, use the following set flags to the helm command, for example:

--set kafka.enabled=true,rabbitmq.enabled=false

Only one messaging layer can be used at a given time. If RabbitMQ and Kafka are enabled, both charts will be installed with RabbitMQ being used in the deployment.

Note that this chart pulls in many different Docker images so can take a while to fully install.

Feature Toggles

If you only need to deploy tasks and schedules, streams can be disabled:

--set features.streaming.enabled=false --set rabbitmq.enabled=false

If you only need to deploy streams, tasks and schedules can be disabled:

--set features.batch.enabled=false

NOTE: Both features.streaming.enabled and features.batch.enabled should not be set to false at the same time.

Streaming and batch applications can be monitored through Prometheus and Grafana. To deploy these components and enable monitoring, set the following:

--set features.monitoring.enabled=true

When using Minikube, the Grafana URL can be obtained for example, via:

minikube service my-release-grafana --url

On a platform that provides a LoadBalancer such as GKE, the following can be checked against until the EXTERNAL-IP field is populated with the assigned load balancer IP address:

kubectl get svc my-release-grafana

See the Grafana table below for default credentials and override parameters.

Using an Ingress

If you would like to use an Ingress instead of having the services use the LoadBalancer type there are a few things to consider.

First you need to have an Ingress Controller installed in your cluster. If you don't already have one instaled, you can use the following helm command to install an NGINX Ingress Controller:

kubectl create namespace nginx-ingress
helm install --name nginx-ingress --namespace nginx-ingress stable/nginx-ingress

You can look up the IP address used by the NGINX Ingress Controller with:

ingress=$(kubectl get svc nginx-ingress-controller -n nginx-ingress -ojsonpath='{.status.loadBalancer.ingress[0].ip}')

This is useful if you would like to use xip.io instead of your own DNS resolution. The folowing options assume that you will use xip.io but you can replace the host values below with your own DNS hosts if you prefer.

To enable the creation of an Ingress resource and configure the services to use ClusterIP type use the following set options in your helm install command:

  --set server.service.type=ClusterIP \
  --set ingress.enabled=true \
  --set ingress.protocol=http \
  --set ingress.server.host=scdf.${ingress}.xip.io \

If you want to use an Ingress with the monitoring feature enabled, then use thes options instead:

  --set features.monitoring.enabled=true \
  --set server.service.type=ClusterIP \
  --set grafana.service.type=ClusterIP \
  --set prometheus.proxy.service.type=ClusterIP \
  --set ingress.enabled=true \
  --set ingress.protocol=http \
  --set ingress.server.host=scdf.${ingress}.xip.io \
  --set ingress.grafana.host=grafana.${ingress}.xip.io \

Configuration

The following tables list the configurable parameters and their default values.

RBAC Configuration

Parameter Description Default
rbac.create Create RBAC configurations true

ServiceAccount Configuration

Parameter Description Default
serviceAccount.create Create ServiceAccount true
serviceAccount.name ServiceAccount name (generated if not specified)

Data Flow Server Configuration

Parameter Description Default
server.version The version/tag of the Data Flow server 2.6.0
server.imagePullPolicy The imagePullPolicy of the Data Flow server IfNotPresent
server.service.type The service type for the Data Flow server LoadBalancer
server.service.annotations Extra annotations for service resource {}
server.service.externalPort The external port for the Data Flow server 80
server.service.labels Extra labels for the service resource {}
server.service.loadBalancerSourceRanges A list of IP address ranges to allow through the load balancer no restriction
server.platformName The name of the configured platform account default
server.configMap Custom ConfigMap name for Data Flow server configuration
server.trustCerts Trust self signed certs false
server.extraEnv Extra environment variables to add to the server container {}
server.containerConfiguration.container.registry-configurations..registry-host The registry host to use for the profile represented by
server.containerConfiguration.container.registry-configurations..authorization-type The registry authorization type to use for the profile represented by

Skipper Server Configuration

Parameter Description Default
skipper.version The version/tag of the Skipper server 2.5.0
skipper.imagePullPolicy The imagePullPolicy of the Skipper server IfNotPresent
skipper.platformName The name of the configured platform account default
skipper.service.type The service type for the Skipper server ClusterIP
skipper.service.annotations Extra annotations for service resources {}
skipper.service.labels Extra labels for the service resource {}
skipper.configMap Custom ConfigMap name for Skipper server configuration
skipper.trustCerts Trust self signed certs false
skipper.extraEnv Extra environment variables to add to the skipper container {}

Spring Cloud Deployer for Kubernetes Configuration

Parameter Description Default
deployer.resourceLimits.cpu Deployer resource limit for cpu 500m
deployer.resourceLimits.memory Deployer resource limit for memory 1024Mi
deployer.readinessProbe.initialDelaySeconds Deployer readiness probe initial delay 120
deployer.livenessProbe.initialDelaySeconds Deployer liveness probe initial delay 90

RabbitMQ Configuration

Parameter Description Default
rabbitmq.enabled Enable RabbitMQ as the middleware to use true
rabbitmq.rabbitmq.username RabbitMQ user name user
rabbitmq.rabbitmq.password RabbitMQ password to encode into the secret changeme

RabbitMQ HA Configuration

Parameter Description Default
rabbitmq-ha.enabled Enable RabbitMQ HA as the middleware to use false
rabbitmq-ha.rabbitmqUsername RabbitMQ user name user

Kafka Configuration

Parameter Description Default
kafka.enabled Enable RabbitMQ as the middleware to use false
kafka.replicas The number of Kafka replicas to use 1
kafka.configurationOverrides Kafka deployment configuration overrides replication.factor=1, metrics.enabled=false
kafka.zookeeper.replicaCount The number of ZooKeeper replicates to use 1

MySQL Configuration

Parameter Description Default
mysql.enabled Enable deployment of MySQL true
mysql.mysqlDatabase MySQL database name dataflow

External Database Configuration

Parameter Description Default
database.driver Database driver nil
database.scheme Database scheme nil
database.host Database host nil
database.port Database port nil
database.user Database user scdf
database.password Database password nil
database.dataflow Database name for SCDF server dataflow
database.skipper Database name for SCDF skipper skipper

Feature Toggles

Parameter Description Default
features.streaming.enabled Enables or disables streams true
features.batch.enabled Enables or disables tasks and schedules true
features.monitoring.enabled Enables or disables monitoring false

Ingress

Parameter Description Default
ingress.enabled Enables or disables ingress support true
ingress.protocol Sets the protocol used by ingress server https
ingress.server.host Sets the host used for server data-flow.local
ingress.server.host Sets the host used for grafana grafana.local

Grafana

Parameter Description Default
grafana.service.type Service type to use LoadBalancer
grafana.admin.existingSecret Existing Secret to use for login credentials scdf-grafana-secret
grafana.admin.userKey Secret userKey field admin-user
grafana.admin.passwordKey Secret passwordKey field admin-password
grafana.admin.defaultUsername The default base64 encoded login username used in the secret admin
grafana.admin.defaultPassword The default base64 encoded login password used in the secret password
grafana.extraConfigmapMounts ConfigMap mount for datasources scdf-grafana-ds-cm
grafana.dashboardProviders Dashboard provider for imported dashboards default
grafana.dashboards Dashboards to auto import SCDF Apps, Streams & Tasks

Prometheus

Parameter Description Default
prometheus.server.global.scrape_interval Scrape interval 10s
prometheus.server.global.scrape_timeout Scrape timeout 9s
prometheus.server.global.evaluation_interval Evaluation interval 10s
prometheus.extraScrapeConfigs Additional scrape configs for proxied applications proxied-applications & proxies jobs
prometheus.podSecurityPolicy Enable or disable PodSecurityContext true
prometheus.alertmanager Enable or disable alert manager false
prometheus.kubeStateMetrics Enable or disable kube state metrics false
prometheus.nodeExporter Enable or disable node exporter false
prometheus.pushgateway Enable or disable push gateway false
prometheus.proxy.service.type Service type to use LoadBalancer

Expected output

After issuing the helm install command, you should see output similar to the following:

NAME:   my-release
LAST DEPLOYED: Sat Mar 10 11:33:29 2018
NAMESPACE: default
STATUS: DEPLOYED

RESOURCES:
==> v1/Secret
NAME                  TYPE    DATA  AGE
my-release-mysql      Opaque  2     1s
my-release-data-flow  Opaque  2     1s
my-release-rabbitmq   Opaque  2     1s

==> v1/ConfigMap
NAME                          DATA  AGE
my-release-data-flow-server   1     1s
my-release-data-flow-skipper  1     1s

==> v1/PersistentVolumeClaim
NAME                 STATUS   VOLUME                                    CAPACITY  ACCESSMODES  STORAGECLASS  AGE
my-release-rabbitmq  Bound    pvc-e9ed7f55-2499-11e8-886f-08002799df04  8Gi       RWO          standard      1s
my-release-mysql     Pending  standard                                  1s

==> v1/ServiceAccount
NAME                  SECRETS  AGE
my-release-data-flow  1        1s

==> v1/Service
NAME                          CLUSTER-IP      EXTERNAL-IP  PORT(S)                                AGE
my-release-mysql              10.110.98.253   <none>       3306/TCP                               1s
my-release-data-flow-server   10.105.216.155  <pending>    80:32626/TCP                           1s
my-release-rabbitmq           10.106.76.215   <none>       4369/TCP,5672/TCP,25672/TCP,15672/TCP  1s
my-release-data-flow-skipper  10.100.28.64    <none>       80/TCP                                 1s

==> v1beta1/Deployment
NAME                          DESIRED  CURRENT  UP-TO-DATE  AVAILABLE  AGE
my-release-mysql              1        1        1           0          1s
my-release-rabbitmq           1        1        1           0          1s
my-release-data-flow-skipper  1        1        1           0          1s
my-release-data-flow-server   1        1        1           0          1s

Get the Spring Cloud Data Flow's application URL by running these commands:

export SERVICE_IP=$(kubectl get svc --namespace default my-release-data-flow-server -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
echo http://$SERVICE_IP:80

It may take a few minutes for the LoadBalancer IP to be available. You can watch the status of the server by running kubectl get svc -w my-release-data-flow-server

If your using Minikube, you can use the following command to get the URL for the server:

minikube service --url my-release-data-flow-server

You have just created a new release in the default namespace of your Kubernetes cluster. It takes a couple of minutes for the application and its required services to start. You can check on the status by issuing a kubectl get pod -w command. You need to wait for the READY column to show 1/1 for all pods.

When all pods are ready, you can access the Spring Cloud Data Flow dashboard by accessing http://<SERVICE_ADDRESS>/dashboard where <SERVICE_ADDRESS> is the address returned by either the kubectl or minikube commands above.

To see what Helm releases of Spring Cloud Data Flow you have running, you can use the helm list command. When it is time to delete the previously installed SCDF release, run helm delete my-release. This command removes any resources created for the release but keeps release information so that you can rollback any changes by using a helm rollback my-release 1 command. To completely delete the release and purge any release metadata, you can use helm delete my-release --purge.

Secret management

There is an issue with generated secrets that are used for the required services getting rotated on chart upgrades. To avoid this issue, set the password for these services when installing the chart. You can use the following command to do so:

helm install --name my-release \
    --set rabbitmq.rabbitmqPassword=rabbitpwd \
    --set mysql.mysqlRootPassword=mysqlpwd incubator/spring-cloud-data-flow

Version Compatibility

The following listing shows Spring Cloud Data Flow’s version compatibility with the respective Helm Chart releases:

SCDF Version Chart Version
SCDF-K8S-Server 1.7.x 1.0.x
SCDF-K8S-Server 2.0.x 2.2.x
SCDF-K8S-Server 2.1.x 2.3.x
SCDF-K8S-Server 2.2.x 2.4.x
SCDF-K8S-Server 2.3.x 2.5.x

Register prebuilt applications

All the prebuilt streaming applications:

  • Are available as Apache Maven artifacts or Docker images.
  • Use RabbitMQ or Apache Kafka.
  • Support monitoring via Prometheus and InfluxDB.
  • Contain metadata for application properties used in the UI and code completion in the shell.

Applications can be registered individually using the app register functionality or as a group using the app import functionality. There are also dataflow.spring.io links that represent the group of prebuilt applications for a specific release which is useful for getting started.

You can register applications using the UI or the shell. Even though we are only using two prebuilt applications, we will register the full set of prebuilt applications.

The easiest way to install Data Flow on Kubernetes is using the Helm chart that uses RabbitMQ as the default messaging middleware. The command to import the Kafka version of the applications is

dataflow:>app import --uri https://dataflow.spring.io/kafka-docker-latest

Change kafka to rabbitmq in the above URL if you set kafka.enabled=true in the helm chart or followed the manual kubectl based installation instructions for installing Data Flow on Kubernetes and chose to use Kafka as the messaging middleware.

Only applications registered with a --uri property pointing to a Docker resource are supported by the Data Flow Server for Kubernetes. However, we do support Maven resources for the --metadata-uri property, which is used to list the properties supported by each application. For example, the following application registration is valid:

app register --type source --name time --uri docker://springcloudstream/time-source-rabbit:{docker-time-source-rabbit-version} --metadata-uri maven://org.springframework.cloud.stream.app:time-source-rabbit:jar:metadata:{docker-time-source-rabbit-version}

Any application registered with a Maven, HTTP, or File resource or the executable jar (by using a --uri property prefixed with maven://, http:// or file://) is not supported.

Application and Server Properties

This section covers how you can customize the deployment of your applications. You can use a number of properties to influence settings for the applications that are deployed. Properties can be applied on a per-application basis or in the appropriate server configuration for all deployed applications.

Properties set on a per-application basis always take precedence over properties set as the server configuration. This arrangement lets you override global server level properties on a per-application basis.

Properties to be applied for all deployed Tasks are defined in the src/kubernetes/server/server-config.yaml file and for Streams in src/kubernetes/skipper/skipper-config-(binder).yaml. Replace (binder) with the messaging middleware you are using — for example, rabbit or kafka.

Memory and CPU Settings

Applications are deployed with default memory and CPU settings. If needed, these values can be adjusted. The following example shows how to set Limits to 1000m for CPU and 1024Mi for memory and Requests to 800m for CPU and 640Mi for memory:

deployer.<app>.kubernetes.limits.cpu=1000m
deployer.<app>.kubernetes.limits.memory=1024Mi
deployer.<app>.kubernetes.requests.cpu=800m
deployer.<app>.kubernetes.requests.memory=640Mi

Those values results in the following container settings being used:

Limits:
cpu: 1
memory: 1Gi
Requests:
cpu: 800m
memory: 640Mi

You can also control the default values to which to set the cpu and memory globally.

The following example shows how to set the CPU and memory for streams and tasks:

data:
  application.yaml: |-
    spring:
      cloud:
        skipper:
          server:
            platform:
              kubernetes:
                accounts:
                  default:
                    limits:
                      memory: 640mi
                      cpu: 500m
data:
  application.yaml: |-
    spring:
      cloud:
        dataflow:
          task:
            platform:
              kubernetes:
                accounts:
                  default:
                    limits:
                      memory: 640mi
                      cpu: 500m

The settings we have used so far only affect the settings for the container. They do not affect the memory setting for the JVM process in the container. If you would like to set JVM memory settings, you can provide an environment variable to do so. See the next section for details.

Environment Variables

To influence the environment settings for a given application, you can use the spring.cloud.deployer.kubernetes.environmentVariables deployer property. For example, a common requirement in production settings is to influence the JVM memory arguments. You can do so by using the JAVA_TOOL_OPTIONS environment variable, as the following example shows:

deployer.<app>.kubernetes.environmentVariables=JAVA_TOOL_OPTIONS=-Xmx1024m

The environmentVariables property accepts a comma-delimited string. If an environment variable contains a value which is also a comma-delimited string, it must be enclosed in single quotation marks — for example,

spring.cloud.deployer.kubernetes.environmentVariables=spring.cloud.stream.kafka.binder.brokers='somehost:9092, anotherhost:9093'

This overrides the JVM memory setting for the desired <app> (replace <app> with the name of your application).

Liveness and Readiness Probes

The liveness and readiness probes use paths called /health and /info, respectively. They use a delay of 10 for both and a period of 60 and 10 respectively. You can change these defaults when you deploy the stream by using deployer properties. Liveness and readiness probes are only applied to streams.

The following example changes the liveness probe (replace <app> with the name of your application) by setting deployer properties:

deployer.<app>.kubernetes.livenessProbePath=/health
deployer.<app>.kubernetes.livenessProbeDelay=120
deployer.<app>.kubernetes.livenessProbePeriod=20

You can declare the same as part of the server global configuration for streams, as the following example shows:

data:
  application.yaml: |-
    spring:
      cloud:
        skipper:
          server:
            platform:
              kubernetes:
                accounts:
                  default:
                    livenessProbePath: /health
                    livenessProbeDelay: 120
                    livenessProbePeriod: 20

Similarly, you can swap liveness for readiness to override the default readiness settings.

By default, port 8080 is used as the probe port. You can change the defaults for both liveness and readiness probe ports by using deployer properties, as the following example shows:

deployer.<app>.kubernetes.readinessProbePort=7000
deployer.<app>.kubernetes.livenessProbePort=7000

You can declare the same as part of the global configuration for streams, as the following example shows:

data:
  application.yaml: |-
    spring:
      cloud:
        skipper:
          server:
            platform:
              kubernetes:
                accounts:
                  default:
                    readinessProbePort: 7000
                    livenessProbePort: 7000

By default, the liveness and readiness probe paths use Spring Boot 2.x+ actuator endpoints. To use Spring Boot 1.x actuator endpoint paths, you must adjust the liveness and readiness values, as the following example shows (replace <app> with the name of your application):

deployer.<app>.kubernetes.livenessProbePath=/health
deployer.<app>.kubernetes.readinessProbePath=/info

To automatically set both liveness and readiness endpoints on a per-application basis to the default Spring Boot 1.x paths, you can set the following property:

deployer.<app>.kubernetes.bootMajorVersion=1

You can access secured probe endpoints by using credentials stored in a Kubernetes secret. You can use an existing secret, provided the credentials are contained under the credentials key name of the secret’s data block. You can configure probe authentication on a per-application basis. When enabled, it is applied to both the liveness and readiness probe endpoints by using the same credentials and authentication type. Currently, only Basic authentication is supported.

To create a new secret:

  1. Generate the base64 string with the credentials used to access the secured probe endpoints.

    Basic authentication encodes a username and password as a base64 string in the format of username:password.

    The following example (which includes output and in which you should replace user and pass with your values) shows how to generate a base64 string:

    echo -n "user:pass" | base64
    dXNlcjpwYXNz
  2. With the encoded credentials, create a file (for example, myprobesecret.yml) with the following contents:

    apiVersion: v1
    kind: Secret
    metadata:
    name:
    myprobesecret type:
    Opaque data:
    credentials: GENERATED_BASE64_STRING
  3. Replace GENERATED_BASE64_STRING with the base64-encoded value generated earlier.
  4. Create the secret by using kubectl, as the following example shows:

    kubectl create -f ./myprobesecret.yml
    secret "myprobesecret" created
  5. Set the following deployer properties to use authentication when accessing probe endpoints, as the following example shows:

    deployer.<app>.kubernetes.probeCredentialsSecret=myprobesecret

    Replace <app> with the name of the application to which to apply authentication.

Using SPRING_APPLICATION_JSON

You can use a SPRING_APPLICATION_JSON environment variable to set Data Flow server properties (including the configuration of maven repository settings) that are common across all of the Data Flow server implementations. These settings go at the server level in the container env section of a deployment YAML. The following example shows how to do so:

env:
  - name: SPRING_APPLICATION_JSON
    value: |-
    {
      "maven": {
        "local-repository": null,
        "remote-repositories": {
          "repo1": {
            "url": "https://repo.spring.io/libs-snapshot"
          }
        }
      }
    }

Private Docker Registry

You can pull Docker images from a private registry on a per-application basis. First, you must create a secret in the cluster. Follow the Pull an Image from a Private Registry guide to create the secret.

Once you have created the secret, you can use the imagePullSecret property to set the secret to use, as the following example shows:

deployer.<app>.kubernetes.imagePullSecret=mysecret

Replace <app> with the name of your application and mysecret with the name of the secret you created earlier.

You can also configure the image pull secret at the global server level.

The following example shows how to do so for streams and tasks:

data:
  application.yaml: |-
    spring:
      cloud:
        skipper:
          server:
            platform:
              kubernetes:
                accounts:
                  default:
                    imagePullSecret: mysecret
data:
  application.yaml: |-
    spring:
      cloud:
        dataflow:
          task:
            platform:
              kubernetes:
                accounts:
                  default:
                    imagePullSecret: mysecret

Replace mysecret with the name of the secret you created earlier.

Volume Mounted Secretes

Data Flow uses the application metadata stored in a container image label. To access the metadata labels in a private registry, you have to extend the Data Flow deployment configuration and mount the registry secrets as a Secrets PropertySource:

    spec:
      containers:
      - name: scdf-server
        ...
        volumeMounts:
          - name: mysecret
            mountPath: /etc/secrets/mysecret
            readOnly: true
        ...
      volumes:
        - name: mysecret
          secret:
            secretName: mysecret

Annotations

You can add annotations to Kubernetes objects on a per-application basis. The supported object types are pod Deployment, Service, and Job. Annotations are defined in a key:value format, allowing for multiple annotations separated by a comma. For more information and use cases on annotations, see Annotations.

The following example shows how you can configure applications to use annotations:

deployer.<app>.kubernetes.podAnnotations=annotationName:annotationValue
deployer.<app>.kubernetes.serviceAnnotations=annotationName:annotationValue,annotationName2:annotationValue2
deployer.<app>.kubernetes.jobAnnotations=annotationName:annotationValue

Replace <app> with the name of your application and the value of your annotations.

Entry Point Style

An entry point style affects how application properties are passed to the container to be deployed. Currently, three styles are supported:

  • exec (default): Passes all application properties and command line arguments in the deployment request as container arguments. Application properties are transformed into the format of --key=value.
  • shell: Passes all application properties and command line arguments as environment variables. Each of the application and command line argument properties is transformed into an uppercase string and . characters are replaced with _.
  • boot: Creates an environment variable called SPRING_APPLICATION_JSON that contains a JSON representation of all application properties. Command line arguments from the deployment request are set as container args.

In all cases, environment variables defined at the server-level configuration and on a per-application basis are set onto the container as is.

You can configure applications as follows:

deployer.<app>.kubernetes.entryPointStyle=<Entry Point Style>

Replace <app> with the name of your application and <Entry Point Style> with your desired entry point style.

You can also configure the entry point style at the global server level.

The following example shows how to do so for streams:

data:
  application.yaml: |-
    spring:
      cloud:
        skipper:
          server:
            platform:
              kubernetes:
                accounts:
                  default:
                    entryPointStyle: entryPointStyle

The following example shows how to do so for tasks:

data:
  application.yaml: |-
    spring:
      cloud:
        dataflow:
          task:
            platform:
              kubernetes:
                accounts:
                  default:
                    entryPointStyle: entryPointStyle

Replace entryPointStye with the desired entry point style.

You should choose an Entry Point Style of either exec or shell, to correspond to how the ENTRYPOINT syntax is defined in the container’s Dockerfile. For more information and uses cases on exec versus shell, see the ENTRYPOINT section of the Docker documentation.

Using the boot entry point style corresponds to using the exec style ENTRYPOINT. Command line arguments from the deployment request are passed to the container, with the addition of application properties being mapped into the SPRING_APPLICATION_JSON environment variable rather than command line arguments.

When you use the boot Entry Point Style, the deployer.<app>.kubernetes.environmentVariables property must not contain SPRING_APPLICATION_JSON.

Deployment Service Account

You can configure a custom service account for application deployments through properties. You can use an existing service account or create a new one. One way to create a service account is by using kubectl, as the following example shows:

kubectl create serviceaccount myserviceaccountname
serviceaccount "myserviceaccountname" created

Then you can configure individual applications as follows:

deployer.<app>.kubernetes.deploymentServiceAccountName=myserviceaccountname

Replace <app> with the name of your application and myserviceaccountname with your service account name.

You can also configure the service account name at the global server level.

The following example shows how to do so for streams:

data:
  application.yaml: |-
    spring:
      cloud:
        skipper:
          server:
            platform:
              kubernetes:
                accounts:
                  default:
                    deploymentServiceAccountName: myserviceaccountname

The following example shows how to do so for tasks:

data:
  application.yaml: |-
    spring:
      cloud:
        dataflow:
          task:
            platform:
              kubernetes:
                accounts:
                  default:
                    deploymentServiceAccountName: myserviceaccountname

Replace myserviceaccountname with the service account name to be applied to all deployments.

Image Pull Policy

An image pull policy defines when a Docker image should be pulled to the local registry. Currently, three policies are supported:

  • IfNotPresent (default): Do not pull an image if it already exists.
  • Always: Always pull the image regardless of whether it already exists.
  • Never: Never pull an image. Use only an image that already exists.

The following example shows how you can individually configure applications:

deployer.<app>.kubernetes.imagePullPolicy=Always

Replace <app> with the name of your application and Always with your desired image pull policy.

You can configure an image pull policy at the global server level.

The following example shows how to do so for streams:

data:
  application.yaml: |-
    spring:
      cloud:
        skipper:
          server:
            platform:
              kubernetes:
                accounts:
                  default:
                    imagePullPolicy: Always

The following example shows how to do so for tasks:

data:
  application.yaml: |-
    spring:
      cloud:
        dataflow:
          task:
            platform:
              kubernetes:
                accounts:
                  default:
                    imagePullPolicy: Always

Replace Always with your desired image pull policy.

Deployment Labels

You can set custom labels on objects related to Deployment. See Labels for more information on labels. Labels are specified in key:value format.

The following example shows how you can individually configure applications:

deployer.<app>.kubernetes.deploymentLabels=myLabelName:myLabelValue

Replace <app> with the name of your application, myLabelName with your label name, and myLabelValue with the value of your label.

Additionally, you can apply multiple labels, as the following example shows:

deployer.<app>.kubernetes.deploymentLabels=myLabelName:myLabelValue,myLabelName2:myLabelValue2

NodePort

Applications are deployed using a Service type of ClusterIP which is the default Kubernetes Service type if not defined otherwise. ClusterIP services are only reachable from within the cluster itself.

To expose the deployed application to be available externally, one option is to use NodePort. See the NodePort documentation for more information.

The following example shows how you can individually configure applications using Kubernetes assigned ports:

deployer.<app>.kubernetes.createNodePort=true

Replace <app> with the name of your application.

Additionally, you can define the port to use for the NodePort Service as shown below:

deployer.<app>.kubernetes.createNodePort=31101

Replace <app> with the name of your application and the value of 31101 with your desired port.

When defining the port manually, the port must not already be in use and within the defined NodePort range. Per NodePort the default port range is 30000-32767.

Monitoring

To learn more about the monitoring experience in Data Flow using Prometheus running on Kubernetes, please refer to the Stream Monitoring feature guide.