Helm Installation
The current Spring Cloud Data Flow chart is based on Helm 2. The Helm project will be ending support for Helm 2 in November of 2020. At that time the Spring Cloud Data Flow chart will be based on Helm 3, dropping support for Helm 2.
Migration steps from Helm 2 to Helm 3 are required. In preparation for the migration, it is advised to read the Helm v2 to v3 Migration Guide for more information. Additionally, some helpful tips on data migration and upgrades can be found in the post migration issues article.
Spring Cloud Data Flow offers a Helm Chart for deploying the Spring Cloud Data Flow server and its required services to a Kubernetes Cluster.
The following sections cover how to initialize Helm
and install Spring Cloud Data Flow on a Kubernetes cluster.
If using Minikube, see Setting Minikube Resources for details on CPU and RAM resource requirements.
Installing Helm
The Spring Cloud Data Flow Helm chart is currently tested against Helm 2.
Helm
is comprised of two components: the client (Helm) and the server (Tiller).
The Helm
client runs on your local machine and can be installed by following the instructions found here.
If Tiller has not been installed on your cluster, run the following to create a service account and the Helm
init client command:
kubectl create serviceaccount tiller -n kube-system
kubectl create clusterrolebinding tiller --clusterrole cluster-admin --serviceaccount kube-system:tiller
helm init --wait --service-account tiller
Please see the Helm documentation for additional Helm security configuration.
helm repo update
To verify that the Tiller
pod is running, run the following command:
kubectl get pod --namespace kube-system
You should see the Tiller
pod running.
Installing the Spring Cloud Data Flow Server and Required Services
Spring Cloud Data Flow Chart
Spring Cloud Data Flow is a toolkit for microservices-based Streaming and Batch data processing pipelines in Cloud Foundry and Kubernetes
Data processing pipelines consist of Spring Boot apps, built using the Spring Cloud Stream or Spring Cloud Task microservice frameworks. This makes Spring Cloud Data Flow suitable for a range of data processing use cases, from import/export to event streaming and predictive analytics.
This Helm chart is deprecated
Given the stable
deprecation timeline, the Bitnami maintained Spring Cloud Data Flow Helm chart is now located at bitnami/charts.
The Bitnami repository is already included in the Hubs and we will continue providing the same cadence of updates, support, etc that we've been keeping here these years. Installation instructions are very similar, just adding the bitnami repo and using it during the installation (bitnami/<chart>
instead of stable/<chart>
)
$ helm repo add bitnami https://charts.bitnami.com/bitnami
$ helm install my-release bitnami/<chart> # Helm 3
$ helm install --name my-release bitnami/<chart> # Helm 2
To update an exisiting stable deployment with a chart hosted in the bitnami repository you can execute
$ helm repo add bitnami https://charts.bitnami.com/bitnami
$ helm upgrade my-release bitnami/<chart>
Issues and PRs related to the chart itself will be redirected to bitnami/charts
GitHub repository. In the same way, we'll be happy to answer questions related to this migration process in this issue created as a common place for discussion.
Chart Details
This chart will provision a fully functional and fully featured Spring Cloud Data Flow installation that can deploy and manage data processing pipelines in the cluster that it is deployed to.
Either the default MySQL deployment or an external database can be used as the data store for Spring Cloud Data Flow state and either RabbitMQ or Kafka can be used as the messaging layer for streaming apps to communicate with one another.
For more information on Spring Cloud Data Flow and its capabilities, see it's documentation.
Prerequisites
Assumes that serviceAccount credentials are available so the deployed Data Flow server can access the API server (Works on GKE and Minikube by default). See Configure Service Accounts for Pods
Installing the Chart
To install the chart with the release name my-release
:
$ helm install --name my-release stable/spring-cloud-data-flow
If you are using a cluster that does not have a load balancer (like Minikube) then you can install using a NodePort:
$ helm install --name my-release --set server.service.type=NodePort stable/spring-cloud-data-flow
To restrict the load balancer to an IP address range:
$ helm install --name my-release --set server.service.loadBalancerSourceRanges='[10.0.0.0/8]' stable/spring-cloud-data-flow
Data Store
By default, MySQL is deployed with this chart. However, if you wish to use an external database, please use the following set
flags to the helm
command to disable MySQL deployment, for example:
--set mysql.enabled=false
In addition, you are required to set all fields listed in External Database Configuration.
Messaging Layer
There are three messaging layers available in this chart:
- RabbitMQ (default)
- RabbitMQ HA
- Kafka
To change the messaging layer to a highly available (HA) version of RabbitMQ, use the following set
flags to the helm
command, for example:
--set rabbitmq-ha.enabled=true,rabbitmq.enabled=false
Alternatively, to change the messaging layer to Kafka, use the following set
flags to the helm
command, for example:
--set kafka.enabled=true,rabbitmq.enabled=false
Only one messaging layer can be used at a given time. If RabbitMQ and Kafka are enabled, both charts will be installed with RabbitMQ being used in the deployment.
Note that this chart pulls in many different Docker images so can take a while to fully install.
Feature Toggles
If you only need to deploy tasks and schedules, streams can be disabled:
--set features.streaming.enabled=false --set rabbitmq.enabled=false
If you only need to deploy streams, tasks and schedules can be disabled:
--set features.batch.enabled=false
NOTE: Both features.streaming.enabled
and features.batch.enabled
should not be set to false
at the same time.
Streaming and batch applications can be monitored through Prometheus and Grafana. To deploy these components and enable monitoring, set the following:
--set features.monitoring.enabled=true
When using Minikube, the Grafana URL can be obtained for example, via:
minikube service my-release-grafana --url
On a platform that provides a LoadBalancer such as GKE, the following can be checked against until the EXTERNAL-IP
field is populated with the assigned load balancer IP address:
kubectl get svc my-release-grafana
See the Grafana table below for default credentials and override parameters.
Using an Ingress
If you would like to use an Ingress instead of having the services use the LoadBalancer
type there are a few things to consider.
First you need to have an Ingress Controller installed in your cluster. If you don't already have one instaled, you can use the following helm command to install an NGINX Ingress Controller:
kubectl create namespace nginx-ingress
helm install --name nginx-ingress --namespace nginx-ingress stable/nginx-ingress
You can look up the IP address used by the NGINX Ingress Controller with:
ingress=$(kubectl get svc nginx-ingress-controller -n nginx-ingress -ojsonpath='{.status.loadBalancer.ingress[0].ip}')
This is useful if you would like to use xip.io
instead of your own DNS resolution. The folowing options assume that you will use xip.io
but you can replace the host values below with your own DNS hosts if you prefer.
To enable the creation of an Ingress
resource and configure the services to use ClusterIP
type use the following set options in your helm install command:
--set server.service.type=ClusterIP \
--set ingress.enabled=true \
--set ingress.protocol=http \
--set ingress.server.host=scdf.${ingress}.xip.io \
If you want to use an Ingress
with the monitoring feature enabled, then use thes options instead:
--set features.monitoring.enabled=true \
--set server.service.type=ClusterIP \
--set grafana.service.type=ClusterIP \
--set prometheus.proxy.service.type=ClusterIP \
--set ingress.enabled=true \
--set ingress.protocol=http \
--set ingress.server.host=scdf.${ingress}.xip.io \
--set ingress.grafana.host=grafana.${ingress}.xip.io \
Configuration
The following tables list the configurable parameters and their default values.
RBAC Configuration
Parameter | Description | Default |
---|---|---|
rbac.create | Create RBAC configurations | true |
ServiceAccount Configuration
Parameter | Description | Default |
---|---|---|
serviceAccount.create | Create ServiceAccount | true |
serviceAccount.name | ServiceAccount name | (generated if not specified) |
Data Flow Server Configuration
Parameter | Description | Default |
---|---|---|
server.version | The version/tag of the Data Flow server | 2.6.0 |
server.imagePullPolicy | The imagePullPolicy of the Data Flow server | IfNotPresent |
server.service.type | The service type for the Data Flow server | LoadBalancer |
server.service.annotations | Extra annotations for service resource | {} |
server.service.externalPort | The external port for the Data Flow server | 80 |
server.service.labels | Extra labels for the service resource | {} |
server.service.loadBalancerSourceRanges | A list of IP address ranges to allow through the load balancer | no restriction |
server.platformName | The name of the configured platform account | default |
server.configMap | Custom ConfigMap name for Data Flow server configuration | |
server.trustCerts | Trust self signed certs | false |
server.extraEnv | Extra environment variables to add to the server container | {} |
server.containerConfiguration.container.registry-configurations. |
The registry host to use for the profile represented by |
|
server.containerConfiguration.container.registry-configurations. |
The registry authorization type to use for the profile represented by |
Skipper Server Configuration
Parameter | Description | Default |
---|---|---|
skipper.version | The version/tag of the Skipper server | 2.5.0 |
skipper.imagePullPolicy | The imagePullPolicy of the Skipper server | IfNotPresent |
skipper.platformName | The name of the configured platform account | default |
skipper.service.type | The service type for the Skipper server | ClusterIP |
skipper.service.annotations | Extra annotations for service resources | {} |
skipper.service.labels | Extra labels for the service resource | {} |
skipper.configMap | Custom ConfigMap name for Skipper server configuration | |
skipper.trustCerts | Trust self signed certs | false |
skipper.extraEnv | Extra environment variables to add to the skipper container | {} |
Spring Cloud Deployer for Kubernetes Configuration
Parameter | Description | Default |
---|---|---|
deployer.resourceLimits.cpu | Deployer resource limit for cpu | 500m |
deployer.resourceLimits.memory | Deployer resource limit for memory | 1024Mi |
deployer.readinessProbe.initialDelaySeconds | Deployer readiness probe initial delay | 120 |
deployer.livenessProbe.initialDelaySeconds | Deployer liveness probe initial delay | 90 |
RabbitMQ Configuration
Parameter | Description | Default |
---|---|---|
rabbitmq.enabled | Enable RabbitMQ as the middleware to use | true |
rabbitmq.rabbitmq.username | RabbitMQ user name | user |
rabbitmq.rabbitmq.password | RabbitMQ password to encode into the secret | changeme |
RabbitMQ HA Configuration
Parameter | Description | Default |
---|---|---|
rabbitmq-ha.enabled | Enable RabbitMQ HA as the middleware to use | false |
rabbitmq-ha.rabbitmqUsername | RabbitMQ user name | user |
Kafka Configuration
Parameter | Description | Default |
---|---|---|
kafka.enabled | Enable RabbitMQ as the middleware to use | false |
kafka.replicas | The number of Kafka replicas to use | 1 |
kafka.configurationOverrides | Kafka deployment configuration overrides | replication.factor=1, metrics.enabled=false |
kafka.zookeeper.replicaCount | The number of ZooKeeper replicates to use | 1 |
MySQL Configuration
Parameter | Description | Default |
---|---|---|
mysql.enabled | Enable deployment of MySQL | true |
mysql.mysqlDatabase | MySQL database name | dataflow |
External Database Configuration
Parameter | Description | Default |
---|---|---|
database.driver | Database driver | nil |
database.scheme | Database scheme | nil |
database.host | Database host | nil |
database.port | Database port | nil |
database.user | Database user | scdf |
database.password | Database password | nil |
database.dataflow | Database name for SCDF server | dataflow |
database.skipper | Database name for SCDF skipper | skipper |
Feature Toggles
Parameter | Description | Default |
---|---|---|
features.streaming.enabled | Enables or disables streams | true |
features.batch.enabled | Enables or disables tasks and schedules | true |
features.monitoring.enabled | Enables or disables monitoring | false |
Ingress
Parameter | Description | Default |
---|---|---|
ingress.enabled | Enables or disables ingress support | true |
ingress.protocol | Sets the protocol used by ingress server | https |
ingress.server.host | Sets the host used for server | data-flow.local |
ingress.server.host | Sets the host used for grafana | grafana.local |
Grafana
Parameter | Description | Default |
---|---|---|
grafana.service.type | Service type to use | LoadBalancer |
grafana.admin.existingSecret | Existing Secret to use for login credentials | scdf-grafana-secret |
grafana.admin.userKey | Secret userKey field | admin-user |
grafana.admin.passwordKey | Secret passwordKey field | admin-password |
grafana.admin.defaultUsername | The default base64 encoded login username used in the secret | admin |
grafana.admin.defaultPassword | The default base64 encoded login password used in the secret | password |
grafana.extraConfigmapMounts | ConfigMap mount for datasources | scdf-grafana-ds-cm |
grafana.dashboardProviders | Dashboard provider for imported dashboards | default |
grafana.dashboards | Dashboards to auto import | SCDF Apps, Streams & Tasks |
Prometheus
Parameter | Description | Default |
---|---|---|
prometheus.server.global.scrape_interval | Scrape interval | 10s |
prometheus.server.global.scrape_timeout | Scrape timeout | 9s |
prometheus.server.global.evaluation_interval | Evaluation interval | 10s |
prometheus.extraScrapeConfigs | Additional scrape configs for proxied applications | proxied-applications & proxies jobs |
prometheus.podSecurityPolicy | Enable or disable PodSecurityContext | true |
prometheus.alertmanager | Enable or disable alert manager | false |
prometheus.kubeStateMetrics | Enable or disable kube state metrics | false |
prometheus.nodeExporter | Enable or disable node exporter | false |
prometheus.pushgateway | Enable or disable push gateway | false |
prometheus.proxy.service.type | Service type to use | LoadBalancer |
Expected output
After issuing the helm install
command, you should see output similar to the following:
NAME: my-release
LAST DEPLOYED: Sat Mar 10 11:33:29 2018
NAMESPACE: default
STATUS: DEPLOYED
RESOURCES:
==> v1/Secret
NAME TYPE DATA AGE
my-release-mysql Opaque 2 1s
my-release-data-flow Opaque 2 1s
my-release-rabbitmq Opaque 2 1s
==> v1/ConfigMap
NAME DATA AGE
my-release-data-flow-server 1 1s
my-release-data-flow-skipper 1 1s
==> v1/PersistentVolumeClaim
NAME STATUS VOLUME CAPACITY ACCESSMODES STORAGECLASS AGE
my-release-rabbitmq Bound pvc-e9ed7f55-2499-11e8-886f-08002799df04 8Gi RWO standard 1s
my-release-mysql Pending standard 1s
==> v1/ServiceAccount
NAME SECRETS AGE
my-release-data-flow 1 1s
==> v1/Service
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
my-release-mysql 10.110.98.253 <none> 3306/TCP 1s
my-release-data-flow-server 10.105.216.155 <pending> 80:32626/TCP 1s
my-release-rabbitmq 10.106.76.215 <none> 4369/TCP,5672/TCP,25672/TCP,15672/TCP 1s
my-release-data-flow-skipper 10.100.28.64 <none> 80/TCP 1s
==> v1beta1/Deployment
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
my-release-mysql 1 1 1 0 1s
my-release-rabbitmq 1 1 1 0 1s
my-release-data-flow-skipper 1 1 1 0 1s
my-release-data-flow-server 1 1 1 0 1s
Get the Spring Cloud Data Flow's application URL by running these commands:
export SERVICE_IP=$(kubectl get svc --namespace default my-release-data-flow-server -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
echo http://$SERVICE_IP:80
It may take a few minutes for the LoadBalancer IP to be available.
You can watch the status of the server by running kubectl get svc -w my-release-data-flow-server
If your using Minikube, you can use the following command to get the URL for the server:
minikube service --url my-release-data-flow-server
You have just created a new release in the default namespace of your Kubernetes cluster.
It takes a couple of minutes for the application and its required services to start.
You can check on the status by issuing a kubectl get pod -w
command.
You need to wait for the READY
column to show 1/1
for all pods.
When all pods are ready, you can access the Spring Cloud Data Flow dashboard by accessing http://<SERVICE_ADDRESS>/dashboard
where <SERVICE_ADDRESS>
is the address returned by either the kubectl
or minikube
commands above.
To see what Helm
releases of Spring Cloud Data Flow you have running, you can use the helm list
command.
When it is time to delete the previously installed SCDF release, run helm delete my-release
.
This command removes any resources created for the release but keeps release information so that you can rollback any changes by using a helm rollback my-release 1
command.
To completely delete the release and purge any release metadata, you can use helm delete my-release --purge
.
Secret management
There is an issue with generated secrets that are used for the required services getting rotated on chart upgrades. To avoid this issue, set the password for these services when installing the chart. You can use the following command to do so:
helm install --name my-release \
--set rabbitmq.rabbitmqPassword=rabbitpwd \
--set mysql.mysqlRootPassword=mysqlpwd incubator/spring-cloud-data-flow
Version Compatibility
The following listing shows Spring Cloud Data Flow’s version compatibility with the respective Helm Chart releases:
SCDF Version | Chart Version |
---|---|
SCDF-K8S-Server 1.7.x | 1.0.x |
SCDF-K8S-Server 2.0.x | 2.2.x |
SCDF-K8S-Server 2.1.x | 2.3.x |
SCDF-K8S-Server 2.2.x | 2.4.x |
SCDF-K8S-Server 2.3.x | 2.5.x |
Register prebuilt applications
All the prebuilt streaming applications:
- Are available as Apache Maven artifacts or Docker images.
- Use RabbitMQ or Apache Kafka.
- Support monitoring via Prometheus and InfluxDB.
- Contain metadata for application properties used in the UI and code completion in the shell.
Applications can be registered individually using the app register
functionality or as a group using the app import
functionality.
There are also dataflow.spring.io
links that represent the group of prebuilt applications for a specific release which is useful for getting started.
You can register applications using the UI or the shell. Even though we are only using two prebuilt applications, we will register the full set of prebuilt applications.
The easiest way to install Data Flow on Kubernetes is using the Helm chart that uses RabbitMQ as the default messaging middleware. The command to import the Kafka version of the applications is
dataflow:>app import --uri https://dataflow.spring.io/kafka-docker-latest
Change kafka
to rabbitmq
in the above URL if you set kafka.enabled=true
in the helm chart or followed the manual kubectl
based installation instructions for installing Data Flow on Kubernetes and chose to use Kafka as the messaging middleware.
Only applications registered with a --uri
property
pointing to a Docker resource are supported by the Data Flow Server
for Kubernetes. However, we do support Maven resources for the
--metadata-uri
property, which is used to list the properties
supported by each application. For example, the following application
registration is valid:
app register --type source --name time --uri docker://springcloudstream/time-source-rabbit:{docker-time-source-rabbit-version} --metadata-uri maven://org.springframework.cloud.stream.app:time-source-rabbit:jar:metadata:{docker-time-source-rabbit-version}
Any application registered with a Maven, HTTP, or File resource or the executable jar (by using a --uri
property prefixed with
maven://
, http://
or file://
) is not supported.
Application and Server Properties
This section covers how you can customize the deployment of your applications. You can use a number of properties to influence settings for the applications that are deployed. Properties can be applied on a per-application basis or in the appropriate server configuration for all deployed applications.
Properties set on a per-application basis always take precedence over properties set as the server configuration. This arrangement lets you override global server level properties on a per-application basis.
Properties to be applied for all deployed Tasks are defined in the
src/kubernetes/server/server-config.yaml
file and for Streams
in src/kubernetes/skipper/skipper-config-(binder).yaml
. Replace
(binder)
with the messaging middleware you are using — for example,
rabbit
or kafka
.
Memory and CPU Settings
Applications are deployed with default memory and CPU settings. If
needed, these values can be adjusted. The following example shows how to
set Limits
to 1000m
for CPU
and 1024Mi
for memory and Requests
to 800m
for CPU and 640Mi
for memory:
deployer.<app>.kubernetes.limits.cpu=1000m
deployer.<app>.kubernetes.limits.memory=1024Mi
deployer.<app>.kubernetes.requests.cpu=800m
deployer.<app>.kubernetes.requests.memory=640Mi
Those values results in the following container settings being used:
Limits:
cpu: 1
memory: 1Gi
Requests:
cpu: 800m
memory: 640Mi
You can also control the default values to which to set the cpu
and
memory
globally.
The following example shows how to set the CPU and memory for streams and tasks:
data:
application.yaml: |-
spring:
cloud:
skipper:
server:
platform:
kubernetes:
accounts:
default:
limits:
memory: 640mi
cpu: 500m
data:
application.yaml: |-
spring:
cloud:
dataflow:
task:
platform:
kubernetes:
accounts:
default:
limits:
memory: 640mi
cpu: 500m
The settings we have used so far only affect the settings for the container. They do not affect the memory setting for the JVM process in the container. If you would like to set JVM memory settings, you can provide an environment variable to do so. See the next section for details.
Environment Variables
To influence the environment settings for a given application, you can
use the spring.cloud.deployer.kubernetes.environmentVariables
deployer
property. For example, a common requirement in production settings is to
influence the JVM memory arguments. You can do so by using the
JAVA_TOOL_OPTIONS
environment variable, as the following example
shows:
deployer.<app>.kubernetes.environmentVariables=JAVA_TOOL_OPTIONS=-Xmx1024m
The environmentVariables
property accepts a comma-delimited string.
If an environment variable contains a value which is also a
comma-delimited string, it must be enclosed in single quotation marks — for example,
spring.cloud.deployer.kubernetes.environmentVariables=spring.cloud.stream.kafka.binder.brokers='somehost:9092, anotherhost:9093'
This overrides the JVM memory setting for the desired <app>
(replace
<app>
with the name of your application).
Liveness and Readiness Probes
The liveness
and readiness
probes use paths called /health
and
/info
, respectively. They use a delay
of 10
for both and a
period
of 60
and 10
respectively. You can change these defaults
when you deploy the stream by using deployer properties. Liveness and
readiness probes are only applied to streams.
The following example changes the liveness
probe (replace <app>
with
the name of your application) by setting deployer properties:
deployer.<app>.kubernetes.livenessProbePath=/health
deployer.<app>.kubernetes.livenessProbeDelay=120
deployer.<app>.kubernetes.livenessProbePeriod=20
You can declare the same as part of the server global configuration for streams, as the following example shows:
data:
application.yaml: |-
spring:
cloud:
skipper:
server:
platform:
kubernetes:
accounts:
default:
livenessProbePath: /health
livenessProbeDelay: 120
livenessProbePeriod: 20
Similarly, you can swap liveness
for readiness
to override the
default readiness
settings.
By default, port 8080 is used as the probe port. You can change the
defaults for both liveness
and readiness
probe ports by using
deployer properties, as the following example shows:
deployer.<app>.kubernetes.readinessProbePort=7000
deployer.<app>.kubernetes.livenessProbePort=7000
You can declare the same as part of the global configuration for streams, as the following example shows:
data:
application.yaml: |-
spring:
cloud:
skipper:
server:
platform:
kubernetes:
accounts:
default:
readinessProbePort: 7000
livenessProbePort: 7000
By default, the liveness
and readiness
probe paths use Spring Boot
2.x+ actuator endpoints. To use Spring Boot 1.x actuator endpoint
paths, you must adjust the liveness
and readiness
values, as the
following example shows (replace <app>
with the name of your
application):
deployer.<app>.kubernetes.livenessProbePath=/health
deployer.<app>.kubernetes.readinessProbePath=/info
To automatically set both liveness
and readiness
endpoints on a
per-application basis to the default Spring Boot 1.x paths, you can set
the following property:
deployer.<app>.kubernetes.bootMajorVersion=1
You can access secured probe endpoints by using credentials stored in a
Kubernetes
secret. You
can use an existing secret, provided the credentials are contained under
the credentials
key name of the secret’s data
block. You can
configure probe authentication on a per-application basis. When enabled,
it is applied to both the liveness
and readiness
probe endpoints by
using the same credentials and authentication type. Currently, only
Basic
authentication is supported.
To create a new secret:
-
Generate the base64 string with the credentials used to access the secured probe endpoints.
Basic authentication encodes a username and password as a base64 string in the format of
username:password
.The following example (which includes output and in which you should replace
user
andpass
with your values) shows how to generate a base64 string:echo -n "user:pass" | base64 dXNlcjpwYXNz
-
With the encoded credentials, create a file (for example,
myprobesecret.yml
) with the following contents:apiVersion: v1 kind: Secret metadata: name: myprobesecret type: Opaque data: credentials: GENERATED_BASE64_STRING
- Replace
GENERATED_BASE64_STRING
with the base64-encoded value generated earlier. -
Create the secret by using
kubectl
, as the following example shows:kubectl create -f ./myprobesecret.yml secret "myprobesecret" created
-
Set the following deployer properties to use authentication when accessing probe endpoints, as the following example shows:
deployer.<app>.kubernetes.probeCredentialsSecret=myprobesecret
Replace
<app>
with the name of the application to which to apply authentication.
Using SPRING_APPLICATION_JSON
You can use a SPRING_APPLICATION_JSON
environment variable to set Data
Flow server properties (including the configuration of maven repository
settings) that are common across all of the Data Flow server
implementations. These settings go at the server level in the container
env
section of a deployment YAML. The following example shows how to
do so:
env:
- name: SPRING_APPLICATION_JSON
value: |-
{
"maven": {
"local-repository": null,
"remote-repositories": {
"repo1": {
"url": "https://repo.spring.io/libs-snapshot"
}
}
}
}
Private Docker Registry
You can pull Docker images from a private registry on a per-application basis. First, you must create a secret in the cluster. Follow the Pull an Image from a Private Registry guide to create the secret.
Once you have created the secret, you can use the imagePullSecret
property to set the secret to use, as the following example shows:
deployer.<app>.kubernetes.imagePullSecret=mysecret
Replace <app>
with the name of your application and mysecret
with
the name of the secret you created earlier.
You can also configure the image pull secret at the global server level.
The following example shows how to do so for streams and tasks:
data:
application.yaml: |-
spring:
cloud:
skipper:
server:
platform:
kubernetes:
accounts:
default:
imagePullSecret: mysecret
data:
application.yaml: |-
spring:
cloud:
dataflow:
task:
platform:
kubernetes:
accounts:
default:
imagePullSecret: mysecret
Replace mysecret
with the name of the secret you created earlier.
Volume Mounted Secretes
Data Flow uses the application metadata stored in a container image label. To access the metadata labels in a private registry, you have to extend the Data Flow deployment configuration and mount the registry secrets as a Secrets PropertySource:
spec:
containers:
- name: scdf-server
...
volumeMounts:
- name: mysecret
mountPath: /etc/secrets/mysecret
readOnly: true
...
volumes:
- name: mysecret
secret:
secretName: mysecret
Annotations
You can add annotations to Kubernetes objects on a per-application
basis. The supported object types are pod Deployment
, Service
, and
Job
. Annotations are defined in a key:value
format, allowing for
multiple annotations separated by a comma. For more information and use
cases on annotations, see
Annotations.
The following example shows how you can configure applications to use annotations:
deployer.<app>.kubernetes.podAnnotations=annotationName:annotationValue
deployer.<app>.kubernetes.serviceAnnotations=annotationName:annotationValue,annotationName2:annotationValue2
deployer.<app>.kubernetes.jobAnnotations=annotationName:annotationValue
Replace <app>
with the name of your application and the value of your
annotations.
Entry Point Style
An entry point style affects how application properties are passed to the container to be deployed. Currently, three styles are supported:
exec
(default): Passes all application properties and command line arguments in the deployment request as container arguments. Application properties are transformed into the format of--key=value
.shell
: Passes all application properties and command line arguments as environment variables. Each of the application and command line argument properties is transformed into an uppercase string and.
characters are replaced with_
.boot
: Creates an environment variable calledSPRING_APPLICATION_JSON
that contains a JSON representation of all application properties. Command line arguments from the deployment request are set as container args.
In all cases, environment variables defined at the server-level configuration and on a per-application basis are set onto the container as is.
You can configure applications as follows:
deployer.<app>.kubernetes.entryPointStyle=<Entry Point Style>
Replace <app>
with the name of your application and
<Entry Point Style>
with your desired entry point style.
You can also configure the entry point style at the global server level.
The following example shows how to do so for streams:
data:
application.yaml: |-
spring:
cloud:
skipper:
server:
platform:
kubernetes:
accounts:
default:
entryPointStyle: entryPointStyle
The following example shows how to do so for tasks:
data:
application.yaml: |-
spring:
cloud:
dataflow:
task:
platform:
kubernetes:
accounts:
default:
entryPointStyle: entryPointStyle
Replace entryPointStye
with the desired entry point style.
You should choose an Entry Point Style of either exec
or shell
, to
correspond to how the ENTRYPOINT
syntax is defined in the container’s
Dockerfile
. For more information and uses cases on exec
versus
shell
, see the
ENTRYPOINT
section of the Docker documentation.
Using the boot
entry point style corresponds to using the exec
style
ENTRYPOINT
. Command line arguments from the deployment request are
passed to the container, with the addition of application properties
being mapped into the SPRING_APPLICATION_JSON
environment variable
rather than command line arguments.
When you use the boot
Entry Point Style, the deployer.<app>.kubernetes.environmentVariables
property must not
contain SPRING_APPLICATION_JSON
.
Deployment Service Account
You can configure a custom service account for application deployments
through properties. You can use an existing service account or create a
new one. One way to create a service account is by using kubectl
, as
the following example shows:
kubectl create serviceaccount myserviceaccountname
serviceaccount "myserviceaccountname" created
Then you can configure individual applications as follows:
deployer.<app>.kubernetes.deploymentServiceAccountName=myserviceaccountname
Replace <app>
with the name of your application and
myserviceaccountname
with your service account name.
You can also configure the service account name at the global server level.
The following example shows how to do so for streams:
data:
application.yaml: |-
spring:
cloud:
skipper:
server:
platform:
kubernetes:
accounts:
default:
deploymentServiceAccountName: myserviceaccountname
The following example shows how to do so for tasks:
data:
application.yaml: |-
spring:
cloud:
dataflow:
task:
platform:
kubernetes:
accounts:
default:
deploymentServiceAccountName: myserviceaccountname
Replace myserviceaccountname
with the service account name to be
applied to all deployments.
Image Pull Policy
An image pull policy defines when a Docker image should be pulled to the local registry. Currently, three policies are supported:
IfNotPresent
(default): Do not pull an image if it already exists.Always
: Always pull the image regardless of whether it already exists.Never
: Never pull an image. Use only an image that already exists.
The following example shows how you can individually configure applications:
deployer.<app>.kubernetes.imagePullPolicy=Always
Replace <app>
with the name of your application and Always
with your
desired image pull policy.
You can configure an image pull policy at the global server level.
The following example shows how to do so for streams:
data:
application.yaml: |-
spring:
cloud:
skipper:
server:
platform:
kubernetes:
accounts:
default:
imagePullPolicy: Always
The following example shows how to do so for tasks:
data:
application.yaml: |-
spring:
cloud:
dataflow:
task:
platform:
kubernetes:
accounts:
default:
imagePullPolicy: Always
Replace Always
with your desired image pull policy.
Deployment Labels
You can set custom labels on objects related to
Deployment.
See
Labels
for more information on labels. Labels are specified in key:value
format.
The following example shows how you can individually configure applications:
deployer.<app>.kubernetes.deploymentLabels=myLabelName:myLabelValue
Replace <app>
with the name of your application, myLabelName
with
your label name, and myLabelValue
with the value of your label.
Additionally, you can apply multiple labels, as the following example shows:
deployer.<app>.kubernetes.deploymentLabels=myLabelName:myLabelValue,myLabelName2:myLabelValue2
NodePort
Applications are deployed using a Service
type of ClusterIP which is the default Kubernetes Service
type if not defined otherwise.
ClusterIP
services are only reachable from within the cluster itself.
To expose the deployed application to be available externally, one option is to use NodePort
.
See the NodePort documentation for more information.
The following example shows how you can individually configure applications using Kubernetes assigned ports:
deployer.<app>.kubernetes.createNodePort=true
Replace <app>
with the name of your application.
Additionally, you can define the port to use for the NodePort
Service
as shown below:
deployer.<app>.kubernetes.createNodePort=31101
Replace <app>
with the name of your application and the value of 31101
with your desired port.
When defining the port manually, the port must not already be in use and within the defined NodePort
range.
Per NodePort the default port range is 30000-32767.
Monitoring
To learn more about the monitoring experience in Data Flow using Prometheus running on Kubernetes, please refer to the Stream Monitoring feature guide.