Application Metadata
Spring Boot provides support for bundling metadata about an application's configuration properties within the executable jar, used for tooling and document generation. In this section we discuss how to configure and build an application, including how to provide application configuration metadata as a label in a container image, to work with Data Flow.
For your own applications, you can easily generate application configuration metadata from classes annotated with @ConfigurationProperties
by using the spring-boot-configuration-processor
library. This library includes a Java annotation processor which is invoked when you compile your project to generate the configuration metadata file, stored in the uber-jar as META-INF/spring-configuration-metadata.json
.
To use the configuration processor, include the following dependency in your application's pom.xml
:
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-configuration-processor</artifactId>
<optional>true</optional>
</dependency>
Exposing Application Properties for Data Flow
Stream and Task applications are Spring Boot applications that provide common application properties, as well as common properties used by Data Flow, along with others included with application dependencies. For a typical application, the complete set of available properties can be exhaustive. This presents a usability challenge for Data Flow tooling. For this reason, application configuration features in the Data Flow UI and shell rely on additional configuration property metadata to ensure that only the most relevant configuration properties are included (by default) when providing contextual help, such as listing available properties, handling auto completion, and performing front-end validation.
Data Flow Configuration Metadata
To define which application properties are most relavant to Data Flow, create a file named META-INF/dataflow-configuration-metadata.properties
in the project resource directory. This file must include one or both of the following properties:
configuration-properties.classes
, containing a comma-separated list of fully qualified@ConfigurationProperties
class names.configuration-properties.names
, containing a comma-separated list of property names. This may be the full property name, such asserver.port
, or a prefix, such asspring.jmx
, to include all related properties.
The Spring Cloud Stream applications Git repository is a good place to look for examples. For instance, the jdbc sink's dataflow-configuration-metadata.properties file contains:
configuration-properties.classes=org.springframework.cloud.fn.consumer.jdbc.JdbcConsumerProperties
configuration-properties.names=\
spring.datasource.url,\
spring.datasource.driver-class-name,\
spring.datasource.username,\
spring.datasource.password,\
spring.datasource.schema,\
spring.datasource.data,\
spring.datasource.initialization-mode
Here, we want to expose specific @ConfigurationProperties
used by the sink, along with some standard spring.datasource
configuration properties needed to configure the JDBC datasource.
Packaging Configuration Metadata
The common steps for packaging configuration properties in an executable jar or container image are:
- Add Boot's Configuration processor to your pom.xml as explained above
- Specify properties which properties you want to expose, as explained above
- Configure the
spring-cloud-app-starter-metadata-maven-plugin
, if necessary, as explained here
The additional steps to create a label in a container image with the application metadata are:
- Configure the
properties-maven-plugin
, as explained here, to load theMETA-INF/spring-configuration-metadata-encoded.properties
as maven properties. Among other steps, it will load theorg.springframework.cloud.dataflow.spring.configuration.metadata.json
property. - Extend the
jib-maven-plugin
(ordocker-maven-plugin
) configuration, if necesary, as explained here.
Dedicated Metadata Artifacts
Including the application metadata inside the uber jar has the downside of needing to download a potentially very large uber jar just to inspect the metadata. This can cause noticable delays when invoking certain Data Flow operations that require this metadata. Creating a separate jar that contains only the application metadata has several advantages:
- The metadata artifact is usually a few kilobytes, as opposed to megabytes for the actual application. Consequently, it is quicker to download, enabling quick response times when using Data Flow's UI and shell.
- A smaller size also helps in resource-constrained environments, such as Cloud Foundry, where the local disk size is often limited.
For environments that use container images (for example, Kubernetes), Data Flow accesses the configured container registry through a REST API to query the metadata without having to download the image. Optionally, if you choose to create a dedicated metadata jar, Data Flow will use it.
Creating Metadata Artifacts
The spring-cloud-app-starter-metadata-maven-plugin
plugin helps to prepare all necessary metadata files for your application.
Depending on the runtime, the metadata is packaged either as an separate companion artifact jar or as a configuration label inside the application's container image. To use the plugin, add the following to your pom.xml
:
<plugin>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-dataflow-apps-metadata-plugin</artifactId>
<version>1.0.2</version>
<configuration>
<storeFilteredMetadata>true</storeFilteredMetadata>
</configuration>
<executions>
<execution>
<id>aggregate-metadata</id>
<phase>compile</phase>
<goals>
<goal>aggregate-metadata</goal>
</goals>
</execution>
</executions>
</plugin>
You must use this plugin in conjunction with the spring-boot-configuration-processor
that creates the spring-configuration-metadata.json
files. Be sure to configure both of them.
NOTE The Cloud Native buildpack for spring-boot, used by the Spring Boot (version 2.4.1 + strongly recommended) Maven Plugin build-image goal, provides this metadata automatically for a properly configured application.
Metadata Jar File
For the uber-jar packaged applications, the plugin will create a companion artifact that contains the metadata. Specifically, it contains the Spring boot JSON file about configuration properties metadata and the dataflow configuration metadata file described in the previous section. The following example shows the contents of such an artifact, for the canonical log sink:
$ jar tvf log-sink-rabbit-3.0.0.BUILD-SNAPSHOT-metadata.jar
373848 META-INF/spring-configuration-metadata.json
174 META-INF/dataflow-configuration-metadata.properties
The spring-cloud-app-starter-metadata-maven-plugin
plugin generates a ready-to-use application metadata.jar
artifact. Make sure the plugin is configured in your application's pom.
Metadata Container Image Label
For applications packaged as container images, the spring-cloud-app-starter-metadata-maven-plugin
copies the contents of the spring-configuration-metadata.json
file as a configuration label in the generated application container image, as well as the exposed properties for Data Flow, under the org.springframework.cloud.dataflow.spring.configuration.metadata.json
label. All the configuration metadata is included in the container image, so there is no need for a companion artifact.
At compile time, the plugin generates a META-INF/spring-configuration-metadata-encoded.properties
file with a single property inside: org.springframework.cloud.dataflow.spring.configuration.metadata.json
. The property value is the stringified, expoaed subset of the configuration metadata. The following listing shows a typical metadata JSON file:
org.springframework.cloud.dataflow.spring.configuration.metadata.json={\n \"groups\": [{\n \"name\": \"log\",\n \"type\": \"org.springframework.cloud.stream.app.log.sink.LogSinkProperties\",\n \"sourceType\": \"org.springframework.cloud.stream.app.log.sink.LogSinkProperties\"\n }],\n \"properties\": [\n {\n \"name\": \"log.expression\",\n \"type\": \"java.lang.String\",\n \"description\": \"A SpEL expression (against the incoming message) to evaluate as the logged message.\",\n \"sourceType\": \"org.springframework.cloud.stream.app.log.sink.LogSinkProperties\",\n \"defaultValue\": \"payload\"\n },\n {\n \"name\": \"log.level\",\n \"type\": \"org.springframework.integration.handler.LoggingHandler$Level\",\n \"description\": \"The level at which to log messages.\",\n \"sourceType\": \"org.springframework.cloud.stream.app.log.sink.LogSinkProperties\"\n },\n {\n \"name\": \"log.name\",\n \"type\": \"java.lang.String\",\n \"description\": \"The name of the logger to use.\",\n \"sourceType\": \"org.springframework.cloud.stream.app.log.sink.LogSinkProperties\"\n }\n ],\n \"hints\": []\n}
Properties Maven Plugin
To turn this property into a Docker label, we first need to load it as a Maven property by using the properties-maven-plugin
plugin:
<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>properties-maven-plugin</artifactId>
<version>1.0.0</version>
<executions>
<execution>
<phase>process-classes</phase>
<goals>
<goal>read-project-properties</goal>
</goals>
<configuration>
<files>
<file>${project.build.outputDirectory}/META-INF/spring-configuration-metadata-encoded.properties</file>
</files>
</configuration>
</execution>
</executions>
</plugin>
Container Maven Plugin
With the help of the fabric8:docker-maven-plugin
or jib
Maven plugins, insert the org.springframework.cloud.dataflow.spring.configuration.metadata.json
property into a Docker label with the same name:
<plugin>
<groupId>com.google.cloud.tools</groupId>
<artifactId>jib-maven-plugin</artifactId>
<version>2.0.0</version>
<configuration>
<from>
<image>springcloud/openjdk</image>
</from>
<to>
<image>springcloudstream/${project.artifactId}</image>
<tags>
<tag>3.0.0.BUILD-SNAPSHOT</tag>
</tags>
</to>
<container>
<creationTime>USE_CURRENT_TIMESTAMP</creationTime>
<format>Docker</format>
<labels>
<org.springframework.cloud.dataflow.spring-configuration-metadata.json>
${org.springframework.cloud.dataflow.spring.configuration.metadata.json}
</org.springframework.cloud.dataflow.spring-configuration-metadata.json>
</labels>
</container>
</configuration>
</plugin>
NOTE: The docker-maven-plugin
version must be at least 0.33.0
or newer!
<plugin>
<groupId>io.fabric8</groupId>
<artifactId>docker-maven-plugin</artifactId>
<version>0.33.0</version>
<configuration>
<images>
<image>
<name>springcloudstream/${project.artifactId}:2.1.3.BUILD-SNAPSHOT</name>
<build>
<from>springcloud/openjdk</from>
<volumes>
<volume>/tmp</volume>
</volumes>
<labels>
<org.springframework.cloud.dataflow.spring-configuration-metadata.json>
${org.springframework.cloud.dataflow.spring.configuration.metadata.json}
</org.springframework.cloud.dataflow.spring-configuration-metadata.json>
</labels>
<entryPoint>
<exec>
<arg>java</arg>
<arg>-jar</arg>
<arg>/maven/log-sink-kafka.jar</arg>
</exec>
</entryPoint>
<assembly>
<descriptor>assembly.xml</descriptor>
</assembly>
</build>
</image>
</images>
</configuration>
</plugin>
Using Application Metadata
Once you have generated the application configuration metadata (either as a separate, companion artifact or embedded in the application container image as a configuration label), you may need some additional configuration to let Data Flow know where to look for it.
Using Metadata Jar files
When registering a single app with the app register
command, you can use the optional --metadata-uri
option in the shell, as follows:
dataflow:>app register --name log --type sink
--uri maven://org.springframework.cloud.stream.app:log-sink:2.1.0.RELEASE
--metadata-uri maven://org.springframework.cloud.stream.app:log-sink:jar:metadata:2.1.0.RELEASE
When registering several files by using the app import
command, the file should contain a <type>.<name>.metadata
line in addition to each <type>.<name>
line. Strictly speaking, doing so is optional (if some apps have it but some others do not, it works), but it is best practice.
The following example shows an uber jar app, where the metadata artifact is hosted in a Maven repository (retrieving it through http://
or file://
is equally possible).
source.http=maven://org.springframework.cloud.stream.app:log-sink:2.1.0.RELEASE
source.http.metadata=maven://org.springframework.cloud.stream.app:log-sink:jar:metadata:2.1.0.RELEASE
Using Metadata Container Image Labels
When registering a single Docker app with the app register
command, the Data Flow server automatically checks for metadata in the org.springframework.cloud.dataflow.spring-configuration-metadata.json
configuration label:
dataflow:>app register --name log --type sink --uri container:springcloudstream/log-sink-rabbit:2.1.13.RELEASE
Configurations are specific for each target Container Registry provider or instance.
For a private container registry with volume-mounted secrets, the registry configurations are automatically inferred from the secrets.
In addition, spring.cloud.dataflow.container.registry-configurations
has properties that let you explicitly configure multiple container registries, as follows:
Container Registry Support
Out of the box you can connect to various on-cloud and on-premise container registries such as Harbor, Arifactory/JFrog, Amazon ECR, Azure Container Registry or host your private registry.
As the different registries my impose different authentication schemas the following sections provide registry specific configuration details:
- Docker Hub - public Docker Hub registry
- spring.cloud.dataflow.container.registry-configurations[default].registry-host=registry-1.docker.io
- spring.cloud.dataflow.container.registry-configurations[default].authorization-type=dockeroauth2
- spring.cloud.dataflow.container.registry-configurations[default].extra[registryAuthUri]=https://auth.docker.io/token?service=registry.docker.io&scope=repository:{repository}:pull&offline_token=1&client_id=shell
spring:
cloud:
dataflow:
container:
registry-configurations:
default:
registry-host: registry-1.docker.io
authorization-type: dockeroauth2
extra:
'registryAuthUri': 'https://auth.docker.io/token?service=registry.docker.io&scope=repository:{repository}:pull&offline_token=1&client_id=shell'
This registry is used by default. If the image name does not provide the registry host prefix. The public Docker hub repositories do not require username and password authorization. The credentials, though, are required for the private Docker Hub repositories.
- spring.cloud.dataflow.container.registry-configurations[harbor].registry-host=demo.goharbor.io
- spring.cloud.dataflow.container.registry-configurations[harbor].authorization-type=dockeroauth2
- spring.cloud.dataflow.container.registry-configurations[harbor].user=admin
- spring.cloud.dataflow.container.registry-configurations[harbor].secret=Harbor12345
spring:
cloud:
dataflow:
container:
registry-configurations:
harbor:
registry-host: demo.goharbor.io
authorization-type: dockeroauth2
user: admin
secret: Harbor12345
The Harbor Registry configuration uses the OAuth2 Token authorization similar to DockerHub but on a different registryAuthUri
. Later is automatically resolved at bootstrap, but you can override it like this:
- spring.cloud.dataflow.container.registry-configurations[harbor].extra[registryAuthUri]=https://demo.goharbor.io/service/token?service=harbor-registry&scope=repository:{repository}:pull
spring:
cloud:
dataflow:
container:
registry-configurations:
harbor:
extra:
'registryAuthUri': https://demo.goharbor.io/service/token?service=harbor-registry&scope=repository:{repository}:pull
- spring.cloud.dataflow.container.registry-configurations[myjfrog].registry-host=springsource-docker-private-local.jfrog.io
- spring.cloud.dataflow.container.registry-configurations[myjfrog].authorization-type=basicauth
- spring.cloud.dataflow.container.registry-configurations[myjfrog].user=[artifactory user]
- spring.cloud.dataflow.container.registry-configurations[myjfrog].secret=[artifactory encrypted password]
spring:
cloud:
dataflow:
container:
registry-configurations:
myjfrog:
registry-host: springsource-docker-private-local.jfrog.io
authorization-type: basicauth
user: [artifactory user]
secret: [artifactory encrypted password]
NOTE: You need to create an Encrypted Password in JFrog.
- spring.cloud.dataflow.container.registry-configurations[myecr].registry-host=283191309520.dkr.ecr.us-west-1.amazonaws.com
- spring.cloud.dataflow.container.registry-configurations[myecr].authorization-type=awsecr
- spring.cloud.dataflow.container.registry-configurations[myecr].user=[your AWS accessKey]
- spring.cloud.dataflow.container.registry-configurations[myecr].secret=[your AWS secretKey]
- spring.cloud.dataflow.container.registry-configurations[myecr].extra[region]=us-west-1
- spring.cloud.dataflow.container.registry-configurations[myecr].extra[registryIds]=283191309520
spring:
cloud:
dataflow:
container:
registry-configurations:
myecr:
registry-host: 283191309520.dkr.ecr.us-west-1.amazonaws.com
authorization-type: awsecr
user: [your AWS accessKey]
secret: [your AWS secretKey]
extra:
region: us-west-1
'registryIds': 283191309520
In addition to the credentials, you have to provide the registry's region
through the extra properties configuration (for example, .extra[region]=us-west-1
).
Optionally, you can set the registry IDs by setting the .extra[registryIds]
property as a comma separated value.
- spring.cloud.dataflow.container.registry-configurations[myazurecr].registry-host=tzolovazureregistry.azurecr.io
- spring.cloud.dataflow.container.registry-configurations[myazurecr].authorization-type=basicauth
- spring.cloud.dataflow.container.registry-configurations[myazurecr].user=[your Azure registry username]
- spring.cloud.dataflow.container.registry-configurations[myazurecr].secret=[your Azure registry access password]
spring:
cloud:
dataflow:
container:
registry-configurations:
myazurecr:
registry-host: tzolovazureregistry.azurecr.io
authorization-type: basicauth
user: [your Azure registry username]
secret: [your Azure registry access password]
Customizations
- Overriding/Augmenting Volume Mounted Secrets
Properties can override or augment the configurations obtained through the registry secrets.
For example, if you have created a secret to access a registry running at address: my-private-registry:5000
, you can disable SSL verification for this registry as follows:
- spring.cloud.dataflow.container.registry-configurations[myregistry].registry-host=my-private-registry:5000
- spring.cloud.dataflow.container.registry-configurations[myregistry].disableSslVerification=true
spring:
cloud:
dataflow:
container:
registry-configurations:
myregistry:
registry-host: my-private-registry:5000
disableSslVerification: true
This is handy for testing registries with self-signed certificates.
- Connect via Http Proxy
You can redirect some of the registry configurations through a pre-configured proxy. For example, if access a registry running at address: my-private-registry:5000
via a proxy configured at my-proxy.test:8080
:
- spring.cloud.dataflow.container.http-proxy.host=my-proxy.test
- spring.cloud.dataflow.container.http-proxy.port=8080
- spring.cloud.dataflow.container.registry-configurations[myregistrywithproxy].registry-host=my-proxy-registry:5000
- spring.cloud.dataflow.container.registry-configurations[myregistrywithproxy].use-http-proxy=true
spring:
cloud:
dataflow:
container:
httpProxy:
host: my-proxy.test
port: 8080
registry-configurations:
myregistrywithproxy:
registry-host: my-proxy-registry:5000
use-http-proxy: true
The spring.cloud.dataflow.container.http-proxy
properties allow you do configure a global Http Proxy and for every registry you can opt to use the proxy using
the registry configuration use-http-proxy
property. The proxy is not used by default.