Tapping a Stream
In Spring Cloud Stream terms, a named destination is a specific destination name in the messaging middleware or the streaming platform.
It could be an exchange
in RabbitMQ or a topic
in Apache Kafka.
In Spring Cloud Data Flow, the named destination can be treated as either a direct source
or sink
, based on whether it acts as a publisher or a consumer.
You can either consume data from a destination (example: a Kafka topic) or you can produce data for a destination (for example, a Kafka topic).
Spring Cloud Data Flow lets you build an event-streaming pipeline from and to the destination on the messaging middleware by using the named destination support.
The stream DSL syntax requires the named destination to be prefixed with a colon (:
).
Suppose you want to collect the user click events from an HTTP web endpoint and send them over to a Kafka topic with a name of user-click-events
.
In this case, your stream DSL in Spring Cloud Data Flow could be as follows:
http > :user-click-events
Now the user-click-events
Kafka topic is set up to consume the user click events from an HTTP web endpoint and thereby act as a sink
.
Assume that you want to create another event-streaming pipeline that consumes these user click events and stores them in an RDBMS. The stream DSL in this case could be as follows:
:user-click-events > jdbc
Now, the Kafka topic user-click-events
acts as a producer for the jdbc
application.
It is a common use case to construct parallel event-streaming pipelines by forking the same data from the event publishers of the primary stream processing pipeline. Consider the following primary stream:
mainstream=http | filter --expression='payload.userId != null' | transform --expression=payload.sensorValue | log
When the stream named mainstream
is deployed, the Kafka topics that connect each of the applications are automatically created by Spring Cloud Data Flow by using Spring Cloud Stream.
Spring Cloud Data Flow names these topics based on the stream
and application
naming conventions, and you can override these names by using the appropriate Spring Cloud Stream binding properties.
In this case, three Kafka topics are created:
mainstream.http
: The Kafka topic that connects the HTTP source’s output to the filter processor’s inputmainstream.filter
: The Kafka topic that connects the filter processor’s output to the transform processor’s inputmainstream.transform
: The Kafka topic that connects the transform processor’s output to the log sink’s input
To create parallel stream pipelines that receive the copy from the primary stream, you need to use the Kafka topic names to construct the event streaming pipeline. For example, you may want to tap into the http
application’s output to construct a new event streaming pipeline that receives unfiltered data. The stream DSL could be as follows:
unfiltered-http-events=:mainstream.http > jdbc
You may also want to tap into the filter
application’s output to get a copy of the filtered data for use in another downstream persistence, as follows:
filtered-http-events=:mainstream.filter > mongodb
In Spring Cloud Data Flow, the name of the stream is unique. Hence, it is used as the consumer group name for the application that consumes from the given Kafka topic. This allows for multiple event streaming pipelines to get a copy of the same data instead of competing for messages.