Kafka is a popular choice among developers and architects when designing micro-service based applications. The powerful event streaming nature and reliability are the main considerations when choosing Kafka. As we all know Kafka follows producer – broker – consumer flow. When it comes to stream processing we two options
- Write a seperate consumer application to process the incoming stream and publish bask to broker.
- Use KSQL to produce streams by manipulating streams or topics.
In second case, we do not need a separate consumer and instead we can manipulate streams withing ksql.
KSQL comes with few main components
- KSQL Server
- KSQL CLI
- Schema Registry (Optional)
We can use KSQL CLI to execute commands in KSQL server. The use of Schema registry is to hold metadata about schemas (JSON/AVRO) and it is kind of a database that act as a caching server for low latency data access.
Lets see how to setup ksql with kafka, here is the docker-compose file
In order to ksql-cli to operate, we have to wait until ksql-server startup.
Note that we can execute SQL inside CLI, after ksql-server is ready, so we can use below command to manipulate data
`ksql --file /ksql/queries.sql -- http://ksql-server:8088`
or we can directly execute queries in cli
ksql -- http://ksql-server:8088
Queries
Create a new stream using a Topic
CREATE STREAM new_stream(product_name VARCHAR, active_substance VARCHAR, route_of_administration VARCHAR, product_authorisation_country VARCHAR)
WITH (kafka_topic = 'topic_1', value_format = 'JSON');
Create a stream using existing stream, and transform data
CREATE STREAM formatted_stream
AS SELECT product_name, split(route_of_administration, ' ')[1] AS route, split(active_substance, ',')[1] AS active_substance_1,
split(active_substance, ',')[2] AS active_substance_2 FROM new_stream;
Change value format on the fly
CREATE STREAM new_avro_stream WITH ( value_format = 'AVRO') AS SELECT * FROM formatted_stream;
There are much more operations you can try using KSQL, the complete reference is here : https://docs.ksqldb.io/en/latest/There are much more operations you can try using KSQL, the complete reference is here : https://docs.ksqldb.io/en/latest/
Leave a comment