Building a Stream Processing Application using open source libraries
by rajasekar Updated: Mar 2, 2022
Solution Kit
Today data has generated constantly, and business needs the latest data to be used for business decisions via intelligent applications. This requires constantly processing data in a streaming fashion to get the lower latency. This will also allow optimum usage of the resources and get the up-to-date data loaded into the systems.
Stream processing involves multiple processing steps in near real-time as the data is produced, transported, and received at the target location. Some examples of such processing requirements processing data in motion are from continuous streams from sensors in IT infrastructure, machine sensors, health sensors, stock trade activities, etc
To create an end-to-end stream processing, you will need components performing different tasks stitched together in a pipeline and workflow.
Streaming
Using the below libraries, you can build you own correct concurrent and scalable streaming applications.
pulsarby apache
Apache Pulsar - distributed pub-sub messaging system
pulsarby apache
Java 12790 Version:v3.0.0 License: Permissive (Apache-2.0)
brooklinby linkedin
An extensible distributed system for reliable nearline data streaming at scale
brooklinby linkedin
Java 833 Version:5.1.0 License: Permissive (BSD-2-Clause)
streamlineby hortonworks
StreamLine - Streaming Analytics
streamlineby hortonworks
Java 156 Version:v0.6.0 License: Permissive (Apache-2.0)
stream-ops-javaby nanosai
Stream Ops is a fully embeddable data streaming engine and stream processing API for Java.
stream-ops-javaby nanosai
Java 40 Version:0.7.0 License: No License
Stream processing engine
The below open-source stream processing framework provide you with stream processing capabilities.
siddhiby siddhi-io
Stream Processing and Complex Event Processing Engine
siddhiby siddhi-io
Java 1426 Version:v5.1.27 License: Permissive (Apache-2.0)
Data Pipeline
Below libraries help in defining both batch and parallel processing pipelines running in a distributed processing backends.
spring-cloud-dataflowby spring-cloud
A microservices-based Streaming and Batch data processing in Cloud Foundry and Kubernetes
spring-cloud-dataflowby spring-cloud
Java 992 Version:v2.10.3 License: Permissive (Apache-2.0)
streamflowby lmco
StreamFlow™ is a stream processing tool designed to help build and monitor processing workflows.
streamflowby lmco
Java 241 Version:0.13.0 License: Permissive (Apache-2.0)
beamby apache
Apache Beam is a unified programming model for Batch and Streaming data processing.
beamby apache
Java 6930 Version:v2.48.0 License: Permissive (Apache-2.0)