Sync data between persistence engines, like ETL only not stodgy
Support
Quality
Security
License
Reuse
Alpakka Kafka connector - Alpakka is a Reactive Enterprise Integration library for Java and Scala, based on Reactive Streams and Akka.
Support
Quality
Security
License
Reuse
Kafka (and Zookeeper) in Docker
Support
Quality
Security
License
Reuse
Distributed Prometheus time series database
Support
Quality
Security
License
Reuse
Collect, aggregate, and visualize a data ecosystem's metadata
Support
Quality
Security
License
Reuse
HiBench is a big data benchmark suite.
Support
Quality
Security
License
Reuse
👾Scripts and samples to support Confluent Demos and Talks. ⚠️Might be rough around the edges ;-) 👉For automated tutorials and QA'd code, see https://github.com/confluentinc/examples/
Support
Quality
Security
License
Reuse
TBase is an enterprise-level distributed HTAP database. Through a single database cluster to provide users with highly consistent distributed database services and high-performance data warehouse services, a set of integrated enterprise-level solutions is formed.
Support
Quality
Security
License
Reuse
This library provides implementations of many algorithms and data structures that are useful for bioinformatics. All provided implementations are rigorously tested via continuous integration.
Support
Quality
Security
License
Reuse
A system design tool that allows you to simulate data flow of distributed systems.
Support
Quality
Security
License
Reuse
Dr. Elephant is a job and flow-level performance monitoring and tuning tool for Apache Hadoop and Apache Spark
Support
Quality
Security
License
Reuse
Open source specification and implementation of Knative event binding and delivery
Support
Quality
Security
License
Reuse
DDMQ is a distributed messaging product with low latency, high throughput and high availability.
Support
Quality
Security
License
Reuse
Kazoo is a high-level Python library that makes it easier to use Apache Zookeeper.
Support
Quality
Security
License
Reuse
A Ruby client library for Apache Kafka
Support
Quality
Security
License
Reuse
A fully asynchronous, futures-based Kafka client library for Rust based on librdkafka
Support
Quality
Security
License
Reuse
EventMesh is a new generation serverless event middleware for building distributed event-driven applications.
Support
Quality
Security
License
Reuse
Examples for running Debezium (Configuration, Docker Compose files etc.)
Support
Quality
Security
License
Reuse
p
pyspark-example-projectby AlexIoannides
Python 1195 Version:Current License: No License (No License)
Example project implementing best practices for PySpark ETL jobs and applications.
Support
Quality
Security
License
Reuse
Test your code without writing mocks with ephemeral Docker containers 📦 Setup popular services with just a couple lines of code ⏱️ No bash, no yaml, only code 💻
Support
Quality
Security
License
Reuse
High performance mqtt broker
Support
Quality
Security
License
Reuse
KillrWeather is a reference application (work in progress) showing how to easily integrate streaming and batch data processing with Apache Spark Streaming, Apache Cassandra, Apache Kafka and Akka for fast, streaming computations on time series data in asynchronous event-driven environments.
Support
Quality
Security
License
Reuse
Scalable, fault-tolerant application-layer sharding for Node.js applications
Support
Quality
Security
License
Reuse
分布式系统服务ZooKeeper的学习历程
Support
Quality
Security
License
Reuse
Vendor-neutral programmable observability pipelines.
Support
Quality
Security
License
Reuse
50+ DockerHub public images for Docker & Kubernetes - DevOps, CI/CD, GitHub Actions, CircleCI, Jenkins, TeamCity, Alpine, CentOS, Debian, Fedora, Ubuntu, Hadoop, Kafka, ZooKeeper, HBase, Cassandra, Solr, SolrCloud, Presto, Apache Drill, Nifi, Spark, Consul, Riak
Support
Quality
Security
License
Reuse
StreamSets Data Collector - Continuous big data and cloud platform ingest infrastructure
Support
Quality
Security
License
Reuse
[DEPRECATED] Docker images for Confluent Platform.
Support
Quality
Security
License
Reuse
java-study 是本人学习Java过程中记录的一些代码!从Java基础的数据类型、jdk1.8的Lambda、Stream和日期的使用、 IO流、数据集合、多线程使用、并发编程、23种设计模式示例代码、常用的工具类, 以及一些常用框架,netty、mina、springboot、kafka、storm、zookeeper、redis、elasticsearch、hbase、hive等等。
Support
Quality
Security
License
Reuse
This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster.
Support
Quality
Security
License
Reuse
Apache Kafka client for Python; high-level & low-level consumer/producer, with great performance.
Support
Quality
Security
License
Reuse
TestDisk & PhotoRec
Support
Quality
Security
License
Reuse
450+ AWS, Hadoop, Cloud, Kafka, Docker, Elasticsearch, RabbitMQ, Redis, HBase, Solr, Cassandra, ZooKeeper, HDFS, Yarn, Hive, Presto, Drill, Impala, Consul, Spark, Jenkins, Travis CI, Git, MySQL, Linux, DNS, Whois, SSL Certs, Yum Security Updates, Kubernetes, Cloudera etc...
Support
Quality
Security
License
Reuse
Authenticate socket.io incoming connections with JWTs
Support
Quality
Security
License
Reuse
franz-go contains a feature complete, pure Go library for interacting with Kafka from 0.8.0 through 3.4+. Producing, consuming, transacting, administrating, etc.
Support
Quality
Security
License
Reuse
Python Stream Processing. A Faust fork
Support
Quality
Security
License
Reuse
Distributed Stream and Batch Processing
Support
Quality
Security
License
Reuse
Kylo is a data lake management software platform and framework for enabling scalable enterprise-class data lakes on big data technologies such as Teradata, Apache Spark and/or Hadoop. Kylo is licensed under Apache 2.0. Contributed by Teradata Inc.
Support
Quality
Security
License
Reuse
m
meteor-collection2by Meteor-Community-Packages
JavaScript 1030 Version:Current License: Permissive (MIT)
A Meteor package that extends Mongo.Collection to provide support for specifying a schema and then validating against that schema when inserting and updating.
Support
Quality
Security
License
Reuse
C/C++ library for processing configuration files
Support
Quality
Security
License
Reuse
Kubernetes Universal Declarative Operator (KUDO)
Support
Quality
Security
License
Reuse
MapReduce, Spark, Java, and Scala for Data Algorithms Book
Support
Quality
Security
License
Reuse
A microservices-based Streaming and Batch data processing in Cloud Foundry and Kubernetes
Support
Quality
Security
License
Reuse
an extension to the sidekiq message processing to track your jobs
Support
Quality
Security
License
Reuse
Cluster extensions for Sarama, the Go client library for Apache Kafka 0.9 [DEPRECATED]
Support
Quality
Security
License
Reuse
This repo contains a sample application based on a Garage Management System for Pitstop - a fictitious garage. The primary goal of this sample is to demonstrate several software-architecture concepts like: Microservices, CQRS, Event Sourcing, Domain Driven Design (DDD), Eventual Consistency.
Support
Quality
Security
License
Reuse
Rust client for Apache Kafka
Support
Quality
Security
License
Reuse
WE HAVE MOVED to Apache Incubator. https://cwiki.apache.org/FLUME/ . Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data. It has a simple and flexible architecture based on streaming data flows. It is robust and fault tolerant with tunable reliability mechanisms and many failover and recovery mechanisms. The system is centrally managed and allows for intelligent dynamic management. It uses a simple extensible data model that allows for online analytic applications.
Support
Quality
Security
License
Reuse
Wormhole is a SPaaS (Stream Processing as a Service) Platform
Support
Quality
Security
License
Reuse
Streaming reference architecture for ETL with Kafka and Kafka-Connect. You can find more on http://lenses.io on how we provide a unified solution to manage your connectors, most advanced SQL engine for Kafka and Kafka Streams, cluster monitoring and alerting, and more.
Support
Quality
Security
License
Reuse
t
transporterby compose
Sync data between persistence engines, like ETL only not stodgy
Go 1408Updated: 2 y ago License: Permissive (BSD-3-Clause)
Support
Quality
Security
License
Reuse
a
alpakka-kafkaby akka
Alpakka Kafka connector - Alpakka is a Reactive Enterprise Integration library for Java and Scala, based on Reactive Streams and Akka.
Scala 1400Updated: 2 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
d
docker-kafkaby spotify
Kafka (and Zookeeper) in Docker
Shell 1396Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
F
FiloDBby filodb
Distributed Prometheus time series database
Scala 1378Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
m
marquezby MarquezProject
Collect, aggregate, and visualize a data ecosystem's metadata
Java 1360Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
H
HiBenchby Intel-bigdata
HiBench is a big data benchmark suite.
Java 1351Updated: 2 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
d
demo-sceneby confluentinc
👾Scripts and samples to support Confluent Demos and Talks. ⚠️Might be rough around the edges ;-) 👉For automated tutorials and QA'd code, see https://github.com/confluentinc/examples/
Shell 1332Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
T
TBaseby Tencent
TBase is an enterprise-level distributed HTAP database. Through a single database cluster to provide users with highly consistent distributed database services and high-performance data warehouse services, a set of integrated enterprise-level solutions is formed.
C 1321Updated: 2 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
r
rust-bioby rust-bio
This library provides implementations of many algorithms and data structures that are useful for bioinformatics. All provided implementations are rigorously tested via continuous integration.
Rust 1311Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
S
Systemizerby honzaap
A system design tool that allows you to simulate data flow of distributed systems.
TypeScript 1305Updated: 2 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
d
dr-elephantby linkedin
Dr. Elephant is a job and flow-level performance monitoring and tuning tool for Apache Hadoop and Apache Spark
Java 1302Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
e
eventingby knative
Open source specification and implementation of Knative event binding and delivery
Go 1292Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
D
DDMQby didi
DDMQ is a distributed messaging product with low latency, high throughput and high availability.
Java 1257Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
k
kazooby python-zk
Kazoo is a high-level Python library that makes it easier to use Apache Zookeeper.
Python 1248Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
r
ruby-kafkaby zendesk
A Ruby client library for Apache Kafka
Ruby 1238Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
r
rust-rdkafkaby fede1024
A fully asynchronous, futures-based Kafka client library for Rust based on librdkafka
Rust 1235Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
i
incubator-eventmeshby apache
EventMesh is a new generation serverless event middleware for building distributed event-driven applications.
Java 1199Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
d
debezium-examplesby debezium
Examples for running Debezium (Configuration, Docker Compose files etc.)
JavaScript 1196Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
p
pyspark-example-projectby AlexIoannides
Example project implementing best practices for PySpark ETL jobs and applications.
Python 1195Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
g
gnomockby orlangure
Test your code without writing mocks with ephemeral Docker containers 📦 Setup popular services with just a couple lines of code ⏱️ No bash, no yaml, only code 💻
Go 1188Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
h
Support
Quality
Security
License
Reuse
k
killrweatherby killrweather
KillrWeather is a reference application (work in progress) showing how to easily integrate streaming and batch data processing with Apache Spark Streaming, Apache Cassandra, Apache Kafka and Akka for fast, streaming computations on time series data in asynchronous event-driven environments.
Scala 1185Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
r
ringpop-nodeby uber-node
Scalable, fault-tolerant application-layer sharding for Node.js applications
JavaScript 1177Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
z
zookeeperby llohellohe
分布式系统服务ZooKeeper的学习历程
Java 1161Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
a
agentby grafana
Vendor-neutral programmable observability pipelines.
Go 1152Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
D
Dockerfilesby HariSekhon
50+ DockerHub public images for Docker & Kubernetes - DevOps, CI/CD, GitHub Actions, CircleCI, Jenkins, TeamCity, Alpine, CentOS, Debian, Fedora, Ubuntu, Hadoop, Kafka, ZooKeeper, HBase, Cassandra, Solr, SolrCloud, Presto, Apache Drill, Nifi, Spark, Consul, Riak
Shell 1147Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
d
datacollectorby streamsets
StreamSets Data Collector - Continuous big data and cloud platform ingest infrastructure
Java 1145Updated: 4 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
c
cp-docker-imagesby confluentinc
[DEPRECATED] Docker images for Confluent Platform.
Python 1127Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
j
java-studyby xuwujing
java-study 是本人学习Java过程中记录的一些代码!从Java基础的数据类型、jdk1.8的Lambda、Stream和日期的使用、 IO流、数据集合、多线程使用、并发编程、23种设计模式示例代码、常用的工具类, 以及一些常用框架,netty、mina、springboot、kafka、storm、zookeeper、redis、elasticsearch、hbase、hive等等。
Java 1117Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
s
scrapy-clusterby istresearch
This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster.
Python 1114Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
p
pykafkaby Parsely
Apache Kafka client for Python; high-level & low-level consumer/producer, with great performance.
Python 1111Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
t
Support
Quality
Security
License
Reuse
N
Nagios-Pluginsby HariSekhon
450+ AWS, Hadoop, Cloud, Kafka, Docker, Elasticsearch, RabbitMQ, Redis, HBase, Solr, Cassandra, ZooKeeper, HDFS, Yarn, Hive, Presto, Drill, Impala, Consul, Spark, Jenkins, Travis CI, Git, MySQL, Linux, DNS, Whois, SSL Certs, Yum Security Updates, Kubernetes, Cloudera etc...
Python 1101Updated: 2 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
a
auth0-socketio-jwtby auth0-community
Authenticate socket.io incoming connections with JWTs
JavaScript 1101Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
f
franz-goby twmb
franz-go contains a feature complete, pure Go library for interacting with Kafka from 0.8.0 through 3.4+. Producing, consuming, transacting, administrating, etc.
Go 1101Updated: 2 y ago License: Permissive (BSD-3-Clause)
Support
Quality
Security
License
Reuse
f
faustby faust-streaming
Python Stream Processing. A Faust fork
Python 1059Updated: 2 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
h
hazelcast-jetby hazelcast
Distributed Stream and Batch Processing
Java 1054Updated: 2 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
k
kyloby Teradata
Kylo is a data lake management software platform and framework for enabling scalable enterprise-class data lakes on big data technologies such as Teradata, Apache Spark and/or Hadoop. Kylo is licensed under Apache 2.0. Contributed by Teradata Inc.
Java 1041Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
m
meteor-collection2by Meteor-Community-Packages
A Meteor package that extends Mongo.Collection to provide support for specifying a schema and then validating against that schema when inserting and updating.
JavaScript 1030Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
l
libconfigby hyperrealm
C/C++ library for processing configuration files
C 1001Updated: 2 y ago License: Weak Copyleft (LGPL-2.1)
Support
Quality
Security
License
Reuse
k
kudoby kudobuilder
Kubernetes Universal Declarative Operator (KUDO)
Go 997Updated: 3 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
d
data-algorithms-bookby mahmoudparsian
MapReduce, Spark, Java, and Scala for Data Algorithms Book
Java 996Updated: 3 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
s
spring-cloud-dataflowby spring-cloud
A microservices-based Streaming and Batch data processing in Cloud Foundry and Kubernetes
Java 992Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
s
sidekiq-statusby utgarda
an extension to the sidekiq message processing to track your jobs
Ruby 992Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
s
sarama-clusterby bsm
Cluster extensions for Sarama, the Go client library for Apache Kafka 0.9 [DEPRECATED]
Go 979Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
p
pitstopby EdwinVW
This repo contains a sample application based on a Garage Management System for Pitstop - a fictitious garage. The primary goal of this sample is to demonstrate several software-architecture concepts like: Microservices, CQRS, Event Sourcing, Domain Driven Design (DDD), Eventual Consistency.
JavaScript 977Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
k
kafka-rustby kafka-rust
Rust client for Apache Kafka
Rust 965Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
f
flumeby cloudera
WE HAVE MOVED to Apache Incubator. https://cwiki.apache.org/FLUME/ . Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data. It has a simple and flexible architecture based on streaming data flows. It is robust and fault tolerant with tunable reliability mechanisms and many failover and recovery mechanisms. The system is centrally managed and allows for intelligent dynamic management. It uses a simple extensible data model that allows for online analytic applications.
Java 941Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
w
wormholeby edp963
Wormhole is a SPaaS (Stream Processing as a Service) Platform
JavaScript 937Updated: 3 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
s
stream-reactorby lensesio
Streaming reference architecture for ETL with Kafka and Kafka-Connect. You can find more on http://lenses.io on how we provide a unified solution to manage your connectors, most advanced SQL engine for Kafka and Kafka Streams, cluster monitoring and alerting, and more.
Scala 937Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse