A common bricks library for building scalable and portable distributed machine learning.
Support
Quality
Security
License
Reuse
Apache Metron
Support
Quality
Security
License
Reuse
Apache Metron
Support
Quality
Security
License
Reuse
equivalent to kafka-streams :octopus: for nodejs :sparkles::turtle::rocket::sparkles:
Support
Quality
Security
License
Reuse
docker-compose.yml files for cp-all-in-one , cp-all-in-one-community, cp-all-in-one-cloud, Apache Kafka Confluent Platform
Support
Quality
Security
License
Reuse
Apache HAWQ
Support
Quality
Security
License
Reuse
TonY is a framework to natively run deep learning frameworks on Apache Hadoop.
Support
Quality
Security
License
Reuse
distributed_computing include mapreduce kvstore etc.
Support
Quality
Security
License
Reuse
h
hadoopecosystemtable.github.ioby hadoopecosystemtable
HTML 667 Version:Current License: Permissive (Apache-2.0)
This page is a summary to keep the track of Hadoop related projects, and relevant projects around Big Data scene focused on the open source, free software environment.
Support
Quality
Security
License
Reuse
Scalable, redundant, and distributed object store for Apache Hadoop
Support
Quality
Security
License
Reuse
80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Functions, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.
Support
Quality
Security
License
Reuse
TonY is a framework to natively run deep learning frameworks on Apache Hadoop.
Support
Quality
Security
License
Reuse
Quantcast File System
Support
Quality
Security
License
Reuse
Golang framework for streaming ETL, observability data pipeline, and event processing apps
Support
Quality
Security
License
Reuse
💎🔥大数据学习笔记
Support
Quality
Security
License
Reuse
Distributed Deep Learning, with a focus on distributed training, using Keras and Apache Spark.
Support
Quality
Security
License
Reuse
Spring for Apache Hadoop is a framework for application developers to take advantage of the features of both Hadoop and Spring.
Support
Quality
Security
License
Reuse
Mirror of Apache Giraph
Support
Quality
Security
License
Reuse
Apache ORC - the smallest, fastest columnar storage for Hadoop workloads
Support
Quality
Security
License
Reuse
Web UI for Trino, Hive and SparkSQL
Support
Quality
Security
License
Reuse
A distributed system designed to ingest and process time series data
Support
Quality
Security
License
Reuse
Hadoop library for large-scale data processing, now an Apache Incubator project
Support
Quality
Security
License
Reuse
A collection of spouts, bolts, serializers, DSLs, and other goodies to use with Storm
Support
Quality
Security
License
Reuse
Hadoop library for large-scale data processing, now an Apache Incubator project
Support
Quality
Security
License
Reuse
Kyuubi is a distributed multi-tenant JDBC server for large-scale data processing and analytics, built on top of Apache Spark
Support
Quality
Security
License
Reuse
This code base is retained for historical interest only, please visit Apache Incubator Repo for latest one
Support
Quality
Security
License
Reuse
eclipse plugin for hadoop 2.2.0 , 2.4.1
Support
Quality
Security
License
Reuse
AWS Glue Libraries are additions and enhancements to Spark for ETL operations.
Support
Quality
Security
License
Reuse
A tool to move your data between any clouds or regions.
Support
Quality
Security
License
Reuse
大数据生态解决方案数据平台:基于大数据、数据平台、微服务、机器学习、商城、自动化运维、DevOps、容器部署平台、数据平台采集、数据平台存储、数据平台计算、数据平台开发、数据平台应用搭建的大数据解决方案。
Support
Quality
Security
License
Reuse
Distributed database specialized in exporting key/value data from Hadoop
Support
Quality
Security
License
Reuse
A simplified, lightweight ETL Framework based on Apache Spark
Support
Quality
Security
License
Reuse
RDBMS のしくみを学ぶための小さな RDBMS 実装
Support
Quality
Security
License
Reuse
HPCC Systems (High Performance Computing Cluster) is an open source, massive parallel-processing computing platform for big data processing and analytics.
Support
Quality
Security
License
Reuse
大数据生态解决方案数据平台:基于大数据、数据平台、微服务、机器学习、商城、自动化运维、DevOps、容器部署平台、数据平台采集、数据平台存储、数据平台计算、数据平台开发、数据平台应用搭建的大数据解决方案。
Support
Quality
Security
License
Reuse
Refactored version of code.google.com/hadoop-gpl-compression for hadoop 0.20
Support
Quality
Security
License
Reuse
The Apache Spark - Apache HBase Connector is a library to support Spark accessing HBase table as external data source or sink.
Support
Quality
Security
License
Reuse
基于 scrapy-redis 的通用分布式爬虫框架
Support
Quality
Security
License
Reuse
Fast and low overhead web framework fastify benchmarks.
Support
Quality
Security
License
Reuse
Data Lineage Tracking And Visualization Solution
Support
Quality
Security
License
Reuse
Minos is beyond a hadoop deployment system.
Support
Quality
Security
License
Reuse
A Scala productivity framework for Hadoop.
Support
Quality
Security
License
Reuse
Riak client for Javascript
Support
Quality
Security
License
Reuse
Spring Hadoop Samples
Support
Quality
Security
License
Reuse
Simplifying robust end-to-end machine learning on Apache Spark.
Support
Quality
Security
License
Reuse
Time Series Prediction with tf.contrib.timeseries
Support
Quality
Security
License
Reuse
Spark Streaming+Flume+Kafka+HBase+Hadoop+Zookeeper实现实时日志分析统计;SpringBoot+Echarts实现数据可视化展示
Support
Quality
Security
License
Reuse
Deploying complex solutions, magically.
Support
Quality
Security
License
Reuse
Kafka Connect HDFS connector
Support
Quality
Security
License
Reuse
Generic Data Ingestion & Dispersal Library for Hadoop
Support
Quality
Security
License
Reuse
d
dmlc-coreby dmlc
A common bricks library for building scalable and portable distributed machine learning.
C++ 835Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
m
Support
Quality
Security
License
Reuse
a
Support
Quality
Security
License
Reuse
k
kafka-streamsby nodefluent
equivalent to kafka-streams :octopus: for nodejs :sparkles::turtle::rocket::sparkles:
TypeScript 734Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
c
cp-all-in-oneby confluentinc
docker-compose.yml files for cp-all-in-one , cp-all-in-one-community, cp-all-in-one-cloud, Apache Kafka Confluent Platform
Python 674Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
h
Support
Quality
Security
License
Reuse
T
TonYby tony-framework
TonY is a framework to natively run deep learning frameworks on Apache Hadoop.
Java 672Updated: 3 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
d
distributed-computingby happyer
distributed_computing include mapreduce kvstore etc.
Go 669Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
h
hadoopecosystemtable.github.ioby hadoopecosystemtable
This page is a summary to keep the track of Hadoop related projects, and relevant projects around Big Data scene focused on the open source, free software environment.
HTML 667Updated: 3 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
o
ozoneby apache
Scalable, redundant, and distributed object store for Apache Hadoop
Java 658Updated: 1 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
D
DevOps-Python-toolsby HariSekhon
80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Functions, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.
Python 657Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
T
TonYby linkedin
TonY is a framework to natively run deep learning frameworks on Apache Hadoop.
Java 640Updated: 3 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
q
Support
Quality
Security
License
Reuse
f
fireboltby digitalocean
Golang framework for streaming ETL, observability data pipeline, and event processing apps
Go 629Updated: 1 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
B
Support
Quality
Security
License
Reuse
d
dist-kerasby cerndb
Distributed Deep Learning, with a focus on distributed training, using Keras and Apache Spark.
Python 615Updated: 3 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
s
spring-hadoopby spring-projects
Spring for Apache Hadoop is a framework for application developers to take advantage of the features of both Hadoop and Spring.
Java 612Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
g
Support
Quality
Security
License
Reuse
o
orcby apache
Apache ORC - the smallest, fastest columnar storage for Hadoop workloads
HTML 607Updated: 1 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
y
yanagishimaby yanagishima
Web UI for Trino, Hive and SparkSQL
Java 602Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
b
bluefloodby rackerlabs
A distributed system designed to ingest and process time series data
Java 592Updated: 3 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
d
datafuby linkedin
Hadoop library for large-scale data processing, now an Apache Incubator project
Java 588Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
s
storm-contribby nathanmarz
A collection of spouts, bolts, serializers, DSLs, and other goodies to use with Storm
Java 588Updated: 4 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
d
datafuby LinkedInAttic
Hadoop library for large-scale data processing, now an Apache Incubator project
Java 588Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
k
kyuubiby NetEase
Kyuubi is a distributed multi-tenant JDBC server for large-scale data processing and analytics, built on top of Apache Spark
Scala 566Updated: 3 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
K
Kylinby KylinOLAP
This code base is retained for historical interest only, please visit Apache Incubator Repo for latest one
Java 561Updated: 3 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
h
hadoop2x-eclipse-pluginby winghc
eclipse plugin for hadoop 2.2.0 , 2.4.1
Java 556Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
a
aws-glue-libsby awslabs
AWS Glue Libraries are additions and enhancements to Spark for ETL operations.
Python 555Updated: 1 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
j
juicesyncby juicedata
A tool to move your data between any clouds or regions.
Go 554Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
b
bdp-platformby wlhbdp
大数据生态解决方案数据平台:基于大数据、数据平台、微服务、机器学习、商城、自动化运维、DevOps、容器部署平台、数据平台采集、数据平台存储、数据平台计算、数据平台开发、数据平台应用搭建的大数据解决方案。
Java 541Updated: 3 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
e
elephantdbby nathanmarz
Distributed database specialized in exporting key/value data from Hadoop
Java 540Updated: 3 y ago License: Permissive (BSD-3-Clause)
Support
Quality
Security
License
Reuse
m
metorikkuby YotpoLtd
A simplified, lightweight ETL Framework based on Apache Spark
Scala 539Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
r
Support
Quality
Security
License
Reuse
H
HPCC-Platformby hpcc-systems
HPCC Systems (High Performance Computing Cluster) is an open source, massive parallel-processing computing platform for big data processing and analytics.
C++ 534Updated: 1 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
b
bdp-dataplatformby wlhbdp
大数据生态解决方案数据平台:基于大数据、数据平台、微服务、机器学习、商城、自动化运维、DevOps、容器部署平台、数据平台采集、数据平台存储、数据平台计算、数据平台开发、数据平台应用搭建的大数据解决方案。
Java 533Updated: 3 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
h
hadoop-lzoby twitter
Refactored version of code.google.com/hadoop-gpl-compression for hadoop 0.20
Shell 533Updated: 3 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
s
shcby hortonworks-spark
The Apache Spark - Apache HBase Connector is a library to support Spark accessing HBase table as external data source or sink.
Scala 531Updated: 3 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
s
Support
Quality
Security
License
Reuse
b
benchmarksby fastify
Fast and low overhead web framework fastify benchmarks.
JavaScript 505Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
s
splineby AbsaOSS
Data Lineage Tracking And Visualization Solution
Scala 503Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
m
minosby XiaoMi
Minos is beyond a hadoop deployment system.
Python 502Updated: 4 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
s
scoobiby NICTA
A Scala productivity framework for Hadoop.
Scala 485Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
r
riak-jsby mostlyserious
Riak client for Javascript
JavaScript 479Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
s
spring-hadoop-samplesby spring-projects
Spring Hadoop Samples
Java 476Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
k
keystoneby amplab
Simplifying robust end-to-end machine learning on Apache Spark.
Scala 472Updated: 4 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
T
TensorFlow-Time-Series-Examplesby hzy46
Time Series Prediction with tf.contrib.timeseries
Python 462Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
S
SparkStreamingby ljcan
Spark Streaming+Flume+Kafka+HBase+Hadoop+Zookeeper实现实时日志分析统计;SpringBoot+Echarts实现数据可视化展示
Java 461Updated: 1 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
c
conjure-upby conjure-up
Deploying complex solutions, magically.
Python 455Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
k
kafka-connect-hdfsby confluentinc
Kafka Connect HDFS connector
Java 452Updated: 1 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
m
marmarayby uber
Generic Data Ingestion & Dispersal Library for Hadoop
Java 449Updated: 3 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse