A Spark DSL in idiomatic kotlin // dependency: com.sparkjava:spark-kotlin:1.0.0-alpha
Support
Quality
Security
License
Reuse
学习记录的一些笔记,以及所看得一些电子书eBooks、视频资源和平常收纳的一些自己认为比较好的博客、网站、工具。涉及大数据几大组件、Python机器学习和数据分析、Linux、操作系统、算法、网络等
Support
Quality
Security
License
Reuse
ADAM is a genomics analysis platform with specialized file formats built using Apache Avro, Apache Spark, and Apache Parquet. Apache 2 licensed.
Support
Quality
Security
License
Reuse
Sparkling Water provides H2O functionality inside Spark cluster
Support
Quality
Security
License
Reuse
C# and F# language binding and extensions to Apache Spark
Support
Quality
Security
License
Reuse
Wormhole is a SPaaS (Stream Processing as a Service) Platform
Support
Quality
Security
License
Reuse
大数据学习,从零开始学习大数据,包含大数据学习各阶段学习视频、面试资料
Support
Quality
Security
License
Reuse
Support
Quality
Security
License
Reuse
Mirror of Apache Sqoop
Support
Quality
Security
License
Reuse
This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]
Support
Quality
Security
License
Reuse
A connector for Spark that allows reading and writing to/from Redis cluster
Support
Quality
Security
License
Reuse
R interface for Apache Spark
Support
Quality
Security
License
Reuse
A Time Series Library for Apache Spark
Support
Quality
Security
License
Reuse
Distributed machine learning platform
Support
Quality
Security
License
Reuse
s
spark-nlp-workshopby JohnSnowLabs
Jupyter Notebook 888 Version:Current License: Permissive (Apache-2.0)
Public runnable examples of using John Snow Labs' NLP for Apache Spark.
Support
Quality
Security
License
Reuse
Cloud-native genomic dataframes and batch computing
Support
Quality
Security
License
Reuse
CTR prediction model based on spark(LR, GBDT, DNN)
Support
Quality
Security
License
Reuse
hadoop各组件使用,持续更新
Support
Quality
Security
License
Reuse
Ready to use log management solution for Kubernetes storing data in ClickHouse and providing web UI.
Support
Quality
Security
License
Reuse
Postgres to Elasticsearch/OpenSearch sync
Support
Quality
Security
License
Reuse
TiSpark is built for running Apache Spark on top of TiDB/TiKV
Support
Quality
Security
License
Reuse
定期更新Hadoop生态圈中常用大数据组件文档 重心依次为: Flink Solr Sparksql ES Scala Kafka Hbase/phoenix Redis Kerberos (项目包含hadoop思维导图 印象笔记 Scala版本简单demo 常用工具类 去敏后的train code 持续更新!!!)
Support
Quality
Security
License
Reuse
Support
Quality
Security
License
Reuse
Expressive types for Spark.
Support
Quality
Security
License
Reuse
U
UserActionAnalyzePlatformby oeljeklaus-you
Java 847 Version:Current License: Permissive (Apache-2.0)
电商用户行为分析大数据平台
Support
Quality
Security
License
Reuse
An open-source framework for real-time anomaly detection using Python, ElasticSearch and Kibana
Support
Quality
Security
License
Reuse
An extensible distributed system for reliable nearline data streaming at scale
Support
Quality
Security
License
Reuse
Apache Metron
Support
Quality
Security
License
Reuse
Apache Metron
Support
Quality
Security
License
Reuse
Convenient and performant logging library for Scala wrapping SLF4J.
Support
Quality
Security
License
Reuse
This is a guide to PySpark code style presenting common situations and the associated best practices based on the most frequent recurring topics across the PySpark repos we've encountered.
Support
Quality
Security
License
Reuse
e
Jupyter Notebook 808 Version:Current License: Permissive (Apache-2.0)
Use Jupyter Notebooks to demonstrate how to build a Recommender with Apache Spark & Elasticsearch
Support
Quality
Security
License
Reuse
An on-line movie recommender using Spark, Python Flask, and the MovieLens dataset
Support
Quality
Security
License
Reuse
Pyspark RDD, DataFrame and Dataset Examples in Python language
Support
Quality
Security
License
Reuse
Support
Quality
Security
License
Reuse
[DEPRECATED] Tensorflow wrapper for DataFrames on Apache Spark
Support
Quality
Security
License
Reuse
Apache Livy is an open source REST interface for interacting with Apache Spark from anywhere.
Support
Quality
Security
License
Reuse
s
sciblog_supportby miguelgfierro
Jupyter Notebook 733 Version:Current License: Proprietary (Proprietary)
Support content for my blog
Support
Quality
Security
License
Reuse
Productionise & schedule your Jupyter Notebooks as easily as you wrote them.
Support
Quality
Security
License
Reuse
Code examples that show to integrate Apache Kafka 0.8+ with Apache Storm 0.9+ and Apache Spark Streaming 1.1+, while using Apache Avro as the data serialization format.
Support
Quality
Security
License
Reuse
Mirror of Apache Bahir Flink
Support
Quality
Security
License
Reuse
IPython Cookbook, Second Edition, by Cyrille Rossant, Packt Publishing 2018
Support
Quality
Security
License
Reuse
Support
Quality
Security
License
Reuse
Read - Write JSON SerDe for Apache Hive.
Support
Quality
Security
License
Reuse
A place for all things related to ye olde Spark Thermostat Hackathon
Support
Quality
Security
License
Reuse
Essential Spark extensions and helper methods ✨😲
Support
Quality
Security
License
Reuse
Mirror of Apache Toree (Incubating)
Support
Quality
Security
License
Reuse
An open source framework for building data analytic applications.
Support
Quality
Security
License
Reuse
Java Statistical Analysis Tool, a Java library for Machine Learning
Support
Quality
Security
License
Reuse
A machine learning package for streaming data in Python. The other ancestor of River.
Support
Quality
Security
License
Reuse
s
spark-kotlinby perwendel
A Spark DSL in idiomatic kotlin // dependency: com.sparkjava:spark-kotlin:1.0.0-alpha
Kotlin 955Updated: 3 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
C
Coding-Nowby josonle
学习记录的一些笔记,以及所看得一些电子书eBooks、视频资源和平常收纳的一些自己认为比较好的博客、网站、工具。涉及大数据几大组件、Python机器学习和数据分析、Linux、操作系统、算法、网络等
Python 951Updated: 1 y ago License: Strong Copyleft (GPL-2.0)
Support
Quality
Security
License
Reuse
a
adamby bigdatagenomics
ADAM is a genomics analysis platform with specialized file formats built using Apache Avro, Apache Spark, and Apache Parquet. Apache 2 licensed.
Scala 943Updated: 1 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
s
sparkling-waterby h2oai
Sparkling Water provides H2O functionality inside Spark cluster
Scala 943Updated: 12 mo ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
M
Mobiusby microsoft
C# and F# language binding and extensions to Apache Spark
C# 939Updated: 1 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
w
wormholeby edp963
Wormhole is a SPaaS (Stream Processing as a Service) Platform
JavaScript 937Updated: 2 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
B
BigDataGuideby Dr11ft
大数据学习,从零开始学习大数据,包含大数据学习各阶段学习视频、面试资料
Java 935Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
g
graphframesby graphframes
Scala 934Updated: 11 mo ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
s
Support
Quality
Security
License
Reuse
L
LearningSparkV2by databricks
This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]
Scala 909Updated: 1 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
s
spark-redisby RedisLabs
A connector for Spark that allows reading and writing to/from Redis cluster
Scala 908Updated: 12 mo ago License: Permissive (BSD-3-Clause)
Support
Quality
Security
License
Reuse
s
sparklyrby sparklyr
R interface for Apache Spark
R 906Updated: 11 mo ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
f
flintby twosigma
A Time Series Library for Apache Spark
Scala 901Updated: 3 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
v
velesby Samsung
Distributed machine learning platform
C++ 893Updated: 3 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
s
spark-nlp-workshopby JohnSnowLabs
Public runnable examples of using John Snow Labs' NLP for Apache Spark.
Jupyter Notebook 888Updated: 11 mo ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
h
hailby hail-is
Cloud-native genomic dataframes and batch computing
Python 881Updated: 12 mo ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
S
SparkCTRby wzhe06
CTR prediction model based on spark(LR, GBDT, DNN)
Scala 872Updated: 1 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
c
Support
Quality
Security
License
Reuse
l
loghouseby flant
Ready to use log management solution for Kubernetes storing data in ClickHouse and providing web UI.
Ruby 863Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
p
pgsyncby toluaina
Postgres to Elasticsearch/OpenSearch sync
Python 860Updated: 11 mo ago License: Weak Copyleft (LGPL-3.0)
Support
Quality
Security
License
Reuse
t
tisparkby pingcap
TiSpark is built for running Apache Spark on top of TiDB/TiKV
Scala 856Updated: 11 mo ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
h
hadoop_studyby realguoshuai
定期更新Hadoop生态圈中常用大数据组件文档 重心依次为: Flink Solr Sparksql ES Scala Kafka Hbase/phoenix Redis Kerberos (项目包含hadoop思维导图 印象笔记 Scala版本简单demo 常用工具类 去敏后的train code 持续更新!!!)
Java 853Updated: 1 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
f
fluent-plugin-elasticsearchby uken
Ruby 853Updated: 1 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
f
framelessby typelevel
Expressive types for Spark.
Scala 851Updated: 11 mo ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
U
UserActionAnalyzePlatformby oeljeklaus-you
电商用户行为分析大数据平台
Java 847Updated: 12 mo ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
d
datastream.ioby MentatInnovations
An open-source framework for real-time anomaly detection using Python, ElasticSearch and Kibana
Python 836Updated: 3 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
b
brooklinby linkedin
An extensible distributed system for reliable nearline data streaming at scale
Java 833Updated: 11 mo ago License: Permissive (BSD-2-Clause)
Support
Quality
Security
License
Reuse
m
Support
Quality
Security
License
Reuse
a
Support
Quality
Security
License
Reuse
s
scala-loggingby lightbend
Convenient and performant logging library for Scala wrapping SLF4J.
Scala 821Updated: 3 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
p
pyspark-style-guideby palantir
This is a guide to PySpark code style presenting common situations and the associated best practices based on the most frequent recurring topics across the PySpark repos we've encountered.
Python 813Updated: 11 mo ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
e
elasticsearch-spark-recommenderby IBM
Use Jupyter Notebooks to demonstrate how to build a Recommender with Apache Spark & Elasticsearch
Jupyter Notebook 808Updated: 12 mo ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
s
spark-movie-lensby jadianes
An on-line movie recommender using Spark, Python Flask, and the MovieLens dataset
Jupyter Notebook 803Updated: 1 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
p
pyspark-examplesby spark-examples
Pyspark RDD, DataFrame and Dataset Examples in Python language
Python 799Updated: 12 mo ago License: No License (No License)
Support
Quality
Security
License
Reuse
d
docker-sparkby sequenceiq
Shell 769Updated: 3 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
t
tensorframesby databricks
[DEPRECATED] Tensorflow wrapper for DataFrames on Apache Spark
Scala 761Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
i
incubator-livyby apache
Apache Livy is an open source REST interface for interacting with Apache Spark from anywhere.
Scala 735Updated: 1 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
s
sciblog_supportby miguelgfierro
Support content for my blog
Jupyter Notebook 733Updated: 1 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
n
notebookerby man-group
Productionise & schedule your Jupyter Notebooks as easily as you wrote them.
Python 731Updated: 12 mo ago License: Strong Copyleft (AGPL-3.0)
Support
Quality
Security
License
Reuse
k
kafka-storm-starterby miguno
Code examples that show to integrate Apache Kafka 0.8+ with Apache Storm 0.9+ and Apache Spark Streaming 1.1+, while using Apache Avro as the data serialization format.
Scala 730Updated: 3 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
b
bahir-flinkby apache
Mirror of Apache Bahir Flink
Java 727Updated: 1 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
c
cookbook-2ndby ipython-books
IPython Cookbook, Second Edition, by Cyrille Rossant, Packt Publishing 2018
Python 722Updated: 3 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
p
private-join-and-computeby google
C++ 720Updated: 1 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
H
Hive-JSON-Serdeby rcongiu
Read - Write JSON SerDe for Apache Hive.
Java 717Updated: 1 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
t
thermostatby particle-iot
A place for all things related to ye olde Spark Thermostat Hackathon
Ruby 717Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
s
spark-dariaby MrPowers
Essential Spark extensions and helper methods ✨😲
Scala 713Updated: 1 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
i
incubator-toreeby apache
Mirror of Apache Toree (Incubating)
Scala 712Updated: 1 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
c
cdapby cdapio
An open source framework for building data analytic applications.
Java 706Updated: 12 mo ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
J
JSATby EdwardRaff
Java Statistical Analysis Tool, a Java library for Machine Learning
Java 693Updated: 3 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
s
scikit-multiflowby scikit-multiflow
A machine learning package for streaming data in Python. The other ancestor of River.
Python 691Updated: 11 mo ago License: Permissive (BSD-3-Clause)
Support
Quality
Security
License
Reuse