Convenient and performant logging library for Scala wrapping SLF4J.
Support
Quality
Security
License
Reuse
s
spark-nlp-workshopby JohnSnowLabs
Jupyter Notebook 888 Version:Current License: Permissive (Apache-2.0)
Public runnable examples of using John Snow Labs' NLP for Apache Spark.
Support
Quality
Security
License
Reuse
The analytical engine for TiDB and TiDB Cloud. Try free: https://tidbcloud.com/free-trial
Support
Quality
Security
License
Reuse
CTR prediction model based on spark(LR, GBDT, DNN)
Support
Quality
Security
License
Reuse
hadoop各组件使用,持续更新
Support
Quality
Security
License
Reuse
Postgres to Elasticsearch/OpenSearch sync
Support
Quality
Security
License
Reuse
TiSpark is built for running Apache Spark on top of TiDB/TiKV
Support
Quality
Security
License
Reuse
定期更新Hadoop生态圈中常用大数据组件文档 重心依次为: Flink Solr Sparksql ES Scala Kafka Hbase/phoenix Redis Kerberos (项目包含hadoop思维导图 印象笔记 Scala版本简单demo 常用工具类 去敏后的train code 持续更新!!!)
Support
Quality
Security
License
Reuse
Expressive types for Spark.
Support
Quality
Security
License
Reuse
Scalable, fast, and lightweight system for large-scale topic modeling
Support
Quality
Security
License
Reuse
U
UserActionAnalyzePlatformby oeljeklaus-you
Java 847 Version:Current License: Permissive (Apache-2.0)
电商用户行为分析大数据平台
Support
Quality
Security
License
Reuse
Accelerate SHA256 computations in pure Go using AVX512, SHA Extensions for x86 and ARM64 for ARM. On AVX512 it provides an up to 8x improvement (over 3 GB/s per core). SHA Extensions give a performance boost of close to 4x over native.
Support
Quality
Security
License
Reuse
A common bricks library for building scalable and portable distributed machine learning.
Support
Quality
Security
License
Reuse
An extensible distributed system for reliable nearline data streaming at scale
Support
Quality
Security
License
Reuse
Apache Metron
Support
Quality
Security
License
Reuse
Apache Metron
Support
Quality
Security
License
Reuse
Convenient and performant logging library for Scala wrapping SLF4J.
Support
Quality
Security
License
Reuse
This is a guide to PySpark code style presenting common situations and the associated best practices based on the most frequent recurring topics across the PySpark repos we've encountered.
Support
Quality
Security
License
Reuse
A new lossy/lossless image format for photos and the internet
Support
Quality
Security
License
Reuse
e
Jupyter Notebook 808 Version:Current License: Permissive (Apache-2.0)
Use Jupyter Notebooks to demonstrate how to build a Recommender with Apache Spark & Elasticsearch
Support
Quality
Security
License
Reuse
A scalable, mature and versatile web crawler based on Apache Storm
Support
Quality
Security
License
Reuse
Dashboard for Apache APISIX
Support
Quality
Security
License
Reuse
Apache Arrow Ballista Distributed Query Engine
Support
Quality
Security
License
Reuse
Go library providing algorithms optimized to leverage the characteristics of modern CPUs
Support
Quality
Security
License
Reuse
[DEPRECATED] Tensorflow wrapper for DataFrames on Apache Spark
Support
Quality
Security
License
Reuse
💥🔥 大数据生态解决方案数据平台:基于大数据、数据平台、微服务、机器学习、商城、自动化运维、DevOps、容器部署平台、数据平台采集、数据平台存储、数据平台计算、数据平台开发、数据平台应用搭建的大数据解决方案。
Support
Quality
Security
License
Reuse
S
SQL-Data-Analysis-and-Visualization-Projectsby ptyadana
Jupyter Notebook 758 Version:Current License: Permissive (MIT)
SQL data analysis & visualization projects using MySQL, PostgreSQL, SQLite, Tableau, Apache Spark and pySpark.
Support
Quality
Security
License
Reuse
Apache Ranger - To enable, monitor and manage comprehensive data security across the Hadoop platform and beyond
Support
Quality
Security
License
Reuse
Fast Base64 stream encoder/decoder in C99, with SIMD acceleration
Support
Quality
Security
License
Reuse
Scalable identity resolution, entity resolution, data mastering and deduplication using ML
Support
Quality
Security
License
Reuse
Apache Livy is an open source REST interface for interacting with Apache Spark from anywhere.
Support
Quality
Security
License
Reuse
Productionise & schedule your Jupyter Notebooks as easily as you wrote them.
Support
Quality
Security
License
Reuse
Library for specialized dense and sparse matrix operations, and deep learning primitives.
Support
Quality
Security
License
Reuse
Mirror of Apache Bahir Flink
Support
Quality
Security
License
Reuse
Support
Quality
Security
License
Reuse
Read - Write JSON SerDe for Apache Hive.
Support
Quality
Security
License
Reuse
A place for all things related to ye olde Spark Thermostat Hackathon
Support
Quality
Security
License
Reuse
Essential Spark extensions and helper methods ✨😲
Support
Quality
Security
License
Reuse
Mirror of Apache Toree (Incubating)
Support
Quality
Security
License
Reuse
An open source framework for building data analytic applications.
Support
Quality
Security
License
Reuse
Python DB API 2.0 client for Impala and Hive (HiveServer2 protocol)
Support
Quality
Security
License
Reuse
Modular node graph based noise generation library using SIMD, C++17 and templates
Support
Quality
Security
License
Reuse
AR相册 Photo Album For AR
Support
Quality
Security
License
Reuse
A Ruby client for the Cassandra distributed database
Support
Quality
Security
License
Reuse
docker-compose.yml files for cp-all-in-one , cp-all-in-one-community, cp-all-in-one-cloud, Apache Kafka Confluent Platform
Support
Quality
Security
License
Reuse
Apache HAWQ
Support
Quality
Security
License
Reuse
The testing ground for the future of portable SIMD in Rust
Support
Quality
Security
License
Reuse
TonY is a framework to natively run deep learning frameworks on Apache Hadoop.
Support
Quality
Security
License
Reuse
OpenShift 3 and 4 product and community documentation
Support
Quality
Security
License
Reuse
The MongoDB Spark Connector
Support
Quality
Security
License
Reuse
s
scala-loggingby lightbend-labs
Convenient and performant logging library for Scala wrapping SLF4J.
Scala 890Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
s
spark-nlp-workshopby JohnSnowLabs
Public runnable examples of using John Snow Labs' NLP for Apache Spark.
Jupyter Notebook 888Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
t
tiflashby pingcap
The analytical engine for TiDB and TiDB Cloud. Try free: https://tidbcloud.com/free-trial
C++ 887Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
S
SparkCTRby wzhe06
CTR prediction model based on spark(LR, GBDT, DNN)
Scala 872Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
c
Support
Quality
Security
License
Reuse
p
pgsyncby toluaina
Postgres to Elasticsearch/OpenSearch sync
Python 860Updated: 2 y ago License: Weak Copyleft (LGPL-3.0)
Support
Quality
Security
License
Reuse
t
tisparkby pingcap
TiSpark is built for running Apache Spark on top of TiDB/TiKV
Scala 856Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
h
hadoop_studyby realguoshuai
定期更新Hadoop生态圈中常用大数据组件文档 重心依次为: Flink Solr Sparksql ES Scala Kafka Hbase/phoenix Redis Kerberos (项目包含hadoop思维导图 印象笔记 Scala版本简单demo 常用工具类 去敏后的train code 持续更新!!!)
Java 853Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
f
framelessby typelevel
Expressive types for Spark.
Scala 851Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
L
LightLDAby microsoft
Scalable, fast, and lightweight system for large-scale topic modeling
C++ 849Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
U
UserActionAnalyzePlatformby oeljeklaus-you
电商用户行为分析大数据平台
Java 847Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
s
sha256-simdby minio
Accelerate SHA256 computations in pure Go using AVX512, SHA Extensions for x86 and ARM64 for ARM. On AVX512 it provides an up to 8x improvement (over 3 GB/s per core). SHA Extensions give a performance boost of close to 4x over native.
Go 837Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
d
dmlc-coreby dmlc
A common bricks library for building scalable and portable distributed machine learning.
C++ 835Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
b
brooklinby linkedin
An extensible distributed system for reliable nearline data streaming at scale
Java 833Updated: 2 y ago License: Permissive (BSD-2-Clause)
Support
Quality
Security
License
Reuse
m
Support
Quality
Security
License
Reuse
a
Support
Quality
Security
License
Reuse
s
scala-loggingby lightbend
Convenient and performant logging library for Scala wrapping SLF4J.
Scala 821Updated: 4 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
p
pyspark-style-guideby palantir
This is a guide to PySpark code style presenting common situations and the associated best practices based on the most frequent recurring topics across the PySpark repos we've encountered.
Python 813Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
p
pikby google
A new lossy/lossless image format for photos and the internet
C++ 810Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
e
elasticsearch-spark-recommenderby IBM
Use Jupyter Notebooks to demonstrate how to build a Recommender with Apache Spark & Elasticsearch
Jupyter Notebook 808Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
s
storm-crawlerby DigitalPebble
A scalable, mature and versatile web crawler based on Apache Storm
HTML 803Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
a
apisix-dashboardby apache
Dashboard for Apache APISIX
Go 802Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
a
arrow-ballistaby apache
Apache Arrow Ballista Distributed Query Engine
Rust 801Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
a
asmby segmentio
Go library providing algorithms optimized to leverage the characteristics of modern CPUs
Go 784Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
t
tensorframesby databricks
[DEPRECATED] Tensorflow wrapper for DataFrames on Apache Spark
Scala 761Updated: 3 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
w
wlhbdpby authorwlh
💥🔥 大数据生态解决方案数据平台:基于大数据、数据平台、微服务、机器学习、商城、自动化运维、DevOps、容器部署平台、数据平台采集、数据平台存储、数据平台计算、数据平台开发、数据平台应用搭建的大数据解决方案。
Java 760Updated: 3 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
S
SQL-Data-Analysis-and-Visualization-Projectsby ptyadana
SQL data analysis & visualization projects using MySQL, PostgreSQL, SQLite, Tableau, Apache Spark and pySpark.
Jupyter Notebook 758Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
r
rangerby apache
Apache Ranger - To enable, monitor and manage comprehensive data security across the Hadoop platform and beyond
Java 756Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
b
base64by aklomp
Fast Base64 stream encoder/decoder in C99, with SIMD acceleration
C 751Updated: 2 y ago License: Permissive (BSD-2-Clause)
Support
Quality
Security
License
Reuse
z
zinggby zinggAI
Scalable identity resolution, entity resolution, data mastering and deduplication using ML
Java 739Updated: 2 y ago License: Strong Copyleft (AGPL-3.0)
Support
Quality
Security
License
Reuse
i
incubator-livyby apache
Apache Livy is an open source REST interface for interacting with Apache Spark from anywhere.
Scala 735Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
n
notebookerby man-group
Productionise & schedule your Jupyter Notebooks as easily as you wrote them.
Python 731Updated: 2 y ago License: Strong Copyleft (AGPL-3.0)
Support
Quality
Security
License
Reuse
l
libxsmmby libxsmm
Library for specialized dense and sparse matrix operations, and deep learning primitives.
C 729Updated: 2 y ago License: Permissive (BSD-3-Clause)
Support
Quality
Security
License
Reuse
b
bahir-flinkby apache
Mirror of Apache Bahir Flink
Java 727Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
p
private-join-and-computeby google
C++ 720Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
H
Hive-JSON-Serdeby rcongiu
Read - Write JSON SerDe for Apache Hive.
Java 717Updated: 2 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
t
thermostatby particle-iot
A place for all things related to ye olde Spark Thermostat Hackathon
Ruby 717Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
s
spark-dariaby MrPowers
Essential Spark extensions and helper methods ✨😲
Scala 713Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
i
incubator-toreeby apache
Mirror of Apache Toree (Incubating)
Scala 712Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
c
cdapby cdapio
An open source framework for building data analytic applications.
Java 706Updated: 2 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
i
impylaby cloudera
Python DB API 2.0 client for Impala and Hive (HiveServer2 protocol)
Python 702Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
F
FastNoise2by Auburn
Modular node graph based noise generation library using SIMD, C++17 and templates
C++ 702Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
H
HeavenMemoirsby SherlockQi
AR相册 Photo Album For AR
Swift 680Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
c
cassandraby cassandra-rb
A Ruby client for the Cassandra distributed database
Ruby 677Updated: 4 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
c
cp-all-in-oneby confluentinc
docker-compose.yml files for cp-all-in-one , cp-all-in-one-community, cp-all-in-one-cloud, Apache Kafka Confluent Platform
Python 674Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
h
Support
Quality
Security
License
Reuse
p
portable-simdby rust-lang
The testing ground for the future of portable SIMD in Rust
Rust 673Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
T
TonYby tony-framework
TonY is a framework to natively run deep learning frameworks on Apache Hadoop.
Java 672Updated: 3 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
o
openshift-docsby openshift
OpenShift 3 and 4 product and community documentation
HTML 670Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
m
mongo-sparkby mongodb
The MongoDB Spark Connector
Java 669Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse