Spark Libraries

FILTER

LANGUAGES

All

LICENSES

All

COMPONENT TYPES

All

SUPPORT

All

SOURCES

All

SECURITY

All

INDUSTRIES

All
Click on the libraries for details

Sort by

Relevance
e

elasticsearchby elastic

Free and Open, Distributed, RESTful Search Engine

Java Updated: 7 d ago License: Proprietary

Support
Quality
Security
License
Reuse
s

sparkby apache

Apache Spark - A unified analytics engine for large-scale data processing

Scala Updated: 9 d ago License: Permissive

Support
Quality
Security
License
Reuse
d

data-science-ipython-notebooksby donnemartin

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

Python Updated: 4 mo ago License: Proprietary

Support
Quality
Security
License
Reuse
k

kafkaby apache

Mirror of Apache Kafka

Java Updated: 8 d ago License: Permissive

Support
Quality
Security
License
Reuse
f

flinkby apache

Apache Flink

Java Updated: 3 mo ago License: Permissive

Support
Quality
Security
License
Reuse
l

luigiby spotify

Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.

Python Updated: 5 mo ago License: Permissive

Support
Quality
Security
License
Reuse
p

prestoby prestodb

The official home of the Presto distributed SQL query engine for big data

Java Updated: 12 d ago License: Permissive

Support
Quality
Security
License
Reuse
d

deeplearning4jby eclipse

Suite of tools for deploying and training deep learning models using the JVM. Highlights include model import for keras, tensorflow, and onnx/pytorch, a modular and tiny c++ library for running math code and a java based math library on top of the core c++ library. Also includes samediff: a pytorch/tensorflow like library for running deep learning using automatic differentiation.

Java Updated: 6 d ago License: Permissive

Support
Quality
Security
License
Reuse
h

hadoopby apache

Apache Hadoop

Java Updated: 3 mo ago License: Permissive

Support
Quality
Security
License
Reuse
f

flink-learningby zhisheng17

flink learning blog. http://www.54tianzhisheng.cn/ 含 Flink 入门、概念、原理、实战、性能调优、源码解析等内容。涉及 Flink Connector、Metrics、Library、DataStream API、Table API & SQL 等内容的学习案例,还有 Flink 落地应用的大型项目案例(PVUV、日志存储、百亿数据实时去重、监控告警)分享。欢迎大家支持我的专栏《大数据实时计算引擎 Flink 实战与性能优化》

Java Updated: 6 d ago License: Permissive

Support
Quality
Security
License
Reuse
B

BigData-Notesby heibaiying

大数据入门指南 :star:

Java Updated: 3 mo ago License: No License

Support
Quality
Security
License
Reuse
m

mlflowby mlflow

Open source platform for the machine learning lifecycle

Python Updated: 3 mo ago License: Permissive

Support
Quality
Security
License
Reuse
s

sparkby perwendel

A simple expressive web framework for java. Spark has a kotlin DSL https://github.com/perwendel/spark-kotlin

Java Updated: 4 mo ago License: Permissive

Support
Quality
Security
License
Reuse
v

vaexby vaexio

Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualize and explore big tabular data at a billion rows per second 🚀

Python Updated: 6 d ago License: Permissive

Support
Quality
Security
License
Reuse
i

industry-machine-learningby firmai

A curated list of applied machine learning and data science notebooks and libraries across different industries (by @firmai)

Jupyter Notebook Updated: 4 mo ago License: No License

Support
Quality
Security
License
Reuse
h

h2o-3by h2oai

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.

Jupyter Notebook Updated: 5 d ago License: Permissive

Support
Quality
Security
License
Reuse
p

pentaho-kettleby pentaho

Pentaho Data Integration ( ETL ) a.k.a Kettle

Java Updated: 5 d ago License: Permissive

Support
Quality
Security
License
Reuse
z

zeppelinby apache

Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.

Java Updated: 4 mo ago License: Permissive

Support
Quality
Security
License
Reuse
a

alluxioby Alluxio

Alluxio, data orchestration for analytics and machine learning in the cloud

Java Updated: 6 d ago License: Permissive

Support
Quality
Security
License
Reuse
h

hazelcastby hazelcast

Open-source distributed computation and storage platform

Java Updated: 5 d ago License: Proprietary

Support
Quality
Security
License
Reuse