TensorFlowOnSpark | TensorFlowOnSpark brings TensorFlow programs to Apache

by yahoo Python Version: 2.2.5 License: Apache-2.0

X-Ray Key Features Code Snippets(6)Community Discussions(2)Vulnerabilities Install Support

kandi X-RAY | TensorFlowOnSpark Summary

TensorFlowOnSpark is a Python library typically used in Big Data, Tensorflow, Spark, Hadoop applications. TensorFlowOnSpark has no bugs, it has no vulnerabilities, it has build file available, it has a Permissive License and it has medium support. You can install using 'pip install TensorFlowOnSpark' or download it from GitHub, PyPI.

TensorFlowOnSpark brings scalable deep learning to Apache Hadoop and Apache Spark clusters. By combining salient features from the TensorFlow deep learning framework with Apache Spark and Apache Hadoop, TensorFlowOnSpark enables distributed deep learning on a cluster of GPU and CPU servers.

Support

Quality

Security

License

Reuse

Support

TensorFlowOnSpark has a medium active ecosystem.

It has 3781 star(s) with 968 fork(s). There are 286 watchers for this library.

It had no major release in the last 12 months.

There are 7 open issues and 355 have been closed. On average issues are closed in 70 days. There are no pull requests.

It has a neutral sentiment in the developer community.

The latest version of TensorFlowOnSpark is 2.2.5

Quality

TensorFlowOnSpark has 0 bugs and 0 code smells.

Security

TensorFlowOnSpark has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.

TensorFlowOnSpark code analysis shows 0 unresolved vulnerabilities.

There are 0 security hotspots that need review.

License

TensorFlowOnSpark is licensed under the Apache-2.0 License. This license is Permissive.

Permissive licenses have the least restrictions, and you can use them in most projects.

Reuse

TensorFlowOnSpark releases are available to install and integrate.

Deployable package is available in PyPI.

Build file is available. You can build the component from source.

Installation instructions, examples and code snippets are available.

TensorFlowOnSpark saves you 2824 person hours of effort in developing the same functionality from scratch.

It has 6192 lines of code, 374 functions and 54 files.

It has high code complexity. Code complexity directly impacts maintainability of the code.

Top functions reviewed by kandi - BETA

kandi has reviewed TensorFlowOnSpark and discovered the below as its top functions. This is intended to give you an instant insight into TensorFlowOnSpark implemented functionality, and help decide if they suit your requirements.

The main function
Parse command line options
Get existing instances in the given cluster
Validate Spark version
Get the DNS name for a given instance
Runs a function in parallel
Return the reservations
Configure environment variables for Spark
Adds a meta
Feed inference data
Feed partitions into the shared queue
Terminate the queue
Train training data
Performs inference on a dataset
Create a keras model
Save a DataFrame as TFRecord
Wait for all reservations to complete
Load image
Load TFRecord files into Spark DataFrames
Run model tf
Parse command line options
Example function
Shutdown TensorFlow workers
Perform training
Feed partitions into partitions
Runs tf2
Gets next batch from the queue
Install external libraries

Get all kandi verified functions for this library.

TensorFlowOnSpark Key Features

No Key Features are available at this moment for TensorFlowOnSpark.

TensorFlowOnSpark Examples and Code Snippets

GSoC: Holmes Automated Malware Relationships,Introduction

Scala

Lines of Code : 12

License : Permissive (Apache-2.0)

Copy

PreProcessingConfig.scala
get_VT_signatures.scala
get_labels_from_VT_signatures.scala
get_features_from_peinfo.scala
get_features_from_objdump.scala
get_labels_features_by_join.scala

spark-submit \
 			--master spark://master:7077 --py-files /Folder

PySpark and argparse

Python

Lines of Code : 5

License : Strong Copyleft (CC BY-SA 4.0)

Copy

export SPARK_HOME=/home/user/spark-2.4.0-bin-hadoop2.7/
export PYSPARK_PYTHON=python3
export PYSPARK_DRIVER_PYTHON=python3
export SPARK_YARN_USER_ENV="PYSPARK_PYTHON=python3"

using python class methods on RDD

Python

Lines of Code : 7

License : Strong Copyleft (CC BY-SA 4.0)

Copy

def tokenize(x):
  return tok.tokenize(x[0])

rdd1.map(tokenize).take(5)

AttributeError: 'Tokenizer' object has no attribute '_Tokenizer__html2unicode'

ValueError: Error when checking target: expected dense_2 to have shape (1,) but got array with shape (14,)

Python

Lines of Code : 4

License : Strong Copyleft (CC BY-SA 4.0)

Copy

model.compile(loss='categorical_crossentropy',optimizer=tf.train.RMSPropOptimizer(learning_rate=0.001),metrics=['accuracy'])

Dense(1, activation="Softmax")

OneHotEncoding using keras.utils.to_categorical fails to convert to full length class size numpy array

Python

Lines of Code : 3

License : Strong Copyleft (CC BY-SA 4.0)

Copy

def generate_rdd_data(dataRDD):
    return dataRDD,keras.utils.to_categorical(dataRDD,num_classes=14)

Tensorflow Multoprocessing; UnknownError: Could not start gRPC server

Python

Lines of Code : 2

License : Strong Copyleft (CC BY-SA 4.0)

Copy

woker:1 log

Community Discussions

Trending Discussions on TensorFlowOnSpark

ValueError: Error when checking target: expected dense_2 to have shape (1,) but got array with shape (14,)

Error in the first step of running this example: TensorFlowOnSpark on a Spark Standalone cluster

QUESTION

ValueError: Error when checking target: expected dense_2 to have shape (1,) but got array with shape (14,)

Asked 2019-Mar-09 at 07:16

I am trying to train a classification model in a distributed way. I am using TensorflowOnSpark library developed by Yahoo. The example I am using github link

I am using dataset other than mnist which is used in example mentioned in the github link. This dataset I am using is of dimensions as follows after preprocessing (260000,28047) and also the classes(labels) range from 0:13.

...

ANSWER

Answered 2019-Mar-09 at 07:16

As pointed out in comment by @Matias you are using wrong loss function

Sparse cross entropy is used when your output is an integer like 0,1,2,3,..13. But your output is onehot encoded [0,0,...1,0].

So use categorical cross entropy.

Source https://stackoverflow.com/questions/54918672

QUESTION

Error in the first step of running this example: TensorFlowOnSpark on a Spark Standalone cluster

Asked 2018-May-22 at 15:04

I have a problem with running this example TensorFlowOnSpark on a Spark Standalone cluster (Single Host):

After executing mnist_data_setup.py file, it extracts the MNIST zip files correctly. But by calling extract_images(filename) functions, it faces an error. Please see the error in the following:

...

ANSWER

Answered 2018-Feb-20 at 22:38

I think in the open, your provide a file type object instead of a string for the name variable.

I do more digging:

In images = numpy.array(mnist.extract_images(f)), f is a file object.

But with tf.gfile.Open(filename, 'rb') as f, gzip.GzipFile(fileobj=f) as bytestream:, this treats the argument passed by images = numpy.array(mnist.extract_images(f)) as a filename.

This behavior does not appear in the latest version: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/learn/python/learn/datasets/mnist.py

Source https://stackoverflow.com/questions/48888762

Community Discussions, Code Snippets contain sources that include Stack Exchange Network

Vulnerabilities

No vulnerabilities reported

Install TensorFlowOnSpark

TensorFlowOnSpark is provided as a pip package, which can be installed on single machines via:. For distributed clusters, please see our wiki site for detailed documentation for specific environments, such as our getting started guides for single-node Spark Standalone, YARN clusters and AWS EC2. Note: the Windows operating system is not currently supported due to this issue.

Support

Please join the TensorFlowOnSpark user group for discussions and questions. If you have a question, please review our FAQ before posting. Contributions are always welcome. For more information, please see our guide for getting involved.

Find more information at: