HiBench | HiBench is a big data benchmark suite

by Intel-bigdata Java Version: v7.1.1 License: Non-SPDX

X-Ray Key Features Code Snippets Community Discussions(1)Vulnerabilities Install Support

kandi X-RAY | HiBench Summary

HiBench is a Java library typically used in Telecommunications, Media, Media, Entertainment, Big Data, Kafka, Spark, Hadoop applications. HiBench has no bugs, it has no vulnerabilities, it has build file available and it has medium support. However HiBench has a Non-SPDX License. You can download it from GitHub.

HiBench is a big data benchmark suite that helps evaluate different big data frameworks in terms of speed, throughput and system resource utilizations. It contains a set of Hadoop, Spark and streaming workloads, including Sort, WordCount, TeraSort, Repartition, Sleep, SQL, PageRank, Nutch indexing, Bayes, Kmeans, NWeight and enhanced DFSIO, etc. It also contains several streaming workloads for Spark Streaming, Flink, Storm and Gearpump.

Support

Quality

Security

License

Reuse

Support

HiBench has a medium active ecosystem.

It has 1351 star(s) with 746 fork(s). There are 127 watchers for this library.

It had no major release in the last 12 months.

There are 236 open issues and 125 have been closed. On average issues are closed in 21 days. There are 12 open pull requests and 0 closed requests.

It has a neutral sentiment in the developer community.

The latest version of HiBench is v7.1.1

Quality

HiBench has 0 bugs and 0 code smells.

Security

HiBench has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.

HiBench code analysis shows 0 unresolved vulnerabilities.

There are 0 security hotspots that need review.

License

HiBench has a Non-SPDX License.

Non-SPDX licenses can be open source with a non SPDX compliant license, or non open source licenses, and you need to review them closely before use.

Reuse

HiBench releases are available to install and integrate.

Build file is available. You can build the component from source.

Installation instructions are available. Examples and code snippets are not available.

HiBench saves you 10983 person hours of effort in developing the same functionality from scratch.

It has 22266 lines of code, 1245 functions and 227 files.

It has medium code complexity. Code complexity directly impacts maintainability of the code.

Top functions reviewed by kandi - BETA

kandi has reviewed HiBench and discovered the below as its top functions. This is intended to give you an instant insight into HiBench implemented functionality, and help decide if they suit your requirements.

Main method for testing
Generate a local bitmask command file
Configure the job1 stage1
Configure stage 5
The main method
Configure the phase 2
Run Saxpy
Multiply a block vector
Multiply a block vector vector
Make a block of encoded data from an output file
Load query node info
Deserialize fields
Demonstrates how to run the Mahout algorithm
Main method to submit a Phoenix job
Generate a page words and titles
Reduce keys and values
The main entry point
Compute the dot product of two matrices
Demonstrates how to submit a Phoenix job
Run a map job
Calculate min block vector
Performs a bit - OR operation on a block vector
Main entry point
Main method
Submit a map job
Create a record writer

Get all kandi verified functions for this library.

HiBench Key Features

No Key Features are available at this moment for HiBench.

HiBench Examples and Code Snippets

No Code Snippets are available at this moment for HiBench.

Community Discussions

Trending Discussions on HiBench

Fault Tolerance of FlinkKafkaConsumer in HiBench

QUESTION

Fault Tolerance of FlinkKafkaConsumer in HiBench

Asked 2018-Apr-16 at 21:21

I am running some experiments to test the fault tolerance capabilities of Apache Flink. I am currently using the HiBench framework with the WordCount micro benchmark implemented for Flink.

I noticed that if I kill a TaskManager during an execution, the state of the Flink operators is recovered after the automatic "redeploy" but many (all?) tuples sent from the benchmark to Kafka are missed (stored in Kafka but not received in Flink).

It seems that after the recovery, the FlinkKafkaConsumer (the benchmark uses FlinkKafkaConsumer08) in place of start reading from the last offset read before the failure start reading from the latest available one (losing all the event sent during the failure).

Any suggestion?

Thanks!

...

ANSWER

Answered 2018-Apr-09 at 14:46

The problem was with the HiBench framework itself and with the latest version of Flink.

I had to update the version of Flink in the benchmark in order to use the "setStartFromGroupOffsets()" method in the Kafka consumer.

Source https://stackoverflow.com/questions/49697590

Community Discussions, Code Snippets contain sources that include Stack Exchange Network

Vulnerabilities

No vulnerabilities reported

Install HiBench

Build HiBench
Run HadoopBench
Run SparkBench
Run StreamingBench (Spark streaming, Flink, Storm, Gearpump)

Support

Hadoop: Apache Hadoop 3.0.x, 3.1.x, 3.2.x, 2.x, CDH5, HDPSpark: Spark 2.4.x, Spark 3.0.x, Spark 3.1.xFlink: 1.0.3Storm: 1.0.1Gearpump: 0.8.1Kafka: 0.8.2.2

Find more information at: